7199203937 (object_array: add and use `object_array_pop()`, 2017-09-23)
noted that the pattern `object = array.objects[--array.nr].item` could
be abstracted as `object = object_array_pop(&array)`.
Unfortunately, one of the conversions was horribly wrong. Between
grabbing the last object (i.e., peeking at it) and decreasing the object
count, the original code would sometimes return early. The updated code
on the other hand, will always pop the last element, then maybe do the
early return without doing anything with the object.
The end result is that merge commits where all the parents have still
not been exported will simply be dropped, meaning that they will be
completely missing from the exported data.
Re-add a commit when it is not yet time to handle it. An alternative
that was considered was to peek-then-pop. That carries some risk with it
since the peeking and popping need to act on the same object, in a
concerted fashion.
Add a test that would have caught this.
Reported-by: Isaac Chou <Isaac.Chou@microfocus.com>
Analyzed-by: Isaac Chou <Isaac.Chou@microfocus.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In anticipation of more involved cleanup to come, make a helper function
for doing the cleanup at the end of handle_renames. Rename the already
existing cleanup_rename[s]() to final_cleanup_rename[s](), name the new
helper initial_cleanup_rename(), and leave the big comment in the code
about why we can't do all the cleanup at once.
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a new function, get_diffpairs() to compute the diff_filepairs
between two trees. While these are currently only used in
get_renames(), I want them to be available to some new functions. No
actual logic changes yet.
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, if !o->detect_rename then get_renames() would return an
empty string_list, and then process_renames() would have nothing to
iterate over. It seems more straightforward to simply avoid calling
either function in that case.
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
get_renames() has always zero'ed out diff_queued_diff.nr while only
manually free'ing diff_filepairs that did not correspond to renames.
Further, it allocated struct renames that were tucked away in the
return string_list. Make sure all of these are deallocated when we
are done with them.
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The amount of logic in merge_trees() relative to renames was just a few
lines, but split it out into new handle_renames() and cleanup_renames()
functions to prepare for additional logic to be added to each. No code or
logic changes, just a new place to put stuff for when the rename detection
gains additional checks.
Note that process_renames() records pointers to various information (such
as diff_filepairs) into rename_conflict_info structs. Even though the
rename string_lists are not directly used once handle_renames() completes,
we should not immediately free the lists at the end of that function
because they store the information referenced in the rename_conflict_info,
which is used later in process_entry(). Thus the reason for a separate
cleanup_renames().
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move this function so it can re-use some others (without either
moving all of them or adding an annoying split between function
declarations and definitions). Cheat slightly by adding a blank line
for readability, and in order to silence checkpatch.pl.
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I came up with the testcases in the first eight sections before coding up
the implementation. The testcases in this section were mostly ones I
thought of while coding/debugging, and which I was too lazy to insert
into the previous sections because I didn't want to re-label with all the
testcase references. :-)
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a long note about why we are not considering "partial directory
renames" for the current directory rename detection implementation.
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We define 'git stash -p' as an alias for 'git stash push -p' in the
manpage. Do the same in the completion script, so all options that
can be given to 'git stash push' are being completed when the user is
using 'git stash -p --<tab>'. Currently the only additional option
the user will get is '--message', but there may be more in the future.
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'save' subcommand in git stash has been deprecated in
fd2ebf14db ("stash: mark "git stash save" deprecated in the man page",
2017-10-22).
Stop showing it when the users enters 'git stash <tab>' or 'git stash
s<tab>'. Keep showing it however when the user enters 'git stash sa<tab>'
or any more characters of the 'save' subcommand. This is designed to
not encourage users to use 'git stash save', but still leaving the
completion option once it's clear that's what the user means.
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The description of the <repository> argument directs readers to "See the
URLS section below". When generating HTML this becomes a link to the
"GIT URLS" section. When reading the man page in a terminal, the
caption is slightly misleading. Use "GIT URLS" as the caption to avoid
any confusion.
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git config` has long allowed the ability for callers to provide a 'type
specifier', which instructs `git config` to (1) ensure that incoming
values can be interpreted as that type, and (2) that outgoing values are
canonicalized under that type.
In another series, we propose to extend this functionality with
`--type=color` and `--default` to replace `--get-color`.
However, we traditionally use `--color` to mean "colorize this output",
instead of "this value should be treated as a color".
Currently, `git config` does not support this kind of colorization, but
we should be careful to avoid squatting on this option too soon, so that
`git config` can support `--color` (in the traditional sense) in the
future, if that is desired.
In this patch, we support `--type=<int|bool|bool-or-int|...>` in
addition to `--int`, `--bool`, and etc. This allows the aforementioned
upcoming patch to support querying a color value with a default via
`--type=color --default=...`, without squandering `--color`.
We retain the historic behavior of complaining when multiple,
legacy-style `--<type>` flags are given, as well as extend this to
conflicting new-style `--type=<type>` flags. `--int --type=int` (and its
commutative pair) does not complain, but `--bool --type=int` (and its
commutative pair) does.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that the sequencer commits without forking when the commit message
isn't edited all the commits that are picked have the same committer
date. If a commit is reworded it's committer date will be a later time
as it is created by running an separate instance of 'git commit'. If
the reworded commit is follow by further picks, those later commits
will have an earlier committer date than the reworded one. This is
caused by git caching the default date used when GIT_COMMITTER_DATE is
not set. Reset the cached date before a commit is generated
in-process.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In case a patch already has In-Reply-To or References in the header
(e.g. when the patch has been created with format-patch --thread)
git-send-email should not add another pair of those headers.
This is also not allowed according to RFC 5322 Section 3.6:
https://tools.ietf.org/html/rfc5322#section-3.6
Avoid the second pair by reading the current headers into the
appropriate variables.
Signed-off-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This target should be marked as .PHONY, just like other targets that
exist only for their side effects that do not create filesystem
entities with the same name.
Signed-off-by: Christian Hesse <mail@eworm.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function compute_rev_name() can return NULL sometimes (e.g. right
after 'submodule init'). The current code makes 'submodule status'
print this:
19d97bf5af05312267c2e874ee6bcf584d9e9681 sha1collisiondetection ((null))
This ugly 'null' adds no value to the user using this command. More
importantly printf() on some platform can't handle NULL as a string
and will crash instead of printing '(null)'.
Check for this and skip printing this part (the alternative is
printing '(n/a)' or something but I think that is just noise).
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We tend to quote command line examples using `` to set them in a
monospace font. The immediate motivation for this patch is to get rid of
another instance of \--. As noted in the previous commits, \-- has a
tendency of rendering badly. Here, it renders ok (at least with
AsciiDoc 8.6.9 and Asciidoctor 1.5.4), but by getting rid of this
instance, we reduce the chances of \-- cropping up in places where it
matters more.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
In git-log.txt, we have an instance of \--, which is known to sometimes
render badly. This one is even worse than normal though, since ``\-- ''
(with or without that trailing space) appears to be entirely broken,
both in HTML and manpages, both with AsciiDoc (version 8.6.9) and
Asciidoctor (version 1.5.4).
Further down in git-log.txt we have a ``--'', which renders good. In
git-shortlog.txt, we use "\-- " (including the quotes and the space),
which happens to look fairly good. I failed to find any other similar
instances. So all in all, we quote a double-dash in three different
places and do it differently each time, with various degrees of success.
Switch all of these to `--`. This sets the double-dash in monospace and
matches what we usually do with example command line usages and options.
Note that we drop the trailing space as well, since `-- ` does not
render well. These should still be clear enough since just a few lines
above each instance, the space is clearly visible in a longer context.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Commit 1c262bb7b (doc: convert \--option to --option, 2015-05-13)
explains that we used to need to write \--option to play well with older
versions of AsciiDoc, but that we do not support such versions anymore
anyway, and that Asciidoctor literally renders \--.
With [\--], which is used to denote the optional separator between
revisions and paths, Asciidoctor renders the backslash literally.
Change all [\--] to [--]. This changes nothing for AsciiDoc version
8.6.9, but is an improvement for Asciidoctor version 1.5.4.
We use double-dashes in several list entries (\--::). In my testing, it
appears that we do need to use the backslash there, so leave those.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Rather than using a backslash in \--foo, with or without ''-quoting,
write `--foo` for better rendering. As explained in commit 1c262bb7b
(doc: convert \--option to --option, 2015-05-13), the backslash is not
needed for the versions of AsciiDoc that we support, but is rendered
literally by Asciidoctor.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
An unwanted single quote character in the paragraph documenting the
'gc.aggressiveWindow' config variable prevented the name of that
config variable from being rendered correctly, ever since that piece
of docs was added in 0d7566a5ba (Add --aggressive option to 'git gc',
2007-05-09).
Remove that single quote.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many commands support a "--force" option, frequently abbreviated as
"-f", however, "git worktree remove"'s hand-rolled OPT_BOOL forgets
to recognize the short form, despite git-worktree.txt documenting
"-f" as supported. Replace OPT_BOOL with OPT__FORCE, which provides
"-f" for free, and makes 'remove' consistent with 'add' option
parsing (which also specifies the PARSE_OPT_NOCOMPLETE flag).
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To get the names of all '$__git_builtin_*' variables caching --options
of builtin commands in order to unset them, 8b0eaa41f2 (completion:
clear cached --options when sourcing the completion script,
2018-03-22) runs a 'set |sed s///' pipeline. This works both in Bash
and in ZSH, but has a higher than necessary overhead with the extra
processes.
In Bash we can do better: run the 'compgen -v __gitcomp_builtin_'
builtin command, which lists the same variables, but without a
pipeline and 'sed' it can do so with lower overhead.
ZSH will still continue to run that pipeline.
This change also happens to work around an issue in the default Bash
version shipped in macOS (3.2.57), reported by users of the Powerline
shell prompt, which was triggered by the same commit 8b0eaa41f2 as
well. Powerline uses several Unicode Private Use Area code points to
represent some of its pretty text UI elements (arrows and what not),
and these are stored in the $PS1 variable. Apparently the 'set'
builtin of said Bash version on macOS has issues with these code
points, and produces garbled output where Powerline's special symbols
should be in the $PS1 variable. This, in turn, triggers the following
error message in the downstream 'sed' process:
sed: RE error: illegal byte sequence
Other Bash versions, notably 4.4.19 on macOS via homebrew (i.e. a
newer version on the same platform) and 3.2.25 on CentOS (i.e. a
slightly earlier version, though on a different platform) are not
affected. ZSH in macOS (the versions shipped by default or installed
via homebrew) or on other platforms isn't affected either.
With this patch neither the 'set' builtin is invoked to print garbage,
nor 'sed' to choke on it.
Issue-on-macOS-reported-by: Stephon Harris <theonestep4@gmail.com>
Issue-on-macOS-explained-by: Matthew Coleman <matt@1eanda.com>
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
During git-aware path completion, when a lot of path components have
to be listed, a significant amount of time is spent in
__gitcomp_file(), or more accurately in the shell loop of
__gitcompappend(), iterating over all the path components filtering
path components matching the current word to be completed, adding
prefix path components, and placing the resulting matching paths into
the COMPREPLY array.
Now, a previous patch in this series made 'git ls-files' and 'git
diff-index' list only paths matching the current word to be completed,
so an additional filtering in __gitcomp_file() is not necessary
anymore. Adding the prefix path components could be done much more
efficiently in __git_index_files()'s 'awk' script while stripping
trailing path components and removing duplicates and quoting. And
then the resulting paths won't require any more filtering or
processing before being handed over to Bash, so we could fill the
COMPREPLY array directly.
Unfortunately, we can't simply use the __gitcomp_direct() helper
function to do that, because __gitcomp_file() does one additional
thing: it tells Bash that we are doing filename completion, so the
shell will kindly do four important things for us:
1. Append a trailing space to all filenames.
2. Append a trailing '/' to all directory names.
3. Escape any meta, globbing, separator, etc. characters.
4. List only the current path component when listing possible
completions (i.e. 'dir/subdir/f<TAB>' will list 'file1', 'file2',
etc. instead of the whole 'dir/subdir/file1',
'dir/subdir/file2').
While we could let __git_index_files()'s 'awk' script take care of the
first two points, the third one gets tricky, and we absolutely need
the shell's support for the fourth.
Add the helper function __gitcomp_file_direct(), which, just like
__gitcomp_direct(), fills the COMPREPLY array with prefiltered and
preprocessed paths without any additional processing, without a shell
loop, with just one single compound assignment, and, similar to
__gitcomp_file(), tells Bash and ZSH that we are doing filename
completion. Extend __git_index_files()'s 'awk' script a bit to
prepend any prefix path components to all listed paths. Finally,
modify __git_complete_index_file() to feed __git_index_files()'s
output to ___gitcomp_file_direct() instead of __gitcomp_file().
After this patch there is no shell loop left in the path completion
code path.
This speeds up path completion when there are a lot of paths matching
the current word to be completed. In a pathological repository with
100k files in a single directory, listing all those files:
Before this patch, best of five, using GNU awk on Linux:
$ time cur=dir/ __git_complete_index_file
real 0m0.983s
user 0m1.004s
sys 0m0.033s
After:
real 0m0.313s
user 0m0.341s
sys 0m0.029s
Difference: -68.2%
Speedup: 3.1x
To see the benefits of the whole patch series, the same command with
v2.17.0:
real 0m2.736s
user 0m2.472s
sys 0m0.610s
Difference: -88.6%
Speedup: 8.7x
Note that this patch changes the output of the __git_index_files()
helper function by unconditionally prepending the prefix path
components to every listed path. This would break users' completion
scriptlets that directly run:
__gitcomp_file "$(__git_index_files ...)" "$pfx" "$cur_"
because that would add the prefix path components once more.
However, __git_index_files() is kind of a "helper function of a helper
function", and users' completion scriptlets should have been using
__git_complete_index_file() for git-aware path completion in the first
place, so this is likely doesn't worth worrying about.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If any pathname contains backslash, double quote, tab, newline, or any
control characters, 'git ls-files' and 'git diff-index' will enclose
that pathname in double quotes and escape those special characters
using C-style one-character escape sequences or \nnn octal values.
This prevents those files from being listed during git-aware path
completion, because due to the quoting they will never match the
current word to be completed.
Extend __git_index_files()'s 'awk' script to remove all that quoting
and escaping from unique path components, so even paths containing
(almost all) such special characters can be completed.
Paths containing newline characters are still an issue, though. We
use newlines as separator character when filling the COMPREPLY array,
so a path with one or more newline will end up split to two or more
elements in COMPREPLY, basically breaking completion. There is
nothing we can do about it without a significant performance hit, so
let's just ignore such paths for now. As far as paths with newlines
are concerned, this isn't any different from the previous behavior,
because those paths were always omitted, though in the past they were
omitted because due to the quoting they didn't match the current word
to be completed. Anyway, Bash's own filename completion (Meta-/) can
complete even those paths, if need be.
Note:
- We don't dequote path components right away as they are coming in,
because then we would have to dequote each directory name
repeatedly, as many times as it appears in the input, i.e. as many
times as the number of listed paths it contains. Instead, we
dequote them at the end, as we print unique path components.
- Even when a directory name itself does not contain any special
characters, it will still be quoted if any of its trailing path
components do. If a directory contains paths both with and
without special characters, then the name of that directory will
appear both quoted and unquoted in the output of 'git ls-files'
and 'git diff-index'. Consequently, we will add such a directory
name to the deduplicating associative array twice: once quoted and
once unquoted.
This means that we have to be careful after dequoting a directory
name, and only print it if we haven't seen the same directory name
unquoted.
- It would be wonderful if we could just pass '-z' to those git
commands to output \0-separated unquoted paths, and use \0 as
record separator in the 'awk' script processing their output...
this patch would be so much simpler, almost trivial even.
Unfortunately, however, POSIX and most 'awk' implementations don't
support \0 as record separator (GNU awk does support it).
- This patch makes the earlier change to list paths with
'core.quotePath=false' basically redundant, because this could
decode any \nnn-escaped non-ASCII character just fine, as well.
However, I suspect that 'git ls-files' can deal with those
non-ASCII characters faster than this updated 'awk' script; just
in case someone is burdened with tons of pathnames containing
non-ASCII characters.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
During git-aware path completion, after all the trailing path
components have been removed from the output of 'git ls-files' and
'git diff-index' (see previous patch), each directory name is repeated
as many times as the number of listed paths it contains. This can be
a lot of repetitions, especially when invoking path completion close
to the root of a big worktree, which would cause a considerable
overhead downstream of __git_index_files(), in particular in the shell
loop that fills the COMPREPLY array. To reduce this overhead,
__git_index_files() runs the classic '... |sort |uniq' pattern to
remove those repetitions from the function's output.
While removing repeated directory names is effective in reducing the
number of iterations in that shell loop, it still imposes the overhead
of fork()+exec()ing two external processes, and two additional stages
in the pipeline, where potentially relatively large amount of data can
be passed between two subsequent pipeline stages.
Extend __git_index_files()'s 'awk' script to remove repeated path
components by first creating and filling an associative array indexed
by all encountered path components (after the trailing path components
have been removed), and then iterating over this array and printing
the indices, i.e. unique path components. This way we can remove the
'|sort |uniq' pipeline stages, and their eliminated overhead results
in faster path completion.
Listing all tracked files (12) and directories (23) at the top of the
worktree in linux.git (over 62k files), i.e. what's doing all the hard
work behind 'git rm <TAB>':
Before this patch, best of five, using GNU awk on Linux:
real 0m0.069s
user 0m0.089s
sys 0m0.026s
After:
real 0m0.052s
user 0m0.072s
sys 0m0.014s
Difference: -24.6%
Note that this changes order of elements in __git_index_files()'s
output. This is not an issue, because this function was only ever
intended to feed paths into the COMPREPLY array, and Bash will sort
its elements (according to the users locale) anyway.
Note also that using 'awk' to remove repeated path components is also
beneficial for the performance of the next two patches:
- The first will extend this 'awk' script to dequote quoted paths in
the output of 'git ls-files' and 'git diff-index'. With this
patch it will only have to dequote unique path components, not
all.
- The second will, among other things, extend this 'awk' script to
prepend prefix path components from the command line to the
currently completed path component. Consequently, each line in
'awk's output will grow longer. Without this patch that '|sort
|uniq' would have to exchange and process that much more data.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The order or possible completion words in the COMPREPLY array doesn't
actually matter, as long as all the right words are in there, because
Bash will sort them anyway. Yet, our tests looking at the elements of
COMPREPLY always expect them to be in a specific order.
Now, this hasn't been an issue before, but the next patch is about to
optimize a bit more our git-aware path completion, and as a harmless
side effect the order of elements in COMPREPLY will change. Worse,
the order will be downright undefined, because after the next patch
path components will come directly from iterating through an
associative array in 'awk', and the order of iteration over the
elements in those arrays is undefined, and indeed different 'awk'
implementations produce different order. Consequently, we can't get
away with simply adjusting the expected results in the affected tests.
Modify the 'test_completion' helper function to sort both the expected
and the actual results, i.e. the elements in COMPREPLY, before
comparing them, so the tests using this helper function will work
regardless of the order of elements.
Note that this change still leaves a bunch of tests depending on the
order of elements in COMPREPLY, tests that focus on a specific helper
function and therefore don't use the 'test_completion' helper. I
would rather deal with those later, when (if ever) the need actually
arises, than create unnecessary code churn now.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
During git-aware path completion we complete one path component at a
time, i.e. 'git add <TAB>' offers only 'dir/' at first, not
'dir/subdir/file' right away, just like Bash's own filename
completion. However, since both 'git ls-files' and 'git diff-index'
dive deep into subdirectories, we have to strip all trailing path
components from the listed paths, keeping only the leading path
component. This stripping is currently done in a shell loop in
__git_index_files(), which can take a significant amount of time when
it has to iterate through a large number of paths.
Replace this shell loop with a little 'awk' script using '/' as input
field separator and printing the first field, which produces the same
output much faster.
Listing all tracked files (12) and directories (23) at the top of the
worktree in linux.git (over 62k files), i.e. what's doing all the hard
work behind 'git rm <TAB>':
Before this patch, best of five, using GNU awk on Linux:
$ time cur= __git_complete_index_file
real 0m2.149s
user 0m1.307s
sys 0m1.086s
After:
real 0m0.067s
user 0m0.089s
sys 0m0.023s
Difference: -96.9%
Speedup: 32.1x
Note that this could be done with 'sed', or even with 'cut', just as
well, but the upcoming patches require 'awk's scriptability.
Note also that this change means one more fork()+exec()ed process
during path completion, adding more overhead especially on Windows,
but a later patch will more than make up for it by eliminating two
other processes in the same function.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
During git-aware path completion, e.g. 'git rm dir/fil<TAB>', both
'git ls-files' and 'git diff-index' list all paths in the given 'dir/'
matching certain criteria (cached, modified, untracked, etc.)
appropriate for the given git command, even paths whose names don't
begin with 'fil'. This comes with a considerable performance
penalty when the directory in question contains a lot of paths, but
the current word can be uniquely completed or when only a handful of
those paths match the current word.
Reduce the number of iterations in this codepath from the number of
paths to the number of matching paths by specifying an appropriate
globbing pattern to 'git ls-files' and 'git diff-index' to list only
paths that match the current word to be completed.
Note that both commands treat backslashes as escape characters in
their file arguments, e.g. to preserve the literal meaning of globbing
characters, so we have to double every backslash in the globbing
pattern. This is why one of the path completion tests specifically
checks the completion of a path containing a literal backslash
character (that test still fails, though, because both commands output
such paths enclosed in double quotes and the special characters
escaped; a later patch in this series will deal with those).
This speeds up path completion considerably when there are a lot of
non-matching paths to be filtered out. Uniquely completing a tracked
filename at the top of the worktree in linux.git (over 62k files),
i.e. what's doing all the hard work behind 'git rm Mak<TAB>' to
complete 'Makefile':
Before this patch, best of five, on Linux:
$ time cur=Mak __git_complete_index_file
real 0m2.159s
user 0m1.299s
sys 0m1.089s
After:
real 0m0.033s
user 0m0.023s
sys 0m0.015s
Difference: -98.5%
Speedup: 65.4x
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our git-aware path completion doesn't work when it has to complete a
word already containing quoted and/or backslash-escaped characters on
the command line. The root cause of the issue is that completion
functions see all words on the command line verbatim, i.e. including
all backslash, single and double quote characters that the shell would
eventually remove when executing the finished command. These
quoting/escaping characters cause different issues depending on which
path component of the word to be completed contains them:
- The quoting/escaping is in the prefix path component(s).
Let's suppose we have a directory called 'New Dir', containing two
untracked files 'file.c' and 'file.o', and we have a gitignore
rule ignoring object files. In this case all of these:
git add New\ Dir/<TAB>
git add "New Dir/<TAB>
git add 'New Dir/<TAB>
should uniquely complete 'file.c' right away, but Bash offers both
'file.c' and 'file.o' instead. The reason for this behavior is
that our completion script uses the prefix directory name like
'git -C "New\ Dir/" ls-files ...", i.e. with the backslash inside
double quotes. Git then tries to enter a directory called
'New\ Dir', which (most likely) fails because such a directory
doesn't exists. As a result our completion script doesn't list
any files, leaves the COMPREPLY array empty, which in turn causes
Bash to fall back to its simple filename completion and lists all
files in that directory, i.e. both 'file.c' and 'file.o'.
- The quoting/escaping is in the path component to be completed.
Let's suppose we have two untracked files 'New File.c' and
'New File.o', and we have a gitignore rule ignoring object files.
In this case all of these:
git add New\ Fi<TAB>
git add "New Fi<TAB>
git add 'New Fi<TAB>
should uniquely complete 'New File.c' right away, but Bash offers
both 'New File.c' and 'New File.o' instead. The reason for this
behavior is that our completion script uses this 'New\ Fi' or
'"New Fi' etc. word to filter matching paths, and of course none
of the potential filenames will match because of the included
backslash or double quote. The end result is the same as above:
the completion script doesn't list any files, Bash falls back to
its filename completion, which then lists the matching object file
as well.
Add the new helper function __git_dequote() [1], which removes (most
of[2]) the quoting and escaping from the word it gets as argument. To
minimize the overhead of calling this function, store its result in
the variable $dequoted_word, supposed to be declared local in the
caller; simply printing the result would require a command
substitution imposing the overhead of fork()ing a subshell. Use this
function in __git_complete_index_file() to dequote the current word,
i.e. the path, to be completed, to avoid the above described
quoting-related issues, thereby fixing two of the failing quoted path
completion tests.
[1] The bash-completion project already has a dequote() function,
which I hoped I could borrow to deal with this, but unfortunately
it doesn't work quite well for this purpose (perhaps that's why
even the bash-completion project only rarely uses it). The main
issue is that their dequote() is implemented as:
eval printf %s "$1" 2> /dev/null
where $1 would contain the word to be completed. While it's a
short and sweet one-liner, the use of 'eval' requires that $1 is a
syntactically valid string, which is not the case when quoting the
path like 'git add "New Dir/<TAB>'. This causes 'eval' to fail,
because it can't find the matching closing double quote, and the
function returns nothing. The result is totally broken behavior,
as if the current word were empty, and the completion script would
then list all files from the current directory. This is why one
of the quoted path completion tests specifically checks the
completion of a path with an opening but without a corresponding
closing double quote character. Furthermore, the 'eval' performs
all kinds of expansions, which may or may not be desired; I think
it's the latter. Finally, using this function would require a
command substitution.
[2] Bash understands the $'string' quoting as well, which "expands to
'string', with backslash-escaped characters replaced as specified
by the ANSI C standard" (quoted from Bash manpage). Since shell
metacharacters, field separators, globbing, etc. can all be easily
entered using standard shell escaping or quoting, this type of
quoting comes in handly when dealing with control characters that
are otherwise difficult both to "type" and to see on the command
line. Because of this difficulty I would assume that people do
avoid pathnames with such control characters anyway, so I didn't
bother implementing it. This function is already way too long as
it is.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unless the user has 'core.quotePath=false' somewhere in the
configuration, both 'git ls-files' and 'git diff-index' will by
default quote any pathnames that contain bytes with values higher than
0x80, and escape those bytes as '\nnn' octal values. This prevents
completing paths when the current path component to be completed
contains any non-ASCII, most notably UTF-8, characters, because none
of the listed quoted paths will match the current word on the command
line.
Set 'core.quotePath=false' for those 'git ls-files' and 'git
diff-index' invocations, so they won't consider bytes higher than 0x80
as "unusual", and won't quote pathnames containing such characters.
Note that pathnames containing backslash, double quote, or control
characters will still be quoted; a later patch in this series will
deal with those.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Once upon a time 'git -C "" cmd' errored out with "Cannot change to
'': No such file or directory", therefore the completion script took
extra steps to run 'git -C "." cmd' instead; see fca416a41e
(completion: use "git -C $there" instead of (cd $there && git ...),
2014-10-09).
Those extra steps are not needed since 6a536e2076 (git: treat "git -C
'<path>'" as a no-op when <path> is empty, 2015-03-06), so remove
them.
While at it, also simplify how the trailing '/' is appended to the
variable holding the prefix path components.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's much easier to read, understand and modify the functions related
to git-aware path completion when they are right next to each other.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Completion functions see all words on the command line verbatim,
including any backslash-escapes, single and double quotes that might
be there. Furthermore, git commands quote pathnames if they contain
certain special characters. All these create various issues when
doing git-aware path completion.
Add a couple of failing tests to demonstrate these issues.
Later patches in this series will discuss these issues in detail as
they fix them.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even though "direct ancestor" is not defined in the glossary, the
common meaning of the term is simply "parent", parents being the only
direct ancestors, and the rest of ancestors being indirect ancestors.
As "parent" is obviously wrong in this place in the description, we
should simply say "ancestor", as everywhere else.
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-blame.el mode has been superseded by Emacs's own
vc-annotate (invoked by C-x v g). Users of the git.el mode are now
much better off using either Magit or the Git backend for Emacs's own
VC mode.
These modes were added over 10 years ago when Emacs's own Git support
was much less mature, and there weren't other mature modes in the wild
or shipped with Emacs itself.
These days these modes have few if any users, and users of git aren't
well served by us shipping these (some OS's install them alongside git
by default, which is confusing and leads users astray).
So let's remove these per Alexandre Julliard's message to the
ML[1]. If someone still wants these for some reason they're better
served by hosting these elsewhere (e.g. on ELPA), instead of us
distributing them with git.
However, since downstream packagers such as Debian are packaging this
as git-el it's less disruptive to still carry these files as Elisp
code that'll error out with a message suggesting alternatives, rather
than drop the files entirely[2].
Then rather than receive a cryptic load error when they upgrade
existing users will get an error directing them to the README file, or
to just stop requiring these modes. I think it makes sense to link to
GitHub's hosting of contrib/emacs/README (which'll be updated by the
time users see this) so they don't have to hunt down the packaged
README on their local system.
1. "Re: [PATCH] git.el: handle default excludesfile
properly" (87muzlwhb0.fsf@winehq.org) --
https://public-inbox.org/git/87muzlwhb0.fsf@winehq.org/
2. "Re: [PATCH v3] git{,-blame}.el: remove old bitrotting Emacs
code" (20180327165751.GA4343@aiede.svl.corp.google.com) --
https://public-inbox.org/git/20180327165751.GA4343@aiede.svl.corp.google.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A signed tag has a detached signature like this:
object ...
[...more header...]
This is the tag body.
-----BEGIN PGP SIGNATURE-----
[opaque gpg data]
-----END PGP SIGNATURE-----
Our parser finds the _first_ line that appears to start a
PGP signature block, meaning we may be confused by a
signature (or a signature-like line) in the actual body.
Let's keep parsing and always find the final block, which
should be the detached signature over all of the preceding
content.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ben Toews <mastahyeti@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's separate the actual line-by-line parsing of signatures
from the notion of "is this a gpg signature line". That will
make it easier to do more refactoring of this loop in future
patches.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ben Toews <mastahyeti@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We accidentally shed the "const" of our buffer by passing it
through memchr. Let's fix that, and while we're at it, move
our variable declaration inside the loop, which is the only
place that uses it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ben Toews <mastahyeti@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>