Commit Graph

52131 Commits

Author SHA1 Message Date
Martin Ågren
be011bbe00 fast-export: fix regression skipping some merge-commits
7199203937 (object_array: add and use `object_array_pop()`, 2017-09-23)
noted that the pattern `object = array.objects[--array.nr].item` could
be abstracted as `object = object_array_pop(&array)`.

Unfortunately, one of the conversions was horribly wrong. Between
grabbing the last object (i.e., peeking at it) and decreasing the object
count, the original code would sometimes return early. The updated code
on the other hand, will always pop the last element, then maybe do the
early return without doing anything with the object.

The end result is that merge commits where all the parents have still
not been exported will simply be dropped, meaning that they will be
completely missing from the exported data.

Re-add a commit when it is not yet time to handle it. An alternative
that was considered was to peek-then-pop. That carries some risk with it
since the peeking and popping need to act on the same object, in a
concerted fashion.

Add a test that would have caught this.

Reported-by: Isaac Chou <Isaac.Chou@microfocus.com>
Analyzed-by: Isaac Chou <Isaac.Chou@microfocus.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-21 12:43:40 +09:00
Elijah Newren
ffc16c490a merge-recursive: make a helper function for cleanup for handle_renames
In anticipation of more involved cleanup to come, make a helper function
for doing the cleanup at the end of handle_renames.  Rename the already
existing cleanup_rename[s]() to final_cleanup_rename[s](), name the new
helper initial_cleanup_rename(), and leave the big comment in the code
about why we can't do all the cleanup at once.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-20 10:44:16 +09:00
Elijah Newren
e5257b2a0d merge-recursive: split out code for determining diff_filepairs
Create a new function, get_diffpairs() to compute the diff_filepairs
between two trees.  While these are currently only used in
get_renames(), I want them to be available to some new functions.  No
actual logic changes yet.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-20 10:44:15 +09:00
Elijah Newren
3992ff0c4c merge-recursive: make !o->detect_rename codepath more obvious
Previously, if !o->detect_rename then get_renames() would return an
empty string_list, and then process_renames() would have nothing to
iterate over.  It seems more straightforward to simply avoid calling
either function in that case.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-20 10:44:15 +09:00
Elijah Newren
9cfee25a82 merge-recursive: fix leaks of allocated renames and diff_filepairs
get_renames() has always zero'ed out diff_queued_diff.nr while only
manually free'ing diff_filepairs that did not correspond to renames.
Further, it allocated struct renames that were tucked away in the
return string_list.  Make sure all of these are deallocated when we
are done with them.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-20 10:44:15 +09:00
Elijah Newren
f172589e59 merge-recursive: introduce new functions to handle rename logic
The amount of logic in merge_trees() relative to renames was just a few
lines, but split it out into new handle_renames() and cleanup_renames()
functions to prepare for additional logic to be added to each.  No code or
logic changes, just a new place to put stuff for when the rename detection
gains additional checks.

Note that process_renames() records pointers to various information (such
as diff_filepairs) into rename_conflict_info structs.  Even though the
rename string_lists are not directly used once handle_renames() completes,
we should not immediately free the lists at the end of that function
because they store the information referenced in the rename_conflict_info,
which is used later in process_entry().  Thus the reason for a separate
cleanup_renames().

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-20 10:44:15 +09:00
Elijah Newren
9ba915577f merge-recursive: move the get_renames() function
Move this function so it can re-use some others (without either
moving all of them or adding an annoying split between function
declarations and definitions).  Cheat slightly by adding a blank line
for readability, and in order to silence checkpatch.pl.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-20 10:44:15 +09:00
Elijah Newren
a7a436042a directory rename detection: tests for handling overwriting dirty files
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-20 10:44:15 +09:00
Elijah Newren
a0b0a15103 directory rename detection: tests for handling overwriting untracked files
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-20 10:44:15 +09:00
Elijah Newren
792e1371d9 directory rename detection: miscellaneous testcases to complete coverage
I came up with the testcases in the first eight sections before coding up
the implementation.  The testcases in this section were mostly ones I
thought of while coding/debugging, and which I was too lazy to insert
into the previous sections because I didn't want to re-label with all the
testcase references.  :-)

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-20 10:44:15 +09:00
Elijah Newren
362ab315ac directory rename detection: testcases exploring possibly suboptimal merges
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-20 10:44:15 +09:00
Elijah Newren
f95de9602b directory rename detection: more involved edge/corner testcases
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-20 10:44:15 +09:00
Elijah Newren
f349987688 directory rename detection: testcases checking which side did the rename
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-20 10:44:15 +09:00
Elijah Newren
c449947a79 directory rename detection: files/directories in the way of some renames
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-20 10:44:15 +09:00
Elijah Newren
de632e4ed3 directory rename detection: partially renamed directory testcase/discussion
Add a long note about why we are not considering "partial directory
renames" for the current directory rename detection implementation.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-20 10:44:14 +09:00
Elijah Newren
21b53733a0 directory rename detection: testcases to avoid taking detection too far
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-20 10:44:14 +09:00
Elijah Newren
509555d8ad directory rename detection: directory splitting testcases
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-20 10:44:14 +09:00
Elijah Newren
04550ab56f directory rename detection: basic testcases
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-20 10:44:14 +09:00
Thomas Gummerer
df70b190bd completion: make stash -p and alias for stash push -p
We define 'git stash -p' as an alias for 'git stash push -p' in the
manpage.  Do the same in the completion script, so all options that
can be given to 'git stash push' are being completed when the user is
using 'git stash -p --<tab>'.  Currently the only additional option
the user will get is '--message', but there may be more in the future.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-20 10:39:50 +09:00
Thomas Gummerer
0eb5a4f911 completion: stop showing 'save' for stash by default
The 'save' subcommand in git stash has been deprecated in
fd2ebf14db ("stash: mark "git stash save" deprecated in the man page",
2017-10-22).

Stop showing it when the users enters 'git stash <tab>' or 'git stash
s<tab>'.  Keep showing it however when the user enters 'git stash sa<tab>'
or any more characters of the 'save' subcommand.  This is designed to
not encourage users to use 'git stash save', but still leaving the
completion option once it's clear that's what the user means.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-20 10:39:48 +09:00
Todd Zullinger
73364e4f10 doc/clone: update caption for GIT URLS cross-reference
The description of the <repository> argument directs readers to "See the
URLS section below".  When generating HTML this becomes a link to the
"GIT URLS" section.  When reading the man page in a terminal, the
caption is slightly misleading.  Use "GIT URLS" as the caption to avoid
any confusion.

Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-20 08:23:56 +09:00
Taylor Blau
fb0dc3bac1 builtin/config.c: support --type=<type> as preferred alias for --<type>
`git config` has long allowed the ability for callers to provide a 'type
specifier', which instructs `git config` to (1) ensure that incoming
values can be interpreted as that type, and (2) that outgoing values are
canonicalized under that type.

In another series, we propose to extend this functionality with
`--type=color` and `--default` to replace `--get-color`.

However, we traditionally use `--color` to mean "colorize this output",
instead of "this value should be treated as a color".

Currently, `git config` does not support this kind of colorization, but
we should be careful to avoid squatting on this option too soon, so that
`git config` can support `--color` (in the traditional sense) in the
future, if that is desired.

In this patch, we support `--type=<int|bool|bool-or-int|...>` in
addition to `--int`, `--bool`, and etc. This allows the aforementioned
upcoming patch to support querying a color value with a default via
`--type=color --default=...`, without squandering `--color`.

We retain the historic behavior of complaining when multiple,
legacy-style `--<type>` flags are given, as well as extend this to
conflicting new-style `--type=<type>` flags. `--int --type=int` (and its
commutative pair) does not complain, but `--bool --type=int` (and its
commutative pair) does.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-19 11:49:19 +09:00
Johannes Sixt
12f7babd6b sequencer: reset the committer date before commits
Now that the sequencer commits without forking when the commit message
isn't edited all the commits that are picked have the same committer
date. If a commit is reworded it's committer date will be a later time
as it is created by running an separate instance of 'git commit'.  If
the reworded commit is follow by further picks, those later commits
will have an earlier committer date than the reworded one. This is
caused by git caching the default date used when GIT_COMMITTER_DATE is
not set. Reset the cached date before a commit is generated
in-process.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-19 07:36:32 +09:00
Stefan Agner
256be1d3f0 send-email: avoid duplicate In-Reply-To/References
In case a patch already has In-Reply-To or References in the header
(e.g. when the patch has been created with format-patch --thread)
git-send-email should not add another pair of those headers.
This is also not allowed according to RFC 5322 Section 3.6:
https://tools.ietf.org/html/rfc5322#section-3.6

Avoid the second pair by reading the current headers into the
appropriate variables.

Signed-off-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-19 07:04:29 +09:00
Christian Hesse
d8698987f3 Makefile: mark perllibdir as a .PHONY target
This target should be marked as .PHONY, just like other targets that
exist only for their side effects that do not create filesystem
entities with the same name.

Signed-off-by: Christian Hesse <mail@eworm.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-19 06:45:06 +09:00
Nguyễn Thái Ngọc Duy
0b5e2ea7cf submodule--helper: don't print null in 'submodule status'
The function compute_rev_name() can return NULL sometimes (e.g. right
after 'submodule init'). The current code makes 'submodule status'
print this:

 19d97bf5af05312267c2e874ee6bcf584d9e9681 sha1collisiondetection ((null))

This ugly 'null' adds no value to the user using this command. More
importantly printf() on some platform can't handle NULL as a string
and will crash instead of printing '(null)'.

Check for this and skip printing this part (the alternative is
printing '(n/a)' or something but I think that is just noise).

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-19 06:24:23 +09:00
Martin Ågren
4c57a4f8fe git-submodule.txt: quote usage in monospace, drop backslash
We tend to quote command line examples using `` to set them in a
monospace font. The immediate motivation for this patch is to get rid of
another instance of \--. As noted in the previous commits, \-- has a
tendency of rendering badly. Here, it renders ok (at least with
AsciiDoc 8.6.9 and Asciidoctor 1.5.4), but by getting rid of this
instance, we reduce the chances of \-- cropping up in places where it
matters more.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
2018-04-18 12:49:26 +09:00
Martin Ågren
6955047ff4 git-[short]log.txt: unify quoted standalone --
In git-log.txt, we have an instance of \--, which is known to sometimes
render badly. This one is even worse than normal though, since ``\-- ''
(with or without that trailing space) appears to be entirely broken,
both in HTML and manpages, both with AsciiDoc (version 8.6.9) and
Asciidoctor (version 1.5.4).

Further down in git-log.txt we have a ``--'', which renders good. In
git-shortlog.txt, we use "\-- " (including the quotes and the space),
which happens to look fairly good. I failed to find any other similar
instances. So all in all, we quote a double-dash in three different
places and do it differently each time, with various degrees of success.

Switch all of these to `--`. This sets the double-dash in monospace and
matches what we usually do with example command line usages and options.
Note that we drop the trailing space as well, since `-- ` does not
render well. These should still be clear enough since just a few lines
above each instance, the space is clearly visible in a longer context.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
2018-04-18 12:49:26 +09:00
Martin Ågren
933c758c7d doc: convert [\--] to [--]
Commit 1c262bb7b (doc: convert \--option to --option, 2015-05-13)
explains that we used to need to write \--option to play well with older
versions of AsciiDoc, but that we do not support such versions anymore
anyway, and that Asciidoctor literally renders \--.

With [\--], which is used to denote the optional separator between
revisions and paths, Asciidoctor renders the backslash literally.
Change all [\--] to [--]. This changes nothing for AsciiDoc version
8.6.9, but is an improvement for Asciidoctor version 1.5.4.

We use double-dashes in several list entries (\--::). In my testing, it
appears that we do need to use the backslash there, so leave those.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
2018-04-18 12:49:26 +09:00
Martin Ågren
9e9f132f53 doc: convert \--option to --option
Rather than using a backslash in \--foo, with or without ''-quoting,
write `--foo` for better rendering. As explained in commit 1c262bb7b
(doc: convert \--option to --option, 2015-05-13), the backslash is not
needed for the versions of AsciiDoc that we support, but is rendered
literally by Asciidoctor.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
2018-04-18 12:49:26 +09:00
SZEDER Gábor
bed21a8ad6 docs/git-gc: fix minor rendering issue
An unwanted single quote character in the paragraph documenting the
'gc.aggressiveWindow' config variable prevented the name of that
config variable from being rendered correctly, ever since that piece
of docs was added in 0d7566a5ba (Add --aggressive option to 'git gc',
2007-05-09).

Remove that single quote.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-18 12:07:23 +09:00
Stefan Beller
d228eea514 worktree: accept -f as short for --force for removal
Many commands support a "--force" option, frequently abbreviated as
"-f", however, "git worktree remove"'s hand-rolled OPT_BOOL forgets
to recognize the short form, despite git-worktree.txt documenting
"-f" as supported. Replace OPT_BOOL with OPT__FORCE, which provides
"-f" for free, and makes 'remove' consistent with 'add' option
parsing (which also specifies the PARSE_OPT_NOCOMPLETE flag).

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-18 09:19:05 +09:00
SZEDER Gábor
94408dc71c completion: reduce overhead of clearing cached --options
To get the names of all '$__git_builtin_*' variables caching --options
of builtin commands in order to unset them, 8b0eaa41f2 (completion:
clear cached --options when sourcing the completion script,
2018-03-22) runs a 'set |sed s///' pipeline.  This works both in Bash
and in ZSH, but has a higher than necessary overhead with the extra
processes.

In Bash we can do better: run the 'compgen -v __gitcomp_builtin_'
builtin command, which lists the same variables, but without a
pipeline and 'sed' it can do so with lower overhead.
ZSH will still continue to run that pipeline.

This change also happens to work around an issue in the default Bash
version shipped in macOS (3.2.57), reported by users of the Powerline
shell prompt, which was triggered by the same commit 8b0eaa41f2 as
well.  Powerline uses several Unicode Private Use Area code points to
represent some of its pretty text UI elements (arrows and what not),
and these are stored in the $PS1 variable.  Apparently the 'set'
builtin of said Bash version on macOS has issues with these code
points, and produces garbled output where Powerline's special symbols
should be in the $PS1 variable.  This, in turn, triggers the following
error message in the downstream 'sed' process:

  sed: RE error: illegal byte sequence

Other Bash versions, notably 4.4.19 on macOS via homebrew (i.e. a
newer version on the same platform) and 3.2.25 on CentOS (i.e. a
slightly earlier version, though on a different platform) are not
affected.  ZSH in macOS (the versions shipped by default or installed
via homebrew) or on other platforms isn't affected either.

With this patch neither the 'set' builtin is invoked to print garbage,
nor 'sed' to choke on it.

Issue-on-macOS-reported-by: Stephon Harris <theonestep4@gmail.com>
Issue-on-macOS-explained-by: Matthew Coleman <matt@1eanda.com>
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-18 08:43:31 +09:00
SZEDER Gábor
7b00342068 completion: fill COMPREPLY directly when completing paths
During git-aware path completion, when a lot of path components have
to be listed, a significant amount of time is spent in
__gitcomp_file(), or more accurately in the shell loop of
__gitcompappend(), iterating over all the path components filtering
path components matching the current word to be completed, adding
prefix path components, and placing the resulting matching paths into
the COMPREPLY array.

Now, a previous patch in this series made 'git ls-files' and 'git
diff-index' list only paths matching the current word to be completed,
so an additional filtering in __gitcomp_file() is not necessary
anymore.  Adding the prefix path components could be done much more
efficiently in __git_index_files()'s 'awk' script while stripping
trailing path components and removing duplicates and quoting.  And
then the resulting paths won't require any more filtering or
processing before being handed over to Bash, so we could fill the
COMPREPLY array directly.

Unfortunately, we can't simply use the __gitcomp_direct() helper
function to do that, because __gitcomp_file() does one additional
thing: it tells Bash that we are doing filename completion, so the
shell will kindly do four important things for us:

  1. Append a trailing space to all filenames.
  2. Append a trailing '/' to all directory names.
  3. Escape any meta, globbing, separator, etc. characters.
  4. List only the current path component when listing possible
     completions (i.e. 'dir/subdir/f<TAB>' will list 'file1', 'file2',
     etc. instead of the whole 'dir/subdir/file1',
     'dir/subdir/file2').

While we could let __git_index_files()'s 'awk' script take care of the
first two points, the third one gets tricky, and we absolutely need
the shell's support for the fourth.

Add the helper function __gitcomp_file_direct(), which, just like
__gitcomp_direct(), fills the COMPREPLY array with prefiltered and
preprocessed paths without any additional processing, without a shell
loop, with just one single compound assignment, and, similar to
__gitcomp_file(), tells Bash and ZSH that we are doing filename
completion.  Extend __git_index_files()'s 'awk' script a bit to
prepend any prefix path components to all listed paths.  Finally,
modify __git_complete_index_file() to feed __git_index_files()'s
output to ___gitcomp_file_direct() instead of __gitcomp_file().

After this patch there is no shell loop left in the path completion
code path.

This speeds up path completion when there are a lot of paths matching
the current word to be completed.  In a pathological repository with
100k files in a single directory, listing all those files:

  Before this patch, best of five, using GNU awk on Linux:

    $ time cur=dir/ __git_complete_index_file

    real    0m0.983s
    user    0m1.004s
    sys     0m0.033s

  After:

    real    0m0.313s
    user    0m0.341s
    sys     0m0.029s

  Difference: -68.2%
  Speedup:      3.1x

  To see the benefits of the whole patch series, the same command with
  v2.17.0:

    real    0m2.736s
    user    0m2.472s
    sys     0m0.610s

  Difference: -88.6%
  Speedup:      8.7x

Note that this patch changes the output of the __git_index_files()
helper function by unconditionally prepending the prefix path
components to every listed path.  This would break users' completion
scriptlets that directly run:

  __gitcomp_file "$(__git_index_files ...)" "$pfx" "$cur_"

because that would add the prefix path components once more.
However, __git_index_files() is kind of a "helper function of a helper
function", and users' completion scriptlets should have been using
__git_complete_index_file() for git-aware path completion in the first
place, so this is likely doesn't worth worrying about.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-17 12:49:37 +09:00
SZEDER Gábor
193757f806 completion: improve handling quoted paths in 'git ls-files's output
If any pathname contains backslash, double quote, tab, newline, or any
control characters, 'git ls-files' and 'git diff-index' will enclose
that pathname in double quotes and escape those special characters
using C-style one-character escape sequences or \nnn octal values.
This prevents those files from being listed during git-aware path
completion, because due to the quoting they will never match the
current word to be completed.

Extend __git_index_files()'s 'awk' script to remove all that quoting
and escaping from unique path components, so even paths containing
(almost all) such special characters can be completed.

Paths containing newline characters are still an issue, though.  We
use newlines as separator character when filling the COMPREPLY array,
so a path with one or more newline will end up split to two or more
elements in COMPREPLY, basically breaking completion.  There is
nothing we can do about it without a significant performance hit, so
let's just ignore such paths for now.  As far as paths with newlines
are concerned, this isn't any different from the previous behavior,
because those paths were always omitted, though in the past they were
omitted because due to the quoting they didn't match the current word
to be completed.  Anyway, Bash's own filename completion (Meta-/) can
complete even those paths, if need be.

Note:

  - We don't dequote path components right away as they are coming in,
    because then we would have to dequote each directory name
    repeatedly, as many times as it appears in the input, i.e. as many
    times as the number of listed paths it contains.  Instead, we
    dequote them at the end, as we print unique path components.

  - Even when a directory name itself does not contain any special
    characters, it will still be quoted if any of its trailing path
    components do.  If a directory contains paths both with and
    without special characters, then the name of that directory will
    appear both quoted and unquoted in the output of 'git ls-files'
    and 'git diff-index'.  Consequently, we will add such a directory
    name to the deduplicating associative array twice: once quoted and
    once unquoted.

    This means that we have to be careful after dequoting a directory
    name, and only print it if we haven't seen the same directory name
    unquoted.

  - It would be wonderful if we could just pass '-z' to those git
    commands to output \0-separated unquoted paths, and use \0 as
    record separator in the 'awk' script processing their output...
    this patch would be so much simpler, almost trivial even.
    Unfortunately, however, POSIX and most 'awk' implementations don't
    support \0 as record separator (GNU awk does support it).

  - This patch makes the earlier change to list paths with
    'core.quotePath=false' basically redundant, because this could
    decode any \nnn-escaped non-ASCII character just fine, as well.
    However, I suspect that 'git ls-files' can deal with those
    non-ASCII characters faster than this updated 'awk' script; just
    in case someone is burdened with tons of pathnames containing
    non-ASCII characters.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-17 12:49:36 +09:00
SZEDER Gábor
c1bc0a0e92 completion: remove repeated dirnames with 'awk' during path completion
During git-aware path completion, after all the trailing path
components have been removed from the output of 'git ls-files' and
'git diff-index' (see previous patch), each directory name is repeated
as many times as the number of listed paths it contains.  This can be
a lot of repetitions, especially when invoking path completion close
to the root of a big worktree, which would cause a considerable
overhead downstream of __git_index_files(), in particular in the shell
loop that fills the COMPREPLY array.  To reduce this overhead,
__git_index_files() runs the classic '... |sort |uniq' pattern to
remove those repetitions from the function's output.

While removing repeated directory names is effective in reducing the
number of iterations in that shell loop, it still imposes the overhead
of fork()+exec()ing two external processes, and two additional stages
in the pipeline, where potentially relatively large amount of data can
be passed between two subsequent pipeline stages.

Extend __git_index_files()'s 'awk' script to remove repeated path
components by first creating and filling an associative array indexed
by all encountered path components (after the trailing path components
have been removed), and then iterating over this array and printing
the indices, i.e. unique path components.  This way we can remove the
'|sort |uniq' pipeline stages, and their eliminated overhead results
in faster path completion.

Listing all tracked files (12) and directories (23) at the top of the
worktree in linux.git (over 62k files), i.e. what's doing all the hard
work behind 'git rm <TAB>':

  Before this patch, best of five, using GNU awk on Linux:

    real    0m0.069s
    user    0m0.089s
    sys     0m0.026s

  After:

    real    0m0.052s
    user    0m0.072s
    sys     0m0.014s

  Difference: -24.6%

Note that this changes order of elements in __git_index_files()'s
output.  This is not an issue, because this function was only ever
intended to feed paths into the COMPREPLY array, and Bash will sort
its elements (according to the users locale) anyway.

Note also that using 'awk' to remove repeated path components is also
beneficial for the performance of the next two patches:

  - The first will extend this 'awk' script to dequote quoted paths in
    the output of 'git ls-files' and 'git diff-index'.  With this
    patch it will only have to dequote unique path components, not
    all.

  - The second will, among other things, extend this 'awk' script to
    prepend prefix path components from the command line to the
    currently completed path component.  Consequently, each line in
    'awk's output will grow longer.  Without this patch that '|sort
    |uniq' would have to exchange and process that much more data.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-17 12:49:36 +09:00
SZEDER Gábor
9703797c9d t9902-completion: ignore COMPREPLY element order in some tests
The order or possible completion words in the COMPREPLY array doesn't
actually matter, as long as all the right words are in there, because
Bash will sort them anyway.  Yet, our tests looking at the elements of
COMPREPLY always expect them to be in a specific order.

Now, this hasn't been an issue before, but the next patch is about to
optimize a bit more our git-aware path completion, and as a harmless
side effect the order of elements in COMPREPLY will change.  Worse,
the order will be downright undefined, because after the next patch
path components will come directly from iterating through an
associative array in 'awk', and the order of iteration over the
elements in those arrays is undefined, and indeed different 'awk'
implementations produce different order.  Consequently, we can't get
away with simply adjusting the expected results in the affected tests.

Modify the 'test_completion' helper function to sort both the expected
and the actual results, i.e. the elements in COMPREPLY, before
comparing them, so the tests using this helper function will work
regardless of the order of elements.

Note that this change still leaves a bunch of tests depending on the
order of elements in COMPREPLY, tests that focus on a specific helper
function and therefore don't use the 'test_completion' helper.  I
would rather deal with those later, when (if ever) the need actually
arises, than create unnecessary code churn now.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-17 12:49:36 +09:00
SZEDER Gábor
105c0efff3 completion: use 'awk' to strip trailing path components
During git-aware path completion we complete one path component at a
time, i.e. 'git add <TAB>' offers only 'dir/' at first, not
'dir/subdir/file' right away, just like Bash's own filename
completion.  However, since both 'git ls-files' and 'git diff-index'
dive deep into subdirectories, we have to strip all trailing path
components from the listed paths, keeping only the leading path
component.  This stripping is currently done in a shell loop in
__git_index_files(), which can take a significant amount of time when
it has to iterate through a large number of paths.

Replace this shell loop with a little 'awk' script using '/' as input
field separator and printing the first field, which produces the same
output much faster.

Listing all tracked files (12) and directories (23) at the top of the
worktree in linux.git (over 62k files), i.e. what's doing all the hard
work behind 'git rm <TAB>':

  Before this patch, best of five, using GNU awk on Linux:

    $ time cur= __git_complete_index_file

    real    0m2.149s
    user    0m1.307s
    sys     0m1.086s

  After:

    real    0m0.067s
    user    0m0.089s
    sys     0m0.023s

  Difference: -96.9%
  Speedup:     32.1x

Note that this could be done with 'sed', or even with 'cut', just as
well, but the upcoming patches require 'awk's scriptability.

Note also that this change means one more fork()+exec()ed process
during path completion, adding more overhead especially on Windows,
but a later patch will more than make up for it by eliminating two
other processes in the same function.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-17 12:49:36 +09:00
SZEDER Gábor
a364e984d1 completion: let 'ls-files' and 'diff-index' filter matching paths
During git-aware path completion, e.g. 'git rm dir/fil<TAB>', both
'git ls-files' and 'git diff-index' list all paths in the given 'dir/'
matching certain criteria (cached, modified, untracked, etc.)
appropriate for the given git command, even paths whose names don't
begin with 'fil'.  This comes with a considerable performance
penalty when the directory in question contains a lot of paths, but
the current word can be uniquely completed or when only a handful of
those paths match the current word.

Reduce the number of iterations in this codepath from the number of
paths to the number of matching paths by specifying an appropriate
globbing pattern to 'git ls-files' and 'git diff-index' to list only
paths that match the current word to be completed.

Note that both commands treat backslashes as escape characters in
their file arguments, e.g. to preserve the literal meaning of globbing
characters, so we have to double every backslash in the globbing
pattern.  This is why one of the path completion tests specifically
checks the completion of a path containing a literal backslash
character (that test still fails, though, because both commands output
such paths enclosed in double quotes and the special characters
escaped; a later patch in this series will deal with those).

This speeds up path completion considerably when there are a lot of
non-matching paths to be filtered out.  Uniquely completing a tracked
filename at the top of the worktree in linux.git (over 62k files),
i.e. what's doing all the hard work behind 'git rm Mak<TAB>' to
complete 'Makefile':

  Before this patch, best of five, on Linux:

    $ time cur=Mak __git_complete_index_file

    real    0m2.159s
    user    0m1.299s
    sys     0m1.089s

  After:

    real    0m0.033s
    user    0m0.023s
    sys     0m0.015s

  Difference: -98.5%
  Speedup:     65.4x

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-17 12:49:36 +09:00
SZEDER Gábor
f12785a3a7 completion: improve handling quoted paths on the command line
Our git-aware path completion doesn't work when it has to complete a
word already containing quoted and/or backslash-escaped characters on
the command line.  The root cause of the issue is that completion
functions see all words on the command line verbatim, i.e. including
all backslash, single and double quote characters that the shell would
eventually remove when executing the finished command.  These
quoting/escaping characters cause different issues depending on which
path component of the word to be completed contains them:

  - The quoting/escaping is in the prefix path component(s).

    Let's suppose we have a directory called 'New Dir', containing two
    untracked files 'file.c' and 'file.o', and we have a gitignore
    rule ignoring object files.  In this case all of these:

      git add New\ Dir/<TAB>
      git add "New Dir/<TAB>
      git add 'New Dir/<TAB>

    should uniquely complete 'file.c' right away, but Bash offers both
    'file.c' and 'file.o' instead.  The reason for this behavior is
    that our completion script uses the prefix directory name like
    'git -C "New\ Dir/" ls-files ...", i.e. with the backslash inside
    double quotes.  Git then tries to enter a directory called
    'New\ Dir', which (most likely) fails because such a directory
    doesn't exists.  As a result our completion script doesn't list
    any files, leaves the COMPREPLY array empty, which in turn causes
    Bash to fall back to its simple filename completion and lists all
    files in that directory, i.e. both 'file.c' and 'file.o'.

  - The quoting/escaping is in the path component to be completed.

    Let's suppose we have two untracked files 'New File.c' and
    'New File.o', and we have a gitignore rule ignoring object files.
    In this case all of these:

      git add New\ Fi<TAB>
      git add "New Fi<TAB>
      git add 'New Fi<TAB>

    should uniquely complete 'New File.c' right away, but Bash offers
    both 'New File.c' and 'New File.o' instead.  The reason for this
    behavior is that our completion script uses this 'New\ Fi' or
    '"New Fi' etc. word to filter matching paths, and of course none
    of the potential filenames will match because of the included
    backslash or double quote.  The end result is the same as above:
    the completion script doesn't list any files, Bash falls back to
    its filename completion, which then lists the matching object file
    as well.

Add the new helper function __git_dequote() [1], which removes (most
of[2]) the quoting and escaping from the word it gets as argument.  To
minimize the overhead of calling this function, store its result in
the variable $dequoted_word, supposed to be declared local in the
caller; simply printing the result would require a command
substitution imposing the overhead of fork()ing a subshell.  Use this
function in __git_complete_index_file() to dequote the current word,
i.e. the path, to be completed, to avoid the above described
quoting-related issues, thereby fixing two of the failing quoted path
completion tests.

[1] The bash-completion project already has a dequote() function,
    which I hoped I could borrow to deal with this, but unfortunately
    it doesn't work quite well for this purpose (perhaps that's why
    even the bash-completion project only rarely uses it).  The main
    issue is that their dequote() is implemented as:

      eval printf %s "$1" 2> /dev/null

    where $1 would contain the word to be completed.  While it's a
    short and sweet one-liner, the use of 'eval' requires that $1 is a
    syntactically valid string, which is not the case when quoting the
    path like 'git add "New Dir/<TAB>'.  This causes 'eval' to fail,
    because it can't find the matching closing double quote, and the
    function returns nothing.  The result is totally broken behavior,
    as if the current word were empty, and the completion script would
    then list all files from the current directory.  This is why one
    of the quoted path completion tests specifically checks the
    completion of a path with an opening but without a corresponding
    closing double quote character.  Furthermore, the 'eval' performs
    all kinds of expansions, which may or may not be desired; I think
    it's the latter.  Finally, using this function would require a
    command substitution.

[2] Bash understands the $'string' quoting as well, which "expands to
    'string', with backslash-escaped characters replaced as specified
    by the ANSI C standard" (quoted from Bash manpage).  Since shell
    metacharacters, field separators, globbing, etc. can all be easily
    entered using standard shell escaping or quoting, this type of
    quoting comes in handly when dealing with control characters that
    are otherwise difficult both to "type" and to see on the command
    line.  Because of this difficulty I would assume that people do
    avoid pathnames with such control characters anyway, so I didn't
    bother implementing it.  This function is already way too long as
    it is.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-17 12:49:36 +09:00
SZEDER Gábor
3dfe23ba51 completion: support completing non-ASCII pathnames
Unless the user has 'core.quotePath=false' somewhere in the
configuration, both 'git ls-files' and 'git diff-index' will by
default quote any pathnames that contain bytes with values higher than
0x80, and escape those bytes as '\nnn' octal values.  This prevents
completing paths when the current path component to be completed
contains any non-ASCII, most notably UTF-8, characters, because none
of the listed quoted paths will match the current word on the command
line.

Set 'core.quotePath=false' for those 'git ls-files' and 'git
diff-index' invocations, so they won't consider bytes higher than 0x80
as "unusual", and won't quote pathnames containing such characters.

Note that pathnames containing backslash, double quote, or control
characters will still be quoted; a later patch in this series will
deal with those.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-17 12:49:36 +09:00
SZEDER Gábor
6bf0ced4e2 completion: simplify prefix path component handling during path completion
Once upon a time 'git -C "" cmd' errored out with "Cannot change to
'': No such file or directory", therefore the completion script took
extra steps to run 'git -C "." cmd' instead; see fca416a41e
(completion: use "git -C $there" instead of (cd $there && git ...),
2014-10-09).

Those extra steps are not needed since 6a536e2076 (git: treat "git -C
'<path>'" as a no-op when <path> is empty, 2015-03-06), so remove
them.

While at it, also simplify how the trailing '/' is appended to the
variable holding the prefix path components.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-17 12:49:36 +09:00
SZEDER Gábor
722e31c713 completion: move __git_complete_index_file() next to its helpers
It's much easier to read, understand and modify the functions related
to git-aware path completion when they are right next to each other.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-17 12:49:36 +09:00
SZEDER Gábor
5bb534a620 t9902-completion: add tests demonstrating issues with quoted pathnames
Completion functions see all words on the command line verbatim,
including any backslash-escapes, single and double quotes that might
be there.  Furthermore, git commands quote pathnames if they contain
certain special characters.  All these create various issues when
doing git-aware path completion.

Add a couple of failing tests to demonstrate these issues.

Later patches in this series will discuss these issues in detail as
they fix them.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-17 12:49:36 +09:00
Sergey Organov
a279b74c68 glossary: substitute "ancestor" for "direct ancestor" in 'push' description.
Even though "direct ancestor" is not defined in the glossary, the
common meaning of the term is simply "parent", parents being the only
direct ancestors, and the rest of ancestors being indirect ancestors.

As "parent" is obviously wrong in this place in the description, we
should simply say "ancestor", as everywhere else.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-17 11:23:23 +09:00
Tao Qingyun
adc887221f t1510-repo-setup.sh: remove useless mkdir
Signed-off-by: Tao Qingyun <845767657@qq.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-17 10:55:17 +09:00
Ævar Arnfjörð Bjarmason
6d5ed4836d git{,-blame}.el: remove old bitrotting Emacs code
The git-blame.el mode has been superseded by Emacs's own
vc-annotate (invoked by C-x v g). Users of the git.el mode are now
much better off using either Magit or the Git backend for Emacs's own
VC mode.

These modes were added over 10 years ago when Emacs's own Git support
was much less mature, and there weren't other mature modes in the wild
or shipped with Emacs itself.

These days these modes have few if any users, and users of git aren't
well served by us shipping these (some OS's install them alongside git
by default, which is confusing and leads users astray).

So let's remove these per Alexandre Julliard's message to the
ML[1]. If someone still wants these for some reason they're better
served by hosting these elsewhere (e.g. on ELPA), instead of us
distributing them with git.

However, since downstream packagers such as Debian are packaging this
as git-el it's less disruptive to still carry these files as Elisp
code that'll error out with a message suggesting alternatives, rather
than drop the files entirely[2].

Then rather than receive a cryptic load error when they upgrade
existing users will get an error directing them to the README file, or
to just stop requiring these modes. I think it makes sense to link to
GitHub's hosting of contrib/emacs/README (which'll be updated by the
time users see this) so they don't have to hunt down the packaged
README on their local system.

1. "Re: [PATCH] git.el: handle default excludesfile
   properly" (87muzlwhb0.fsf@winehq.org) --
   https://public-inbox.org/git/87muzlwhb0.fsf@winehq.org/

2. "Re: [PATCH v3] git{,-blame}.el: remove old bitrotting Emacs
   code" (20180327165751.GA4343@aiede.svl.corp.google.com) --
   https://public-inbox.org/git/20180327165751.GA4343@aiede.svl.corp.google.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-16 17:25:49 +09:00
Jeff King
8b44b2be89 gpg-interface: find the last gpg signature line
A signed tag has a detached signature like this:

  object ...
  [...more header...]

  This is the tag body.

  -----BEGIN PGP SIGNATURE-----
  [opaque gpg data]
  -----END PGP SIGNATURE-----

Our parser finds the _first_ line that appears to start a
PGP signature block, meaning we may be confused by a
signature (or a signature-like line) in the actual body.
Let's keep parsing and always find the final block, which
should be the detached signature over all of the preceding
content.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ben Toews <mastahyeti@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-16 14:15:03 +09:00
Jeff King
f68f2dd57f gpg-interface: extract gpg line matching helper
Let's separate the actual line-by-line parsing of signatures
from the notion of "is this a gpg signature line". That will
make it easier to do more refactoring of this loop in future
patches.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ben Toews <mastahyeti@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-16 14:15:03 +09:00
Jeff King
17ef3a421e gpg-interface: fix const-correctness of "eol" pointer
We accidentally shed the "const" of our buffer by passing it
through memchr. Let's fix that, and while we're at it, move
our variable declaration inside the loop, which is the only
place that uses it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ben Toews <mastahyeti@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-16 14:15:03 +09:00