Commit Graph

47023 Commits

Author SHA1 Message Date
Jeff King
63d428e656 receive-pack: avoid duplicates between our refs and alternates
We de-duplicate ".have" refs among themselves, but never
check if they are duplicates of our local refs. It's not
unreasonable that they would be if we are a "--shared" or
"--reference" clone of a similar repository; we'd have all
the same tags.

We can handle this by inserting our local refs into the
oidset, but obviously not suppressing duplicates (since the
refnames are important).

Note that this also switches the order in which we advertise
refs, processing ours first and then any alternates. The
order shouldn't matter (and arguably showing our refs first
makes more sense).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 15:39:55 -08:00
Jeff King
8b24b9e765 receive-pack: treat namespace .have lines like alternates
Namely, de-duplicate them. We use the same set as the
alternates, since we call them both ".have" (i.e., there is
no value in showing one versus the other).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 15:39:55 -08:00
Jeff King
fea6c47f2f receive-pack: fix misleading namespace/.have comment
The comment claims that we handle alternate ".have" lines
through this function, but that hasn't been the case since
85f251045 (write_head_info(): handle "extra refs" locally,
2012-01-06).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 15:39:55 -08:00
Jeff King
ab6eea6f7b receive-pack: use oidset to de-duplicate .have lines
If you have an alternate object store with a very large
number of refs, the peak memory usage of the sha1_array can
grow high, even if most of them are duplicates that end up
not being printed at all.

The similar for_each_alternate_ref() code-paths in
fetch-pack solve this by using flags in "struct object" to
de-duplicate (and so are relying on obj_hash at the core).

But we don't have a "struct object" at all in this case. We
could call lookup_unknown_object() to get one, but if our
goal is reducing memory footprint, it's not great:

 - an unknown object is as large as the largest object type
   (a commit), which is bigger than an oidset entry

 - we can free the memory after our ref advertisement, but
   "struct object" entries persist forever (and the
   receive-pack may hang around for a long time, as the
   bottleneck is often client upload bandwidth).

So let's use an oidset. Note that unlike a sha1-array it
doesn't sort the output as a side effect. However, our
output is at least stable, because for_each_alternate_ref()
will give us the sha1s in ref-sorted order.

In one particularly pathological case with an alternate that
has 60,000 unique refs out of 80 million total, this reduced
the peak heap usage of "git receive-pack . </dev/null" from
13GB to 14MB.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 15:39:55 -08:00
Jeff King
29c2bd5fa8 add oidset API
This is similar to many of our uses of sha1-array, but it
overcomes one limitation of a sha1-array: when you are
de-duplicating a large input with relatively few unique
entries, sha1-array uses 20 bytes per non-unique entry.
Whereas this set will use memory linear in the number of
unique entries (albeit a few more than 20 bytes due to
hashmap overhead).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 15:39:55 -08:00
Jeff King
41a078c60b fetch-pack: cache results of for_each_alternate_ref
We may run for_each_alternate_ref() twice, once in
find_common() and once in everything_local(). This operation
can be expensive, because it involves running a sub-process
which must freshly load all of the alternate's refs from
disk.

Let's cache and reuse the results between the two calls. We
can make some optimizations based on the particular use
pattern in fetch-pack to keep our memory usage down.

The first is that we only care about the sha1s, not the refs
themselves. So it's OK to store only the sha1s, and to
suppress duplicates. The natural fit would therefore be a
sha1_array.

However, sha1_array's de-duplication happens only after it
has read and sorted all entries. It still stores each
duplicate. For an alternate with a large number of refs
pointing to the same commits, this is a needless expense.

Instead, we'd prefer to eliminate duplicates before putting
them in the cache, which implies using a hash. We can
further note that fetch-pack will call parse_object() on
each alternate sha1. We can therefore keep our cache as a
set of pointers to "struct object". That gives us a place to
put our "already seen" bit with an optimized hash lookup.
And as a bonus, the object stores the sha1 for us, so
pointer-to-object is all we need.

There are two extra optimizations I didn't do here:

  - we actually store an array of pointer-to-object.
    Technically we could just walk the obj_hash table
    looking for entries with the ALTERNATE flag set (because
    our use case doesn't care about the order here).

    But that hash table may be mostly composed of
    non-ALTERNATE entries, so we'd waste time walking over
    them. So it would be a slight win in memory use, but a
    loss in CPU.

  - the items we pull out of the cache are actual "struct
    object"s, but then we feed "obj->sha1" to our
    sub-functions, which promptly call parse_object().

    This second parse is cheap, because it starts with
    lookup_object() and will bail immediately when it sees
    we've already parsed the object. We could save the extra
    hash lookup, but it would involve refactoring the
    functions we call. It may or may not be worth the
    trouble.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 15:39:55 -08:00
Jeff King
a10a17877b for_each_alternate_ref: replace transport code with for-each-ref
The current method for getting the refs from an alternate is
to run upload-pack in the alternate and parse its output
using the normal transport code.  This works and is
reasonably short, but it has a very bad memory footprint
when there are a lot of refs in the alternate. There are two
problems:

  1. It reads in all of the refs before passing any back to
     us. Which means that our peak memory usage has to store
     every ref (including duplicates for peeled variants),
     even if our callback could determine that some are not
     interesting (e.g., because they point to the same sha1
     as another ref).

  2. It allocates a "struct ref" for each one. Among other
     things, this contains 3 separate 20-byte oids, along
     with the name and various pointers.  That can add up,
     especially if the callback is only interested in the
     sha1 (which it can store in a sha1_array as just 20
     bytes).

On a particularly pathological case, where the alternate had
over 80 million refs pointing to only around 60,000 unique
objects, the peak heap usage of "git clone --reference" grew
to over 25GB.

This patch instead calls git-for-each-ref in the alternate
repository, and passes each line to the callback as we read
it. That drops the peak heap of the same command to 50MB.

I considered and rejected a few alternatives.

We could read all of the refs in the alternate using our own
ref code, just as we do with submodules.  However, as memory
footprint is one of the concerns here, we want to avoid
loading those refs into our own memory as a whole.

It's possible that this will be a better technique in the
future when the ref code can more easily iterate without
loading all of packed-refs into memory.

Another option is to keep calling upload-pack, and just
parse its output ourselves in a streaming fashion. Besides
for-each-ref being simpler (we get to define the format
ourselves, and don't have to deal with speaking the git
protocol), it's more flexible for possible future changes.

For instance, it might be useful for the caller to be able
to limit the set of "interesting" alternate refs.  The
motivating example is one where many "forks" of a particular
repository share object storage, and the shared storage has
refs for each fork (which is why so many of the refs are
duplicates; each fork has the same tags).  A plausible
future optimization would be to ask for the alternate refs
for just _one_ fork (if you had some out-of-band way of
knowing which was the most interesting or important for the
current operation).

Similarly, no callbacks actually care about the symref value
of alternate refs, and as before, this patch ignores them
entirely.  However, if we wanted to add them, for-each-ref's
"%(symref)" is going to be more flexible than upload-pack,
because the latter only handles the HEAD symref due to
historical constraints.

There is one potential downside, though: unlike upload-pack,
our for-each-ref command doesn't report the peeled value of
refs. The existing code calls the alternate_ref_fn callback
twice for tags: once for the tag, and once for the peeled
value with the refname set to "ref^{}".

For the callers in fetch-pack, this doesn't matter at all.
We immediately peel each tag down to a commit either way (so
there's a slight improvement, as do not bother passing the
redundant data over the pipe). For the caller in
receive-pack, it means we will not advertise the peeled
values of tags in our alternate. However, we also don't
advertise peeled values for our _own_ tags, so this is
actually making things more consistent.

It's unclear whether receive-pack advertising peeled values
is a win or not. On one hand, giving more information to the
other side may let it omit some objects from the push. On
the other hand, for tags which both sides have, they simply
bloat the advertisement. The upload-pack advertisement of
git.git is about 30% larger than the receive-pack
advertisement due to its peeled information.

This patch omits the peeled information from
for_each_alternate_ref entirely, and leaves it up to the
caller whether they want to dig up the information.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 15:39:55 -08:00
Jeff King
2429d63a46 for_each_alternate_ref: pass name/oid instead of ref struct
Breaking down the fields in the interface makes it easier to
change the backend of for_each_alternate_ref to something
that doesn't use "struct ref" internally.

The only field that callers actually look at is the oid,
anyway. The refname is kept in the interface as a plausible
thing for future code to want.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 15:39:55 -08:00
Jeff King
5e8c968c64 for_each_alternate_ref: use strbuf for path allocation
We have a string with ".../objects" pointing to the
alternate object store, and overwrite bits of it to look at
other paths in the (potential) git repository holding it.
This works because the only path we care about is "refs",
which is shorter than "objects".

Using a strbuf to hold the path lets us get rid of some
magic numbers, and makes it more obvious that the memory
operations are safe.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 15:39:55 -08:00
Jeff King
ece657f399 for_each_alternate_ref: stop trimming trailing slashes
The real_pathdup() function will have removed extra slashes
for us already (on top of the normalize_path() done when we
created the alternate_object_database struct in the first
place).

Incidentally, this also fixes the case where the path is
just "/", which would read off the start of the array.
That doesn't seem possible to trigger in practice, though,
as link_alt_odb_entry() blindly eats trailing slashes,
including a bare "/".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 15:39:55 -08:00
Jeff King
3a1345af28 for_each_alternate_ref: handle failure from real_pathdup()
In older versions of git, if real_path() failed to resolve
the alternate object store path, we would die() with an
error. However, since 4ac9006f8 (real_path: have callers use
real_pathdup and strbuf_realpath, 2016-12-12) we use the
real_pathdup() function, which may return NULL. Since we
don't check the return value, we can segfault.

This is hard to trigger in practice, since we check that the
path is accessible before creating the alternate_object_database
struct. But it could be removed racily, or we could see a
transient filesystem error.

We could restore the original behavior by switching back to
xstrdup(real_path()).  However, dying is probably not the
best option here. This whole function is best-effort
already; there might not even be a repository around the
shared objects at all. And if the alternate store has gone
away, there are no objects to show.

So let's just quietly return, as we would if we failed to
open "refs/", or if upload-pack failed to start, etc.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 15:39:55 -08:00
Jeff King
f5022b5fed diff: print line prefix for --name-only output
If you run "git log --graph --name-only", the pathnames are
not indented to go along with their matching commits (unlike
all of the other diff formats). We need to output the line
prefix for each item before writing it.

The tests cover both --name-status and --name-only. The
former actually gets this right already, because it builds
on the --raw format functions. It's only --name-only which
uses its own code (and this fix mirrors the code in
diff_flush_raw()).

Note that the tests don't follow our usual style of setting
up the "expect" output inside the test block. This matches
the surrounding style, but more importantly it is easier to
read: we don't have to worry about embedded single-quotes,
and the leading indentation is more obvious.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 13:39:57 -08:00
René Scharfe
bec5ab8997 dir: avoid allocation in fill_directory()
Pass the match member of the first pathspec item directly to
read_directory() instead of using common_prefix() to duplicate it first,
thus avoiding memory duplication, strlen(3) and free(3).

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 13:38:41 -08:00
Nguyễn Thái Ngọc Duy
209df269a6 rev-list-options.txt: update --all about HEAD
This is the document patch for f0298cf1c6 (revision walker: include a
detached HEAD in --all - 2009-01-16).

Even though that commit is about detached HEAD, as Jeff pointed out,
always adding HEAD in that case may have subtle differences with
--source or --exclude. So the document mentions nothing about the
detached-ness.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 13:37:32 -08:00
David Aguilar
1ce515f09d t7800: replace "wc -l" with test_line_count
Make t7800 easier to debug by capturing output into temporary files and
using test_line_count to make assertions on those files.

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 13:36:07 -08:00
Junio C Hamano
a83c2d2972 Merge branch 'da/difftool-dir-diff-fix' into da/t7800-cleanup
* da/difftool-dir-diff-fix:
  difftool: fix dir-diff index creation when in a subdirectory
2017-02-08 13:36:03 -08:00
David Aguilar
e66adcadfe t7800: simplify basic usage test
Use "test_line_count" instead of "wc -l", use "git -C" instead of a
subshell, and use test_expect_code when calling difftool.  Ease
debugging by capturing output into temporary files.

Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 13:31:20 -08:00
Denton Liu
56c2da57fe Document the --no-gui option in difftool
Prior to this, the `--no-gui` option was not documented in the manpage.
This commit introduces this into the manpage

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08 13:30:28 -08:00
Junio C Hamano
44a6b6ce17 ref-filter: resurrect "strip" as a synonym to "lstrip"
We forgot that "strip" was introduced at 0571979bd6 ("tag: do not
show ambiguous tag names as "tags/foo"", 2016-01-25) as part of Git
2.8 (and 2.7.1) when we started calling this "lstrip" to make it
easier to explain the new "rstrip" operation.

We shouldn't have renamed the existing one; "lstrip" should have
been a new synonym that means the same thing as "strip".  Scripts
in the wild are surely using the original form already.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-07 11:50:34 -08:00
Patrick Steinhardt
2488dcab22 worktree: fix option descriptions for prune
The `verbose` and `expire` options of the `git worktree prune`
subcommand have wrong descriptions in that they pretend to relate to
objects. But as the git-worktree(1) correctly states, these options have
nothing to do with objects but only with worktrees. Fix the description
accordingly.

Signed-off-by: Patrick Steinhardt <patrick.steinhardt@elego.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-06 10:59:25 -08:00
René Scharfe
c86000c1a7 p5302: create repositories for index-pack results explicitly
Before 7176a314 (index-pack: complain when --stdin is used outside of a
repo) index-pack silently created a non-existing target directory; now
the command refuses to work unless it's used against a valid repository.
That causes p5302 to fail, which relies on the former behavior.  Fix it
by setting up the destinations for its performance tests using git init.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-06 10:55:25 -08:00
Eric Wong
2cbad17642 completion: fix git svn authorship switches
--add-author-from and --use-log-author are for "git svn dcommit",
not "git svn (init|clone)"

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-06 10:40:39 -08:00
David Aguilar
d81345ce09 difftool: fix bug when printing usage
"git difftool -h" reports an error:

	fatal: BUG: setup_git_env called without repository

Defer repository setup so that the help option processing happens before
the repository is initialized.

Add tests to ensure that the basic usage works inside and outside of a
repository.

Signed-off-by: David Aguilar <davvid@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-06 10:13:48 -08:00
Anthony Ramine
c0ea09b2da l10n: fr.po: Fix a typo in the French translation
Signed-off-by: Anthony Ramine <n.oxyde@gmail.com>
2017-02-04 15:10:39 +01:00
Joachim Jablon
98992e9334 l10n: fr.po: Remove gender specific adjectives
Signed-off-by: Joachim Jablon <ewjoachim@gmail.com>
Reviewed-by: Jean-Noel Avila <jn.avila@free.fr>
2017-02-04 15:10:39 +01:00
Joachim Jablon
e1df01c55c l10n: fr.po: Fix typos
Reviewed-by: Jean-Noel Avila <jn.avila@free.fr>
Signed-off-by: Joachim Jablon <ewjoachim@gmail.com>
2017-02-04 15:10:39 +01:00
Jacob Keller
7326451bed reset: add an example of how to split a commit into two
It is often useful to break a commit into multiple parts that are more
logical separations. This can be tricky to learn how to do without the
brute-force method if re-writing code or commit messages from scratch.

Add a section to the git-reset documentation which shows an example
process for how to use git add -p and git commit -c HEAD@{1} to
interactively break a commit apart and re-use the original commit
message as a starting point when making the new commit message.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-03 22:31:47 -08:00
Cornelius Weig
f483a0aa2a completion: recognize more long-options
Command completion only recognizes a subset of the available options for
the various git commands. The set of recognized options needs to balance
between having all useful options and to not clutter the terminal.

This commit adds all long-options that are mentioned in the man-page
synopsis of the respective git command. Possibly dangerous options are
not included in this set, to avoid accidental data loss. The added
options are:

 - apply: --recount --directory=
 - archive: --output
 - branch: --column --no-column --sort= --points-at
 - clone: --no-single-branch --shallow-submodules
 - commit: --patch --short --date --allow-empty
 - describe: --first-parent
 - fetch, pull: --unshallow --update-shallow
 - fsck: --name-objects
 - grep: --break --heading --show-function --function-context
         --untracked --no-index
 - mergetool: --prompt --no-prompt
 - reset: --keep
 - revert: --strategy= --strategy-option=
 - shortlog: --email
 - tag: --merged --no-merged --create-reflog

Signed-off-by: Cornelius Weig <cornelius.weig@tngtech.com>
Helped-by: Johannes Sixt <j6t@kdbg.org>
Reviewed-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-03 22:25:46 -08:00
Cornelius Weig
cac84960ea completion: teach remote subcommands to complete options
Git-remote needs to complete remote names, its subcommands, and options
thereof. In addition to the existing subcommand and remote name
completion, do also complete the options

 - add: --track --master --fetch --tags --no-tags --mirror=
 - set-url: --push --add --delete
 - get-url: --push --all
 - prune: --dry-run

Signed-off-by: Cornelius Weig <cornelius.weig@tngtech.com>
Reviewed-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-03 22:25:46 -08:00
Cornelius Weig
188fba1172 completion: teach replace to complete options
Git-replace needs to complete references and its own options. In
addition to the existing references completions, do also complete the
options --edit --graft --format= --list --delete.

Signed-off-by: Cornelius Weig <cornelius.weig@tngtech.com>
Reviewed-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-03 22:25:46 -08:00
Cornelius Weig
2c0f3a5314 completion: teach ls-remote to complete options
ls-remote needs to complete remote names and its own options. In
addition to the existing remote name completions, do also complete
the options --heads, --tags, --refs, --get-url, and --symref.

Signed-off-by: Cornelius Weig <cornelius.weig@tngtech.com>
Reviewed-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-03 22:25:46 -08:00
Cornelius Weig
bd9ab9dfc0 completion: improve bash completion for git-add
Command completion for git-add did not recognize some long-options.
This commits adds completion for all long-options that are mentioned in
the man-page synopsis. In addition, if the user specified `--update` or
`-u`, path completion will only suggest modified tracked files.

Signed-off-by: Cornelius Weig <cornelius.weig@tngtech.com>
Reviewed-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-03 22:25:46 -08:00
Cornelius Weig
e24a256b59 completion: add subcommand completion for rerere
Managing recorded resolutions requires command-line usage of git-rerere.
Added subcommand completion for rerere and path completion for its
subcommand forget.

Signed-off-by: Cornelius Weig <cornelius.weig@tngtech.com>
Reviewed-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-03 22:25:46 -08:00
Cornelius Weig
65d5a1e0a5 completion: teach submodule subcommands to complete options
Each submodule subcommand has specific long-options. Therefore, teach
bash completion to support option completion based on the current
subcommand. All long-options that are mentioned in the man-page synopsis
are added.

Signed-off-by: Cornelius Weig <cornelius.weig@tngtech.com>
Reviewed-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-03 22:25:46 -08:00
SZEDER Gábor
fad9484f0a completion: cache the path to the repository
After the previous changes in this series there are only a handful of
$(__gitdir) command substitutions left in the completion script, but
there is still a bit of room for improvements:

  1. The command substitution involves the forking of a subshell,
     which has considerable overhead on some platforms.

  2. There are a few cases, where this command substitution is
     executed more than once during a single completion, which means
     multiple subshells and possibly multiple 'git rev-parse'
     executions.  __gitdir() is invoked twice while completing refs
     for e.g. 'git log', 'git rebase', 'gitk', or while completing
     remote refs for 'git fetch' or 'git push'.

Both of these points can be addressed by using the
__git_find_repo_path() helper function introduced in the previous
commit:

  1. __git_find_repo_path() stores the path to the repository in a
     variable instead of printing it, so the command substitution
     around the function can be avoided.  Or rather: the command
     substitution should be avoided to make the new value of the
     variable set inside the function visible to the callers.
     (Yes, there is now a command substitution inside
     __git_find_repo_path() around each 'git rev-parse', but that's
     executed only if necessary, and only once per completion, see
     point 2. below.)

  2. $__git_repo_path, the variable holding the path to the
     repository, is declared local in the toplevel completion
     functions __git_main() and __gitk_main().  Thus, once set, the
     path is visible in all completion functions, including all
     subsequent calls to __git_find_repo_path(), meaning that they
     wouldn't have to re-discover the path to the repository.

So call __git_find_repo_path() and use $__git_repo_path instead of the
$(__gitdir) command substitution to access paths in the .git
directory.  Turn tests checking __gitdir()'s repository discovery into
tests of __git_find_repo_path() such that only the tested function
changes but the expected results don't, ensuring that repo discovery
keeps working as it did before.

As __gitdir() is not used anymore in the completion script, mark it as
deprecated and direct users' attention to __git_find_repo_path() and
$__git_repo_path.  Yet keep four __gitdir() tests to ensure that it
handles success and failure of __git_find_repo_path() and that it
still handles its optional remote argument, because users' custom
completion scriptlets might depend on it.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-03 22:18:41 -08:00
SZEDER Gábor
beb6ee7163 completion: extract repository discovery from __gitdir()
To prepare for caching the path to the repository in the following
commit, extract the repository discovering part of __gitdir() into the
__git_find_repo_path() helper function, which stores the found path in
the $__git_repo_path variable instead of printing it.  Make __gitdir()
a wrapper around this new function.  Declare $__git_repo_path local in
the toplevel completion functions __git_main() and __gitk_main() to
ensure that it never leaks into the environment and influences
subsequent completions (though this isn't necessary right now, as
__gitdir() is still only executed in subshells, but will matter for
the following commit).

Adjust tests checking __gitdir() or any other completion function
calling __gitdir() to perform those checks in a subshell to prevent
$__git_repo_path from leaking into the test environment.  Otherwise
leave the tests unchanged to demonstrate that this change doesn't
alter __gitdir()'s behavior.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-03 22:18:41 -08:00
SZEDER Gábor
a958d40f4e completion: don't guard git executions with __gitdir()
Three completion functions, namely __git_index_files(), __git_heads()
and __git_tags(), first run __gitdir() and check that the path it
outputs exists, i.e. that there is a git repository, and run a git
command only if there is one.

After the previous changes in this series there are no further uses of
__gitdir()'s output in these functions besides those checks.  And
those checks are unnecessary, because we can just execute those git
commands outside of a repository and let them error out.  We don't
perform such a check in other places either.

Remove this check and the __gitdir() call from these functions,
sparing the fork()+exec() overhead of the command substitution and the
potential 'git rev-parse' execution.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-03 22:18:41 -08:00
SZEDER Gábor
e15098a314 completion: consolidate silencing errors from git commands
Outputting error messages during completion is bad: they disrupt the
command line, can't be deleted, and the user is forced to Ctrl-C and
start over most of the time.  We already silence stderr of many git
commands in our Bash completion script, but there are still some in
there that can spew error messages when something goes wrong.

We could add the missing stderr redirections to all the remaining
places, but instead let's leverage that git commands are now executed
through the previously introduced __git() wrapper function, and
redirect standard error to /dev/null only in that function.  This way
we need only one redirection to take care of errors from almost all
git commands.  Redirecting standard error of the __git() wrapper
function thus became redundant, remove them.

The exceptions, i.e. the repo-independent git executions and those in
the __gitdir() function that don't go through __git() already have
their standard error silenced.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-03 22:18:41 -08:00
SZEDER Gábor
1cd23e9e05 completion: don't use __gitdir() for git commands
Several completion functions contain the following pattern to run git
commands respecting the path to the repository specified on the
command line:

  git --git-dir="$(__gitdir)" <cmd> <options>

This imposes the overhead of fork()ing a subshell for the command
substitution and potentially fork()+exec()ing 'git rev-parse' inside
__gitdir().

Now, if neither '--gitdir=<path>' nor '-C <path>' options are
specified on the command line, then those git commands are perfectly
capable to discover the repository on their own.  If either one or
both of those options are specified on the command line, then, again,
the git commands could discover the repository, if we pass them all of
those options from the command line.

This means we don't have to run __gitdir() at all for git commands and
can spare its fork()+exec() overhead.

Use Bash parameter expansions to check the $__git_dir variable and
$__git_C_args array and to assemble the appropriate '--git-dir=<path>'
and '-C <path>' options if either one or both are present on the
command line.  These parameter expansions are, however, rather long,
so instead of changing all git executions and make already long lines
even longer, encapsulate running git with '--git-dir=<path> -C <path>'
options into the new __git() wrapper function.  Furthermore, this
wrapper function will also enable us to silence error messages from
git commands uniformly in one place in a later commit.

There's one tricky case, though: in __git_refs() local refs are listed
with 'git for-each-ref', where "local" is not necessarily the
repository we are currently in, but it might mean a remote repository
in the filesystem (e.g. listing refs for 'git fetch /some/other/repo
<TAB>').  Use one-shot variable assignment to override $__git_dir with
the path of the repository where the refs should come from.  Although
one-shot variable assignments in front of shell functions are to be
avoided in our scripts in general, in the Bash completion script we
can do that safely.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-03 22:18:41 -08:00
SZEDER Gábor
80ac0744b1 completion: respect 'git -C <path>'
'git -C <path>' option(s) on the command line should be taken into
account during completion, because

  - like '--git-dir=<path>', it can lead us to a different repository,

  - a few git commands executed in the completion script do care about
    in which directory they are executed, and

  - the command for which we are providing completion might care about
    in which directory it will be executed.

However, unlike '--git-dir=<path>', the '-C <path>' option can be
specified multiple times and their effect is cumulative, so we can't
just store a single '<path>' in a variable.  Nor can we simply
concatenate a path from '-C <path1> -C <path2> ...', because e.g. (in
an arguably pathological corner case) a relative path might be
followed by an absolute path.

Instead, store all '-C <path>' options word by word in the
$__git_C_args array in the main git completion function, and pass this
array, if present, to 'git rev-parse --absolute-git-dir' when
discovering the repository in __gitdir(), and let it take care of
multiple options, relative paths, absolute paths and everything.

Also pass all '-C <path> options via the $__git_C_args array to those
git executions which require a worktree and for which it matters from
which directory they are executed from.  There are only three such
cases:

  - 'git diff-index' and 'git ls-files' in __git_ls_files_helper()
    used for git-aware filename completion, and

  - the 'git ls-tree' used for completing the 'ref:path' notation.

The other git commands executed in the completion script don't need
these '-C <path>' options, because __gitdir() already took those
options into account.  It would not hurt them, either, but let's not
induce unnecessary code churn.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-03 22:18:41 -08:00
SZEDER Gábor
a2f5a87626 rev-parse: add '--absolute-git-dir' option
The output of 'git rev-parse --git-dir' can be either a relative or an
absolute path, depending on whether the current working directory is
at the top of the worktree or the .git directory or not, or how the
path to the repository is specified via the '--git-dir=<path>' option
or the $GIT_DIR environment variable.  And if that output is a
relative path, then it is relative to the directory where any 'git
-C <path>' options might have led us.

This doesn't matter at all for regular scripts, because the git
wrapper automatically takes care of changing directories according to
the '-C <path>' options, and the scripts can then simply follow any
path returned by 'git rev-parse --git-dir', even if it's a relative
path.

Our Bash completion script, however, is unique in that it must run
directly in the user's interactive shell environment.  This means that
it's not executed through the git wrapper and would have to take care
of any '-C <path> options on its own, and it can't just change
directories as it pleases.  Consequently, adding support for taking
any '-C <path>' options on the command line into account during
completion turned out to be considerably more difficult, error prone
and required more subshells and git processes when it had to cope with
a relative path to the .git directory.

Help this rather special use case and teach 'git rev-parse' a new
'--absolute-git-dir' option which always outputs a canonicalized
absolute path to the .git directory, regardless of whether the path is
discovered automatically or is specified via $GIT_DIR or 'git
--git-dir=<path>'.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-03 22:18:41 -08:00
SZEDER Gábor
336d694ce4 completion: fix completion after 'git -C <path>'
The main completion function finds the name of the git command by
iterating through all the words on the command line in search for the
first non-option-looking word.  As it is not aware of 'git -C's
mandatory path argument, if the '-C <path>' option is present, 'path'
will be the first such word and it will be mistaken for a git command.
This breaks completion in various ways:

 - If 'path' happens to match one of the commands supported by the
   completion script, then options of that command will be offered.

 - If 'path' doesn't match a supported command and doesn't contain any
   characters not allowed in Bash identifier names, then the
   completion script does basically nothing and Bash in turn falls
   back to filename completion for all subsequent words.

 - Otherwise, if 'path' does contain such an unallowed character, then
   it leads to a more or less ugly error message in the middle of the
   command line.  The standard '/' directory separator is such a
   character, and it happens to trigger one of the uglier errors:

     $ git -C some/path <TAB>sh.exe": declare: `_git_some/path': not a valid identifier
     error: invalid key: alias.some/path

Fix this by skipping 'git -C's mandatory path argument while iterating
over the words on the command line.  Extend the relevant test with
this case and, while at it, with cases that needed similar treatment
in the past ('--git-dir', '-c', '--work-tree' and '--namespace').

Additionally, silence the standard error of the 'declare' builtins
looking for the completion function associated with the git command
and of the 'git config' query for the aliased command.  So if git ever
learns a new option with a mandatory argument in the future, then,
though the completion script will again misbehave, at least the
command line will not be utterly disrupted by those error messages.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-03 22:18:41 -08:00
SZEDER Gábor
7b329b9dab completion: don't offer commands when 'git --opt' needs an argument
The main git options '--git-dir', '-c', '-C', '--worktree' and
'--namespace' require an argument, but attempting completion right
after them lists git commands.

Don't offer anything right after these options, thus let Bash fall
back to filename completion, because

  - the three options '--git-dir', '-C' and '--worktree' do actually
    require a path argument, and

  - we don't complete the required argument of '-c' and '--namespace',
    and in that case the "standard" behavior of our completion script
    is to not offer anything, but fall back to filename completion.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-03 22:18:40 -08:00
SZEDER Gábor
91b7ea81e8 completion: list short refs from a remote given as a URL
e832f5c096 (completion: avoid ls-remote in certain scenarios,
2013-05-28) turned a 'git ls-remote <remote>' query into a 'git
for-each-ref refs/remotes/<remote>/' to improve responsiveness of
remote refs completion by avoiding potential network communication.
However, it inadvertently made impossible to complete short refs from
a remote given as a URL, e.g. 'git fetch git://server.com/repo.git
<TAB>', because there is, of course, no such thing as
'refs/remotes/git://server.com/repo.git'.

Since the previous commit we tell apart configured remotes, i.e. those
that can have a hierarchy under 'refs/remotes/', from others that
don't, including remotes given as URL, so we know when we can't use
the faster 'git for-each-ref'-based approach.

Resurrect the old, pre-e832f5c09680 'git ls-remote'-based code for the
latter case to support listing short refs from remotes given as a URL.
The code is slightly updated from the original to

  - take into account the path to the repository given on the command
    line (if any), and
  - omit 'ORIG_HEAD' from the query, as 'git ls-remote' will never
    list it anyway.

When the remote given to __git_refs() doesn't exist, then it will be
handled by this resurrected 'git ls-remote' query.  This code path
doesn't list 'HEAD' unconditionally, which has the nice side effect of
fixing two more expected test failures.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-03 22:18:40 -08:00
SZEDER Gábor
62a1b73216 completion: don't list 'HEAD' when trying refs completion outside of a repo
When refs completion is attempted while not in a git repository, the
completion script offers 'HEAD' erroneously.

Check early in __git_refs() that there is either a repository or a
remote to work on, and return early if neither is given.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-03 22:18:40 -08:00
SZEDER Gábor
69a775963b completion: list refs from remote when remote's name matches a directory
If the remote given to __git_refs() happens to match both the name of
a configured remote and the name of a directory in the current working
directory, then that directory is assumed to be a git repository, and
listing refs from that directory will be attempted.  This is wrong,
because in such a situation git commands (e.g. 'git fetch|pull|push
<remote>' whom these refs will eventually be passed to) give
precedence to the configured remote.  Therefore, __git_refs() should
list refs from the configured remote as well.

Add the helper function __git_is_configured_remote() that checks
whether its argument matches the name of a configured remote.  Use
this helper to decide how to handle the remote passed to __git_refs().

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-03 22:18:40 -08:00
SZEDER Gábor
5c12f642df completion: respect 'git --git-dir=<path>' when listing remote refs
In __git_refs() the git commands listing refs, both short and full,
from a given remote repository are run without giving them the path to
the git repository which might have been specified on the command line
via 'git --git-dir=<path>'.  This is bad, those git commands should
access the 'refs/remotes/<remote>/' hierarchy or the remote and
credentials configuration in that specified repository.

Use the __gitdir() helper only to find the path to the .git directory
and pass the resulting path to the 'git ls-remote' and 'for-each-ref'
executions that list remote refs.  While modifying that 'for-each-ref'
line, remove the superfluous disambiguating doubledash.

Don't use __gitdir() to check that the given remote is on the file
system: basically it performs only a single if statement for us at the
considerable cost of fork()ing a subshell for a command substitution.
We are better off to perform all the necessary checks of the remote in
__git_refs().

Though __git_refs() was the last remaining callsite that passed a
remote to __gitdir(), don't delete __gitdir()'s remote-handling part
yet, just in case some users' custom completion scriptlets depend on
it.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-03 22:18:40 -08:00
SZEDER Gábor
3bcb41f976 completion: fix most spots not respecting 'git --git-dir=<path>'
The completion script already respects the path to the repository
specified on the command line most of the time, here we add the
necessary '--git-dir=$(__gitdir)' options to most of the places where
git was executed without it.  The exceptions where said option is not
added are the git invocations:

  - in __git_refs() which are non-trivial and will be the subject of
    the following patch,

  - getting the list of git commands, merge strategies and archive
    formats, because these are independent from the repository and
    thus don't need it, and

  - the 'git rev-parse --git-dir' in __gitdir() itself.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-03 22:18:40 -08:00
SZEDER Gábor
a2f03b0ec8 completion: ensure that the repository path given on the command line exists
The __gitdir() helper function prints the path to the git repository
to its stdout or stays silent and returns with error when it can't
find a repository or when the repository given via $GIT_DIR doesn't
exist.

This is not the case, however, when the path in $__git_dir, i.e. the
path to the repository specified on the command line via 'git
--git-dir=<path>', doesn't exist: __gitdir() still outputs it as if it
were a real existing repository, making some completion functions
believe that they operate on an existing repository.

Check that the path in $__git_dir exists and return with error without
printing anything to stdout if it doesn't.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-03 22:18:40 -08:00
SZEDER Gábor
fb9cd42042 completion tests: add tests for the __git_refs() helper function
Check how __git_refs() lists refs in different scenarios, i.e.

  - short and full refs,
  - from a local or from a remote repository,
  - remote specified via path, name or URL,
  - with or without a repository specified on the command line,
  - non-existing remote,
  - unique remote branches for 'git checkout's tracking DWIMery,
  - not in a git repository, and
  - interesting combinations of the above.

Seven of these tests expect failure, mostly demonstrating bugs related
to listing refs from a remote repository:

  - ignoring the repository specified on the command line (2 tests),
  - listing refs from the wrong place when the name of a configured
    remote happens to match a directory,
  - listing only 'HEAD' but no short refs from a remote given as URL,
  - listing 'HEAD' even from non-existing remotes (2 tests), and
  - listing 'HEAD' when not in a repository.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-03 22:18:40 -08:00