Commit Graph

17779 Commits

Author SHA1 Message Date
Junio C Hamano
b69bed22c5 Merge branch 'jk/log-cherry-pick-duplicate-patches'
When more than one commit with the same patch ID appears on one
side, "git log --cherry-pick A...B" did not exclude them all when a
commit with the same patch ID appears on the other side.  Now it
does.

* jk/log-cherry-pick-duplicate-patches:
  patch-ids: handle duplicate hashmap entries
2021-01-25 14:19:19 -08:00
Junio C Hamano
27d7c8599b Merge branch 'js/default-branch-name-tests-final-stretch'
Prepare tests not to be affected by the name of the default branch
"git init" creates.

* js/default-branch-name-tests-final-stretch: (28 commits)
  tests: drop prereq `PREPARE_FOR_MAIN_BRANCH` where no longer needed
  t99*: adjust the references to the default branch name "main"
  tests(git-p4): transition to the default branch name `main`
  t9[5-7]*: adjust the references to the default branch name "main"
  t9[0-4]*: adjust the references to the default branch name "main"
  t8*: adjust the references to the default branch name "main"
  t7[5-9]*: adjust the references to the default branch name "main"
  t7[0-4]*: adjust the references to the default branch name "main"
  t6[4-9]*: adjust the references to the default branch name "main"
  t64*: preemptively adjust alignment to prepare for `master` -> `main`
  t6[0-3]*: adjust the references to the default branch name "main"
  t5[6-9]*: adjust the references to the default branch name "main"
  t55[4-9]*: adjust the references to the default branch name "main"
  t55[23]*: adjust the references to the default branch name "main"
  t551*: adjust the references to the default branch name "main"
  t550*: adjust the references to the default branch name "main"
  t5503: prepare aligned comment for replacing `master` with `main`
  t5[0-4]*: adjust the references to the default branch name "main"
  t5323: prepare centered comment for `master` -> `main`
  t4*: adjust the references to the default branch name "main"
  ...
2021-01-25 14:19:18 -08:00
Junio C Hamano
440acfbe0c Merge branch 'dl/reflog-with-single-entry'
After expiring a reflog and making a single commit, the reflog for
the branch would record a single entry that knows both @{0} and
@{1}, but we failed to answer "what commit were we on?", i.e. @{1}

* dl/reflog-with-single-entry:
  refs: allow @{n} to work with n-sized reflog
  refs: factor out set_read_ref_cutoffs()
2021-01-25 14:19:18 -08:00
Junio C Hamano
0806279428 Merge branch 'sj/untracked-files-in-submodule-directory-is-not-dirty'
"git diff" showed a submodule working tree with untracked cruft as
"Submodule commit <objectname>-dirty", but a natural expectation is
that the "-dirty" indicator would align with "git describe --dirty",
which does not consider having untracked files in the working tree
as source of dirtiness.  The inconsistency has been fixed.

* sj/untracked-files-in-submodule-directory-is-not-dirty:
  diff: do not show submodule with untracked files as "-dirty"
2021-01-25 14:19:18 -08:00
Junio C Hamano
dfcd905069 Merge branch 'jc/deprecate-pack-redundant'
Warn loudly when the "pack-redundant" command, which has been left
stale with almost unusable performance issues, gets used, as we no
longer want to recommend its use (instead just "repack -d" instead).

* jc/deprecate-pack-redundant:
  pack-redundant: gauge the usage before proposing its removal
2021-01-25 14:19:18 -08:00
Junio C Hamano
c7b1aaf6d6 Merge branch 'jk/forbid-lf-in-git-url'
Newline characters in the host and path part of git:// URL are
now forbidden.

* jk/forbid-lf-in-git-url:
  fsck: reject .gitmodules git:// urls with newlines
  git_connect_git(): forbid newlines in host and path
2021-01-25 14:19:17 -08:00
Junio C Hamano
9e409d7e07 Merge branch 'ab/branch-sort'
The implementation of "git branch --sort" wrt the detached HEAD
display has always been hacky, which has been cleaned up.

* ab/branch-sort:
  branch: show "HEAD detached" first under reverse sort
  branch: sort detached HEAD based on a flag
  ref-filter: move ref_sorting flags to a bitfield
  ref-filter: move "cmp_fn" assignment into "else if" arm
  ref-filter: add braces to if/else if/else chain
  branch tests: add to --sort tests
  branch: change "--local" to "--list" in comment
2021-01-25 14:19:17 -08:00
Junio C Hamano
a5ac31b5b1 Merge branch 'en/diffcore-rename'
File-level rename detection updates.

* en/diffcore-rename:
  diffcore-rename: remove unnecessary duplicate entry checks
  diffcore-rename: accelerate rename_dst setup
  diffcore-rename: simplify and accelerate register_rename_src()
  t4058: explore duplicate tree entry handling in a bit more detail
  t4058: add more tests and documentation for duplicate tree entry handling
  diffcore-rename: reduce jumpiness in progress counters
  diffcore-rename: simplify limit check
  diffcore-rename: avoid usage of global in too_many_rename_candidates()
  diffcore-rename: rename num_create to num_destinations
2021-01-25 14:19:17 -08:00
Junio C Hamano
c7d6d419b0 Merge branch 'ab/mktag'
"git mktag" validates its input using its own rules before writing
a tag object---it has been updated to share the logic with "git
fsck".

* ab/mktag: (23 commits)
  mktag: add a --[no-]strict option
  mktag: mark strings for translation
  mktag: convert to parse-options
  mktag: allow omitting the header/body \n separator
  mktag: allow turning off fsck.extraHeaderEntry
  fsck: make fsck_config() re-usable
  mktag: use fsck instead of custom verify_tag()
  mktag: use puts(str) instead of printf("%s\n", str)
  mktag: remove redundant braces in one-line body "if"
  mktag: use default strbuf_read() hint
  mktag tests: test verify_object() with replaced objects
  mktag tests: improve verify_object() test coverage
  mktag tests: test "hash-object" compatibility
  mktag tests: stress test whitespace handling
  mktag tests: run "fsck" after creating "mytag"
  mktag tests: don't create "mytag" twice
  mktag tests: don't redirect stderr to a file needlessly
  mktag tests: remove needless SHA-1 hardcoding
  mktag tests: use "test_commit" helper
  mktag tests: don't needlessly use a subshell
  ...
2021-01-25 14:19:17 -08:00
Ævar Arnfjörð Bjarmason
95ca1f987e grep/pcre2: better support invalid UTF-8 haystacks
Improve the support for invalid UTF-8 haystacks given a non-ASCII
needle when using the PCREv2 backend.

This is a more complete fix for a bug I started to fix in
870eea8166 (grep: do not enter PCRE2_UTF mode on fixed matching,
2019-07-26), now that PCREv2 has the PCRE2_MATCH_INVALID_UTF mode we
can make use of it.

This fixes the sort of case described in 8a5999838e (grep: stess test
PCRE v2 on invalid UTF-8 data, 2019-07-26), i.e.:

    - The subject string is non-ASCII (e.g. "ævar")
    - We're under a is_utf8_locale(), e.g. "en_US.UTF-8", not "C"
    - We are using --ignore-case, or we're a non-fixed pattern

If those conditions were satisfied and we matched found non-valid
UTF-8 data PCREv2 might bark on it, in practice this only happened
under the JIT backend (turned on by default on most platforms).

Ultimately this fixes a "regression" in b65abcafc7 ("grep: use PCRE v2
for optimized fixed-string search", 2019-07-01), I'm putting that in
scare-quotes because before then we wouldn't properly support these
complex case-folding, locale etc. cases either, it just broke in
different ways.

There was a bug related to this the PCRE2_NO_START_OPTIMIZE flag fixed
in PCREv2 10.36. It can be worked around by setting the
PCRE2_NO_START_OPTIMIZE flag. Let's do that in those cases, and add
tests for the bug.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-24 16:09:17 -08:00
Ævar Arnfjörð Bjarmason
a4fea08b6e grep/pcre2 tests: don't rely on invalid UTF-8 data test
As noted in [1] when I originally added this test in [2] the test was
completely broken as it lacked a redirect[3]. I now think this whole
thing is overly fragile. Let's only test if we have a segfault here.

Before this the first test's "test_cmp" was pretty meaningless. We
were only testing if PCREv2 was so broken that it would spew out
something completely unrelated on stdout, which isn't very plausible.

In the second test we're relying on PCREv2 forever holding to the
current behavior of the PCRE_UTF8 flag, as opposed to learning some
optimistic graceful fallback to PCRE2_MATCH_INVALID_UTF in the
future. If that happens having this test broken under bisecting would
suck.

A follow-up commit will actually test this case in a meaningful way
under the PCRE2_MATCH_INVALID_UTF flag. Let's run this one
unconditionally, and just make sure we don't segfault.

1. e714b898c6 (t7812: expect failure for grep -i with invalid UTF-8
   data, 2019-11-29)
2. 8a5999838e (grep: stess test PCRE v2 on invalid UTF-8 data,
   2019-07-26)
3. c74b3cbb83 (t7812: add missing redirects, 2019-11-26)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-24 16:09:15 -08:00
Ævar Arnfjörð Bjarmason
7599730b7e Remove support for v1 of the PCRE library
Remove support for using version 1 of the PCRE library. Its use has
been discouraged by upstream for a long time, and it's in a
bugfix-only state.

Anyone who was relying on v1 in particular got a nudge to move to v2
in e6c531b808 (Makefile: make USE_LIBPCRE=YesPlease mean v2, not v1,
2018-03-11), which was first released as part of v2.18.0.

With this the LIBPCRE2 test prerequisites is redundant to PCRE. But
I'm keeping it for self-documentation purposes, and to avoid conflict
with other in-flight PCRE patches.

I'm also not changing all of our own "pcre2" names to "pcre", i.e. the
inverse of 6d4b5747f0 (grep: change internal *pcre* variable &
function names to be *pcre1*, 2017-05-25). I don't see the point, and
it makes the history/blame harder to read. Maybe if there's ever a
PCRE v3...

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 21:15:43 -08:00
Derrick Stolee
19a0acc83e t1092: test interesting sparse-checkout scenarios
These also document some behaviors that differ from a full checkout, and
possibly in a way that is not intended.

The test is designed to be run with "--run=1,X" where 'X' is an
interesting test case. Each test uses 'init_repos' to reset the full and
sparse copies of the initial-repo that is created by the first test
case. This also makes it possible to have test cases leave the working
directory or index in unusual states without disturbing later cases.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 17:14:20 -08:00
Derrick Stolee
3b14436364 test-lib: test_region looks for trace2 regions
From ff15d509b89edd4830d85d53cea3079a6b0c1c08 Mon Sep 17 00:00:00 2001
From: Derrick Stolee <dstolee@microsoft.com>
Date: Mon, 11 Jan 2021 08:53:09 -0500
Subject: [PATCH 8/9] test-lib: test_region looks for trace2 regions

Most test cases can verify Git's behavior using input/output
expectations or changes to the .git directory. However, sometimes we
want to check that Git did or did not run a certain section of code.
This is particularly important for performance-only features that we
want to ensure have been enabled in certain cases.

Add a new 'test_region' function that checks if a trace2 region was
entered and left in a given trace2 event log.

There is one existing test (t0500-progress-display.sh) that performs
this check already, so use the helper function instead. Note that this
changes the expectations slightly. The old test (incorrectly) used two
patterns for the 'grep' invocation, but this performs an OR of the
patterns, not an AND. This means that as long as one region_enter event
was logged, the test would succeed, even if it was not due to the
progress category.

More uses will be added in a later change.

t6423-merge-rename-directories.sh also greps for region_enter lines, but
it verifies the number of such lines, which is not the same as an
existence check.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 17:14:18 -08:00
Ævar Arnfjörð Bjarmason
db89a82b5b rm tests: actually test for SIGPIPE in SIGPIPE test
Change a test initially added in 50cd31c652 (t3600: comment on
inducing SIGPIPE in `git rm`, 2019-11-27) to explicitly test for
SIGPIPE using a pattern initially established in 7559a1be8a (unblock
and unignore SIGPIPE, 2014-09-18).

The problem with using that pattern is that it requires us to skip the
test on MINGW[1]. If we kept the test with its initial semantics[2]
we'd get coverage there, at the cost of not checking whether we
actually had SIGPIPE outside of MinGW.

Arguably we should just remove this test. Between the test added in
7559a1be8a and the change made in 12e0437f23 (common-main: call
restore_sigpipe_to_default(), 2016-07-01) it's a bit arbitrary to only
check this for "git rm".

But in lieu of having wider test coverage for other "git" subcommands
let's refactor this to explicitly test for SIGPIPE outside of MinGW,
and then just that we remove the ".git/index.lock" (as before) on all
platforms.

1. https://lore.kernel.org/git/xmqq1rec5ckf.fsf@gitster.c.googlers.com/
2. 0693f9ddad (Make sure lockfiles are unlocked when dying on SIGPIPE,
   2008-12-18)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 13:25:12 -08:00
Ævar Arnfjörð Bjarmason
60127996b5 archive tests: use a cheaper "zipinfo -h" invocation to get header
Change an invocation of zipinfo added in 19ee29401d (t5004: test ZIP
archives with many entries, 2015-08-22) to simply ask zipinfo for the
header info, rather than spewing out info about the entire archive and
race to kill it with SIGPIPE due to the downstream "head -2".

I ran across this because I'm adding a "set -o pipefail" test
mode. This won't be needed for the version of the mode that I'm
introducing (which currently relies on a patch to GNU bash), but I
think this is a good idea anyway.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 13:25:12 -08:00
Ævar Arnfjörð Bjarmason
9aebc4708a upload-pack tests: avoid a non-zero "grep" exit status
Continue changing a test that 763b47bafa (t5703: stop losing return
codes of git commands, 2019-11-27) already refactored.

This was originally added as part of a series to add support for
running under bash's "set -o pipefail", under that mode this test will
fail because sometimes there's no commits in the "objs" output.

It's easier to fix that than exempt these tests under a hypothetical
"set -o pipefail" test mode. It looks like we probably won't have
that, but once we've dug this code up let's refactor it[2] so we don't
hide a potential pipe failure.

1. https://lore.kernel.org/git/xmqqzh18o8o6.fsf@gitster.c.googlers.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 13:25:12 -08:00
Jeff King
796c248dc1 git-svn tests: rewrite brittle tests to use "--[no-]merges".
Rewrite a brittle tests which used "rev-list" without "--[no-]merges"
to figure out if a set of commits turned into merge commits or not.

Signed-off-by: Jeff King <peff@peff.net>
[ÆAB: wrote commit message]
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 13:25:12 -08:00
Ævar Arnfjörð Bjarmason
f918a89e50 git svn mergeinfo tests: refactor "test -z" to use test_must_be_empty
Refactor some old-style test code to use test_must_be_empty instead of
"test -z". This makes a follow-up commit easier to read.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 13:25:12 -08:00
Ævar Arnfjörð Bjarmason
4669917e8f git svn mergeinfo tests: modernize redirection & quoting style
Use "<file" instead of "< file", and don't put the closing quote for
strings on an indented line. This makes a follow-up refactoring commit
easier to read.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 13:25:12 -08:00
Ævar Arnfjörð Bjarmason
ef83970059 cache-tree tests: explicitly test HEAD and index differences
The test code added in 9c4d6c0297 (cache-tree: Write updated
cache-tree after commit, 2014-07-13) used "ls-files" in lieu of
"ls-tree" because it wanted to test the data in the index, since this
test is testing the cache-tree extension.

Change the test to instead use "ls-tree" for traversal, and then
explicitly check how HEAD differs from the index. This is more easily
understood, and less fragile as numerous past bug fixes[1][2][3] to
the old code we're replacing demonstrate.

As an aside this would be a bit easier if empty pathspecs hadn't been
made an error in d426430e6e (pathspec: warn on empty strings as
pathspec, 2016-06-22) and 9e4e8a64c2 (pathspec: die on empty strings
as pathspec, 2017-06-06).

If that was still allowed this code could be simplified slightly:

	diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh
	index 9bf66c9e68..0b02881f55 100755
	--- a/t/t0090-cache-tree.sh
	+++ b/t/t0090-cache-tree.sh
	@@ -18,19 +18,18 @@ cmp_cache_tree () {
	 # test-tool dump-cache-tree already verifies that all existing data is
	 # correct.
	 generate_expected_cache_tree () {
	-       pathspec="$1" &&
	-       dir="$2${2:+/}" &&
	+       pathspec="$1${1:+/}" &&
	        git ls-tree --name-only HEAD -- "$pathspec" >files &&
	        git ls-tree --name-only -d HEAD -- "$pathspec" >subtrees &&
	-       printf "SHA %s (%d entries, %d subtrees)\n" "$dir" $(wc -l <files) $(wc -l <subtrees) &&
	+       printf "SHA %s (%d entries, %d subtrees)\n" "$pathspec" $(wc -l <files) $(wc -l <subtrees) &&
	        while read subtree
	        do
	-               generate_expected_cache_tree "$pathspec/$subtree/" "$subtree" || return 1
	+               generate_expected_cache_tree "$subtree" || return 1
	        done <subtrees
	 }

	 test_cache_tree () {
	-       generate_expected_cache_tree "." >expect &&
	+       generate_expected_cache_tree >expect &&
	        cmp_cache_tree expect &&
	        rm expect actual files subtrees &&
	        git status --porcelain -- ':!status' ':!expected.status' >status &&

1. c8db708d5d (t0090: avoid passing empty string to printf %d,
   2014-09-30)
2. d69360c6b1 (t0090: tweak awk statement for Solaris
   /usr/xpg4/bin/awk, 2014-12-22)
3. 9b5a9fa60a (t0090: stop losing return codes of git commands,
   2019-11-27)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 13:25:12 -08:00
Ævar Arnfjörð Bjarmason
fa6edee776 cache-tree tests: use a sub-shell with less indirection
Change a "cd xyz && work && cd .." pattern introduced in
9c4d6c0297 (cache-tree: Write updated cache-tree after commit,
2014-07-13) to use a sub-shell instead with less indirection.

We did actually recover correctly if we failed in this function since
we were wrapped in a subshell one function call up. Let's just use the
sub-shell at the point where we want to change the directory
instead.

It's important that the "|| return 1" is outside the
subshell. Normally, we `exit 1` from within subshells[1], but that
wouldn't help us exit this loop early[1][2].

Since we can get rid of the wrapper function let's rename the main
function to drop the "rec" (for "recursion") suffix[3].

1. https://lore.kernel.org/git/CAPig+cToj8nQmyBCqC1k7DXF2vXaonCEA-fCJ4x7JBZG2ixYBw@mail.gmail.com/
2. https://lore.kernel.org/git/20150325052952.GE31924@peff.net/
3. https://lore.kernel.org/git/YARsCsgXuiXr4uFX@coredump.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 13:25:12 -08:00
Ævar Arnfjörð Bjarmason
3226725507 cache-tree tests: remove unused $2 parameter
Remove the $2 paramater. This appears to have been some
work-in-progress code from an earlier version of
9c4d6c0297 (cache-tree: Write updated cache-tree after commit,
2014-07-13) which was left in the final version.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 13:25:12 -08:00
Ævar Arnfjörð Bjarmason
3f96d75ef5 cache-tree tests: refactor for modern test style
Refactor the cache-tree test file to use our current recommended
patterns. This makes a subsequent meaningful change easier to read.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 13:25:11 -08:00
ZheNing Hu
93a7d9835f ls-files.c: add --deduplicate option
During a merge conflict, the name of a file may appear multiple
times in "git ls-files" output, once for each stage.  If you use
both `--delete` and `--modify` at the same time, the output may
mention a deleted file twice.

When none of the '-t', '-u', or '-s' options is in use, these
duplicate entries do not add much value to the output.

Introduce a new '--deduplicate' option to suppress them.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
[jc: extended doc and rewritten commit log]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 11:48:20 -08:00
Jiang Xin
822ee894f6 t5411: refactor check of refs using test_cmp_refs
Add new helper 'test_cmp_refs' to check references in a repository.

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-22 13:09:06 -08:00
Jiang Xin
8388a64cd1 t5411: use different out file to prevent overwriting
SZEDER reported that t5411 failed in Travis CI's s390x environment a
couple of times, and could be reproduced with '--stress' test on this
specific environment.  The test failure messages might look like this:

    + test_cmp expect actual
    --- expect      2021-01-17 21:55:23.430750004 +0000
    +++ actual      2021-01-17 21:55:23.430750004 +0000
    @@ -1 +1 @@
    -<COMMIT-A> refs/heads/main
    +<COMMIT-A> refs/heads/maifatal: the remote end hung up unexpectedly
    error: last command exited with $?=1
    not ok 86 - proc-receive: not support push options (builtin protocol)

The file 'actual' is filtered from the file 'out' which contains result
of 'git show-ref' command.  Due to the error messages from other process
is written into the file 'out' accidentally, t5411 failed.  SZEDER finds
the root cause of this issue:

 - 'git push' is executed with its standard output and error redirected
   to the file 'out'.

 - 'git push' executes 'git receive-pack' internally, which inherits
   the open file descriptors, so its output and error goes into that
   same 'out' file.

 - 'git push' ends without waiting for the close of 'git-receive-pack'
   for some cases, and the file 'out' is reused for test of
   'git show-ref' afterwards.

 - A mixture of the output of 'git show-ref' abd 'git receive-pack'
   leads to this issue.

The first intuitive reaction to resolve this issue is to remove the
file 'out' after use, so that the newly created file 'out' will have a
different file descriptor and will not be overwritten by the
'git receive-pack' process.  But Johannes pointed out that removing an
open file is not possible on Windows.  So we use different temporary
file names to store the output of 'git push' to solve this issue.

Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Johannes Sixt <j6t@kdbg.org>
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-22 13:09:04 -08:00
Jeff King
36a317929b refs: switch peel_ref() to peel_iterated_oid()
The peel_ref() interface is confusing and error-prone:

  - it's typically used by ref iteration callbacks that have both a
    refname and oid. But since they pass only the refname, we may load
    the ref value from the filesystem again. This is inefficient, but
    also means we are open to a race if somebody simultaneously updates
    the ref. E.g., this:

      int some_ref_cb(const char *refname, const struct object_id *oid, ...)
      {
              if (!peel_ref(refname, &peeled))
                      printf("%s peels to %s",
                             oid_to_hex(oid), oid_to_hex(&peeled);
      }

    could print nonsense. It is correct to say "refname peels to..."
    (you may see the "before" value or the "after" value, either of
    which is consistent), but mentioning both oids may be mixing
    before/after values.

    Worse, whether this is possible depends on whether the optimization
    to read from the current iterator value kicks in. So it is actually
    not possible with:

      for_each_ref(some_ref_cb);

    but it _is_ possible with:

      head_ref(some_ref_cb);

    which does not use the iterator mechanism (though in practice, HEAD
    should never peel to anything, so this may not be triggerable).

  - it must take a fully-qualified refname for the read_ref_full() code
    path to work. Yet we routinely pass it partial refnames from
    callbacks to for_each_tag_ref(), etc. This happens to work when
    iterating because there we do not call read_ref_full() at all, and
    only use the passed refname to check if it is the same as the
    iterator. But the requirements for the function parameters are quite
    unclear.

Instead of taking a refname, let's instead take an oid. That fixes both
problems. It's a little funny for a "ref" function not to involve refs
at all. The key thing is that it's optimizing under the hood based on
having access to the ref iterator. So let's change the name to make it
clear why you'd want this function versus just peel_object().

There are two other directions I considered but rejected:

  - we could pass the peel information into the each_ref_fn callback.
    However, we don't know if the caller actually wants it or not. For
    packed-refs, providing it is essentially free. But for loose refs,
    we actually have to peel the object, which would be wasteful in most
    cases. We could likewise pass in a flag to the callback indicating
    whether the peeled information is known, but that complicates those
    callbacks, as they then have to decide whether to manually peel
    themselves. Plus it requires changing the interface of every
    callback, whether they care about peeling or not, and there are many
    of them.

  - we could make a function to return the peeled value of the current
    iterated ref (computing it if necessary), and BUG() otherwise. I.e.:

      int peel_current_iterated_ref(struct object_id *out);

    Each of the current callers is an each_ref_fn callback, so they'd
    mostly be happy. But:

      - we use those callbacks with functions like head_ref(), which do
        not use the iteration code. So we'd need to handle the fallback
        case there, anyway.

      - it's possible that a caller would want to call into generic code
        that sometimes is used during iteration and sometimes not. This
        encapsulates the logic to do the fast thing when possible, and
        fallback when necessary.

The implementation is mostly obvious, but I want to call out a few
things in the patch:

  - the test-tool coverage for peel_ref() is now meaningless, as it all
    collapses to a single peel_object() call (arguably they were pretty
    uninteresting before; the tricky part of that function is the
    fast-path we see during iteration, but these calls didn't trigger
    that). I've just dropped it entirely, though note that some other
    tests relied on the tags we created; I've moved that creation to the
    tests where it matters.

  - we no longer need to take a ref_store parameter, since we'd never
    look up a ref now. We do still rely on a global "current iterator"
    variable which _could_ be kept per-ref-store. But in practice this
    is only useful if there are multiple recursive iterations, at which
    point the more appropriate solution is probably a stack of
    iterators. No caller used the actual ref-store parameter anyway
    (they all call the wrapper that passes the_repository).

  - the original only kicked in the optimization when the "refname"
    pointer matched (i.e., not string comparison). We do likewise with
    the "oid" parameter here, but fall back to doing an actual oideq()
    call. This in theory lets us kick in the optimization more often,
    though in practice no current caller cares. It should never be
    wrong, though (peeling is a property of an object, so two refs
    pointing to the same object would peel identically).

  - the original took care not to touch the peeled out-parameter unless
    we found something to put in it. But no caller cares about this, and
    anyway, it is enforced by peel_object() itself (and even in the
    optimized iterator case, that's where we eventually end up). We can
    shorten the code and avoid an extra copy by just passing the
    out-parameter through the stack.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-21 15:51:31 -08:00
Ævar Arnfjörð Bjarmason
73c01d25fe tests: remove uses of GIT_TEST_GETTEXT_POISON=false
As noted in previous commits we are removing the use of
GIT_TEST_GETTEXT_POISON=false. These tests all relied on the facility
being off, it always is off after an earlier change, but we hadn't
removed the redundant assignments to "false" in the tests.

I'm preserving the deletion of "error" lines in 38b9197a76 (t5411:
add basic test cases for proc-receive hook, 2020-08-27), it turns out
that's useful even without GIT_TEST_GETTEXT_POISON=true in
play. Update a comment added in that commit to note that.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-21 15:50:03 -08:00
Ævar Arnfjörð Bjarmason
d162b25f95 tests: remove support for GIT_TEST_GETTEXT_POISON
This removes the ability to inject "poison" gettext() messages via the
GIT_TEST_GETTEXT_POISON special test setup.

I initially added this as a compile-time option in bb946bba76 (i18n:
add GETTEXT_POISON to simulate unfriendly translator, 2011-02-22), and
most recently modified to be toggleable at runtime in
6cdccfce1e (i18n: make GETTEXT_POISON a runtime option, 2018-11-08)..

The reason for its removal is that the trade-off of maintaining it
v.s. what it's getting us has long since flipped. When gettext was
integrated in 5e9637c629 (i18n: add infrastructure for translating
Git with gettext, 2011-11-18) there was understandable concern on the
Git ML that in marking messages for translation en-masse we'd
inadvertently mark plumbing messages. The GETTEXT_POISON facility was
a way to smoke those out via our test suite.

Nowadays however we're done (or almost entirely done) with any marking
of messages for translation. New messages are usually marked by their
authors, who'll know whether it makes sense to translate them or
not. If not any errors in marking the messages are much more likely to
be spotted in review than in the the initial deluge of i18n patches in
the 2011-2012 era.

So let's just remove this. This leaves the test suite in a state where
we still have a lot of test_i18n, C_LOCALE_OUTPUT
etc. uses. Subsequent commits will remove those too.

The change to t/lib-rebase.sh is a selective revert of the relevant
part of f2d17068fd (i18n: rebase-interactive: mark comments of squash
for translation, 2016-06-17), and the comment in
t/t3406-rebase-message.sh is from c7108bf9ed (i18n: rebase: mark
messages for translation, 2012-07-25).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-21 15:50:01 -08:00
Derrick Stolee
3cf5f221be t7900: clean up some broken refs
The tests for the 'prefetch' task create remotes and fetch refs into
'refs/prefetch/<remote>/' and tags into 'refs/tags/'. These tests use
the remotes to create objects not intended to be seen by the "local"
repository.

In that sense, the incrmental-repack tasks did not have these objects
and refs in mind. That test replaces the object directory with a
specific pack-file layout for testing the batch-size logic. However,
this causes some operations to start showing warnings such as:

 error: refs/prefetch/remote1/one does not point to a valid object!
 error: refs/tags/one does not point to a valid object!

This only shows up if you run the tests verbosely and watch the output.
It caught my eye and I _thought_ that there was a bug where 'git gc' or
'git repack' wouldn't check 'refs/prefetch/' before pruning objects.
That is incorrect. Those commands do handle 'refs/prefetch/' correctly.

All that is left is to clean up the tests in t7900-maintenance.sh to
remove these tags and refs that are not being repacked for the
incremental-repack tests. Use update-ref to ensure this works with all
ref backends.

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-20 18:46:22 -08:00
Derrick Stolee
96eaffebbf maintenance: set log.excludeDecoration durin prefetch
The 'prefetch' task fetches refs from all remotes and places them in the
refs/prefetch/<remote>/ refspace. As this task is intended to run in the
background, this allows users to keep their local data very close to the
remote servers' data while not updating the users' understanding of the
remote refs in refs/remotes/<remote>/.

However, this can clutter 'git log' decorations with copies of the refs
with the full name 'refs/prefetch/<remote>/<branch>'.

The log.excludeDecoration config option was added in a6be5e67 (log: add
log.excludeDecoration config option, 2020-05-16) for exactly this
purpose.

Ensure we set this only for users that would benefit from it by
assigning it at the beginning of the prefetch task. Other alternatives
would be during 'git maintenance register' or 'git maintenance start',
but those might assign the config even when the prefetch task is
disabled by existing config. Further, users could run 'git maintenance
run --task=prefetch' using their own scripting or scheduling. This
provides the best coverage to automatically update the config when
valuable.

It is improbable, but possible, that users might want to run the
prefetch task _and_ see these refs in their log decorations. This seems
incredibly unlikely to me, but users can always opt-in on a
command-by-command basis using --decorate-refs=refs/prefetch/.

Test that this works in a few cases. In particular, ensure that our
assignment of log.excludeDecoration=refs/prefetch/ is additive to other
existing exclusions. Further, ensure we do not add multiple copies in
multiple runs.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-20 18:46:22 -08:00
brian m. carlson
1fb5cf0da6 commit: ignore additional signatures when parsing signed commits
When we create a commit with multiple signatures, neither of these
signatures includes the other.  Consequently, when we produce the
payload which has been signed so we can verify the commit, we must strip
off any other signatures, or the payload will differ from what was
signed.  Do so, and in preparation for verifying with multiple
algorithms, pass the algorithm we want to verify into
parse_signed_commit.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18 17:38:20 -08:00
Abhishek Kumar
8d00d7c3df commit-reach: use corrected commit dates in paint_down_to_common()
091f4cf (commit: don't use generation numbers if not needed,
2018-08-30) changed paint_down_to_common() to use commit dates instead
of generation numbers v1 (topological levels) as the performance
regressed on certain topologies. With generation number v2 (corrected
commit dates) implemented, we no longer have to rely on commit dates and
can use generation numbers.

For example, the command `git merge-base v4.8 v4.9` on the Linux
repository walks 167468 commits, taking 0.135s for committer date and
167496 commits, taking 0.157s for corrected committer date respectively.

While using corrected commit dates, Git walks nearly the same number of
commits as commit date, the process is slower as for each comparision we
have to access a commit-slab (for corrected committer date) instead of
accessing struct member (for committer date).

This change incidentally broke the fragile t6404-recursive-merge test.
t6404-recursive-merge sets up a unique repository where all commits have
the same committer date without a well-defined merge-base.

While running tests with GIT_TEST_COMMIT_GRAPH unset, we use committer
date as a heuristic in paint_down_to_common(). 6404.1 'combined merge
conflicts' merges commits in the order:
- Merge C with B to form an intermediate commit.
- Merge the intermediate commit with A.

With GIT_TEST_COMMIT_GRAPH=1, we write a commit-graph and subsequently
use the corrected committer date, which changes the order in which
commits are merged:
- Merge A with B to form an intermediate commit.
- Merge the intermediate commit with C.

While resulting repositories are equivalent, 6404.4 'virtual trees were
processed' fails with GIT_TEST_COMMIT_GRAPH=1 as we are selecting
different merge-bases and thus have different object ids for the
intermediate commits.

As this has already causes problems (as noted in 859fdc0 (commit-graph:
define GIT_TEST_COMMIT_GRAPH, 2018-08-29)), we disable commit graph
within t6404-recursive-merge.

Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18 16:21:18 -08:00
Abhishek Kumar
1fdc383c5c commit-graph: use generation v2 only if entire chain does
Since there are released versions of Git that understand generation
numbers in the commit-graph's CDAT chunk but do not understand the GDAT
chunk, the following scenario is possible:

1. "New" Git writes a commit-graph with the GDAT chunk.
2. "Old" Git writes a split commit-graph on top without a GDAT chunk.

If each layer of split commit-graph is treated independently, as it was
the case before this commit, with Git inspecting only the current layer
for chunk_generation_data pointer, commits in the lower layer (one with
GDAT) whould have corrected commit date as their generation number,
while commits in the upper layer would have topological levels as their
generation. Corrected commit dates usually have much larger values than
topological levels. This means that if we take two commits, one from the
upper layer, and one reachable from it in the lower layer, then the
expectation that the generation of a parent is smaller than the
generation of a child would be violated.

It is difficult to expose this issue in a test. Since we _start_ with
artificially low generation numbers, any commit walk that prioritizes
generation numbers will walk all of the commits with high generation
number before walking the commits with low generation number. In all the
cases I tried, the commit-graph layers themselves "protect" any
incorrect behavior since none of the commits in the lower layer can
reach the commits in the upper layer.

This issue would manifest itself as a performance problem in this case,
especially with something like "git log --graph" since the low
generation numbers would cause the in-degree queue to walk all of the
commits in the lower layer before allowing the topo-order queue to write
anything to output (depending on the size of the upper layer).

Therefore, When writing the new layer in split commit-graph, we write a
GDAT chunk only if the topmost layer has a GDAT chunk. This guarantees
that if a layer has GDAT chunk, all lower layers must have a GDAT chunk
as well.

Rewriting layers follows similar approach: if the topmost layer below
the set of layers being rewritten (in the split commit-graph chain)
exists, and it does not contain GDAT chunk, then the result of rewrite
does not have GDAT chunks either.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18 16:21:18 -08:00
Abhishek Kumar
e8b63005c4 commit-graph: implement generation data chunk
As discovered by Ævar, we cannot increment graph version to
distinguish between generation numbers v1 and v2 [1]. Thus, one of
pre-requistes before implementing generation number v2 was to
distinguish between graph versions in a backwards compatible manner.

We are going to introduce a new chunk called Generation DATa chunk (or
GDAT). GDAT will store corrected committer date offsets whereas CDAT
will still store topological level.

Old Git does not understand GDAT chunk and would ignore it, reading
topological levels from CDAT. New Git can parse GDAT and take advantage
of newer generation numbers, falling back to topological levels when
GDAT chunk is missing (as it would happen with a commit-graph written
by old Git).

We introduce a test environment variable 'GIT_TEST_COMMIT_GRAPH_NO_GDAT'
which forces commit-graph file to be written without generation data
chunk to emulate a commit-graph file written by old Git.

To minimize the space required to store corrrected commit date, Git
stores corrected commit date offsets into the commit-graph file, instea
of corrected commit dates. This saves us 4 bytes per commit, decreasing
the GDAT chunk size by half, but it's possible for the offset to
overflow the 4-bytes allocated for storage. As such overflows are and
should be exceedingly rare, we use the following overflow management
scheme:

We introduce a new commit-graph chunk, Generation Data OVerflow ('GDOV')
to store corrected commit dates for commits with offsets greater than
GENERATION_NUMBER_V2_OFFSET_MAX.

If the offset is greater than GENERATION_NUMBER_V2_OFFSET_MAX, we set
the MSB of the offset and the other bits store the position of corrected
commit date in GDOV chunk, similar to how Extra Edge List is maintained.

We test the overflow-related code with the following repo history:

           F - N - U
          /         \
U - N - U            N
         \          /
	  N - F - N

Where the commits denoted by U have committer date of zero seconds
since Unix epoch, the commits denoted by N have committer date of
1112354055 (default committer date for the test suite) seconds since
Unix epoch and the commits denoted by F have committer date of
(2 ^ 31 - 2) seconds since Unix epoch.

The largest offset observed is 2 ^ 31, just large enough to overflow.

[1]: https://lore.kernel.org/git/87a7gdspo4.fsf@evledraar.gmail.com/

Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18 16:21:18 -08:00
Abhishek Kumar
c0ef139843 t6600-test-reach: generalize *_three_modes
In a preparatory step to implement generation number v2, we add tests to
ensure Git can read and parse commit-graph files without Generation Data
chunk. These files represent commit-graph files written by Old Git and
are neccesary for backward compatability.

We extend run_three_modes() and test_three_modes() to *_all_modes() with
the fourth mode being "commit-graph without generation data chunk".

Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18 16:21:18 -08:00
Abhishek Kumar
f90fca638e commit-graph: consolidate fill_commit_graph_info
Both fill_commit_graph_info() and fill_commit_in_graph() parse
information present in commit data chunk. Let's simplify the
implementation by calling fill_commit_graph_info() within
fill_commit_in_graph().

fill_commit_graph_info() used to not load committer data from commit data
chunk. However, with the upcoming switch to using corrected committer
date as generation number v2, we will have to load committer date to
compute generation number value anyway.

e51217e15 (t5000: test tar files that overflow ustar headers,
30-06-2016) introduced a test 'generate tar with future mtime' that
creates a commit with committer date of (2^36 + 1) seconds since
EPOCH. The CDAT chunk provides 34-bits for storing committer date, thus
committer time overflows into generation number (within CDAT chunk) and
has undefined behavior.

The test used to pass as fill_commit_graph_info() would not set struct
member `date` of struct commit and load committer date from the object
database, generating a tar file with the expected mtime.

However, with corrected commit date, we will load the committer date
from CDAT chunk (truncated to lower 34-bits to populate the generation
number. Thus, Git sets date and generates tar file with the truncated
mtime.

The ustar format (the header format used by most modern tar programs)
only has room for 11 (or 12, depending on some implementations) octal
digits for the size and mtime of each file.

As the CDAT chunk is overflow by 12-octal digits but not 11-octal
digits, we split the existing tests to test both implementations
separately and add a new explicit test for 11-digit implementation.

To test the 11-octal digit implementation, we create a future commit
with committer date of 2^34 - 1, which overflows 11-octal digits without
overflowing 34-bits of the Commit Date chunks.

To test the 12-octal digit implementation, the smallest committer date
possible is 2^36 + 1, which overflows the CDAT chunk and thus
commit-graph must be disabled for the test.

Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-18 16:21:18 -08:00
Derrick Stolee
845d15d4d0 index-format: use 'cache tree' over 'cached tree'
The index has a "cache tree" extension. This corresponds to a
significant API implemented in cache-tree.[ch]. However, there are a few
places that refer to this erroneously as "cached tree". These are rare,
but notably the index-format.txt file itself makes this error.

The only other reference is in t7104-reset-hard.sh.

Reported-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-15 23:04:38 -08:00
Junio C Hamano
d9e1cd555d Merge branch 'ad/t4129-setfacl-target-fix'
Test fix.

* ad/t4129-setfacl-target-fix:
  t4129: fix setfacl-related permissions failure
2021-01-15 21:48:46 -08:00
Junio C Hamano
2b8cef2307 Merge branch 'jk/t5516-deflake'
Test fix.

* jk/t5516-deflake:
  t5516: loosen "not our ref" error check
2021-01-15 21:48:46 -08:00
Junio C Hamano
073552d7ae Merge branch 'pb/mergetool-tool-help-fix'
Fix 2.29 regression where "git mergetool --tool-help" fails to list
all the available tools.

* pb/mergetool-tool-help-fix:
  mergetool--lib: fix '--tool-help' to correctly show available tools
2021-01-15 21:48:46 -08:00
Junio C Hamano
aa08688362 Merge branch 'ds/for-each-repo-noopfix'
"git for-each-repo --config=<var> <cmd>" should not run <cmd> for
any repository when the configuration variable <var> is not defined
even once.

* ds/for-each-repo-noopfix:
  for-each-repo: do nothing on empty config
2021-01-15 21:48:46 -08:00
Junio C Hamano
8dbabb31df Merge branch 'mt/t4129-with-setgid-dir'
Some tests expect that "ls -l" output has either '-' or 'x' for
group executable bit, but setgid bit can be inherited from parent
directory and make these fields 'S' or 's' instead, causing test
failures.

* mt/t4129-with-setgid-dir:
  t4129: don't fail if setgid is set in the test directory
2021-01-15 21:48:45 -08:00
Junio C Hamano
b2ace18759 Merge branch 'ds/maintenance-part-4'
Follow-up on the "maintenance part-3" which introduced scheduled
maintenance tasks to support platforms whose native scheduling
methods are not 'cron'.

* ds/maintenance-part-4:
  maintenance: use Windows scheduled tasks
  maintenance: use launchctl on macOS
  maintenance: include 'cron' details in docs
  maintenance: extract platform-specific scheduling
2021-01-15 21:48:45 -08:00
Junio C Hamano
f9fb9063fd Merge branch 'fc/completion-aliases-support'
Bash completion (in contrib/) update to make it easier for
end-users to add completion for their custom "git" subcommands.

* fc/completion-aliases-support:
  completion: add proper public __git_complete
  test: completion: add tests for __git_complete
  completion: bash: improve function detection
  completion: bash: add __git_have_func helper
2021-01-15 15:20:30 -08:00
Junio C Hamano
62fb47a4d3 Merge branch 'en/stash-apply-sparse-checkout'
"git stash" did not work well in a sparsely checked out working
tree.

* en/stash-apply-sparse-checkout:
  stash: fix stash application in sparse-checkouts
  stash: remove unnecessary process forking
  t7012: add a testcase demonstrating stash apply bugs in sparse checkouts
2021-01-15 15:20:29 -08:00
Junio C Hamano
1ee70a916d Merge branch 'ar/t6016-modernise'
Test update.

* ar/t6016-modernise:
  t6016: move to lib-log-graph.sh framework
2021-01-15 15:20:29 -08:00
Junio C Hamano
7bfa022993 Merge branch 'nk/perf-fsmonitor-cleanup'
Test fix.

* nk/perf-fsmonitor-cleanup:
  p7519: allow running without watchman prereq
2021-01-15 15:20:29 -08:00
Junio C Hamano
8b327f1784 Merge branch 'ma/sha1-is-a-hash'
Retire more names with "sha1" in it.

* ma/sha1-is-a-hash:
  hash-lookup: rename from sha1-lookup
  sha1-lookup: rename `sha1_pos()` as `hash_pos()`
  object-file.c: rename from sha1-file.c
  object-name.c: rename from sha1-name.c
2021-01-15 15:20:29 -08:00
Junio C Hamano
a11571bb7f Merge branch 'ma/t1300-cleanup'
Code clean-up.

* ma/t1300-cleanup:
  t1300: don't needlessly work with `core.foo` configs
  t1300: remove duplicate test for `--file no-such-file`
  t1300: remove duplicate test for `--file ../foo`
2021-01-15 15:20:28 -08:00
Junio C Hamano
9ba366f12b Merge branch 'bc/rev-parse-path-format'
"git rev-parse" can be explicitly told to give output as absolute
or relative path with the `--path-format=(absolute|relative)` option.

* bc/rev-parse-path-format:
  rev-parse: add option for absolute or relative path formatting
  abspath: add a function to resolve paths with missing components
2021-01-15 15:20:28 -08:00
Junio C Hamano
6dbbae17d9 Merge branch 'ew/decline-core-abbrev'
The configuration variable 'core.abbrev' can be set to 'no' to
force no abbreviation regardless of the hash algorithm.

* ew/decline-core-abbrev:
  core.abbrev=no disables abbreviations
2021-01-15 15:20:28 -08:00
Patrick Steinhardt
d8d77153ea config: allow specifying config entries via envvar pairs
While we currently have the `GIT_CONFIG_PARAMETERS` environment variable
which can be used to pass runtime configuration data to git processes,
it's an internal implementation detail and not supposed to be used by
end users.

Next to being for internal use only, this way of passing config entries
has a major downside: the config keys need to be parsed as they contain
both key and value in a single variable. As such, it is left to the user
to escape any potentially harmful characters in the value, which is
quite hard to do if values are controlled by a third party.

This commit thus adds a new way of adding config entries via the
environment which gets rid of this shortcoming. If the user passes the
`GIT_CONFIG_COUNT=$n` environment variable, Git will parse environment
variable pairs `GIT_CONFIG_KEY_$i` and `GIT_CONFIG_VALUE_$i` for each
`i` in `[0,n)`.

While the same can be achieved with `git -c <name>=<value>`, one may
wish to not do so for potentially sensitive information. E.g. if one
wants to set `http.extraHeader` to contain an authentication token,
doing so via `-c` would trivially leak those credentials via e.g. ps(1),
which typically also shows command arguments.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-15 13:03:45 -08:00
Patrick Steinhardt
1ff21c05ba config: store "git -c" variables using more robust format
The previous commit added a new format for $GIT_CONFIG_PARAMETERS which
is able to robustly handle subsections with "=" in them. Let's start
writing the new format. Unfortunately, this does much less than you'd
hope, because "git -c" itself has the same ambiguity problem! But it's
still worth doing:

  - we've now pushed the problem from the inter-process communication
    into the "-c" command-line parser. This would free us up to later
    add an unambiguous format there (e.g., separate arguments like "git
    --config key value", etc).

  - for --config-env, the parser already disallows "=" in the
    environment variable name. So:

      git --config-env section.with=equals.key=ENVVAR

    will robustly set section.with=equals.key to the contents of
    $ENVVAR.

The new test shows the improvement for --config-env.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-15 13:03:18 -08:00
Jeff King
f9dbb64fad config: parse more robust format in GIT_CONFIG_PARAMETERS
When we stuff config options into GIT_CONFIG_PARAMETERS, we shell-quote
each one as a single unit, like:

  'section.one=value1' 'section.two=value2'

On the reading side, we de-quote to get the individual strings, and then
parse them by splitting on the first "=" we find. This format is
ambiguous, because an "=" may appear in a subsection. So the config
represented in a file by both:

  [section "subsection=with=equals"]
  key = value

and:

  [section]
  subsection = with=equals.key=value

ends up in this flattened format like:

  'section.subsection=with=equals.key=value'

and we can't tell which was desired. We have traditionally resolved this
by taking the first "=" we see starting from the left, meaning that we
allowed arbitrary content in the value, but not in the subsection.

Let's make our environment format a bit more robust by separately
quoting the key and value. That turns those examples into:

  'section.subsection=with=equals.key'='value'

and:

  'section.subsection'='with=equals.key=value'

respectively, and we can tell the difference between them. We can detect
which format is in use for any given element of the list based on the
presence of the unquoted "=". That means we can continue to allow the
old format to work to support any callers which manually used the old
format, and we can even intermingle the two formats. The old format
wasn't documented, and nobody was supposed to be using it. But it's
likely that such callers exist in the wild, so it's nice if we can avoid
breaking them. Likewise, it may be possible to trigger an older version
of "git -c" that runs a script that calls into a newer version of "git
-c"; that new version would see the intermingled format.

This does create one complication, which is that the obvious format in
the new scheme for

  [section]
  some-bool

is:

  'section.some-bool'

with no equals. We'd mistake that for an old-style variable. And it even
has the same meaning in the old style, but:

  [section "with=equals"]
  some-bool

does not. It would be:

  'section.with=equals=some-bool'

which we'd take to mean:

  [section]
  with = equals=some-bool

in the old, ambiguous style. Likewise, we can't use:

  'section.some-bool'=''

because that's ambiguous with an actual empty string. Instead, we'll
again use the shell-quoting to give us a hint, and use:

  'section.some-bool'=

to show that we have no value.

Note that this commit just expands the reading side. We'll start writing
the new format via "git -c" in a future patch. In the meantime, the
existing "git -c" tests will make sure we didn't break reading the old
format. But we'll also add some explicit coverage of the two formats to
make sure we continue to handle the old one after we move the writing
side over.

And one final note: since we're now using the shell-quoting as a
semantically meaningful hint, this closes the door to us ever allowing
arbitrary shell quoting, like:

  'a'shell'would'be'ok'with'this'.key=value

But we have never supported that (only what sq_quote() would produce),
and we are probably better off keeping things simple, robust, and
backwards-compatible, than trying to make it easier for humans. We'll
continue not to advertise the format of the variable to users, and
instead keep "git -c" as the recommended mechanism for setting config
(even if we are trying to be kind not to break users who may be relying
on the current undocumented format).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-15 13:03:18 -08:00
Junio C Hamano
2d02bc91c0 t4203: make blame output massaging more robust
In the "git blame --porcelain" output, lines that ends with three
integers may not be the line that shows a commit object with line
numbers and block length (the contents from the blamed file or the
summary field can have a line that happens to match).  Also, the
names of the author may have more than three SP separated tokens
("git blame -L242,+1 cf6de18aab Documentation/SubmittingPatches"
gives an example).  The existing "grep -E | cut" pipeline is a bit
too loose on these two points.

While they can be assumed on the test data, it is not so hard to
use the right pattern from the documented format, so let's do so.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-14 21:54:52 -08:00
Denton Liu
afa80f534b t4203: stop losing return codes of git commands
In a pipe, only the return code of the last command is used. Thus, all
other commands will have their return codes masked. Rewrite pipes so
that there are no git commands upstream so that their failure is
reported.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-14 18:21:21 -08:00
Denton Liu
f9f30a0310 test-lib-functions.sh: fix usage for test_commit()
The usage comment for test_commit() shows that the --author option
should be given as `--author=<author>`. However, this is incorrect as it
only works when given as `--author <author>`. Correct this erroneous
text.

Also, for the sake of correctness, fix the description as well since we
invoke `git commit` with `--author <author>`, not `--author=<author>`.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-14 18:21:03 -08:00
Ævar Arnfjörð Bjarmason
238803cb40 mailmap doc + tests: document and test for case-insensitivity
Add documentation and more tests for case-insensitivity. The existing
test only matched on the E-Mail part, but as shown here we also match
the name with strcasecmp().

This behavior was last discussed on the mailing list in the thread
starting at [1]. It seems we're keeping it like this, so let's
document it.

1. https://lore.kernel.org/git/87czykvg19.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:42 -08:00
Ævar Arnfjörð Bjarmason
34986b773a mailmap tests: add tests for empty "<>" syntax
Add tests for mailmap's handling of "<>", which is allowed on the RHS,
but not the LHS of a "<LHS> <RHS>" pair.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:42 -08:00
Ævar Arnfjörð Bjarmason
9e2a14a889 mailmap tests: add tests for whitespace syntax
Add tests for mailmap's handling of whitespace, i.e. how it trims
space within "<>" and around author names.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:42 -08:00
Ævar Arnfjörð Bjarmason
9b391b09a0 mailmap tests: add a test for comment syntax
Add a test for mailmap comment syntax. As noted in [1] there was no
test coverage for this. Let's make sure a future change doesn't break
it.

1. https://lore.kernel.org/git/CAN0heSoKYWXqskCR=GPreSHc6twCSo1345WTmiPdrR57XSShhA@mail.gmail.com/

Reported-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:42 -08:00
Ævar Arnfjörð Bjarmason
05b5ff219c mailmap doc + tests: add better examples & test them
Change the mailmap documentation added in 0925ce4d49 (Add map_user()
and clear_mailmap() to mailmap, 2009-02-08) to continue discussing the
Jane/Joe example. I think this makes things a lot less confusing as
we're building up more complex examples using one set of data which
covers all the things we'd like to discuss.

Also add tests to assert that what our documentation says is what's
actually happening. This is mostly (or entirely) covered by existing
tests which I'm not deleting, but having these tests for the synopsis
makes it easier to follow-along while reading the tests & docs.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:42 -08:00
Ævar Arnfjörð Bjarmason
f5d79bf7dd tests: refactor a few tests to use "test_commit --append"
Refactor a few more tests to use the new "--append" option to
"test_commit". I added it for use in the mailmap tests, but this
demonstrates how useful it is in general.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:41 -08:00
Ævar Arnfjörð Bjarmason
3373518cc8 test-lib functions: add an --append option to test_commit
Add an --append option to test_commit to append <contents> to the
<file> we're writing to. This simplifies a lot of test setup, as shown
in some of the tests being changed here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:41 -08:00
Ævar Arnfjörð Bjarmason
999cfc4f45 test-lib functions: add --author support to test_commit
Add support for --author to "test_commit". This will simplify some
current and future tests, one of those is being changed here.

Let's also line-wrap the "git commit" command invocation to make diffs
that add subsequent options easier to add, as they'll only need to add
a new option line.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:41 -08:00
Ævar Arnfjörð Bjarmason
76b8b8d05c test-lib functions: document arguments to test_commit
The --notick argument was added in [1] and was followed by --signoff
in [2], but neither of these commits added any documentation for these
options. When -C was added in [3] a comment was added to document it,
but not the other options. Let's document all of these options.

1. 44b85e89d7 (t7003: add test to filter a branch with a commit at
   epoch, 2012-07-12),
2. 5ed75e2a3f (cherry-pick: don't forget -s on failure, 2012-09-14).
3. 6f94351b0a (test-lib-functions.sh: teach test_commit -C <dir>,
   2016-12-08)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:41 -08:00
Ævar Arnfjörð Bjarmason
f21426e189 test-lib functions: expand "test_commit" comment template
Expand the comment template for "test_commit" to match that of
"test_commit_bulk" added in b1c36cb849 (test-lib: introduce
test_commit_bulk, 2019-07-02). It has several undocumented options,
which won't all fit on one line. Follow-up commit(s) will document
them.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:41 -08:00
Ævar Arnfjörð Bjarmason
56ac194e1d mailmap: test for silent exiting on missing file/blob
That we silently ignore missing mailmap.file or mailmap.blob values is
intentional. See 938a60d64f (mailmap: clean up read_mailmap error
handling, 2012-12-12). However, nothing tested for this. Let's do that
by checking that stderr is empty in those cases.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:41 -08:00
Ævar Arnfjörð Bjarmason
c1fe7fd7e3 mailmap tests: get rid of overly complex blame fuzzing
Change a test that used a custom fuzzing function since
bfdfa3d414 (t4203 (mailmap): stop hardcoding commit ids and dates,
2010-10-15) to just use the "blame --porcelain" output instead.

We could use the same pattern as 0ba9c9a0fb (t8008: rely on
rev-parse'd HEAD instead of sha1 value, 2017-07-26) does to do this,
but there wouldn't be any point. We're not trying to test "blame"
output here in general, just that "blame" pays attention to the
mailmap.

So it's sufficient to get the blamed line(s) and authors from the
output, which is much easier with the "--porcelain" option.

It would still be possible for there to be a bug in "blame" such that
it uses the mailmap for its "--porcelain" output, but not the regular
output. Let's test for that simply by checking if specifying the
mailmap changes the output.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:41 -08:00
Ævar Arnfjörð Bjarmason
400d160e39 mailmap tests: add a test for "not a blob" error
Add a test for one of the error conditions added in
938a60d64f (mailmap: clean up read_mailmap error handling,
2012-12-12).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:40 -08:00
Ævar Arnfjörð Bjarmason
fb3bbe4ea3 mailmap tests: remove redundant entry in test
Remove a redundant line in a test added in d20d654fe8 (Change current
mailmap usage to do matching on both name and email of
author/committer., 2009-02-08).

This didn't conceivably test anything useful and is most likely a
copy/paste error.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:40 -08:00
Ævar Arnfjörð Bjarmason
1db421ab85 mailmap tests: improve --stdin tests
The --stdin tests setup the "contact" file in the main setup, let's
instead set it up in the test that uses it.

Also refactor the first test so it's obvious that the point of it is
that "check-mailmap" will spew its input as-is when given no
argument. For that one we can just use the "expect" file as-is.

Also add tests for how other "--stdin" cases are handled, e.g. one
where we actually do a mapping.

For the rest of --stdin testing we just assume we're going to get the
same output. We could follow-up and make sure everything's
round-tripped through both --stdin and the file/blob backends, but I
don't think there's much point in that.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:40 -08:00
Ævar Arnfjörð Bjarmason
e9931ace4f mailmap tests: modernize syntax & test idioms
Refactor the mailmap tests to:

 * Setup "actual" test files in the body of "test_expect_success"

 * Don't have X of "test_expect_success X Y" be an unquoted string.

 * Not to carry over test config between tests, and instead use
   "test_config".

 * Replace various "echo" a line-at-a-time patterns with here-docs.

 * Change a case of "log.mailmap=False" to use the lower-case
   "false". Both work, but this ends up in git-config's boolean
   parsing and these atypical values are tested for elsewhere. Let's
   use the lower-case to not draw the reader's attention to this
   abnormality.

 * Remove commentary asserting that things work a given way in favor
   of simply testing for it, i.e. in the case of a .mailmap file
   outside of the repository.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:40 -08:00
Ævar Arnfjörð Bjarmason
9aaeac9cf7 mailmap tests: use our preferred whitespace syntax
Change these tests to use the preferred whitespace around ">",
"<<-EOF" etc. This is an initial step in larger and more meaningful
refactoring of the file, which makes a subsequent commit easier to
read.

I'm not changing the whitespace of "echo <str> > file" patterns to
"echo <str> >file" because all of those will be changed to here-docs
in a subsequent commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:40 -08:00
Patrick Steinhardt
c7b190dabd fetch: implement support for atomic reference updates
When executing a fetch, then git will currently allocate one reference
transaction per reference update and directly commit it. This means that
fetches are non-atomic: even if some of the reference updates fail,
others may still succeed and modify local references.

This is fine in many scenarios, but this strategy has its downsides.

- The view of remote references may be inconsistent and may show a
  bastardized state of the remote repository.

- Batching together updates may improve performance in certain
  scenarios. While the impact probably isn't as pronounced with loose
  references, the upcoming reftable backend may benefit as it needs to
  write less files in case the update is batched.

- The reference-update hook is currently being executed twice per
  updated reference. While this doesn't matter when there is no such
  hook, we have seen severe performance regressions when doing a
  git-fetch(1) with reference-transaction hook when the remote
  repository has hundreds of thousands of references.

Similar to `git push --atomic`, this commit thus introduces atomic
fetches. Instead of allocating one reference transaction per updated
reference, it causes us to only allocate a single transaction and commit
it as soon as all updates were received. If locking of any reference
fails, then we abort the complete transaction and don't update any
reference, which gives us an all-or-nothing fetch.

Note that this may not completely fix the first of above downsides, as
the consistent view also depends on the server-side. If the server
doesn't have a consistent view of its own references during the
reference negotiation phase, then the client would get the same
inconsistent view the server has. This is a separate problem though and,
if it actually exists, can be fixed at a later point.

This commit also changes the way we write FETCH_HEAD in case `--atomic`
is passed. Instead of writing changes as we go, we need to accumulate
all changes first and only commit them at the end when we know that all
reference updates succeeded. Ideally, we'd just do so via a temporary
file so that we don't need to carry all updates in-memory. This isn't
trivially doable though considering the `--append` mode, where we do not
truncate the file but simply append to it. And given that we support
concurrent processes appending to FETCH_HEAD at the same time without
any loss of data, seeding the temporary file with current contents of
FETCH_HEAD initially and then doing a rename wouldn't work either. So
this commit implements the simple strategy of buffering all changes and
appending them to the file on commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 12:06:15 -08:00
Patrick Steinhardt
ce81b1da23 config: add new way to pass config via --config-env
While it's already possible to pass runtime configuration via `git -c
<key>=<value>`, it may be undesirable to use when the value contains
sensitive information. E.g. if one wants to set `http.extraHeader` to
contain an authentication token, doing so via `-c` would trivially leak
those credentials via e.g. ps(1), which typically also shows command
arguments.

To enable this usecase without leaking credentials, this commit
introduces a new switch `--config-env=<key>=<envvar>`. Instead of
directly passing a value for the given key, it instead allows the user
to specify the name of an environment variable. The value of that
variable will then be used as value of the key.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 12:03:18 -08:00
Jeff King
c9e3a4e76d patch-ids: handle duplicate hashmap entries
This fixes a bug introduced in dfb7a1b4d0 (patch-ids: stop using a
hand-rolled hashmap implementation, 2016-07-29) in which

  git rev-list --cherry-pick A...B

will fail to suppress commits reachable from A even if a commit with
matching patch-id appears in B.

Around the time of that commit, the algorithm for "--cherry-pick" looked
something like this:

  0. Traverse all of the commits, marking them as being on the left or
     right side of the symmetric difference.

  1. Iterate over the left-hand commits, inserting a patch-id struct for
     each into a hashmap, and pointing commit->util to the patch-id
     struct.

  2. Iterate over the right-hand commits, checking which are present in
     the hashmap. If so, we exclude the commit from the output _and_ we
     mark the patch-id as "seen".

  3. Iterate again over the left-hand commits, checking whether
     commit->util->seen is set; if so, exclude them from the output.

At the end, we'll have eliminated commits from both sides that have a
matching patch-id on the other side. But there's a subtle assumption
here: for any given patch-id, we must have exactly one struct
representing it. If two commits from A both have the same patch-id and
we allow duplicates in the hashmap, then we run into a problem:

  a. In step 1, we insert two patch-id structs into the hashmap.

  b. In step 2, our lookups will find only one of these structs, so only
     one "seen" flag is marked.

  c. In step 3, one of the commits in A will have its commit->util->seen
     set, but the other will not. We'll erroneously output the latter.

Prior to dfb7a1b4d0, our hashmap did not allow duplicates. Afterwards,
it used hashmap_add(), which explicitly does allow duplicates.

At that point, the solution would have been easy: when we are about to
add a duplicate, skip doing so and return the existing entry which
matches. But it gets more complicated.

In 683f17ec44 (patch-ids: replace the seen indicator with a commit
pointer, 2016-07-29), our step 3 goes away entirely. Instead, in step 2,
when the right-hand side finds a matching patch_id from the left-hand
side, we can directly mark the left-hand patch_id->commit to be omitted.
Solving that would be easy, too; there's a one-to-many relationship of
patch-ids to commits, so we just need to keep a list.

But there's more. Commit b3dfeebb92 (rebase: avoid computing unnecessary
patch IDs, 2016-07-29) built on that by lazily computing the full
patch-ids. So we don't even know when adding to the hashmap whether two
commits truly have the same id. We'd have to tentatively assign them a
list, and then possibly split them apart (possibly into N new structs)
at the moment we compute the real patch-ids. This could work, but it's
complicated and error-prone.

Instead, let's accept that we may store duplicates, and teach the lookup
side to be more clever. Rather than asking for a single matching
patch-id, it will need to iterate over all matching patch-ids. This does
mean examining every entry in a single hash bucket, but the worst-case
for a hash lookup was already doing that.

We'll keep the hashmap details out of the caller by providing a simple
iteration interface. We can retain the simple has_commit_patch_id()
interface for the other callers, but we'll simplify its return value
into an integer, rather than returning the patch_id struct. That way
they won't be tempted to look at the "commit" field of the return value
without iterating.

Reported-by: Arnaud Morin <arnaud.morin@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 11:13:32 -08:00
Jiang Xin
5bb0fd2cab bundle: arguments can be read from stdin
In order to create an incremental bundle, we need to pass many arguments
to let git-bundle ignore some already packed commits.  It will be more
convenient to pass args via stdin.  But the current implementation does
not allow us to do this.

This is because args are parsed twice when creating bundle.  The first
time for parsing args is in `compute_and_write_prerequisites()` by
running `git-rev-list` command to write prerequisites in bundle file,
and stdin is consumed in this step if "--stdin" option is provided for
`git-bundle`.  Later nothing can be read from stdin when running
`setup_revisions()` in `create_bundle()`.

The solution is to parse args once by removing the entire function
`compute_and_write_prerequisites()` and then calling function
`setup_revisions()`.  In order to write prerequisites for bundle, will
call `prepare_revision_walk()` and `traverse_commit_list()`.  But after
calling `prepare_revision_walk()`, the object array `revs.pending` is
left empty, and the following steps could not work properly with the
empty object array (`revs.pending`).  Therefore, make a copy of `revs`
to `revs_copy` for later use right after calling `setup_revisions()`.

The copy of `revs_copy` is not a deep copy, it shares the same objects
with `revs`. The object array of `revs` has been cleared, but objects
themselves are still kept.  Flags of objects may change after calling
`prepare_revision_walk()`, we can use these changed flags without
calling the `git rev-list` command and parsing its output like the
former implementation.

Also add testcases for git bundle in t6020, which read args from stdin.

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-11 21:50:41 -08:00
Jiang Xin
ce1d6d9f16 bundle: lost objects when removing duplicate pendings
`git rev-list` will list one commit for the following command:

    $ git rev-list 'main^!'
    <tip-commit-of-main-branch>

But providing the same rev-list args to `git bundle`, fail to create
a bundle file.

    $ git bundle create - 'main^!'
    # v2 git bundle
    -<OID> <one-line-message>

    fatal: Refusing to create empty bundle.

This is because when removing duplicate objects in function
`object_array_remove_duplicates()`, one unique pending object which has
the same name is deleted by mistake.  The revision arg 'main^!' in the
above example is parsed by `handle_revision_arg()`, and at lease two
different objects will be appended to `revs.pending`, one points to the
parent commit of the "main" branch, and the other points to the tip
commit of the "main" branch.  These two objects have the same name
"main".  Only one object is left with the name "main" after calling the
function `object_array_remove_duplicates()`.

And what's worse, when adding boundary commits into pending list, we use
one-line commit message as names, and the arbitory names may surprise
git-bundle.

Only comparing objects themselves (".item") is also not good enough,
because user may want to create a bundle with two identical objects but
with different reference names, such as: "HEAD" and "refs/heads/main".

Add new function `contains_object()` which compare both the address and
the name of the object.

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-11 21:50:41 -08:00
Jiang Xin
9901164d81 test: add helper functions for git-bundle
Move git-bundle related functions from t5510 to a library, and this lib
will be shared with a new testcase t6020 which finds a known breakage of
"git-bundle".

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-11 21:50:41 -08:00
Denton Liu
6436a20284 refs: allow @{n} to work with n-sized reflog
This sequence works

	$ git checkout -b newbranch
	$ git commit --allow-empty -m one
	$ git show -s newbranch@{1}

and shows the state that was immediately after the newbranch was
created.

But then if you do

	$ git reflog expire --expire=now refs/heads/newbranch
	$ git commit --allow=empty -m two
	$ git show -s newbranch@{1}

you'd be scolded with

	fatal: log for 'newbranch' only has 1 entries

While it is true that it has only 1 entry, we have enough
information in that single entry that records the transition between
the state in which the tip of the branch was pointing at commit
'one' to the new commit 'two' built on it, so we should be able to
answer "what object newbranch was pointing at?". But we refuse to
do so.

Make @{0} the special case where we use the new side to look up that
entry. Otherwise, look up @{n} using the old side of the (n-1)th entry
of the reflog.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-11 14:13:50 -08:00
Jeff King
acaabcf391 t5516: loosen "not our ref" error check
Commit 014ade7484 (upload-pack: send ERR packet for non-tip objects,
2019-04-13) added a test that greps the output of a failed fetch to make
sure that upload-pack sent us the ERR packet we expected. But checking
this is racy; despite the argument in that commit, the client may still
be sending a "done" line after the server exits, causing it to die() on
a failed write() and never see the ERR packet at all.

This fails quite rarely on Linux, but more often on macOS. However, it
can be triggered reliably with:

	diff --git a/fetch-pack.c b/fetch-pack.c
	index 876f90c759..cf40de9092 100644
	--- a/fetch-pack.c
	+++ b/fetch-pack.c
	@@ -489,6 +489,7 @@ static int find_common(struct fetch_negotiator *negotiator,
	 done:
	 	trace2_region_leave("fetch-pack", "negotiation_v0_v1", the_repository);
	 	if (!got_ready || !no_done) {
	+		sleep(1);
	 		packet_buf_write(&req_buf, "done\n");
	 		send_request(args, fd[1], &req_buf);
	 	}

This is a real user-visible race that it would be nice to fix, but it's
tricky to do so: the client would have to speculatively try to read an
ERR packet after hitting a write() error. And at least for this error,
it's specific to v0 (since v2 does not enforce reachability at all).

So let's loosen the test to avoid annoying racy failures. If we
eventually do the read-after-failed-write thing, we can tighten it. And
if not, v0 will grow increasingly obsolete as servers support v2, so the
utility of this test will decrease over time anyway.

Note that we can still check stderr to make sure upload-pack bailed for
the reason we expected. It writes a similar message to stderr, and
because the server side is just another process connected by pipes,
we'll reliably see it. This would not be the case for git://, or for
ssh servers that do not relay stderr (e.g., GitHub's custom endpoint
does not).

Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-09 21:05:12 -08:00
Adam Dinwoodie
a1e03535db t4129: fix setfacl-related permissions failure
When running this test in Cygwin, it's necessary to remove the inherited
access control lists from the Git working directory in order for later
permissions tests to work as expected.

As such, fix an error in the test script so that the ACLs are set for
the working directory, not a nonexistent subdirectory.

Signed-off-by: Adam Dinwoodie <adam@dinwoodie.org>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-09 14:45:26 -08:00
SZEDER Gábor
e3f5da7e60 t7800-difftool: don't accidentally match tmp dirs
In a bunch of test cases in 't7800-difftool.sh' we 'grep' for specific
filenames in 'git difftool's output, and those test cases are prone to
occasional failures because those filenames might be part of the name
of difftool's temporary directory as well, e.g.:

  +git difftool --dir-diff --no-symlinks --extcmd ls v1
  +grep sub output
  +test_line_count = 2 sub-output
  test_line_count: line count for sub-output != 2
  /tmp/git-difftool.Ssubfq/left/:
  sub
  /tmp/git-difftool.Ssubfq/right/:
  sub
  error: last command exited with $?=1
  not ok 50 - difftool --dir-diff v1 from subdirectory --no-symlinks

Fix this by tightening the 'grep' patterns looking for those
interesting filenames to match only lines where a filename stands on
its own.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-09 13:40:32 -08:00
Derrick Stolee
6c62f01552 for-each-repo: do nothing on empty config
'git for-each-repo --config=X' should return success without calling any
subcommands when the config key 'X' has no value. The current
implementation instead segfaults.

A user could run into this issue if they used 'git maintenance start' to
initialize their cron schedule using 'git for-each-repo
--config=maintenance.repo ...' but then using 'git maintenance
unregister' to remove the config option. (Note: 'git maintenance stop'
would remove the config _and_ remove the cron schedule.)

Add a simple test to ensure this works. Use 'git help --no-such-option'
as the potential subcommand to ensure that we will hit a failure if the
subcommand is ever run.

Reported-by: Andreas Bühmann <dev@uuml.de>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-07 19:12:02 -08:00
Ævar Arnfjörð Bjarmason
4045f659bd branch: show "HEAD detached" first under reverse sort
Change the output of the likes of "git branch -l --sort=-objectsize"
to show the "(HEAD detached at <hash>)" message at the start of the
output. Before the compare_detached_head() function added in a
preceding commit we'd emit this output as an emergent effect.

It doesn't make any sense to consider the objectsize, type or other
non-attribute of the "(HEAD detached at <hash>)" message for the
purposes of sorting. Let's always emit it at the top instead. The only
reason it was sorted in the first place is because we're injecting it
into the ref-filter machinery so builtin/branch.c doesn't need to do
its own "am I detached?" detection.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-07 15:13:21 -08:00
Ævar Arnfjörð Bjarmason
2708ce62d2 branch: sort detached HEAD based on a flag
Change the ref-filter sorting of detached HEAD to check the
FILTER_REFS_DETACHED_HEAD flag, instead of relying on the ref
description filled-in by get_head_description() to start with "(",
which in turn we expect to ASCII-sort before any other reference.

For context, we'd like the detached line to appear first at the start
of "git branch -l", e.g.:

    $ git branch -l
    * (HEAD detached at <hash>)
      master

This doesn't change that, but improves on a fix made in
28438e84e0 (ref-filter: sort detached HEAD lines firstly, 2019-06-18)
and gives the Chinese translation the ability to use its preferred
punctuation marks again.

In Chinese the fullwidth versions of punctuation like "()" are
typically written as (U+FF08 fullwidth left parenthesis), (U+FF09
fullwidth right parenthesis) instead[1]. This form is used in both
po/zh_{CN,TW}.po in most cases where "()" is translated in a string.

Aside from that improvement to the Chinese translation, it also just
makes for cleaner code that we mark any special cases in the ref_array
we're sorting with flags and make the sort function aware of them,
instead of piggy-backing on the general-case of strcmp() doing the
right thing.

As seen in the amended tests this made reverse sorting a bit more
consistent. Before this we'd sometimes sort this message in the
middle, now it's consistently at the beginning or end, depending on
whether we're doing a normal or reverse sort. Having it at the end
doesn't make much sense either, but at least it behaves consistently
now. A follow-up commit will make this behavior under reverse sorting
even better.

I'm removing the "TRANSLATORS" comments that were in the old code
while I'm at it. Those were added in d4919bb288 (ref-filter: move
get_head_description() from branch.c, 2017-01-10). I think it's
obvious from context, string and translation memory in typical
translation tools that these are the same or similar string.

1. https://en.wikipedia.org/wiki/Chinese_punctuation#Marks_similar_to_European_punctuation

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-07 15:13:21 -08:00
Jeff King
6aed56736b fsck: reject .gitmodules git:// urls with newlines
The previous commit taught the clone/fetch client side to reject a
git:// URL with a newline in it. Let's also catch these when fscking a
.gitmodules file, which will give an earlier warning.

Note that it would be simpler to just complain about newline in _any_
URL, but an earlier tightening for http/ftp made sure we kept allowing
newlines for unknown protocols (and this is covered in the tests). So
we'll stick to that precedent.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-07 14:25:44 -08:00
Jeff King
a02ea57717 git_connect_git(): forbid newlines in host and path
When we connect to a git:// server, we send an initial request that
looks something like:

  002dgit-upload-pack repo.git\0host=example.com

If the repo path contains a newline, then it's included literally, and
we get:

  002egit-upload-pack repo
  .git\0host=example.com

This works fine if you really do have a newline in your repository name;
the server side uses the pktline framing to parse the string, not
newlines. However, there are many _other_ protocols in the wild that do
parse on newlines, such as HTTP. So a carefully constructed git:// URL
can actually turn into a valid HTTP request. For example:

  git://localhost:1234/%0d%0a%0d%0aGET%20/%20HTTP/1.1 %0d%0aHost:localhost%0d%0a%0d%0a

becomes:

  0050git-upload-pack /
  GET / HTTP/1.1
  Host:localhost

  host=localhost:1234

on the wire. Again, this isn't a problem for a real Git server, but it
does mean that feeding a malicious URL to Git (e.g., through a
submodule) can cause it to make unexpected cross-protocol requests.
Since repository names with newlines are presumably quite rare (and
indeed, we already disallow them in git-over-http), let's just disallow
them over this protocol.

Hostnames could likewise inject a newline, but this is unlikely a
problem in practice; we'd try resolving the hostname with a newline in
it, which wouldn't work. Still, it doesn't hurt to err on the side of
caution there, since we would not expect them to work in the first
place.

The ssh and local code paths are unaffected by this patch. In both cases
we're trying to run upload-pack via a shell, and will quote the newline
so that it makes it intact. An attacker can point an ssh url at an
arbitrary port, of course, but unless there's an actual ssh server
there, we'd never get as far as sending our shell command anyway.  We
_could_ similarly restrict newlines in those protocols out of caution,
but there seems little benefit to doing so.

The new test here is run alongside the git-daemon tests, which cover the
same protocol, but it shouldn't actually contact the daemon at all.  In
theory we could make the test more robust by setting up an actual
repository with a newline in it (so that our clone would succeed if our
new check didn't kick in). But a repo directory with newline in it is
likely not portable across all filesystems. Likewise, we could check
git-daemon's log that it was not contacted at all, but we do not
currently record the log (and anyway, it would make the test racy with
the daemon's log write). We'll just check the client-side stderr to make
sure we hit the expected code path.

Reported-by: Harold Kim <h.kim@flatt.tech>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-07 14:25:44 -08:00
Junio C Hamano
d3aff11c3e Merge branch 'es/perf-export-fix'
Tweak unneeded recursion from a test framework helper function.

* es/perf-export-fix:
  t/perf: avoid unnecessary test_export() recursion
2021-01-06 23:33:44 -08:00
Junio C Hamano
cf4b0714f7 Merge branch 'fc/t6030-bisect-reset-removes-auxiliary-files'
A 3-year old test that was not testing anything useful has been
corrected.

* fc/t6030-bisect-reset-removes-auxiliary-files:
  test: bisect-porcelain: fix location of files
2021-01-06 23:33:44 -08:00
Junio C Hamano
8664fcb83b Merge branch 'es/worktree-repair-both-moved'
"git worktree repair" learned to deal with the case where both the
repository and the worktree moved.

* es/worktree-repair-both-moved:
  worktree: teach `repair` to fix multi-directional breakage
2021-01-06 23:33:44 -08:00
Junio C Hamano
d3fa84d528 Merge branch 'fc/pull-merge-rebase'
When a user does not tell "git pull" to use rebase or merge, the
command gives a loud message telling a user to choose between
rebase or merge but creates a merge anyway, forcing users who would
want to rebase to redo the operation.  Fix an early part of this
problem by tightening the condition to give the message---there is
no reason to stop or force the user to choose between rebase or
merge if the history fast-forwards.

* fc/pull-merge-rebase:
  pull: display default warning only when non-ff
  pull: correct condition to trigger non-ff advice
  pull: get rid of unnecessary global variable
  pull: give the advice for choosing rebase/merge much later
  pull: refactor fast-forward check
2021-01-06 23:33:44 -08:00
Junio C Hamano
c256631065 Merge branch 'tb/pack-bitmap'
Various improvements to the codepath that writes out pack bitmaps.

* tb/pack-bitmap: (24 commits)
  pack-bitmap-write: better reuse bitmaps
  pack-bitmap-write: relax unique revwalk condition
  pack-bitmap-write: use existing bitmaps
  pack-bitmap: factor out 'add_commit_to_bitmap()'
  pack-bitmap: factor out 'bitmap_for_commit()'
  pack-bitmap-write: ignore BITMAP_FLAG_REUSE
  pack-bitmap-write: build fewer intermediate bitmaps
  pack-bitmap.c: check reads more aggressively when loading
  pack-bitmap-write: rename children to reverse_edges
  t5310: add branch-based checks
  commit: implement commit_list_contains()
  bitmap: implement bitmap_is_subset()
  pack-bitmap-write: fill bitmap with commit history
  pack-bitmap-write: pass ownership of intermediate bitmaps
  pack-bitmap-write: reimplement bitmap writing
  ewah: add bitmap_dup() function
  ewah: implement bitmap_or()
  ewah: make bitmap growth less aggressive
  ewah: factor out bitmap growth
  rev-list: die when --test-bitmap detects a mismatch
  ...
2021-01-06 23:33:43 -08:00
Junio C Hamano
b62bbd3580 Merge branch 'ab/trailers-extra-format'
The "--format=%(trailers)" mechanism gets enhanced to make it
easier to design output for machine consumption.

* ab/trailers-extra-format:
  pretty format %(trailers): add a "key_value_separator"
  pretty format %(trailers): add a "keyonly"
  pretty-format %(trailers): fix broken standalone "valueonly"
  pretty format %(trailers) doc: avoid repetition
  pretty format %(trailers) test: split a long line
2021-01-06 23:33:43 -08:00
Junio C Hamano
c977ff4407 Merge branch 'pk/subsub-fetch-fix-take-2'
"git fetch --recurse-submodules" fix (second attempt).

* pk/subsub-fetch-fix-take-2:
  submodules: fix of regression on fetching of non-init subsub-repo
2021-01-06 23:33:43 -08:00
Philippe Blain
80f5a16798 mergetool--lib: fix '--tool-help' to correctly show available tools
Commit 83bbf9b92e (mergetool--lib: improve support for vimdiff-style tool
variants, 2020-07-29) introduced a regression in the output of `git mergetool
--tool-help` and `git difftool --tool-help` [1].

In function 'show_tool_names' in git-mergetool--lib.sh, we loop over the
supported mergetools and their variants and accumulate them in the variable
'variants', separating them with a literal '\n'.

The code then uses 'echo $variants' to turn these '\n' into newlines, but this
behaviour is not portable, it just happens to work in some shells, like
dash(1)'s 'echo' builtin.

For shells in which 'echo' does not turn '\n' into newlines, the end
result is that the only tools that are shown are the existing variants
(except the last variant alphabetically), since the variants are
separated by actual newlines in '$variants' because of the several
'echo' calls in mergetools/{bc,vimdiff}::list_tool_variants.

Fix this bug by embedding an actual line feed into `variants` in
show_tool_names(). While at it, replace `sort | uniq` by `sort -u`.

To prevent future regressions, add a simple test that checks that a few
known tools are correctly shown (let's avoid counting the total number
of tools to lessen the maintenance burden when new tools are added or if
'--tool-help' learns additional logic, like hiding tools depending on
the current platform).

[1] https://lore.kernel.org/git/CADtb9DyozjgAsdFYL8fFBEWmq7iz4=prZYVUdH9W-J5CKVS4OA@mail.gmail.com/

Reported-by: Philippe Blain <levraiphilippeblain@gmail.com>
Based-on-patch-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 18:31:27 -08:00
Matheus Tavares
ea8bbf2a4e t4129: don't fail if setgid is set in the test directory
The last test of t4129 creates a directory and expects its setgid bit
(g+s) to be off. But this makes the test fail when the parent directory
has the bit set, as setgid's state is inherited by newly created
subdirectories.

One way to solve this problem is to allow the presence of this bit when
comparing the return of `test_modebits` with the expected value. But
then we may have the same problem in the future when other tests start
using `test_modebits` on directories (currently t4129 is the only one)
and forget about setgid. Instead, let's make the helper function more
robust with respect to the state of the setgid bit in the test directory
by removing this bit from the returning value. There should be no
problem with existing callers as no one currently expects this bit to be
on.

Note that the sticky bit (+t) and the setuid bit (u+s) are not
inherited, so we don't have to worry about those.

Reported-by: Kevin Daudt <me@ikke.info>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 15:59:17 -08:00
Ævar Arnfjörð Bjarmason
08bf6a8bc3 branch tests: add to --sort tests
Further stress the --sort callback in ref-filter.c. The implementation
uses certain short-circuiting logic, let's make sure it behaves the
same way on e.g. name & version sort. Improves a test added in
aedcb7dc75 (branch.c: use 'ref-filter' APIs, 2015-09-23).

I don't think all of this output makes sense, but let's test for the
behavior as-is, we can fix bugs in it in a later commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 15:16:56 -08:00
Ævar Arnfjörð Bjarmason
06ce79152b mktag: add a --[no-]strict option
Now that mktag has been migrated to use the fsck machinery to check
its input, it makes sense to teach it to run in the equivalent of "git
fsck"'s default mode.

For cases where mktag is used to (re)create a tag object using data
from an existing and malformed tag object, the validation may
optionally have to be loosened. Teach the command to take the
"--[no-]strict" option to do so.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 14:22:24 -08:00
Taylor Blau
cc2d43be2b p7519: allow running without watchman prereq
p7519 measures the performance of the fsmonitor code. To do this, it
uses the installed copy of Watchman. If Watchman isn't installed, a noop
integration script is installed in its place.

When in the latter mode, it is expected that the script should not write
a "last update token": in fact, it doesn't write anything at all since
the script is blank.

Commit 33226af42b (t/perf/fsmonitor: improve error message if typoing
hook name, 2020-10-26) made sure that running 'git update-index
--fsmonitor' did not write anything to stderr, but this is not the case
when using the empty Watchman script, since Git will complain that:

    $ which watchman
    watchman not found
    $ cat .git/hooks/fsmonitor-empty
    $ git -c core.fsmonitor=.git/hooks/fsmonitor-empty update-index --fsmonitor
    warning: Empty last update token.

Prior to 33226af42b, the output wasn't checked at all, which allowed
this noop mode to work. But, 33226af42b breaks p7519 when running it
without a 'watchman(1)' on your system.

Handle this by only checking that the stderr is empty only when running
with a real watchman executable. Otherwise, assert that the error
message is the expected one when running in the noop mode.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Nipunn Koorapati <nipunn@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 13:48:25 -08:00
Ævar Arnfjörð Bjarmason
3f390a366c mktag: convert to parse-options
Convert the "mktag" command to use parse-options.h instead of its own
ad-hoc argc handling. This doesn't matter much in practice since it
doesn't support any options, but removes another special-case in our
codebase, and makes it easier to add options to it in the future.

It does marginally improve the situation for programs that want to
execute git commands in a consistent manner and e.g. always use
--end-of-options. E.g. "gitaly" does that, and has a blacklist of
built-ins that don't support --end-of-options. This is one less
special case for it and other similar programs to support.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
9a1a3a4d4c mktag: allow omitting the header/body \n separator
Change mktag's acceptance rules to accept an empty body without an
empty line after the header again. This fixes an ancient unintended
dregression in "mktag".

When "mktag" was introduced in ec4465adb3 (Add "tag" objects that can
be used to sign other objects., 2005-04-25) the input checks were much
looser. When it was documented it 6cfec03680 (mktag: minimally update
the description., 2007-06-10) it was clearly intended for this \n to
be optional:

    The message, when [it] exists, is separated by a blank line from
    the header.

But then in e0aaf781f6 (mktag.c: improve verification of tagger field
and tests, 2008-03-27) this was made an error, seemingly by
accident. It was just a result of the general header checks, and all
the tests after that patch have a trailing empty line (but did not
before).

Let's allow this again, and tweak the test semantics changed in
e0aaf781f6 to remove the redundant empty line. New tests added in
previous commits of mine already added an explicit test for allowing
the empty line between header and body.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
acfc01332b mktag: allow turning off fsck.extraHeaderEntry
In earlier commits mktag learned to use the fsck machinery, at which
point we needed to add fsck.extraHeaderEntry so it could be as strict
about extra headers as it's been ever since it was implemented.

But it's not nice to need to switch away from "mktag" to "hash-object"
+ manual "fsck" just because you'd like to have an extra header. So
let's support turning it off by getting "fsck.*" variables from the
config.

Pedantically speaking it's still not possible to make "mktag" behave
just like "hash-object -t tag" does, since we're unconditionally going
to check the referenced object in verify_object_in_tag(), which is our
own check, and not one that exists in fsck.c.

But the spirit of "this works like fsck" is preserved, in that if you
created such a tag with "hash-object" and did a full "fsck" on the
repository it would also error out about that invalid object, it just
wouldn't emit the same message as fsck does.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
acf9de4c94 mktag: use fsck instead of custom verify_tag()
Change the validation logic in "mktag" to use fsck's fsck_tag()
instead of its own custom parser. Curiously the logic for both dates
back to the same commit[1]. Let's unify them so we're not maintaining
two sets functions to verify that a tag is OK.

The behavior of fsck_tag() and the old "mktag" code being removed here
is different in few aspects.

I think it makes sense to remove some of those checks, namely:

 A. fsck only cares that the timezone matches [-+][0-9]{4}. The mktag
    code disallowed values larger than 1400.

    Yes there's currently no timezone with a greater offset[2], but
    since we allow any number of non-offical timezones (e.g. +1234)
    passing this through seems fine. Git also won't break in the
    future if e.g. French Polynesia decides it needs to outdo the Line
    Islands when it comes to timezone extravagance.

 B. fsck allows missing author names such as "tagger <email>", mktag
    wouldn't, but would allow e.g. "tagger [2 spaces] <email>" (but
    not "tagger [1 space] <email>"). Now we allow all of these.

 C. Like B, but "mktag" disallowed spaces in the <email> part, fsck
    allows it.

In some ways fsck_tag() is stricter than "mktag" was, namely:

 D. fsck disallows zero-padded dates, but mktag didn't care. So
    e.g. the timestamp "0000000000 +0000" produces an error now. A
    test in "t1006-cat-file.sh" relied on this, it's been changed to
    use "hash-object" (without fsck) instead.

There was one check I deemed worth keeping by porting it over to
fsck_tag():

 E. "mktag" did not allow any custom headers, and by extension (as an
    empty commit is allowed) also forbade an extra stray trailing
    newline after the headers it knew about.

    Add a new check in the "ignore" category to fsck and use it. This
    somewhat abuses the facility added in efaba7cc77 (fsck:
    optionally ignore specific fsck issues completely, 2015-06-22).

    This is somewhat of hack, but probably the least invasive change
    we can make here. The fsck command will shuffle these categories
    around, e.g. under --strict the "info" becomes a "warn" and "warn"
    becomes "error". Existing users of fsck's (and others,
    e.g. index-pack) --strict option rely on this.

    So we need to put something into a category that'll be ignored by
    all existing users of the API. Pretending that
    fsck.extraHeaderEntry=error ("ignore" by default) was set serves
    to do this for us.

1. ec4465adb3 (Add "tag" objects that can be used to sign other
   objects., 2005-04-25)

2. https://en.wikipedia.org/wiki/List_of_UTC_time_offsets

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
692654dca0 mktag tests: test verify_object() with replaced objects
Add tests to demonstrate what "mktag" does in the face of replaced
objects.

There was an existing test for replaced objects fed to "mktag" added
in cc400f5011 (mktag: call "check_sha1_signature" with the
replacement sha1, 2009-01-23), but that one only tests a
commit->commit mapping. Not a mapping to a different type as like
we're also testing for here. We could remove the "mktag" test in
t6050-replace.sh now if the created tag wasn't being used by a
subsequent "fsck" test.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
30f882c16d mktag tests: improve verify_object() test coverage
The verify_object() function in "mktag.c" is tasked with ensuring that
our tag refers to a valid object.

The existing test for this might fail because it was also testing that
"type taggg" didn't refer to a valid object type (it should be "type
tag"), or because we referred to a valid object but got the type
wrong.

Let's split these tests up, so we're testing all combinations of a
non-existing object and in invalid/wrong "type" lines.

We need to provide GIT_TEST_GETTEXT_POISON=false here because the
"invalid object type" error is emitted by
parse_loose_header_extended(), which has that message already marked
for translation. Another option would be to use test_i18ngrep, but I
prefer always running the test, not skipping it under gettext poison
testing.

I'm not testing this in combination with "git replace". That'll be
done in a subsequent commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
ca9a1ed969 mktag tests: test "hash-object" compatibility
Change all the successful "mktag" tests to test that "hash-object"
produces the same hash for the input, and that fsck passes for
both.

This tests e.g. that "mktag" doesn't trim its input or otherwise munge
it in a way that "hash-object" doesn't.

Since we're doing an "fsck --strict" here at the end let's incorporate
the creation of the "mytag" name into this test, removing the
special-case at the end of the file.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
47c95e77d1 mktag tests: stress test whitespace handling
Add tests for a couple of whitespace edge cases around the header/body
boundary.

I consider the requirement for a blank line before the empty body a
bug, it's a long-standing regression which goes against the command's
documented behavior. This bug will be addressed in a follow-up change.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
3b9e4dd3a3 mktag tests: run "fsck" after creating "mytag"
Change the last test in the file to run an "fsck --strict" after
creating the tag at the end.

We're just doing this for good measure to check that fsck behaves as
expected now that there's finally a reference for our valid tag. Other
tests going to be checking this elsewhere, but it's nice to cover all
the edge cases in this test to make it as self-contained as possible.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
5c2303e0c7 mktag tests: don't create "mytag" twice
Change a test added in e0aaf781f6 (mktag.c: improve verification of
tagger field and tests, 2008-03-27) to not create "mytag", which
should only be created and verified at the end in an earlier test
added in 446c6faec6 (New tests and en-passant modifications to mktag.,
2006-07-29).

While we're at it let's prevent a similar logic error from creeping
into the test by asserting that "mytag" doesn't exist before we create
it. Let's do this by moving the test to use "update-ref", instead of
our own homebrew ad-hoc refstore update.

We're not really testing for anything yet by creating the tag at the
end here. A subsequent commit will change that.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
317c176279 mktag tests: don't redirect stderr to a file needlessly
Remove the redirection of stderr to "message" in the valid tag
test. This pattern seems to have been copy/pasted from the failure
case in 446c6faec6 (New tests and en-passant modifications to mktag.,
2006-07-29).

While I'm at it do the same for the "replace" tests. The tag creation
I'm changing here seems to have been copy/pasted from the "mktag"
tests to those tests in cc400f5011 (mktag: call
"check_sha1_signature" with the replacement sha1, 2009-01-23).

Nobody examines the contents of the resulting "message" file, so the
net result is that error messages cannot be seen in "sh t3800-mktag.sh
-v" output.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:28 -08:00
Ævar Arnfjörð Bjarmason
0d35ccb5e0 mktag tests: remove needless SHA-1 hardcoding
Change the tests amended in acb49d1cc8 (t3800: make hash-size
independent, 2019-08-18) even more to make them independent of either
SHA-1 or SHA-256.

Some of these tests were failing for the wrong reasons. The first one
being modified here would fail because the line starts with "xxxxxx"
instead of "object", the rest of the line doesn't matter.

Let's just put a valid hash on the rest of the line anyway to narrow
the test down for just the s/object/xxxxxx/ case.

The second one being modified here would fail under
GIT_TEST_DEFAULT_HASH=sha256 because <some sha-1 length garbage> is an
invalid SHA-256, but we should really be testing <some sha-256 length
garbage> when under SHA-256.

This doesn't really matter since we should be able to trust other
parts of the code to validate things in the 0-9a-f range, but let's
keep it for good measure.

There's a later test which tests an invalid SHA which looks like a
valid one, to stress the "We refuse to tag something we can't
verify[...]" logic in mktag.c.

But here we're testing for a SHA-length string which contains
characters outside of the /[0-9a-f]/i set.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:28 -08:00
Ævar Arnfjörð Bjarmason
b5ca549c93 mktag tests: use "test_commit" helper
Replace ad-hoc setup of a single commit in the "mktag" tests with our
standard helper pattern. The old setup dated back to 446c6faec6 (New
tests and en-passant modifications to mktag., 2006-07-29) before the
helper existed.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:28 -08:00
Ævar Arnfjörð Bjarmason
aba5377f69 mktag tests: don't needlessly use a subshell
The use of a subshell dates back to e9b20943b7 (t/t3800: do not use a
temporary file to hold expected result., 2008-01-04). It's not needed
anymore, if it ever was.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:28 -08:00
Derrick Stolee
3797a0a7b7 maintenance: use Windows scheduled tasks
Git's background maintenance uses cron by default, but this is not
available on Windows. Instead, integrate with Task Scheduler.

Tasks can be scheduled using the 'schtasks' command. There are several
command-line options that can allow for some advanced scheduling, but
unfortunately these seem to all require authenticating using a password.

Instead, use the "/xml" option to pass an XML file that contains the
configuration for the necessary schedule. These XML files are based on
some that I exported after constructing a schedule in the Task Scheduler
GUI. These options only run background maintenance when the user is
logged in, and more fields are populated with the current username and
SID at run-time by 'schtasks'.

Since the GIT_TEST_MAINT_SCHEDULER environment variable allows us to
specify 'schtasks' as the scheduler, we can test the Windows-specific
logic on other platforms. Thus, add a check that the XML file written
by Git is valid when xmllint exists on the system.

Since we use a temporary file for the XML files sent to 'schtasks', we
prefix the random characters with the frequency so it is easier to
examine the proper file during tests. Instead of an exact match on the
'args' file, we 'grep' for the arguments other than the filename.

There is a deficiency in the current design. Windows has two kinds of
applications: GUI applications that start by "winmain()" and console
applications that start by "main()". Console applications are attached
to a new Console window if they are not already associated with a GUI
application. This means that every hour the scheudled task launches a
command window for the scheduled tasks. Not only is this visually
obtrusive, but it also takes focus from whatever else the user is
doing!

A simple fix would be to insert a GUI application that acts as a shim
between the scheduled task and Git. This is currently possible in Git
for Windows by setting the <Command> tag equal to

  C:\Program Files\Git\git-bash.exe

with options "--hide --no-needs-console --command=cmd\git.exe"
followed by the arguments currently used. Since git-bash.exe is not
included in Windows builds of core Git, I chose to leave out this
feature. My plan is to submit a small patch to Git for Windows that
converts the use of git.exe with this use of git-bash.exe in the
short term. In the long term, we can consider creating this GUI
shim application within core Git, perhaps in contrib/.

Co-authored-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:38:02 -08:00
Derrick Stolee
2afe7e3567 maintenance: use launchctl on macOS
The existing mechanism for scheduling background maintenance is done
through cron. The 'crontab -e' command allows updating the schedule
while cron itself runs those commands. While this is technically
supported by macOS, it has some significant deficiencies:

1. Every run of 'crontab -e' must request elevated privileges through
   the user interface. When running 'git maintenance start' from the
   Terminal app, it presents a dialog box saying "Terminal.app would
   like to administer your computer. Administration can include
   modifying passwords, networking, and system settings." This is more
   alarming than what we are hoping to achieve. If this alert had some
   information about how "git" is trying to run "crontab" then we would
   have some reason to believe that this dialog might be fine. However,
   it also doesn't help that some scenarios just leave Git waiting for
   a response without presenting anything to the user. I experienced
   this when executing the command from a Bash terminal view inside
   Visual Studio Code.

2. While cron initializes a user environment enough for "git config
   --global --show-origin" to show the correct config file information,
   it does not set up the environment enough for Git Credential Manager
   Core to load credentials during a 'prefetch' task. My prefetches
   against private repositories required re-authenticating through UI
   pop-ups in a way that should not be required.

The solution is to switch from cron to the Apple-recommended [1]
'launchd' tool.

[1] https://developer.apple.com/library/archive/documentation/MacOSX/Conceptual/BPSystemStartup/Chapters/ScheduledJobs.html

The basics of this tool is that we need to create XML-formatted
"plist" files inside "~/Library/LaunchAgents/" and then use the
'launchctl' tool to make launchd aware of them. The plist files
include all of the scheduling information, along with the command-line
arguments split across an array of <string> tags.

For example, here is my plist file for the weekly scheduled tasks:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0"><dict>
<key>Label</key><string>org.git-scm.git.weekly</string>
<key>ProgramArguments</key>
<array>
<string>/usr/local/libexec/git-core/git</string>
<string>--exec-path=/usr/local/libexec/git-core</string>
<string>for-each-repo</string>
<string>--config=maintenance.repo</string>
<string>maintenance</string>
<string>run</string>
<string>--schedule=weekly</string>
</array>
<key>StartCalendarInterval</key>
<array>
<dict>
<key>Day</key><integer>0</integer>
<key>Hour</key><integer>0</integer>
<key>Minute</key><integer>0</integer>
</dict>
</array>
</dict>
</plist>

The schedules for the daily and hourly tasks are more complicated
since we need to use an array for the StartCalendarInterval with
an entry for each of the six days other than the 0th day (to avoid
colliding with the weekly task), and each of the 23 hours other
than the 0th hour (to avoid colliding with the daily task).

The "Label" value is currently filled with "org.git-scm.git.X"
where X is the frequency. We need a different plist file for each
frequency.

The launchctl command needs to be aligned with a user id in order
to initialize the command environment. This must be done using
the 'launchctl bootstrap' subcommand. This subcommand is new as
of macOS 10.11, which was released in September 2015. Before that
release the 'launchctl load' subcommand was recommended. The best
source of information on this transition I have seen is available
at [2]. The current design does not preclude a future version that
detects the available fatures of 'launchctl' to use the older
commands. However, it is best to rely on the newest version since
Apple might completely remove the deprecated version on short
notice.

[2] https://babodee.wordpress.com/2016/04/09/launchctl-2-0-syntax/

To remove a schedule, we must run 'launchctl bootout' with a valid
plist file. We also need to 'bootout' a task before the 'bootstrap'
subcommand will succeed, if such a task already exists.

The need for a user id requires us to run 'id -u' which works on
POSIX systems but not Windows. Further, the need for fully-qualitifed
path names including $HOME behaves differently in the Git internals and
the external test suite. The $HOME variable starts with "C:\..." instead
of the "/c/..." that is provided by Git in these subcommands. The test
therefore has a prerequisite that we are not on Windows. The cross-
platform logic still allows us to test the macOS logic on a Linux
machine.

We can verify the commands that were run by 'git maintenance start'
and 'git maintenance stop' by injecting a script that writes the
command-line arguments into GIT_TEST_MAINT_SCHEDULER.

An earlier version of this patch accidentally had an opening
"<dict>" tag when it should have had a closing "</dict>" tag. This
was caught during manual testing with actual 'launchctl' commands,
but we do not want to update developers' tasks when running tests.
It appears that macOS includes the "xmllint" tool which can verify
the XML format. This is useful for any system that might contain
the tool, so use it whenever it is available.

We strive to make these tests work on all platforms, but Windows caused
some headaches. In particular, the value of getuid() called by the C
code is not guaranteed to be the same as `$(id -u)` invoked by a test.
This is because `git.exe` is a native Windows program, whereas the
utility programs run by the test script mostly utilize the MSYS2 runtime,
which emulates a POSIX-like environment. Since the purpose of the test
is to check that the input to the hook is well-formed, the actual user
ID is immaterial, thus we can work around the problem by making the the
test UID-agnostic. Another subtle issue is the $HOME environment
variable being a Windows-style path instead of a Unix-style path. We can
be more flexible here instead of expecting exact path matches.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Co-authored-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:38:02 -08:00
Felipe Contreras
5a067ba9d0 completion: add proper public __git_complete
When __git_complete was introduced, it was meant to be temporarily, while
a proper guideline for public shell functions was established
(tentatively _GIT_complete), but since that never happened, people
in the wild started to use __git_complete, even though it was marked as
not public.

Eight years is more than enough wait, let's mark this function as
public, and make it a bit more user-friendly.

So that instead of doing:

  __git_complete gk __gitk_main

The user can do:

  __git_complete gk gitk

And instead of:

  __git_complete gf _git_fetch

Do:

  __git_complete gf git_fetch

Backwards compatibility is maintained.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 15:25:56 -08:00
Felipe Contreras
0e02bdc17a test: completion: add tests for __git_complete
Even though the function was marked as not public, it's already used in
the wild.

We should at least test basic functionality.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 15:25:56 -08:00
Martin Ågren
e5afd4449d object-file.c: rename from sha1-file.c
Drop the last remnant of "sha1" in this file and rename it to reflect
that we're not just able to handle SHA-1 these days.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 13:01:55 -08:00
Martin Ågren
1e6771e504 object-name.c: rename from sha1-name.c
Generalize the last remnants of "sha" and "sha1" in this file and rename
it to reflect that we're not just able to handle SHA-1 these days.

We need to update one test to check for an updated error string.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 13:01:55 -08:00
Elijah Newren
350410f6b1 diffcore-rename: remove unnecessary duplicate entry checks
Commit 25d5ea410f ("[PATCH] Redo rename/copy detection logic.",
2005-05-24) added a duplicate entry check on rename_src in order to
avoid segfaults; the code at the time was prone to double free()s and an
easy way to avoid it was just to turn off rename detection for any
duplicate entries.  Note that the form of the check was modified two
commits ago in this series.

Similarly, commit 4d6be03b95 ("diffcore-rename: avoid processing
duplicate destinations", 2015-02-26) added a duplicate entry check
on rename_dst for the exact same reason -- the code was prone to double
free()s, and an easy way to avoid it was just to turn off rename
detection entirely.  Note that the form of the check was modified in the
commit just before this one.

In the original code in both places, the code was dealing with
individual diff_filespecs and trying to match things up, instead of just
keeping the original diff_filepairs around as we do now.  The
intervening change in structure has fixed the accounting problems and
the associated double free()s that used to occur, and thus we already
have a better fix.  As such, we can remove the band-aid checks for
duplicate entries.

Due to the last two patches, the diffcore_rename() setup is no longer a
sizeable chunk of overall runtime.  Thus, in a large rebase of many
commits with lots of renames and several optimizations to inexact rename
detection, this patch only speeds up the overall code by about half a
percent or so and is pretty close to the run-to-run variability making
it hard to get an exact measurement.  However, with some trace2 regions
around the setup code in diffcore_rename() so that I can focus on just
it, I measure that this patch consistently saves almost a third of the
remaining time spent in diffcore_rename() setup.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 12:59:34 -08:00
Antonio Russo
c8302c6c00 t6016: move to lib-log-graph.sh framework
t6016 manually reconstructs git log --graph output by using the reported
commit hashes from `git rev-parse`.  Each tag is converted into an
environment variable manually, and then `echo`-ed to an expected output
file, which is in turn compared to the actual output.

The expected output is difficult to read and write, because, e.g.,
each line of output must be prefaced with echo, quoted, and properly
escaped.  Additionally, the test is sensitive to trailing whitespace,
which may potentially be removed from graph log output in the future.

In order to reduce duplication, ease troubleshooting of failed tests by
improving readability, and ease the addition of more tests to this file,
port the operations to `lib-log-graph.sh`, which is already used in
several other tests, e.g., t4215.  Give all merges a simple commit
message, and use a common `check_graph` macro taking a heredoc of the
expected output which does not required extensive escaping.

Signed-off-by: Antonio Russo <aerusso@aerusso.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 12:20:49 -08:00
Martin Ågren
04f6b0a192 t1300: don't needlessly work with core.foo configs
We use various made-up config keys in the "core" section for no real
reason. Change them to work in the "section" section instead and be
careful to also change "cores" to "sections". Make sure to also catch
"Core", "CoReS" and similar.

There are a few instances that actually want to work with a real "core"
config such as `core.bare` or `core.editor`. After this, it's clearer
that they work with "core" for a reason.

Reported-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 11:31:25 -08:00
Martin Ågren
34479d7177 t1300: remove duplicate test for --file no-such-file
We test that we can handle `git config --file symlink` and the error
case of `git config --file symlink-to-missing-file`. For good measure,
we also throw in a test to check that we correctly handle referencing a
missing regular file. But we have such a test earlier in this script.
They both check that we fail to use `--file no-such-file --list`.

Drop the latter of these and keep the one that is in the general area
where we test `--file` and `GIT_CONFIG`. The one we're dropping also
checks that we can't even get a specific key from the missing file --
let's make sure we check that in the test we keep.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 11:31:25 -08:00
Martin Ågren
b832abb63d t1300: remove duplicate test for --file ../foo
We have two tests for checking that we can handle `git config --file
../other-config ...`. One, using `--file`, was introduced in 65807ee697
("builtin-config: Fix crash when using "-f <relative path>" from
non-root dir", 2010-01-26), then another, using `GIT_CONFIG`, came about
in 270a34438b ("config: stop using config_exclusive_filename",
2012-02-16).

The latter of these was then converted to use `--file` in f7e8714101
("t: prefer "git config --file" to GIT_CONFIG", 2014-03-20). Both where
then simplified in a5db0b77b9 ("t1300: extract and use
test_cmp_config()", 2018-10-21).

These two tests differ slightly in the order of the options used, but
other than that, they are identical. Let's drop one. As noted in
f7e8714101, we do still have a test for `GIT_CONFIG` and it shares the
implementation with `--file`.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 11:31:24 -08:00
Junio C Hamano
73583204d9 Merge branch 'nk/refspecs-negative-fix'
Hotfix for recent regression.

* nk/refspecs-negative-fix:
  negative-refspec: improve comment on query_matches_negative_refspec
  negative-refspec: fix segfault on : refspec
2020-12-23 13:59:46 -08:00
Junio C Hamano
7a50265295 Merge branch 'ma/maintenance-crontab-fix'
Hotfix for a topic of this cycle.

* ma/maintenance-crontab-fix:
  t7900-maintenance: test for magic markers
  gc: fix handling of crontab magic markers
  git-maintenance.txt: add missing word
2020-12-23 13:59:46 -08:00
Junio C Hamano
04cd999638 Merge branch 'dl/checkout-p-merge-base'
Fix to a regression introduced during this cycle.

* dl/checkout-p-merge-base:
  checkout -p: handle tree arguments correctly again
2020-12-23 13:59:46 -08:00
Junio C Hamano
d076224363 Merge branch 'js/no-more-prepare-for-main-in-test'
Test coverage fix.

* js/no-more-prepare-for-main-in-test:
  tests: drop the `PREPARE_FOR_MAIN_BRANCH` prereq
  t9902: use `main` as initial branch name
  t6302: use `main` as initial branch name
  t5703: use `main` as initial branch name
  t5510: use `main` as initial branch name
  t5505: finalize transitioning to using the branch name `main`
  t3205: finalize transitioning to using the branch name `main`
  t3203: complete the transition to using the branch name `main`
  t3201: finalize transitioning to using the branch name `main`
  t3200: finish transitioning to the initial branch name `main`
  t1400: use `main` as initial branch name
2020-12-23 13:59:46 -08:00
Junio C Hamano
c46f849f8a Merge branch 'jx/pack-redundant-on-single-pack'
"git pack-redandant" when there is only one packfile used to crash,
which has been corrected.

* jx/pack-redundant-on-single-pack:
  pack-redundant: fix crash when one packfile in repo
2020-12-23 13:59:46 -08:00
Eric Wong
a9ecaa06a7 core.abbrev=no disables abbreviations
This allows users to write hash-agnostic scripts and configs by
disabling abbreviations.  Using "-c core.abbrev=40" will be
insufficient with SHA-256, and "-c core.abbrev=64" won't work with
SHA-1 repos today.

Signed-off-by: Eric Wong <e@80x24.org>
[jc: tweaked implementation, added doc and a test]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-23 13:40:09 -08:00
Eric Sunshine
5bc12c11cc t/perf: avoid unnecessary test_export() recursion
test_export() has been self-recursive since its inception even though a
simple for-loop would have served just as well to append its arguments
to the `test_export_` variable separated by the pipe character "|".
Recently `test_export_` was changed instead to a space-separated list of
tokens to be exported, an operation which can be accomplished via a
single simple assignment, with no need for looping or recursion.
Therefore, simplify the implementation.

While at it, take advantage of the fact that variable names to be
exported are shell identifiers, thus won't be composed of special
characters or whitespace, thus simple a `$*` can be used rather than
magical `"$@"`.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-22 13:45:36 -08:00
Nipunn Koorapati
18f9c98845 negative-refspec: fix segfault on : refspec
The logic added to check for negative pathspec match by c0192df630
(refspec: add support for negative refspecs, 2020-09-30) looks at
refspec->src assuming it is never NULL, however when
remote.origin.push is set to ":", then refspec->src is NULL,
causing a segfault within strcmp.

Tell git to handle matching refspec by adding the needle to the
set of positively matched refspecs, since matching ":" refspecs
match anything as src.

Add test for matching refspec pushes fetch-negative-refspec
both individually and in combination with a negative refspec.

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 22:49:36 -08:00
Martin Ågren
a52df25a54 t7900-maintenance: test for magic markers
When we insert our "BEGIN" and "END" markers into the cron table, it's
so that a Git version from many years into the future would be able to
identify this region in the cron table. Let's add a test to make sure
that these markers don't ever change.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 14:33:09 -08:00
Martin Ågren
66dc0a3625 gc: fix handling of crontab magic markers
On `git maintenance start`, we add a few entries to the user's cron
table. We wrap our entries using two magic markers, "# BEGIN GIT
MAINTENANCE SCHEDULE" and "# END GIT MAINTENANCE SCHEDULE". At a later
`git maintenance stop`, we will go through the table and remove these
lines. Or rather, we will remove the "BEGIN" marker, the "END" marker
and everything between them.

Alas, we have a bug in how we detect the "END" marker: we don't. As we
loop through all the lines of the crontab, if we are in the "old
region", i.e., the region we're aiming to remove, we make an early
`continue` and don't get as far as checking for the "END" marker. Thus,
once we've seen our "BEGIN", we remove everything until the end of the
file.

Rewrite the logic for identifying these markers. There are four cases
that are mutually exclusive: The current line starts a region or it ends
it, or it's firmly within the region, or it's outside of it (and should
be printed).

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 14:33:08 -08:00
Johannes Schindelin
5c29f19cda checkout -p: handle tree arguments correctly again
This fixes a segmentation fault.

The bug is caused by dereferencing `new_branch_info->commit` when it is
`NULL`, which is the case when the tree-ish argument is actually a tree,
not a commit-ish. This was introduced in 5602b500c3 (builtin/checkout:
fix `git checkout -p HEAD...` bug, 2020-10-07), where we tried to ensure
that the special tree-ish `HEAD...` is handled correctly.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 14:06:09 -08:00
Sergey Organov
af04d8f1a5 t4013: add tests for --diff-merges=first-parent
This new option provides essential new functionality, changing diff
output to first parent only, without changing history traversal mode,
so it deserves its own test.

As we do it, add additional test that --diff-merges=first-parent by
itself doesn't imply -p and only outputs diffs for merge commits.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:32 -08:00
Sergey Organov
6fc944d895 diff-merges: fix -m to properly override -c/--cc
Logically, -m, -c, --cc specify 3 different formats for representing
merge commits, yet -m doesn't in fact override -c or --cc, that makes
no sense.

Fix -m to properly override -c/--cc, and change the tests accordingly.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:31 -08:00
Sergey Organov
ec315c66bb t4013: add tests for -m failing to override -c/--cc
Logically, -m, -c, --cc specify 3 different formats for representing
merge commits, yet -m doesn't in fact override -c or --cc, that makes
no sense.

Add 2 expected to fail tests that demonstrate the problem.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:31 -08:00
Sergey Organov
14c14b44e4 t4013: support test_expect_failure through ':failure' magic
Add support to be able to specify expected failure, through :failure
magic, like this:

:failure cmd args

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:31 -08:00
Eric Sunshine
cf76baea41 worktree: teach repair to fix multi-directional breakage
`git worktree repair` knows how to repair the two-way links between the
repository and a worktree as long as a link in one or the other
direction is sound. For instance, if a linked worktree is moved (without
using `git worktree move`), repair is possible because the worktree
still knows the location of the repository even though the repository no
longer knows where the worktree is. Similarly, if the repository is
moved, repair is possible since the repository still knows the locations
of the worktrees even though the worktrees no longer know where the
repository is.

However, if both the repository and the worktrees are moved, then links
are severed in both directions, and no repair is possible. This is the
case even when the new worktree locations are specified as arguments to
`git worktree repair`. The reason for this limitation is twofold. First,
when `repair` consults the worktree's gitfile (/path/to/worktree/.git)
to determine the corresponding <repo>/worktrees/<id>/gitdir file to fix,
<repo> is the old path to the repository, thus it is unable to fix the
`gitdir` file at its new location since it doesn't know where it is.
Second, when `repair` consults <repo>/worktrees/<id>/gitdir to find the
location of the worktree's gitfile (/path/to/worktree/.git), the path
recorded in `gitdir` is the old location of the worktree's gitfile, thus
it is unable to repair the gitfile since it doesn't know where it is.

Fix these shortcomings by teaching `repair` to attempt to infer the new
location of the <repo>/worktrees/<id>/gitdir file when the location
recorded in the worktree's gitfile has become stale but the file is
otherwise well-formed. The inference is intentionally simple-minded.
For each worktree path specified as an argument, `git worktree repair`
manually reads the ".git" gitfile at that location and, if it is
well-formed, extracts the <id>. It then searches for a corresponding
<id> in <repo>/worktrees/ and, if found, concludes that there is a
reasonable match and updates <repo>/worktrees/<id>/gitdir to point at
the specified worktree path. In order for <repo> to be known, `git
worktree repair` must be run in the main worktree or bare repository.

`git worktree repair` first attempts to repair each incoming
/path/to/worktree/.git gitfile to point at the repository, and then
attempts to repair outgoing <repo>/worktrees/<id>/gitdir files to point
at the worktrees. This sequence was chosen arbitrarily when originally
implemented since the order of fixes is immaterial as long as one side
of the two-way link between the repository and a worktree is sound.
However, for this new repair technique to work, the order must be
reversed. This is because the new inference mechanism, when it is
successful, allows the outgoing <repo>/worktrees/<id>/gitdir file to be
repaired, thus fixing one side of the two-way link. Once that side is
fixed, the other side can be fixed by the existing repair mechanism,
hence the order of repairs is now significant.

Two safeguards are employed to avoid hijacking a worktree from a
different repository if the user accidentally specifies a foreign
worktree as an argument. The first, as described above, is that it
requires an <id> match between the repository and the worktree. That
itself is not foolproof for preventing hijack, so the second safeguard
is that the inference will only kick in if the worktree's
/path/to/worktree/.git gitfile does not point at a repository.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:44:28 -08:00
Felipe Contreras
33fc56253b test: bisect-porcelain: fix location of files
Commit ba7eafe146 (t6030: explicitly test for bisection cleanup,
2017-09-29) introduced checks for files in the $GIT_DIR directory, but
that variable is not always defined, and in this test file it's not.

Therefore these checks always passed regardless of the presence of these
files (unless the user has some /BISECT_LOG file, for some reason).

Let's check the files in the correct location.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:08:39 -08:00
Junio C Hamano
59fcf746f5 Merge branch 'jc/diff-I-status-fix'
"git diff -I<pattern> -exit-code" should exit with 0 status when
all the changes match the ignored pattern, but it didn't.

* jc/diff-I-status-fix:
  diff: correct interaction between --exit-code and -I<pattern>
2020-12-18 15:15:18 -08:00
Junio C Hamano
d4187bd4d5 Merge branch 'es/perf-export-fix'
Dev-support fix for BSD.

* es/perf-export-fix:
  t/perf: fix test_export() failure with BSD `sed`
2020-12-18 15:15:18 -08:00
Junio C Hamano
772bdcd429 Merge branch 'js/init-defaultbranch-advice'
Our users are going to be trained to prepare for future change of
init.defaultBranch configuration variable.

* js/init-defaultbranch-advice:
  init: provide useful advice about init.defaultBranch
  get_default_branch_name(): prepare for showing some advice
  branch -m: allow renaming a yet-unborn branch
  init: document `init.defaultBranch` better
2020-12-18 15:15:17 -08:00
Junio C Hamano
689010ca3c Merge branch 'js/t7064-master-to-initial'
Test update.

* js/t7064-master-to-initial:
  t7064: avoid relying on a specific default branch name
2020-12-17 15:06:40 -08:00
Junio C Hamano
f4fb219a97 Merge branch 'js/t6300-hardcode-main'
Test update.

* js/t6300-hardcode-main:
  t6300: avoid using the default name of the initial branch
2020-12-17 15:06:40 -08:00
Junio C Hamano
e5ace7167a Merge branch 'jk/oid-array-cleanup'
Code clean-up.

* jk/oid-array-cleanup:
  commit-graph: use size_t for array allocation and indexing
  commit-graph: replace packed_oid_list with oid_array
  commit-graph: drop count_distinct_commits() function
  oid-array: provide a for-loop iterator
  oid-array: make sort function public
  cache.h: move hash/oid functions to hash.h
  t0064: make duplicate tests more robust
  t0064: drop sha1 mention from filename
  oid-array.h: drop sha1 mention from header guard
2020-12-17 15:06:40 -08:00
Junio C Hamano
21127fa982 Merge branch 'tb/partial-clone-filters-fix'
Fix potential server side resource deallocation issues when
responding to a partial clone request.

* tb/partial-clone-filters-fix:
  upload-pack.c: don't free allowed_filters util pointers
  builtin/clone.c: don't ignore transport_fetch_refs() errors
2020-12-17 15:06:40 -08:00
Junio C Hamano
9feed4e2a6 Merge branch 'js/t7900-protect-pwd-in-config-get'
Hotfix for test breakage.

* js/t7900-protect-pwd-in-config-get:
  t7900: use --fixed-value in git-maintenance tests
2020-12-17 15:06:39 -08:00
Jiang Xin
0696232390 pack-redundant: fix crash when one packfile in repo
Command `git pack-redundant --all` will crash if there is only one
packfile in the repository.  This is because, if there is only one
packfile in local_packs, `cmp_local_packs` will do nothing and will
leave `pl->unique_objects` as uninitialized.

Also add testcases for repository with no packfile and one packfile
in t5323.

Reported-by: Daniel C. Klauer <daniel.c.klauer@web.de>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-16 21:21:06 -08:00
Johannes Schindelin
f17c9da2cf tests: drop the PREPARE_FOR_MAIN_BRANCH prereq
We no longer use it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-16 17:41:41 -08:00
Johannes Schindelin
0007618107 t9902: use main as initial branch name
In 8164360fc8 (t9902: prepare a test for the upcoming default branch
name, 2020-10-23), we started adjusting this test script for the default
initial branch name changing to `main`.

However, there is no need to wait for that: let's adjust the test script
to stop relying on a specific initial branch name by setting it
explicitly. This allows us to drop the `PREPARE_FOR_MAIN_BRANCH` prereq
from one test case.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-16 17:41:41 -08:00
Johannes Schindelin
2dbd00a7a1 t6302: use main as initial branch name
In 66713e84e7 (tests: prepare aligned mentions of the default branch
name, 2020-10-23), we started adjusting this test script for the default
initial branch name changing to `main`.

However, there is no need to wait for that: let's adjust the test script
to stop relying on a specific initial branch name by setting it
explicitly. This allows us to drop the `PREPARE_FOR_MAIN_BRANCH` prereq
from six test cases.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-16 17:41:41 -08:00
Johannes Schindelin
72dc172804 t5703: use main as initial branch name
In 97cf8d50b5 (t5703: adjust a test case for the upcoming default
branch name, 2020-10-23), we prepared this test script for a world when
the default initial branch name would be `main`.

However, there is no need to wait for that: let's adjust the test script
to stop relying on a specific initial branch name by setting it
explicitly. This allows us to drop the `PREPARE_FOR_MAIN_BRANCH` prereq
from one test case.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-16 17:41:41 -08:00
Johannes Schindelin
83ecf26ee7 t5510: use main as initial branch name
In 66713e84e7 (tests: prepare aligned mentions of the default branch
name, 2020-10-23), we prepared this test script for a time when the
default initial branch name would be `main`.

However, there is no need to wait for that: let's adjust the test script
to stop relying on a specific initial branch name by setting it
explicitly. This allows us to drop the `PREPARE_FOR_MAIN_BRANCH` prereq
from two test cases.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-16 17:41:41 -08:00
Johannes Schindelin
97b913681b t5505: finalize transitioning to using the branch name main
In 66713e84e7 (tests: prepare aligned mentions of the default branch
name, 2020-10-23), we started that transition, trying to prepare for a
time when `git init` would use that name for the initial branch.

Even if that time has not arrived, we can complete the transition by
making the test script independent of the default branch name. This also
allows us to drop the `PREPARE_FOR_MAIN_BRANCH` prereq from four test
cases.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-16 17:41:41 -08:00
Johannes Schindelin
654bd7e8a9 t3205: finalize transitioning to using the branch name main
In 66713e84e7 (tests: prepare aligned mentions of the default branch
name, 2020-10-23), we started that transition, trying to prepare for a
time when `git init` would use that name for the initial branch.

Even if that time has not arrived, we can complete the transition by
making the test script independent of the default branch name. This also
allows us to drop the `PREPARE_FOR_MAIN_BRANCH` prereq from one test
case.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-16 17:41:41 -08:00
Johannes Schindelin
1eee0a42f9 t3203: complete the transition to using the branch name main
In 66713e84e7 (tests: prepare aligned mentions of the default branch
name, 2020-10-23), we started that transition, trying to prepare for a
time when `git init` would use that name for the initial branch.

Even if that time has not arrived, we can complete the transition by
making the test script independent of the default branch name. This also
allows us to drop the `PREPARE_FOR_MAIN_BRANCH` prereq from one test
case.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-16 17:41:41 -08:00
Johannes Schindelin
94287e788b t3201: finalize transitioning to using the branch name main
In 66713e84e7 (tests: prepare aligned mentions of the default branch
name, 2020-10-23), we started that transition, trying to prepare for a
time when `git init` would use that name for the initial branch.

Even if that time has not arrived, we can complete the transition by
making the test script independent of the default branch name. This also
allows us to drop the `PREPARE_FOR_MAIN_BRANCH` prereq from one test
case.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-16 17:41:40 -08:00
Johannes Schindelin
ec9779bcd8 t3200: finish transitioning to the initial branch name main
In 56300ff356 (t3200: prepare for `main` being shorter than `master`,
2020-10-23) and in 66713e84e7 (tests: prepare aligned mentions of the
default branch name, 2020-10-23), we started to prepare t3200 for a new
world where `git init` uses the branch name `main` for the initial
branch.

We do not even have to wait for that new world: we can easily ensure
that that branch name is used, independent of the exact name `git init`
will give the initial branch, so let's do that.

This also lets us remove the `PREPARE_FOR_MAIN_BRANCH` prereq from three
test cases in that script.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-16 17:41:40 -08:00
Johannes Schindelin
35a16dbe32 t1400: use main as initial branch name
In 3224b0f0bb (t1400: prepare for `main` being default branch name,
2020-10-23), we prepared t1400 for a time when the default initial
branch name would be `main`.

However, there is no need to wait that long: let's adjust the test
script to stop relying on a specific initial branch name by setting it
explicitly. This allows us to drop the `PREPARE_FOR_MAIN_BRANCH` prereq
from two test cases.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-16 17:41:40 -08:00
Junio C Hamano
50f0439490 diff: correct interaction between --exit-code and -I<pattern>
Just like "git diff -w --exit-code" should exit with 0 when ignoring
whitespace differences results in no changes shown, if ignoring
certain changes with "git diff -I<pattern> --exit-code" result in an
empty patch, we should exit with 0.

The test suite did not cover the interaction between "--exit-code"
and "-w"; add one while adding a new test for "--exit-code" + "-I".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-16 17:33:26 -08:00
Eric Sunshine
f4698738f9 t/perf: fix test_export() failure with BSD sed
test_perf() runs each test in its own subshell which makes it difficult
to persist variables between tests. test_export() addresses this
shortcoming by grabbing the values of specified variables after a test
runs but before the subshell exits, and writes those values to a file
which is loaded into the environment of subsequent tests.

To grab the values to be persisted, test_export() pipes the output of
the shell's builtin `set` command through `sed` which plucks them out
using a regular expression along the lines of `s/^(var1|var2)/.../p`.
Unfortunately, though, this use of alternation is not portable. For
instance, BSD-lineage `sed` (including macOS `sed`) does not support it
in the default "basic regular expression" mode (BRE). It may be possible
to enable "extended regular expression" mode (ERE) in some cases with
`sed -E`, however, `-E` is neither portable nor part of POSIX.

Fortunately, alternation is unnecessary in this case and can easily be
avoided, so replace it with a series of simple expressions such as
`s/^var1/.../p;s/^var2/.../p`.

While at it, tighten the expressions so they match the variable names
exactly rather than matching prefixes (i.e. use `s/^var1=/.../p`).

If the requirements of test_export() become more complex in the future,
then an alternative would be to replace `sed` with `perl` which supports
alternation on all platforms, however, the simple elimination of
alternation via multiple `sed` expressions suffices for the present.

Reported-by: Sangeeta <sangunb09@gmail.com>
Diagnosed-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-16 11:00:29 -08:00
Felipe Contreras
c525de335e pull: display default warning only when non-ff
There's no need to display the annoying warning on every pull... only
the ones that are not fast-forward.

The current warning tests still pass, but not because of the arguments
or the configuration, but because they are all fast-forward.

We need to test non-fast-forward situations now.

Suggestions-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-15 17:39:42 -08:00
Junio C Hamano
c3b58472be pack-redundant: gauge the usage before proposing its removal
The subcommand is unusably slow and the reason why nobody reports it
as a performance bug is suspected to be the absense of users.  Let's
show a big message that asks the user to tell us that they still
care about the command when an attempt is made to run the command,
with an escape hatch to override it with a command line option.

In a few releases, we may turn it into an error and keep it for a
few more releases before finally removing it (during the whole time,
the plan to remove it would be interrupted by end user raising hand).

Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-15 14:30:11 -08:00
Junio C Hamano
3fc7fc1c5f Merge branch 'js/t5526-with-no-particular-primary-branch-name'
Test update.

* js/t5526-with-no-particular-primary-branch-name:
  t5526: drop the prereq expecting the default branch name `main`
  t5526: avoid depending on a specific default branch name
2020-12-14 10:21:38 -08:00
Junio C Hamano
a5e74b4baa Merge branch 'jk/check-config-parsing-error-in-upload-pack'
Tighten error checking in the codepath that responds to "git fetch".

* jk/check-config-parsing-error-in-upload-pack:
  upload-pack: propagate return value from object filter config callback
2020-12-14 10:21:37 -08:00
Junio C Hamano
c59b73bef3 Merge branch 'fc/atmark-in-refspec'
"@" sometimes worked (e.g. "git push origin @:there") as a part of
a refspec element, but "git push origin @" did not work, which has
been corrected.

* fc/atmark-in-refspec:
  refspec: make @ a synonym of HEAD
  tests: push: trivial cleanup
  tests: push: improve cleanup of HEAD tests
2020-12-14 10:21:36 -08:00
Junio C Hamano
78abcff222 Merge branch 'dd/help-autocorrect-never'
"git $cmd $args", when $cmd is not a recognised subcommand, by
default tries to see if $cmd is a typo of an existing subcommand
and optionally executes the corrected command if there is only one
possibility, depending on the setting of help.autocorrect; the
users can now disable the whole thing, including the cycles spent
to find a likely typo, by setting the configuration variable to
'never'.

* dd/help-autocorrect-never:
  help.c: help.autocorrect=never means "do not compute suggestions"
2020-12-14 10:21:36 -08:00
Elijah Newren
ac14de13b2 t4058: explore duplicate tree entry handling in a bit more detail
While creating the last commit, I found a number of other cases where
git would segfault when faced with trees that have duplicate entries.
None of these segfaults are in the diffcore-rename code (they all occur
in cache-tree and unpack-trees).  Further, to my knowledge, no one has
ever been adversely affected by these bugs, and given that it has been
15 years and folks have fixed a few other issues with historical
duplicate entries (as noted in the last commit), I am not sure we will
ever run into anyone having problems with these.  So I am not sure these
are worth fixing, but it doesn't hurt to at least document these
failures in the same test file that is concerned with duplicate tree
entries.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-14 09:34:50 -08:00
Elijah Newren
5c72261c66 t4058: add more tests and documentation for duplicate tree entry handling
Commit 4d6be03b95 ("diffcore-rename: avoid processing duplicate
destinations", 2015-02-26) added t4058 to demonstrate that a workaround
it added to avoid double frees (namely to just turn off rename detection
when trees had duplicate entries) would indeed avoid segfaults.  The
tests, though, give the impression that the expected diffs are "correct"
when in reality they are just "don't segfault, and do something
semi-reasonable under the circumstances".  Add some notes to make this
clearer.

Also, commit 25d5ea410f ("[PATCH] Redo rename/copy detection logic.",
2005-05-24) added a similar workaround to avoid segfaults, but for
rename_src rather than rename_dst.  I do not see any tests in the
testsuite to cover the collision detection of entries limited to the
source side, so add a couple.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-14 09:34:50 -08:00
Johannes Schindelin
675704c74d init: provide useful advice about init.defaultBranch
To give ample warning for users wishing to override Git's the fall-back
for an unconfigured `init.defaultBranch` (in case we decide to change it
in a future Git version), let's introduce some advice that is shown upon
`git init` when that value is not set.

Note: two test cases in Git's test suite want to verify that the
`stderr` output of `git init` is empty. It is now necessary to suppress
the advice, we now do that via the `init.defaultBranch` setting. While
not strictly necessary, we also set this to `false` in
`test_create_repo()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 15:53:51 -08:00
Johannes Schindelin
cfaff3aac8 branch -m: allow renaming a yet-unborn branch
In one of the next commits, we would like to give users some advice
regarding the initial branch name, and how to modify it.

To that end, it would be good if `git branch -m <name>` worked in a
freshly initialized repository without any commits. Let's make it so.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 15:53:50 -08:00
brian m. carlson
fac60b8925 rev-parse: add option for absolute or relative path formatting
git rev-parse has several options which print various paths.  Some of
these paths are printed relative to the current working directory, and
some are absolute.

Normally, this is not a problem, but there are times when one wants
paths entirely in one format or another.  This can be done trivially if
the paths are canonical, but canonicalizing paths is not possible on
some shell scripting environments which lack realpath(1) and also in Go,
which lacks functions that properly canonicalize paths on Windows.

To help out the scripter, let's provide an option which turns most of
the paths printed by git rev-parse to be either relative to the current
working directory or absolute and canonical.  Document which options are
affected and which are not so that users are not confused.

This approach is cleaner and tidier than providing duplicates of
existing options which are either relative or absolute.

Note that if the user needs both forms, it is possible to pass an
additional option in the middle of the command line which changes the
behavior of subsequent operations.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-12 23:35:51 -08:00
Josh Steadmon
610a3fc953 t7900: use --fixed-value in git-maintenance tests
Use --fixed-value in git-config calls in the git-maintenance tests, so
that the tests will continue to work even if the repo path contains
regexp metacharacters.

Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-09 16:25:20 -08:00
Ævar Arnfjörð Bjarmason
058761f1c1 pretty format %(trailers): add a "key_value_separator"
Add a "key_value_separator" option to the "%(trailers)" pretty format,
to go along with the existing "separator" argument. In combination
these two options make it trivial to produce machine-readable (e.g. \0
and \0\0-delimited) format output.

As elaborated on in a previous commit which added "keyonly" it was
needlessly tedious to extract structured data from "%(trailers)"
before the addition of this "key_value_separator" option. As seen by
the test being added here extracting this data now becomes trivial.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-09 14:16:42 -08:00
Ævar Arnfjörð Bjarmason
9d87d5ae02 pretty format %(trailers): add a "keyonly"
Add support for a "keyonly". This allows for easier parsing out of the
key and value. Before if you didn't want to make assumptions about how
the key was formatted. You'd need to parse it out as e.g.:

    --pretty=format:'%H%x00%(trailers:separator=%x00%x00)' \
                       '%x00%(trailers:separator=%x00%x00,valueonly)'

And then proceed to deduce keys by looking at those two and
subtracting the value plus the hardcoded ": " separator from the
non-valueonly %(trailers) line. Now it's possible to simply do:

    --pretty=format:'%H%x00%(trailers:separator=%x00%x00,keyonly)' \
                    '%x00%(trailers:separator=%x00%x00,valueonly)'

Which at least reduces it to a state machine where you get N keys and
correlate them with N values. Even better would be to have a way to
change the ": " delimiter to something easily machine-readable (a key
might contain ": " too). A follow-up change will add support for that.

I don't really have a use-case for just "keyonly" myself. I suppose it
would be useful in some cases as "key=*" matches case-insensitively,
so a plain "keyonly" will give you the variants of the keys you
matched. I'm mainly adding it to fix the inconsistency with
"valueonly".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-09 14:16:42 -08:00
Ævar Arnfjörð Bjarmason
8b966a0506 pretty-format %(trailers): fix broken standalone "valueonly"
Fix %(trailers:valueonly) being a noop due to on overly eager
optimization in format_trailer_info() which skips custom formatting if
no custom options are given.

When "valueonly" was added in d9b936db52 (pretty: add support for
"valueonly" option in %(trailers), 2019-01-28) we forgot to add it to
the list of options that optimization checks for. See e.g. the
addition of "key" in 250bea0c16 (pretty: allow showing specific
trailers, 2019-01-28) for a similar change where this wasn't missed.

Thus the "valueonly" option in "%(trailers:valueonly)" was a noop and
the output was equivalent to that of a plain "%(trailers)". This
wasn't caught because the tests for it always combined it with other
options.

Fix the bug by adding !opts->value_only to the list. I initially
attempted to make this more future-proof by setting a flag if we got
to ":" in "%(trailers:" in format_commit_one() in pretty.c. However,
"%(trailers:" is also parsed in trailers_atom_parser() in
ref-filter.c.

There is an outstanding patch[1] unify those two, and such a fix, or
other future-proofing, such as changing "process_trailer_options"
flags into a bitfield, would conflict with that effort. Let's instead
do the bare minimum here as this aspect of trailers is being actively
worked on by another series.

Let's also test for a plain "valueonly" without any other options, as
well as "separator". All the other existing options on the pretty.c
path had tests where they were the only option provided. I'm also
keeping a sanity test for "%(trailers:)" being the same as
"%(trailers)". There's no reason to suspect it wouldn't be in the
current implementation, but let's keep it in the interest of black box
testing.

1. https://lore.kernel.org/git/pull.726.git.1599335291.gitgitgadget@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-09 14:16:42 -08:00
Peter Kaestle
505a276596 submodules: fix of regression on fetching of non-init subsub-repo
A regression has been introduced by a62387b (submodule.c: fetch in
submodules git directory instead of in worktree, 2018-11-28).

The scenario in which it triggers is when one has a repository with a
submodule inside a submodule like this:
superproject/middle_repo/inner_repo

Person A and B have both a clone of it, while Person B is not working
with the inner_repo and thus does not have it initialized in his working
copy.

Now person A introduces a change to the inner_repo and propagates it
through the middle_repo and the superproject.

Once person A pushed the changes and person B wants to fetch them using
"git fetch" at the superproject level, B's git call will return with
error saying:

Could not access submodule 'inner_repo'
Errors during submodule fetch:
         middle_repo

Expectation is that in this case the inner submodule will be recognized
as uninitialized submodule and skipped by the git fetch command.

This used to work correctly before 'a62387b (submodule.c: fetch in
submodules git directory instead of in worktree, 2018-11-28)'.

Starting with a62387b the code wants to evaluate "is_empty_dir()" inside
.git/modules for a directory only existing in the worktree, delivering
then of course wrong return value.

This patch ensures is_empty_dir() is getting the correct path of the
uninitialized submodule by concatenation of the actual worktree and the
name of the uninitialized submodule.

The first attempt to fix this regression, in 1b7ac4e6d4 (submodules:
fix of regression on fetching of non-init subsub-repo, 2020-11-12), by
simply reverting a62387b, resulted in an infinite loop of submodule
fetches in the simpler case of a recursive fetch of a superproject with
uninitialized submodules, and so this commit was reverted in 7091499bc0
(Revert "submodules: fix of regression on fetching of non-init
subsub-repo", 2020-12-02).
To prevent future breakages, also add a regression test for this
scenario.

Signed-off-by: Peter Kaestle <peter.kaestle@nokia.com>
CC: Junio C Hamano <gitster@pobox.com>
CC: Philippe Blain <levraiphilippeblain@gmail.com>
CC: Ralf Thielow <ralf.thielow@gmail.com>
CC: Eric Sunshine <sunshine@sunshineco.us>
Reviewed-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-09 12:32:07 -08:00
Junio C Hamano
75827da103 Merge branch 'fc/random-cleanup'
Random cleanup.

* fc/random-cleanup:
  gitignore: remove entry for git serve
  gitignore: drop duplicate entry for git-sh-i18n
  tests: lib-functions: trivial style cleanups
  test: completion: fix typos
  .gitignore: remove dangling file
  refspec: trivial cleanup
2020-12-08 15:11:21 -08:00
Junio C Hamano
f2a75cb312 Merge branch 'rs/maintenance-run-outside-repo'
"git maintenance run/start/stop" needed to be run in a repository
to hold the lockfile they use, but didn't make sure they are
actually in a repository, which has been corrected.

* rs/maintenance-run-outside-repo:
  t7900: fix typo: "test_execpt_success"
  maintenance: fix SEGFAULT when no repository
2020-12-08 15:11:21 -08:00
Junio C Hamano
8e2def76f7 Merge branch 'nk/perf-fsmonitor-cleanup'
Test clean-up.

* nk/perf-fsmonitor-cleanup:
  perf/fsmonitor: use test_must_be_empty helper
2020-12-08 15:11:20 -08:00
Junio C Hamano
01b8886a62 Merge branch 'js/trace2-session-id'
The transport layer was taught to optionally exchange the session
ID assigned by the trace2 subsystem during fetch/push transactions.

* js/trace2-session-id:
  receive-pack: log received client session ID
  send-pack: advertise session ID in capabilities
  upload-pack, serve: log received client session ID
  fetch-pack: advertise session ID in capabilities
  transport: log received server session ID
  serve: advertise session ID in v2 capabilities
  receive-pack: advertise session ID in v0 capabilities
  upload-pack: advertise session ID in v0 capabilities
  trace2: add a public function for getting the SID
  docs: new transfer.advertiseSID option
  docs: new capability to advertise session IDs
2020-12-08 15:11:20 -08:00
Junio C Hamano
9b3b4adb3f Merge branch 'mt/do-not-use-scld-in-working-tree'
"git apply" adjusted the permission bits of working-tree files and
directories according core.sharedRepository setting by mistake and
for a long time, which has been corrected.

* mt/do-not-use-scld-in-working-tree:
  apply: don't use core.sharedRepository to create working tree files
2020-12-08 15:11:20 -08:00
Junio C Hamano
d702cb9e89 Merge branch 'ds/maintenance-part-3'
"git maintenance" command had trouble working in a directory whose
pathname contained an ERE metacharacter like '+'.

* ds/maintenance-part-3:
  maintenance: use 'git config --fixed-value'
2020-12-08 15:11:19 -08:00
Junio C Hamano
945158016a Merge branch 'ds/maintenance-part-2'
Test fix.

* ds/maintenance-part-2:
  t7900: speed up expensive test
2020-12-08 15:11:19 -08:00
Junio C Hamano
a10e7842ab Merge branch 'ds/config-literal-value'
Various subcommands of "git config" that takes value_regex
learn the "--literal-value" option to take the value_regex option
as a literal string.

* ds/config-literal-value:
  config doc: value-pattern is not necessarily a regexp
  config: implement --fixed-value with --get*
  config: plumb --fixed-value into config API
  config: add --fixed-value option, un-implemented
  t1300: add test for --replace-all with value-pattern
  t1300: test "set all" mode with value-pattern
  config: replace 'value_regex' with 'value_pattern'
  config: convert multi_replace to flags
2020-12-08 15:11:19 -08:00
Junio C Hamano
6bac6a1ef9 Merge branch 'tb/idx-midx-race-fix'
Processes that access packdata while the .idx file gets removed
(e.g. while repacking) did not fail or fall back gracefully as they
could.

* tb/idx-midx-race-fix:
  midx.c: protect against disappearing packs
  packfile.c: protect against disappearing indexes
2020-12-08 15:11:18 -08:00
Junio C Hamano
1bc550effe Merge branch 'ps/update-ref-multi-transaction'
"git update-ref --stdin" learns to take multiple transactions in a
single session.

* ps/update-ref-multi-transaction:
  update-ref: disallow "start" for ongoing transactions
  p1400: use `git-update-ref --stdin` to test multiple transactions
  update-ref: allow creation of multiple transactions
  t1400: avoid touching refs on filesystem
2020-12-08 15:11:17 -08:00
Junio C Hamano
e0d25686e3 Merge branch 'js/add-i-color-fix'
"git add -i" failed to honor custom colors configured to show
patches, which has been corrected.

* js/add-i-color-fix:
  add -i: verify in the tests that colors can be overridden
  add -p: prefer color.diff.context over color.diff.plain
  add -i (Perl version): color header to match the C version
  add -i (built-in): use the same indentation as the Perl version
  add -p (built-in): do not color the progress indicator separately
  add -i (built-in): use correct names to load color.diff.* config
  add -i (built-in): prevent the `reset` "color" from being configured
  add -i: use `reset_color` consistently
  add -p (built-in): imitate `xdl_format_hunk_hdr()` generating hunk headers
  add -i (built-in): send error messages to stderr
  add -i (built-in): do show an error message for incorrect inputs
2020-12-08 15:11:17 -08:00
Derrick Stolee
45f4eeb291 pack-bitmap-write: relax unique revwalk condition
The previous commits improved the bitmap computation process for very
long, linear histories with many refs by removing quadratic growth in
how many objects were walked. The strategy of computing "intermediate
commits" using bitmasks for which refs can reach those commits
partitioned the poset of reachable objects so each part could be walked
exactly once. This was effective for linear histories.

However, there was a (significant) drawback: wide histories with many
refs had an explosion of memory costs to compute the commit bitmasks
during the exploration that discovers these intermediate commits. Since
these wide histories are unlikely to repeat walking objects, the benefit
of walking objects multiple times was not expensive before. But now, the
commit walk *before computing bitmaps* is incredibly expensive.

In an effort to discover a happy medium, this change reduces the walk
for intermediate commits to only the first-parent history. This focuses
the walk on how the histories converge, which still has significant
reduction in repeat object walks. It is still possible to create
quadratic behavior in this version, but it is probably less likely in
realistic data shapes.

Here is some data taken on a fresh clone of the kernel:

             |   runtime (sec)    |   peak heap (GB)   |
             |                    |                    |
             |   from  |   with   |   from  |   with   |
             | scratch | existing | scratch | existing |
  -----------+---------+----------+---------+-----------
    original |  64.044 |   83.241 |   2.088 |    2.194 |
  last patch |  45.049 |   37.624 |   2.267 |    2.334 |
  this patch |  88.478 |   53.218 |   2.157 |    2.224 |

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:49:07 -08:00
Derrick Stolee
089f751360 pack-bitmap-write: build fewer intermediate bitmaps
The bitmap_writer_build() method calls bitmap_builder_init() to
construct a list of commits reachable from the selected commits along
with a "reverse graph". This reverse graph has edges pointing from a
commit to other commits that can reach that commit. After computing a
reachability bitmap for a commit, the values in that bitmap are then
copied to the reachability bitmaps across the edges in the reverse
graph.

We can now relax the role of the reverse graph to greatly reduce the
number of intermediate reachability bitmaps we compute during this
reverse walk. The end result is that we walk objects the same number of
times as before when constructing the reachability bitmaps, but we also
spend much less time copying bits between bitmaps and have much lower
memory pressure in the process.

The core idea is to select a set of "important" commits based on
interactions among the sets of commits reachable from each selected commit.

The first technical concept is to create a new 'commit_mask' member in the
bb_commit struct. Note that the selected commits are provided in an
ordered array. The first thing to do is to mark the ith bit in the
commit_mask for the ith selected commit. As we walk the commit-graph, we
copy the bits in a commit's commit_mask to its parents. At the end of
the walk, the ith bit in the commit_mask for a commit C stores a boolean
representing "The ith selected commit can reach C."

As we walk, we will discover non-selected commits that are important. We
will get into this later, but those important commits must also receive
bit positions, growing the width of the bitmasks as we walk. At the true
end of the walk, the ith bit means "the ith _important_ commit can reach
C."

MAXIMAL COMMITS
---------------

We use a new 'maximal' bit in the bb_commit struct to represent whether
a commit is important or not. The term "maximal" comes from the
partially-ordered set of commits in the commit-graph where C >= P if P
is a parent of C, and then extending the relationship transitively.
Instead of taking the maximal commits across the entire commit-graph, we
instead focus on selecting each commit that is maximal among commits
with the same bits on in their commit_mask. This definition is
important, so let's consider an example.

Suppose we have three selected commits A, B, and C. These are assigned
bitmasks 100, 010, and 001 to start. Each of these can be marked as
maximal immediately because they each will be the uniquely maximal
commit that contains their own bit. Keep in mind that that these commits
may have different bitmasks after the walk; for example, if B can reach
C but A cannot, then the final bitmask for C is 011. Even in these
cases, C would still be a maximal commit among all commits with the
third bit on in their masks.

Now define sets X, Y, and Z to be the sets of commits reachable from A,
B, and C, respectively. The intersections of these sets correspond to
different bitmasks:

 * 100: X - (Y union Z)
 * 010: Y - (X union Z)
 * 001: Z - (X union Y)
 * 110: (X intersect Y) - Z
 * 101: (X intersect Z) - Y
 * 011: (Y intersect Z) - X
 * 111: X intersect Y intersect Z

This can be visualized with the following Hasse diagram:

	100    010    001
         | \  /   \  / |
         |  \/     \/  |
         |  /\     /\  |
         | /  \   /  \ |
        110    101    011
          \___  |  ___/
              \ | /
               111

Some of these bitmasks may not be represented, depending on the topology
of the commit-graph. In fact, we are counting on it, since the number of
possible bitmasks is exponential in the number of selected commits, but
is also limited by the total number of commits. In practice, very few
bitmasks are possible because most commits converge on a common "trunk"
in the commit history.

With this three-bit example, we wish to find commits that are maximal
for each bitmask. How can we identify this as we are walking?

As we walk, we visit a commit C. Since we are walking the commits in
topo-order, we know that C is visited after all of its children are
visited. Thus, when we get C from the revision walk we inspect the
'maximal' property of its bb_data and use that to determine if C is truly
important. Its commit_mask is also nearly final. If C is not one of the
originally-selected commits, then assign a bit position to C (by
incrementing num_maximal) and set that bit on in commit_mask. See
"MULTIPLE MAXIMAL COMMITS" below for more detail on this.

Now that the commit C is known to be maximal or not, consider each
parent P of C. Compute two new values:

 * c_not_p : true if and only if the commit_mask for C contains a bit
             that is not contained in the commit_mask for P.

 * p_not_c : true if and only if the commit_mask for P contains a bit
             that is not contained in the commit_mask for P.

If c_not_p is false, then P already has all of the bits that C would
provide to its commit_mask. In this case, move on to other parents as C
has nothing to contribute to P's state that was not already provided by
other children of P.

We continue with the case that c_not_p is true. This means there are
bits in C's commit_mask to copy to P's commit_mask, so use bitmap_or()
to add those bits.

If p_not_c is also true, then set the maximal bit for P to one. This means
that if no other commit has P as a parent, then P is definitely maximal.
This is because no child had the same bitmask. It is important to think
about the maximal bit for P at this point as a temporary state: "P is
maximal based on current information."

In contrast, if p_not_c is false, then set the maximal bit for P to
zero. Further, clear all reverse_edges for P since any edges that were
previously assigned to P are no longer important. P will gain all
reverse edges based on C.

The final thing we need to do is to update the reverse edges for P.
These reverse edges respresent "which closest maximal commits
contributed bits to my commit_mask?" Since C contributed bits to P's
commit_mask in this case, C must add to the reverse edges of P.

If C is maximal, then C is a 'closest' maximal commit that contributed
bits to P. Add C to P's reverse_edges list.

Otherwise, C has a list of maximal commits that contributed bits to its
bitmask (and this list is exactly one element). Add all of these items
to P's reverse_edges list. Be careful to ignore duplicates here.

After inspecting all parents P for a commit C, we can clear the
commit_mask for C. This reduces the memory load to be limited to the
"width" of the commit graph.

Consider our ABC/XYZ example from earlier and let's inspect the state of
the commits for an interesting bitmask, say 011. Suppose that D is the
only maximal commit with this bitmask (in the first three bits). All
other commits with bitmask 011 have D as the only entry in their
reverse_edges list. D's reverse_edges list contains B and C.

COMPUTING REACHABILITY BITMAPS
------------------------------

Now that we have our definition, let's zoom out and consider what
happens with our new reverse graph when computing reachability bitmaps.
We walk the reverse graph in reverse-topo-order, so we visit commits
with largest commit_masks first. After we compute the reachability
bitmap for a commit C, we push the bits in that bitmap to each commit D
in the reverse edge list for C. Then, when we finally visit D we already
have the bits for everything reachable from maximal commits that D can
reach and we only need to walk the objects in the set-difference.

In our ABC/XYZ example, when we finally walk for the commit A we only
need to walk commits with bitmask equal to A's bitmask. If that bitmask
is 100, then we are only walking commits in X - (Y union Z) because the
bitmap already contains the bits for objects reachable from (X intersect
Y) union (X intersect Z) (i.e. the bits from the reachability bitmaps
for the maximal commits with bitmasks 110 and 101).

The behavior is intended to walk each commit (and the trees that commit
introduces) at most once while allocating and copying fewer reachability
bitmaps. There is one caveat: what happens when there are multiple
maximal commits with the same bitmask, with respect to the initial set
of selected commits?

MULTIPLE MAXIMAL COMMITS
------------------------

Earlier, we mentioned that when we discover a new maximal commit, we
assign a new bit position to that commit and set that bit position to
one for that commit. This is absolutely important for interesting
commit-graphs such as git/git and torvalds/linux. The reason is due to
the existence of "butterflies" in the commit-graph partial order.

Here is an example of four commits forming a butterfly:

   I    J
   |\  /|
   | \/ |
   | /\ |
   |/  \|
   M    N
    \  /
     |/
     Q

Here, I and J both have parents M and N. In general, these do not need
to be exact parent relationships, but reachability relationships. The
most important part is that M and N cannot reach each other, so they are
independent in the partial order. If I had commit_mask 10 and J had
commit_mask 01, then M and N would both be assigned commit_mask 11 and
be maximal commits with the bitmask 11. Then, what happens when M and N
can both reach a commit Q? If Q is also assigned the bitmask 11, then it
is not maximal but is reachable from both M and N.

While this is not necessarily a deal-breaker for our abstract definition
of finding maximal commits according to a given bitmask, we have a few
issues that can come up in our larger picture of constructing
reachability bitmaps.

In particular, if we do not also consider Q to be a "maximal" commit,
then we will walk commits reachable from Q twice: once when computing
the reachability bitmap for M and another time when computing the
reachability bitmap for N. This becomes much worse if the topology
continues this pattern with multiple butterflies.

The solution has already been mentioned: each of M and N are assigned
their own bits to the bitmask and hence they become uniquely maximal for
their bitmasks. Finally, Q also becomes maximal and thus we do not need
to walk its commits multiple times. The final bitmasks for these commits
are as follows:

  I:10       J:01
   |\        /|
   | \ _____/ |
   | /\____   |
   |/      \  |
   M:111    N:1101
        \  /
       Q:1111

Further, Q's reverse edge list is { M, N }, while M and N both have
reverse edge list { I, J }.

PERFORMANCE MEASUREMENTS
------------------------

Now that we've spent a LOT of time on the theory of this algorithm,
let's show that this is actually worth all that effort.

To test the performance, use GIT_TRACE2_PERF=1 when running
'git repack -abd' in a repository with no existing reachability bitmaps.
This avoids any issues with keeping existing bitmaps to skew the
numbers.

Inspect the "building_bitmaps_total" region in the trace2 output to
focus on the portion of work that is affected by this change. Here are
the performance comparisons for a few repositories. The timings are for
the following versions of Git: "multi" is the timing from before any
reverse graph is constructed, where we might perform multiple
traversals. "reverse" is for the previous change where the reverse graph
has every reachable commit.  Finally "maximal" is the version introduced
here where the reverse graph only contains the maximal commits.

      Repository: git/git
           multi: 2.628 sec
         reverse: 2.344 sec
         maximal: 2.047 sec

      Repository: torvalds/linux
           multi: 64.7 sec
         reverse: 205.3 sec
         maximal: 44.7 sec

So in all cases we've not only recovered any time lost to switching to
the reverse-edge algorithm, but we come out ahead of "multi" in all
cases. Likewise, peak heap has gone back to something reasonable:

      Repository: torvalds/linux
           multi: 2.087 GB
         reverse: 3.141 GB
         maximal: 2.288 GB

While I do not have access to full fork networks on GitHub, Peff has run
this algorithm on the chromium/chromium fork network and reported a
change from 3 hours to ~233 seconds. That network is particularly
beneficial for this approach because it has a long, linear history along
with many tags. The "multi" approach was obviously quadratic and the new
approach is linear.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:48:17 -08:00
Derrick Stolee
1467b9572a t5310: add branch-based checks
The current rev-list tests that check the bitmap data only work on HEAD
instead of multiple branches. Expand the test cases to handle both
'master' and 'other' branches.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:48:17 -08:00
Jeff King
c5cd749076 t5310: drop size of truncated ewah bitmap
We truncate the .bitmap file to 512 bytes and expect to run into
problems reading an individual ewah file. But this length is somewhat
arbitrary, and just happened to work when the test was added in
9d2e330b17 (ewah_read_mmap: bounds-check mmap reads, 2018-06-14).

An upcoming commit will change the size of the history we create in the
test repo, which will cause this test to fail. We can future-proof it a
bit more by reducing the size of the truncated bitmap file.

Signed-off-by: Jeff King <peff@peff.net>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:48:15 -08:00
Jeff King
ec6c7b4367 pack-bitmap: bounds-check size of cache extension
A .bitmap file may have a "name hash cache" extension, which puts a
sequence of uint32_t values (one per object) at the end of the file.
When we see a flag indicating this extension, we blindly subtract the
appropriate number of bytes from our available length. However, if the
.bitmap file is too short, we'll underflow our length variable and wrap
around, thinking we have a very large length. This can lead to reading
out-of-bounds bytes while loading individual ewah bitmaps.

We can fix this by checking the number of available bytes when we parse
the header. The existing "truncated bitmap" test is now split into two
tests: one where we don't have this extension at all (and hence actually
do try to read a truncated ewah bitmap) and one where we realize
up-front that we can't even fit in the cache structure. We'll check
stderr in each case to make sure we hit the error we're expecting.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:48:15 -08:00
Johannes Schindelin
469f17d097 t7064: avoid relying on a specific default branch name
To allow us to consider a change in the default behavior of `git init`
where it uses a more inclusive name for the initial branch, we must
first teach the test suite not to rely on a specific default branch
name. In this patch, we teach t7064 that trick.

To that end, we set a specific name for the initial branch. Ideally, we
would simply start out by calling `git branch -M initial-branch`, but
there is a bug in `git branch -M` that does not allow renaming branches
unless they already have commits. This will be fixed in the
`js/init-defaultbranch-advice` topic, and until that time, we use the
equivalent (but less intuitive) `git checkout -f --orphan`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:44:02 -08:00
Sangeeta Jain
8ef9312464 diff: do not show submodule with untracked files as "-dirty"
Git diff reports a submodule directory as -dirty even when there are
only untracked files in the submodule directory. This is inconsistent
with what `git describe --dirty` says when run in the submodule
directory in that state.

Make `--ignore-submodules=untracked` the default for `git diff` when
there is no configuration variable or command line option, so that the
command would not give '-dirty' suffix to a submodule whose working
tree has untracked files, to make it consistent with `git
describe --dirty` that is run in the submodule working tree.

And also make `--ignore-submodules=none` the default for `git status`
so that the user doesn't end up deleting a submodule that has
uncommitted (untracked) files.

Signed-off-by: Sangeeta Jain <sangunb09@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:27:35 -08:00
Josh Steadmon
0a1f2d05d2 t7900: fix typo: "test_execpt_success"
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:17:54 -08:00
Johannes Schindelin
8f19c9fd43 t6300: avoid using the default name of the initial branch
Our test suite currently only passes when `git init` uses the name
`master` for the initial branch. This would stop us from changing the
default branch name.

Let's adjust t6300 so that it does not rely on any specific default
branch name. This trick is done by (force-)renaming the initial branch
to the name `main` in the `setup` and the `:remotename and :remoteref`
test cases, and then replacing all mentions of `master` and `MASTER`
with `main` and `MAIN`, respectively.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-07 10:29:25 -08:00
Ævar Arnfjörð Bjarmason
7c1f79fc16 pretty format %(trailers) test: split a long line
Split a very long line in a test introduced in 0b691d8685 (pretty:
add support for separator option in %(trailers), 2019-01-28). This
makes it easier to read, especially as follow-up commits will copy
this test as a template.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-07 10:23:11 -08:00
Jeff King
3ea922fc8b t0064: make duplicate tests more robust
Our tests for handling duplicates in oid-array provide only a single
duplicate for each number, so our sorted array looks like:

  44 44 55 55 88 88 aa aa

A slightly more interesting test is to have multiple duplicates, which
makes sure that we not only skip the duplicate, but keep skipping until
we are out of the set of matching duplicates.

Unsurprisingly this works just fine, but it's worth beefing up this test
since we're about to change the duplicate-detection code.

Note that we do need to adjust the results on the lookup test, since it
is returning the index of the found item (and now we have more items
before our range, and the range itself is slightly larger, since we'll
accept a match of any element).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-04 13:55:14 -08:00
Jeff King
d9ca6f8d90 t0064: drop sha1 mention from filename
The data type is an oid_array these days, and we are using "test-tool
oid-array", so let's name the test script appropriately.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-04 13:55:14 -08:00
Johannes Schindelin
71f4a9355a t5526: drop the prereq expecting the default branch name main
Initially, we started converting this test script in anticipation for
renaming the default branch name to `main`. To that end, we partially
converted it to accommodate for that default branch name, marking the
now-failing test cases with a prereq that was designed to be fulfilled
once the rename was complete.

However, the effort to move to the branch name `main` needs quite a bit
longer, as it was decided that we need a deprecation phase first.

To avoid keeping t5526 in limbo for such a long time, we just made it
independent of the actual default branch name used by Git. Therefore,
that prereq is no longer necessary, and we can drop it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-04 12:05:19 -08:00
Johannes Schindelin
b618a2d9df t5526: avoid depending on a specific default branch name
While at it, use different default branch names for the three different
repositories involved in the test script: this makes it easier to debug
failures, too (otherwise you have to wonder which `master` branch was
meant: the super project's? The submodule's? The nested submodule's?).

Note: this touches code that was originally modified to prepare for
renaming the default branch name to `main`. This patch side-steps that
effort completely by overriding the initial branch name explicitly.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-04 12:05:18 -08:00
Taylor Blau
8d133f500a upload-pack.c: don't free allowed_filters util pointers
To keep track of which object filters are allowed or not, 'git
upload-pack' stores the name of each filter in a string_list, and sets
it ->util pointer to be either 0 or 1, indicating whether it is banned
or allowed.

Later on, we attempt to clear that list, but we incorrectly ask for the
util pointers to be free()'d, too. This behavior (introduced back in
6dd3456a8c (upload-pack.c: allow banning certain object filter(s),
2020-08-03)) leads to an invalid free, and causes us to crash.

In order to trigger this, one needs to fetch from a server that (a) has
at least one object filter allowed, and (b) issue a fetch that contains
a subset of the allowed filters (i.e., we cannot ask for a banned
filter, since this causes us to die() before we hit the bogus
string_list_clear()).

In that case, whatever banned filters exist will cause a noop free()
(since those ->util pointers are set to 0), but the first allowed filter
we try to free will crash us.

We never noticed this in the tests because we didn't have an example of
setting 'uploadPackFilter' configuration variables and then following up
with a valid fetch. The first new 'git clone' prevents further
regression here. For good measure on top, add a test which checks the
same behavior at a tree depth greater than 0.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-03 12:42:33 -08:00
Jeff King
d43a21bdbb upload-pack: propagate return value from object filter config callback
If we encounter an error in parse_filter_object_config(), we'll complain
to stderr but won't actually propagate the return value up the stack.
This is unlike most of our config callbacks, which return the error to
git_config() so it can die (this includes the call just below us to
parse_hide_refs_config(), which can also produce errors).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-03 10:25:13 -08:00
Junio C Hamano
f3e5dcd660 Merge branch 'pk/subsub-fetch-fix'
An earlier attempt to fix "git fetch --recurse-submodules" broke
another use case; revert it until a better fix is found.

* pk/subsub-fetch-fix:
  Revert "submodules: fix of regression on fetching of non-init subsub-repo"
2020-12-03 00:18:07 -08:00
Junio C Hamano
c692e1b673 Merge branch 'pb/pull-rebase-recurse-submodules'
"git pull --rebase --recurse-submodules" checked for local changes
in a wrong range and failed to run correctly when it should.

* pb/pull-rebase-recurse-submodules:
  pull: check for local submodule modifications with the right range
  t5572: describe '--rebase' tests a little more
  t5572: add notes on a peculiar test
  pull --rebase: compute rebase arguments in separate function
2020-12-03 00:18:06 -08:00
Junio C Hamano
7091499bc0 Revert "submodules: fix of regression on fetching of non-init subsub-repo"
This reverts commit 1b7ac4e6d4d490b224f5206af7418ed74e490608; in
<CAN0XMOLiS_8JZKF_wW70BvRRxkDHyUoa=Z3ODtB_Bd6f5Y=7JQ@mail.gmail.com>,
Ralf Thielow reports that "git fetch" with submodule.recurse set can
result in a bogus and infinitely recursive fetching of the same
submodule.
2020-12-02 15:07:14 -08:00
Matheus Tavares
eb3c027e17 apply: don't use core.sharedRepository to create working tree files
core.sharedRepository defines which permissions Git should set when
creating files in $GIT_DIR, so that the repository may be shared with
other users. But (in its current form) the setting shouldn't affect how
files are created in the working tree. This is not respected by apply
and am (which uses apply), when creating leading directories:

$ cat d.patch
 diff --git a/d/f b/d/f
 new file mode 100644
 index 0000000..e69de29

Apply without the setting:
$ umask 0077
$ git apply d.patch
$ ls -ld d
 drwx------

Apply with the setting:
$ umask 0077
$ git -c core.sharedRepository=0770 apply d.patch
$ ls -ld d
 drwxrws---

Only the leading directories are affected. That's because they are
created with safe_create_leading_directories(), which calls
adjust_shared_perm() to set the directories' permissions based on
core.sharedRepository. To fix that, let's introduce a variant of this
function that ignores the setting, and use it in apply. Also add a
regression test and a note in the function documentation about the use
of each variant according to the destination (working tree or git
dir).

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-02 14:35:51 -08:00
Jeff King
a0c5ccc1c0 t7900: speed up expensive test
A test marked with EXPENSIVE creates two 2.5GB files and adds them to
the repository. This takes 194s to run on my machine, versus 2s when the
EXPENSIVE prereq isn't set. We can trim this down a bit by doing two
things:

  - use "git commit --quiet" to avoid spending time generating a diff
    summary (this actually only helps for the second commit, but I've
    added it here to both for consistency). This shaves off 8s.

  - set core.compression to 0. We know these files are full of random
    bytes, and so won't compress (that's the point of the test!).
    Spending cycles on zlib is pointless. This shaves off 122s.

After this, my total time to run the script is 64s. That won't help
normal runs without GIT_TEST_LONG set, of course, but it's easy enough
to do.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-02 14:27:08 -08:00
Elijah Newren
ba359fd507 stash: fix stash application in sparse-checkouts
sparse-checkouts are built on the patterns in the
$GIT_DIR/info/sparse-checkout file, where commands have modified
behavior for paths that do not match those patterns.  The differences in
behavior, as far as the bugs concerned here, fall into three different
categories (with git subcommands that fall into each category listed):

  * commands that only look at files matching the patterns:
      * status
      * diff
      * clean
      * update-index
  * commands that remove files from the working tree that do not match
    the patterns, and restore files that do match them:
      * read-tree
      * switch
      * checkout
      * reset (--hard)
  * commands that omit writing files to the working tree that do not
    match the patterns, unless those files are not clean:
      * merge
      * rebase
      * cherry-pick
      * revert

There are some caveats above, e.g. a plain `git diff` ignores files
outside the sparsity patterns but will show diffs for paths outside the
sparsity patterns when revision arguments are passed.  (Technically,
diff is treating the sparse paths as matching HEAD.)  So, there is some
internal inconsistency among these commands.  There are also additional
commands that should behave differently in the face of sparse-checkouts,
as the sparse-checkout documentation alludes to, but the above is
sufficient for me to explain how `git stash` is affected.

What is relevant here is that logically 'stash' should behave like a
merge; it three-way merges the changes the user had in progress at stash
creation time, the HEAD at the time the stash was created, and the
current HEAD, in order to get the stashed changes applied to the current
branch.  However, this simplistic view doesn't quite work in practice,
because stash tweaks it a bit due to two factors: (1) flags like
--keep-index and --include-untracked (why we used two different verbs,
'keep' and 'include', is a rant for another day) modify what should be
staged at the end and include more things that should be quasi-merged,
(2) stash generally wants changes to NOT be staged.  It only provides
exceptions when (a) some of the changes had conflicts and thus we want
to use stages to denote the clean merges and higher order stages to
mark the conflicts, or (b) if there is a brand new file we don't want
it to become untracked.

stash has traditionally gotten this special behavior by first doing a
merge, and then when it's clean, applying a pipeline of commands to
modify the result.  This series of commands for
unstaging-non-newly-added-files came from the following commands:

    git diff-index --cached --name-only --diff-filter=A $CTREE >"$a"
    git read-tree --reset $CTREE
    git update-index --add --stdin <"$a"
    rm -f "$a"

Looking back at the different types of special sparsity handling listed
at the beginning of this message, you may note that we have at least one
of each type covered here: merge, diff-index, and read-tree.  The weird
mix-and-match led to 3 different bugs:

(1) If a path merged cleanly and it didn't match the sparsity patterns,
the merge backend would know to avoid writing it to the working tree and
keep the SKIP_WORKTREE bit, simply only updating it in the index.
Unfortunately, the subsequent commands would essentially undo the
changes in the index and thus simply toss the changes altogether since
there was nothing left in the working tree.  This means the stash is
only partially applied.

(2) If a path existed in the worktree before `git stash apply` despite
having the SKIP_WORKTREE bit set, then the `git read-tree --reset` would
print an error message of the form
      error: Entry 'modified' not uptodate. Cannot merge.
and cause stash to abort early.

(3) If there was a brand new file added by the stash, then the
diff-index command would save that pathname to the temporary file, the
read-tree --reset would remove it from the index, and the update-index
command would barf due to no such file being present in the working
copy; it would print a message of the form:
      error: NEWFILE: does not exist and --remove not passed
      fatal: Unable to process path NEWFILE
and then cause stash to abort early.

Basically, the whole idea of unstage-unless-brand-new requires special
care when you are dealing with a sparse-checkout.  Fix these problems
by applying the following simple rule:

  When we unstage files, if they have the SKIP_WORKTREE bit set,
  clear that bit and write the file out to the working directory.

  (*) If there's already a file present in the way, rename it first.

This fixes all three problems in t7012.13 and allows us to mark it as
passing.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-01 14:39:04 -08:00
Elijah Newren
b34ab4a43b stash: remove unnecessary process forking
When stash was converted from shell to a builtin, it merely
transliterated the forking of various git commands from shell to a C
program that would fork the same commands.  Some of those were converted
over to actual library calls, but much of the pipeline-of-commands
design still remains.  Fix some of this by replacing the portion
corresponding to

    git diff-index --cached --name-only --diff-filter=A $CTREE >"$a"
    git read-tree --reset $CTREE
    git update-index --add --stdin <"$a"
    rm -f "$a"

into a library function that does the same thing.  (The read-tree
--reset was already partially converted over to a library call, but as
an independent piece.)  Note here that this came after a merge operation
was performed.  The merge machinery always stages anything that cleanly
merges, and the above code only runs if there are no conflicts.  Its
purpose is to make it so that when there are no conflicts, all the
changes from the stash are unstaged.  However, that causes brand new
files from the stash to become untracked, so the code above first saves
those files off and then re-adds them afterwards.

We replace the whole series of commands with a simple function that will
unstage files that are not newly added.  This doesn't fix any bugs in
the usage of these commands, it simply matches the existing behavior but
makes it into a single atomic operation that we can then operate on as a
whole.  A subsequent commit will take advantage of this to fix issues
with these commands in sparse-checkouts.

This conversion incidentally fixes t3906.1, because the separate
update-index process would die with the following error messages:
    error: uninitialized_sub: is a directory - add files inside instead
    fatal: Unable to process path uninitialized_sub
The unstaging of the directory as a submodule meant it was no longer
tracked, and thus as an uninitialized directory it could not be added
back using `git update-index --add`, thus resulting in this error and
early abort.  Most of the submodule tests in 3906 continue to fail after
this change, this change was just enough to push the first of those
tests to success.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-01 14:39:04 -08:00
Elijah Newren
a31e48d394 t7012: add a testcase demonstrating stash apply bugs in sparse checkouts
Applying stashes in sparse-checkouts, particularly when the patterns
used to define the sparseness have changed between when the stash was
created and when it is applied, has a number of bugs.  The primary
problem is that stashes are sometimes only partially applied.  In most
such cases, it does so silently without any warning or error being
displayed and with 0 exit status.

There are, however, a few cases when non-translated error messages are
shown and the stash application aborts early.  The first is when there
are files present despite the SKIP_WORKTREE bit being set, in which case
the error message shown is:

    error: Entry 'PATHNAME' not uptodate. Cannot merge.

The other situation is when a stash contains new files to add to the
working tree; in this case, the code aborts early but still has the
stash partially applied, and shows the following error message:

    error: NEWFILE: does not exist and --remove not passed
    fatal: Unable to process path NEWFILE

Add a test that can trigger all three of these problems.  Have it
carefully check that the working copy and SKIP_WORKTREE bits are as
expected after the stash application.  The test is currently marked as
expected to fail, but subsequent commits will implement the fixes and
toggle the expectation.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-01 14:39:04 -08:00
Felipe Contreras
1ab7e00e24 tests: lib-functions: trivial style cleanups
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-01 10:31:30 -08:00
Felipe Contreras
b64b43d2f2 test: completion: fix typos
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-01 10:31:30 -08:00
Junio C Hamano
9f41d09888 Merge branch 'js/t3404-master-to-primary'
A test script got cleaned up and then made not to depend on the
value of init.defaultBranch.

* js/t3404-master-to-primary:
  t3404: do not depend on any specific default branch name
2020-11-30 14:49:44 -08:00
Junio C Hamano
e082a85708 Merge branch 'na/notes-displayref-is-not-boolean'
Config parser fix for "git notes".

* na/notes-displayref-is-not-boolean:
  t3301: test proper exit response to no-value notes.displayRef.
  notes.c: fix a segfault in notes_display_config()
2020-11-30 14:49:44 -08:00
Junio C Hamano
124250108f Merge branch 'js/t1309-master-to-topic'
Test preparation.

* js/t1309-master-to-topic:
  t1309: use a neutral branch name in the `onbranch` test cases
2020-11-30 14:49:42 -08:00
Junio C Hamano
290c94085b Merge branch 'js/pull-rebase-use-advise'
UI improvement.

* js/pull-rebase-use-advise:
  pull: colorize the hint about setting `pull.rebase`
2020-11-30 14:49:42 -08:00
Junio C Hamano
376b4cc420 Merge branch 'js/t4015-wo-master'
A test script got cleaned up not to depend on the value of
init.defaultBranch.

* js/t4015-wo-master:
  t4015: let the test pass with any default branch name
2020-11-30 14:49:41 -08:00
Junio C Hamano
26d0286103 Merge branch 'js/t3040-cleanup'
Cleanup.

* js/t3040-cleanup:
  t3040: remove stale note
2020-11-30 14:49:41 -08:00
Junio C Hamano
39f95df236 Merge branch 'js/t2106-cleanup'
A test script got cleaned up and then made not to depend on the
value of init.defaultBranch.

* js/t2106-cleanup:
  t2106: ensure that the checkout fails for the expected reason
  t2106: make test independent of the current main branch name
  t2106: adjust style to the current conventions
2020-11-30 14:49:41 -08:00
Felipe Contreras
374fbaef3d refspec: make @ a synonym of HEAD
Since commit 9ba89f484e git learned how to push to a remote branch using
the source @, for example:

  git push origin @:master

However, if the right-hand side is missing, the push fails:

  git push origin @

It is obvious what is the desired behavior, and allowing the push makes
things more consistent.

Additionally, @:master now has the same semantics as HEAD:master.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-30 13:57:55 -08:00
Felipe Contreras
e7f80eafd1 tests: push: trivial cleanup
No need to do two checkouts.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-30 13:57:54 -08:00
Felipe Contreras
12a30a3ea6 tests: push: improve cleanup of HEAD tests
So that we are not left in an inconsistent state between them.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-30 13:57:54 -08:00
Nipunn Koorapati
36fa907d7a perf/fsmonitor: use test_must_be_empty helper
Simplify test and make error messages more clear here.
Per feedback from Junio in
33226af42b (t/perf/fsmonitor: improve error message if typoing hook
name, 2020-10-26)

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-30 13:51:43 -08:00
Rafael Silva
e72f7defc4 maintenance: fix SEGFAULT when no repository
The "git maintenance run" and "git maintenance start/stop" commands
holds a file-based lock at the .git/maintenance.lock and
.git/schedule.lock respectively. These locks are used to ensure only
one maintenance process is executed at the time as both operations
involves writing data into the git repository.

The path to the lock file is built using
"the_repository->objects->odb->path" that results in SEGFAULT when we
have no repository available as "the_repository->objects->odb" is
set to NULL.

Let's teach maintenance command to use RUN_SETUP option that will
provide the validation and fail when running outside of a repository.
Hence fixing the SEGFAULT for all three operations and making the
behaviour consistent across all subcommands.

Setting the RUN_SETUP also provides the same protection for all
subcommands given that the "register" and "unregister" also requires to
be executed inside a repository.

Furthermore let's remove the local validation implemented by the
"register" and "unregister" as this will not be required anymore with
the new option.

Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-30 13:44:15 -08:00
Junio C Hamano
7bd645e21d Merge branch 'sg/tests-prereq'
A lazily defined test prerequisite can now be defined in terms of
another lazily defined test prerequisite.

* sg/tests-prereq:
  tests: fix description of 'test_set_prereq'
  tests: make sure nested lazy prereqs work reliably
2020-11-25 15:24:54 -08:00
Junio C Hamano
d302170046 Merge branch 'sg/t5310-jgit-wants-sha1'
Since jgit does not yet work with SHA-256 repositories, mark the
tests that uses it not to run unless we are testing with ShA-1
repositories.

* sg/t5310-jgit-wants-sha1:
  t5310-pack-bitmaps: skip JGit tests with SHA256
2020-11-25 15:24:53 -08:00
Junio C Hamano
d627bf6039 Merge branch 'pk/subsub-fetch-fix'
"git fetch" did not work correctly with nested submodules where the
innermost submodule that is not of interest got updated in the
upstream, which has been corrected.

* pk/subsub-fetch-fix:
  submodules: fix of regression on fetching of non-init subsub-repo
2020-11-25 15:24:52 -08:00
Junio C Hamano
8f8f10ac09 Merge branch 'jx/t5411-flake-fix'
The exchange between receive-pack and proc-receive hook did not
carefully check for errors.

* jx/t5411-flake-fix:
  receive-pack: use default version 0 for proc-receive
  receive-pack: gently write messages to proc-receive
  t5411: new helper filter_out_user_friendly_and_stable_output
2020-11-25 15:24:52 -08:00
Junio C Hamano
fd6445a0b8 Merge branch 'fc/bash-completion-alias-of-alias'
The command line completion script (in contrib/) learned to expand
commands that are alias of alias.

* fc/bash-completion-alias-of-alias:
  completion: bash: improve alias loop detection
  completion: bash: check for alias loop
  completion: bash: support recursive aliases
2020-11-25 15:24:51 -08:00
Derrick Stolee
483a6d9b5d maintenance: use 'git config --fixed-value'
When a repository's leading directories contain regex metacharacters,
the config calls for 'git maintenance register' and 'git maintenance
unregister' are not careful enough. Use the new --fixed-value option
to direct the config machinery to use exact string matches. This is a
more robust option than escaping these arguments in a piecemeal fashion.

For the test, require that we are not running on Windows since the '+'
and '*' characters are not allowed on that filesystem.

Reported-by: Emily Shaffer <emilyshaffer@google.com>
Reported-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-25 15:04:55 -08:00
Derrick Stolee
3f1bae1dc3 config: implement --fixed-value with --get*
The config builtin does its own regex matching of values for the --get,
--get-all, and --get-regexp modes. Plumb the existing 'flags' parameter
to the get_value() method so we can initialize the value-pattern argument
as a fixed string instead of a regex pattern.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-25 14:43:48 -08:00
Derrick Stolee
c90702a1f6 config: plumb --fixed-value into config API
The git_config_set_multivar_in_file_gently() and related methods now
take a 'flags' bitfield, so add a new bit representing the --fixed-value
option from 'git config'. This alters the purpose of the value_pattern
parameter to be an exact string match. This requires some initialization
changes in git_config_set_multivar_in_file_gently() and a new strcmp()
call in the matches() method.

The new CONFIG_FLAGS_FIXED_VALUE flag is initialized in builtin/config.c
based on the --fixed-value option, and that needs to be updated in
several callers.

This patch only affects some of the modes of 'git config', and the rest
will be completed in the next change.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-25 14:43:48 -08:00
Derrick Stolee
fda43942d7 config: add --fixed-value option, un-implemented
The 'git config' builtin takes a 'value-pattern' parameter for several
actions. This can cause confusion when expecting exact value matches
instead of regex matches, especially when the input string contains
metacharacters. While callers can escape the patterns themselves, it
would be more friendly to allow an argument to disable the pattern
matching in favor of an exact string match.

Add a new '--fixed-value' option that does not currently change the
behavior. The implementation will be filled in by later changes for
each appropriate action. For now, check and test that --fixed-value
will abort the command when included with an incompatible action or
without a 'value-pattern' argument.

The name '--fixed-value' was chosen over something simpler like
'--fixed' because some commands allow regular expressions on the
key in addition to the value.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-25 14:43:48 -08:00
Derrick Stolee
d15671943e t1300: add test for --replace-all with value-pattern
The --replace-all option was added in 4ddba79d (git-config-set: add more
options) but was not tested along with the 'value-pattern' parameter.
Since we will be updating this option to optionally treat 'value-pattern'
as a fixed string, let's add a test here that documents the current
behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-25 14:43:48 -08:00
Derrick Stolee
2076dba281 t1300: test "set all" mode with value-pattern
Without additional modifiers, 'git config <key> <value>' attempts
to set a single value in the .git/config file. When the
value-pattern parameter is supplied, this command behaves in a
non-trivial manner.

Consider 'git config <key> <value> <value-pattern>'. The expected
behavior is as follows:

1. If there are multiple existing values that match 'value-pattern',
   then the command fails. Users should use --replace-all instead.

2. If there is no existing values match 'value-pattern', then the
   'key=value' pair is appended, making this 'key' a multi-valued
   config setting.

3. If there is one existing value that matches 'value-pattern', then
   the new config has one entry where 'key=value'.

Add a test that demonstrates these options. Break from the existing
pattern in t1300-config.sh to use 'git config --file=<file>' instead of
modifying .git/config directly to prevent possibly incompatible repo
states. Also use 'git config --file=<file> --list' for config state
comparison instead of the config file format. This makes the tests
more readable.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-25 14:43:48 -08:00
Taylor Blau
506ec2fbda midx.c: protect against disappearing packs
When a packed object is stored in a multi-pack index, but that pack has
racily gone away, the MIDX code simply calls die(), when it could be
returning an error to the caller, which would in turn lead to
re-scanning the pack directory.

A pack can racily disappear, for example, due to a simultaneous 'git
repack -ad',

You can also reproduce this with two terminals, where one is running:

    git init
    while true; do
      git commit -q --allow-empty -m foo
      git repack -ad
      git multi-pack-index write
    done

(in effect, constantly writing new MIDXs), and the other is running:

    obj=$(git rev-parse HEAD)
    while true; do
      echo $obj | git cat-file --batch-check='%(objectsize:disk)' || break
    done

That will sometimes hit the error preparing packfile from
multi-pack-index message, which this patch fixes.

Right now, that path to discovering a missing pack looks something like
'find_pack_entry()' calling 'fill_midx_entry()' and eventually making
its way to call 'nth_midxed_pack_entry()'.

'nth_midxed_pack_entry()' already checks 'is_pack_valid()' and
propagates an error if the pack is invalid. So, this works if the pack
has gone away between calling 'prepare_midx_pack()' and before calling
'is_pack_valid()', but not if it disappears before then.

Catch the case where the pack has already disappeared before
'prepare_midx_pack()' by returning an error in that case, too.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-25 13:15:56 -08:00
Taylor Blau
c8a45eb66e packfile.c: protect against disappearing indexes
In 17c35c8969 (packfile: skip loading index if in multi-pack-index,
2018-07-12) we stopped loading the .idx file for packs that are
contained within a multi-pack index.

This saves us the effort of loading an .idx and doing some lightweight
validity checks by way of 'packfile.c:load_idx()', but introduces a race
between processes that need to load the index (e.g., to generate a
reverse index) and processes that can delete the index.

For example, running the following in your shell:

    $ git init repo && cd repo
    $ git commit --allow-empty -m 'base'
    $ git repack -ad && git multi-pack-index write

followed by:

    $ rm -f .git/objects/pack/pack-*.idx
    $ git rev-parse HEAD | git cat-file --batch-check='%(objectsize:disk)'

will result in a segfault prior to this patch. What's happening here is
that we notice that the pack is in the multi-pack index, and so don't
check that it still has a .idx. When we then try and load that index to
generate a reverse index, we don't have it, so the call to
'find_pack_revindex()' in 'packfile.c:packed_object_info()' returns
NULL, and then dereferencing it causes a segfault.

Of course, we don't ever expect someone to remove the index file by
hand, or to be in a state where we never wrote it to begin with (yet
find that pack in the multi-pack-index). But, this can happen in a
timing race with 'git repack -ad', which removes all existing packs
after writing a new pack containing all of their objects.

Avoid this by reverting the hunk of 17c35c8969 which stops loading the
index when the pack is contained in a MIDX. This makes the latter half
of 17c35c8969 useless, since we'll always have a non-NULL
'p->index_data', in which case that if statement isn't guarding
anything.

These two together effectively revert 17c35c8969, and avoid the race
explained above.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-25 13:15:49 -08:00
Drew DeVault
644bb953ce help.c: help.autocorrect=never means "do not compute suggestions"
While help.autocorrect can be set to 0 to decline auto-execution of
possibly mistyped commands, it still spends cycles to compute the
suggestions, and it wastes screen real estate.

Update help.autocorrect to accept the string "never" to just exit
with error upon mistyped commands to help users who prefer to never
see suggested corrections at all.

While at it, introduce "immediate" as a more readable way to
immediately execute the auto-corrected command, which can be done
with negative value.

Signed-off-by: Drew DeVault <sir@cmpwn.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-25 13:02:15 -08:00
Johannes Schindelin
9c8509a4e3 t3404: do not depend on any specific default branch name
Now that we can override the default branch name in the tests via
`GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME`, we should avoid expecting a
particular hard-coded name.

So let's rename the initial branch immediately to `primary` and work
with that.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-24 13:17:06 -08:00
Derrick Stolee
31345d5545 maintenance: extract platform-specific scheduling
The existing schedule mechanism using 'cron' is supported by POSIX
platforms, but not Windows. It also works slightly differently on
macOS to significant detriment of the user experience. To allow for
new implementations on these platforms, extract a method that
performs the platform-specific scheduling mechanism. This will be
swapped at compile time with new implementations on specialized
platforms.

As we add this generality, rename GIT_TEST_CRONTAB to
GIT_TEST_MAINT_SCHEDULER. Further, this variable is now parsed as
"<scheduler>:<command>" so we can test platform-specific scheduling
logic even when not on the correct platform. By specifying the
<scheduler> in this string, we will be able to test all three sets of
Git logic from a Linux machine.

Co-authored-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-24 13:02:29 -08:00
Nate Avers
45fef1599a t3301: test proper exit response to no-value notes.displayRef.
Signed-off-by: Nate Avers <nate@roosteregg.cc>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-23 10:42:25 -08:00
Junio C Hamano
bf0a430f70 Merge branch 'en/strmap'
A specialization of hashmap that uses a string as key has been
introduced.  Hopefully it will see wider use over time.

* en/strmap:
  shortlog: use strset from strmap.h
  Use new HASHMAP_INIT macro to simplify hashmap initialization
  strmap: take advantage of FLEXPTR_ALLOC_STR when relevant
  strmap: enable allocations to come from a mem_pool
  strmap: add a strset sub-type
  strmap: split create_entry() out of strmap_put()
  strmap: add functions facilitating use as a string->int map
  strmap: enable faster clearing and reusing of strmaps
  strmap: add more utility functions
  strmap: new utility functions
  hashmap: provide deallocation function names
  hashmap: introduce a new hashmap_partial_clear()
  hashmap: allow re-use after hashmap_free()
  hashmap: adjust spacing to fix argument alignment
  hashmap: add usage documentation explaining hashmap_free[_entries]()
2020-11-21 15:14:38 -08:00