Commit Graph

17927 Commits

Author SHA1 Message Date
Ævar Arnfjörð Bjarmason
dcc0a86f2f show tests: add test for "git show <tree>"
Add missing tests for showing a tree with "git show". Let's test for
showing a tree, two trees, and that doing so doesn't recurse.

The only tests for this code added in 5d7eeee2ac (git-show: grok
blobs, trees and tags, too, 2006-12-14) were the tests in
t7701-repack-unpack-unreachable.sh added in ccc1297226 (repack:
modify behavior of -A option to leave unreferenced objects unpacked,
2008-05-09).

Let's add this common mode of operation to the "show" tests
themselves. It's more obvious, and the tests in
t7701-repack-unpack-unreachable.sh happily pass if we start buggily
emitting trees recursively.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 16:09:25 -07:00
Elijah Newren
f3b964a07e Add testing with merge-ort merge strategy
In preparation for switching from merge-recursive to merge-ort as the
default strategy, have the testsuite default to running with merge-ort.
Keep coverage of the recursive backend by having the linux-gcc job run
with it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
Elijah Newren
259490e572 t6423: mark remaining expected failure under merge-ort as such
When we started on merge-ort, thousands of tests failed when run with
the GIT_TEST_MERGE_ALGORITHM=ort flag; with so many, it didn't make
sense to flip all their test expectations.  The ones in t6409, t6418,
and the submodule tests are being handled by an independent in-flight
topic ("Complete merge-ort implemenation...almost").  The ones in
t6423 were left out of the other series because other ongoing series
that this commit depends upon were addressing those.  Now that we only
have one remaining test failure in t6423, let's mark it as such.

This remaining test will be fixed by a future optimization series, but
since merge-recursive doesn't pass this test either, passing it is not
necessary for declaring merge-ort ready for general use.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
Elijah Newren
aa2faac03a t: mark several submodule merging tests as fixed under merge-ort
merge-ort handles submodules (and directory/file conflicts in general)
differently than merge-recursive does; it basically puts all the special
handling for different filetypes into one place in the codebase instead
of needing special handling for different filetypes in many different
code paths.  This one code path in merge-ort could perhaps use some work
still (there are still test_expect_failure cases in the testsuite), but
it passes all the tests that merge-recursive does as well as 12
additional ones that merge-recursive fails.  Mark those 12 tests as
test_expect_success under merge-ort.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
Elijah Newren
66b209b86a merge-ort: implement CE_SKIP_WORKTREE handling with conflicted entries
When merge conflicts occur in paths removed by a sparse-checkout, we
need to unsparsify those paths (clear the SKIP_WORKTREE bit), and write
out the conflicted file to the working copy.  In the very unlikely case
that someone manually put a file into the working copy at the location
of the SKIP_WORKTREE file, we need to avoid overwriting whatever edits
they have made and move that file to a different location first.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
Elijah Newren
8ddc20b896 t6428: new test for SKIP_WORKTREE handling and conflicts
If there is a conflict during a merge for a SKIP_WORKTREE entry, we
expect that file to be written to the working copy and have the
SKIP_WORKTREE bit cleared in the index.  If the user had manually
created a file in the working tree despite SKIP_WORKTREE being set, we
do not want to clobber their changes to that file, but want to move it
out of the way.  Add tests that check for these behaviors.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
Junio C Hamano
cc930b7472 Merge branch 'jt/clone-unborn-head'
Test fix.

* jt/clone-unborn-head:
  t5606: run clone branch name test with protocol v2
2021-03-19 15:25:38 -07:00
Junio C Hamano
35381b13da Merge branch 'jk/bisect-peel-tag-fix'
"git bisect" reimplemented more in C during 2.30 timeframe did not
take an annotated tag as a good/bad endpoint well.  This regression
has been corrected.

* jk/bisect-peel-tag-fix:
  bisect: peel annotated tags to commits
2021-03-19 15:25:37 -07:00
Junio C Hamano
eabacfd9cb Merge branch 'jc/calloc-fix'
Code clean-up.

* jc/calloc-fix:
  xcalloc: use CALLOC_ARRAY() when applicable
2021-03-19 15:25:37 -07:00
Taylor Blau
14e7b8344f builtin/pack-objects.c: ignore missing links with --stdin-packs
When 'git pack-objects --stdin-packs' encounters a commit in a pack, it
marks it as a starting point of a best-effort reachability traversal
that is used to populate the name-hash of the objects listed in the
given packs.

The traversal expects that it should be able to walk the ancestors of
all commits in a pack without issue. Ordinarily this is the case, but it
is possible to having missing parents from an unreachable part of the
repository. In that case, we'd consider any missing objects in the
unreachable portion of the graph to be junk.

This should be handled gracefully: since the traversal is best-effort
(i.e., we don't strictly need to fill in all of the name-hash fields),
we should simply ignore any missing links.

This patch does that (by setting the 'ignore_missing_links' bit on the
rev_info struct), and ensures we don't regress in the future by adding a
test which demonstrates this case.

It is a little over-eager, since it will also ignore missing links in
reachable parts of the packs (which would indicate a corrupted
repository), but '--stdin-packs' is explicitly *not* about reachability.
So this step isn't making anything worse for a repository which contains
packs missing reachable objects (since we never drop objects with
'--stdin-packs').

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-19 11:19:29 -07:00
Jeff King
27d578d904 t: annotate !PTHREADS tests with !FAIL_PREREQS
Some tests in t5300 and t7810 expect us to complain about a "--threads"
argument when Git is compiled without pthread support. Running these
under GIT_TEST_FAIL_PREREQS produces a confusing failure: we pretend to
the tests that there is no pthread support, so they expect the warning,
but of course the actual build is perfectly happy to respect the
--threads argument.

We never noticed before the recent a926c4b904 (tests: remove most uses
of C_LOCALE_OUTPUT, 2021-02-11), because the tests also were marked as
requiring the C_LOCALE_OUTPUT prerequisite. Which means they'd never
have run in FAIL_PREREQS mode, since it would always pretend that the
locale prereq was not satisfied.

These tests can't possibly work in this mode; it is a mismatch between
what the tests expect and what the build was told to do. So let's just
mark them to be skipped, using the special prereq introduced by
dfe1a17df9 (tests: add a special setup where prerequisites fail,
2019-05-13).

Reported-by: Son Luong Ngoc <sluongng@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 14:17:30 -07:00
Nipunn Koorapati
7e5aa13d2c fsmonitor: add perf test for git diff HEAD
Update the xargs call so that if your large repo contains
symlinks, test-tool chmtime failure does not end the script.

On Linux
Test                                                          this tree           upstream/master
---------------------------------------------------------------------------------------------------------
7519.4: status (fsmonitor=fsmonitor-watchman)                 0.52(0.43+0.10)     0.53(0.49+0.05) +1.9%
7519.5: status -uno (fsmonitor=fsmonitor-watchman)            0.21(0.15+0.07)     0.22(0.13+0.09) +4.8%
7519.6: status -uall (fsmonitor=fsmonitor-watchman)           1.65(0.93+0.71)     1.69(1.03+0.65) +2.4%
7519.7: status (dirty) (fsmonitor=fsmonitor-watchman)         11.99(11.34+1.58)   11.95(11.02+1.79) -0.3%
7519.8: diff (fsmonitor=fsmonitor-watchman)                   0.25(0.17+0.26)     0.25(0.18+0.26) +0.0%
7519.9: diff HEAD (fsmonitor=fsmonitor-watchman)              0.39(0.25+0.34)     0.89(0.35+0.74) +128.2%
7519.10: diff -- 0_files (fsmonitor=fsmonitor-watchman)       0.16(0.13+0.04)     0.16(0.12+0.05) +0.0%
7519.11: diff -- 10_files (fsmonitor=fsmonitor-watchman)      0.16(0.12+0.05)     0.16(0.12+0.05) +0.0%
7519.12: diff -- 100_files (fsmonitor=fsmonitor-watchman)     0.16(0.12+0.05)     0.16(0.12+0.05) +0.0%
7519.13: diff -- 1000_files (fsmonitor=fsmonitor-watchman)    0.16(0.11+0.06)     0.16(0.12+0.05) +0.0%
7519.14: diff -- 10000_files (fsmonitor=fsmonitor-watchman)   0.18(0.13+0.06)     0.17(0.10+0.08) -5.6%
7519.15: add (fsmonitor=fsmonitor-watchman)                   2.25(1.53+0.68)     2.25(1.47+0.74) +0.0%
7519.18: status (fsmonitor=disabled)                          0.88(0.73+1.03)     0.89(0.67+1.08) +1.1%
7519.19: status -uno (fsmonitor=disabled)                     0.45(0.43+0.89)     0.45(0.34+0.98) +0.0%
7519.20: status -uall (fsmonitor=disabled)                    1.88(1.16+1.58)     1.88(1.22+1.51) +0.0%
7519.21: status (dirty) (fsmonitor=disabled)                  7.53(7.05+2.11)     7.53(6.98+2.04) +0.0%
7519.22: diff (fsmonitor=disabled)                            0.42(0.37+0.92)     0.42(0.38+0.91) +0.0%
7519.23: diff HEAD (fsmonitor=disabled)                       0.44(0.41+0.90)     0.44(0.40+0.91) +0.0%
7519.24: diff -- 0_files (fsmonitor=disabled)                 0.13(0.09+0.05)     0.13(0.09+0.05) +0.0%
7519.25: diff -- 10_files (fsmonitor=disabled)                0.13(0.10+0.04)     0.13(0.10+0.04) +0.0%
7519.26: diff -- 100_files (fsmonitor=disabled)               0.13(0.09+0.05)     0.13(0.10+0.04) +0.0%
7519.27: diff -- 1000_files (fsmonitor=disabled)              0.13(0.09+0.06)     0.13(0.09+0.05) +0.0%
7519.28: diff -- 10000_files (fsmonitor=disabled)             0.14(0.11+0.05)     0.14(0.10+0.05) +0.0%
7519.29: add (fsmonitor=disabled)                             2.43(1.61+1.64)     2.43(1.69+1.57) +0.0%

On linux (2.29.2 vs w/ this patch):
nipunn@nipunn-dbx:~/src/server3$ strace -f -c git diff 2>&1 | grep lstat
  0.04    0.000063           3        20         6 lstat
nipunn@nipunn-dbx:~/src/server3$ strace -f -c git diff HEAD 2>&1 | grep lstat
 94.98    5.242262          10    523783        13 lstat
nipunn@nipunn-dbx:~/src/server3$ strace -f -c ../git/bin-wrappers/git diff 2>&1 | grep lstat
  0.38    0.000032           5         7         3 lstat
nipunn@nipunn-dbx:~/src/server3$ strace -f -c ../git/bin-wrappers/git diff HEAD 2>&1 | grep lstat
 99.44    0.741892           9     81634        10 lstat

On mac (2.29.2 vs w/ this patch):
nipunn-mbp:server nipunn$ sudo dtruss -L -f -c git diff 2>&1 | grep "^lstat64 "
lstat64                                         8
nipunn-mbp:server nipunn$ sudo dtruss -L -f -c git diff HEAD 2>&1 | grep "^lstat64 "
lstat64                                    120242
nipunn-mbp:server nipunn$ sudo dtruss -L -f -c ../git/bin-wrappers/git diff 2>&1 | grep "^lstat64 "
lstat64                                         4
nipunn-mbp:server nipunn$ sudo dtruss -L -f -c ../git/bin-wrappers/git diff HEAD 2>&1 | grep "^lstat64 "
lstat64                                      4497

There are still a bunch of lstats - on directories, but not every file. Progress!

Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 13:31:14 -07:00
Matheus Tavares
fab78a0c3d checkout: don't follow symlinks when removing entries
At 1d718a5108 ("do not overwrite untracked symlinks", 2011-02-20),
symlink.c:check_leading_path() started returning different codes for
FL_ENOENT and FL_SYMLINK. But one of its callers, unlink_entry(), was
not adjusted for this change, so it started to follow symlinks on the
leading path of to-be-removed entries. Fix that and add a regression
test.

Note that since 1d718a5108 check_leading_path() no longer differentiates
the case where it found a symlink in the path's leading components from
the cases where it found a regular file or failed to lstat() the
component. So, a side effect of this current patch is that
unlink_entry() now returns early in all of these three cases. And
because we no longer try to unlink such paths, we also don't get the
warning from remove_or_warn().

For the regular file and symlink cases, it's questionable whether the
warning was useful in the first place: unlink_entry() removes tracked
paths that should no longer be present in the state we are checking out
to. If the path had its leading dir replaced by another file, it means
that the basename already doesn't exist, so there is no need for a
warning. Sure, we are leaving a regular file or symlink behind at the
path's dirname, but this file is either untracked now (so again, no
need to warn), or it will be replaced by a tracked file during the next
phase of this checkout operation.

As for failing to lstat() one of the leading components, the basename
might still exist only we cannot unlink it (e.g. due to the lack of the
required permissions). Since the user expect it to be removed
(especially with checkout's --no-overlay option), add back the warning
in this more relevant case.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-18 12:58:10 -07:00
Jeff King
7730f85594 bisect: peel annotated tags to commits
This patch fixes a bug where git-bisect doesn't handle receiving
annotated tags as "git bisect good <tag>", etc. It's a regression in
27257bc466 (bisect--helper: reimplement `bisect_state` & `bisect_head`
shell functions in C, 2020-10-15).

The original shell code called:

  sha=$(git rev-parse --verify "$rev^{commit}") ||
          die "$(eval_gettext "Bad rev input: \$rev")"

which will peel the input to a commit (or complain if that's not
possible). But the C code just calls get_oid(), which will yield the oid
of the tag.

The fix is to peel to a commit. The error message here is a little
non-idiomatic for Git (since it starts with a capital). I've mostly left
it, as it matches the other converted messages (like the "Bad rev input"
we print when get_oid() fails), though I did add an indication that it
was the peeling that was the problem. It might be worth taking a pass
through this converted code to modernize some of the error messages.

Note also that the test does a bare "grep" (not i18ngrep) on the
expected "X is the first bad commit" output message. This matches the
rest of the test script.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-17 11:24:08 -07:00
Jonathan Tan
5f70859c15 t5606: run clone branch name test with protocol v2
4f37d45706 ("clone: respect remote unborn HEAD", 2021-02-05) introduces
a new feature (if the remote has an unborn HEAD, e.g. when the remote
repository is empty, use it as the name of the branch) that only works
in protocol v2, but did not ensure that one of its tests always uses
protocol v2, and thus that test would fail if
GIT_TEST_PROTOCOL_VERSION=0 (or 1) is used. Therefore, add "-c
protocol.version=2" to the appropriate test.

(The rest of the tests from that commit have "-c protocol.version=2"
already added.)

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-17 11:19:36 -07:00
Junio C Hamano
486f4bd183 xcalloc: use CALLOC_ARRAY() when applicable
These are for codebase before Git 2.31

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 17:51:10 -07:00
Charvi Mendiratta
8bedae4599 t3437: use --fixup with options to create amend! commit
We taught `git commit --fixup` to create "amend!" commit. Let's also
update the tests and use it to setup the rebase tests.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:29:36 -07:00
Charvi Mendiratta
3d1bda6b5b t7500: add tests for --fixup=[amend|reword] options
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:29:35 -07:00
René Scharfe
96099726dd archive: expand only a single %(describe) per archive
Every %(describe) placeholder in $Format:...$ strings in files with the
attribute export-subst is expanded by calling git describe.  This can
potentially result in a lot of such calls per archive.  That's OK for
local repositories under control of the user of git archive, but could
be a problem for hosted repositories.

Expand only a single %(describe) placeholder per archive for now to
avoid denial-of-service attacks.  We can make this limit configurable
later if needed, but let's start out simple.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-11 13:22:44 -08:00
Elijah Newren
32a56dfb99 merge-ort: precompute subset of sources for which we need rename detection
rename detection works by trying to pair all file deletions (or
"sources") with all file additions (or "destinations"), checking
similarity, and then marking the sufficiently similar ones as renames.
This can be expensive if there are many sources and destinations on a
given side of history as it results in an N x M comparison matrix.
However, there are many cases where we can compute in advance that
detecting renames for some of the sources provides no useful information
and thus that we can exclude those sources from the matrix.

To see why, first note that the merge machinery uses detected renames in
two ways:

   * directory rename detection: when one side of history renames a
       directory, and the other side of history adds new files to that
       directory, we want to be able to warn the user about the need to
       chose whether those new files stay in the old directory or move
       to the new one.

   * three-way content merging: in order to do three-way content merging
       of files, we need three different file versions.  If one side of
       history renamed a file, then some of the content for the file is
       found under a different path than in the merge base or on the
       other side of history.

Add a simple testcase showing the two kinds of reasons renames are
relevant; it's a testcase that will only pass if we detect both kinds of
needed renames.

Other than the testcase added above, this commit concentrates just on
the three-way content merging; it will punt and mark all sources as
needed for directory rename detection, and leave it to future commits to
narrow that down more.

The point of three-way content merging is to reconcile changes made on
*both* sides of history.  What if the file wasn't modified on both
sides?  There are two possibilities:

   * If it wasn't modified on the renamed side:
       -> then we get to do exact rename detection, which is cheap.

   * If it wasn't modified on the unrenamed side:
       -> then detection of a rename for that source file is irrelevant

That latter claim might be surprising at first, so let's walk through a
case to show why rename detection for that source file is irrelevant.
Let's use two filenames, old.c & new.c, with the following abbreviated
object ids (and where the value '000000' is used to denote that the file
is missing in that commit):

                 old.c     new.c
   MERGE_BASE:   01d01d    000000
   MERGE_SIDE1:  01d01d    000000
   MERGE_SIDE2:  000000    5e1ec7

If the rename *isn't* detected:
   then old.c looks like it was unmodified on one side and deleted on
   the other and should thus be removed.  new.c looks like a new file we
   should keep as-is.

If the rename *is* detected:
   then a three-way content merge is done.  Since the version of the
   file in MERGE_BASE and MERGE_SIDE1 are identical, the three-way merge
   will produce exactly the version of the file whose abbreviated
   object id is 5e1ec7.  It will record that file at the path new.c,
   while removing old.c from the directory.

Note that these two results are identical -- a single file named 'new.c'
with object id 5e1ec7.  In other words, it doesn't matter if the rename
is detected in the case where the file is unmodified on the unrenamed
side.

Use this information to compute whether we need rename detection for
each source created in add_pair().

It's probably worth noting that there used to be a few other edge or
corner cases besides three-way content merges and directory rename
detection where lack of rename detection could have affected the result,
but those cases actually highlighted where conflict resolution methods
were not consistent with each other.  Fixing those inconsistencies were
thus critically important to enabling this optimization.  That work
involved the following:

 * bringing consistency to add/add, rename/add, and rename/rename
    conflict types, as done back in the topic merged at commit
    ac193e0e0a ("Merge branch 'en/merge-path-collision'", 2019-01-04),
    and further extended in commits 2a7c16c980 ("t6422, t6426: be more
    flexible for add/add conflicts involving renames", 2020-08-10) and
    e8eb99d4a6 ("t642[23]: be more flexible for add/add conflicts
    involving pair renames", 2020-08-10)

  * making rename/delete more consistent with modify/delete
    as done in commits 1f3c9ba707 ("t6425: be more flexible with
    rename/delete conflict messages", 2020-08-10) and 727c75b23f
    ("t6404, t6423: expect improved rename/delete handling in ort
    backend", 2020-10-26)

Since the set of relevant_sources we compute has not yet been narrowed
down for directory rename detection, we do not pass it to
diffcore_rename_extended() yet.  That will be done after subsequent
commits narrow down the list of relevant_sources needed for directory
rename detection reasons.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 22:18:04 -08:00
brian m. carlson
75555676ad builtin/init-db: handle bare clones when core.bare set to false
In 552955ed7f ("clone: use more conventional config/option layering",
2020-10-01), clone learned to read configuration options earlier in its
execution, before creating the new repository.  However, that led to a
problem: if the core.bare setting is set to false in the global config,
cloning a bare repository segfaults.  This happens because the
repository is falsely thought to be non-bare, but clone has set the work
tree to NULL, which is then dereferenced.

The code to initialize the repository already considers the fact that a
user might want to override the --bare option for git init, but it
doesn't take into account clone, which uses a different option.  Let's
just check that the work tree is not NULL, since that's how clone
indicates that the repository is bare.  This is also the case for git
init, so we won't be regressing that case.

Reported-by: Joseph Vusich <jvusich@amazon.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 15:06:48 -08:00
Jeff King
6d875d19fd t7003: test ref rewriting explicitly
After it has rewritten all of the commits, filter-branch will then
rewrite each of the input refs based on the resulting map of old/new
commits. But we don't have any explicit test coverage of this code.
Let's make sure we are covering each of those cases:

  - deleting a ref when all of its commits were pruned

  - rewriting a ref based on the mapping (this happens throughout the
    script, but let's make sure we generate the correct messages)

  - rewriting a ref whose tip was excluded, in which case we rewrite to
    the nearest ancestor. Note in this case that we still insist that no
    "warning" line is present (even though it looks like we'd trigger
    the "... was rewritten into multiple commits" one). See the next
    commit for more details.

Note these all pass currently, but the latter two will fail when run
with GIT_TEST_DEFAULT_HASH=sha256.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 14:14:19 -08:00
Junio C Hamano
56a57652ef Sync with Git 2.30.2 for CVE-2021-21300
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-08 16:09:07 -08:00
Junio C Hamano
6c46f864e5 Merge branch 'jt/transfer-fsck-across-packs-fix'
The code to fsck objects received across multiple packs during a
single git fetch session has been broken when the packfile URI
feature was in use.  A workaround has been added by disabling the
codepath to avoid keeping a packfile that is too small.

* jt/transfer-fsck-across-packs-fix:
  fetch-pack: do not mix --pack_header and packfile uri
2021-03-08 16:04:47 -08:00
Jonathan Tan
2aec3bc4b6 fetch-pack: do not mix --pack_header and packfile uri
When fetching (as opposed to cloning) from a repository with packfile
URIs enabled, an error like this may occur:

 fatal: pack has bad object at offset 12: unknown object type 5
 fatal: finish_http_pack_request gave result -1
 fatal: fetch-pack: expected keep then TAB at start of http-fetch output

This bug was introduced in b664e9ffa1 ("fetch-pack: with packfile URIs,
use index-pack arg", 2021-02-22), when the index-pack args used when
processing the inline packfile of a fetch response and when processing
packfile URIs were unified.

This bug happens because fetch, by default, partially reads (and
consumes) the header of the inline packfile to determine if it should
store the downloaded objects as a packfile or loose objects, and thus
passes --pack_header=<...> to index-pack to inform it that some bytes
are missing. However, when it subsequently fetches the additional
packfiles linked by URIs, it reuses the same index-pack arguments, thus
wrongly passing --index-pack-arg=--pack_header=<...> when no bytes are
missing.

This does not happen when cloning because "git clone" always passes
do_keep, which instructs the fetch mechanism to always retain the
packfile, eliminating the need to read the header.

There are a few ways to fix this, including filtering out pack_header
arguments when downloading the additional packfiles, but I decided to
stick to always using index-pack throughout when packfile URIs are
present - thus, Git no longer needs to read the bytes, and no longer
needs --pack_header here.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 15:04:09 -08:00
Denton Liu
0af760e261 stash show: learn stash.showIncludeUntracked
The previous commit teaches `git stash show --include-untracked`. It
may be desirable for a user to be able to always enable the
--include-untracked behavior. Teach the stash.showIncludeUntracked
config option which allows users to do this in a similar manner to
stash.showPatch.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 14:31:27 -08:00
Denton Liu
d3c7bf73bd stash show: teach --include-untracked and --only-untracked
Stash entries can be made with untracked files via
`git stash push --include-untracked`. However, because the untracked
files are stored in the third parent of the stash entry and not the
stash entry itself, running `git stash show` does not include the
untracked files as part of the diff.

With --include-untracked, untracked paths, which are recorded in the
third-parent if it exists, are shown in addition to the paths that have
modifications between the stash base and the working tree in the stash.

It is possible to manually craft a malformed stash entry where duplicate
untracked files in the stash entry will mask tracked files. We detect
and error out in that case via a custom unpack_trees() callback:
stash_worktree_untracked_merge().

Also, teach stash the --only-untracked option which only shows the
untracked files of a stash entry. This is similar to `git show stash^3`
but it is nice to provide a convenient abstraction for it so that users
do not have to think about the underlying implementation.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 14:31:26 -08:00
Taylor Blau
dab3247734 t7703: test --geometric repack with loose objects
We don't currently have a test that demonstrates the non-idempotent
behavior of 'git repack --geometric' with loose objects, so add one here
to make sure we don't regress in this area.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 11:33:52 -08:00
Taylor Blau
f25e33c156 builtin/repack.c: do not repack single packs with --geometric
In 0fabafd0b9 (builtin/repack.c: add '--geometric' option, 2021-02-22),
the 'git repack --geometric' code aborts early when there is zero or one
pack.

When there are no packs, this code does the right thing by placing the
split at "0". But when there is exactly one pack, the split is placed at
"1", which means that "git repack --geometric" (with any factor)
repacks all of the objects in a single pack.

This is wasteful, and the remaining code in split_pack_geometry() does
the right thing (not repacking the objects in a single pack) even when
only one pack is present.

Loosen the guard to only stop when there aren't any packs, and let the
rest of the code do the right thing. Add a test to ensure that this is
the case.

Noticed-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 11:33:52 -08:00
Shubham Verma
12604a8d0c t9801: replace test -f with test_path_is_file
Although `test -f` has the same functionality as test_path_is_file(), in
the case where test_path_is_file() fails, we get much better debugging
information.

Replace `test -f` with test_path_is_file so that future developers
will have a better experience debugging these test cases.

Signed-off-by: Shubham Verma <shubhunic@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-03 17:11:31 -08:00
Junio C Hamano
28714238c8 Merge branch 'hv/trailer-formatting'
The logic to handle "trailer" related placeholders in the
"--format=" mechanisms in the "log" family and "for-each-ref"
family is getting unified.

* hv/trailer-formatting:
  ref-filter: use pretty.c logic for trailers
  pretty.c: capture invalid trailer argument
  pretty.c: refactor trailer logic to `format_set_trailers_options()`
  t6300: use function to test trailer options
2021-03-01 14:02:58 -08:00
Junio C Hamano
fbad3505ee Merge branch 'sv/t7001-modernize'
Test script modernization.

* sv/t7001-modernize:
  t7001: use `test` rather than `[`
  t7001: use here-docs instead of echo
  t7001: put each command on a separate line
  t7001: use '>' rather than 'touch'
  t7001: avoid using `cd` outside of subshells
  t7001: remove whitespace after redirect operators
  t7001: modernize subshell formatting
  t7001: remove unnecessary blank lines
  t7001: indent with TABs instead of spaces
  t7001: modernize test formatting
2021-03-01 14:02:57 -08:00
Junio C Hamano
6ee353d42f Merge branch 'jt/transfer-fsck-across-packs'
The approach to "fsck" the incoming objects in "index-pack" is
attractive for performance reasons (we have them already in core,
inflated and ready to be inspected), but fundamentally cannot be
applied fully when we receive more than one pack stream, as a tree
object in one pack may refer to a blob object in another pack as
".gitmodules", when we want to inspect blobs that are used as
".gitmodules" file, for example.  Teach "index-pack" to emit
objects that must be inspected later and check them in the calling
"fetch-pack" process.

* jt/transfer-fsck-across-packs:
  fetch-pack: print and use dangling .gitmodules
  fetch-pack: with packfile URIs, use index-pack arg
  http-fetch: allow custom index-pack args
  http: allow custom index-pack args
2021-03-01 14:02:57 -08:00
Junio C Hamano
660dd97a62 Merge branch 'ds/chunked-file-api'
The common code to deal with "chunked file format" that is shared
by the multi-pack-index and commit-graph files have been factored
out, to help codepaths for both filetypes to become more robust.

* ds/chunked-file-api:
  commit-graph.c: display correct number of chunks when writing
  chunk-format: add technical docs
  chunk-format: restore duplicate chunk checks
  midx: use 64-bit multiplication for chunk sizes
  midx: use chunk-format read API
  commit-graph: use chunk-format read API
  chunk-format: create read chunk API
  midx: use chunk-format API in write_midx_internal()
  midx: drop chunk progress during write
  midx: return success/failure in chunk write methods
  midx: add num_large_offsets to write_midx_context
  midx: add pack_perm to write_midx_context
  midx: add entries to write_midx_context
  midx: use context in write_midx_pack_names()
  midx: rename pack_info to write_midx_context
  commit-graph: use chunk-format write API
  chunk-format: create chunk format write API
  commit-graph: anonymize data in chunk_write_fn
2021-03-01 14:02:57 -08:00
Junio C Hamano
12bd17521c Merge branch 'en/diffcore-rename'
Performance optimization work on the rename detection continues.

* en/diffcore-rename:
  merge-ort: call diffcore_rename() directly
  gitdiffcore doc: mention new preliminary step for rename detection
  diffcore-rename: guide inexact rename detection based on basenames
  diffcore-rename: complete find_basename_matches()
  diffcore-rename: compute basenames of source and dest candidates
  t4001: add a test comparing basename similarity and content similarity
  diffcore-rename: filter rename_src list when possible
  diffcore-rename: no point trying to find a match better than exact
2021-03-01 14:02:56 -08:00
Junio C Hamano
700696bcfc Merge branch 'jh/fsmonitor-prework'
Preliminary changes to fsmonitor integration.

* jh/fsmonitor-prework:
  fsmonitor: refactor initialization of fsmonitor_last_update token
  fsmonitor: allow all entries for a folder to be invalidated
  fsmonitor: log FSMN token when reading and writing the index
  fsmonitor: log invocation of FSMonitor hook to trace2
  read-cache: log the number of scanned files to trace2
  read-cache: log the number of lstat calls to trace2
  preload-index: log the number of lstat calls to trace2
  p7519: add trace logging during perf test
  p7519: move watchman cleanup earlier in the test
  p7519: fix watchman watch-list test on Windows
  p7519: do not rely on "xargs -d" in test
2021-03-01 14:02:56 -08:00
René Scharfe
09fe8ca92e t4205: assert %(describe) test coverage
Document that the test is covering both describable and
undescribable commits.

Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-01 09:42:17 -08:00
Jeff King
36e834abc1 t/perf: avoid copying worktree files from test repo
When running the perf suite, we copy files from an existing $GIT_DIR to
a scratch repository to give us a realistic setup on which to operate.
Since the perf scripts themselves may modify the scratch repository, we
want to make sure we've scrubbed any references back to the original.

One existing example is that we avoid copying the file "commondir" at
the top-level of the repository. In a worktree git-dir (e.g.,
.git/worktrees/foo), that file contains the path to the parent
repository; copying it could mean ref updates in the scratch repository
affect the original.

But there are other files we should cover, too:

  - "gitdir" in a worktree git-dir contains the path to the actual .git
    file in the working tree. We _shouldn't_ end up looking at it at
    all, since the lack of a "commondir" file means Git won't consider
    this to be a worktree git-dir. But it's best to err on the safe
    side.

  - in a parent repository that contains worktrees, the
    "$GIT_DIR/worktrees" directory will contain the git dirs for the
    individual worktrees. Which will themselves contain commondir and
    gitdir files that may reference the original repository. We should
    likewise remove them.

    Note that this does mean that the perf suite's scratch repositories
    will never have any worktrees. That's OK; we don't have any perf tests
    that are influenced by their presence. If we add any, they'd
    probably want to create the worktrees themselves anyway.

This patch adds both paths to the set of omissions in
test_perf_copy_repo_contents(). Note that we won't get confused here by
matching arbitrary names like refs/heads/commondir. This list is always
matching top-level entries in $GIT_DIR (we rely on "cp -R" to do the
actual recursion).

Suggested-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 14:21:04 -08:00
Jeff King
85b87a5396 t/perf: handle worktrees as test repos
The perf suite gets confused when test_perf_default_repo is pointed at a
worktree (which includes when it is run from within a worktree at all,
since the default is to use the current repository).

Here's an example:

  $ git worktree add ~/foo
  Preparing worktree (new branch 'foo')
  HEAD is now at 328c109303 The eighth batch
  $ cd ~/foo
  $ make
  [...build output...]
  $ cd t/perf
  $ ./p0000-perf-lib-sanity.sh -v -i
  [...]
  perf 1 - test_perf_default_repo works:
  running:
  	foo=$(git rev-parse HEAD) &&
  	test_export foo

  fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
  Use '--' to separate paths from revisions, like this:
  'git <command> [<revision>...] -- [<file>...]'

The problem is that we didn't copy all of the necessary files from the
source repository (in this case we got HEAD, but we have no refs!). We
discover the git-dir with "rev-parse --git-dir", but this points to the
worktree's partial repository in .../.git/worktrees/foo.

That partial repository has a "commondir" file which points to the main
repository, where the actual refs are stored, but we don't copy it. This
is the correct thing to do, though! If we did copy it, then our scratch
test repo would be pointing back to the original main repo, and any ref
updates we made in the tests would impact that original repo.

Instead, we need to either:

  1. Make a scratch copy of the original main repo (in addition to the
     worktree repo), and point the scratch worktree repo's commondir at
     it. This preserves the original relationship, but it's doubtful any
     script really cares (if they are testing worktree performance,
     they'd probably make their own worktrees). And it's trickier to get
     right.

  2. Collapse the main and worktree repos into a single scratch repo.
     This can be done by copying everything from both, preferring any
     files from the worktree repo.

This patch does the second one. With this applied, the example above
results in p0000 running successfully.

Reported-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 14:21:04 -08:00
Matheus Tavares
6fab35f748 convert: fail gracefully upon missing clean cmd on required filter
The gitattributes documentation mentions that either the clean cmd or
the smudge cmd can be left unspecified in a filter definition. However,
when the filter is marked as 'required', the absence of any one of these
two should be treated as an error. Git already fails under these
circumstances, but not always in a pleasant way: omitting a clean cmd in
a required filter triggers an assertion error which leaves the user with
a quite verbose message:

git: convert.c:1459: convert_to_git_filter_fd: Assertion "ca.drv->clean || ca.drv->process" failed.

This assertion is not really necessary, as the apply_filter() call below
it already performs the same check. And when this condition is not met,
the function returns 0, making the caller die() with a much nicer
message. (Also note that die()-ing here is the right behavior as
`would_convert_to_git_filter_fd() == true` is a precondition to use
convert_to_git_filter_fd(), and the former is only true when the filter
is required.) So remove the assertion and add two regression tests to
make sure that git fails nicely when either the smudge or clean command
is missing on a required filter.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-26 11:20:02 -08:00
Junio C Hamano
140045821a Merge branch 'jc/push-delete-nothing'
"git push $there --delete ''" should have been diagnosed as an
error, but instead turned into a matching push, which has been
corrected.

* jc/push-delete-nothing:
  push: do not turn --delete '' into a matching push
2021-02-25 16:43:33 -08:00
Junio C Hamano
1c8f5dfa42 Merge branch 'js/params-vs-args'
Messages update.

* js/params-vs-args:
  replace "parameters" by "arguments" in error messages
2021-02-25 16:43:32 -08:00
Junio C Hamano
d166e8c1d4 Merge branch 'es/maintenance-of-bare-repositories'
The "git maintenance register" command had trouble registering bare
repositories, which had been corrected.

* es/maintenance-of-bare-repositories:
  maintenance: fix incorrect `maintenance.repo` path with bare repository
2021-02-25 16:43:32 -08:00
Junio C Hamano
f277234860 Merge branch 'mt/add-chmod-fixes'
Various fixes on "git add --chmod".

* mt/add-chmod-fixes:
  add: propagate --chmod errors to exit status
  add: mark --chmod error string for translation
  add --chmod: don't update index when --dry-run is used
2021-02-25 16:43:31 -08:00
Junio C Hamano
682bbad64d Merge branch 'ah/rebase-no-fork-point-config'
"git rebase --[no-]fork-point" gained a configuration variable
rebase.forkPoint so that users do not have to keep specifying a
non-default setting.

* ah/rebase-no-fork-point-config:
  rebase: add a config option for --no-fork-point
2021-02-25 16:43:31 -08:00
Junio C Hamano
628c13ccee Merge branch 'mt/grep-sparse-checkout'
"git grep" has been tweaked to be limited to the sparse checkout
paths.

* mt/grep-sparse-checkout:
  grep: honor sparse-checkout on working tree searches
2021-02-25 16:43:31 -08:00
Junio C Hamano
6eea44cee1 Merge branch 'zh/difftool-skip-to'
"git difftool" learned "--skip-to=<path>" option to restart an
interrupted session from an arbitrary path.

* zh/difftool-skip-to:
  difftool.c: learn a new way start at specified file
2021-02-25 16:43:31 -08:00
Junio C Hamano
845d6030f8 Merge branch 'jc/diffcore-rotate'
"git {diff,log} --{skip,rotate}-to=<path>" allows the user to
discard diff output for early paths or move them to the end of the
output.

* jc/diffcore-rotate:
  diff: --{rotate,skip}-to=<path>
2021-02-25 16:43:30 -08:00
Junio C Hamano
3da165ca28 Merge branch 'mt/checkout-index-corner-cases'
The error codepath around the "--temp/--prefix" feature of "git
checkout-index" has been improved.

* mt/checkout-index-corner-cases:
  checkout-index: omit entries with no tempname from --temp output
  write_entry(): fix misuses of `path` in error messages
2021-02-25 16:43:30 -08:00
Junio C Hamano
608cc4f273 Merge branch 'ab/detox-gettext-tests'
Removal of GIT_TEST_GETTEXT_POISON continues.

* ab/detox-gettext-tests:
  tests: remove most uses of test_i18ncmp
  tests: remove last uses of C_LOCALE_OUTPUT
  tests: remove most uses of C_LOCALE_OUTPUT
  tests: remove last uses of GIT_TEST_GETTEXT_POISON=false
2021-02-25 16:43:29 -08:00
Junio C Hamano
6fe12b5215 Merge branch 'jk/rev-list-disk-usage'
"git rev-list" command learned "--disk-usage" option.

* jk/rev-list-disk-usage:
  docs/rev-list: add some examples of --disk-usage
  docs/rev-list: add an examples section
  rev-list: add --disk-usage option for calculating disk usage
  t: add --no-tag option to test_commit
2021-02-25 16:43:29 -08:00
Derrick Stolee
702110aac6 commit-graph: use config to specify generation type
We have two established generation number versions:

 1: topological levels
 2: corrected commit dates

The corrected commit dates are enabled by default, but they also write
extra data in the GDAT and GDOV chunks. Services that host Git data
might want to have more control over when this feature rolls out than
just updating the Git binaries.

Add a new "commitGraph.generationVersion" config option that specifies
the intended generation number version. If this value is less than 2,
then the GDAT chunk is never written _or read_ from an existing file.

This can replace our use of the GIT_TEST_COMMIT_GRAPH_NO_GDAT
environment variable in the test suite. Remove it.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-25 15:10:41 -08:00
Ævar Arnfjörð Bjarmason
0f1da600e6 remote: write camel-cased *.pushRemote on rename
When a remote is renamed don't change the canonical "*.pushRemote"
form to "*.pushremote". Fixes and tests for a minor bug in
923d4a5ca4 (remote rename/remove: handle branch.<name>.pushRemote
config values, 2020-01-27). See the preceding commit for why this does
& doesn't matter.

While we're at it let's also test that we handle the "*.pushDefault"
key correctly. The code to handle that was added in
b3fd6cbf29 (remote rename/remove: gently handle remote.pushDefault
config, 2020-02-01) and does the right thing, but nothing tested that
we wrote out the canonical camel-cased form.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-24 19:03:00 -08:00
Ævar Arnfjörð Bjarmason
bfa9148ff7 remote: add camel-cased *.tagOpt key, like clone
Change "git remote add" so that it adds a *.tagOpt key, and not the
lower-cased *.tagopt on "git remote add --no-tags", just as "git clone
--no-tags" would do.

This doesn't matter for anything that reads the config. It's just
prettier if we write config keys in their documented camelCase form to
user-readable config files.

When I added support for "clone -no-tags" in 0dab2468ee (clone: add a
--no-tags option to clone without tags, 2017-04-26) I made it use
the *.tagOpt form, but the older "git remote add" added in
111fb85865 (remote add: add a --[no-]tags option, 2010-04-20) has
been using *.tagopt all this time.

It's easy enough to add a test for this, so let's do that. We can't
use "git config -l" there, because it'll normalize the keys to their
lower-cased form. Let's add the test for "git clone" too for good
measure, not just to the "git remote" codepath we're fixing.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-24 19:02:58 -08:00
Junio C Hamano
11875561bf Merge branch 'ds/chunked-file-api' into tb/reverse-midx
* ds/chunked-file-api:
  commit-graph.c: display correct number of chunks when writing
  chunk-format: add technical docs
  chunk-format: restore duplicate chunk checks
  midx: use 64-bit multiplication for chunk sizes
  midx: use chunk-format read API
  commit-graph: use chunk-format read API
  chunk-format: create read chunk API
  midx: use chunk-format API in write_midx_internal()
  midx: drop chunk progress during write
  midx: return success/failure in chunk write methods
  midx: add num_large_offsets to write_midx_context
  midx: add pack_perm to write_midx_context
  midx: add entries to write_midx_context
  midx: use context in write_midx_pack_names()
  midx: rename pack_info to write_midx_context
  commit-graph: use chunk-format write API
  chunk-format: create chunk format write API
  commit-graph: anonymize data in chunk_write_fn
2021-02-24 15:26:14 -08:00
Matheus Tavares
9ebd7fe158 add: propagate --chmod errors to exit status
If `add` encounters an error while applying the --chmod changes, it
prints a message to stderr, but exits with a success code. This might
have been an oversight, as the command does exit with a non-zero code in
other situations where it cannot (or refuses to) update all of the
requested paths (e.g. when some of the given paths are ignored). So make
the exit behavior more consistent by also propagating --chmod errors to
the exit status.

Note: the test "all statuses changed in folder if . is given" uses paths
added by previous test cases, some of which might be symbolic links.
Because `git add --chmod` will now fail with such paths, this test would
depend on whether all the previous tests were executed, or only some
of them. Avoid that by running the test on a fresh repo with only
regular files.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-24 12:14:51 -08:00
Matheus Tavares
48960894f5 add: mark --chmod error string for translation
This error message is intended for humans, so mark it for translation.
Also use error() instead of fprintf(stderr, ...), to make the
corresponding line a bit cleaner, and to display the "error:" prefix,
which helps classifying the nature/severity of the message.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-24 12:14:51 -08:00
Matheus Tavares
c937d70bfb add --chmod: don't update index when --dry-run is used
`git add --chmod` applies the mode changes even when `--dry-run` is
used. Fix that and add some tests for this option combination.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-24 12:14:51 -08:00
Alex Henrie
2803d800d2 rebase: add a config option for --no-fork-point
Some users (myself included) would prefer to have this feature off by
default because it can silently drop commits.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-24 11:49:10 -08:00
Junio C Hamano
20e416409f push: do not turn --delete '' into a matching push
When we added a syntax sugar "git push remote --delete <ref>" to
"git push" as a synonym to the canonical "git push remote :<ref>"
syntax at f517f1f2 (builtin-push: add --delete as syntactic sugar
for :foo, 2009-12-30), we weren't careful enough to make sure that
<ref> is not empty.

Blindly rewriting "--delete <ref>" to ":<ref>" means that an empty
string <ref> results in refspec ":", which is the syntax to ask for
"matching" push that does not delete anything.

Worse yet, if there were matching refs that can be fast-forwarded,
they would have been published prematurely, even if the user feels
that they are not ready yet to be pushed out, which would be a real
disaster.

Noticed-by: Tilman Vogel <tilman.vogel@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-23 15:19:34 -08:00
Johannes Sixt
b865734760 replace "parameters" by "arguments" in error messages
When an error message informs the user about an incorrect command
invocation, it should refer to "arguments", not "parameters".

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-23 13:30:45 -08:00
Eric Sunshine
26c7974376 maintenance: fix incorrect maintenance.repo path with bare repository
The periodic maintenance tasks configured by `git maintenance start`
invoke `git for-each-repo` to run `git maintenance run` on each path
specified by the multi-value global configuration variable
`maintenance.repo`. Because `git for-each-repo` will likely be run
outside of the repositories which require periodic maintenance, it is
mandatory that the repository paths specified by `maintenance.repo` are
absolute.

Unfortunately, however, `git maintenance register` does nothing to
ensure that the paths it assigns to `maintenance.repo` are indeed
absolute, and may in fact -- especially in the case of a bare repository
-- assign a relative path to `maintenance.repo` instead. Fix this
problem by converting all paths to absolute before assigning them to
`maintenance.repo`.

While at it, also fix `git maintenance unregister` to convert paths to
absolute, as well, in order to ensure that it can correctly remove from
`maintenance.repo` a path assigned via `git maintenance register`.

Reported-by: Clement Moyroud <clement.moyroud@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-23 00:22:45 -08:00
Taylor Blau
0fabafd0b9 builtin/repack.c: add '--geometric' option
Often it is useful to both:

  - have relatively few packfiles in a repository, and

  - avoid having so few packfiles in a repository that we repack its
    entire contents regularly

This patch implements a '--geometric=<n>' option in 'git repack'. This
allows the caller to specify that they would like each pack to be at
least a factor times as large as the previous largest pack (by object
count).

Concretely, say that a repository has 'n' packfiles, labeled P1, P2,
..., up to Pn. Each packfile has an object count equal to 'objects(Pn)'.
With a geometric factor of 'r', it should be that:

  objects(Pi) > r*objects(P(i-1))

for all i in [1, n], where the packs are sorted by

  objects(P1) <= objects(P2) <= ... <= objects(Pn).

Since finding a true optimal repacking is NP-hard, we approximate it
along two directions:

  1. We assume that there is a cutoff of packs _before starting the
     repack_ where everything to the right of that cut-off already forms
     a geometric progression (or no cutoff exists and everything must be
     repacked).

  2. We assume that everything smaller than the cutoff count must be
     repacked. This forms our base assumption, but it can also cause
     even the "heavy" packs to get repacked, for e.g., if we have 6
     packs containing the following number of objects:

       1, 1, 1, 2, 4, 32

     then we would place the cutoff between '1, 1' and '1, 2, 4, 32',
     rolling up the first two packs into a pack with 2 objects. That
     breaks our progression and leaves us:

       2, 1, 2, 4, 32
         ^

     (where the '^' indicates the position of our split). To restore a
     progression, we move the split forward (towards larger packs)
     joining each pack into our new pack until a geometric progression
     is restored. Here, that looks like:

       2, 1, 2, 4, 32  ~>  3, 2, 4, 32  ~>  5, 4, 32  ~> ... ~> 9, 32
         ^                   ^                ^                   ^

This has the advantage of not repacking the heavy-side of packs too
often while also only creating one new pack at a time. Another wrinkle
is that we assume that loose, indexed, and reflog'd objects are
insignificant, and lump them into any new pack that we create. This can
lead to non-idempotent results.

Suggested-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 23:30:52 -08:00
Jeff King
fbf20aeeef p5303: measure time to repack with keep
Add two new tests to measure repack performance. Both tests split the
repository into synthetic "pushes", and then leave the remaining objects
in a big base pack.

The first new test marks an empty pack as "kept" and then passes
--honor-pack-keep to avoid including objects in it. That doesn't change
the resulting pack, but it does let us compare to the normal repack case
to see how much overhead we add to check whether objects are kept or
not.

The other test is of --stdin-packs, which gives us a sense of how that
number scales based on the number of packs we provide as input. In each
of those tests, the empty pack isn't considered, but the residual pack
(objects that were left over and not included in one of the synthetic
push packs) is marked as kept.

(Note that in the single-pack case of the --stdin-packs test, there is
nothing do since there are no non-excluded packs).

Here are some timings on a recent clone of the kernel:

  5303.5: repack (1)                          57.26(54.59+10.84)
  5303.6: repack with kept (1)                57.33(54.80+10.51)

in the 50-pack case, things start to slow down:

  5303.11: repack (50)                        71.54(88.57+4.84)
  5303.12: repack with kept (50)              85.12(102.05+4.94)

and by the time we hit 1,000 packs, things are substantially worse, even
though the resulting pack produced is the same:

  5303.17: repack (1000)                      216.87(490.79+14.57)
  5303.18: repack with kept (1000)            665.63(938.87+15.76)

That's because the code paths around handling .keep files are known to
scale badly; they look in every single pack file to find each object.
Our solution to that was to notice that most repos don't have keep
files, and to make that case a fast path. But as soon as you add a
single .keep, that part of pack-objects slows down again (even if we
have fewer objects total to look at).

Likewise, the scaling is pretty extreme on --stdin-packs (but each
subsequent test is also being asked to do more work):

  5303.7: repack with --stdin-packs (1)       0.01(0.01+0.00)
  5303.13: repack with --stdin-packs (50)     3.53(12.07+0.24)
  5303.19: repack with --stdin-packs (1000)   195.83(371.82+8.10)

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 23:30:52 -08:00
Jeff King
60bb5f2f5d p5303: add missing &&-chains
These are in a helper function, so the usual chain-lint doesn't notice
them. This function is still not perfect, as it has some git invocations
on the left-hand-side of the pipe, but it's primary purpose is timing,
not finding bugs or correctness issues.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 23:30:52 -08:00
Taylor Blau
339bce27f4 builtin/pack-objects.c: add '--stdin-packs' option
In an upcoming commit, 'git repack' will want to create a pack comprised
of all of the objects in some packs (the included packs) excluding any
objects in some other packs (the excluded packs).

This caller could iterate those packs themselves and feed the objects it
finds to 'git pack-objects' directly over stdin, but this approach has a
few downsides:

  - It requires every caller that wants to drive 'git pack-objects' in
    this way to implement pack iteration themselves. This forces the
    caller to think about details like what order objects are fed to
    pack-objects, which callers would likely rather not do.

  - If the set of objects in included packs is large, it requires
    sending a lot of data over a pipe, which is inefficient.

  - The caller is forced to keep track of the excluded objects, too, and
    make sure that it doesn't send any objects that appear in both
    included and excluded packs.

But the biggest downside is the lack of a reachability traversal.
Because the caller passes in a list of objects directly, those objects
don't get a namehash assigned to them, which can have a negative impact
on the delta selection process, causing 'git pack-objects' to fail to
find good deltas even when they exist.

The caller could formulate a reachability traversal themselves, but the
only way to drive 'git pack-objects' in this way is to do a full
traversal, and then remove objects in the excluded packs after the
traversal is complete. This can be detrimental to callers who care
about performance, especially in repositories with many objects.

Introduce 'git pack-objects --stdin-packs' which remedies these four
concerns.

'git pack-objects --stdin-packs' expects a list of pack names on stdin,
where 'pack-xyz.pack' denotes that pack as included, and
'^pack-xyz.pack' denotes it as excluded. The resulting pack includes all
objects that are present in at least one included pack, and aren't
present in any excluded pack.

To address the delta selection problem, 'git pack-objects --stdin-packs'
works as follows. First, it assembles a list of objects that it is going
to pack, as above. Then, a reachability traversal is started, whose tips
are any commits mentioned in included packs. Upon visiting an object, we
find its corresponding object_entry in the to_pack list, and set its
namehash parameter appropriately.

To avoid the traversal visiting more objects than it needs to, the
traversal is halted upon encountering an object which can be found in an
excluded pack (by marking the excluded packs as kept in-core, and
passing --no-kept-objects=in-core to the revision machinery).

This can cause the traversal to halt early, for example if an object in
an included pack is an ancestor of ones in excluded packs. But stopping
early is OK, since filling in the namehash fields of objects in the
to_pack list is only additive (i.e., having it helps the delta selection
process, but leaving it blank doesn't impact the correctness of the
resulting pack).

Even still, it is unlikely that this hurts us much in practice, since
the 'git repack --geometric' caller (which is introduced in a later
commit) marks small packs as included, and large ones as excluded.
During ordinary use, the small packs usually represent pushes after a
large repack, and so are unlikely to be ancestors of objects that
already exist in the repository.

(I found it convenient while developing this patch to have 'git
pack-objects' report the number of objects which were visited and got
their namehash fields filled in during traversal. This is also included
in the below patch via trace2 data lines).

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 23:30:52 -08:00
Taylor Blau
c9fff00016 revision: learn '--no-kept-objects'
A future caller will want to be able to perform a reachability traversal
which terminates when visiting an object found in a kept pack. The
closest existing option is '--honor-pack-keep', but this isn't quite
what we want. Instead of halting the traversal midway through, a full
traversal is always performed, and the results are only trimmed
afterwords.

Besides needing to introduce a new flag (since culling results
post-facto can be different than halting the traversal as it's
happening), there is an additional wrinkle handling the distinction
in-core and on-disk kept packs. That is: what kinds of kept pack should
stop the traversal?

Introduce '--no-kept-objects[=<on-disk|in-core>]' to specify which kinds
of kept packs, if any, should stop a traversal. This can be useful for
callers that want to perform a reachability analysis, but want to leave
certain packs alone (for e.g., when doing a geometric repack that has
some "large" packs which are kept in-core that it wants to leave alone).

Note that this option is not guaranteed to produce exactly the set of
objects that aren't in kept packs, since it's possible the traversal
order may end up in a situation where a non-kept ancestor was "cut off"
by a kept object (at which point we would stop traversing). But, we
don't care about absolute correctness here, since this will eventually
be used as a purely additive guide in an upcoming new repack mode.

Explicitly avoid documenting this new flag, since it is only used
internally. In theory we could avoid even adding it rev-list, but being
able to spell this option out on the command-line makes some special
cases easier to test without promising to keep it behaving consistently
forever. Those tricky cases are exercised in t6114.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 23:30:52 -08:00
Junio C Hamano
d68fccef86 Merge branch 'ab/test-lib'
Test framework clean-up.

* ab/test-lib:
  test-lib-functions: assert correct parameter count
  test-lib-functions: remove bug-inducing "diagnostics" helper param
  test libs: rename "diff-lib" to "lib-diff"
  t/.gitattributes: sort lines
  test-lib-functions: move function to lib-bitmap.sh
  test libs: rename gitweb-lib.sh to lib-gitweb.sh
  test libs: rename bundle helper to "lib-bundle.sh"
  test-lib-functions: remove generate_zero_bytes() wrapper
  test-lib-functions: move test_set_index_version() to its user
  test lib: change "error" to "BUG" as appropriate
  test-lib: remove check_var_migration
2021-02-22 16:12:43 -08:00
Junio C Hamano
dcb11fc622 Merge branch 'ab/pager-exit-log'
When a pager spawned by us exited, the trace log did not record its
exit status correctly, which has been corrected.

* ab/pager-exit-log:
  pager: properly log pager exit code when signalled
  run-command: add braces for "if" block in wait_or_whine()
  pager: test for exit code with and without SIGPIPE
  pager: refactor wait_for_pager() function
2021-02-22 16:12:43 -08:00
Junio C Hamano
dc24948be9 Merge branch 'ta/hash-function-transition-doc'
Update formatting and grammar of the hash transition plan
documentation, plus some updates.

* ta/hash-function-transition-doc:
  doc: use https links
  doc hash-function-transition: move rationale upwards
  doc hash-function-transition: fix incomplete sentence
  doc hash-function-transition: use upper case consistently
  doc hash-function-transition: use SHA-1 and SHA-256 consistently
  doc hash-function-transition: fix asciidoc output
2021-02-22 16:12:43 -08:00
Junio C Hamano
15af6e6fee Merge branch 'bc/signed-objects-with-both-hashes'
Signed commits and tags now allow verification of objects, whose
two object names (one in SHA-1, the other in SHA-256) are both
signed.

* bc/signed-objects-with-both-hashes:
  gpg-interface: remove other signature headers before verifying
  ref-filter: hoist signature parsing
  commit: allow parsing arbitrary buffers with headers
  gpg-interface: improve interface for parsing tags
  commit: ignore additional signatures when parsing signed commits
  ref-filter: switch some uses of unsigned long to size_t
2021-02-22 16:12:42 -08:00
Junio C Hamano
b9554c03a0 Merge branch 'dl/stash-cleanup'
Documentation, code and test clean-up around "git stash".

* dl/stash-cleanup:
  stash: declare ref_stash as an array
  t3905: use test_cmp() to check file contents
  t3905: replace test -s with test_file_not_empty
  t3905: remove nested git in command substitution
  t3905: move all commands into test cases
  t3905: remove spaces after redirect operators
  git-stash.txt: be explicit about subcommand options
2021-02-22 16:12:42 -08:00
ZheNing Hu
1c881026a1 difftool.c: learn a new way start at specified file
`git difftool` only allow us to select file to view in turn.
If there is a commit with many files and we exit in the middle,
we will have to traverse list again to get the file diff which
we want to see. Therefore,teach the command an option
`--skip-to=<path>` to allow the user to say that diffs for earlier
paths are not interesting (because they were already seen in an
earlier session) and start this session with the named path.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 13:35:49 -08:00
Jonathan Tan
5476e1efde fetch-pack: print and use dangling .gitmodules
Teach index-pack to print dangling .gitmodules links after its "keep" or
"pack" line instead of declaring an error, and teach fetch-pack to check
such lines printed.

This allows the tree side of the .gitmodules link to be in one packfile
and the blob side to be in another without failing the fsck check,
because it is now fetch-pack which checks such objects after all
packfiles have been downloaded and indexed (and not index-pack on an
individual packfile, as it is before this commit).

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 12:07:40 -08:00
Jonathan Tan
27e35ba6c6 http-fetch: allow custom index-pack args
This is the next step in teaching fetch-pack to pass its index-pack
arguments when processing packfiles referenced by URIs.

The "--keep" in fetch-pack.c will be replaced with a full message in a
subsequent commit.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 12:07:40 -08:00
Derrick Stolee
6ab3b8b8b8 midx: use chunk-format read API
Instead of parsing the table of contents directly, use the chunk-format
API methods read_table_of_contents() and pair_chunk(). In particular, we
can use the return value of pair_chunk() to generate an error when a
required chunk is missing.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18 13:38:16 -08:00
Derrick Stolee
2692c2f6fd commit-graph: use chunk-format read API
Instead of parsing the table of contents directly, use the chunk-format
API methods read_table_of_contents() and pair_chunk(). While the current
implementation loses the duplicate-chunk detection, that will be added
in a future change.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-18 13:38:16 -08:00
Junio C Hamano
483e09e810 Merge branch 'ak/config-bad-bool-error'
The error message given when a configuration variable that is
expected to have a boolean value has been improved.

* ak/config-bad-bool-error:
  config: improve error message for boolean config
2021-02-17 17:21:43 -08:00
Junio C Hamano
e68f62be8d Merge branch 'js/reflog-expire-stale-fix'
"git reflog expire --stale-fix" can be used to repair the reflog by
removing entries that refer to objects that have been pruned away,
but was not careful to tolerate missing objects.

* js/reflog-expire-stale-fix:
  reflog expire --stale-fix: be generous about missing objects
2021-02-17 17:21:43 -08:00
Junio C Hamano
e9b4c483c7 Merge branch 'ew/rev-parse-since-test'
Test to make sure "git rev-parse one-thing one-thing" gives
the same thing twice (when one-thing is --since=X).

* ew/rev-parse-since-test:
  t1500: ensure current --since= behavior remains
2021-02-17 17:21:42 -08:00
Junio C Hamano
d494433d26 Merge branch 'ds/maintenance-pack-refs'
"git maintenance" tool learned a new "pack-refs" maintenance task.

* ds/maintenance-pack-refs:
  maintenance: incremental strategy runs pack-refs weekly
  maintenance: add pack-refs task
2021-02-17 17:21:42 -08:00
Junio C Hamano
fdf3a27ca9 Merge branch 'jx/t5411-unique-filenames'
Avoid individual tests in t5411 from getting affected by each other
by forcing them to use separate output files during the test.

* jx/t5411-unique-filenames:
  t5411: refactor check of refs using test_cmp_refs
  t5411: use different out file to prevent overwriting
2021-02-17 17:21:42 -08:00
Junio C Hamano
9e634a91c8 Merge branch 'js/fsck-name-objects-fix'
Fix "git fsck --name-objects" which apparently has not been used by
anybody who is motivated enough to report breakage.

* js/fsck-name-objects-fix:
  fsck --name-objects: be more careful parsing generation numbers
  t1450: robustify `remove_object()`
2021-02-17 17:21:42 -08:00
Junio C Hamano
9bdccbcda7 Merge branch 'jk/mailmap-only-at-root'
The .mailmap is documented to be read only from the root level of a
working tree, but a stray file in a bare repository also was read
by accident, which has been corrected.

* jk/mailmap-only-at-root:
  mailmap: only look for .mailmap in work tree
2021-02-17 17:21:42 -08:00
Junio C Hamano
78a26cb720 Merge branch 'sh/mergetool-hideresolved'
"git mergetool" feeds three versions (base, local and remote) of
a conflicted path unmodified.  The command learned to optionally
prepare these files with unconflicted parts already resolved.

* sh/mergetool-hideresolved:
  mergetool: add per-tool support and overrides for the hideResolved flag
  mergetool: break setup_tool out into separate initialization function
  mergetool: add hideResolved configuration
2021-02-17 17:21:41 -08:00
Junio C Hamano
aa2d3dbdf5 Merge branch 'jt/trace2-BUG'
Even though invocations of "die()" were logged to the trace2
system, "BUG()"s were not, which has been corrected.

* jt/trace2-BUG:
  usage: trace2 BUG() invocations
2021-02-17 17:21:41 -08:00
Junio C Hamano
dadc91ff0c Merge branch 'js/range-diff-one-side-only'
The "git range-diff" command learned "--(left|right)-only" option
to show only one side of the compared range.

* js/range-diff-one-side-only:
  range-diff: offer --left-only/--right-only options
  range-diff: move the diffopt initialization down one layer
  range-diff: combine all options in a single data structure
  range-diff: simplify code spawning `git log`
  range-diff: libify the read_patches() function again
  range-diff: avoid leaking memory in two error code paths
2021-02-17 17:21:41 -08:00
Junio C Hamano
77348b0e6e Merge branch 'js/range-diff-wo-dotdot'
There are other ways than ".." for a single token to denote a
"commit range", namely "<rev>^!" and "<rev>^-<n>", but "git
range-diff" did not understand them.

* js/range-diff-wo-dotdot:
  range-diff(docs): explain how to specify commit ranges
  range-diff/format-patch: handle commit ranges other than A..B
  range-diff/format-patch: refactor check for commit range
2021-02-17 17:21:41 -08:00
Junio C Hamano
69571dfe21 Merge branch 'jt/clone-unborn-head'
"git clone" tries to locally check out the branch pointed at by
HEAD of the remote repository after it is done, but the protocol
did not convey the information necessary to do so when copying an
empty repository.  The protocol v2 learned how to do so.

* jt/clone-unborn-head:
  clone: respect remote unborn HEAD
  connect, transport: encapsulate arg in struct
  ls-refs: report unborn targets of symrefs
2021-02-17 17:21:40 -08:00
Junio C Hamano
5bd0b21bf7 Merge branch 'ds/commit-graph-genno-fix'
Fix incremental update of commit-graph file around corrected commit
date data.

* ds/commit-graph-genno-fix:
  commit-graph: prepare commit graph
  commit-graph: be extra careful about mixed generations
  commit-graph: compute generations separately
  commit-graph: validate layers for generation data
  commit-graph: always parse before commit_graph_data_at()
  commit-graph: use repo_parse_commit
2021-02-17 17:21:40 -08:00
Junio C Hamano
8b4701ae4f Merge branch 'ak/corrected-commit-date'
The commit-graph learned to use corrected commit dates instead of
the generation number to help topological revision traversal.

* ak/corrected-commit-date:
  doc: add corrected commit date info
  commit-reach: use corrected commit dates in paint_down_to_common()
  commit-graph: use generation v2 only if entire chain does
  commit-graph: implement generation data chunk
  commit-graph: implement corrected commit date
  commit-graph: return 64-bit generation number
  commit-graph: add a slab to store topological levels
  t6600-test-reach: generalize *_three_modes
  commit-graph: consolidate fill_commit_graph_info
  revision: parse parent in indegree_walk_step()
  commit-graph: fix regression when computing Bloom filters
2021-02-17 17:21:40 -08:00
René Scharfe
b081547ec1 pretty: add merge and exclude options to %(describe)
Allow restricting the tags used by the placeholder %(describe) with the
options match and exclude.  E.g. the following command describes the
current commit using official version tags, without those for release
candidates:

   $ git log -1 --format='%(describe:match=v[0-9]*,exclude=*rc*)'

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 09:54:33 -08:00
René Scharfe
15ae82d5d6 pretty: add %(describe)
Add a format placeholder for describe output.  Implement it by actually
calling git describe, which is simple and guarantees correctness.  It's
intended to be used with $Format:...$ in files with the attribute
export-subst and git archive.  It can also be used with git log etc.,
even though that's going to be slow due to the fork for each commit.

Suggested-by: Eli Schwartz <eschwartz@archlinux.org>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 09:54:31 -08:00
Jeff Hostetler
4f2009dce2 p7519: add trace logging during perf test
Add optional trace logging to allow us to better compare performance of
various fsmonitor providers and compare results with non-fsmonitor runs.

Currently, this includes Trace2 logging, but may be extended to include
other trace targets, such as GIT_TRACE_FSMONITOR if desired.

Using this logging helped me explain an odd behavior on MacOS where the
kernel was dropping events and causing the hook to Watchman to timeout.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 17:14:34 -08:00
Jeff Hostetler
a7556c3bde p7519: move watchman cleanup earlier in the test
Shutdown Watchman after the Watchman-based tests and before the block of
"no fsmonitor" tests.

This helps ensure that Watchman cannot affect the test results for the
other.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 17:14:34 -08:00
Jeff Hostetler
0917763d67 p7519: fix watchman watch-list test on Windows
Only use the final portion of the test trash directory file name
when verifying that Watchman was started.

On Windows and under the SDK, $GIT_WORKTREE is a cygwin-style
path with forward slashes and a "/c/" drive name.  However
`watchman watch-list` reports a proper Windows-style pathname
with drive letters and backslashes.  This causes the grep to
fail.  Since we don't really care about the full pathname (and
we really don't want to bother with normalizaing them), just see
if the test-name portion of the path is found.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 17:14:34 -08:00
Jeff Hostetler
eb10e637cf p7519: do not rely on "xargs -d" in test
Convert the test to use a more portable method to update the mtime on a
large number of files under version control.

The Mac version of xargs does not support the "-d" option.
Likewise, the "-0" and "--null" options are not portable.

Furthermore, use `test-tool chmtime` rather than `touch` to update the
mtime to ensure that it is actually updated (especially on file systems
with only whole second resolution).

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 17:14:34 -08:00
Matheus Tavares
9334ea8e92 write_entry(): fix misuses of path in error messages
The variables `path` and `ce->name`, at write_entry(), usually have the
same contents, but that's not the case when using a checkout prefix or
writing to a tempfile. (In fact, `path` will be either empty or dirty
when writing to a tempfile.) Therefore, these variables cannot be used
interchangeably. In this sense, fix wrong uses of `path` in error
messages where it should really be `ce->name`, and add some regression
tests. (Note: there doesn't seem to be any misuse in the other way
around.)

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 11:27:17 -08:00
Jeff King
adcd9f5472 mailmap: do not respect symlinks for in-tree .mailmap
As with .gitattributes and .gitignore, we would like to make sure that
.mailmap files are handled consistently whether read from the a blob (as
is the default behavior in a bare repo) or from the filesystem.
Likewise, we would like to avoid reading out-of-tree files pointed to by
a symlink, which could have security implications in certain setups.

We can cover both by using open_nofollow() when opening the in-tree
files. We'll continue to follow links for mailmap.file, as well as when
reading .mailmap from the current directory when outside of a repository
entirely.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 09:41:33 -08:00
Jeff King
feb9b7792f exclude: do not respect symlinks for in-tree .gitignore
As with .gitattributes, we would like to make sure that .gitignore files
are handled consistently whether read from the index or from the
filesystem. Likewise, we would like to avoid reading out-of-tree files
pointed to by the symlinks, which could have security implications in
certain setups.

We can cover both by using open_nofollow() when opening the in-tree
files. We'll continue to follow links for core.excludesFile, as well as
$GIT_DIR/info/exclude.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 09:41:33 -08:00
Jeff King
2ef579e261 attr: do not respect symlinks for in-tree .gitattributes
The attributes system may sometimes read in-tree files from the
filesystem, and sometimes from the index. In the latter case, we do not
resolve symbolic links (and are not likely to ever start doing so).
Let's open filesystem links with O_NOFOLLOW so that the two cases behave
consistently.

As a bonus, this means that git will not follow such symlinks to read
and parse out-of-tree paths. In some cases this could have security
implications, as a malicious repository can cause Git to open and read
arbitrary files. It could already feed arbitrary content to the parser,
but in certain setups it might be able to exfiltrate data from those
paths (e.g., if an automated service operating on the malicious repo
reveals its stderr to an attacker).

Note that O_NOFOLLOW only prevents following links for the path itself,
not intermediate directories in the path.  At first glance, it seems
like

  ln -s /some/path in-repo

might still look at "in-repo/.gitattributes", following the symlink to
"/some/path/.gitattributes". However, if "in-repo" is a symbolic link,
then we know that it has no git paths below it, and will never look at
its .gitattributes file.

We will continue to support out-of-tree symbolic links (e.g., in
$GIT_DIR/info/attributes); this just affects in-tree links. When a
symbolic link is encountered, the contents are ignored and a warning is
printed. POSIX specifies ELOOP in this case, so the user would generally
see something like:

  warning: unable to access '.gitattributes': Too many levels of symbolic links

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 09:41:33 -08:00
Junio C Hamano
1eb4136ac2 diff: --{rotate,skip}-to=<path>
In the implementation of "git difftool", there is a case where the
user wants to start viewing the diffs at a specific path and
continue on to the rest, optionally wrapping around to the
beginning.  Since it is somewhat cumbersome to implement such a
feature as a post-processing step of "git diff" output, let's
support it internally with two new options.

 - "git diff --rotate-to=C", when the resulting patch would show
   paths A B C D E without the option, would "rotate" the paths to
   shows patch to C D E A B instead.  It is an error when there is
   no patch for C is shown.

 - "git diff --skip-to=C" would instead "skip" the paths before C,
   and shows patch to C D E.  Again, it is an error when there is no
   patch for C is shown.

 - "git log [-p]" also accepts these two options, but it is not an
   error if there is no change to the specified path.  Instead, the
   set of output paths are rotated or skipped to the specified path
   or the first path that sorts after the specified path.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 09:30:42 -08:00
Elijah Newren
bd24aa2f97 diffcore-rename: guide inexact rename detection based on basenames
Make use of the new find_basename_matches() function added in the last
two patches, to find renames more rapidly in cases where we can match up
files based on basenames.  As a quick reminder (see the last two commit
messages for more details), this means for example that
docs/extensions.txt and docs/config/extensions.txt are considered likely
renames if there are no remaining 'extensions.txt' files elsewhere among
the added and deleted files, and if a similarity check confirms they are
similar, then they are marked as a rename without looking for a better
similarity match among other files.  This is a behavioral change, as
covered in more detail in the previous commit message.

We do not use this heuristic together with either break or copy
detection.  The point of break detection is to say that filename
similarity does not imply file content similarity, and we only want to
know about file content similarity.  The point of copy detection is to
use more resources to check for additional similarities, while this is
an optimization that uses far less resources but which might also result
in finding slightly fewer similarities.  So the idea behind this
optimization goes against both of those features, and will be turned off
for both.

For the testcases mentioned in commit 557ac0350d ("merge-ort: begin
performance work; instrument with trace2_region_* calls", 2020-10-28),
this change improves the performance as follows:

                            Before                  After
    no-renames:       13.815 s ±  0.062 s    13.294 s ±  0.103 s
    mega-renames:   1799.937 s ±  0.493 s   187.248 s ±  0.882 s
    just-one-mega:    51.289 s ±  0.019 s     5.557 s ±  0.017 s

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-15 18:02:16 -08:00
Elijah Newren
f3845257a5 t4001: add a test comparing basename similarity and content similarity
Add a simple test where a removed file is similar to two different added
files; one of them has the same basename, and the other has a slightly
higher content similarity.  In the current test, content similarity is
weighted higher than filename similarity.

Subsequent commits will add a new rule that weighs a mixture of filename
similarity and content similarity in a manner that will change the
outcome of this testcase.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-15 18:02:16 -08:00
Hariom Verma
ee82a487f6 ref-filter: use pretty.c logic for trailers
Now, ref-filter is using pretty.c logic for setting trailer options.

New to ref-filter:
  :key=<K> - only show trailers with specified key.
  :valueonly[=val] - only show the value part.
  :separator=<SEP> - inserted between trailer lines.
  :key_value_separator=<SEP> - inserted between key and value in trailer lines

Enhancement to existing options(now can take value and its optional):
  :only[=val]
  :unfold[=val]

'val' can be: true, on, yes or false, off, no.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-15 16:48:38 -08:00
Hariom Verma
727331dce1 t6300: use function to test trailer options
Add a function to test trailer options. This will make tests look cleaner,
as well as will make it easier to add new tests for trailers in the future.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Heba Waly <heba.waly@gmail.com>
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-15 16:48:38 -08:00
Junio C Hamano
8b25dee615 Merge branch 'tb/precompose-prefix-too'
When commands are started from a subdirectory, they may have to
compare the path to the subdirectory (called prefix and found out
from $(pwd)) with the tracked paths.  On macOS, $(pwd) and
readdir() yield decomposed path, while the tracked paths are
usually normalized to the precomposed form, causing mismatch.  This
has been fixed by taking the same approach used to normalize the
command line arguments.

* tb/precompose-prefix-too:
  MacOS: precompose_argv_prefix()
2021-02-12 14:21:04 -08:00
Junio C Hamano
60f8121940 Merge branch 'jv/upload-pack-filter-spec-quotefix'
Fix in passing custom args from "git clone" to "upload-pack" on the
other side.

* jv/upload-pack-filter-spec-quotefix:
  t5544: clarify 'hook works with partial clone' test
  upload-pack.c: fix filter spec quoting bug
2021-02-12 14:21:04 -08:00
Junio C Hamano
3c12d0b885 Merge branch 'tb/pack-revindex-on-disk'
Introduce an on-disk file to record revindex for packdata, which
traditionally was always created on the fly and only in-core.

* tb/pack-revindex-on-disk:
  t5325: check both on-disk and in-memory reverse index
  pack-revindex: ensure that on-disk reverse indexes are given precedence
  t: support GIT_TEST_WRITE_REV_INDEX
  t: prepare for GIT_TEST_WRITE_REV_INDEX
  Documentation/config/pack.txt: advertise 'pack.writeReverseIndex'
  builtin/pack-objects.c: respect 'pack.writeReverseIndex'
  builtin/index-pack.c: write reverse indexes
  builtin/index-pack.c: allow stripping arbitrary extensions
  pack-write.c: prepare to write 'pack-*.rev' files
  packfile: prepare for the existence of '*.rev' files
2021-02-12 14:21:04 -08:00
Junio C Hamano
2c873f9791 Merge branch 'ab/tests-various-fixup'
Various test updates.

* ab/tests-various-fixup:
  rm tests: actually test for SIGPIPE in SIGPIPE test
  archive tests: use a cheaper "zipinfo -h" invocation to get header
  upload-pack tests: avoid a non-zero "grep" exit status
  git-svn tests: rewrite brittle tests to use "--[no-]merges".
  git svn mergeinfo tests: refactor "test -z" to use test_must_be_empty
  git svn mergeinfo tests: modernize redirection & quoting style
  cache-tree tests: explicitly test HEAD and index differences
  cache-tree tests: use a sub-shell with less indirection
  cache-tree tests: remove unused $2 parameter
  cache-tree tests: refactor for modern test style
2021-02-12 14:21:04 -08:00
Ævar Arnfjörð Bjarmason
e7884b353b test-lib-functions: assert correct parameter count
Add assertions of the correct parameter count of various functions, in
particularly the wrappers for the shell "test" built-in.

In an earlier commit we fixed a bug with an incorrect number of
arguments being passed to "test_path_is_{file,missing}". Let's also
guard other similar functions from the same sort of misuse.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-12 11:58:21 -08:00
Ævar Arnfjörð Bjarmason
45a2686441 test-lib-functions: remove bug-inducing "diagnostics" helper param
Remove the optional "diagnostics" parameter of the
test_path_is_{file,dir,missing} functions.

We have a lot of uses of these functions, but the only legitimate use
of the diagnostics parameter is from when the functions themselves
were introduced in 2caf20c52b (test-lib: user-friendly alternatives
to test [-d|-f|-e], 2010-08-10).

But as the the rest of this diff demonstrates its presence did more to
silently introduce bugs in our tests. Fix such bugs in the tests added
in ae4e89e549 (gc: add --keep-largest-pack option, 2018-04-15), and
c04ba51739 (t6046: testcases checking whether updates can be skipped
in a merge, 2018-04-19).

Let's also assert that those functions are called with exactly one
parameter, a follow-up commit will add similar asserts to other
functions in test-lib-functions.sh that we didn't have existing misuse
of.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-12 11:58:21 -08:00
Ævar Arnfjörð Bjarmason
ebd73f50c6 test libs: rename "diff-lib" to "lib-diff"
Rename the "diff-lib" to "lib-diff". With this rename and preceding
commits there is no remaining t/*lib* which doesn't follow the
convention of being called t/lib-*.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-12 11:58:21 -08:00
Johannes Schindelin
e4e68081bb Sync with 2.29.3
* maint-2.29:
  Git 2.29.3
  Git 2.28.1
  Git 2.27.1
  Git 2.26.3
  Git 2.25.5
  Git 2.24.4
  Git 2.23.4
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:51:12 +01:00
Johannes Schindelin
d7bdabe52f Sync with 2.28.1
* maint-2.28:
  Git 2.28.1
  Git 2.27.1
  Git 2.26.3
  Git 2.25.5
  Git 2.24.4
  Git 2.23.4
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:50:14 +01:00
Johannes Schindelin
3f01e56686 Sync with 2.27.1
* maint-2.27:
  Git 2.27.1
  Git 2.26.3
  Git 2.25.5
  Git 2.24.4
  Git 2.23.4
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:50:09 +01:00
Johannes Schindelin
2d1142a3e8 Sync with 2.26.3
* maint-2.26:
  Git 2.26.3
  Git 2.25.5
  Git 2.24.4
  Git 2.23.4
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:50:04 +01:00
Johannes Schindelin
8f80393c14 Sync with 2.25.5
* maint-2.25:
  Git 2.25.5
  Git 2.24.4
  Git 2.23.4
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:49:59 +01:00
Johannes Schindelin
97d1dcb1ef Sync with 2.24.4
* maint-2.24:
  Git 2.24.4
  Git 2.23.4
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:49:55 +01:00
Johannes Schindelin
92ac04b8ee Sync with 2.23.4
* maint-2.23:
  Git 2.23.4
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:49:50 +01:00
Johannes Schindelin
4bd06fd490 Sync with 2.22.5
* maint-2.22:
  Git 2.22.5
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:49:45 +01:00
Johannes Schindelin
bcf08f33d8 Sync with 2.21.4
* maint-2.21:
  Git 2.21.4
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:49:41 +01:00
Johannes Schindelin
b1726b1a38 Sync with 2.20.5
* maint-2.20:
  Git 2.20.5
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:49:35 +01:00
Johannes Schindelin
804963848e Sync with 2.19.6
* maint-2.19:
  Git 2.19.6
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:49:17 +01:00
Johannes Schindelin
fb049fd85b Sync with 2.18.5
* maint-2.18:
  Git 2.18.5
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:47:47 +01:00
Johannes Schindelin
9b77cec89b Sync with 2.17.6
* maint-2.17:
  Git 2.17.6
  unpack_trees(): start with a fresh lstat cache
  run-command: invalidate lstat cache after a command finished
  checkout: fix bug that makes checkout follow symlinks in leading path
2021-02-12 15:47:42 +01:00
Johannes Schindelin
0d58fef58a run-command: invalidate lstat cache after a command finished
In the previous commit, we intercepted calls to `rmdir()` to invalidate
the lstat cache in the successful case, so that the lstat cache could
not have the idea that a directory exists where there is none.

The same situation can arise, of course, when a separate process is
spawned (most notably, this is the case in `submodule_move_head()`).
Obviously, we cannot know whether a directory was removed in that
process, therefore we must invalidate the lstat cache afterwards.

Note: in contrast to `lstat_cache_aware_rmdir()`, we invalidate the
lstat cache even in case of an error: the process might have removed a
directory and still have failed afterwards.

Co-authored-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-02-12 15:47:02 +01:00
Matheus Tavares
684dd4c2b4 checkout: fix bug that makes checkout follow symlinks in leading path
Before checking out a file, we have to confirm that all of its leading
components are real existing directories. And to reduce the number of
lstat() calls in this process, we cache the last leading path known to
contain only directories. However, when a path collision occurs (e.g.
when checking out case-sensitive files in case-insensitive file
systems), a cached path might have its file type changed on disk,
leaving the cache on an invalid state. Normally, this doesn't bring
any bad consequences as we usually check out files in index order, and
therefore, by the time the cached path becomes outdated, we no longer
need it anyway (because all files in that directory would have already
been written).

But, there are some users of the checkout machinery that do not always
follow the index order. In particular: checkout-index writes the paths
in the same order that they appear on the CLI (or stdin); and the
delayed checkout feature -- used when a long-running filter process
replies with "status=delayed" -- postpones the checkout of some entries,
thus modifying the checkout order.

When we have to check out an out-of-order entry and the lstat() cache is
invalid (due to a previous path collision), checkout_entry() may end up
using the invalid data and thrusting that the leading components are
real directories when, in reality, they are not. In the best case
scenario, where the directory was replaced by a regular file, the user
will get an error: "fatal: unable to create file 'foo/bar': Not a
directory". But if the directory was replaced by a symlink, checkout
could actually end up following the symlink and writing the file at a
wrong place, even outside the repository. Since delayed checkout is
affected by this bug, it could be used by an attacker to write
arbitrary files during the clone of a maliciously crafted repository.

Some candidate solutions considered were to disable the lstat() cache
during unordered checkouts or sort the entries before passing them to
the checkout machinery. But both ideas include some performance penalty
and they don't future-proof the code against new unordered use cases.

Instead, we now manually reset the lstat cache whenever we successfully
remove a directory. Note: We are not even checking whether the directory
was the same as the lstat cache points to because we might face a
scenario where the paths refer to the same location but differ due to
case folding, precomposed UTF-8 issues, or the presence of `..`
components in the path. Two regression tests, with case-collisions and
utf8-collisions, are also added for both checkout-index and delayed
checkout.

Note: to make the previously mentioned clone attack unfeasible, it would
be sufficient to reset the lstat cache only after the remove_subtree()
call inside checkout_entry(). This is the place where we would remove a
directory whose path collides with the path of another entry that we are
currently trying to check out (possibly a symlink). However, in the
interest of a thorough fix that does not leave Git open to
similar-but-not-identical attack vectors, we decided to intercept
all `rmdir()` calls in one fell swoop.

This addresses CVE-2021-21300.

Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
2021-02-12 15:47:02 +01:00
Andrew Klotz
f276e2a469 config: improve error message for boolean config
Currently invalid boolean config values return messages about 'bad
numeric', which is slightly misleading when the error was due to a
boolean value. We can improve the developer experience by returning a
boolean error message when we know the value is neither a bool text or
int.

before with an invalid boolean value of `non-boolean`, its unclear what
numeric is referring to:
  fatal: bad numeric config value 'non-boolean' for 'commit.gpgsign': invalid unit

now the error message mentions `non-boolean` is a bad boolean value:
  fatal: bad boolean config value 'non-boolean' for 'commit.gpgsign'

Signed-off-by: Andrew Klotz <agc.klotz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:44:55 -08:00
Shubham Verma
488acf15df t7001: use test rather than [
According to Documentation/CodingGuidelines, we should use "test"
rather than "[ ... ]" in shell scripts, so let's replace the
"[ ... ]" with "test" in the t7001 test script.

Signed-off-by: Shubham Verma <shubhunic@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:42:17 -08:00
Shubham Verma
39252c833e t7001: use here-docs instead of echo
Change from old style to current style by taking advantage of
here-docs instead of echo commands.

Signed-off-by: Shubham Verma <shubhunic@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:42:16 -08:00
Shubham Verma
5d683c3f4b t7001: put each command on a separate line
Modern practice is to avoid multiple commands per line, and
instead place each command on its own line.

Signed-off-by: Shubham Verma <shubhunic@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:42:16 -08:00
Shubham Verma
d2ecddc981 t7001: use '>' rather than 'touch'
Use `>` rather than `touch` to create an empty file when the
timestamp isn't relevant to the test.

Signed-off-by: Shubham Verma <shubhunic@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:42:16 -08:00
Shubham Verma
368d278249 t7001: avoid using cd outside of subshells
Avoid using `cd` outside of subshells since, if the test fails,
there is no guarantee that the current working directory is the
expected one, which may cause subsequent tests to run in the wrong
directory.

While at it, make some other tests more concise by replacing
simple subshells with `git -C`.

Signed-off-by: Shubham Verma <shubhunic@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:42:16 -08:00
Shubham Verma
dd72154149 t7001: remove whitespace after redirect operators
According to Documentation/CodingGuidelines, there should be no
whitespace after redirect operators. So, we should remove these
whitespaces after redirect operators.

Signed-off-by: Shubham Verma <shubhunic@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:42:16 -08:00
Shubham Verma
9bcaeb71a6 t7001: modernize subshell formatting
Some test use an old style for formatting subshells:

        (command &&
            ...

Update them to the modern style:

        (
            command &&
            ...

Signed-off-by: Shubham Verma <shubhunic@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:42:16 -08:00
Shubham Verma
9b46e9c9cc t7001: remove unnecessary blank lines
Some tests use a deprecated style in which there are unnecessary
blank lines after the opening quote of the test body and before the
closing quote. So we should remove these unnecessary blank lines.

Signed-off-by: Shubham Verma <shubhunic@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:42:16 -08:00
Shubham Verma
a76d90670a t7001: indent with TABs instead of spaces
Signed-off-by: Shubham Verma <shubhunic@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:42:16 -08:00
Shubham Verma
5712d62ccf t7001: modernize test formatting
Some tests in this script are formatted using a very old style:

        test_expect_success \
            'title' \
            'body line 1 &&
            body line 2'

Update the formatting to the modern style:

        test_expect_success 'title' '
            body line 1 &&
            body line 2
        '

Signed-off-by: Shubham Verma <shubhunic@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:42:16 -08:00
Denton Liu
8c2462d1fe t3905: use test_cmp() to check file contents
Modernize the script by doing file content comparisons using test_cmp()
instead of `test x = "$(cat file)"`.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:34:58 -08:00
Denton Liu
27e25a8cbf t3905: replace test -s with test_file_not_empty
In order to modernize the test script, replace `test -s` with
test_file_not_empty(), which provides better diagnostic output in the
case of failure.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:34:58 -08:00
Denton Liu
389ece4022 t3905: remove nested git in command substitution
If a git command in a nested command substitution fails, it will be
silently ignored since only the return code of the outer command
substitutions is reported. Factor out nested command substitutions so
that the error codes of those commands are reported.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:34:58 -08:00
Denton Liu
bbaa45c3aa t3905: move all commands into test cases
In order to modernize the tests, move commands that currently run
outside of test cases into a test case. Where possible, clean up files
that are produced using test_when_finished() but in the case where files
persist over multiple test cases, create a new test case to perform
cleanup.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:34:58 -08:00
Denton Liu
32b7385e43 t3905: remove spaces after redirect operators
For shell scripts, the usual convention is for there to be no space
after redirection operators, (e.g. `>file`, not `> file`). Remove these
spaces wherever they appear.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:34:58 -08:00
Jeff King
16950f8384 rev-list: add --disk-usage option for calculating disk usage
It can sometimes be useful to see which refs are contributing to the
overall repository size (e.g., does some branch have a bunch of objects
not found elsewhere in history, which indicates that deleting it would
shrink the size of a clone).

You can find that out by generating a list of objects, getting their
sizes from cat-file, and then summing them, like:

    git rev-list --objects --no-object-names main..branch
    git cat-file --batch-check='%(objectsize:disk)' |
    perl -lne '$total += $_; END { print $total }'

Though note that the caveats from git-cat-file(1) apply here. We "blame"
base objects more than their deltas, even though the relationship could
easily be flipped. Still, it can be a useful rough measure.

But one problem is that it's slow to run. Teaching rev-list to sum up
the sizes can be much faster for two reasons:

  1. It skips all of the piping of object names and sizes.

  2. If bitmaps are in use, for objects that are in the
     bitmapped packfile we can skip the oid_object_info()
     lookup entirely, and just ask the revindex for the
     on-disk size.

This patch implements a --disk-usage option which produces the same
answer in a fraction of the time. Here are some timings using a clone of
torvalds/linux:

  [rev-list piped to cat-file, no bitmaps]
  $ time git rev-list --objects --no-object-names --all |
    git cat-file --buffer --batch-check='%(objectsize:disk)' |
    perl -lne '$total += $_; END { print $total }'
  1459938510
  real	0m29.635s
  user	0m38.003s
  sys	0m1.093s

  [internal, no bitmaps]
  $ time git rev-list --disk-usage --objects --all
  1459938510
  real	0m31.262s
  user	0m30.885s
  sys	0m0.376s

Even though the wall-clock time is slightly worse due to parallelism,
notice the CPU savings between the two. We saved 21% of the CPU just by
avoiding the pipes.

But the real win is with bitmaps. If we use them without the new option:

  [rev-list piped to cat-file, bitmaps]
  $ time git rev-list --objects --no-object-names --all --use-bitmap-index |
    git cat-file --batch-check='%(objectsize:disk)' |
    perl -lne '$total += $_; END { print $total }'
  1459938510
  real	0m6.244s
  user	0m8.452s
  sys	0m0.311s

then we're faster to generate the list of objects, but we still spend a
lot of time piping and looking things up. But if we do both together:

  [internal, bitmaps]
  $ time git rev-list --disk-usage --objects --all --use-bitmap-index
  1459938510
  real	0m0.219s
  user	0m0.169s
  sys	0m0.049s

then we get the same answer much faster.

For "--all", that answer will correspond closely to "du objects/pack",
of course. But we're actually checking reachability here, so we're still
fast when we ask for more interesting things:

  $ time git rev-list --disk-usage --use-bitmap-index v5.0..v5.10
  374798628
  real	0m0.429s
  user	0m0.356s
  sys	0m0.072s

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 09:57:55 -08:00
Johannes Schindelin
c809798b2a reflog expire --stale-fix: be generous about missing objects
Whenever a user runs `git reflog expire --stale-fix`, the most likely
reason is that their repository is at least _somewhat_ corrupt. Which
means that it is more than just possible that some objects are missing.

If that is the case, that can currently let the command abort through
the phase where it tries to mark all reachable objects.

Instead of adding insult to injury, let's be gentle and continue as best
as we can in such a scenario, simply by ignoring the missing objects and
moving on.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 09:21:52 -08:00
Ævar Arnfjörð Bjarmason
1108cea7f8 tests: remove most uses of test_i18ncmp
As a follow-up to d162b25f95 (tests: remove support for
GIT_TEST_GETTEXT_POISON, 2021-01-20) remove most uses of test_i18ncmp
via a simple s/test_i18ncmp/test_cmp/g search-replacement.

I'm leaving t6300-for-each-ref.sh out due to a conflict with in-flight
changes between "master" and "seen", as well as the prerequisite
itself due to other changes between "master" and "next/seen" which add
new test_i18ncmp uses.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 23:48:27 -08:00
Ævar Arnfjörð Bjarmason
b1e079807b tests: remove last uses of C_LOCALE_OUTPUT
Remove the last uses of the C_LOCALE_OUTPUT prerequisite as well as
the prerequisite itself. This is a follow-up to d162b25f95 (tests:
remove support for GIT_TEST_GETTEXT_POISON, 2021-01-20), as well as
the preceding commit where we removed the simpler uses of
C_LOCALE_OUTPUT.

Here I'm slightly refactoring a test added in 21e5ad50fc (safecrlf:
Add mechanism to warn about irreversible crlf conversions,
2008-02-06), as well as getting rid of another "test_have_prereq
C_LOCALE_OUTPUT" use.

I'm not leaving the prerequisite itself in place for in-flight changes
as there currently are none that introduce new tests that rely on it,
and because C_LOCALE_OUTPUT is currently a noop on the master branch
we likely won't have any new submissions that use it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 23:48:27 -08:00
Ævar Arnfjörð Bjarmason
a926c4b904 tests: remove most uses of C_LOCALE_OUTPUT
As a follow-up to d162b25f95 (tests: remove support for
GIT_TEST_GETTEXT_POISON, 2021-01-20) remove those uses of the now
always true C_LOCALE_OUTPUT prerequisite from those tests which
declare it as an argument to test_expect_{success,failure}.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 23:48:26 -08:00
Ævar Arnfjörð Bjarmason
780aa0a21e tests: remove last uses of GIT_TEST_GETTEXT_POISON=false
Follow-up my 73c01d25fe (tests: remove uses of
GIT_TEST_GETTEXT_POISON=false, 2021-01-20) by removing the last uses
of GIT_TEST_GETTEXT_POISON=*.

These assignments were part of branch that was in-flight at the time
of the gettext poison removal. See 466f94ec45 (Merge branch
'ab/detox-gettext-tests', 2021-02-10) and c7d6d419b0 (Merge branch
'ab/mktag', 2021-01-25) for the merging of the two branches.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 23:48:26 -08:00