The code to read the pack data using the offsets stored in the pack
idx file has been made more carefully check the validity of the
data in the idx.
* jk/pack-idx-corruption-safety:
sha1_file.c: mark strings for translation
use_pack: handle signed off_t overflow
nth_packed_object_offset: bounds-check extended offset
t5313: test bounds-checks of corrupted/malicious pack/idx files
"git config section.var value" to set a value in per-repository
configuration file failed when it was run outside any repository,
but didn't say the reason correctly.
* js/config-set-in-non-repository:
git config: report when trying to modify a non-existing repo config
A helper function "git submodule" uses since v2.7.0 to list the
modules that match the pathspec argument given to its subcommands
(e.g. "submodule add <repo> <path>") has been fixed.
* sb/submodule-module-list-fix:
submodule helper list: respect correct path prefix
Recent versions of GNU grep are pickier when their input contains
arbitrary binary data, which some of our tests uses. Rewrite the
tests to sidestep the problem.
* jk/grep-binary-workaround-in-test:
t9200: avoid grep on non-ASCII data
t8005: avoid grep on non-ASCII data
"git rev-parse --git-common-dir" used in the worktree feature
misbehaved when run from a subdirectory.
* nd/git-common-dir-fix:
rev-parse: take prefix into account in --git-common-dir
"git show 'HEAD:Foo[BAR]Baz'" did not interpret the argument as a
rev, i.e. the object named by the the pathname with wildcard
characters in a tree object.
* nd/dwim-wildcards-as-pathspecs:
get_sha1: don't die() on bogus search strings
check_filename: tighten dwim-wildcard ambiguity
checkout: reorder check_filename conditional
Handling of errors while writing into our internal asynchronous
process has been made more robust, which reduces flakiness in our
tests.
* jk/epipe-in-async:
t5504: handle expected output from SIGPIPE death
test_must_fail: report number of unexpected signal
fetch-pack: ignore SIGPIPE in sideband demuxer
write_or_die: handle EPIPE in async threads
Many codepaths forget to check return value from git_config_set();
the function is made to die() to make sure we do not proceed when
setting a configuration variable failed.
* ps/config-error:
config: rename git_config_set_or_die to git_config_set
config: rename git_config_set to git_config_set_gently
compat: die when unable to set core.precomposeunicode
sequencer: die on config error when saving replay opts
init-db: die on config errors when initializing empty repo
clone: die on config error in cmd_clone
remote: die on config error when manipulating remotes
remote: die on config error when setting/adding branches
remote: die on config error when setting URL
submodule--helper: die on config error when cloning module
submodule: die on config error when linking modules
branch: die on config error when editing branch description
branch: die on config error when unsetting upstream
branch: report errors in tracking branch setup
config: introduce set_or_die wrappers
Traditionally, the tests that try commands that work on the
contents in the working tree were named with "worktree" in their
filenames, but with the recent addition of "git worktree"
subcommand, whose tests are also named similarly, it has become
harder to tell them apart. The traditional tests have been renamed
to use "work-tree" instead in an attempt to differentiate them.
* mg/work-tree-tests:
tests: rename work-tree tests to *work-tree*
Commit 8bf4bec (add "ok=sigpipe" to test_must_fail and use
it to fix flaky tests, 2015-11-27) taught t5504 to handle
"git push" racily exiting with SIGPIPE rather than failing.
However, one of the tests checks the output of the command,
as well. In the SIGPIPE case, we will not have produced any
output. If we want the test to be truly non-flaky, we have
to accept either output.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a command is marked as test_must_fail but dies with a
signal, we consider that a problem and report the error to
stderr. However, we don't say _which_ signal; knowing that
can make debugging easier. Let's share as much as we know.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A v2 pack index file can specify an offset within a packfile
of up to 2^64-1 bytes. On a system with a signed 64-bit
off_t, we can represent only up to 2^63-1. This means that a
corrupted .idx file can end up with a negative offset in the
pack code. Our bounds-checking use_pack function looks for
too-large offsets, but not for ones that have wrapped around
to negative. Let's do so, which fixes an out-of-bounds
access demonstrated in t5313.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a pack .idx file has a corrupted offset for an object, we
may try to access an offset in the .idx or .pack file that
is larger than the file's size. For the .pack case, we have
use_pack() to protect us, which realizes the access is out
of bounds. But if the corrupted value asks us to look in the
.idx file's secondary 64-bit offset table, we blindly add it
to the mmap'd index data and access arbitrary memory.
We can fix this with a simple bounds-check compared to the
size we found when we opened the .idx file.
Note that there's similar code in index-pack that is
triggered only during "index-pack --verify". To support
both, we pull the bounds-check into a separate function,
which dies when it sees a corrupted file.
It would be nice if we could return an error, so that the
pack code could try to find a good copy of the object
elsewhere. Currently nth_packed_object_offset doesn't have
any way to return an error, but it could probably use "0" as
a sentinel value (since no object can start there). This is
the minimal fix, and we can improve the resilience later on
top.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our on-disk .pack and .idx files may reference other data by
offset. We should make sure that we are not fooled by
corrupt data into accessing memory outside of our mmap'd
boundaries.
This patch adds a series of tests for offsets found in .pack
and .idx files. For the most part we get this right, but
there are two tests of .idx files marked as failures: we do
not bounds-check offsets in the v2 index's extended offset
table, nor do we handle .idx offsets that overflow a signed
off_t.
With these tests, we should have good coverage of all
offsets found in these files. Note that this doesn't cover
.bitmap files, which may have similar bugs.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is a pilot error to call `git config section.key value` outside of
any Git worktree. The message
error: could not lock config file .git/config: No such file or
directory
is not very helpful in that situation, though. Let's print a helpful
message instead.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a regression introduced by 74703a1e4d (submodule: rewrite
`module_list` shell function in C, 2015-09-02).
Add a test to ensure we list the right submodule when giving a
specific pathspec.
Reported-By: Caleb Jorden <cjorden@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
GNU grep 2.23 detects the input used in this test as binary data so it
does not work for extracting lines from a file. We could add the "-a"
option to force grep to treat the input as text, but not all
implementations support that. Instead, use sed to extract the desired
lines since it will always treat its input as text.
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
GNU grep 2.23 detects the input used in this test as binary data so it
does not work for extracting lines from a file. We could add the "-a"
option to force grep to treat the input as text, but not all
implementations support that. Instead, use sed to extract the desired
lines since it will always treat its input as text.
While touching these lines, modernize the test style to avoid hiding the
exit status of "git blame" and remove a space following a redirection
operator. Also swap the order of the expected and actual output
files given to test_cmp; we compare expect and actual to show how
actual output differs from what is expected.
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When invoking `git-remote --set-url` we do not check the return
value when writing the actual new URL to the configuration file,
pretending to the user that the configuration has been set while
it was in fact not persisted.
Fix this problem by dying early when setting the config fails.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we try to unset upstream configurations we do not check
return codes for the `git_config_set` functions. As those may
indicate that we were unable to unset the respective
configuration we may exit successfully without any error message
while in fact the upstream configuration was not unset.
Fix this by dying with an error message when we cannot unset the
configuration.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When setting up a new tracking branch fails due to issues with
the configuration file we do not report any errors to the user
and pretend setting the tracking branch succeeded.
Setting up the tracking branch is handled by the
`install_branch_config` function. We do not want to simply die
there as the function is not only invoked when explicitly setting
upstream information with `git branch --set-upstream-to=`, but
also by `git push --set-upstream` and `git clone`. While it is
reasonable to die in the explict first case, we would lose
information in the latter two cases, so we only print the error
message but continue the program as usual.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"Work tree" or "working tree" is the name of a checked out tree,
"worktree" the name of the command which manages several working trees.
The naming of tests mixes these two, currently:
$ls t/*worktree*
t/t1501-worktree.sh
t/t1509-root-worktree.sh
t/t2025-worktree-add.sh
t/t2026-worktree-prune.sh
t/t2027-worktree-list.sh
t/t2104-update-index-skip-worktree.sh
t/t3320-notes-merge-worktrees.sh
t/t7011-skip-worktree-reading.sh
t/t7012-skip-worktree-writing.sh
t/t7409-submodule-detached-worktree.sh
$grep -l "git worktree" t/*.sh
t/t0002-gitfile.sh
t/t1400-update-ref.sh
t/t2025-worktree-add.sh
t/t2026-worktree-prune.sh
t/t2027-worktree-list.sh
t/t3320-notes-merge-worktrees.sh
t/t7410-submodule-checkout-to.sh
Rename t1501, t1509 and t7409 to make it clear on first glance that they
test work tree related behavior, rather than the worktree command.
t2104, t7011 and t7012 are about the "skip-worktree" flag so that their
name should remain unchanged.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The emulated "yes" command used in our test scripts has been
tweaked not to spend too much time generating unnecessary output
that is not used, to help those who test on Windows where it would
not stop until it fills the pipe buffer due to lack of SIGPIPE.
* js/test-lib-windows-emulated-yes:
test-lib: limit the output of the yes utility
"git push --force-with-lease" has been taught to report if the push
needed to force (or fast-forwarded).
* aw/push-force-with-lease-reporting:
push: fix ref status reporting for --force-with-lease
The low-level merge machinery has been taught to use CRLF line
termination when inserting conflict markers to merged contents that
are themselves CRLF line-terminated.
* js/xmerge-marker-eol:
merge-file: ensure that conflict sections match eol style
merge-file: let conflict markers match end-of-line style of the context
Most of the time, get_git_common_dir() returns an absolute path so
prefix is irrelevant. If it returns a relative path (e.g. from the
main worktree) then prefixing is required.
Noticed-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The get_sha1() function generally returns an error code
rather than dying, and we sometimes speculatively call it
with something that may be a revision or a pathspec, in
order to see which one it might be.
If it sees a bogus ":/" search string, though, it complains,
without giving the caller the opportunity to recover. We can
demonstrate this in t6133 by looking for ":/*.t", which
should mean "*.t at the root of the tree", but instead dies
because of the invalid regex (the "*" has nothing to operate
on).
We can fix this by returning an error rather than calling
die(). Unfortunately, the tradeoff is that the error message
is slightly worse in cases where we _do_ know we have a rev.
E.g., running "git log ':/*.t' --" before yielded:
fatal: Invalid search pattern: *.t
and now we get only:
fatal: bad revision ':/*.t'
There's not a simple way to fix this short of passing a
"quiet" flag all the way through the get_sha1() stack.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When specifying both revisions and pathnames, we allow
"<rev> -- <pathspec>" to be spelled without the "--" as long
as it is not ambiguous. The original logic was something
like:
1. Resolve each item with get_sha1(). If successful,
we know it can be a <rev>. Verify that it _isn't_ a
filename, using verify_non_filename(), and complain of
ambiguity otherwise.
2. If get_sha1() didn't succeed, make sure that it _is_
a file, using verify_filename(). If not, complain
that it is neither a <rev> nor a <pathspec>.
Both verify_filename() and verify_non_filename() rely on
check_filename(), which definitely said "yes, this is a
file" or "no, it is not" using lstat().
Commit 28fcc0b (pathspec: avoid the need of "--" when
wildcard is used, 2015-05-02) introduced a convenience
feature: check_filename() will consider anything with
wildcard meta-characters as a possible filename, without
even checking the filesystem.
This works well for case 2. For such a wildcard, we would
previously have died and said "it is neither". Post-28fcc0b,
we assume it's a pathspec and proceed.
But it makes some instances of case 1 worse. We may have an
extended sha1 expression that contains meta-characters
(e.g., "HEAD^{/foo.*bar}"), and we now complain that it's
also a filename, due to the wildcard characters (even though
that wildcard would not match anything in the filesystem).
One solution would be to actually expand the pathname and
see if it matches anything on the filesystem. But that's
potentially expensive, and we do not have to be so rigorous
for this DWIM magic (if you want rigor, use "--").
Instead, we can just use different rules for cases 1 and 2.
When we know something is a rev, we will complain only if it
meets a much higher standard for "this is also a file";
namely that it actually exists in the filesystem. Case 2
remains the same: we use the looser "it could be a filename"
standard introduced by 28fcc0b.
We can accomplish this by pulling the wildcard logic out of
check_filename() and putting it into verify_filename(). Its
partner verify_non_filename() does not need a change, since
check_filename() goes back to implementing the "higher
standard".
Besides these two callers of check_filename(), there is one
other: git-checkout does a similar DWIM itself. It hits this
code path only after get_sha1() has returned failure, making
it case 2, which gets the special wildcard treatment.
Note that we drop the tests in t2019 in favor of a more
complete set in t6133. t2019 was not the right place for
them (it's about refname ambiguity, not dwim parsing
ambiguity), and the second test explicitly checked for the
opposite result of the case we are fixing here (which didn't
really make any sense; as shown by the test_must_fail in the
test, it would only serve to annoy people).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The underlying machinery used by "ls-files -o" and other commands
have been taught not to create empty submodule ref cache for a
directory that is not a submodule. This removes a ton of wasted
CPU cycles.
* jk/ref-cache-non-repository-optim:
resolve_gitlink_ref: ignore non-repository paths
clean: make is_git_repository a public function
dirname() emulation has been added, as Msys2 lacks it.
* js/dirname-basename:
mingw: avoid linking to the C library's isalpha()
t0060: loosen overly strict expectations
t0060: verify that basename() and dirname() work as expected
compat/basename.c: provide a dirname() compatibility function
compat/basename: make basename() conform to POSIX
Refactor skipping DOS drive prefixes
A few options of "git diff" did not work well when the command was
run from a subdirectory.
* nd/diff-with-path-params:
diff: make -O and --output work in subdirectory
diff-no-index: do not take a redundant prefix argument
"git tag" started listing a tag "foo" as "tags/foo" when a branch
named "foo" exists in the same repository; remove this unnecessary
disambiguation, which is a regression introduced in v2.7.0.
* jk/list-tag-2.7-regression:
tag: do not show ambiguous tag names as "tags/foo"
t6300: use test_atom for some un-modern tests
The description for SANITY prerequisite the test suite uses has
been clarified both in the comment and in the implementation.
* jk/sanity:
test-lib: clarify and tighten SANITY
A recent optimization to filter-branch in v2.7.0 introduced a
regression when --prune-empty filter is used, which has been
corrected.
* jk/filter-branch-no-index:
filter-branch: resolve $commit^{tree} in no-index case
Many codepaths that run "gc --auto" before exiting kept packfiles
mapped and left the file descriptors to them open, which was not
friendly to systems that cannot remove files that are open. They
now close the packs before doing so.
* js/close-packs-before-gc:
receive-pack: release pack files before garbage-collecting
merge: release pack files before garbage-collecting
am: release pack files before garbage-collecting
fetch: release pack files before garbage-collecting
The ignore mechanism saw a few regressions around untracked file
listing and sparse checkout selection areas in 2.7.0; the change
that is responsible for the regression has been reverted.
* nd/exclusion-regression-fix:
Revert "dir.c: don't exclude whole dir prematurely if neg pattern may match"
"git reflog" incorrectly assumed that all objects that used to be
at the tip of a ref must be commits, which caused it to segfault.
* dk/reflog-walk-with-non-commit:
reflog-walk: don't segfault on non-commit sha1's in the reflog
"git send-email" was confused by escaped quotes stored in the alias
files saved by "mutt", which has been corrected.
* ew/send-email-mutt-alias-fix:
git-send-email: do not double-escape quotes from mutt
An earlier change in 2.5.x-era broke users' hooks and aliases by
exporting GIT_WORK_TREE to point at the root of the working tree,
interfering when they tried to use a different working tree without
setting GIT_WORK_TREE environment themselves.
* nd/stop-setenv-work-tree:
Revert "setup: set env $GIT_WORK_TREE when work tree is set, like $GIT_DIR"
On Windows, there is no SIGPIPE. A consequence of this is that the
upstream process of a pipe does not notice the death of the downstream
process until the pipe buffer is full and writing more data returns an
error. This behavior is the reason for an annoying delay during the
execution of t7610-mergetool.sh: There are a number of test cases where
'yes' is invoked upstream. Since the utility is basically an endless
loop it runs, on Windows, until the pipe buffer is full. This does take
a few seconds.
The test suite has its own implementation of 'yes'. Modify it to produce
only a limited amount of output that is sufficient for the test suite.
The amount chosen should be sufficiently high for any test case, assuming
that future test cases will not exaggerate their demands of input from
an upstream 'yes' invocation.
[j6t: commit message]
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --force--with-lease push option leads to less
detailed status information than --force. In particular,
the output indicates that a reference was fast-forwarded,
even when it was force-updated.
Modify the --force-with-lease ref status logic to leverage
the --force ref status logic when the "lease" conditions
are met.
Also, enhance tests to validate output status reporting.
Signed-off-by: Andrew Wheeler <awheeler@motorola.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous patch, we made sure that the conflict markers themselves
match the end-of-line style of the input files. However, this still left
out the conflicting text itself: if it lacks a trailing newline, we
add one, and should add a carriage return when appropriate, too.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When merging files with CR/LF line endings, the conflict markers should
match those, lest the output file has mixed line endings.
This is particularly of interest on Windows, where some editors get
*really* confused by mixed line endings.
The original version of this patch by Beat Bolli respected core.eol, and
a subsequent improvement by this developer also respected gitattributes.
This approach was suboptimal, though: `git merge-file` was invented as a
drop-in replacement for GNU merge and as such has no problem operating
outside of any repository at all!
Another problem with the original approach was pointed out by Junio
Hamano: legacy repositories might have their text files committed using
CR/LF line endings (and core.eol and the gitattributes would give us a
false impression there). Therefore, the much superior approach is to
simply match the context's line endings, if any.
We actually do not have to look at the *entire* context at all: if the
files are all LF-only, or if they all have CR/LF line endings, it is
sufficient to look at just a *single* line to match that style. And if
the line endings are mixed anyway, it is *still* okay to imitate just a
single line's eol: we will just add to the pile of mixed line endings,
and there is nothing we can do about that.
So what we do is: we look at the line preceding the conflict, falling
back to the line preceding that in case it was the last line and had no
line ending, falling back to the first line, first in the first
post-image, then the second post-image, and finally the pre-image.
If we find consistent CR/LF (or undecided) end-of-line style, we match
that, otherwise we use LF-only line endings for the conflict markers.
Note that while it is true that there have to be at least two lines we
can look at (otherwise there would be no conflict), the same is not true
for line *endings*: the three files in question could all consist of a
single line without any line ending, each. In this case we fall back to
using LF-only.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since b7cc53e9 (tag.c: use 'ref-filter' APIs, 2015-07-11),
git-tag has started showing tags with ambiguous names (i.e.,
when both "heads/foo" and "tags/foo" exists) as "tags/foo"
instead of just "foo". This is both:
- pointless; the output of "git tag" includes only
refs/tags, so we know that "foo" means the one in
"refs/tags".
and
- ambiguous; in the original output, we know that the line
"foo" means that "refs/tags/foo" exists. In the new
output, it is unclear whether we mean "refs/tags/foo" or
"refs/tags/tags/foo".
The reason this happens is that commit b7cc53e9 switched
git-tag to use ref-filter's "%(refname:short)" output
formatting, which was adapted from for-each-ref. This more
general code does not know that we care only about tags, and
uses shorten_unambiguous_ref to get the short-name. We need
to tell it that we care only about "refs/tags/", and it
should shorten with respect to that value.
In theory, the ref-filter code could figure this out by us
passing FILTER_REFS_TAGS. But there are two complications
there:
1. The handling of refname:short is deep in formatting
code that does not even have our ref_filter struct, let
alone the arguments to the filter_ref struct.
2. In git v2.7.0, we expose the formatting language to the
user. If we follow this path, it will mean that
"%(refname:short)" behaves differently for "tag" versus
"for-each-ref" (including "for-each-ref refs/tags/"),
which can lead to confusion.
Instead, let's add a new modifier to the formatting
language, "strip", to remove a specific set of prefix
components. This fixes "git tag", and lets users invoke the
same behavior from their own custom formats (for "tag" or
"for-each-ref") while leaving ":short" with its same
consistent meaning in all places.
We introduce a test in t7004 for "git tag", which fails
without this patch. We also add a similar test in t3203 for
"git branch", which does not actually fail. But since it is
likely that "branch" will eventually use the same formatting
code, the test helps defend against future regressions.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Because this script has to test so many formatters, we have
the nice "test_atom" helper, but we don't use it
consistently. Let's do so. This is shorter, gets rid of some
tests that have their "expected" setup outside of a
test_expect_success block, and lets us organize the changes
better (e.g., putting "refname:short" near "refname").
We also expand the "%(push)" tests a little to match the
"%(upstream)" ones.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we want to look up a submodule ref, we use
get_ref_cache(path) to find or auto-create its ref cache.
But if we feed a path that isn't actually a git repository,
we blindly create the ref cache, and then may die deeper in
the code when we try to access it. This is a problem because
many callers speculatively feed us a path that looks vaguely
like a repository, and expect us to tell them when it is
not.
This patch teaches resolve_gitlink_ref to reject
non-repository paths without creating a ref_cache. This
avoids the die(), and also performs better if you have a
large number of these faux-submodule directories (because
the ref_cache lookup is linear, under the assumption that
there won't be a large number of submodules).
To accomplish this, we also break get_ref_cache into two
pieces: the lookup and auto-creation (the latter is lumped
into create_ref_cache). This lets us first cheaply ask our
cache "is it a submodule we know about?" If so, we can avoid
repeating our filesystem lookup. So lookups of real
submodules are not penalized; they examine the submodule's
.git directory only once.
The test in t3000 demonstrates a case where this improves
correctness (we used to just die). The new perf case in
p7300 shows off the speed improvement in an admittedly
pathological repository:
Test HEAD^ HEAD
----------------------------------------------------------------
7300.4: ls-files -o 66.97(66.15+0.87) 0.33(0.08+0.24) -99.5%
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 348d4f2 (filter-branch: skip index read/write when
possible, 2015-11-06) taught filter-branch to optimize out
the final "git write-tree" when we know we haven't touched
the tree with any of our filters. It does by simply putting
the literal text "$commit^{tree}" into the "$tree" variable,
avoiding a useless rev-parse call.
However, when we pass this to git_commit_non_empty_tree(),
it gets confused; it resolves "$commit^{tree}" itself, and
compares our string to the 40-hex sha1, which obviously
doesn't match. As a result, "--prune-empty" (or any custom
filter using git_commit_non_empty_tree) will fail to drop
an empty commit (when filter-branch is used without a tree
or index filter).
Let's resolve $tree to the 40-hex ourselves, so that
git_commit_non_empty_tree can work. Unfortunately, this is a
bit slower due to the extra process overhead:
$ cd t/perf && ./run 348d4f2 HEAD p7000-filter-branch.sh
[...]
Test 348d4f2 HEAD
--------------------------------------------------------------
7000.2: noop filter 3.76(0.24+0.26) 4.54(0.28+0.24) +20.7%
We could try to make git_commit_non_empty_tree more clever.
However, the value of $tree here is technically
user-visible. The user can provide arbitrary shell code at
this stage, which could itself have a similar assumption to
what is in git_commit_non_empty_tree. So the conservative
choice to fix this regression is to take the 20% hit and
give the pre-348d4f2 behavior. We still end up much faster
than before the optimization:
$ cd t/perf && ./run 348d4f2^ HEAD p7000-filter-branch.sh
[...]
Test 348d4f2^ HEAD
--------------------------------------------------------------
7000.2: noop filter 9.51(4.32+0.40) 4.51(0.28+0.23) -52.6%
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
f400e51c (test-lib.sh: set prerequisite SANITY by testing what we
really need, 2015-01-27) improved the way SANITY prerequisite was
determined, but made the resulting code (incorrectly) imply that
SANITY is all about effects of permission bits of the containing
directory has on the files contained in it by the comment it added,
its log message and the actual tests.
State what SANITY is about more clearly in the comment, and test
that a file whose permission bits says should be unreadble truly
cannot be read.
Signed-off-by: Junio C Hamano <gitster@pobox.com>