Change a "git diff-tree" command to be &&-chained so that we won't
ignore its exit code, see the ea05fd5fbf (Merge branch
'ab/keep-git-exit-codes-in-tests', 2022-03-16) topic for prior art.
This fixes code added in b45563a229 (rename: Break filepairs with
different types., 2007-11-30). Due to hiding the exit code we hid a
memory leak under SANITIZE=leak.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the verify_mergeheads() helper the check the exit code of "git
rev-parse".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Amend the test added in [1] to check the exit code of the "git"
invocations. An in-flight change[2] introduced a memory leak in these
invocations, which went undetected unless we were running under
"GIT_TEST_SANITIZE_LEAK_LOG=true".
Note that the in-flight change made 8 test files fail, but as far as I
can tell only this one would have had its exit code hidden unless
under "GIT_TEST_SANITIZE_LEAK_LOG=true". The rest would be caught
without it.
We could pick other variable names here than "ln%d", e.g. "commit",
"dummy_blob" and "file_blob", but having the "rev-parse" invocations
aligned makes the difference between them more readable, so let's pick
"ln%d".
1. 4cf2143e02 (pack-objects: break delta cycles before delta-search
phase, 2016-08-11)
2. https://lore.kernel.org/git/221128.868rjvmi3l.gmgdl@evledraar.gmail.com/
3. faececa53f (test-lib: have the "check" mode for SANITIZE=leak
consider leak logs, 2022-07-28)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Only abort the individual check instead of exiting the whole test script
if git show fails. Noticed with GIT_TEST_PASSING_SANITIZE_LEAK=check.
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git status` can be slow when there are a large number of
untracked files and directories since Git must search the entire
worktree to enumerate them. When it is too slow, Git prints
advice with the elapsed search time and a suggestion to disable
the search using the `-uno` option. This suggestion also carries
a warning that might scare off some users.
However, these days, `-uno` isn't the only option. Git can reduce
the time taken to enumerate untracked files by caching results from
previous `git status` invocations, when the `core.untrackedCache`
and `core.fsmonitor` features are enabled.
Update the `git status` man page to explain these configuration
options, and update the advice to provide more detail about the
current configuration and to refer to the updated documentation.
Signed-off-by: Rudy Rigot <rudy.rigot@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 5cb28270a1 (pack-objects: lazily set up "struct rev_info", don't
leak, 2022-03-28) --filter options given to git pack-objects overrule
earlier ones, letting only the leftmost win and leaking the memory
allocated for earlier ones. Fix that by only initializing the rev_info
struct once.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git pack-objects should accept multiple --filter options as documented
in Documentation/rev-list-options.txt, but currently the last one wins.
Show that using tests with multiple blob size limits
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fb2d0db502 (test-lib-functions: add parsing helpers for ls-files and
ls-tree, 2022-04-04) not only started to use helper functions, it also
started to pipe the output of git ls-files into them directly, without
using a temporary file. No explanation was given. This causes the
return code of that git command to be ignored.
Revert that part of the change, use temporary files and check the return
code of git ls-files again.
Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When GIT_COMPLETION_IGNORE_CASE is set, also allow lowercase completion
text like "head" to match uppercase HEAD and other pseudorefs.
Signed-off-by: Alison Winters <alisonatwork@outlook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If GIT_COMPLETION_IGNORE_CASE is set, --ignore-case will be added to
git for-each-ref calls so that refs can be matched case insensitively,
even when running on case sensitive filesystems.
Signed-off-by: Alison Winters <alisonatwork@outlook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The format of a line in /proc/cpuinfo that describes a CPU on s390x
looked different from everybody else, and the code in chainlint.pl
failed to parse it.
* ah/chainlint-cpuinfo-parse-fix:
chainlint.pl: fix /proc/cpuinfo regexp
Resolve symbolic links when processing the locations of alternate
object stores, since failing to do so can lead to confusing and buggy
behavior.
* gc/resolve-alternate-symlinks:
object-file: use real paths when adding alternates
Add one more candidate directory that may house httpd modules while
running tests.
* es/locate-httpd-module-location-in-test:
lib-httpd: extend module location auto-detection
"git prune" may try to iterate over .git/objects/pack for trash
files to remove in it, and loudly fail when the directory is
missing, which is not necessary. The command has been taught to
ignore such a failure.
* ew/prune-with-missing-objects-pack:
prune: quiet ENOENT on missing directories
Assorted fixes of parsing end-user input as integers.
* pw/config-int-parse-fixes:
git_parse_signed(): avoid integer overflow
config: require at least one digit when parsing numbers
git_parse_unsigned: reject negative values
`parse_object()` hardening when checking for the existence of a
suspected blob object.
* jk/parse-object-type-mismatch:
parse_object(): simplify blob conditional
parse_object(): check on-disk type of suspected blob
parse_object(): drop extra "has" check before checking object type
The handling to die early when there is no EDITOR is valuable when
used in normal code (i.e., editor.c). In git-var, where
null/empty-string is a perfectly valid value to return, it doesn't
make as much sense.
Remove this handling from `git var GIT_EDITOR` so that it does not
fail so noisily when there is no defined editor.
Signed-off-by: Sean Allred <allred.sean@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When adding an alternate ODB, we check if the alternate has the same
path as the object dir, and if so, we do nothing. However, that
comparison does not resolve symlinks. This makes it possible to add the
object dir as an alternate, which may result in bad behavior. For
example, it can trick "git repack -a -l -d" (possibly run by "git gc")
into thinking that all packs come from an alternate and delete all
objects.
rm -rf test &&
git clone https://github.com/git/git test &&
(
cd test &&
ln -s objects .git/alt-objects &&
# -c repack.updateserverinfo=false silences a warning about not
# being able to update "info/refs", it isn't needed to show the
# bad behavior
GIT_ALTERNATE_OBJECT_DIRECTORIES=".git/alt-objects" git \
-c repack.updateserverinfo=false repack -a -l -d &&
# It's broken!
git status
# Because there are no more objects!
ls .git/objects/pack
)
Fix this by resolving symlinks and relative paths before comparing the
alternate and object dir. This lets us clean up a number of issues noted
in 37a95862c6 (alternates: re-allow relative paths from environment,
2016-11-07):
- Now that we compare the real paths, duplicate detection is no longer
foiled by relative paths.
- Using strbuf_realpath() allows us to "normalize" paths that
strbuf_normalize_path() can't, so we can stop silently ignoring errors
when "normalizing" paths from the environment.
- We now store an absolute path based on getcwd() (the "future
direction" named in 37a95862c6), so chdir()-ing in the process no
longer changes the directory pointed to by the alternate. This is a
change in behavior, but a desirable one.
Signed-off-by: Glen Choo <chooglen@google.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 81071626ba (trace2: add global counter mechanism, 2022-10-24)
these tests have been failing when git is compiled with NO_PTHREADS=Y,
which is always the case e.g. if 'uname -s' is "NONSTOP_KERNEL".
Reported-by: Randall S. Becker <randall.becker@nexbridge.ca>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git receive-pack" used to use all the local refs as the boundary for
checking connectivity of the data "git push" sent, but now it uses
only the refs that it advertised to the pusher. In a repository with
the .hideRefs configuration, this reduces the resources needed to
perform the check.
cf. <221028.86bkpw805n.gmgdl@evledraar.gmail.com>
cf. <xmqqr0yrizqm.fsf@gitster.g>
* ps/receive-use-only-advertised:
receive-pack: only use visible refs for connectivity check
rev-parse: add `--exclude-hidden=` option
revision: add new parameter to exclude hidden refs
revision: introduce struct to handle exclusions
revision: move together exclusion-related functions
refs: get rid of global list of hidden refs
refs: fix memory leak when parsing hideRefs config
Teach chainlint.pl to show corresponding line numbers when printing
the source of a test.
* es/chainlint-lineno:
chainlint: prefix annotated test definition with line numbers
chainlint: latch line numbers at which each token starts and ends
chainlint: sidestep impoverished macOS "terminfo"
Fix a source of flakiness in CI when compiling with SANITIZE=leak.
* ab/t7610-timeout:
t7610: use "file:///dev/null", not "/dev/null", fixes MinGW
t7610: fix flaky timeout issue, don't clone from example.com
'git maintenance register' is taught to write configuration to an
arbitrary path, and 'git for-each-repo' is taught to expand tilde
characters in paths.
* rp/maintenance-qol:
builtin/gc.c: fix use-after-free in maintenance_unregister()
maintenance --unregister: fix uninit'd data use & -Wdeclaration-after-statement
maintenance: add option to register in a specific config
for-each-repo: interpolate repo path arguments
Correct an error where `git rebase` would mistakenly use a branch or
tag named "refs/rewritten/xyz" when missing a rebase label.
* pw/strict-label-lookups:
sequencer: tighten label lookups
sequencer: unify label lookup
Redact headers from cURL's h2h3 module in GIT_CURL_VERBOSE and
others.
* gc/redact-h2h3-headers:
http: redact curl h2h3 headers in info
t: run t5551 tests with both HTTP and HTTP/2
Teach chainlint.pl to annotate the original test definition instead
of the token stream.
* es/chainlint-output:
chainlint: annotate original test definition rather than token stream
chainlint: latch start/end position of each token
chainlint: tighten accuracy when consuming input stream
chainlint: add explanatory comments
'scalar reconfigure -a' is taught to automatically remove
scalar.repo entires which no longer exist.
* js/remove-stale-scalar-repos:
tests(scalar): tighten the stale `scalar.repo` test some
scalar reconfigure -a: remove stale `scalar.repo` entries
Fix a regression in the bisect-helper which mistakenly treats
arguments to the command given to 'git bisect run' as arguments to
the helper.
* dd/bisect-helper-subcommand:
bisect--helper: parse subcommand with OPT_SUBCOMMAND
bisect--helper: move all subcommands into their own functions
bisect--helper: remove unused options
Preparation to remove git-submodule.sh and replace it with a builtin.
* ab/submodule-helper-prep-only:
submodule--helper: use OPT_SUBCOMMAND() API
submodule--helper: drop "update --prefix <pfx>" for "-C <pfx> update"
submodule--helper: remove --prefix from "absorbgitdirs"
submodule API & "absorbgitdirs": remove "----recursive" option
submodule.c: refactor recursive block out of absorb function
submodule tests: test for a "foreach" blind-spot
submodule--helper: fix a memory leak in "status"
submodule tests: add tests for top-level flag output
submodule--helper: move "config" to a test-tool
29fb2ec3 (chainlint.pl: validate test scripts in parallel,
2022-09-01) introduced a function that gets the number of cores from
/proc/cpuinfo on some systems, notably linux.
The regexp it uses (^processor\s*:) fails to match the desired lines in
the s390x architecture, where they look like this:
processor 0: version = FF, identification = 148F67, machine = 2964
As a result, on s390x that function returns 0 as the number of cores,
and the chainlint.pl script exits without doing anything.
Signed-off-by: Andreas Hasenack <andreas.hasenack@canonical.com>
Acked-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Although it is possible to manually set LIB_HTTPD_PATH and
LIB_HTTPD_MODULE_PATH to point at the location of `httpd` and its
modules, doing so is cumbersome and easily forgotten. To address this,
0d344738dc (t/lib-http.sh: Restructure finding of default httpd
location, 2010-01-02) enhanced lib-httpd.sh to automatically detect the
location of `httpd` and its modules in order to facilitate out-of-the-
box testing on a wider range of platforms. Follow that lead by further
enhancing it to automatically detect the `httpd` modules on Void Linux,
as well.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test case "push with config push.useBitmap" of t5516 was introduced
in commit 82f67ee13f (send-pack.c: add config push.useBitmaps,
2022-06-17). It won't work in verbose mode, e.g.:
$ sh t5516-fetch-push.sh --run='1,115' -v
This is because "git-push" will run in a tty in this case, and the
subcommand "git pack-objects" will contain an argument "--progress"
instead of "-q". Adding a specific option "--quiet" to "git push" will
get a stable result for t5516.
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
$GIT_DIR/objects/pack may be removed to save inodes in shared
repositories. Quiet down prune in cases where either
$GIT_DIR/objects or $GIT_DIR/objects/pack is non-existent,
but emit the system error in other cases to help users diagnose
permissions problems or resource constraints.
Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Free memory from parse_options_concat(), which comes from code
originally added (then extended) in [1].
At this point we could get several more tests leak-free by free()-ing
the xstrdup() just above the line being changed, but that one's
trickier than it seems. The sequencer_remove_state() function
supposedly owns it, but sometimes we don't call it. I have a fix for
it, but it's non-trivial, so let's fix the easy one first.
1. c62f6ec341 (revert: add --ff option to allow fast forward when
cherry-picking, 2010-03-06)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Fix a leak in the recent 6159e7add4 (rebase --abort: improve reflog
message, 2022-10-12). Before that commit we'd strbuf_release() the
reflog message we were formatting, but when that code was refactored
to use "ropts.head_msg" the strbuf_release() was omitted.
Ideally the three users of "ropts" in cmd_rebase() should use
different "ropts" variables, in practice they're completely separate,
as this and the other user in the "switch" statement will "goto
cleanup", which won't touch "ropts".
The third caller after the "switch" is then unreachable if we take
these two branches, so all of them are getting a "{ 0 }" init'd
"ropts".
So it's OK that we're leaving a stale pointer in "ropts.head_msg",
cleaning it up was our responsibility, and it won't be used again.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The "new_pack" we allocate in check_connected() wasn't being
free'd. Let's do that before we return from the function. This has
leaked ever since "new_pack" was added to this function in
c6807a40dc (clone: open a shortcut for connectivity check,
2013-05-26).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Fix a memory leak in overlay_tree_on_index(), we need to
clear_pathspec() at some point, which might as well be after the last
time we use it in the function.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Call graph_clear() in release_revisions(), this will free memory
allocated by e.g. this command, which will now run without memory
leaks:
git -P log -1 --graph --no-graph --graph
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Fix a leak that's been with us since 3407bb4940 (Add "unpack-file"
helper that unpacks a sha1 blob into a tmpfile., 2005-04-18). See
00c8fd493a (cat-file: use streaming API to print blobs, 2012-03-07)
for prior art which shows the same API pattern, i.e. free()-ing the
result of read_object_file() after it's used.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Fix various leaks in built-ins, libraries and a test helper here we
were missing a call to strbuf_release(), string_list_clear() etc, or
were calling them after a potential "return".
Comments on individual changes:
- builtin/checkout.c: Fix a memory leak that was introduced in [1]. A
sibling leak introduced in [2] was recently fixed in [3]. As with [3]
we should be using the wt_status_state_free_buffers() API introduced
in [4].
- builtin/repack.c: Fix a leak that's been here since this use of
"strbuf_release()" was added in a1bbc6c017 (repack: rewrite the shell
script in C, 2013-09-15). We don't use the variable for anything
except this loop, so we can instead free it right afterwards.
- builtin/rev-parse: Fix a leak that's been here since this code was
added in 21d4783538 (Add a parseopt mode to git-rev-parse to bring
parse-options to shell scripts., 2007-11-04).
- builtin/stash.c: Fix a couple of leaks that have been here since
this code was added in d4788af875 (stash: convert create to builtin,
2019-02-25), we strbuf_release()'d only some of the "struct strbuf" we
allocated earlier in the function, let's release all of them.
- ref-filter.c: Fix a leak in 482c119186 (gpg-interface: improve
interface for parsing tags, 2021-02-11), we don't use the "payload"
variable that we ask parse_signature() to populate for us, so let's
free it.
- t/helper/test-fake-ssh.c: Fix a leak that's been here since this
code was added in 3064d5a38c (mingw: fix t5601-clone.sh,
2016-01-27). Let's free the "struct strbuf" as soon as we don't need
it anymore.
1. c45f0f525d (switch: reject if some operation is in progress,
2019-03-29)
2. 2708ce62d2 (branch: sort detached HEAD based on a flag,
2021-01-07)
3. abcac2e19f (ref-filter.c: fix a leak in get_head_description,
2022-09-25)
4. 962dd7ebc3 (wt-status: introduce wt_status_state_free_buffers(),
2020-09-27).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The "sparse_checkout_patterns" member was added to the "struct
index_state" in 836e25c51b (sparse-checkout: hold pattern list in
index, 2021-03-30), but wasn't added to discard_index(). Let's do
that.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The read_cache() in prepare_to_commit() would end up clobbering the
pointer we had for a previously populated "the_index.cache_tree" in
the very common case of "git commit" stressed by e.g. the tests being
changed here.
We'd populate "the_index.cache_tree" by calling
"update_main_cache_tree" in prepare_index(), but would not end up with
a "fully prepared" index. What constitutes an existing index is
clearly overly fuzzy, here we'll check "active_nr" (aka
"the_index.cache_nr"), but our "the_index.cache_tree" might have been
malloc()'d already.
Thus the code added in 11c8a74a64 (commit: write cache-tree data when
writing index anyway, 2011-12-06) would end up allocating the
"cache_tree", and would interact here with code added in
7168624c35 (Do not generate full commit log message if it is not
going to be used, 2007-11-28). The result was a very common memory
leak.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
This marks tests that have been leak-free since various recent
commits, but which were not marked us such when the memory leak was
fixed. These were mostly discovered with the "check" mode added in
faececa53f (test-lib: have the "check" mode for SANITIZE=leak
consider leak logs, 2022-07-28).
Commits that fixed the last memory leak in these tests. Per narrowing
down when they started to pass under SANITIZE=leak with "bisect":
- t1022-read-tree-partial-clone.sh:
7e2619d8ff (list_objects_filter_options: plug leak of filter_spec
strings, 2022-09-08)
- t4053-diff-no-index.sh: 07a6f94a6d (diff-no-index: release prefixed
filenames, 2022-09-07)
- t6415-merge-dir-to-symlink.sh: bac92b1f39 (Merge branch
'js/ort-clean-up-after-failed-merge', 2022-08-08).
- t5554-noop-fetch-negotiator.sh:
66eede4a37 (prepare_repo_settings(): plug leak of config values,
2022-09-08)
- t2012-checkout-last.sh, t7504-commit-msg-hook.sh,
t91{15,46,60}-git-svn-*.sh: The in-flight "pw/rebase-no-reflog-action"
series, upon which this is based:
https://lore.kernel.org/git/pull.1405.git.1667575142.gitgitgadget@gmail.com/
Let's mark all of these as passing with
"TEST_PASSES_SANITIZE_LEAK=true", to have it regression tested,
including as part of the "linux-leaks" CI job.
Additionally, let's remove the "!SANITIZE_LEAK" prerequisite from
tests that now pass, these were marked as failing in:
- 77e56d55ba (diff.c: fix a double-free regression in a18d66cefb,
2022-03-17)
- c4d1d52631 (tests: change some 'test $(git) = "x"' to test_cmp,
2022-03-07)
These were not spotted with the new "check" mode, but manually, it
doesn't cover these sort of prerequisites. There's few enough that we
shouldn't bother to automate it. They'll be going away sooner than
later.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
In a preceding commit we fully applied the
"index-compatibility.pending.cocci" rule to "t/helper/*".
Let's now stop defining "USE_THE_INDEX_COMPATIBILITY_MACROS" in
test-tool.h itself, and instead instead define
"USE_THE_INDEX_VARIABLE" in the individual test helpers that need
it. This mirrors how we do the same thing in the "builtin/" directory.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apply the "index-compatibility.pending.cocci" rule to the "t/helper/*"
directory, a subsequent commit will extend cache.h to further narrow
down the use of "USE_THE_INDEX_COMPATIBILITY_MACROS" in this area.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mostly apply the part of "index-compatibility.pending.cocci" that
renames the global variables like "active_nr", which are a shorthand
to referencing (in that case) a struct member as "the_index.cache_nr".
In doing so move more of "index-compatibility.pending.cocci" to
"index-compatibility.cocci".
In the case of "active_nr" we'd have a textual conflict with
"ab/various-leak-fixes" in "next"[1]. Let's exclude that specific case
while moving the rule over from "pending".
1. 407b94280f8 (commit: discard partial cache before (re-)reading it,
2022-11-08)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid calling 'cache_tree_update()' when doing so would be redundant.
* vd/skip-cache-tree-update:
rebase: use 'skip_cache_tree_update' option
read-tree: use 'skip_cache_tree_update' option
reset: use 'skip_cache_tree_update' option
unpack-trees: add 'skip_cache_tree_update' option
cache-tree: add perf test comparing update and prime
`git rebase --update-refs` would delete references when all `update-ref`
commands in the sequencer were removed, which has been corrected.
* vd/update-refs-delete:
rebase --update-refs: avoid unintended ref deletion
"git repack" learns to send cruft objects out of the way into
packfiles outside the repository.
* tb/repack-expire-to:
builtin/repack.c: implement `--expire-to` for storing pruned objects
builtin/repack.c: write cruft packs to arbitrary locations
builtin/repack.c: pass "cruft_expiration" to `write_cruft_pack`
builtin/repack.c: pass "out" to `prepare_pack_objects`
Makefile comments updates and reordering to clarify knobs used to
choose SHA implementations.
* ab/sha-makefile-doc:
Makefile: discuss SHAttered in *_SHA{1,256} discussion
Makefile: document default SHA-1 backend on OSX
Makefile & test-tool: replace "DC_SHA1" variable with a "define"
Makefile: document SHA-1 and SHA-256 default and selection order
Makefile: document default SHA-256 backend
Makefile: rephrase the discussion of *_SHA1 knobs
Makefile: create and use sections for "define" flag listing
Makefile: correct DC_SHA1 documentation
INSTALL: remove discussion of SHA-1 backends
Makefile: always (re)set DC_SHA1 on fallback
Various test updates.
* ab/misc-hook-submodule-run-command:
run-command tests: test stdout of run_command_parallel()
submodule tests: reset "trace.out" between "grep" invocations
hook tests: fix redirection logic error in 96e7225b31
In parse_object(), we try to handle blobs by streaming rather than
loading them entirely into memory. The most common case here will be
that we haven't seen the object yet and check oid_object_info(), which
tells us we have a blob.
But we trigger this code on one other case: when we have an in-memory
object struct with type OBJ_BLOB (and without its "parsed" flag set,
since otherwise we'd return early from the function). This indicates
that some other part of the code suspected we have a blob (e.g., it was
mentioned by a tree or tag) but we haven't yet looked at the on-disk
copy.
In this case before hitting the streaming path, we check if we have the
object on-disk at all. This is mostly pointless extra work, as the
streaming path would complain if it couldn't open the object (albeit
with the message "hash mismatch", which is a little misleading).
But it's also insufficient to catch all problems. The streaming code
will only tell us "yes, the on-disk object matches the oid". But it
doesn't actually confirm that what we found was indeed a blob, and
neither does repo_has_object_file().
One way to improve this would be to teach stream_object_signature() to
check the type (either by returning it to us to check, or taking an
"expected" type). But there's an even simpler fix here: if we suspect
the object is a blob, just call oid_object_info() to confirm that we
have it on-disk, and that it really is a blob.
This is slightly less efficient than teaching stream_object_signature()
to do it (since it has to open the object already). But this case very
rarely comes up. In practice, we usually don't have any clue what the
type is, in which case we already call oid_object_info(). This
"suspected" case happens only when some other code created an object
struct but didn't actually parse the blob, which is actually tricky to
trigger at all (see the discussion of the test below).
I reworked the conditional a bit so that instead of:
if ((suspected_blob && oid_object_info() == OBJ_BLOB)
(no_clue && oid_object_info() == OBJ_BLOB)
we have the simpler:
if ((suspected_blob || no_clue) && oid_object_info() == OBJ_BLOB)
This is shorter, but also reflects what we really want say, which is
"have we ruled out this being a blob; if not, check it on-disk".
In either case, if oid_object_info() fails to tell us it's a blob, we'll
skip the streaming code path and call repo_read_object_file(), just as
before. And if we really do have a mismatch with the existing object
struct, we'll eventually call lookup_commit(), etc, via
parse_object_buffer(), which will complain that it doesn't match our
existing obj->type.
So this fixes one of the lingering expect_failure cases from 0616617c7e
(t: introduce tests for unexpected object types, 2019-04-09). That test
works by peeling a tag that claims to point to a blob (triggering us to
create the struct), but really points to something else, which we later
discover when we call parse_object() as part of the actual traversal).
Prior to this commit, we'd quietly check the sha1 and mark the blob as
"parsed". Now we correctly complain about the mismatch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Since 52d59cc645 (branch: add a --copy (-c) option to go with --move
(-m), 2017-06-18) we can copy a branch to make a new branch with the
'-c' (copy) option or to overwrite an existing branch using the '-C'
(force copy) option. A no-op possibility is considered when we are
asked to copy a branch to itself, to follow the same no-op introduced
for the rename (-M) operation in 3f59481e33 (branch: allow a no-op
"branch -M <current-branch> HEAD", 2011-11-25). To check for this, in
52d59cc645 we compared the branch names provided by the user, source
(HEAD if omitted) and destination, and a match is considered as this
no-op.
Since ae5a6c3684 (checkout: implement "@{-N}" shortcut name for N-th
last branch, 2009-01-17) a branch can be specified using shortcuts like
@{-1}. This allows this usage:
$ git checkout -b test
$ git checkout -
$ git branch -C test test # no-op
$ git branch -C test @{-1} # oops
$ git branch -C @{-1} test # oops
As we are using the branch name provided by the user to do the
comparison, if one of the branches is provided using a shortcut we are
not going to have a match and a call to git_config_copy_section() will
happen. This will make a duplicate of the configuration for that
branch, and with this progression the second call will produce four
copies of the configuration, and so on.
Let's use the interpreted branch name instead for this comparison.
The rename operation is not affected.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Add a new `--exclude-hidden=` option that is similar to the one we just
added to git-rev-list(1). Given a section name `uploadpack` or `receive`
as argument, it causes us to exclude all references that would be hidden
by the respective `$section.hideRefs` configuration.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Users can optionally hide refs from remote users in git-upload-pack(1),
git-receive-pack(1) and others via the `transfer.hideRefs`, but there is
not an easy way to obtain the list of all visible or hidden refs right
now. We'll require just that though for a performance improvement in our
connectivity check.
Add a new option `--exclude-hidden=` that excludes any hidden refs from
the next pseudo-ref like `--all` or `--branches`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
On MinGW the "/dev/null" is translated to "nul" on command-lines, even
though as in this case it'll never end up referring to an actual file.
So on Windows the fix for the previous "example.com" timeout issue in
8354cf752e (t7610: fix flaky timeout issue, don't clone from
example.com, 2022-11-05) would yield:
fatal: repo URL: 'nul' must be absolute or begin with ./|../
Let's evade this yet again by prefixing this with "file://", which
makes this pass in the Windows CI.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
maintenance register currently records the maintenance repo exclusively
within the user's global configuration, but other configuration files
may be relevant when running maintenance if they are included from the
global config. This option allows the user to choose where maintenance
repos are recorded.
Signed-off-by: Ronan Pigott <ronan@rjp.ie>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
This is a quality of life change for git-maintenance, so repos can be
recorded with the tilde syntax. The register subcommand will not record
repos in this format by default.
Signed-off-by: Ronan Pigott <ronan@rjp.ie>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
With GIT_TRACE_CURL=1 or GIT_CURL_VERBOSE=1, sensitive headers like
"Authorization" and "Cookie" get redacted. However, since [1], curl's
h2h3 module (invoked when using HTTP/2) also prints headers in its
"info", which don't get redacted. For example,
echo 'github.com TRUE / FALSE 1698960413304 o foo=bar' >cookiefile &&
GIT_TRACE_CURL=1 GIT_TRACE_CURL_NO_DATA=1 git \
-c 'http.cookiefile=cookiefile' \
-c 'http.version=' \
ls-remote https://github.com/git/git refs/heads/main 2>output &&
grep 'cookie' output
produces output like:
23:04:16.920495 http.c:678 == Info: h2h3 [cookie: o=foo=bar]
23:04:16.920562 http.c:637 => Send header: cookie: o=<redacted>
Teach http.c to check for h2h3 headers in info and redact them using the
existing header redaction logic. This fixes the broken redaction logic
that we noted in the previous commit, so mark the redaction tests as
passing under HTTP2.
[1] f8c3724aa9
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
We have occasionally seen bugs that affect Git running only against an
HTTP/2 web server, not an HTTP one. For instance, b66c77a64e (http:
match headers case-insensitively when redacting, 2021-09-22). But since
we have no test coverage using HTTP/2, we only uncover these bugs in the
wild.
That commit gives a recipe for converting our Apache setup to support
HTTP/2, but:
- it's not necessarily portable
- we don't want to just test HTTP/2; we really want to do a variety of
basic tests for _both_ protocols
This patch handles both problems by running a duplicate of t5551
(labeled as t5559 here) with an alternate-universe setup that enables
HTTP/2. So we'll continue to run t5551 as before, but run the same
battery of tests again with HTTP/2. If HTTP/2 isn't supported on a given
platform, then t5559 should bail during the webserver setup, and
gracefully skip all tests (unless GIT_TEST_HTTPD has been changed from
"auto" to "yes", where the point is to complain when webserver setup
fails).
In theory other http-related test scripts could benefit from the same
duplication, but doing t5551 should give us a reasonable check of basic
functionality, and would have caught both bugs we've seen in the wild
with HTTP/2.
A few notes on the implementation:
- a script enables the server side config by calling enable_http2
before starting the webserver. This avoids even trying to load any
HTTP/2 config for t5551 (which is what lets it keep working with
regular HTTP even on systems that don't support it). This also sets
a prereq which can be used by individual tests.
- As discussed in b66c77a64e, the http2 module isn't compatible with
the "prefork" mpm, so we need to pick something else. I chose
"event" here, which works on my Debian system, but it's possible
there are platforms which would prefer something else. We can adjust
that later if somebody finds such a platform.
- The test "large fetch-pack requests can be sent using chunked
encoding" makes sure we use a chunked transfer-encoding by looking
for that header in the trace. But since HTTP/2 has its own streaming
mechanisms, we won't find such a header. We could skip the test
entirely by marking it with !HTTP2. But there's some value in making
sure that the fetch itself succeeded. So instead, we'll confirm that
either we're using HTTP2 _or_ we saw the expected chunked header.
- the redaction tests fail under HTTP/2 with recent versions of curl.
This is a bug! I've marked them with !HTTP2 here to skip them under
t5559 for the moment. Using test_expect_failure would be more
appropriate, but would require a bunch of boilerplate. Since we'll
be fixing them momentarily, let's just skip them for now to keep the
test suite bisectable, and we can re-enable them in the commit that
fixes the bug.
- one alternative layout would be to push most of t5551 into a
lib-t5551.sh script, then source it from both t5551 and t5559.
Keeping t5551 intact seemed a little simpler, as its one less level
of indirection for people fixing bugs/regressions in the non-HTTP/2
tests.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
In "open_midx_bitmap_1()" and "open_pack_bitmap_1()", when we find that
there are multiple bitmaps, we will only open the first one and then
leave warnings about the remaining pack information, the information
will contain the absolute path of the repository, for example in a
alternates usage scenario. So let's hide this kind of potentially
sensitive information in this commit.
Found-by: XingXin <moweng.xx@antgroup.com>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Git learned pushing submodules without pushing the superproject by
the user specifying --recurse-submodules=only through 6c656c3fe4
("submodules: add RECURSE_SUBMODULES_ONLY value", 2016-12-20) and
225e8bf778 ("push: add option to push only submodules", 2016-12-20).
For users who use this feature regularly, it is desirable to have an
equivalent configuration.
It turns out that such a configuration (push.recurseSubmodules=only) is
already supported, even though it is neither documented nor mentioned
in the commit messages, due to the way the --recurse-submodules=only
feature was implemented (a function used to parse --recurse-submodules
was updated to support "only", but that same function is used to parse
push.recurseSubmodules too). What is left is to document it and test it,
which is what this commit does.
There is a possible point of confusion when recursing into a submodule
that itself has the push.recurseSubmodules=only configuration, because
if a repository has only its submodules pushed and not itself, its
superproject can never be pushed. Therefore, treat such configurations
as being "on-demand", and print a warning message.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The previous commit added a `--merge-base` option in order to allow
using a specified merge-base for the merge. Extend the input accepted
by `--stdin` to also allow a specified merge-base with each merge
requested. For example:
printf "<b3> -- <b1> <b2>" | git merge-tree --stdin
does a merge of b1 and b2, and uses b3 as the merge-base.
Signed-off-by: Kyle Zhao <kylezhao@tencent.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
This patch will give our callers more flexibility to use `git merge-tree`,
such as:
git merge-tree --write-tree --merge-base=branch^ HEAD branch
This does a merge of HEAD and branch, but uses branch^ as the merge-base.
And the reason why using an option flag instead of a positional argument
is to allow additional commits passed to merge-tree to be handled via an
octopus merge in the future.
Signed-off-by: Kyle Zhao <kylezhao@tencent.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
As pointed out by Stolee, the previous incarnation of this test case was
not stringent enough: we want to verify that _only_ the stale entries
are removed (previously, the test case would have succeeded even if all
entries had been removed).
Let's rectify this and verify that the other entries are left intact.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Address a test blindspot, the "log" command is the odd one out because
"git-bisect.sh" ignores any arguments it receives. Let's test both the
exit codes we expect, and the stderr and stdout we're emitting.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Some system never reports negative exit code at all, they reports them
as bigger-than-128 instead. We take extra care for those systems in the
later check for normal 'do_bisect_run' loop.
Let's check it here, too.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Preceding commits fixed output and behavior regressions in
d1bbbe45df (bisect--helper: reimplement `bisect_run` shell function
in C, 2021-09-13), which did not claim to be changing the output of
"git bisect run".
But some of the output it emitted was subjectively better, so once
we've asserted that we're back on v2.29.0 behavior, let's change some
of it back:
- We now quote the arguments again, but omit the first " " when
printing the "running" line.
- Ditto for other cases where we emitted the argument
- We say "found first bad commit" again, not just "run success"
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Based-on-patch-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When d1bbbe45df (bisect--helper: reimplement `bisect_run` shell
function in C, 2021-09-13) reimplemented parts of "git bisect run" in
C it changed the output we emitted so that:
- The "running ..." line was now quoted
- We lost the \n after our output
- We started saying "bisect found ..." instead of "bisect run success"
Arguably some of this is better now, but as d1bbbe45df did not
advocate for changing the output, let's revert this for now. It'll be
easy to change it back if that's what we'd prefer.
This does not change the one remaining use of "command.buf" to emit
the quoted argument, as that's new in d1bbbe45df.
Some of these cases were not tested for in the tests added in the
preceding commit, I didn't have time to fleshen those out, but a look
at f1de981e8b will show that the other output being adjusted here is
now equivalent to what it was before d1bbbe45df.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Add three failing tests which succeed on v2.29.0, but due to the topic
merged at [1] (specifically [2]) have been failing since then. We'll
address those regressions in subsequent commits.
There was also a "regression" where:
git bisect run ./missing-script.sh
Would count a non-existing script as "good", as the shell would exit
with 127. That edge case is a bit too insane to preserve, so let's not
add it to these regression tests.
There was another regression that 'git bisect' consumed some options
that was meant to passed down to program run with 'git bisect run'.
Since that regression is breaking user's expectation, it has been fixed
earlier without this patch queued.
1. 0a4cb1f1f2 (Merge branch 'mr/bisect-in-c-4', 2021-09-23)
2. d1bbbe45df (bisect--helper: reimplement `bisect_run` shell
function in C, 2021-09-13)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
* dd/bisect-helper-subcommand:
bisect--helper: parse subcommand with OPT_SUBCOMMAND
bisect--helper: move all subcommands into their own functions
bisect--helper: remove unused options
As of it is, we're parsing subcommand with OPT_CMDMODE, which will
continue to parse more options even if the command has been found.
When we're running "git bisect run" with a command that expecting
a "--log" or "--no-log" arguments, or one of those "--bisect-..."
arguments, bisect--helper may mistakenly think those options are
bisect--helper's option.
We may fix those problems by passing "--" when calling from
git-bisect.sh, and skip that "--" in bisect--helper. However, it may
interfere with user's "--".
Let's parse subcommand with OPT_SUBCOMMAND since that API was born for
this specific use-case.
Reported-by: Lukáš Doktor <ldoktor@redhat.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When chainlint detects problems in a test, it prints out the name of the
test script, the name of the problematic test, and a copy of the test
definition with "?!FOO?!" annotations inserted at the locations where
problems were detected. Taken together this information is sufficient
for the test author to identify the problematic code in the original
test definition. However, in a lengthy script or a lengthy test
definition, the author may still end up using the editor's search
feature to home in on the exact problem location.
To further assist the test author, display line numbers along with the
annotated test definition, thus allowing the author to jump directly to
each problematic line.
Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When chainlint detects problems in a test, it prints out the name of the
test script, the name of the problematic test, and a copy of the test
definition with "?!FOO?!" annotations inserted at the locations where
problems were detected. Taken together this information is sufficient
for the test author to identify the problematic code in the original
test definition. However, in a lengthy script or a lengthy test
definition, the author may still end up using the editor's search
feature to home in on the exact problem location.
To further assist the test author, an upcoming change will display line
numbers along with the annotated test definition, thus allowing the
author to jump directly to each problematic line. As preparation,
upgrade Lexer to latch the line numbers at which each token starts and
ends, and return that information with the token itself.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Although the macOS Terminal.app is "xterm"-compatible, its corresponding
"terminfo" entries -- such as "xterm", "xterm-256color", and
"xterm-new"[1] -- neglect to mention capabilities which Terminal.app
actually supports (such as "dim text"). This oversight on Apple's part
ends up penalizing users of "good citizen" console programs which
consult "terminfo" to tailor their output based upon reported terminal
capabilities (as opposed to programs which assume that the terminal
supports ANSI codes). The same problem is present in other Apple
"terminfo" entries, such as "nsterm"[2], with which macOS Terminal.app
may be configured.
Sidestep this Apple problem by imbuing get_colors() with specific
knowledge of capabilities common to "xterm" and "nsterm", rather than
trusting "terminfo" to report them correctly. Although hard-coding such
knowledge is ugly, "xterm" support is nearly ubiquitous these days, and
Git itself sets precedence by assuming support for ANSI color codes. For
other terminal types, fall back to querying "terminfo" via `tput` as
usual.
FOOTNOTES
[1] iTerm2 FAQ suggests "xterm-new": https://iterm2.com/faq.html
[2] Neovim documentation recommends terminal type "nsterm" with
Terminal.app: https://neovim.io/doc/user/term.html#terminfo
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The `label` command creates a ref refs/rewritten/<label> that the
`reset` and `merge` commands resolve by calling lookup_label(). That
uses lookup_commit_reference_by_name() to look up the label ref. As
lookup_commit_reference_by_name() uses the dwim rules when looking up
the label it will look for a branch named
refs/heads/refs/rewritten/<label> and return that instead of an error if
the branch exists and the label does not. Fix this by using read_ref()
followed by lookup_commit_object() when looking up labels.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The arguments to the `reset` and `merge` commands may be a label created
with a `label` command or an arbitrary commit name. The `merge` command
uses the lookup_label() function to lookup its arguments but `reset` has
a slightly different version of that function in do_reset(). Reduce this
code duplication by calling lookup_label() from do_reset() as well.
This change improves the behavior of `reset` when the argument is a
tree. Previously `reset` would accept a tree only for the rebase to
fail with
update_ref failed for ref 'HEAD': cannot update ref 'HEAD': trying to write non-commit object da5497437fd67ca928333aab79c4b4b55036ea66 to branch 'HEAD'
Using lookup_label() means do_reset() will now error out straight away
if its argument is not a commit.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When running 'read-tree' with a single tree and no prefix,
'prime_cache_tree()' is called after the tree is unpacked. In that
situation, skip a redundant call to 'cache_tree_update()' in
'unpack_trees()' by enabling the 'skip_cache_tree_update' unpack option.
Removing the redundant cache tree update provides a substantial performance
improvement to 'git read-tree <tree-ish>', as shown by a test added to
'p0006-read-tree-checkout.sh':
Test before after
----------------------------------------------------------------------
read-tree br_ballast_plus_1 3.94(1.80+1.57) 3.00(1.14+1.28) -23.9%
Note that the 'read-tree' in 't1022-read-tree-partial-clone.sh' is updated
to read two trees, rather than one. The test was first introduced in
d3da223f22 (cache-tree: prefetch in partial clone read-tree, 2021-07-23) to
exercise the 'cache_tree_update()' code path, as used in 'git merge'. Since
this patch drops the call to 'cache_tree_update()' in single-tree 'git
read-tree', change the test to use the two-tree variant so that
'cache_tree_update()' is called as intended.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Enable the 'skip_cache_tree_update' option in the variants that call
'prime_cache_tree()' after 'unpack_trees()' (specifically, 'git reset
--mixed' and 'git reset --hard'). This avoids redundantly rebuilding the
cache tree in both 'cache_tree_update()' at the end of 'unpack_trees()' and
in 'prime_cache_tree()', resulting in a small (but consistent) performance
improvement. From the newly-added 'p7102-reset.sh' test:
Test before after
--------------------------------------------------------------------
7102.1: reset --hard (...) 2.11(0.40+1.54) 1.97(0.38+1.47) -6.6%
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Add a performance test comparing the execution times of 'prime_cache_tree()'
and 'cache_tree_update(_, WRITE_TREE_SILENT | WRITE_TREE_REPAIR)'. The goal
of comparing these two is to identify which is the faster method for
rebuilding an invalid cache tree, ultimately to remove one when both are
(reundantly) called in immediate succession.
Both methods are fast, so the new tests in 'p0090-cache-tree.sh' must call
each tested function multiple times to ensure the reported times (to 0.01s
resolution) convey the differences between them.
The tests compare the timing of a 'test-tool cache-tree' run as a no-op (to
capture a baseline for the overhead associated with running the tool),
'cache_tree_update()', and 'prime_cache_tree()' on four scenarios:
- A completely valid cache tree
- A cache tree with 2 invalid paths
- A cache tree with 50 invalid paths
- A completely empty cache tree
Example results:
Test this tree
-----------------------------------------------------------
0090.2: no-op, clean 1.27(0.48+0.52)
0090.3: prime_cache_tree, clean 2.02(0.83+0.85)
0090.4: cache_tree_update, clean 1.30(0.49+0.54)
0090.5: no-op, invalidate 2 1.29(0.48+0.54)
0090.6: prime_cache_tree, invalidate 2 1.98(0.81+0.83)
0090.7: cache_tree_update, invalidate 2 2.12(0.94+0.86)
0090.8: no-op, invalidate 50 1.32(0.50+0.55)
0090.9: prime_cache_tree, invalidate 50 2.10(0.86+0.89)
0090.10: cache_tree_update, invalidate 50 2.35(1.14+0.90)
0090.11: no-op, empty 1.33(0.50+0.54)
0090.12: prime_cache_tree, empty 2.04(0.84+0.87)
0090.13: cache_tree_update, empty 2.51(1.27+0.92)
These timings show that, while 'cache_tree_update()' is faster when the
cache tree is completely valid, it is equal to or slower than
'prime_cache_tree()' when there are any invalid paths. Since the redundant
calls are mostly in scenarios where the cache tree will be at least
partially invalid (e.g., 'git reset --hard'), 'prime_cache_tree()' will
likely perform better than 'cache_tree_update()' in typical cases.
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When deleting a branch, "git branch -d" has a safety check that ensures
the branch is merged to its upstream (if any), or to HEAD. To do that,
naturally we try to resolve HEAD to a commit object. If we're on an
orphan branch (i.e., HEAD points to a branch that does not yet exist),
that will fail, and we'll bail with an error:
$ git branch -d to-delete
fatal: Couldn't look up commit object for HEAD
This usually isn't that big of a deal. The deletion would fail anyway,
since the branch isn't merged to HEAD, and you'd need to use "-D" (or
"-f"). And doing so skips the HEAD resolution, courtesy of 67affd5173
(git-branch -D: make it work even when on a yet-to-be-born branch,
2006-11-24).
But there are still two problems:
1. The error message isn't very helpful. We should give the usual "not
fully merged" message, which points the user at "branch -D". That
was a problem even back in 67affd5173.
2. Even without a HEAD, these days it's still possible for the
deletion to succeed. After 67affd5173, commit 99c419c915 (branch
-d: base the "already-merged" safety on the branch it merges with,
2009-12-29) made it OK to delete a branch if it is merged to its
upstream.
We can fix both by removing the die() in delete_branches() completely,
leaving head_rev NULL in this case. It's tempting to stop there, as it
appears at first glance that the rest of the code does the right thing
with a NULL. But sadly, it's not quite true.
We end up feeding the NULL to repo_is_descendant_of(). In the
traditional code path there, we call repo_in_merge_bases_many(). It
feeds the NULL to repo_parse_commit(), which is smart enough to return
an error, and we immediately return "no, it's not a descendant".
But there's an alternate code path: if we have a commit graph with
generation numbers, we end up in can_all_from_reach(), which does
eventually try to set a flag on the NULL commit and segfaults.
So instead, we'll teach the local branch_merged() helper to treat a NULL
as "not merged". This would be a little more elegant in in_merge_bases()
itself, but that function is called in a lot of places, and it's not
clear that quietly returning "not merged" is the right thing everywhere
(I'd expect in many cases, feeding a NULL is a sign of a bug).
There are four tests here:
a. The first one confirms that deletion succeeds with an orphaned HEAD
when the branch is merged to its upstream. This is case (2) above.
b. Same, but with commit graphs enabled. Even if it is merged to
upstream, we still check head_rev so that we can say "deleting
because it's merged to upstream, even though it's not merged to
HEAD". Without the second hunk in branch_merged(), this test would
segfault in can_all_from_reach().
c. The third one confirms that we correctly say "not merged to HEAD"
when we can't resolve HEAD, and reject the deletion.
d. Same, but with commit graphs enabled. Without the first hunk in
branch_merged(), this one would segfault.
Reported-by: Martin von Zweigbergk <martinvonz@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
If the input to strtoimax() or strtoumax() does not contain any digits
then they return zero and set `end` to point to the start of the input
string. git_parse_[un]signed() do not check `end` and so fail to return
an error and instead return a value of zero if the input string is a
valid units factor without any digits (e.g "k").
Tests are added to check that 'git config --int' and OPT_MAGNITUDE()
reject a units specifier without a leading digit.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
git_parse_unsigned() relies on strtoumax() which unfortunately parses
negative values as large positive integers. Fix this by rejecting any
string that contains '-' as we do in strtoul_ui(). I've chosen to treat
negative numbers as invalid input and set errno to EINVAL rather than
ERANGE one the basis that they are never acceptable if we're looking for
a unsigned integer. This is also consistent with the existing behavior
of rejecting "1–2" with EINVAL.
As we do not have unit tests for this function it is tested indirectly
by checking that negative values of reject for core.bigFileThreshold are
rejected. As this function is also used by OPT_MAGNITUDE() a test is
added to check that rejects negative values too.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When t7610-mergetool.sh runs without failures the git://example.com
submodule URLs will never be used. That's because we "git submodule
add" it, but then manually populate them so that subsequent "git
submodule update -N" won't attempt to clone it, only update it without
fetching.
But if we fail in an earlier test it'll have the knock-on effect of
having later tests hang on that "git submodule update -N" as we
attempt to clone this repository from example.com.
This can be reproduced on "master" by running the test with
SANITIZE=leak without "--immediate". With
"GIT_TEST_PASSING_SANITIZE_LEAK=true" (which the linux-leaks job uses)
we'll skip the test entirely. So we'll only run into this when running
it manually, or with the "GIT_TEST_PASSING_SANITIZE_LEAK=check" mode.
That's not because the failure has anything to do with leak detection
per-se. It just so happens that we have a leak that'll fail before
we've managed to fully set these up, and therefore "git submodule
update -N" ends up spawning "git clone".
Let's instead continue lying about the origin of this submodule by
providing a URL for it that doesn't work, but now one that *really*
doesn't work: /dev/null. If the test is passing we won't ever use
this, and if we have knock-on failures we'll fail early, instead of
waiting for a timeout.
The behavior of "-N" here might be surprising to some, since it's
explained as "[if you use -N we] don’t fetch new objects from the
remote site". But (perhaps counter-intuitively) it's only talking
about if it needs to do so via "git fetch". In this case we'll end up
spawning a "git clone", as we have no submodule set up.
See ff7f089ed1 (mergetool: Teach about submodules, 2011-04-13) for
the commit that implemented these "example.com" tests.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
* es/chainlint-output:
chainlint: annotate original test definition rather than token stream
chainlint: latch start/end position of each token
chainlint: tighten accuracy when consuming input stream
chainlint: add explanatory comments
Simplify the run-command API.
* rs/no-more-run-command-v:
replace and remove run_command_v_opt()
replace and remove run_command_v_opt_cd_env_tr2()
replace and remove run_command_v_opt_tr2()
replace and remove run_command_v_opt_cd_env()
use child_process members "args" and "env" directly
use child_process member "args" instead of string array variable
sequencer: simplify building argument list in do_exec()
bisect--helper: factor out do_bisect_run()
bisect: simplify building "checkout" argument list
am: simplify building "show" argument list
run-command: fix return value comment
merge: remove always-the-same "verbose" arguments
"git archive" mistakenly complained twice about a missing executable,
which has been corrected.
* rs/archive-filter-error-once:
archive-tar: report filter start error only once
A redundant diagnostic message is dropped from test_path_is_missing().
* ma/drop-redundant-diagnostic:
test-lib-functions: drop redundant diagnostic print
Various tests exercising the transfer.credentialsInUrl configuration
are taught to avoid making requests which require resolving localhost
to reduce CI-flakiness.
* jk/ref-filter-parsing-bugs:
ref-filter: fix parsing of signatures with CRLF and no body
ref-filter: fix parsing of signatures without blank lines
Enable gc.cruftpacks by default for those who opt into
feature.experimental setting.
* es/mark-gc-cruft-as-experimental:
config: let feature.experimental imply gc.cruftPacks=true
gc: add tests for --cruft and friends
When chainlint detects problems in a test, such as a broken &&-chain, it
prints out the test with "?!FOO?!" annotations inserted at each problem
location. However, rather than annotating the original test definition,
it instead dumps out a parsed token representation of the test. Since it
lacks comments, indentations, here-doc bodies, and so forth, this
tokenized representation can be difficult for the test author to digest
and relate back to the original test definition.
However, now that each parsed token carries positional information, the
location of a detected problem can be pinpointed precisely in the
original test definition. Therefore, take advantage of this information
to annotate the test definition itself rather than annotating the parsed
token stream, thus making it easier for a test author to relate a
problem back to the source.
Maintaining the positional meta-information associated with each
detected problem requires a slight change in how the problems are
managed internally. In particular, shell syntax such as:
msg="total: $(cd data; wc -w *.txt) words"
requires the lexical analyzer to recursively invoke the parser in order
to detect problems within the $(...) expression inside the double-quoted
string. In this case, the recursive parse context will detect the broken
&&-chain between the `cd` and `wc` commands, returning the token stream:
cd data ; ?!AMP?! wc -w *.txt
However, the parent parse context will see everything inside the
double-quotes as a single string token:
"total: $(cd data ; ?!AMP?! wc -w *.txt) words"
losing whatever positional information was attached to the ";" token
where the problem was detected.
One way to preserve the positional information of a detected problem in
a recursive parse context within a string would be to attach the
positional information to the annotation textually; for instance:
"total: $(cd data ; ?!AMP:21:22?! wc -w *.txt) words"
and then extract the positional information when annotating the original
test definition.
However, a cleaner and much simpler approach is to maintain the list of
detected problems separately rather than embedding the problems as
annotations directly in the parsed token stream. Not only does this
ensure that positional information within recursive parse contexts is
not lost, but it keeps the token stream free from non-token pollution,
which may simplify implementation of validations added in the future
since they won't have to handle non-token "?!FOO!?" items specially.
Finally, the chainlint self-test "expect" files need a few mechanical
adjustments now that the original test definitions are emitted rather
than the parsed token stream. In particular, the following items missing
from the historic parsed-token output are now preserved verbatim:
* indentation (and whitespace, in general)
* comments
* here-doc bodies
* here-doc tag quoting (i.e. "\EOF")
* line-splices (i.e. "\" at the end of a line)
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
When chainlint detects problems in a test, such as a broken &&-chain, it
prints out the test with "?!FOO?!" annotations inserted at each problem
location. However, rather than annotating the original test definition,
it instead dumps out a parsed token representation of the test. Since it
lacks comments, indentations, here-doc bodies, and so forth, this
tokenized representation can be difficult for the test author to digest
and relate back to the original test definition.
To address this shortcoming, an upcoming change will make it print out
an annotated copy of the original test definition rather than the
tokenized representation. In order to do so, it will need to know the
start and end positions of each token in the original test definition.
As preparation, upgrade TestParser::scan_token() to latch the start and
end position of the token being scanned, and return that information
along with the token itself. A subsequent change will take advantage of
this positional information.
In terms of implementation, TestParser::scan_token() is retrofitted to
return a tuple consisting of the token's lexeme and its start and end
positions, rather than returning just the lexeme. However, an
alternative would be to define a class which represents a token:
package Token;
sub new {
my ($class, $lexeme, $start, $end) = @_;
bless [$lexeme, $start, $end] => $class;
}
sub as_string {
my $self = shift @_;
return $self->[0];
}
sub compare {
my ($x, $y) = @_;
if (UNIVERSAL::isa($y, 'Token')) {
return $x->[0] cmp $y->[0];
}
return $x->[0] cmp $y;
}
use overload (
'""' => 'as_string',
'cmp' => 'compare'
);
The major benefit of the class-based approach is that it is entirely
non-invasive; it requires no additional changes to the rest of the
script since a Token converts automatically to a string, which is what
scan_token() historically returned.
The big downside to the Token approach, however, is that it is _slow_;
on this developer's (old) machine, it increases user-time by an
unacceptable seven seconds when scanning all test scripts in the
project. Hence, the simple tuple approach is employed instead since it
adds only a fraction of a second user-time.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
To extract the next token in the input stream, Lexer::scan_token() finds
the start of the token by skipping whitespace, then consumes characters
belonging to the token until it encounters a non-token character, such
as an operator, punctuation, or whitespace. In the case of an operator
or punctuation which ends a token, before returning the just-scanned
token, it pushes that operator or punctuation character back onto the
input stream to ensure that it will be the first character consumed by
the next call to scan_token().
However, scan_token() is intentionally lax when whitespace ends a token;
it doesn't bother pushing the whitespace character back onto the token
stream since it knows that the next call to scan_token() will, as its
first step, skip over whitespace anyhow when looking for the start of
the token.
Although such laxity is harmless for the proper functioning of the
lexical analyzer, it does make it difficult to precisely identify the
token's end position in the input stream. Accurate token position
information may be desirable, for instance, to annotate problems or
highlight other interesting facets of the input found during the parsing
phase. To accommodate such possibilities, tighten scan_token() by making
it push the token-ending whitespace character back onto the input
stream, just as it does for other token-ending characters.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The logic in TestParser::accumulate() for detecting broken &&-chains is
mostly well-commented, but a couple branches which were deemed obvious
and straightforward lack comments. In retrospect, though, these cases
may give future readers pause, so comment them, as well.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>