In Git 2.36 we revamped the way how hooks are invoked. One change
that is end-user visible is that the output of a hook is no longer
directly connected to the standard output of "git" that spawns the
hook, which was noticed post release. This is getting corrected.
* ab/hooks-regression-fix:
hook API: fix v2.36.0 regression: hooks should be connected to a TTY
run-command: add an "ungroup" option to run_process_parallel()
"git -c diff.submodule=log range-diff" did not show anything for
submodules that changed in the ranges being compared, and
"git -c diff.submodule=diff range-diff" did not work correctly.
Fix this by including the "--submodule=short" output
unconditionally to be compared.
* pb/range-diff-with-submodule:
range-diff: show submodule changes irrespective of diff.submodule
A new bug() and BUG_if_bug() API is introduced to make it easier to
uniformly log "detect multiple bugs and abort in the end" pattern.
* ab/bug-if-bug:
cache-tree.c: use bug() and BUG_if_bug()
receive-pack: use bug() and BUG_if_bug()
parse-options.c: use optbug() instead of BUG() "opts" check
parse-options.c: use new bug() API for optbug()
usage.c: add a non-fatal bug() function to go with BUG()
common-main.c: move non-trace2 exit() behavior out of trace2.c
More fsmonitor--daemon.
* jh/builtin-fsmonitor-part3: (30 commits)
t7527: improve implicit shutdown testing in fsmonitor--daemon
fsmonitor--daemon: allow --super-prefix argument
t7527: test Unicode NFC/NFD handling on MacOS
t/lib-unicode-nfc-nfd: helper prereqs for testing unicode nfc/nfd
t/helper/hexdump: add helper to print hexdump of stdin
fsmonitor: on macOS also emit NFC spelling for NFD pathname
t7527: test FSMonitor on case insensitive+preserving file system
fsmonitor: never set CE_FSMONITOR_VALID on submodules
t/perf/p7527: add perf test for builtin FSMonitor
t7527: FSMonitor tests for directory moves
fsmonitor: optimize processing of directory events
fsm-listen-darwin: shutdown daemon if worktree root is moved/renamed
fsm-health-win32: force shutdown daemon if worktree root moves
fsm-health-win32: add polling framework to monitor daemon health
fsmonitor--daemon: stub in health thread
fsmonitor--daemon: rename listener thread related variables
fsmonitor--daemon: prepare for adding health thread
fsmonitor--daemon: cd out of worktree root
fsm-listen-darwin: ignore FSEvents caused by xattr changes on macOS
unpack-trees: initialize fsmonitor_has_run_once in o->result
...
A misconfigured 'branch..remote' led to a bug in configuration
parsing.
* gc/zero-length-branch-config-fix:
remote.c: reject 0-length branch names
remote.c: don't BUG() on 0-length branch names
Rename .env_array member to .env in the child_process structure.
* ab/env-array:
run-command API users: use "env" not "env_array" in comments & names
run-command API: rename "env_array" to "env"
With a more targetted workaround in http.c in another topic, we may
be able to lift this blanket "GCC12 dangling-pointer warning is
broken and unsalvageable" workaround.
* cb/buggy-gcc-12-workaround:
Revert -Wno-error=dangling-pointer
"git clone --origin X" leaked piece of memory that held value read
from the clone.defaultRemoteName configuration variable, which has
been plugged.
source: <xmqqlevl4ysk.fsf@gitster.g>
* jc/clone-remote-name-leak-fix:
clone: plug a miniscule leak
The path taken by "git multi-pack-index" command from the end user
was compared with path internally prepared by the tool withut first
normalizing, which lead to duplicated paths not being noticed,
which has been corrected.
source: <pull.1221.v2.git.1650911234.gitgitgadget@gmail.com>
* ds/midx-normalize-pathname-before-comparison:
cache: use const char * for get_object_directory()
multi-pack-index: use --object-dir real path
midx: use real paths in lookup_multi_pack_index()
"git rebase --keep-base <upstream> <branch-to-rebase>" computed the
commit to rebase onto incorrectly, which has been corrected.
source: <20220421044233.894255-1-alexhenrie24@gmail.com>
* ah/rebase-keep-base-fix:
rebase: use correct base for --keep-base when a branch is given
Avoid problems from interaction between malloc_check and address
sanitizer.
source: <pull.1210.git.1649507317350.gitgitgadget@gmail.com>
* pw/test-malloc-with-sanitize-address:
tests: make SANITIZE=address imply TEST_NO_MALLOC_CHECK
The commit summary shown after making a commit is matched to what
is given in "git status" not to use the break-rewrite heuristics.
source: <c35bd0aa-2e46-e710-2b39-89f18bad0097@web.de>
* rs/commit-summary-wo-break-rewrite:
commit, sequencer: turn off break_opt for commit summary
macOS CI jobs have been occasionally flaky due to tentative version
skew between perforce and the homebrew packager. Instead of
failing the whole CI job, just let it skip the p4 tests when this
happens.
source: <20220512223940.238367-1-gitster@pobox.com>
* cb/ci-make-p4-optional:
ci: use https, not http to download binaries from perforce.com
ci: reintroduce prevention from perforce being quarantined in macOS
ci: avoid brew for installing perforce
ci: make failure to find perforce more user friendly
A bit of test framework fixes with a few fixes to issues found by
valgrind.
source: <20220512223218.237544-1-gitster@pobox.com>
* ab/valgrind-fixes:
commit-graph.c: don't assume that stat() succeeds
object-file: fix a unpack_loose_header() regression in 3b6a8db3b0
log test: skip a failing mkstemp() test under valgrind
tests: using custom GIT_EXEC_PATH breaks --valgrind tests
"git archive --add-file=<path>" picked up the raw permission bits
from the path and propagated to zip output in some cases, without
normalization, which has been corrected (tar output did not have
this issue).
source: <xmqqmtfme8v6.fsf@gitster.g>
* jc/archive-add-file-normalize-mode:
archive: do not let on-disk mode leak to zip archives
The "--current" option of "git show-branch" should have been made
incompatible with the "--reflog" mode, but this was not enforced,
which has been corrected.
source: <xmqqh76mf7s4.fsf_-_@gitster.g>
* jc/show-branch-g-current:
show-branch: -g and --current are incompatible
Meant to go with js/ci-gcc-12-fixes.
source: <xmqq7d68ytj8.fsf_-_@gitster.g>
* jc/http-clear-finished-pointer:
http.c: clear the 'finished' member once we are done with it
Fixes real problems noticed by gcc 12 and works around false
positives.
source: <pull.1238.git.1653351786.gitgitgadget@gmail.com>
* js/ci-gcc-12-fixes:
dir.c: avoid "exceeds maximum object size" error with GCC v12.x
nedmalloc: avoid new compile error
compat/win32/syslog: fix use-after-realloc
A git subcommand like "git add -p" spawns a separate git process
while relaying its command line arguments. A pathspec with only
negative elements was mistakenly passed with an empty string, which
has been corrected.
* jc/all-negative-pathspec:
pathspec: correct an empty string used as a pathspec element
Implementation of "scalar diagnose" subcommand.
* js/scalar-diagnose:
scalar: teach `diagnose` to gather loose objects information
scalar: teach `diagnose` to gather packfile info
scalar diagnose: include disk space information
scalar: implement `scalar diagnose`
scalar: validate the optional enlistment argument
archive --add-virtual-file: allow paths containing colons
archive: optionally add "virtual" files
The documentation on the interaction between "--add-file" and
"--prefix" options of "git archive" has been improved.
* rs/document-archive-prefix:
archive: improve documentation of --prefix
Leakfix.
* fh/transport-push-leakfix:
transport: free local and remote refs in transport_push()
transport: unify return values and exit point from transport_push()
transport: remove unnecessary indenting in transport_push()
Update the GitHub workflow support to make it quicker to get to the
failing test.
* js/ci-github-workflow-markup:
ci: call `finalize_test_case_output` a little later
ci(github): mention where the full logs can be found
ci: use `--github-workflow-markup` in the GitHub workflow
ci(github): avoid printing test case preamble twice
ci(github): skip the logs of the successful test cases
ci: optionally mark up output in the GitHub workflow
ci/run-build-and-tests: add some structure to the GitHub workflow output
ci: make it easier to find failed tests' logs in the GitHub workflow
ci/run-build-and-tests: take a more high-level view
test(junit): avoid line feeds in XML attributes
tests: refactor --write-junit-xml code
ci: fix code style
Plug the memory leaks from the trickiest API of all, the revision
walker.
* ab/plug-leak-in-revisions: (27 commits)
revisions API: add a TODO for diff_free(&revs->diffopt)
revisions API: have release_revisions() release "topo_walk_info"
revisions API: have release_revisions() release "date_mode"
revisions API: call diff_free(&revs->pruning) in revisions_release()
revisions API: release "reflog_info" in release revisions()
revisions API: clear "boundary_commits" in release_revisions()
revisions API: have release_revisions() release "prune_data"
revisions API: have release_revisions() release "grep_filter"
revisions API: have release_revisions() release "filter"
revisions API: have release_revisions() release "cmdline"
revisions API: have release_revisions() release "mailmap"
revisions API: have release_revisions() release "commits"
revisions API users: use release_revisions() for "prune_data" users
revisions API users: use release_revisions() with UNLEAK()
revisions API users: use release_revisions() in builtin/log.c
revisions API users: use release_revisions() in http-push.c
revisions API users: add "goto cleanup" for release_revisions()
stash: always have the owner of "stash_info" free it
revisions API users: use release_revisions() needing REV_INFO_INIT
revision.[ch]: document and move code declared around "init"
...
CMake updates.
* yw/cmake-updates:
cmake: remove (_)UNICODE def on Windows in CMakeLists.txt
cmake: add pcre2 support
cmake: fix CMakeLists.txt on Linux
Fix a regression reported[1] against f443246b9f (commit: convert
{pre-commit,prepare-commit-msg} hook to hook.h, 2021-12-22): Due to
using the run_process_parallel() API in the earlier 96e7225b31 (hook:
add 'run' subcommand, 2021-12-22) we'd capture the hook's stderr and
stdout, and thus lose the connection to the TTY in the case of
e.g. the "pre-commit" hook.
As a preceding commit notes GNU parallel's similar --ungroup option
also has it emit output faster. While we're unlikely to have hooks
that emit truly massive amounts of output (or where the performance
thereof matters) it's still informative to measure the overhead. In a
similar "seq" test we're now ~30% faster:
$ cat .git/hooks/seq-hook; git hyperfine -L rev origin/master,HEAD~0 -s 'make CFLAGS=-O3' './git hook run seq-hook'
#!/bin/sh
seq 100000000
Benchmark 1: ./git hook run seq-hook' in 'origin/master
Time (mean ± σ): 787.1 ms ± 13.6 ms [User: 701.6 ms, System: 534.4 ms]
Range (min … max): 773.2 ms … 806.3 ms 10 runs
Benchmark 2: ./git hook run seq-hook' in 'HEAD~0
Time (mean ± σ): 603.4 ms ± 1.6 ms [User: 573.1 ms, System: 30.3 ms]
Range (min … max): 601.0 ms … 606.2 ms 10 runs
Summary
'./git hook run seq-hook' in 'HEAD~0' ran
1.30 ± 0.02 times faster than './git hook run seq-hook' in 'origin/master'
1. https://lore.kernel.org/git/CA+dzEBn108QoMA28f0nC8K21XT+Afua0V2Qv8XkR8rAeqUCCZw@mail.gmail.com/
Reported-by: Anthony Sottile <asottile@umich.edu>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
[jc: minor fix-up to tests for consistency]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the parallel execution API added in c553c72eed (run-command:
add an asynchronous parallel child processor, 2015-12-15) to support a
mode where the stdout and stderr of the processes isn't captured and
output in a deterministic order, instead we'll leave it to the kernel
and stdio to sort it out.
This gives the API same functionality as GNU parallel's --ungroup
option. As we'll see in a subsequent commit the main reason to want
this is to support stdout and stderr being connected to the TTY in the
case of jobs=1, demonstrated here with GNU parallel:
$ parallel --ungroup 'test -t {} && echo TTY || echo NTTY' ::: 1 2
TTY
TTY
$ parallel 'test -t {} && echo TTY || echo NTTY' ::: 1 2
NTTY
NTTY
Another is as GNU parallel's documentation notes a potential for
optimization. As demonstrated in next commit our results with "git
hook run" will be similar, but generally speaking this shows that if
you want to run processes in parallel where the exact order isn't
important this can be a lot faster:
$ hyperfine -r 3 -L o ,--ungroup 'parallel {o} seq ::: 10000000 >/dev/null '
Benchmark 1: parallel seq ::: 10000000 >/dev/null
Time (mean ± σ): 220.2 ms ± 9.3 ms [User: 124.9 ms, System: 96.1 ms]
Range (min … max): 212.3 ms … 230.5 ms 3 runs
Benchmark 2: parallel --ungroup seq ::: 10000000 >/dev/null
Time (mean ± σ): 154.7 ms ± 0.9 ms [User: 136.2 ms, System: 25.1 ms]
Range (min … max): 153.9 ms … 155.7 ms 3 runs
Summary
'parallel --ungroup seq ::: 10000000 >/dev/null ' ran
1.42 ± 0.06 times faster than 'parallel seq ::: 10000000 >/dev/null '
A large part of the juggling in the API is to make the API safer for
its maintenance and consumers alike.
For the maintenance of the API we e.g. avoid malloc()-ing the
"pp->pfd", ensuring that SANITIZE=address and other similar tools will
catch any unexpected misuse.
For API consumers we take pains to never pass the non-NULL "out"
buffer to an API user that provided the "ungroup" option. The
resulting code in t/helper/test-run-command.c isn't typical of such a
user, i.e. they'd typically use one mode or the other, and would know
whether they'd provided "ungroup" or not.
We could also avoid the strbuf_init() for "buffered_output" by having
"struct parallel_processes" use a static PARALLEL_PROCESSES_INIT
initializer, but let's leave that cleanup for later.
Using a global "run_processes_parallel_ungroup" variable to enable
this option is rather nasty, but is being done here to produce as
minimal of a change as possible for a subsequent regression fix. This
change is extracted from a larger initial version[1] which ends up
with a better end-state for the API, but in doing so needed to modify
all existing callers of the API. Let's defer that for now, and
narrowly focus on what we need for fixing the regression in the
subsequent commit.
It's safe to do this with a global variable because:
A) hook.c is the only user of it that sets it to non-zero, and before
we'll get any other API users we'll refactor away this method of
passing in the option, i.e. re-roll [1].
B) Even if hook.c wasn't the only user we don't have callers of this
API that concurrently invoke this parallel process starting API
itself in parallel.
As noted above "A" && "B" are rather nasty, and we don't want to live
with those caveats long-term, but for now they should be an acceptable
compromise.
1. https://lore.kernel.org/git/cover-v2-0.8-00000000000-20220518T195858Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After generating diffs for each range to be compared using a 'git log'
invocation, range-diff.c::read_patches looks for the "diff --git" header
in those diffs to recognize the beginning of a new change.
In a project with submodules, and with 'diff.submodule=log' set in the
config, this header is missing for the diff of a changed submodule, so
any submodule changes are quietly ignored in the range-diff.
When 'diff.submodule=diff' is set in the config, the "diff --git" header
is also missing for the submodule itself, but is shown for submodule
content changes, which can easily confuse 'git range-diff' and lead to
errors such as:
error: git apply: bad git-diff - inconsistent old filename on line 1
error: could not parse git header 'diff --git path/to/submodule/and/some/file/within
'
error: could not parse log for '@{u}..@{1}'
Force the submodule diff format to its default ("short") when invoking
'git log' to generate the patches for each range, such that submodule
changes are always detected.
Add a test, including an invocation with '--creation-factor=100' to
force the second commit in the range not to be considered a complete
rewrite, in order to verify we do indeed get the "short" format.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A mechanism to pack unreachable objects into a "cruft pack",
instead of ejecting them into loose form to be reclaimed later, has
been introduced.
* tb/cruft-packs:
sha1-file.c: don't freshen cruft packs
builtin/gc.c: conditionally avoid pruning objects via loose
builtin/repack.c: add cruft packs to MIDX during geometric repack
builtin/repack.c: use named flags for existing_packs
builtin/repack.c: allow configuring cruft pack generation
builtin/repack.c: support generating a cruft pack
builtin/pack-objects.c: --cruft with expiration
reachable: report precise timestamps from objects in cruft packs
reachable: add options to add_unseen_recent_objects_to_traversal
builtin/pack-objects.c: --cruft without expiration
builtin/pack-objects.c: return from create_object_entry()
t/helper: add 'pack-mtimes' test-tool
pack-mtimes: support writing pack .mtimes files
chunk-format.h: extract oid_version()
pack-write: pass 'struct packing_data' to 'stage_tmp_packfiles'
pack-mtimes: support reading .mtimes files
Documentation/technical: add cruft-packs.txt
Disable the "do not remove the directory the user started Git in"
logic when Git cannot tell where that directory is. Earlier we
refused to run in such a case.
* kl/setup-in-unreadable-worktree:
setup: don't die if realpath(3) fails on getcwd(3)
A workflow change for translators are being proposed.
* jx/l10n-workflow-change:
l10n: Document the new l10n workflow
Makefile: add "po-init" rule to initialize po/XX.po
Makefile: add "po-update" rule to update po/XX.po
po/git.pot: don't check in result of "make pot"
po/git.pot: this is now a generated file
Makefile: remove duplicate and unwanted files in FOUND_SOURCE_FILES
i18n CI: stop allowing non-ASCII source messages in po/git.pot
Makefile: have "make pot" not "reset --hard"
Makefile: generate "po/git.pot" from stable LOCALIZED_C
Makefile: sort source files before feeding to xgettext
Teach "git repack --geometric" work better with "--keep-pack" and
avoid corrupting the repository when packsize limit is used.
* tb/geom-repack-with-keep-and-max:
builtin/repack.c: ensure that `names` is sorted
t7703: demonstrate object corruption with pack.packSizeLimit
repack: respect --keep-pack with geometric repack
"sparse-checkout" learns to work well with the sparse-index
feature.
* ds/sparse-sparse-checkout:
sparse-checkout: integrate with sparse index
p2000: add test for 'git sparse-checkout [add|set]'
sparse-index: complete partial expansion
sparse-index: partially expand directories
sparse-checkout: --no-sparse-index needs a full index
cache-tree: implement cache_tree_find_path()
sparse-index: introduce partially-sparse indexes
sparse-index: create expand_index()
t1092: stress test 'git sparse-checkout set'
t1092: refactor 'sparse-index contents' test
The multi-pack-index code did not protect the packfile it is going
to depend on from getting removed while in use, which has been
corrected.
* tb/midx-race-in-pack-objects:
builtin/pack-objects.c: ensure pack validity from MIDX bitmap objects
builtin/pack-objects.c: ensure included `--stdin-packs` exist
builtin/pack-objects.c: avoid redundant NULL check
pack-bitmap.c: check preferred pack validity when opening MIDX bitmap
Preliminary code refactoring around transport and bundle code.
* ds/bundle-uri:
bundle.h: make "fd" version of read_bundle_header() public
remote: allow relative_url() to return an absolute url
remote: move relative_url()
http: make http_get_file() external
fetch-pack: move --keep=* option filling to a function
fetch-pack: add a deref_without_lazy_fetch_extended()
dir API: add a generalized path_match_flags() function
connect.c: refactor sending of agent & object-format
Introduce a filesystem-dependent mechanism to optimize the way the
bits for many loose object files are ensured to hit the disk
platter.
* ns/batch-fsync:
core.fsyncmethod: performance tests for batch mode
t/perf: add iteration setup mechanism to perf-lib
core.fsyncmethod: tests for batch mode
test-lib-functions: add parsing helpers for ls-files and ls-tree
core.fsync: use batch mode and sync loose objects by default on Windows
unpack-objects: use the bulk-checkin infrastructure
update-index: use the bulk-checkin infrastructure
builtin/add: add ODB transaction around add_files_to_cache
cache-tree: use ODB transaction around writing a tree
core.fsyncmethod: batched disk flushes for loose-objects
bulk-checkin: rebrand plug/unplug APIs as 'odb transactions'
bulk-checkin: rename 'state' variable and separate 'plugged' boolean
Deprecate non-cone mode of the sparse-checkout feature.
* en/sparse-cone-becomes-default:
Documentation: some sparsity wording clarifications
git-sparse-checkout.txt: mark non-cone mode as deprecated
git-sparse-checkout.txt: flesh out pattern set sections a bit
git-sparse-checkout.txt: add a new EXAMPLES section
git-sparse-checkout.txt: shuffle some sections and mark as internal
git-sparse-checkout.txt: update docs for deprecation of 'init'
git-sparse-checkout.txt: wording updates for the cone mode default
sparse-checkout: make --cone the default
tests: stop assuming --no-cone is the default mode for sparse-checkout
Add a test for the regression introduced in my 9c4d58ff2c (ls-tree:
split up "fast path" callbacks, 2022-03-23) and fixed in
350296cc78 (ls-tree: `-l` should not imply recursive listing,
2022-04-04), and test for the test of ls-tree option/mode combinations
to make sure we don't have other blind spots.
The setup for these tests can be shared with those added in the
1041d58b4d (Merge branch 'tl/ls-tree-oid-only', 2022-04-04) topic, so
let's create a new t/lib-t3100.sh to help them share data.
The existing tests in "t3104-ls-tree-format.sh" didn't deal with a
submodule, which they'll now encounter with as the
setup_basic_ls_tree_data() sets one up.
This extensive testing should give us confidence that there were no
further regressions in this area. The lack of testing was noted back
in [1], but unfortunately we didn't cover that blind-spot before
9c4d58ff2c.
1. https://lore.kernel.org/git/211115.86o86lqe3c.gmgdl@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Follow-up on a preceding commit which changed all references to the
"env_array" when referring to the "struct child_process" member. These
changes are all unnecessary for the compiler, but help the code's
human readers.
All the comments that referred to "env_array" have now been updated,
as well as function names and variables that had "env_array" in their
name, they now refer to "env".
In addition the "out" name for the submodule.h prototype was
inconsistent with the function definition's use of "env_array" in
submodule.c. Both of them use "env" now.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>