Commit Graph

67061 Commits

Author SHA1 Message Date
Junio C Hamano
2246937e41 Merge branch 'tb/show-ref-optim'
"git show-ref --heads" (and "--tags") still iterated over all the
refs only to discard refs outside the specified area, which has
been corrected.

* tb/show-ref-optim:
  builtin/show-ref.c: avoid over-iterating with --heads, --tags
2022-06-13 15:53:42 -07:00
Junio C Hamano
11698e551c Merge branch 'ds/credentials-in-url'
The "fetch.credentialsInUrl" configuration variable controls what
happens when a URL with embedded login credential is used.

* ds/credentials-in-url:
  remote: create fetch.credentialsInUrl config
2022-06-13 15:53:42 -07:00
Junio C Hamano
eef985e17a Merge branch 'jt/unparse-commit-upon-graft-change'
Updating the graft information invalidates the list of parents of
in-core commit objects that used to be in the graft file.

* jt/unparse-commit-upon-graft-change:
  commit,shallow: unparse commits if grafts changed
2022-06-13 15:53:42 -07:00
Junio C Hamano
1a7f6be5b1 Merge branch 'ab/hooks-regression-fix'
In Git 2.36 we revamped the way how hooks are invoked.  One change
that is end-user visible is that the output of a hook is no longer
directly connected to the standard output of "git" that spawns the
hook, which was noticed post release.  This is getting corrected.

* ab/hooks-regression-fix:
  hook API: fix v2.36.0 regression: hooks should be connected to a TTY
  run-command: add an "ungroup" option to run_process_parallel()
2022-06-13 15:53:41 -07:00
Junio C Hamano
66c2948ffd Merge branch 'tl/ls-tree-oid-only'
Add tests for a regression fixed earlier.

* tl/ls-tree-oid-only:
  ls-tree: test for the regression in 9c4d58ff2c
2022-06-13 15:53:41 -07:00
Junio C Hamano
ecbd60ae99 Merge branch 'pb/range-diff-with-submodule'
"git -c diff.submodule=log range-diff" did not show anything for
submodules that changed in the ranges being compared, and
"git -c diff.submodule=diff range-diff" did not work correctly.
Fix this by including the "--submodule=short" output
unconditionally to be compared.

* pb/range-diff-with-submodule:
  range-diff: show submodule changes irrespective of diff.submodule
2022-06-13 15:53:41 -07:00
Junio C Hamano
5699ec1b0a Ninth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-10 15:05:15 -07:00
Junio C Hamano
4da14b574f Merge branch 'ab/bug-if-bug'
A new bug() and BUG_if_bug() API is introduced to make it easier to
uniformly log "detect multiple bugs and abort in the end" pattern.

* ab/bug-if-bug:
  cache-tree.c: use bug() and BUG_if_bug()
  receive-pack: use bug() and BUG_if_bug()
  parse-options.c: use optbug() instead of BUG() "opts" check
  parse-options.c: use new bug() API for optbug()
  usage.c: add a non-fatal bug() function to go with BUG()
  common-main.c: move non-trace2 exit() behavior out of trace2.c
2022-06-10 15:04:15 -07:00
Junio C Hamano
28c2a35997 Merge branch 'jy/gitweb-xhtml5'
Update the doctype written in gitweb output to xhtml5.

* jy/gitweb-xhtml5:
  gitweb: switch to an XHTML5 DOCTYPE
2022-06-10 15:04:15 -07:00
Junio C Hamano
9e496fffc8 Merge branch 'jh/builtin-fsmonitor-part3'
More fsmonitor--daemon.

* jh/builtin-fsmonitor-part3: (30 commits)
  t7527: improve implicit shutdown testing in fsmonitor--daemon
  fsmonitor--daemon: allow --super-prefix argument
  t7527: test Unicode NFC/NFD handling on MacOS
  t/lib-unicode-nfc-nfd: helper prereqs for testing unicode nfc/nfd
  t/helper/hexdump: add helper to print hexdump of stdin
  fsmonitor: on macOS also emit NFC spelling for NFD pathname
  t7527: test FSMonitor on case insensitive+preserving file system
  fsmonitor: never set CE_FSMONITOR_VALID on submodules
  t/perf/p7527: add perf test for builtin FSMonitor
  t7527: FSMonitor tests for directory moves
  fsmonitor: optimize processing of directory events
  fsm-listen-darwin: shutdown daemon if worktree root is moved/renamed
  fsm-health-win32: force shutdown daemon if worktree root moves
  fsm-health-win32: add polling framework to monitor daemon health
  fsmonitor--daemon: stub in health thread
  fsmonitor--daemon: rename listener thread related variables
  fsmonitor--daemon: prepare for adding health thread
  fsmonitor--daemon: cd out of worktree root
  fsm-listen-darwin: ignore FSEvents caused by xattr changes on macOS
  unpack-trees: initialize fsmonitor_has_run_once in o->result
  ...
2022-06-10 15:04:15 -07:00
Junio C Hamano
0b91d563d8 Merge branch 'gc/zero-length-branch-config-fix'
A misconfigured 'branch..remote' led to a bug in configuration
parsing.

* gc/zero-length-branch-config-fix:
  remote.c: reject 0-length branch names
  remote.c: don't BUG() on 0-length branch names
2022-06-10 15:04:14 -07:00
Junio C Hamano
c21fa3bb54 Merge branch 'ab/env-array'
Rename .env_array member to .env in the child_process structure.

* ab/env-array:
  run-command API users: use "env" not "env_array" in comments & names
  run-command API: rename "env_array" to "env"
2022-06-10 15:04:13 -07:00
Junio C Hamano
597553e42e Merge branch 'cb/buggy-gcc-12-workaround'
With a more targetted workaround in http.c in another topic, we may
be able to lift this blanket "GCC12 dangling-pointer warning is
broken and unsalvageable" workaround.

* cb/buggy-gcc-12-workaround:
  Revert -Wno-error=dangling-pointer
2022-06-10 15:04:12 -07:00
Junio C Hamano
1e59178e3f Sync with 'maint' 2022-06-08 14:29:30 -07:00
Junio C Hamano
dc8c8deaa6 Prepare for 2.36.2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-08 14:27:53 -07:00
Junio C Hamano
d2b11e05e0 Merge branch 'jc/clone-remote-name-leak-fix' into maint
"git clone --origin X" leaked piece of memory that held value read
from the clone.defaultRemoteName configuration variable, which has
been plugged.
source: <xmqqlevl4ysk.fsf@gitster.g>

* jc/clone-remote-name-leak-fix:
  clone: plug a miniscule leak
2022-06-08 14:27:53 -07:00
Junio C Hamano
67c305f722 Merge branch 'ds/midx-normalize-pathname-before-comparison' into maint
The path taken by "git multi-pack-index" command from the end user
was compared with path internally prepared by the tool withut first
normalizing, which lead to duplicated paths not being noticed,
which has been corrected.
source: <pull.1221.v2.git.1650911234.gitgitgadget@gmail.com>

* ds/midx-normalize-pathname-before-comparison:
  cache: use const char * for get_object_directory()
  multi-pack-index: use --object-dir real path
  midx: use real paths in lookup_multi_pack_index()
2022-06-08 14:27:53 -07:00
Junio C Hamano
363d54ff80 Merge branch 'ah/rebase-keep-base-fix' into maint
"git rebase --keep-base <upstream> <branch-to-rebase>" computed the
commit to rebase onto incorrectly, which has been corrected.
source: <20220421044233.894255-1-alexhenrie24@gmail.com>

* ah/rebase-keep-base-fix:
  rebase: use correct base for --keep-base when a branch is given
2022-06-08 14:27:52 -07:00
Junio C Hamano
d777ef9bef Merge branch 'pw/test-malloc-with-sanitize-address' into maint
Avoid problems from interaction between malloc_check and address
sanitizer.
source: <pull.1210.git.1649507317350.gitgitgadget@gmail.com>

* pw/test-malloc-with-sanitize-address:
  tests: make SANITIZE=address imply TEST_NO_MALLOC_CHECK
2022-06-08 14:27:52 -07:00
Junio C Hamano
ac8f6b6608 Merge branch 'rs/commit-summary-wo-break-rewrite' into maint
The commit summary shown after making a commit is matched to what
is given in "git status" not to use the break-rewrite heuristics.
source: <c35bd0aa-2e46-e710-2b39-89f18bad0097@web.de>

* rs/commit-summary-wo-break-rewrite:
  commit, sequencer: turn off break_opt for commit summary
2022-06-08 14:27:52 -07:00
Junio C Hamano
a5a52739e9 Merge branch 'mg/detect-compiler-in-c-locale' into maint
Build procedure fixup.
source: <f306f43f375bc9b9c98e85260587442e5d9ef0ba.1652094958.git.git@grubix.eu>

* mg/detect-compiler-in-c-locale:
  detect-compiler: make detection independent of locale
2022-06-08 14:27:52 -07:00
Junio C Hamano
080b062071 Merge branch 'cb/ci-make-p4-optional' into maint
macOS CI jobs have been occasionally flaky due to tentative version
skew between perforce and the homebrew packager.  Instead of
failing the whole CI job, just let it skip the p4 tests when this
happens.
source: <20220512223940.238367-1-gitster@pobox.com>

* cb/ci-make-p4-optional:
  ci: use https, not http to download binaries from perforce.com
  ci: reintroduce prevention from perforce being quarantined in macOS
  ci: avoid brew for installing perforce
  ci: make failure to find perforce more user friendly
2022-06-08 14:27:51 -07:00
Junio C Hamano
f02e23405f Merge branch 'ab/valgrind-fixes' into maint
A bit of test framework fixes with a few fixes to issues found by
valgrind.
source: <20220512223218.237544-1-gitster@pobox.com>

* ab/valgrind-fixes:
  commit-graph.c: don't assume that stat() succeeds
  object-file: fix a unpack_loose_header() regression in 3b6a8db3b0
  log test: skip a failing mkstemp() test under valgrind
  tests: using custom GIT_EXEC_PATH breaks --valgrind tests
2022-06-08 14:27:51 -07:00
Junio C Hamano
9d1304155b Merge branch 'jc/archive-add-file-normalize-mode' into maint
"git archive --add-file=<path>" picked up the raw permission bits
from the path and propagated to zip output in some cases, without
normalization, which has been corrected (tar output did not have
this issue).
source: <xmqqmtfme8v6.fsf@gitster.g>

* jc/archive-add-file-normalize-mode:
  archive: do not let on-disk mode leak to zip archives
2022-06-08 14:27:51 -07:00
Junio C Hamano
c47b89cde6 Merge branch 'jc/show-branch-g-current' into maint
The "--current" option of "git show-branch" should have been made
incompatible with the "--reflog" mode, but this was not enforced,
which has been corrected.
source: <xmqqh76mf7s4.fsf_-_@gitster.g>

* jc/show-branch-g-current:
  show-branch: -g and --current are incompatible
2022-06-08 14:27:51 -07:00
Junio C Hamano
b8117d2c08 Merge branch 'jc/update-ozlabs-url' into maint
Update URL to the gitk repository.

* jc/update-ozlabs-url:
  SubmittingPatches: use more stable git.ozlabs.org URL
2022-06-08 14:27:51 -07:00
Junio C Hamano
79d1e6d407 Merge branch 'jc/http-clear-finished-pointer' into maint
Meant to go with js/ci-gcc-12-fixes.
source: <xmqq7d68ytj8.fsf_-_@gitster.g>

* jc/http-clear-finished-pointer:
  http.c: clear the 'finished' member once we are done with it
2022-06-08 14:27:50 -07:00
Junio C Hamano
596838d2c5 Merge branch 'js/ci-gcc-12-fixes' into maint
Fixes real problems noticed by gcc 12 and works around false
positives.
source: <pull.1238.git.1653351786.gitgitgadget@gmail.com>

* js/ci-gcc-12-fixes:
  dir.c: avoid "exceeds maximum object size" error with GCC v12.x
  nedmalloc: avoid new compile error
  compat/win32/syslog: fix use-after-realloc
2022-06-08 14:27:50 -07:00
Junio C Hamano
9c897eef06 Eighth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-07 14:11:05 -07:00
Junio C Hamano
f00809500f Merge branch 'jc/all-negative-pathspec'
A git subcommand like "git add -p" spawns a separate git process
while relaying its command line arguments.  A pathspec with only
negative elements was mistakenly passed with an empty string, which
has been corrected.

* jc/all-negative-pathspec:
  pathspec: correct an empty string used as a pathspec element
2022-06-07 14:10:59 -07:00
Junio C Hamano
08baf19fa3 Merge branch 'js/scalar-diagnose'
Implementation of "scalar diagnose" subcommand.

* js/scalar-diagnose:
  scalar: teach `diagnose` to gather loose objects information
  scalar: teach `diagnose` to gather packfile info
  scalar diagnose: include disk space information
  scalar: implement `scalar diagnose`
  scalar: validate the optional enlistment argument
  archive --add-virtual-file: allow paths containing colons
  archive: optionally add "virtual" files
2022-06-07 14:10:58 -07:00
Junio C Hamano
006fd83e03 Merge branch 'rs/document-archive-prefix'
The documentation on the interaction between "--add-file" and
"--prefix" options of "git archive" has been improved.

* rs/document-archive-prefix:
  archive: improve documentation of --prefix
2022-06-07 14:10:57 -07:00
Junio C Hamano
07a454027b Merge branch 'fh/transport-push-leakfix'
Leakfix.

* fh/transport-push-leakfix:
  transport: free local and remote refs in transport_push()
  transport: unify return values and exit point from transport_push()
  transport: remove unnecessary indenting in transport_push()
2022-06-07 14:10:57 -07:00
Junio C Hamano
fc5a070f59 Merge branch 'js/ci-github-workflow-markup'
Update the GitHub workflow support to make it quicker to get to the
failing test.

* js/ci-github-workflow-markup:
  ci: call `finalize_test_case_output` a little later
  ci(github): mention where the full logs can be found
  ci: use `--github-workflow-markup` in the GitHub workflow
  ci(github): avoid printing test case preamble twice
  ci(github): skip the logs of the successful test cases
  ci: optionally mark up output in the GitHub workflow
  ci/run-build-and-tests: add some structure to the GitHub workflow output
  ci: make it easier to find failed tests' logs in the GitHub workflow
  ci/run-build-and-tests: take a more high-level view
  test(junit): avoid line feeds in XML attributes
  tests: refactor --write-junit-xml code
  ci: fix code style
2022-06-07 14:10:57 -07:00
Junio C Hamano
2da81d1efb Merge branch 'ab/plug-leak-in-revisions'
Plug the memory leaks from the trickiest API of all, the revision
walker.

* ab/plug-leak-in-revisions: (27 commits)
  revisions API: add a TODO for diff_free(&revs->diffopt)
  revisions API: have release_revisions() release "topo_walk_info"
  revisions API: have release_revisions() release "date_mode"
  revisions API: call diff_free(&revs->pruning) in revisions_release()
  revisions API: release "reflog_info" in release revisions()
  revisions API: clear "boundary_commits" in release_revisions()
  revisions API: have release_revisions() release "prune_data"
  revisions API: have release_revisions() release "grep_filter"
  revisions API: have release_revisions() release "filter"
  revisions API: have release_revisions() release "cmdline"
  revisions API: have release_revisions() release "mailmap"
  revisions API: have release_revisions() release "commits"
  revisions API users: use release_revisions() for "prune_data" users
  revisions API users: use release_revisions() with UNLEAK()
  revisions API users: use release_revisions() in builtin/log.c
  revisions API users: use release_revisions() in http-push.c
  revisions API users: add "goto cleanup" for release_revisions()
  stash: always have the owner of "stash_info" free it
  revisions API users: use release_revisions() needing REV_INFO_INIT
  revision.[ch]: document and move code declared around "init"
  ...
2022-06-07 14:10:56 -07:00
Junio C Hamano
f31b624495 Merge branch 'yw/cmake-updates'
CMake updates.

* yw/cmake-updates:
  cmake: remove (_)UNICODE def on Windows in CMakeLists.txt
  cmake: add pcre2 support
  cmake: fix CMakeLists.txt on Linux
2022-06-07 14:10:56 -07:00
Ævar Arnfjörð Bjarmason
a082345372 hook API: fix v2.36.0 regression: hooks should be connected to a TTY
Fix a regression reported[1] against f443246b9f (commit: convert
{pre-commit,prepare-commit-msg} hook to hook.h, 2021-12-22): Due to
using the run_process_parallel() API in the earlier 96e7225b31 (hook:
add 'run' subcommand, 2021-12-22) we'd capture the hook's stderr and
stdout, and thus lose the connection to the TTY in the case of
e.g. the "pre-commit" hook.

As a preceding commit notes GNU parallel's similar --ungroup option
also has it emit output faster. While we're unlikely to have hooks
that emit truly massive amounts of output (or where the performance
thereof matters) it's still informative to measure the overhead. In a
similar "seq" test we're now ~30% faster:

	$ cat .git/hooks/seq-hook; git hyperfine -L rev origin/master,HEAD~0 -s 'make CFLAGS=-O3' './git hook run seq-hook'
	#!/bin/sh

	seq 100000000
	Benchmark 1: ./git hook run seq-hook' in 'origin/master
	  Time (mean ± σ):     787.1 ms ±  13.6 ms    [User: 701.6 ms, System: 534.4 ms]
	  Range (min … max):   773.2 ms … 806.3 ms    10 runs

	Benchmark 2: ./git hook run seq-hook' in 'HEAD~0
	  Time (mean ± σ):     603.4 ms ±   1.6 ms    [User: 573.1 ms, System: 30.3 ms]
	  Range (min … max):   601.0 ms … 606.2 ms    10 runs

	Summary
	  './git hook run seq-hook' in 'HEAD~0' ran
	    1.30 ± 0.02 times faster than './git hook run seq-hook' in 'origin/master'

1. https://lore.kernel.org/git/CA+dzEBn108QoMA28f0nC8K21XT+Afua0V2Qv8XkR8rAeqUCCZw@mail.gmail.com/

Reported-by: Anthony Sottile <asottile@umich.edu>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
[jc: minor fix-up to tests for consistency]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-07 11:13:20 -07:00
Ævar Arnfjörð Bjarmason
fd3aaf53f7 run-command: add an "ungroup" option to run_process_parallel()
Extend the parallel execution API added in c553c72eed (run-command:
add an asynchronous parallel child processor, 2015-12-15) to support a
mode where the stdout and stderr of the processes isn't captured and
output in a deterministic order, instead we'll leave it to the kernel
and stdio to sort it out.

This gives the API same functionality as GNU parallel's --ungroup
option. As we'll see in a subsequent commit the main reason to want
this is to support stdout and stderr being connected to the TTY in the
case of jobs=1, demonstrated here with GNU parallel:

	$ parallel --ungroup 'test -t {} && echo TTY || echo NTTY' ::: 1 2
	TTY
	TTY
	$ parallel 'test -t {} && echo TTY || echo NTTY' ::: 1 2
	NTTY
	NTTY

Another is as GNU parallel's documentation notes a potential for
optimization. As demonstrated in next commit our results with "git
hook run" will be similar, but generally speaking this shows that if
you want to run processes in parallel where the exact order isn't
important this can be a lot faster:

	$ hyperfine -r 3 -L o ,--ungroup 'parallel {o} seq ::: 10000000 >/dev/null '
	Benchmark 1: parallel  seq ::: 10000000 >/dev/null
	  Time (mean ± σ):     220.2 ms ±   9.3 ms    [User: 124.9 ms, System: 96.1 ms]
	  Range (min … max):   212.3 ms … 230.5 ms    3 runs

	Benchmark 2: parallel --ungroup seq ::: 10000000 >/dev/null
	  Time (mean ± σ):     154.7 ms ±   0.9 ms    [User: 136.2 ms, System: 25.1 ms]
	  Range (min … max):   153.9 ms … 155.7 ms    3 runs

	Summary
	  'parallel --ungroup seq ::: 10000000 >/dev/null ' ran
	    1.42 ± 0.06 times faster than 'parallel  seq ::: 10000000 >/dev/null '

A large part of the juggling in the API is to make the API safer for
its maintenance and consumers alike.

For the maintenance of the API we e.g. avoid malloc()-ing the
"pp->pfd", ensuring that SANITIZE=address and other similar tools will
catch any unexpected misuse.

For API consumers we take pains to never pass the non-NULL "out"
buffer to an API user that provided the "ungroup" option. The
resulting code in t/helper/test-run-command.c isn't typical of such a
user, i.e. they'd typically use one mode or the other, and would know
whether they'd provided "ungroup" or not.

We could also avoid the strbuf_init() for "buffered_output" by having
"struct parallel_processes" use a static PARALLEL_PROCESSES_INIT
initializer, but let's leave that cleanup for later.

Using a global "run_processes_parallel_ungroup" variable to enable
this option is rather nasty, but is being done here to produce as
minimal of a change as possible for a subsequent regression fix. This
change is extracted from a larger initial version[1] which ends up
with a better end-state for the API, but in doing so needed to modify
all existing callers of the API. Let's defer that for now, and
narrowly focus on what we need for fixing the regression in the
subsequent commit.

It's safe to do this with a global variable because:

 A) hook.c is the only user of it that sets it to non-zero, and before
    we'll get any other API users we'll refactor away this method of
    passing in the option, i.e. re-roll [1].

 B) Even if hook.c wasn't the only user we don't have callers of this
    API that concurrently invoke this parallel process starting API
    itself in parallel.

As noted above "A" && "B" are rather nasty, and we don't want to live
with those caveats long-term, but for now they should be an acceptable
compromise.

1. https://lore.kernel.org/git/cover-v2-0.8-00000000000-20220518T195858Z-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-07 10:01:41 -07:00
Philippe Blain
04b1f1fd9d range-diff: show submodule changes irrespective of diff.submodule
After generating diffs for each range to be compared using a 'git log'
invocation, range-diff.c::read_patches looks for the "diff --git" header
in those diffs to recognize the beginning of a new change.

In a project with submodules, and with 'diff.submodule=log' set in the
config, this header is missing for the diff of a changed submodule, so
any submodule changes are quietly ignored in the range-diff.

When 'diff.submodule=diff' is set in the config, the "diff --git" header
is also missing for the submodule itself, but is shown for submodule
content changes, which can easily confuse 'git range-diff' and lead to
errors such as:

    error: git apply: bad git-diff - inconsistent old filename on line 1
    error: could not parse git header 'diff --git path/to/submodule/and/some/file/within
    '
    error: could not parse log for '@{u}..@{1}'

Force the submodule diff format to its default ("short") when invoking
'git log' to generate the patches for each range, such that submodule
changes are always detected.

Add a test, including an invocation with '--creation-factor=100' to
force the second commit in the range not to be considered a complete
rewrite, in order to verify we do indeed get the "short" format.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-06 15:47:01 -07:00
Jonathan Tan
4d4e49fff1 commit,shallow: unparse commits if grafts changed
When a commit is parsed, it pretends to have a different (possibly
empty) list of parents if there is graft information for that commit.
But there is a bug that could occur when a commit is parsed, the graft
information is updated (for example, when a shallow file is rewritten),
and the same commit is subsequently used: the parents of the commit do
not conform to the updated graft information, but the information at the
time of parsing.

This is usually not an issue, as a commit is usually introduced into the
repository at the same time as its graft information. That means that
when we try to parse that commit, we already have its graft information.

But it is an issue when fetching a shallow point directly into a
repository with submodules. The function
assign_shallow_commits_to_refs() parses all sought objects (including
the shallow point, which we are directly fetching). In update_shallow()
in fetch-pack.c, assign_shallow_commits_to_refs() is called before
commit_shallow_file(), which means that the shallow point would have
been parsed before graft information is updated. Once a commit is
parsed, it is no longer sensitive to any graft information updates. This
parsed commit is subsequently used when we do a revision walk to search
for submodules to fetch, meaning that the commit is considered to have
parents even though it is a shallow point (and therefore should be
treated as having no parents).

Therefore, whenever graft information is updated, mark the commits that
were previously grafts and the commits that are newly grafts as
unparsed.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-06 11:50:34 -07:00
Taylor Blau
c0c9d35e27 builtin/show-ref.c: avoid over-iterating with --heads, --tags
When `show-ref` is combined with the `--heads` or `--tags` options, it
can avoid iterating parts of a repository's references that it doesn't
care about.

But it doesn't take advantage of this potential optimization. When this
command was introduced back in 358ddb62cf (Add "git show-ref" builtin
command, 2006-09-15), `for_each_ref_in()` did exist. But since most
repositories don't have many (any?) references that aren't branches or
tags already, this makes little difference in practice.

Though for repositories with a large imbalance of branches and tags (or,
more likely in the case of server operators, many hidden references),
this can make quite a difference. Take, for example, a repository with
500,000 "hidden" references (all of the form "refs/__hidden__/N"), and
a single branch:

    git commit --allow-empty -m "base" &&
    seq 1 500000 | sed 's,\(.*\),create refs/__hidden__/\1 HEAD,' |
      git update-ref --stdin &&
    git pack-refs --all

Outputting the existence of that single branch currently takes on the
order of ~50ms on my machine. The vast majority of this time is wasted
iterating through references that we know we're going to discard.

Instead, teach `show-ref` that it can iterate just "refs/heads" and/or
"refs/tags" when given `--heads` and/or `--tags`, respectively. A few
small interesting things to note:

  - When given either option, we can avoid the general-purpose
    for_each_ref() call altogether, since we know that it won't give us
    any references that we wouldn't filter out already.

  - We can make two separate calls to `for_each_fullref_in()` (and
    avoid, say, the more specialized `for_each_fullref_in_prefixes()`,
    since we know that the set of references enumerated by each is
    disjoint, so we'll never see the same reference appear in both
    calls.

  - We have to use the "fullref" variant (instead of just
    `for_each_branch_ref()` and `for_each_tag_ref()`), since we expect
    fully-qualified reference names to appear in `show-ref`'s output.

When either of `heads_only` or `tags_only` is set, we can eliminate the
strcmp() calls in `builtin/show-ref.c::show_ref()` altogether, since we
know that `show_ref()` will never see a non-branch or tag reference.

Unfortunately, we can't use `for_each_fullref_in_prefixes()` to enhance
`show-ref`'s pattern matching, since `show-ref` patterns match on the
_suffix_ (e.g., the pattern "foo" shows "refs/heads/foo",
"refs/tags/foo", and etc, not "foo/*").

Nonetheless, in our synthetic example above, this provides a significant
speed-up ("git" is roughly v2.36, "git.compile" is this patch):

    $ hyperfine -N 'git show-ref --heads' 'git.compile show-ref --heads'
    Benchmark 1: git show-ref --heads
      Time (mean ± σ):      49.9 ms ±   6.2 ms    [User: 45.6 ms, System: 4.1 ms]
      Range (min … max):    46.1 ms …  73.6 ms    43 runs

    Benchmark 2: git.compile show-ref --heads
      Time (mean ± σ):       2.8 ms ±   0.4 ms    [User: 1.4 ms, System: 1.2 ms]
      Range (min … max):     1.3 ms …   5.6 ms    957 runs

    Summary
      'git.compile show-ref --heads' ran
       18.03 ± 3.38 times faster than 'git show-ref --heads'

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-06 09:56:42 -07:00
Derrick Stolee
6dcbdc0d66 remote: create fetch.credentialsInUrl config
Users sometimes provide a "username:password" combination in their
plaintext URLs. Since Git stores these URLs in plaintext in the
.git/config file, this is a very insecure way of storing these
credentials. Credential managers are a more secure way of storing this
information.

System administrators might want to prevent this kind of use by users on
their machines.

Create a new "fetch.credentialsInUrl" config option and teach Git to
warn or die when seeing a URL with this kind of information. The warning
anonymizes the sensitive information of the URL to be clear about the
issue.

This change currently defaults the behavior to "allow" which does
nothing with these URLs. We can consider changing this behavior to
"warn" by default if we wish. At that time, we may want to add some
advice about setting fetch.credentialsInUrl=ignore for users who still
want to follow this pattern (and not receive the warning).

An earlier version of this change injected the logic into
url_normalize() in urlmatch.c. While most code paths that parse URLs
eventually normalize the URL, that normalization does not happen early
enough in the stack to avoid attempting connections to the URL first. By
inserting a check into the remote validation, we identify the issue
before making a connection. In the old code path, this was revealed by
testing the new t5601-clone.sh test under --stress, resulting in an
instance where the return code was 13 (SIGPIPE) instead of 128 from the
die().

However, we can reuse the parsing information from url_normalize() in
order to benefit from its well-worn parsing logic. We can use the struct
url_info that is created in that method to replace the password with
"<redacted>" in our error messages. This comes with a slight downside
that the normalized URL might look slightly different from the input URL
(for instance, the normalized version adds a closing slash). This should
not hinder users figuring out what the problem is and being able to fix
the issue.

As an attempt to ensure the parsing logic did not catch any
unintentional cases, I modified this change locally to to use the "die"
option by default. Running the test suite succeeds except for the
explicit username:password URLs used in t5550-http-fetch-dumb.sh and
t5541-http-push-smart.sh. This means that all other tested URLs did not
trigger this logic.

The tests show that the proper error messages appear (or do not
appear), but also count the number of error messages. When only warning,
each process validates the remote URL and outputs a warning. This
happens twice for clone, three times for fetch, and once for push.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-06 09:32:32 -07:00
Junio C Hamano
ab336e8f1c Seventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-03 14:30:45 -07:00
Junio C Hamano
a50036da1a Merge branch 'tb/cruft-packs'
A mechanism to pack unreachable objects into a "cruft pack",
instead of ejecting them into loose form to be reclaimed later, has
been introduced.

* tb/cruft-packs:
  sha1-file.c: don't freshen cruft packs
  builtin/gc.c: conditionally avoid pruning objects via loose
  builtin/repack.c: add cruft packs to MIDX during geometric repack
  builtin/repack.c: use named flags for existing_packs
  builtin/repack.c: allow configuring cruft pack generation
  builtin/repack.c: support generating a cruft pack
  builtin/pack-objects.c: --cruft with expiration
  reachable: report precise timestamps from objects in cruft packs
  reachable: add options to add_unseen_recent_objects_to_traversal
  builtin/pack-objects.c: --cruft without expiration
  builtin/pack-objects.c: return from create_object_entry()
  t/helper: add 'pack-mtimes' test-tool
  pack-mtimes: support writing pack .mtimes files
  chunk-format.h: extract oid_version()
  pack-write: pass 'struct packing_data' to 'stage_tmp_packfiles'
  pack-mtimes: support reading .mtimes files
  Documentation/technical: add cruft-packs.txt
2022-06-03 14:30:37 -07:00
Junio C Hamano
37d4ae58ef Merge branch 'kl/setup-in-unreadable-worktree'
Disable the "do not remove the directory the user started Git in"
logic when Git cannot tell where that directory is.  Earlier we
refused to run in such a case.

* kl/setup-in-unreadable-worktree:
  setup: don't die if realpath(3) fails on getcwd(3)
2022-06-03 14:30:36 -07:00
Junio C Hamano
28db3b7b71 Merge branch 'jx/l10n-workflow-change'
A workflow change for translators are being proposed.

* jx/l10n-workflow-change:
  l10n: Document the new l10n workflow
  Makefile: add "po-init" rule to initialize po/XX.po
  Makefile: add "po-update" rule to update po/XX.po
  po/git.pot: don't check in result of "make pot"
  po/git.pot: this is now a generated file
  Makefile: remove duplicate and unwanted files in FOUND_SOURCE_FILES
  i18n CI: stop allowing non-ASCII source messages in po/git.pot
  Makefile: have "make pot" not "reset --hard"
  Makefile: generate "po/git.pot" from stable LOCALIZED_C
  Makefile: sort source files before feeding to xgettext
2022-06-03 14:30:36 -07:00
Junio C Hamano
16a0e92ddc Merge branch 'tb/geom-repack-with-keep-and-max'
Teach "git repack --geometric" work better with "--keep-pack" and
avoid corrupting the repository when packsize limit is used.

* tb/geom-repack-with-keep-and-max:
  builtin/repack.c: ensure that `names` is sorted
  t7703: demonstrate object corruption with pack.packSizeLimit
  repack: respect --keep-pack with geometric repack
2022-06-03 14:30:36 -07:00
Junio C Hamano
c276c21da6 Merge branch 'ds/sparse-sparse-checkout'
"sparse-checkout" learns to work well with the sparse-index
feature.

* ds/sparse-sparse-checkout:
  sparse-checkout: integrate with sparse index
  p2000: add test for 'git sparse-checkout [add|set]'
  sparse-index: complete partial expansion
  sparse-index: partially expand directories
  sparse-checkout: --no-sparse-index needs a full index
  cache-tree: implement cache_tree_find_path()
  sparse-index: introduce partially-sparse indexes
  sparse-index: create expand_index()
  t1092: stress test 'git sparse-checkout set'
  t1092: refactor 'sparse-index contents' test
2022-06-03 14:30:35 -07:00
Junio C Hamano
091680472d Merge branch 'tb/midx-race-in-pack-objects'
The multi-pack-index code did not protect the packfile it is going
to depend on from getting removed while in use, which has been
corrected.

* tb/midx-race-in-pack-objects:
  builtin/pack-objects.c: ensure pack validity from MIDX bitmap objects
  builtin/pack-objects.c: ensure included `--stdin-packs` exist
  builtin/pack-objects.c: avoid redundant NULL check
  pack-bitmap.c: check preferred pack validity when opening MIDX bitmap
2022-06-03 14:30:35 -07:00
Junio C Hamano
d8c8dccbaa Merge branch 'ds/object-file-unpack-loose-header-fix'
Coding style fix.

* ds/object-file-unpack-loose-header-fix:
  object-file: convert 'switch' back to 'if'
2022-06-03 14:30:35 -07:00