Commit Graph

20561 Commits

Author SHA1 Message Date
Derrick Stolee
876094ac16 clone: unbundle the advertised bundles
A previous change introduced the transport methods to acquire a bundle
list from the 'bundle-uri' protocol v2 command, when advertised _and_
when the client has chosen to enable the feature.

Teach Git to download and unbundle the data advertised by those bundles
during 'git clone'. This takes place between the ref advertisement and
the object data download, and stateful connections will linger while
the client downloads bundles. In the future, we should consider closing
the remote connection during this process.

Also, since the --bundle-uri option exists, we do not want to mix the
advertised bundles with the user-specified bundles.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:24:24 +09:00
Derrick Stolee
ebc3947955 bundle-uri: allow relative URLs in bundle lists
Bundle providers may want to distribute that data across multiple CDNs.
This might require a change in the base URI, all the way to the domain
name. If all bundles require an absolute URI in their 'uri' value, then
every push to a CDN would require altering the table of contents to
match the expected domain and exact location within it.

Allow a bundle list to specify a relative URI for the bundles. This URI
is based on where the client received the bundle list. For a list
provided in the 'bundle-uri' protocol v2 command, the Git remote URI is
the base URI. Otherwise, the bundle list was provided from an HTTP URI
not using the Git protocol, and that URI is the base URI. This allows
easier distribution of bundle data.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:24:24 +09:00
Derrick Stolee
738dc7d4a5 bundle-uri: serve bundle.* keys from config
Implement the "bundle-uri" protocol v2 capability by populating the
key=value packet lines from the local Git config. The list of bundles is
provided from the keys beginning with "bundle.".

In the future, we may want to filter this list to be more specific to
the exact known keys that the server intends to share, but for
flexibility at the moment we will assume that the config values are
well-formed.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:24:24 +09:00
Ævar Arnfjörð Bjarmason
70b9c10373 bundle-uri client: add helper for testing server
Add a 'test-tool bundle-uri ls-remote' command. This is a thin wrapper
for issuing protocol v2 "bundle-uri" commands to a server, and to the
parsing routines in bundle-uri.c.

In the "git clone" case we'll have already done the handshake(),
but not here. Add an extra case to check for this handshake in
get_bundle_uri() for ease of use for future callers.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:24:24 +09:00
Ævar Arnfjörð Bjarmason
7cce9074a7 bundle-uri client: add boolean transfer.bundleURI setting
The yet-to-be introduced client support for bundle-uri will always
fall back on a full clone, but we'd still like to be able to ignore a
server's bundle-uri advertisement entirely.

The new transfer.bundleURI config option defaults to 'false', but a user
can set it to 'true' to enable checking for bundle URIs from the origin
Git server using protocol v2.

Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:24:23 +09:00
Ævar Arnfjörð Bjarmason
0cfde740f0 clone: request the 'bundle-uri' command when available
Set up all the needed client parts of the 'bundle-uri' protocol v2
command, without actually doing anything with the bundle URIs.

If the server says it supports 'bundle-uri' teach Git to issue the
'bundle-uri' command after the 'ls-refs' during 'git clone'. The
returned key=value pairs are passed to the bundle list code which is
tested using a different ingest mechanism in t5750-bundle-uri-parse.sh.

At this point, Git does nothing with that bundle list. It will not
download any of the bundles. That will come in a later change after
these protocol bits are finalized.

The no-op client is initially used only by 'git clone' to test the basic
functionality, and eventually will bootstrap the initial download of Git
objects during a fresh clone. The bundle URI client will not be
integrated into other fetches until a mechanism is created to select a
subset of bundles for download.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:24:23 +09:00
Ævar Arnfjörð Bjarmason
8f788eb8b7 t: create test harness for 'bundle-uri' command
The previous change allowed for a Git server to advertise the
'bundle-uri' command as a capability based on the
uploadPack.advertiseBundleURIs config option. Create a set of tests that
check that this capability is advertised using 'git ls-remote'.

In order to test this functionality across three protocols (file, git,
and http), create lib-bundle-uri-protocol.sh to generalize the tests,
allowing the other test scripts to set an environment variable and
otherwise inherit the setup and tests from this script.

The tests currently only test that the 'bundle-uri' command is
advertised or not. Other actions will be tested as the Git client learns
to request the 'bundle-uri' command and parse its response.

To help with URI escaping, specifically for file paths with a space in
them, extract a 'sed' invocation from t9199-git-svn-info.sh into a
helper function for use here, too.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:24:23 +09:00
Ævar Arnfjörð Bjarmason
8b8d9a2298 protocol v2: add server-side "bundle-uri" skeleton
Add a skeleton server-side implementation of a new "bundle-uri" command
to protocol v2. This will allow conforming clients to optionally seed
their initial clones or incremental fetches from URLs containing
"*.bundle" files created with "git bundle create".

This change only performs the basic boilerplate of advertising a new
protocol v2 capability. The new 'bundle-uri' capability allows a client
to request a list of bundles. Right now, the server only returns a flush
packet, which corresponds to an empty advertisement. The bundle.* config
namespace describes which key-value pairs will be communicated across
this interface in future updates.

The critical bit right now is that the new boolean
uploadPack.adverstiseBundleURIs config value signals whether or not this
capability should be advertised at all.

An earlier version of this patch [1] used a different transfer format
than the "key=value" pairs in the current implementation. The change was
made to unify the protocol v2 command with the bundle lists provided by
independent bundle servers. Further, the standard allows for the server
to advertise a URI that contains a bundle list. This allows users
automatically discovering bundle providers that are loosely associated
with the origin server, but without the origin server knowing exactly
which bundles are currently available.

[1] https://lore.kernel.org/git/RFC-patch-v2-01.13-2fc87ce092b-20220311T155841Z-avarab@gmail.com/

The very-deep headings needed to be modified to stop at level 4 due to
documentation build issues. These were not recognized in earlier builds
since the file was previously in the Documentation/technical/ directory
and was built in a different way. With its current location, the
heavily-nested details were causing build issues and they are now
replaced with a bulletted list of details.

Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:24:23 +09:00
Ævar Arnfjörð Bjarmason
891cb09db6 bundle: don't segfault on "git bundle <subcmd>"
Since aef7d75e58 (builtin/bundle.c: let parse-options parse
subcommands, 2022-08-19) we've been segfaulting if no argument was
provided.

The fix is easy, as all of the "git bundle" subcommands require a
non-option argument we can check that we have arguments left after
calling parse-options().

This makes use of code added in 73c3253d75 (bundle: framework for
options before bundle file, 2019-11-10), before this change that code
has always been unreachable. In 73c3253d75 we'd never reach it as we
already checked "argc < 2" in cmd_bundle() itself.

Then when aef7d75e58 (whose segfault we're fixing here) migrated this
code to the subcommand API it removed that "argc < 2" check, but we
were still checking the wrong "argc" in parse_options_cmd_bundle(), we
need to check the "newargc". The "argc" will always be >= 1, as it
will necessarily contain at least the subcommand name
itself (e.g. "create").

As an aside, this could be safely squashed into this, but let's not do
that for this minimal segfault fix, as it's an unrelated refactoring:

	--- a/builtin/bundle.c
	+++ b/builtin/bundle.c
	@@ -55,13 +55,12 @@ static int parse_options_cmd_bundle(int argc,
	 		const char * const usagestr[],
	 		const struct option options[],
	 		char **bundle_file) {
	-	int newargc;
	-	newargc = parse_options(argc, argv, NULL, options, usagestr,
	+	argc = parse_options(argc, argv, NULL, options, usagestr,
	 			     PARSE_OPT_STOP_AT_NON_OPTION);
	-	if (!newargc)
	+	if (!argc)
	 		usage_with_options(usagestr, options);
	 	*bundle_file = prefix_filename(prefix, argv[0]);
	-	return newargc;
	+	return argc;
	 }

	 static int cmd_bundle_create(int argc, const char **argv, const char *prefix) {

Reported-by: Hubert Jasudowicz <hubertj@stmcyber.pl>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Tested-by: Hubert Jasudowicz <hubertj@stmcyber.pl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:01:09 +09:00
Siddharth Asthana
a797c0ea04 cat-file: add mailmap support to --batch-check option
Even though the cat-file command with `--batch-check` option does not
complain when `--use-mailmap` option is given, the latter option is
ignored. Compute the size of the object after replacing the idents and
report it instead.

In order to make `--batch-check` option honour the mailmap mechanism we
have to read the contents of the commit/tag object.

There were two ways to do it:

1. Make two calls to `oid_object_info_extended()`. If `--use-mailmap`
   option is given, the first call will get us the type of the object
   and second call will only be made if the object type is either a
   commit or tag to get the contents of the object.

2. Make one call to `oid_object_info_extended()` to get the type of the
   object. Then, if the object type is either of commit or tag, make a
   call to `repo_read_object_file()` to read the contents of the object.

I benchmarked the following command with both the above approaches and
compared against the current implementation where `--use-mailmap`
option is ignored:

`git cat-file --use-mailmap --batch-all-objects --batch-check --buffer
--unordered`

The results can be summarized as follows:
                       Time (mean ± σ)
default               827.7 ms ± 104.8 ms
first approach        6.197 s ± 0.093 s
second approach       1.975 s ± 0.217 s

Since, the second approach is faster than the first one, I implemented
it in this patch.

The command git cat-file can now use the mailmap mechanism to replace
idents with canonical versions for commit and tag objects. There are
several options like `--batch`, `--batch-check` and `--batch-command`
that can be combined with `--use-mailmap`. But the documentation for
`--batch`, `--batch-check` and `--batch-command` doesn't say so. This
patch fixes that documentation.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-20 15:20:45 +09:00
Siddharth Asthana
49050a043b cat-file: add mailmap support to -s option
Even though the cat-file command with `-s` option does not complain when
`--use-mailmap` option is given, the latter option is ignored. Compute
the size of the object after replacing the idents and report it instead.

In order to make `-s` option honour the mailmap mechanism we have to
read the contents of the commit/tag object. Make use of the call to
`oid_object_info_extended()` to get the contents of the object and store
in `buf`. `buf` is later freed in the function.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-20 15:20:45 +09:00
Lars Kellogg-Stedman
4e57c88e02 line-range: fix infinite loop bug with '$' regex
When the -L argument to "git log" is passed the zero-width regular
expression "$" (as in "-L :$:line-range.c"), this results in an
infinite loop in find_funcname_matching_regexp().

Modify find_funcname_matching_regexp to correctly match the entire line
instead of the zero-width match at eol and update the loop condition to
prevent an infinite loop in the event of other undiscovered corner cases.

The primary change is that we pre-decrement the beginning-of-line marker
('bol') before comparing it to '\n'. In the case of '$', where we match the
'\n' at the end of the line and start the loop with bol == eol, this
ensures that bol will find the beginning of the line on which the match
occurred.

Originally reported in <https://stackoverflow.com/q/74690545/147356>.

Signed-off-by: Lars Kellogg-Stedman <lars@oddbit.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-20 10:00:43 +09:00
Junio C Hamano
963f8d3b63 Merge branch 'rj/branch-copy-and-rename'
Fix a pair of bugs in 'git branch'.

* rj/branch-copy-and-rename:
  branch: force-copy a branch to itself via @{-1} is a no-op
2022-12-19 11:46:18 +09:00
Junio C Hamano
f3d9bc801a Merge branch 'rr/status-untracked-advice'
The advice message given by "git status" when it takes long time to
enumerate untracked paths has been updated.

* rr/status-untracked-advice:
  status: modernize git-status "slow untracked files" advice
2022-12-19 11:46:18 +09:00
Junio C Hamano
053650ddad Merge branch 'aw/complete-case-insensitive'
Introduce a case insensitive mode to the Bash completion helpers.

* aw/complete-case-insensitive:
  completion: add case-insensitive match of pseudorefs
  completion: add optional ignore-case when matching refs
2022-12-19 11:46:18 +09:00
Junio C Hamano
3c0a988672 Merge branch 'rs/t3920-crlf-eating-grep-fix'
Test fix.

* rs/t3920-crlf-eating-grep-fix:
  t3920: support CR-eating grep
2022-12-19 11:46:14 +09:00
Junio C Hamano
b7bb8828cf Merge branch 'js/t3920-shell-and-or-fix'
Test fix.

* js/t3920-shell-and-or-fix:
  t3920: don't ignore errors of more than one command with `|| true`
2022-12-19 11:46:14 +09:00
Junio C Hamano
314a0af909 Merge branch 'ab/t4023-avoid-losing-exit-status-of-diff'
Test fix.

* ab/t4023-avoid-losing-exit-status-of-diff:
  t4023: fix ignored exit codes of git
2022-12-19 11:46:13 +09:00
Junio C Hamano
4eec47c1cd Merge branch 'ab/t7600-avoid-losing-exit-status-of-git'
Test fix.

* ab/t7600-avoid-losing-exit-status-of-git:
  t7600: don't ignore "rev-parse" exit code in helper
2022-12-19 11:46:13 +09:00
Junio C Hamano
d2caf09d00 Merge branch 'ab/t5314-avoid-losing-exit-status'
Test fix.

* ab/t5314-avoid-losing-exit-status:
  t5314: check exit code of "git"
2022-12-19 11:46:13 +09:00
Junio C Hamano
907951c88b Merge branch 'rs/t4205-do-not-exit-in-test-script'
Test fix.

* rs/t4205-do-not-exit-in-test-script:
  t4205: don't exit test script on failure
2022-12-19 11:46:12 +09:00
SZEDER Gábor
a48a88019b tests: make 'test_oid' print trailing newline
Unlike other test helper functions, 'test_oid' doesn't terminate its
output with a LF, but, alas, the reason for this, if any, is not
mentioned in 2c02b110da (t: add test functions to translate
hash-related values, 2018-09-13)).

Now, in the vast majority of cases 'test_oid' is invoked in a command
substitution that is part of a heredoc or supplies an argument to a
command or the value to a variable, and the command substitution would
chop off any trailing LFs, so in these cases the lack or presence of a
trailing LF in its output doesn't matter.  However:

  - There appear to be only three cases where 'test_oid' is not
    invoked in a command substitution:

      $ git grep '\stest_oid ' -- ':/t/*.sh'
      t0000-basic.sh:  test_oid zero >actual &&
      t0000-basic.sh:  test_oid zero >actual &&
      t0000-basic.sh:  test_oid zero >actual &&

    These are all in test cases checking that 'test_oid' actually
    works, and that the size of its output matches the size of the
    corresponding hash function with conditions like

      test $(wc -c <actual) -eq 40

    In these cases the lack of trailing LF does actually matter,
    though they could be trivially updated to account for the presence
    of a trailing LF.

  - There are also a few cases where the lack of trailing LF in
    'test_oid's output actually hurts, because tests need to compare
    its output with LF terminated file contents, forcing developers to
    invoke it as 'echo $(test_oid ...)' to append the missing LF:

      $ git grep 'echo "\?$(test_oid ' -- ':/t/*.sh'
      t1302-repo-version.sh:  echo $(test_oid version) >expect &&
      t1500-rev-parse.sh:     echo "$(test_oid algo)" >expect &&
      t4044-diff-index-unique-abbrev.sh:      echo "$(test_oid val1)" > foo &&
      t4044-diff-index-unique-abbrev.sh:      echo "$(test_oid val2)" > foo &&
      t5313-pack-bounds-checks.sh:    echo $(test_oid oidfff) >file &&

    And there is yet another similar case in an in-flight topic at:

      https://public-inbox.org/git/813e81a058227bd373cec802e443fcd677042fb4.1670862677.git.gitgitgadget@gmail.com/

Arguably we would be better off if 'test_oid' terminated its output
with a LF.  So let's update 'test_oid' accordingly, update its tests
in t0000 to account for the extra character in those size tests, and
remove the now unnecessary 'echo $(...)' command substitutions around
'test_oid' invocations as well.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-19 09:49:11 +09:00
Sean Allred
4c3dd9304e var: add GIT_SEQUENCE_EDITOR variable
The editor program used by Git when editing the sequencer "todo" file
is determined by examining a few environment variables and also
affected by configuration variables. Introduce "git var
GIT_SEQUENCE_EDITOR" to give users access to the final result of the
logic without having to know the exact details.

This is very similar in spirit to 44fcb497 (Teach git var about
GIT_EDITOR, 2009-11-11) that introduced "git var GIT_EDITOR".

Signed-off-by: Sean Allred <allred.sean@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-18 11:48:26 +09:00
Jeff King
1955ef10ed ref-filter: truncate atom names in error messages
If you pass a bogus argument to %(refname), you may end up with a
message like this:

  $ git for-each-ref --format='%(refname:foo)'
  fatal: unrecognized %(refname:foo) argument: foo

which is confusing. It should just say:

  fatal: unrecognized %(refname) argument: foo

which is clearer, and is consistent with most other atom parsers. Those
other parsers do not have the same problem because they pass the atom
name from a string literal in the parser function. But because the
parser for %(refname) also handles %(upstream) and %(push), it instead
uses atom->name, which includes the arguments. The oid atom parser which
handles %(tree), %(parent), etc suffers from the same problem.

It seems like the cleanest fix would be for atom->name to be _just_ the
name, since there's already a separate "args" field. But since that
field is also used for other things, we can't change it easily (e.g.,
it's how we find things in the used_atoms array, and clearly %(refname)
and %(refname:short) are not the same thing).

Instead, we'll teach our error_bad_arg() function to stop at the first
":". This is a little hacky, as we're effectively re-parsing the name,
but the format is simple enough to do this as a one-liner, and this
localizes the change to the error-reporting code.

We'll give the same treatment to err_no_arg(). None of its callers use
this atom->name trick, but it's worth future-proofing it while we're
here.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-15 09:14:04 +09:00
Jeff King
dda4fc1a84 ref-filter: factor out "unrecognized %(foo) arg" errors
Atom parsers that take arguments generally have a catch-all for "this
arg is not recognized". Most of them use the same printf template, which
is good, because it makes life easier for translators. Let's pull this
template into a helper function, which makes the code in the parsers
shorter and avoids any possibility of differences.

As with the previous commit, we'll pick an arbitrary atom to make sure
the test suite covers this code.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-15 09:14:00 +09:00
Jeff King
a33d0fae76 ref-filter: factor out "%(foo) does not take arguments" errors
Many atom parsers give the same error message, differing only in the
name of the atom. If we use "%s does not take arguments", that should
make life easier for translators, as they only need to translate one
string. And in doing so, we can easily pull it into a helper function to
make sure they are all using the exact same string.

I've added a basic test here for %(HEAD), just to make sure this code is
exercised at all in the test suite. We could cover each such atom, but
the effort-to-reward ratio of trying to maintain an exhaustive list
doesn't seem worth it.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-15 09:13:56 +09:00
Peter Grayson
209d9cb011 diff: fix regression with --stat and unmerged file
A regression was introduced in

  12fc4ad89e (diff.c: use utf8_strwidth() to count display width, 2022-09-14)

that causes missing newlines after "Unmerged" entries in `git diff
--cached --stat` output.

This problem affects v2.39.0-rc0 through v2.39.0.

Add the missing newline along with a new test to cover this
behavior.

Signed-off-by: Peter Grayson <pete@jpgrayson.net>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-15 09:12:04 +09:00
Junio C Hamano
26f81233ab Merge branch 'js/t0021-windows-pwd'
Test fix.

* js/t0021-windows-pwd:
  t0021: use Windows-friendly `pwd`
2022-12-14 17:42:18 +09:00
Junio C Hamano
d818458088 Merge branch 'sa/git-var-empty'
"git var UNKNOWN_VARIABLE" and "git var VARIABLE" with the variable
given an empty value used to behave identically.  Now the latter
just gives an empty output, while the former still gives an error
message.

* sa/git-var-empty:
  var: allow GIT_EDITOR to return null
  var: do not print usage() with a correct invocation
2022-12-14 15:55:47 +09:00
Junio C Hamano
cb3d2e535a Merge branch 'rs/multi-filter-args'
Fix a bug where `pack-objects` would not respect multiple `--filter`
arguments when invoked directly.

* rs/multi-filter-args:
  list-objects-filter: remove OPT_PARSE_LIST_OBJECTS_FILTER_INIT()
  pack-objects: simplify --filter handling
  pack-objects: fix handling of multiple --filter options
  t5317: demonstrate failure to handle multiple --filter options
  t5317: stop losing return codes of git ls-files
2022-12-14 15:55:47 +09:00
Junio C Hamano
a1b8e5ec28 Merge branch 'tl/pack-bitmap-absolute-paths'
The pack-bitmap machinery is taught to log the paths of redundant
bitmap(s) to trace2 instead of stderr.

* tl/pack-bitmap-absolute-paths:
  pack-bitmap.c: trace bitmap ignore logs when midx-bitmap is found
  pack-bitmap.c: break out of the bitmap loop early if not tracing
  pack-bitmap.c: avoid exposing absolute paths
  pack-bitmap.c: remove unnecessary "open_pack_index()" calls
2022-12-14 15:55:46 +09:00
Junio C Hamano
9ea1378d04 Merge branch 'ab/various-leak-fixes'
Various leak fixes.

* ab/various-leak-fixes:
  built-ins: use free() not UNLEAK() if trivial, rm dead code
  revert: fix parse_options_concat() leak
  cherry-pick: free "struct replay_opts" members
  rebase: don't leak on "--abort"
  connected.c: free the "struct packed_git"
  sequencer.c: fix "opts->strategy" leak in read_strategy_opts()
  ls-files: fix a --with-tree memory leak
  revision API: call graph_clear() in release_revisions()
  unpack-file: fix ancient leak in create_temp_file()
  built-ins & libs & helpers: add/move destructors, fix leaks
  dir.c: free "ident" and "exclude_per_dir" in "struct untracked_cache"
  read-cache.c: clear and free "sparse_checkout_patterns"
  commit: discard partial cache before (re-)reading it
  {reset,merge}: call discard_index() before returning
  tests: mark tests as passing with SANITIZE=leak
2022-12-14 15:55:46 +09:00
Junio C Hamano
7576e512ce Merge branch 'kz/merge-tree-merge-base'
"merge-tree" learns a new `--merge-base` option.

* kz/merge-tree-merge-base:
  docs: fix description of the `--merge-base` option
  merge-tree.c: allow specifying the merge-base when --stdin is passed
  merge-tree.c: add --merge-base=<commit> option
2022-12-14 15:55:46 +09:00
Junio C Hamano
bee6e7a8f9 Merge branch 'dd/git-bisect-builtin'
`git bisect` becomes a builtin.

* dd/git-bisect-builtin:
  bisect; remove unused "git-bisect.sh" and ".gitignore" entry
  Turn `git bisect` into a full built-in
  bisect--helper: log: allow arbitrary number of arguments
  bisect--helper: handle states directly
  bisect--helper: emit usage for "git bisect"
  bisect test: test exit codes on bad usage
  bisect--helper: identify as bisect when report error
  bisect-run: verify_good: account for non-negative exit status
  bisect run: keep some of the post-v2.30.0 output
  bisect: fix output regressions in v2.30.0
  bisect: refactor bisect_run() to match CodingGuidelines
  bisect tests: test for v2.30.0 "bisect run" regressions
2022-12-14 15:55:45 +09:00
Junio C Hamano
96738bb0e1 Sync with 2.38.3 2022-12-13 21:25:15 +09:00
Junio C Hamano
fea9f607a8 Sync with Git 2.37.5 2022-12-13 21:23:36 +09:00
Junio C Hamano
431f6e67e6 Merge branch 'maint-2.36' into maint-2.37 2022-12-13 21:20:35 +09:00
Junio C Hamano
8253c00421 Merge branch 'maint-2.35' into maint-2.36 2022-12-13 21:19:11 +09:00
Junio C Hamano
fbabbc30e7 Merge branch 'maint-2.34' into maint-2.35 2022-12-13 21:17:10 +09:00
Junio C Hamano
3748b5b7f5 Merge branch 'maint-2.33' into maint-2.34 2022-12-13 21:15:22 +09:00
Junio C Hamano
5f22dcc02d Sync with Git 2.32.5 2022-12-13 21:13:11 +09:00
Junio C Hamano
32e357b6df Merge branch 'ps/attr-limits-with-fsck' into maint-2.32 2022-12-13 21:09:56 +09:00
Junio C Hamano
8a755eddf5 Sync with Git 2.31.6 2022-12-13 21:09:40 +09:00
Junio C Hamano
16128765d7 Git 2.30.7
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE4fA2sf7nIh/HeOzvsLXohpav5ssFAmOYaKAACgkQsLXohpav
 5svHThAAhjTaBBBYDM6FbHcFUHv515fSo04AmyXl6QKdjiBroaGj+WpJPrYM3B5G
 eR0MYfwDfp7rAvxPQwq1LzSQLz2G2Swue1a4X0t3Bomjpqf48OeAUleGHQlBUTm7
 wfZIHpgCbMBIHJtVAVPiEOo43ZJ1OareCwVpPOpAXLVgTU2Bbx59K0oUqGszjgE3
 anQ0kon6hELZ9aBTx80hUJaYWaxiUqENtRFs6vyOV/MKvW2KR+MJgvu/SQqbRJPy
 ndBJ5r0gcSbes0OLxKCAFFNVt2p6BeVb4IxyPogJveGwJsNU88DQnarSos7hvPYG
 DkhTzfpPmFJkP0WiRHr87jWCXNJraq1SmK65ac1CGV/NTrDfX9ZNoGIRFsHfLmw2
 1poTxhB/h0F4wCucZu7Wavvgd2NI2V+GK5dx8Mx5NovrC67smBny2W7kQgXJCdZX
 e6vNuKVK7pz3cVYvo5GbUo2ivY2igm9Xbj3Na1/Ie8wTFaZ0ZX+oRnxxAdwKbL/1
 X0VRUTQMgtrrLd24JCApo8r5+Ssg0HvIOpXcUZFpvaYl9kMltatwV1Y01lNAhAgF
 VFBvUWdFy5tGzPzSCd3w2NyZOJBng2GdKw9YUt/WVWCKeiLXLI3wh10pC+m1qJus
 HJwQbRRSzC4mhXlkKZ5IG+Xz7x+HrHFnLpQXhtjeSc5WwGQkE2w=
 =syKo
 -----END PGP SIGNATURE-----

Sync with Git 2.30.7
2022-12-13 21:02:20 +09:00
Simon Gerber
0918d08887 help.c: fix autocorrect in work tree for bare repository
Currently, auto correction doesn't work reliably for commands which must
run in a work tree (e.g. `git status`) in Git work trees which are
created from a bare repository.

As far as I'm able to determine, this has been broken since commit
659fef199f (help: use early config when autocorrecting aliases,
2017-06-14), where the call to `git_config()` in `help_unknown_cmd()`
was replaced with a call to `read_early_config()`. From what I can tell,
the actual cause for the unexpected error is that we call
`git_default_config()` in the `git_unknown_cmd_config` callback instead
of simply returning `0` for config entries which we aren't interested
in.

Calling `git_default_config()` in this callback to `read_early_config()`
seems like a bad idea since those calls will initialize a bunch of state
in `environment.c` (among other things `is_bare_repository_cfg`) before
we've properly detected that we're running in a work tree.

All other callbacks provided to `read_early_config()` appear to only
extract their configurations while simply returning `0` for all other
config keys.

This commit changes the `git_unknown_cmd_config` callback to not call
`git_default_config()`. Instead we also simply return `0` for config
keys which we're not interested in.

Additionally the commit adds a new test case covering `help.autocorrect`
in a work tree created from a bare clone.

Signed-off-by: Simon Gerber <gesimu@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 10:01:53 +09:00
Johannes Schindelin
a3795bf0e6 tests(mingw): avoid very slow mingw_test_cmp
When Git's test suite uses `test_cmp`, it is not actually trying to
compare binary files as the name `cmp` would suggest to users familiar
with Unix' tools, but the tests instead verify that actual output
matches the expected text.

On Unix, `cmp` works well enough for Git's purposes because only Line
Feed characters are used as line endings. However, on Windows, while
most tools accept Line Feeds as line endings, many tools produce
Carriage Return + Line Feed line endings, including some of the tools
used by the test suite (which are therefore provided via Git for Windows
SDK). Therefore, `cmp` would frequently fail merely due to different
line endings.

To accommodate for that, the `mingw_test_cmp` function was introduced
into Git's test suite to perform a line-by-line comparison that ignores
line endings. This function is a Bash function that is only used on
Windows, everywhere else `cmp` is used.

This is a double whammy because `cmp` is fast, and `mingw_test_cmp` is
slow, even more so on Windows because it is a Bash script function, and
Bash scripts are known to run particularly slowly on Windows due to
Bash's need for the POSIX emulation layer provided by the MSYS2 runtime.

The commit message of 32ed3314c1 (t5351: avoid using `test_cmp` for
binary data, 2022-07-29) provides an illuminating account of the
consequences: On Windows, the platform on which Git could really use all
the help it can get to improve its performance, the time spent on one
entire test script was reduced from half an hour to less than half a
minute merely by avoiding a single call to `mingw_test_cmp` in but a
single test case.

Learning the lesson to avoid shell scripting wherever possible, the Git
for Windows project implemented a minimal replacement for
`mingw_test_cmp` in the form of a `test-tool` subcommand that parses the
input files line by line, ignoring line endings, and compares them.
Essentially the same thing as `mingw_test_cmp`, but implemented in
C instead of Bash. This solution served the Git for Windows project
well, over years.

However, when this solution was finally upstreamed, the conclusion was
reached that a change to use `git diff --no-index` instead of
`mingw_test_cmp` was more easily reviewed and hence should be used
instead.

The reason why this approach was not even considered in Git for Windows
is that in 2007, there was already a motion on the table to use Git's
own diff machinery to perform comparisons in Git's test suite, but it
was dismissed in https://lore.kernel.org/git/xmqqbkrpo9or.fsf@gitster.g/
as undesirable because tests might potentially succeed due to bugs in
the diff machinery when they should not succeed, and those bugs could
therefore hide regressions that the tests try to prevent.

By the time Git for Windows' `mingw-test-cmp` in C was finally
contributed to the Git mailing list, reviewers agreed that the diff
machinery had matured enough and should be used instead.

When the concern was raised that the diff machinery, due to its
complexity, would perform substantially worse than the test helper
originally implemented in the Git for Windows project, a test
demonstrated that these performance differences are well lost within the
100+ minutes it takes to run Git's test suite on Windows.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-13 07:18:06 +09:00
Victoria Dye
93a7bc8b28 rebase --update-refs: avoid unintended ref deletion
In b3b1a21d1a (sequencer: rewrite update-refs as user edits todo list,
2022-07-19), the 'todo_list_filter_update_refs()' step was added to handle
the removal of 'update-ref' lines from a 'rebase-todo'. Specifically, it
removes potential ref updates from the "update refs state" if a ref does not
have a corresponding 'update-ref' line.

However, because 'write_update_refs_state()' will not update the state if
the 'refs_to_oids' list was empty, removing *all* 'update-ref' lines will
result in the state remaining unchanged from how it was initialized (with
all refs' "after" OID being null). Then, when the ref update is applied, all
refs will be updated to null and consequently deleted.

To fix this, delete the 'update-refs' state file when 'refs_to_oids' is
empty. Additionally, add a tests covering "all update-ref lines removed"
cases.

Reported-by: herr.kaste <herr.kaste@gmail.com>
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-12-09 19:31:45 +09:00
Patrick Steinhardt
27ab4784d5 fsck: implement checks for gitattributes
Recently, a vulnerability was reported that can lead to an out-of-bounds
write when reading an unreasonably large gitattributes file. The root
cause of this error are multiple integer overflows in different parts of
the code when there are either too many lines, when paths are too long,
when attribute names are too long, or when there are too many attributes
declared for a pattern.

As all of these are related to size, it seems reasonable to restrict the
size of the gitattributes file via git-fsck(1). This allows us to both
stop distributing known-vulnerable objects via common hosting platforms
that have fsck enabled, and users to protect themselves by enabling the
`fetch.fsckObjects` config.

There are basically two checks:

    1. We verify that size of the gitattributes file is smaller than
       100MB.

    2. We verify that the maximum line length does not exceed 2048
       bytes.

With the preceding commits, both of these conditions would cause us to
either ignore the complete gitattributes file or blob in the first case,
or the specific line in the second case. Now with these consistency
checks added, we also grow the ability to stop distributing such files
in the first place when `receive.fsckObjects` is enabled.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 17:07:04 +09:00
Junio C Hamano
e0bfc0b3b9 Merge branch 'ps/attr-limits' into maint-2.32 2022-12-09 17:03:49 +09:00
Junio C Hamano
6662a836eb Merge branch 'ps/attr-limits' into maint-2.30 2022-12-09 16:05:52 +09:00
Patrick Steinhardt
304a50adff pretty: restrict input lengths for padding and wrapping formats
Both the padding and wrapping formatting directives allow the caller to
specify an integer that ultimately leads to us adding this many chars to
the result buffer. As a consequence, it is trivial to e.g. allocate 2GB
of RAM via a single formatting directive and cause resource exhaustion
on the machine executing this logic. Furthermore, it is debatable
whether there are any sane usecases that require the user to pad data to
2GB boundaries or to indent wrapped data by 2GB.

Restrict the input sizes to 16 kilobytes at a maximum to limit the
amount of bytes that can be requested by the user. This is not meant
as a fix because there are ways to trivially amplify the amount of
data we generate via formatting directives; the real protection is
achieved by the changes in previous steps to catch and avoid integer
wraparound that causes us to under-allocate and access beyond the
end of allocated memory reagions. But having such a limit
significantly helps fuzzing the pretty format, because the fuzzer is
otherwise quite fast to run out-of-memory as it discovers these
formatters.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
Patrick Steinhardt
81c2d4c3a5 utf8: fix checking for glyph width in strbuf_utf8_replace()
In `strbuf_utf8_replace()`, we call `utf8_width()` to compute the width
of the current glyph. If the glyph is a control character though it can
be that `utf8_width()` returns `-1`, but because we assign this value to
a `size_t` the conversion will cause us to underflow. This bug can
easily be triggered with the following command:

    $ git log --pretty='format:xxx%<|(1,trunc)%x10'

>From all I can see though this seems to be a benign underflow that has
no security-related consequences.

Fix the bug by using an `int` instead. When we see a control character,
we now copy it into the target buffer but don't advance the current
width of the string.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
Patrick Steinhardt
937b71cc8b utf8: fix overflow when returning string width
The return type of both `utf8_strwidth()` and `utf8_strnwidth()` is
`int`, but we operate on string lengths which are typically of type
`size_t`. This means that when the string is longer than `INT_MAX`, we
will overflow and thus return a negative result.

This can lead to an out-of-bounds write with `--pretty=format:%<1)%B`
and a commit message that is 2^31+1 bytes long:

    =================================================================
    ==26009==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000001168 at pc 0x7f95c4e5f427 bp 0x7ffd8541c900 sp 0x7ffd8541c0a8
    WRITE of size 2147483649 at 0x603000001168 thread T0
        #0 0x7f95c4e5f426 in __interceptor_memcpy /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827
        #1 0x5612bbb1068c in format_and_pad_commit pretty.c:1763
        #2 0x5612bbb1087a in format_commit_item pretty.c:1801
        #3 0x5612bbc33bab in strbuf_expand strbuf.c:429
        #4 0x5612bbb110e7 in repo_format_commit_message pretty.c:1869
        #5 0x5612bbb12d96 in pretty_print_commit pretty.c:2161
        #6 0x5612bba0a4d5 in show_log log-tree.c:781
        #7 0x5612bba0d6c7 in log_tree_commit log-tree.c:1117
        #8 0x5612bb691ed5 in cmd_log_walk_no_free builtin/log.c:508
        #9 0x5612bb69235b in cmd_log_walk builtin/log.c:549
        #10 0x5612bb6951a2 in cmd_log builtin/log.c:883
        #11 0x5612bb56c993 in run_builtin git.c:466
        #12 0x5612bb56d397 in handle_builtin git.c:721
        #13 0x5612bb56db07 in run_argv git.c:788
        #14 0x5612bb56e8a7 in cmd_main git.c:923
        #15 0x5612bb803682 in main common-main.c:57
        #16 0x7f95c4c3c28f  (/usr/lib/libc.so.6+0x2328f)
        #17 0x7f95c4c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        #18 0x5612bb5680e4 in _start ../sysdeps/x86_64/start.S:115

    0x603000001168 is located 0 bytes to the right of 24-byte region [0x603000001150,0x603000001168)
    allocated by thread T0 here:
        #0 0x7f95c4ebe7ea in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:85
        #1 0x5612bbcdd556 in xrealloc wrapper.c:136
        #2 0x5612bbc310a3 in strbuf_grow strbuf.c:99
        #3 0x5612bbc32acd in strbuf_add strbuf.c:298
        #4 0x5612bbc33aec in strbuf_expand strbuf.c:418
        #5 0x5612bbb110e7 in repo_format_commit_message pretty.c:1869
        #6 0x5612bbb12d96 in pretty_print_commit pretty.c:2161
        #7 0x5612bba0a4d5 in show_log log-tree.c:781
        #8 0x5612bba0d6c7 in log_tree_commit log-tree.c:1117
        #9 0x5612bb691ed5 in cmd_log_walk_no_free builtin/log.c:508
        #10 0x5612bb69235b in cmd_log_walk builtin/log.c:549
        #11 0x5612bb6951a2 in cmd_log builtin/log.c:883
        #12 0x5612bb56c993 in run_builtin git.c:466
        #13 0x5612bb56d397 in handle_builtin git.c:721
        #14 0x5612bb56db07 in run_argv git.c:788
        #15 0x5612bb56e8a7 in cmd_main git.c:923
        #16 0x5612bb803682 in main common-main.c:57
        #17 0x7f95c4c3c28f  (/usr/lib/libc.so.6+0x2328f)

    SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827 in __interceptor_memcpy
    Shadow bytes around the buggy address:
      0x0c067fff81d0: fd fd fd fa fa fa fd fd fd fa fa fa fd fd fd fa
      0x0c067fff81e0: fa fa fd fd fd fd fa fa fd fd fd fd fa fa fd fd
      0x0c067fff81f0: fd fa fa fa fd fd fd fa fa fa fd fd fd fa fa fa
      0x0c067fff8200: fd fd fd fa fa fa fd fd fd fd fa fa 00 00 00 fa
      0x0c067fff8210: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd
    =>0x0c067fff8220: fd fa fa fa fd fd fd fa fa fa 00 00 00[fa]fa fa
      0x0c067fff8230: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff8240: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff8250: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff8260: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff8270: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    Shadow byte legend (one shadow byte represents 8 application bytes):
      Addressable:           00
      Partially addressable: 01 02 03 04 05 06 07
      Heap left redzone:       fa
      Freed heap region:       fd
      Stack left redzone:      f1
      Stack mid redzone:       f2
      Stack right redzone:     f3
      Stack after return:      f5
      Stack use after scope:   f8
      Global redzone:          f9
      Global init order:       f6
      Poisoned by user:        f7
      Container overflow:      fc
      Array cookie:            ac
      Intra object redzone:    bb
      ASan internal:           fe
      Left alloca redzone:     ca
      Right alloca redzone:    cb
    ==26009==ABORTING

Now the proper fix for this would be to convert both functions to return
an `size_t` instead of an `int`. But given that this commit may be part
of a security release, let's instead do the minimal viable fix and die
in case we see an overflow.

Add a test that would have previously caused us to crash.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
Patrick Steinhardt
17d23e8a38 utf8: fix returning negative string width
The `utf8_strnwidth()` function calls `utf8_width()` in a loop and adds
its returned width to the end result. `utf8_width()` can return `-1`
though in case it reads a control character, which means that the
computed string width is going to be wrong. In the worst case where
there are more control characters than non-control characters, we may
even return a negative string width.

Fix this bug by treating control characters as having zero width.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
Patrick Steinhardt
48050c42c7 pretty: fix integer overflow in wrapping format
The `%w(width,indent1,indent2)` formatting directive can be used to
rewrap text to a specific width and is designed after git-shortlog(1)'s
`-w` parameter. While the three parameters are all stored as `size_t`
internally, `strbuf_add_wrapped_text()` accepts integers as input. As a
result, the casted integers may overflow. As these now-negative integers
are later on passed to `strbuf_addchars()`, we will ultimately run into
implementation-defined behaviour due to casting a negative number back
to `size_t` again. On my platform, this results in trying to allocate
9000 petabyte of memory.

Fix this overflow by using `cast_size_t_to_int()` so that we reject
inputs that cannot be represented as an integer.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
Patrick Steinhardt
1de69c0cdd pretty: fix adding linefeed when placeholder is not expanded
When a formatting directive has a `+` or ` ` after the `%`, then we add
either a line feed or space if the placeholder expands to a non-empty
string. In specific cases though this logic doesn't work as expected,
and we try to add the character even in the case where the formatting
directive is empty.

One such pattern is `%w(1)%+d%+w(2)`. `%+d` expands to reference names
pointing to a certain commit, like in `git log --decorate`. For a tagged
commit this would for example expand to `\n (tag: v1.0.0)`, which has a
leading newline due to the `+` modifier and a space added by `%d`. Now
the second wrapping directive will cause us to rewrap the text to
`\n(tag:\nv1.0.0)`, which is one byte shorter due to the missing leading
space. The code that handles the `+` magic now notices that the length
has changed and will thus try to insert a leading line feed at the
original posititon. But as the string was shortened, the original
position is past the buffer's boundary and thus we die with an error.

Now there are two issues here:

    1. We check whether the buffer length has changed, not whether it
       has been extended. This causes us to try and add the character
       past the string boundary.

    2. The current logic does not make any sense whatsoever. When the
       string got expanded due to the rewrap, putting the separator into
       the original position is likely to put it somewhere into the
       middle of the rewrapped contents.

It is debatable whether `%+w()` makes any sense in the first place.
Strictly speaking, the placeholder never expands to a non-empty string,
and consequentially we shouldn't ever accept this combination. We thus
fix the bug by simply refusing `%+w()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
Patrick Steinhardt
f6e0b9f389 pretty: fix out-of-bounds read when parsing invalid padding format
An out-of-bounds read can be triggered when parsing an incomplete
padding format string passed via `--pretty=format` or in Git archives
when files are marked with the `export-subst` gitattribute.

This bug exists since we have introduced support for truncating output
via the `trunc` keyword a7f01c6b4d (pretty: support truncating in %>, %<
and %><, 2013-04-19). Before this commit, we used to find the end of the
formatting string by using strchr(3P). This function returns a `NULL`
pointer in case the character in question wasn't found. The subsequent
check whether any character was found thus simply checked the returned
pointer. After the commit we switched to strcspn(3P) though, which only
returns the offset to the first found character or to the trailing NUL
byte. As the end pointer is now computed by adding the offset to the
start pointer it won't be `NULL` anymore, and as a consequence the check
doesn't do anything anymore.

The out-of-bounds data that is being read can in fact end up in the
formatted string. As a consequence, it is possible to leak memory
contents either by calling git-log(1) or via git-archive(1) when any of
the archived files is marked with the `export-subst` gitattribute.

    ==10888==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000398 at pc 0x7f0356047cb2 bp 0x7fff3ffb95d0 sp 0x7fff3ffb8d78
    READ of size 1 at 0x602000000398 thread T0
        #0 0x7f0356047cb1 in __interceptor_strchrnul /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:725
        #1 0x563b7cec9a43 in strbuf_expand strbuf.c:417
        #2 0x563b7cda7060 in repo_format_commit_message pretty.c:1869
        #3 0x563b7cda8d0f in pretty_print_commit pretty.c:2161
        #4 0x563b7cca04c8 in show_log log-tree.c:781
        #5 0x563b7cca36ba in log_tree_commit log-tree.c:1117
        #6 0x563b7c927ed5 in cmd_log_walk_no_free builtin/log.c:508
        #7 0x563b7c92835b in cmd_log_walk builtin/log.c:549
        #8 0x563b7c92b1a2 in cmd_log builtin/log.c:883
        #9 0x563b7c802993 in run_builtin git.c:466
        #10 0x563b7c803397 in handle_builtin git.c:721
        #11 0x563b7c803b07 in run_argv git.c:788
        #12 0x563b7c8048a7 in cmd_main git.c:923
        #13 0x563b7ca99682 in main common-main.c:57
        #14 0x7f0355e3c28f  (/usr/lib/libc.so.6+0x2328f)
        #15 0x7f0355e3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        #16 0x563b7c7fe0e4 in _start ../sysdeps/x86_64/start.S:115

    0x602000000398 is located 0 bytes to the right of 8-byte region [0x602000000390,0x602000000398)
    allocated by thread T0 here:
        #0 0x7f0356072faa in __interceptor_strdup /usr/src/debug/gcc/libsanitizer/asan/asan_interceptors.cpp:439
        #1 0x563b7cf7317c in xstrdup wrapper.c:39
        #2 0x563b7cd9a06a in save_user_format pretty.c:40
        #3 0x563b7cd9b3e5 in get_commit_format pretty.c:173
        #4 0x563b7ce54ea0 in handle_revision_opt revision.c:2456
        #5 0x563b7ce597c9 in setup_revisions revision.c:2850
        #6 0x563b7c9269e0 in cmd_log_init_finish builtin/log.c:269
        #7 0x563b7c927362 in cmd_log_init builtin/log.c:348
        #8 0x563b7c92b193 in cmd_log builtin/log.c:882
        #9 0x563b7c802993 in run_builtin git.c:466
        #10 0x563b7c803397 in handle_builtin git.c:721
        #11 0x563b7c803b07 in run_argv git.c:788
        #12 0x563b7c8048a7 in cmd_main git.c:923
        #13 0x563b7ca99682 in main common-main.c:57
        #14 0x7f0355e3c28f  (/usr/lib/libc.so.6+0x2328f)
        #15 0x7f0355e3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        #16 0x563b7c7fe0e4 in _start ../sysdeps/x86_64/start.S:115

    SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:725 in __interceptor_strchrnul
    Shadow bytes around the buggy address:
      0x0c047fff8020: fa fa fd fd fa fa 00 06 fa fa 05 fa fa fa fd fd
      0x0c047fff8030: fa fa 00 02 fa fa 06 fa fa fa 05 fa fa fa fd fd
      0x0c047fff8040: fa fa 00 07 fa fa 03 fa fa fa fd fd fa fa 00 00
      0x0c047fff8050: fa fa 00 01 fa fa fd fd fa fa 00 00 fa fa 00 01
      0x0c047fff8060: fa fa 00 06 fa fa 00 06 fa fa 05 fa fa fa 05 fa
    =>0x0c047fff8070: fa fa 00[fa]fa fa fd fa fa fa fd fd fa fa fd fd
      0x0c047fff8080: fa fa fd fd fa fa 00 00 fa fa 00 fa fa fa fd fa
      0x0c047fff8090: fa fa fd fd fa fa 00 00 fa fa fa fa fa fa fa fa
      0x0c047fff80a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c047fff80b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c047fff80c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    Shadow byte legend (one shadow byte represents 8 application bytes):
      Addressable:           00
      Partially addressable: 01 02 03 04 05 06 07
      Heap left redzone:       fa
      Freed heap region:       fd
      Stack left redzone:      f1
      Stack mid redzone:       f2
      Stack right redzone:     f3
      Stack after return:      f5
      Stack use after scope:   f8
      Global redzone:          f9
      Global init order:       f6
      Poisoned by user:        f7
      Container overflow:      fc
      Array cookie:            ac
      Intra object redzone:    bb
      ASan internal:           fe
      Left alloca redzone:     ca
      Right alloca redzone:    cb
    ==10888==ABORTING

Fix this bug by checking whether `end` points at the trailing NUL byte.
Add a test which catches this out-of-bounds read and which demonstrates
that we used to write out-of-bounds data into the formatted message.

Reported-by: Markus Vervier <markus.vervier@x41-dsec.de>
Original-patch-by: Markus Vervier <markus.vervier@x41-dsec.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
Patrick Steinhardt
b49f309aa1 pretty: fix out-of-bounds read when left-flushing with stealing
With the `%>>(<N>)` pretty formatter, you can ask git-log(1) et al to
steal spaces. To do so we need to look ahead of the next token to see
whether there are spaces there. This loop takes into account ANSI
sequences that end with an `m`, and if it finds any it will skip them
until it finds the first space. While doing so it does not take into
account the buffer's limits though and easily does an out-of-bounds
read.

Add a test that hits this behaviour. While we don't have an easy way to
verify this, the test causes the following failure when run with
`SANITIZE=address`:

    ==37941==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000000baf at pc 0x55ba6f88e0d0 bp 0x7ffc84c50d20 sp 0x7ffc84c50d10
    READ of size 1 at 0x603000000baf thread T0
        #0 0x55ba6f88e0cf in format_and_pad_commit pretty.c:1712
        #1 0x55ba6f88e7b4 in format_commit_item pretty.c:1801
        #2 0x55ba6f9b1ae4 in strbuf_expand strbuf.c:429
        #3 0x55ba6f88f020 in repo_format_commit_message pretty.c:1869
        #4 0x55ba6f890ccf in pretty_print_commit pretty.c:2161
        #5 0x55ba6f7884c8 in show_log log-tree.c:781
        #6 0x55ba6f78b6ba in log_tree_commit log-tree.c:1117
        #7 0x55ba6f40fed5 in cmd_log_walk_no_free builtin/log.c:508
        #8 0x55ba6f41035b in cmd_log_walk builtin/log.c:549
        #9 0x55ba6f4131a2 in cmd_log builtin/log.c:883
        #10 0x55ba6f2ea993 in run_builtin git.c:466
        #11 0x55ba6f2eb397 in handle_builtin git.c:721
        #12 0x55ba6f2ebb07 in run_argv git.c:788
        #13 0x55ba6f2ec8a7 in cmd_main git.c:923
        #14 0x55ba6f581682 in main common-main.c:57
        #15 0x7f2d08c3c28f  (/usr/lib/libc.so.6+0x2328f)
        #16 0x7f2d08c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        #17 0x55ba6f2e60e4 in _start ../sysdeps/x86_64/start.S:115

    0x603000000baf is located 1 bytes to the left of 24-byte region [0x603000000bb0,0x603000000bc8)
    allocated by thread T0 here:
        #0 0x7f2d08ebe7ea in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:85
        #1 0x55ba6fa5b494 in xrealloc wrapper.c:136
        #2 0x55ba6f9aefdc in strbuf_grow strbuf.c:99
        #3 0x55ba6f9b0a06 in strbuf_add strbuf.c:298
        #4 0x55ba6f9b1a25 in strbuf_expand strbuf.c:418
        #5 0x55ba6f88f020 in repo_format_commit_message pretty.c:1869
        #6 0x55ba6f890ccf in pretty_print_commit pretty.c:2161
        #7 0x55ba6f7884c8 in show_log log-tree.c:781
        #8 0x55ba6f78b6ba in log_tree_commit log-tree.c:1117
        #9 0x55ba6f40fed5 in cmd_log_walk_no_free builtin/log.c:508
        #10 0x55ba6f41035b in cmd_log_walk builtin/log.c:549
        #11 0x55ba6f4131a2 in cmd_log builtin/log.c:883
        #12 0x55ba6f2ea993 in run_builtin git.c:466
        #13 0x55ba6f2eb397 in handle_builtin git.c:721
        #14 0x55ba6f2ebb07 in run_argv git.c:788
        #15 0x55ba6f2ec8a7 in cmd_main git.c:923
        #16 0x55ba6f581682 in main common-main.c:57
        #17 0x7f2d08c3c28f  (/usr/lib/libc.so.6+0x2328f)
        #18 0x7f2d08c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        #19 0x55ba6f2e60e4 in _start ../sysdeps/x86_64/start.S:115

    SUMMARY: AddressSanitizer: heap-buffer-overflow pretty.c:1712 in format_and_pad_commit
    Shadow bytes around the buggy address:
      0x0c067fff8120: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd
      0x0c067fff8130: fd fd fa fa fd fd fd fd fa fa fd fd fd fa fa fa
      0x0c067fff8140: fd fd fd fa fa fa fd fd fd fa fa fa fd fd fd fa
      0x0c067fff8150: fa fa fd fd fd fd fa fa 00 00 00 fa fa fa fd fd
      0x0c067fff8160: fd fa fa fa fd fd fd fa fa fa fd fd fd fa fa fa
    =>0x0c067fff8170: fd fd fd fa fa[fa]00 00 00 fa fa fa 00 00 00 fa
      0x0c067fff8180: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff8190: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff81a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff81b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff81c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    Shadow byte legend (one shadow byte represents 8 application bytes):
      Addressable:           00
      Partially addressable: 01 02 03 04 05 06 07
      Heap left redzone:       fa
      Freed heap region:       fd
      Stack left redzone:      f1
      Stack mid redzone:       f2
      Stack right redzone:     f3
      Stack after return:      f5
      Stack use after scope:   f8
      Global redzone:          f9
      Global init order:       f6
      Poisoned by user:        f7
      Container overflow:      fc
      Array cookie:            ac
      Intra object redzone:    bb
      ASan internal:           fe
      Left alloca redzone:     ca
      Right alloca redzone:    cb

Luckily enough, this would only cause us to copy the out-of-bounds data
into the formatted commit in case we really had an ANSI sequence
preceding our buffer. So this bug likely has no security consequences.

Fix it regardless by not traversing past the buffer's start.

Reported-by: Patrick Steinhardt <ps@pks.im>
Reported-by: Eric Sesterhenn <eric.sesterhenn@x41-dsec.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
Patrick Steinhardt
81dc898df9 pretty: fix out-of-bounds write caused by integer overflow
When using a padding specifier in the pretty format passed to git-log(1)
we need to calculate the string length in several places. These string
lengths are stored in `int`s though, which means that these can easily
overflow when the input lengths exceeds 2GB. This can ultimately lead to
an out-of-bounds write when these are used in a call to memcpy(3P):

        ==8340==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7f1ec62f97fe at pc 0x7f2127e5f427 bp 0x7ffd3bd63de0 sp 0x7ffd3bd63588
    WRITE of size 1 at 0x7f1ec62f97fe thread T0
        #0 0x7f2127e5f426 in __interceptor_memcpy /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827
        #1 0x5628e96aa605 in format_and_pad_commit pretty.c:1762
        #2 0x5628e96aa7f4 in format_commit_item pretty.c:1801
        #3 0x5628e97cdb24 in strbuf_expand strbuf.c:429
        #4 0x5628e96ab060 in repo_format_commit_message pretty.c:1869
        #5 0x5628e96acd0f in pretty_print_commit pretty.c:2161
        #6 0x5628e95a44c8 in show_log log-tree.c:781
        #7 0x5628e95a76ba in log_tree_commit log-tree.c:1117
        #8 0x5628e922bed5 in cmd_log_walk_no_free builtin/log.c:508
        #9 0x5628e922c35b in cmd_log_walk builtin/log.c:549
        #10 0x5628e922f1a2 in cmd_log builtin/log.c:883
        #11 0x5628e9106993 in run_builtin git.c:466
        #12 0x5628e9107397 in handle_builtin git.c:721
        #13 0x5628e9107b07 in run_argv git.c:788
        #14 0x5628e91088a7 in cmd_main git.c:923
        #15 0x5628e939d682 in main common-main.c:57
        #16 0x7f2127c3c28f  (/usr/lib/libc.so.6+0x2328f)
        #17 0x7f2127c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        #18 0x5628e91020e4 in _start ../sysdeps/x86_64/start.S:115

    0x7f1ec62f97fe is located 2 bytes to the left of 4831838265-byte region [0x7f1ec62f9800,0x7f1fe62f9839)
    allocated by thread T0 here:
        #0 0x7f2127ebe7ea in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:85
        #1 0x5628e98774d4 in xrealloc wrapper.c:136
        #2 0x5628e97cb01c in strbuf_grow strbuf.c:99
        #3 0x5628e97ccd42 in strbuf_addchars strbuf.c:327
        #4 0x5628e96aa55c in format_and_pad_commit pretty.c:1761
        #5 0x5628e96aa7f4 in format_commit_item pretty.c:1801
        #6 0x5628e97cdb24 in strbuf_expand strbuf.c:429
        #7 0x5628e96ab060 in repo_format_commit_message pretty.c:1869
        #8 0x5628e96acd0f in pretty_print_commit pretty.c:2161
        #9 0x5628e95a44c8 in show_log log-tree.c:781
        #10 0x5628e95a76ba in log_tree_commit log-tree.c:1117
        #11 0x5628e922bed5 in cmd_log_walk_no_free builtin/log.c:508
        #12 0x5628e922c35b in cmd_log_walk builtin/log.c:549
        #13 0x5628e922f1a2 in cmd_log builtin/log.c:883
        #14 0x5628e9106993 in run_builtin git.c:466
        #15 0x5628e9107397 in handle_builtin git.c:721
        #16 0x5628e9107b07 in run_argv git.c:788
        #17 0x5628e91088a7 in cmd_main git.c:923
        #18 0x5628e939d682 in main common-main.c:57
        #19 0x7f2127c3c28f  (/usr/lib/libc.so.6+0x2328f)
        #20 0x7f2127c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        #21 0x5628e91020e4 in _start ../sysdeps/x86_64/start.S:115

    SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827 in __interceptor_memcpy
    Shadow bytes around the buggy address:
      0x0fe458c572a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0fe458c572b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0fe458c572c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0fe458c572d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0fe458c572e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    =>0x0fe458c572f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa[fa]
      0x0fe458c57300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0fe458c57310: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0fe458c57320: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0fe458c57330: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0fe458c57340: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    Shadow byte legend (one shadow byte represents 8 application bytes):
      Addressable:           00
      Partially addressable: 01 02 03 04 05 06 07
      Heap left redzone:       fa
      Freed heap region:       fd
      Stack left redzone:      f1
      Stack mid redzone:       f2
      Stack right redzone:     f3
      Stack after return:      f5
      Stack use after scope:   f8
      Global redzone:          f9
      Global init order:       f6
      Poisoned by user:        f7
      Container overflow:      fc
      Array cookie:            ac
      Intra object redzone:    bb
      ASan internal:           fe
      Left alloca redzone:     ca
      Right alloca redzone:    cb
    ==8340==ABORTING

The pretty format can also be used in `git archive` operations via the
`export-subst` attribute. So this is what in our opinion makes this a
critical issue in the context of Git forges which allow to download an
archive of user supplied Git repositories.

Fix this vulnerability by using `size_t` instead of `int` to track the
string lengths. Add tests which detect this vulnerability when Git is
compiled with the address sanitizer.

Reported-by: Joern Schneeweisz <jschneeweisz@gitlab.com>
Original-patch-by: Joern Schneeweisz <jschneeweisz@gitlab.com>
Modified-by: Taylor  Blau <me@ttalorr.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
Carlo Marcelo Arenas Belón
a244dc5b0a test-lib: add prerequisite for 64-bit platforms
Allow tests that assume a 64-bit `size_t` to be skipped in 32-bit
platforms and regardless of the size of `long`.

This imitates the `LONG_IS_64BIT` prerequisite.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:04 +09:00
Eric Sunshine
35c194dc57 t1509: facilitate repeated script invocations
t1509-root-work-tree.sh, which tests behavior of a Git repository
located at the root `/` directory, refuses to run if it detects the
presence of an existing repository at `/`. This safeguard ensures that
it won't clobber a legitimate repository at that location. However,
because t1509 does a poor job of cleaning up after itself, it runs afoul
of its own safety check on subsequent runs, which makes it painful to
run the script repeatedly since each run requires manual cleanup of
detritus from the previous run.

Address this shortcoming by making t1509 clean up after itself as its
last action. This is safe since the script can only make it to this
cleanup action if it did not find a legitimate repository at `/` in the
first place, so the resources cleaned up here can only have been created
by the script itself.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
2022-12-09 10:41:59 +09:00
Eric Sunshine
ce153b8d4d t1509: make "setup" test more robust
One of the t1509 setup tests is very particular about the output it
expects from `git init`, and fails if the output differs even slightly
which can happen easily if the script is run multiple times since it
doesn't do a good job of cleaning up after itself (i.e. it leaves
detritus in the root directory `/`). One bit of cruft in particular
(`/HEAD`) makes the test fail since its presence causes `git init` to
alter its output; rather than reporting "Initialized empty Git
repository", it instead reports "Reinitialized existing Git repository"
when `/HEAD` is present. Address this problem by making the test do a
more careful job of crafting its intended initial state.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
2022-12-09 10:41:58 +09:00
Eric Sunshine
7790b8c6b5 t1509: fix failing "root work tree" test due to owner-check
When 8959555cee (setup_git_directory(): add an owner check for the
top-level directory, 2022-03-02) tightened security surrounding
directory ownership, it neglected to adjust t1509-root-work-tree.sh to
take the new restriction into account. As a result, since the root
directory `/` is typically not owned by the user running the test
(indeed, t1509 refuses to run as `root`), the ownership check added
by 8959555cee kicks in and causes the test to fail:

    fatal: detected dubious ownership in repository at '/'
    To add an exception for this directory, call:

        git config --global --add safe.directory /

This problem went unnoticed for so long because t1509 is rarely run
since it requires setting up a `chroot` environment or a sacrificial
virtual machine in which `/` can be made writable and polluted by any
user.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
2022-12-09 10:41:58 +09:00
René Scharfe
86325d36e6 t3920: support CR-eating grep
grep(1) converts CRLF line endings to LF on current MinGW:

   $ uname -sr
   MINGW64_NT-10.0-22621 3.3.6-341.x86_64

   $ printf 'a\r\n' | hexdump.exe -C
   00000000  61 0d 0a                                          |a..|
   00000003

   $ printf 'a\r\n' | grep . | hexdump.exe -C
   00000000  61 0a                                             |a.|
   00000002

Create the intended test file by grepping the original file with LF
line endings and adding CRs explicitly.

The missing CRs went unnoticed because test_cmp on MinGW ignores line
endings since 4d715ac05c (Windows: a test_cmp that is agnostic to random
LF <> CRLF conversions, 2013-10-26).  Fix this test anyway to avoid
depending on that special test_cmp behavior, especially since this is
the only test that needs it.

Piping the output of grep(1) through append_cr has the side-effect of
ignoring its return value.  That means we no longer need the explicit
"|| true" to support commit messages without a body.

Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-07 13:33:18 +09:00
Johannes Schindelin
95494c6f61 t0021: use Windows-friendly pwd
In Git for Windows, when passing paths from shell scripts to regular
Win32 executables, thanks to the MSYS2 runtime a somewhat magic path
conversion happens that lets the shell script think that there is a file
at `/git/Makefile` and the Win32 process it spawned thinks that the
shell script said `C:/git-sdk-64/git/Makefile` instead.

This conversion is documented in detail over here:
https://www.msys2.org/docs/filesystem-paths/#automatic-unix-windows-path-conversion

As all automatic conversions, there are gaps. For example, to avoid
mistaking command-line options like `/LOG=log.txt` (which are quite
common in the Windows world) from being mistaken for a Unix-style
absolute path, the MSYS2 runtime specifically exempts arguments
containing a `=` character from that conversion.

We are about to change `test_cmp` to use `git diff --no-index`, which
involves spawning precisely such a Win32 process.

In combination, this would cause a failure in `t0021-conversion.sh`
where we pass an absolute path containing an equal character to the
`test_cmp` function.

Seeing as the Unix tools like `cp` and `diff` that are used by Git's
test suite in the Git for Windows SDK (thanks to the MSYS2 project)
understand both Unix-style as well as Windows-style paths, we can stave
off this problem by simply switching to Windows-style paths and
side-stepping the need for any automatic path conversion.

Note: The `PATH` variable is obviously special, as it is colon-separated
in the MSYS2 Bash used by Git for Windows, and therefore _cannot_
contain absolute Windows-style paths, lest the colon after the drive
letter is mistaken for a path separator. Therefore, we need to be
careful to keep the Unix-style when modifying the `PATH` variable.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-07 13:22:58 +09:00
Patrick Steinhardt
3c50032ff5 attr: ignore overly large gitattributes files
Similar as with the preceding commit, start ignoring gitattributes files
that are overly large to protect us against out-of-bounds reads and
writes caused by integer overflows. Unfortunately, we cannot just define
"overly large" in terms of any preexisting limits in the codebase.

Instead, we choose a very conservative limit of 100MB. This is plenty of
room for specifying gitattributes, and incidentally it is also the limit
for blob sizes for GitHub. While we don't want GitHub to dictate limits
here, it is still sensible to use this fact for an informed decision
given that it is hosting a huge set of repositories. Furthermore, over
at GitLab we scanned a subset of repositories for their root-level
attribute files. We found that 80% of them have a gitattributes file
smaller than 100kB, 99.99% have one smaller than 1MB, and only a single
repository had one that was almost 3MB in size. So enforcing a limit of
100MB seems to give us ample of headroom.

With this limit in place we can be reasonably sure that there is no easy
way to exploit the gitattributes file via integer overflows anymore.
Furthermore, it protects us against resource exhaustion caused by
allocating the in-memory data structures required to represent the
parsed attributes.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 15:50:03 +09:00
Patrick Steinhardt
dfa6b32b5e attr: ignore attribute lines exceeding 2048 bytes
There are two different code paths to read gitattributes: once via a
file, and once via the index. These two paths used to behave differently
because when reading attributes from a file, we used fgets(3P) with a
buffer size of 2kB. Consequentially, we silently truncate line lengths
when lines are longer than that and will then parse the remainder of the
line as a new pattern. It goes without saying that this is entirely
unexpected, but it's even worse that the behaviour depends on how the
gitattributes are parsed.

While this is simply wrong, the silent truncation saves us with the
recently discovered vulnerabilities that can cause out-of-bound writes
or reads with unreasonably long lines due to integer overflows. As the
common path is to read gitattributes via the worktree file instead of
via the index, we can assume that any gitattributes file that had lines
longer than that is already broken anyway. So instead of lifting the
limit here, we can double down on it to fix the vulnerabilities.

Introduce an explicit line length limit of 2kB that is shared across all
paths that read attributes and ignore any line that hits this limit
while printing a warning.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 15:33:07 +09:00
Patrick Steinhardt
d74b1fd54f attr: fix silently splitting up lines longer than 2048 bytes
When reading attributes from a file we use fgets(3P) with a buffer size
of 2048 bytes. This means that as soon as a line exceeds the buffer size
we split it up into multiple parts and parse each of them as a separate
pattern line. This is of course not what the user intended, and even
worse the behaviour is inconsistent with how we read attributes from the
index.

Fix this bug by converting the code to use `strbuf_getline()` instead.
This will indeed read in the whole line, which may theoretically lead to
an out-of-memory situation when the gitattributes file is huge. We're
about to reject any gitattributes files larger than 100MB in the next
commit though, which makes this less of a concern.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 15:29:30 +09:00
Johannes Sixt
500317ae03 t3920: don't ignore errors of more than one command with || true
It is customary to write `A || true` to ignore a potential error exit of
command A. But when we have a sequence `A && B && C || true && D`, then
a failure of any of A, B, or C skips to D right away. This is not
intended here. Turn the command whose failure is to be ignored into a
compound command to ensure it is the only one that is allowed to fail.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 10:02:34 +09:00
Ævar Arnfjörð Bjarmason
5f3bfdc4f3 t4023: fix ignored exit codes of git
Change a "git diff-tree" command to be &&-chained so that we won't
ignore its exit code, see the ea05fd5fbf (Merge branch
'ab/keep-git-exit-codes-in-tests', 2022-03-16) topic for prior art.

This fixes code added in b45563a229 (rename: Break filepairs with
different types., 2007-11-30). Due to hiding the exit code we hid a
memory leak under SANITIZE=leak.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 09:28:04 +09:00
Ævar Arnfjörð Bjarmason
4d81ce1b99 t7600: don't ignore "rev-parse" exit code in helper
Change the verify_mergeheads() helper the check the exit code of "git
rev-parse".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-05 09:27:32 +09:00
Ævar Arnfjörð Bjarmason
243caa8982 t5314: check exit code of "git"
Amend the test added in [1] to check the exit code of the "git"
invocations. An in-flight change[2] introduced a memory leak in these
invocations, which went undetected unless we were running under
"GIT_TEST_SANITIZE_LEAK_LOG=true".

Note that the in-flight change made 8 test files fail, but as far as I
can tell only this one would have had its exit code hidden unless
under "GIT_TEST_SANITIZE_LEAK_LOG=true". The rest would be caught
without it.

We could pick other variable names here than "ln%d", e.g. "commit",
"dummy_blob" and "file_blob", but having the "rev-parse" invocations
aligned makes the difference between them more readable, so let's pick
"ln%d".

1. 4cf2143e02 (pack-objects: break delta cycles before delta-search
   phase, 2016-08-11)
2. https://lore.kernel.org/git/221128.868rjvmi3l.gmgdl@evledraar.gmail.com/
3. faececa53f (test-lib: have the "check" mode for SANITIZE=leak
   consider leak logs, 2022-07-28)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-02 16:38:12 +09:00
René Scharfe
77e04b2ed4 t4205: don't exit test script on failure
Only abort the individual check instead of exiting the whole test script
if git show fails.  Noticed with GIT_TEST_PASSING_SANITIZE_LEAK=check.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-02 08:25:02 +09:00
Rudy Rigot
ecbc23e4c5 status: modernize git-status "slow untracked files" advice
`git status` can be slow when there are a large number of
untracked files and directories since Git must search the entire
worktree to enumerate them.  When it is too slow, Git prints
advice with the elapsed search time and a suggestion to disable
the search using the `-uno` option.  This suggestion also carries
a warning that might scare off some users.

However, these days, `-uno` isn't the only option.  Git can reduce
the time taken to enumerate untracked files by caching results from
previous `git status` invocations, when the `core.untrackedCache`
and `core.fsmonitor` features are enabled.

Update the `git status` man page to explain these configuration
options, and update the advice to provide more detail about the
current configuration and to refer to the updated documentation.

Signed-off-by: Rudy Rigot <rudy.rigot@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-01 15:27:41 +09:00
Jiang Xin
bcb71d45bf t1301: do not change $CWD in "shared=all" test case
In test case "shared=all", the working directory is permanently changed
to the "sub" directory. This leads to a strange behavior that the
temporary repositories created by subsequent test cases are all in this
"sub" directory, such as "sub/new", "sub/child.git". If we bypass this
test case, all subsequent test cases will have different working
directory.

Besides, all subsequent test cases assuming they are in the "sub"
directory do not run any destructive operations in their parent
directory (".."), and will not make damage out side of $TRASH_DIRECTORY.

So it is a safe change for us to run the test case "shared=all" in
current repository instead of creating and changing to "sub".

For the next test case, the path ".git/info" is assumed to be missing,
but we no longer run the test case in the "sub" repository which is
initialized from an empty template. In order for the test case to run
properly, we can set "TEST_CREATE_REPO_NO_TEMPLATE=1" to initialize the
default repository without a template.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-30 10:21:51 +09:00
Jiang Xin
5d64229ef5 t1301: use test_when_finished for cleanup
Refactor several test cases to use "test_when_finished" for cleanup.

1. For first of these, we used to clean-up outside the test, but instead
   let's use test_when_finished for that.

2. For the second, we used to leave "new" after we are done, but not use
   it at all later. Now we do clean up.

3. For the rest, these child.git test repositories used to follow
   "initialize what we are going to use to a known state before we use"
   pattern, which is not wrong per-se, but now we use "clean up the
   cruft we made after we are done" pattern, which may arguably be
   better simply because the test that makes cruft should know what
   cruft it created better than whatever comes later that may not know.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-30 10:21:51 +09:00
Jiang Xin
a0883a2440 t1301: fix wrong template dir for git-init
The template dir prepared in test case "forced modes" is not used as
expected because a wrong template dir is provided to "git init". This is
because the $CWD for "git-init" command is a sibling directory alongside
the template directory. Change it to the right template directory and
add a protection test using "test_path_is_file".

The wrong template directory was introduced by mistake in commit
e1df7fe43f (init: make --template path relative to $CWD, 2019-05-10).

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-30 10:21:50 +09:00
René Scharfe
825babe5d5 pack-objects: fix handling of multiple --filter options
Since 5cb28270a1 (pack-objects: lazily set up "struct rev_info", don't
leak, 2022-03-28) --filter options given to git pack-objects overrule
earlier ones, letting only the leftmost win and leaking the memory
allocated for earlier ones.  Fix that by only initializing the rev_info
struct once.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-30 10:00:33 +09:00
René Scharfe
f00d811533 t5317: demonstrate failure to handle multiple --filter options
git pack-objects should accept multiple --filter options as documented
in Documentation/rev-list-options.txt, but currently the last one wins.
Show that using tests with multiple blob size limits

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-30 10:00:32 +09:00
René Scharfe
3f75a6e5b4 t5317: stop losing return codes of git ls-files
fb2d0db502 (test-lib-functions: add parsing helpers for ls-files and
ls-tree, 2022-04-04) not only started to use helper functions, it also
started to pipe the output of git ls-files into them directly, without
using a temporary file.  No explanation was given.  This causes the
return code of that git command to be ignored.

Revert that part of the change, use temporary files and check the return
code of git ls-files again.

Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-30 10:00:32 +09:00
Alison Winters
9de31f7bd2 completion: add case-insensitive match of pseudorefs
When GIT_COMPLETION_IGNORE_CASE is set, also allow lowercase completion
text like "head" to match uppercase HEAD and other pseudorefs.

Signed-off-by: Alison Winters <alisonatwork@outlook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-30 09:58:06 +09:00
Alison Winters
9bab766fb2 completion: add optional ignore-case when matching refs
If GIT_COMPLETION_IGNORE_CASE is set, --ignore-case will be added to
git for-each-ref calls so that refs can be matched case insensitively,
even when running on case sensitive filesystems.

Signed-off-by: Alison Winters <alisonatwork@outlook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-30 09:58:06 +09:00
Junio C Hamano
8165c6af11 Merge branch 'jh/trace2-timers-and-counters'
Test fix.

* jh/trace2-timers-and-counters:
  trace2 tests: guard pthread test with "PTHREAD"
2022-11-29 10:41:05 +09:00
Junio C Hamano
8a40cb1e5a Merge branch 'ah/chainlint-cpuinfo-parse-fix'
The format of a line in /proc/cpuinfo that describes a CPU on s390x
looked different from everybody else, and the code in chainlint.pl
failed to parse it.

* ah/chainlint-cpuinfo-parse-fix:
  chainlint.pl: fix /proc/cpuinfo regexp
2022-11-29 10:41:05 +09:00
Junio C Hamano
f32996d99a Merge branch 'gc/resolve-alternate-symlinks'
Resolve symbolic links when processing the locations of alternate
object stores, since failing to do so can lead to confusing and buggy
behavior.

* gc/resolve-alternate-symlinks:
  object-file: use real paths when adding alternates
2022-11-29 10:41:05 +09:00
Junio C Hamano
041df69edd Merge branch 'ab/fewer-the-index-macros'
Progress on removing 'the_index' convenience wrappers.

* ab/fewer-the-index-macros:
  cocci: apply "pending" index-compatibility to some "builtin/*.c"
  cache.h & test-tool.h: add & use "USE_THE_INDEX_VARIABLE"
  {builtin/*,repository}.c: add & use "USE_THE_INDEX_VARIABLE"
  cocci: apply "pending" index-compatibility to "t/helper/*.c"
  cocci & cache.h: apply variable section of "pending" index-compatibility
  cocci & cache.h: apply a selection of "pending" index-compatibility
  cocci: add a index-compatibility.pending.cocci
  read-cache API & users: make discard_index() return void
  cocci & cache.h: remove rarely used "the_index" compat macros
  builtin/{grep,log}.: don't define "USE_THE_INDEX_COMPATIBILITY_MACROS"
  cache.h: remove unused "the_index" compat macros
2022-11-28 12:13:46 +09:00
Junio C Hamano
91c43cde25 Merge branch 'es/locate-httpd-module-location-in-test'
Add one more candidate directory that may house httpd modules while
running tests.

* es/locate-httpd-module-location-in-test:
  lib-httpd: extend module location auto-detection
2022-11-28 12:13:45 +09:00
Junio C Hamano
399a9f31f7 Merge branch 'zk/push-use-bitmaps'
Test fix.

* zk/push-use-bitmaps:
  t5516: fail to run in verbose mode
2022-11-28 12:13:45 +09:00
Junio C Hamano
7d7ed48dd5 Merge branch 'ew/prune-with-missing-objects-pack'
"git prune" may try to iterate over .git/objects/pack for trash
files to remove in it, and loudly fail when the directory is
missing, which is not necessary.  The command has been taught to
ignore such a failure.

* ew/prune-with-missing-objects-pack:
  prune: quiet ENOENT on missing directories
2022-11-28 12:13:44 +09:00
Junio C Hamano
6accbe3ce7 Merge branch 'pw/config-int-parse-fixes'
Assorted fixes of parsing end-user input as integers.

* pw/config-int-parse-fixes:
  git_parse_signed(): avoid integer overflow
  config: require at least one digit when parsing numbers
  git_parse_unsigned: reject negative values
2022-11-28 12:13:43 +09:00
Junio C Hamano
ba88f8c81d Merge branch 'jk/parse-object-type-mismatch'
`parse_object()` hardening when checking for the existence of a
suspected blob object.

* jk/parse-object-type-mismatch:
  parse_object(): simplify blob conditional
  parse_object(): check on-disk type of suspected blob
  parse_object(): drop extra "has" check before checking object type
2022-11-28 12:13:42 +09:00
Kyle Meyer
8774aa56ad send-email: relay '-v N' to format-patch
send-email relays unrecognized arguments to its format-patch call.
Passing '-v N' leads to an error because -v is consumed as
send-email's --validate.  For example,

  git send-email -v 3 @{u}

fails with

  fatal: ambiguous argument '3': unknown revision or path not in the
  working tree.  [...]

To prevent this, add the short --reroll-count option to send-email's
main option list and explicitly provide it to the format-patch call.

There other format-patch options that send-email doesn't relay
properly, including at least -n, -N, and the diff option -D.  Punt on
these because dealing with them is more complicated:

 * they would require configuring send-email to not ignore option case

 * send-email makes three GetOptions() calls with different sets of
   options, the last being the main set of options.  Unlike -v, which
   is consumed by the last GetOptions call, the -n, -N, and -D options
   are consumed as abbreviations by the earlier calls.

Signed-off-by: Kyle Meyer <kyle@kyleam.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-27 10:21:43 +09:00
Sean Allred
2ad150e35e var: allow GIT_EDITOR to return null
The handling to die early when there is no EDITOR is valuable when
used in normal code (i.e., editor.c). In git-var, where
null/empty-string is a perfectly valid value to return, it doesn't
make as much sense.

Remove this handling from `git var GIT_EDITOR` so that it does not
fail so noisily when there is no defined editor.

Signed-off-by: Sean Allred <allred.sean@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-27 09:35:55 +09:00
Glen Choo
199337d6ec object-file: use real paths when adding alternates
When adding an alternate ODB, we check if the alternate has the same
path as the object dir, and if so, we do nothing. However, that
comparison does not resolve symlinks. This makes it possible to add the
object dir as an alternate, which may result in bad behavior. For
example, it can trick "git repack -a -l -d" (possibly run by "git gc")
into thinking that all packs come from an alternate and delete all
objects.

	rm -rf test &&
	git clone https://github.com/git/git test &&
	(
	cd test &&
	ln -s objects .git/alt-objects &&
	# -c repack.updateserverinfo=false silences a warning about not
	# being able to update "info/refs", it isn't needed to show the
	# bad behavior
	GIT_ALTERNATE_OBJECT_DIRECTORIES=".git/alt-objects" git \
		-c repack.updateserverinfo=false repack -a -l -d  &&
	# It's broken!
	git status
	# Because there are no more objects!
	ls .git/objects/pack
	)

Fix this by resolving symlinks and relative paths before comparing the
alternate and object dir. This lets us clean up a number of issues noted
in 37a95862c6 (alternates: re-allow relative paths from environment,
2016-11-07):

- Now that we compare the real paths, duplicate detection is no longer
  foiled by relative paths.
- Using strbuf_realpath() allows us to "normalize" paths that
  strbuf_normalize_path() can't, so we can stop silently ignoring errors
  when "normalizing" paths from the environment.
- We now store an absolute path based on getcwd() (the "future
  direction" named in 37a95862c6), so chdir()-ing in the process no
  longer changes the directory pointed to by the alternate. This is a
  change in behavior, but a desirable one.

Signed-off-by: Glen Choo <chooglen@google.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-25 09:44:08 +09:00
Ævar Arnfjörð Bjarmason
14903c8e92 trace2 tests: guard pthread test with "PTHREAD"
Since 81071626ba (trace2: add global counter mechanism, 2022-10-24)
these tests have been failing when git is compiled with NO_PTHREADS=Y,
which is always the case e.g. if 'uname -s' is "NONSTOP_KERNEL".

Reported-by: Randall S. Becker <randall.becker@nexbridge.ca>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-25 09:36:26 +09:00
Junio C Hamano
f8828f9125 Merge branch 'ps/receive-use-only-advertised'
"git receive-pack" used to use all the local refs as the boundary for
checking connectivity of the data "git push" sent, but now it uses
only the refs that it advertised to the pusher. In a repository with
the .hideRefs configuration, this reduces the resources needed to
perform the check.
cf. <221028.86bkpw805n.gmgdl@evledraar.gmail.com>
cf. <xmqqr0yrizqm.fsf@gitster.g>

* ps/receive-use-only-advertised:
  receive-pack: only use visible refs for connectivity check
  rev-parse: add `--exclude-hidden=` option
  revision: add new parameter to exclude hidden refs
  revision: introduce struct to handle exclusions
  revision: move together exclusion-related functions
  refs: get rid of global list of hidden refs
  refs: fix memory leak when parsing hideRefs config
2022-11-23 11:22:25 +09:00
Junio C Hamano
173fc54b00 Merge branch 'jt/submodule-on-demand'
Push all submodules recursively with
'--recurse-submodules=on-demand'.

* jt/submodule-on-demand:
  Doc: document push.recurseSubmodules=only
2022-11-23 11:22:25 +09:00
Junio C Hamano
cf9721cc46 Merge branch 'es/chainlint-lineno'
Teach chainlint.pl to show corresponding line numbers when printing
the source of a test.

* es/chainlint-lineno:
  chainlint: prefix annotated test definition with line numbers
  chainlint: latch line numbers at which each token starts and ends
  chainlint: sidestep impoverished macOS "terminfo"
2022-11-23 11:22:24 +09:00
Junio C Hamano
4a04f718c0 Merge branch 'ab/t7610-timeout'
Fix a source of flakiness in CI when compiling with SANITIZE=leak.

* ab/t7610-timeout:
  t7610: use "file:///dev/null", not "/dev/null", fixes MinGW
  t7610: fix flaky timeout issue, don't clone from example.com
2022-11-23 11:22:24 +09:00
Junio C Hamano
56a64fcdc3 Merge branch 'rp/maintenance-qol'
'git maintenance register' is taught to write configuration to an
arbitrary path, and 'git for-each-repo' is taught to expand tilde
characters in paths.

* rp/maintenance-qol:
  builtin/gc.c: fix use-after-free in maintenance_unregister()
  maintenance --unregister: fix uninit'd data use & -Wdeclaration-after-statement
  maintenance: add option to register in a specific config
  for-each-repo: interpolate repo path arguments
2022-11-23 11:22:24 +09:00
Junio C Hamano
3b041ea5f7 Merge branch 'pw/strict-label-lookups'
Correct an error where `git rebase` would mistakenly use a branch or
tag named "refs/rewritten/xyz" when missing a rebase label.

* pw/strict-label-lookups:
  sequencer: tighten label lookups
  sequencer: unify label lookup
2022-11-23 11:22:23 +09:00
Junio C Hamano
6adf17050b Merge branch 'gc/redact-h2h3-headers'
Redact headers from cURL's h2h3 module in GIT_CURL_VERBOSE and
others.

* gc/redact-h2h3-headers:
  http: redact curl h2h3 headers in info
  t: run t5551 tests with both HTTP and HTTP/2
2022-11-23 11:22:23 +09:00
Junio C Hamano
613fb30a49 Merge branch 'es/chainlint-output'
Teach chainlint.pl to annotate the original test definition instead
of the token stream.

* es/chainlint-output:
  chainlint: annotate original test definition rather than token stream
  chainlint: latch start/end position of each token
  chainlint: tighten accuracy when consuming input stream
  chainlint: add explanatory comments
2022-11-23 11:22:23 +09:00
Junio C Hamano
58d80df6a3 Merge branch 'js/remove-stale-scalar-repos'
'scalar reconfigure -a' is taught to automatically remove
scalar.repo entires which no longer exist.

* js/remove-stale-scalar-repos:
  tests(scalar): tighten the stale `scalar.repo` test some
  scalar reconfigure -a: remove stale `scalar.repo` entries
2022-11-23 11:22:23 +09:00
Junio C Hamano
e3d40fb240 Merge branch 'dd/bisect-helper-subcommand'
Fix a regression in the bisect-helper which mistakenly treats
arguments to the command given to 'git bisect run' as arguments to
the helper.

* dd/bisect-helper-subcommand:
  bisect--helper: parse subcommand with OPT_SUBCOMMAND
  bisect--helper: move all subcommands into their own functions
  bisect--helper: remove unused options
2022-11-23 11:22:22 +09:00
Junio C Hamano
1107a3963b Merge branch 'ab/submodule-helper-prep-only'
Preparation to remove git-submodule.sh and replace it with a builtin.

* ab/submodule-helper-prep-only:
  submodule--helper: use OPT_SUBCOMMAND() API
  submodule--helper: drop "update --prefix <pfx>" for "-C <pfx> update"
  submodule--helper: remove --prefix from "absorbgitdirs"
  submodule API & "absorbgitdirs": remove "----recursive" option
  submodule.c: refactor recursive block out of absorb function
  submodule tests: test for a "foreach" blind-spot
  submodule--helper: fix a memory leak in "status"
  submodule tests: add tests for top-level flag output
  submodule--helper: move "config" to a test-tool
2022-11-23 11:22:22 +09:00
Andreas Hasenack
1f51b77f4f chainlint.pl: fix /proc/cpuinfo regexp
29fb2ec3 (chainlint.pl: validate test scripts in parallel,
2022-09-01) introduced a function that gets the number of cores from
/proc/cpuinfo on some systems, notably linux.

The regexp it uses (^processor\s*:) fails to match the desired lines in
the s390x architecture, where they look like this:

    processor 0: version = FF, identification = 148F67, machine = 2964

As a result, on s390x that function returns 0 as the number of cores,
and the chainlint.pl script exits without doing anything.

Signed-off-by: Andreas Hasenack <andreas.hasenack@canonical.com>
Acked-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-23 10:20:19 +09:00
Eric Sunshine
1c7dc23d41 lib-httpd: extend module location auto-detection
Although it is possible to manually set LIB_HTTPD_PATH and
LIB_HTTPD_MODULE_PATH to point at the location of `httpd` and its
modules, doing so is cumbersome and easily forgotten. To address this,
0d344738dc (t/lib-http.sh: Restructure finding of default httpd
location, 2010-01-02) enhanced lib-httpd.sh to automatically detect the
location of `httpd` and its modules in order to facilitate out-of-the-
box testing on a wider range of platforms. Follow that lead by further
enhancing it to automatically detect the `httpd` modules on Void Linux,
as well.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-22 09:57:53 +09:00
Jiang Xin
288fcb1c94 t5516: fail to run in verbose mode
The test case "push with config push.useBitmap" of t5516 was introduced
in commit 82f67ee13f (send-pack.c: add config push.useBitmaps,
2022-06-17). It won't work in verbose mode, e.g.:

    $ sh t5516-fetch-push.sh --run='1,115' -v

This is because "git-push" will run in a tty in this case, and the
subcommand "git pack-objects" will contain an argument "--progress"
instead of "-q". Adding a specific option "--quiet" to "git push" will
get a stable result for t5516.

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-22 09:16:30 +09:00
Eric Wong
6974765352 prune: quiet ENOENT on missing directories
$GIT_DIR/objects/pack may be removed to save inodes in shared
repositories.  Quiet down prune in cases where either
$GIT_DIR/objects or $GIT_DIR/objects/pack is non-existent,
but emit the system error in other cases to help users diagnose
permissions problems or resource constraints.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-21 15:58:54 +09:00
Ævar Arnfjörð Bjarmason
603f2f5719 revert: fix parse_options_concat() leak
Free memory from parse_options_concat(), which comes from code
originally added (then extended) in [1].

At this point we could get several more tests leak-free by free()-ing
the xstrdup() just above the line being changed, but that one's
trickier than it seems. The sequencer_remove_state() function
supposedly owns it, but sometimes we don't call it. I have a fix for
it, but it's non-trivial, so let's fix the easy one first.

1. c62f6ec341 (revert: add --ff option to allow fast forward when
   cherry-picking, 2010-03-06)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-21 12:32:48 +09:00
Ævar Arnfjörð Bjarmason
5ff6e8afac rebase: don't leak on "--abort"
Fix a leak in the recent 6159e7add4 (rebase --abort: improve reflog
message, 2022-10-12). Before that commit we'd strbuf_release() the
reflog message we were formatting, but when that code was refactored
to use "ropts.head_msg" the strbuf_release() was omitted.

Ideally the three users of "ropts" in cmd_rebase() should use
different "ropts" variables, in practice they're completely separate,
as this and the other user in the "switch" statement will "goto
cleanup", which won't touch "ropts".

The third caller after the "switch" is then unreachable if we take
these two branches, so all of them are getting a "{ 0 }" init'd
"ropts".

So it's OK that we're leaving a stale pointer in "ropts.head_msg",
cleaning it up was our responsibility, and it won't be used again.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-21 12:32:48 +09:00
Ævar Arnfjörð Bjarmason
dd4143e7bf connected.c: free the "struct packed_git"
The "new_pack" we allocate in check_connected() wasn't being
free'd. Let's do that before we return from the function. This has
leaked ever since "new_pack" was added to this function in
c6807a40dc (clone: open a shortcut for connectivity check,
2013-05-26).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-21 12:32:48 +09:00
Ævar Arnfjörð Bjarmason
c07ce0602a ls-files: fix a --with-tree memory leak
Fix a memory leak in overlay_tree_on_index(), we need to
clear_pathspec() at some point, which might as well be after the last
time we use it in the function.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-21 12:32:48 +09:00
Ævar Arnfjörð Bjarmason
fc47252d5b revision API: call graph_clear() in release_revisions()
Call graph_clear() in release_revisions(), this will free memory
allocated by e.g. this command, which will now run without memory
leaks:

	git -P log -1 --graph --no-graph --graph

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-21 12:32:48 +09:00
Ævar Arnfjörð Bjarmason
e84a26e32f unpack-file: fix ancient leak in create_temp_file()
Fix a leak that's been with us since 3407bb4940 (Add "unpack-file"
helper that unpacks a sha1 blob into a tmpfile., 2005-04-18). See
00c8fd493a (cat-file: use streaming API to print blobs, 2012-03-07)
for prior art which shows the same API pattern, i.e. free()-ing the
result of read_object_file() after it's used.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-21 12:32:48 +09:00
Ævar Arnfjörð Bjarmason
b6046abc0c built-ins & libs & helpers: add/move destructors, fix leaks
Fix various leaks in built-ins, libraries and a test helper here we
were missing a call to strbuf_release(), string_list_clear() etc, or
were calling them after a potential "return".

Comments on individual changes:

- builtin/checkout.c: Fix a memory leak that was introduced in [1]. A
  sibling leak introduced in [2] was recently fixed in [3]. As with [3]
  we should be using the wt_status_state_free_buffers() API introduced
  in [4].

- builtin/repack.c: Fix a leak that's been here since this use of
  "strbuf_release()" was added in a1bbc6c017 (repack: rewrite the shell
  script in C, 2013-09-15). We don't use the variable for anything
  except this loop, so we can instead free it right afterwards.

- builtin/rev-parse: Fix a leak that's been here since this code was
  added in 21d4783538 (Add a parseopt mode to git-rev-parse to bring
  parse-options to shell scripts., 2007-11-04).

- builtin/stash.c: Fix a couple of leaks that have been here since
  this code was added in d4788af875 (stash: convert create to builtin,
  2019-02-25), we strbuf_release()'d only some of the "struct strbuf" we
  allocated earlier in the function, let's release all of them.

- ref-filter.c: Fix a leak in 482c119186 (gpg-interface: improve
  interface for parsing tags, 2021-02-11), we don't use the "payload"
  variable that we ask parse_signature() to populate for us, so let's
  free it.

- t/helper/test-fake-ssh.c: Fix a leak that's been here since this
  code was added in 3064d5a38c (mingw: fix t5601-clone.sh,
  2016-01-27). Let's free the "struct strbuf" as soon as we don't need
  it anymore.

1. c45f0f525d (switch: reject if some operation is in progress,
   2019-03-29)
2. 2708ce62d2 (branch: sort detached HEAD based on a flag,
   2021-01-07)
3. abcac2e19f (ref-filter.c: fix a leak in get_head_description,
   2022-09-25)
4. 962dd7ebc3 (wt-status: introduce wt_status_state_free_buffers(),
   2020-09-27).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-21 12:32:48 +09:00
Ævar Arnfjörð Bjarmason
b5fcb1c006 read-cache.c: clear and free "sparse_checkout_patterns"
The "sparse_checkout_patterns" member was added to the "struct
index_state" in 836e25c51b (sparse-checkout: hold pattern list in
index, 2021-03-30), but wasn't added to discard_index(). Let's do
that.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-21 12:32:48 +09:00
Ævar Arnfjörð Bjarmason
03267e8656 commit: discard partial cache before (re-)reading it
The read_cache() in prepare_to_commit() would end up clobbering the
pointer we had for a previously populated "the_index.cache_tree" in
the very common case of "git commit" stressed by e.g. the tests being
changed here.

We'd populate "the_index.cache_tree" by calling
"update_main_cache_tree" in prepare_index(), but would not end up with
a "fully prepared" index. What constitutes an existing index is
clearly overly fuzzy, here we'll check "active_nr" (aka
"the_index.cache_nr"), but our "the_index.cache_tree" might have been
malloc()'d already.

Thus the code added in 11c8a74a64 (commit: write cache-tree data when
writing index anyway, 2011-12-06) would end up allocating the
"cache_tree", and would interact here with code added in
7168624c35 (Do not generate full commit log message if it is not
going to be used, 2007-11-28). The result was a very common memory
leak.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-21 12:32:48 +09:00
Ævar Arnfjörð Bjarmason
e5e37517dd tests: mark tests as passing with SANITIZE=leak
This marks tests that have been leak-free since various recent
commits, but which were not marked us such when the memory leak was
fixed. These were mostly discovered with the "check" mode added in
faececa53f (test-lib: have the "check" mode for SANITIZE=leak
consider leak logs, 2022-07-28).

Commits that fixed the last memory leak in these tests. Per narrowing
down when they started to pass under SANITIZE=leak with "bisect":

- t1022-read-tree-partial-clone.sh:
  7e2619d8ff (list_objects_filter_options: plug leak of filter_spec
  strings, 2022-09-08)

- t4053-diff-no-index.sh: 07a6f94a6d (diff-no-index: release prefixed
  filenames, 2022-09-07)

- t6415-merge-dir-to-symlink.sh: bac92b1f39 (Merge branch
  'js/ort-clean-up-after-failed-merge', 2022-08-08).

- t5554-noop-fetch-negotiator.sh:
  66eede4a37 (prepare_repo_settings(): plug leak of config values,
  2022-09-08)

- t2012-checkout-last.sh, t7504-commit-msg-hook.sh,
  t91{15,46,60}-git-svn-*.sh: The in-flight "pw/rebase-no-reflog-action"
  series, upon which this is based:
  https://lore.kernel.org/git/pull.1405.git.1667575142.gitgitgadget@gmail.com/

Let's mark all of these as passing with
"TEST_PASSES_SANITIZE_LEAK=true", to have it regression tested,
including as part of the "linux-leaks" CI job.

Additionally, let's remove the "!SANITIZE_LEAK" prerequisite from
tests that now pass, these were marked as failing in:

- 77e56d55ba (diff.c: fix a double-free regression in a18d66cefb,
  2022-03-17)
- c4d1d52631 (tests: change some 'test $(git) = "x"' to test_cmp,
  2022-03-07)

These were not spotted with the new "check" mode, but manually, it
doesn't cover these sort of prerequisites. There's few enough that we
shouldn't bother to automate it. They'll be going away sooner than
later.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-21 12:32:48 +09:00
Ævar Arnfjörð Bjarmason
bdafeae0b9 cache.h & test-tool.h: add & use "USE_THE_INDEX_VARIABLE"
In a preceding commit we fully applied the
"index-compatibility.pending.cocci" rule to "t/helper/*".

Let's now stop defining "USE_THE_INDEX_COMPATIBILITY_MACROS" in
test-tool.h itself, and instead instead define
"USE_THE_INDEX_VARIABLE" in the individual test helpers that need
it. This mirrors how we do the same thing in the "builtin/" directory.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-21 12:06:15 +09:00
Ævar Arnfjörð Bjarmason
0ea414a14d cocci: apply "pending" index-compatibility to "t/helper/*.c"
Apply the "index-compatibility.pending.cocci" rule to the "t/helper/*"
directory, a subsequent commit will extend cache.h to further narrow
down the use of "USE_THE_INDEX_COMPATIBILITY_MACROS" in this area.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-21 12:06:15 +09:00
Ævar Arnfjörð Bjarmason
dc594180d9 cocci & cache.h: apply variable section of "pending" index-compatibility
Mostly apply the part of "index-compatibility.pending.cocci" that
renames the global variables like "active_nr", which are a shorthand
to referencing (in that case) a struct member as "the_index.cache_nr".

In doing so move more of "index-compatibility.pending.cocci" to
"index-compatibility.cocci".

In the case of "active_nr" we'd have a textual conflict with
"ab/various-leak-fixes" in "next"[1]. Let's exclude that specific case
while moving the rule over from "pending".

1. 407b94280f8 (commit: discard partial cache before (re-)reading it,
   2022-11-08)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-11-21 12:06:15 +09:00
Taylor Blau
26734da056 Merge branch 'jk/branch-delete-detached'
Fix a bug where `git branch -d` did not work on an orphaned HEAD.

* jk/branch-delete-detached:
  branch: gracefully handle '-d' on orphan HEAD
2022-11-18 18:44:00 -05:00
Taylor Blau
a92fce4c50 Merge branch 'vd/skip-cache-tree-update'
Avoid calling 'cache_tree_update()' when doing so would be redundant.

* vd/skip-cache-tree-update:
  rebase: use 'skip_cache_tree_update' option
  read-tree: use 'skip_cache_tree_update' option
  reset: use 'skip_cache_tree_update' option
  unpack-trees: add 'skip_cache_tree_update' option
  cache-tree: add perf test comparing update and prime
2022-11-18 18:43:56 -05:00
Taylor Blau
35dc2cf03f Merge branch 'vd/update-refs-delete'
`git rebase --update-refs` would delete references when all `update-ref`
commands in the sequencer were removed, which has been corrected.

* vd/update-refs-delete:
  rebase --update-refs: avoid unintended ref deletion
2022-11-18 18:43:11 -05:00
Taylor Blau
ad9096881d Merge branch 'tb/repack-expire-to'
"git repack" learns to send cruft objects out of the way into
packfiles outside the repository.

* tb/repack-expire-to:
  builtin/repack.c: implement `--expire-to` for storing pruned objects
  builtin/repack.c: write cruft packs to arbitrary locations
  builtin/repack.c: pass "cruft_expiration" to `write_cruft_pack`
  builtin/repack.c: pass "out" to `prepare_pack_objects`
2022-11-18 18:43:09 -05:00
Taylor Blau
e53598a5ab Merge branch 'ab/sha-makefile-doc'
Makefile comments updates and reordering to clarify knobs used to
choose SHA implementations.

* ab/sha-makefile-doc:
  Makefile: discuss SHAttered in *_SHA{1,256} discussion
  Makefile: document default SHA-1 backend on OSX
  Makefile & test-tool: replace "DC_SHA1" variable with a "define"
  Makefile: document SHA-1 and SHA-256 default and selection order
  Makefile: document default SHA-256 backend
  Makefile: rephrase the discussion of *_SHA1 knobs
  Makefile: create and use sections for "define" flag listing
  Makefile: correct DC_SHA1 documentation
  INSTALL: remove discussion of SHA-1 backends
  Makefile: always (re)set DC_SHA1 on fallback
2022-11-18 18:43:07 -05:00
Taylor Blau
69c1d609ba Merge branch 'ab/misc-hook-submodule-run-command'
Various test updates.

* ab/misc-hook-submodule-run-command:
  run-command tests: test stdout of run_command_parallel()
  submodule tests: reset "trace.out" between "grep" invocations
  hook tests: fix redirection logic error in 96e7225b31
2022-11-18 18:43:04 -05:00
Jeff King
8db2dad7a0 parse_object(): check on-disk type of suspected blob
In parse_object(), we try to handle blobs by streaming rather than
loading them entirely into memory. The most common case here will be
that we haven't seen the object yet and check oid_object_info(), which
tells us we have a blob.

But we trigger this code on one other case: when we have an in-memory
object struct with type OBJ_BLOB (and without its "parsed" flag set,
since otherwise we'd return early from the function). This indicates
that some other part of the code suspected we have a blob (e.g., it was
mentioned by a tree or tag) but we haven't yet looked at the on-disk
copy.

In this case before hitting the streaming path, we check if we have the
object on-disk at all. This is mostly pointless extra work, as the
streaming path would complain if it couldn't open the object (albeit
with the message "hash mismatch", which is a little misleading).

But it's also insufficient to catch all problems. The streaming code
will only tell us "yes, the on-disk object matches the oid". But it
doesn't actually confirm that what we found was indeed a blob, and
neither does repo_has_object_file().

One way to improve this would be to teach stream_object_signature() to
check the type (either by returning it to us to check, or taking an
"expected" type). But there's an even simpler fix here: if we suspect
the object is a blob, just call oid_object_info() to confirm that we
have it on-disk, and that it really is a blob.

This is slightly less efficient than teaching stream_object_signature()
to do it (since it has to open the object already). But this case very
rarely comes up. In practice, we usually don't have any clue what the
type is, in which case we already call oid_object_info(). This
"suspected" case happens only when some other code created an object
struct but didn't actually parse the blob, which is actually tricky to
trigger at all (see the discussion of the test below).

I reworked the conditional a bit so that instead of:

  if ((suspected_blob && oid_object_info() == OBJ_BLOB)
      (no_clue && oid_object_info() == OBJ_BLOB)

we have the simpler:

  if ((suspected_blob || no_clue) && oid_object_info() == OBJ_BLOB)

This is shorter, but also reflects what we really want say, which is
"have we ruled out this being a blob; if not, check it on-disk".

In either case, if oid_object_info() fails to tell us it's a blob, we'll
skip the streaming code path and call repo_read_object_file(), just as
before. And if we really do have a mismatch with the existing object
struct, we'll eventually call lookup_commit(), etc, via
parse_object_buffer(), which will complain that it doesn't match our
existing obj->type.

So this fixes one of the lingering expect_failure cases from 0616617c7e
(t: introduce tests for unexpected object types, 2019-04-09).  That test
works by peeling a tag that claims to point to a blob (triggering us to
create the struct), but really points to something else, which we later
discover when we call parse_object() as part of the actual traversal).
Prior to this commit, we'd quietly check the sha1 and mark the blob as
"parsed". Now we correctly complain about the mismatch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-18 13:59:31 -05:00
Rubén Justo
cfbd173ccb branch: force-copy a branch to itself via @{-1} is a no-op
Since 52d59cc645 (branch: add a --copy (-c) option to go with --move
(-m), 2017-06-18) we can copy a branch to make a new branch with the
'-c' (copy) option or to overwrite an existing branch using the '-C'
(force copy) option.  A no-op possibility is considered when we are
asked to copy a branch to itself, to follow the same no-op introduced
for the rename (-M) operation in 3f59481e33 (branch: allow a no-op
"branch -M <current-branch> HEAD", 2011-11-25).  To check for this, in
52d59cc645 we compared the branch names provided by the user, source
(HEAD if omitted) and destination, and a match is considered as this
no-op.

Since ae5a6c3684 (checkout: implement "@{-N}" shortcut name for N-th
last branch, 2009-01-17) a branch can be specified using shortcuts like
@{-1}.  This allows this usage:

	$ git checkout -b test
	$ git checkout -
	$ git branch -C test test  # no-op
	$ git branch -C test @{-1} # oops
	$ git branch -C @{-1} test # oops

As we are using the branch name provided by the user to do the
comparison, if one of the branches is provided using a shortcut we are
not going to have a match and a call to git_config_copy_section() will
happen.  This will make a duplicate of the configuration for that
branch, and with this progression the second call will produce four
copies of the configuration, and so on.

Let's use the interpreted branch name instead for this comparison.

The rename operation is not affected.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-17 17:16:21 -05:00
Patrick Steinhardt
5ff36c9b6b rev-parse: add --exclude-hidden= option
Add a new `--exclude-hidden=` option that is similar to the one we just
added to git-rev-list(1). Given a section name `uploadpack` or `receive`
as argument, it causes us to exclude all references that would be hidden
by the respective `$section.hideRefs` configuration.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-17 16:22:52 -05:00
Patrick Steinhardt
8c1bc2a71a revision: add new parameter to exclude hidden refs
Users can optionally hide refs from remote users in git-upload-pack(1),
git-receive-pack(1) and others via the `transfer.hideRefs`, but there is
not an easy way to obtain the list of all visible or hidden refs right
now. We'll require just that though for a performance improvement in our
connectivity check.

Add a new option `--exclude-hidden=` that excludes any hidden refs from
the next pseudo-ref like `--all` or `--branches`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-17 16:22:52 -05:00
Ævar Arnfjörð Bjarmason
23fb328c8d t7610: use "file:///dev/null", not "/dev/null", fixes MinGW
On MinGW the "/dev/null" is translated to "nul" on command-lines, even
though as in this case it'll never end up referring to an actual file.

So on Windows the fix for the previous "example.com" timeout issue in
8354cf752e (t7610: fix flaky timeout issue, don't clone from
example.com, 2022-11-05) would yield:

  fatal: repo URL: 'nul' must be absolute or begin with ./|../

Let's evade this yet again by prefixing this with "file://", which
makes this pass in the Windows CI.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-15 20:05:02 -05:00
Ronan Pigott
1f80129d61 maintenance: add option to register in a specific config
maintenance register currently records the maintenance repo exclusively
within the user's global configuration, but other configuration files
may be relevant when running maintenance if they are included from the
global config. This option allows the user to choose where maintenance
repos are recorded.

Signed-off-by: Ronan Pigott <ronan@rjp.ie>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-14 22:39:25 -05:00
Ronan Pigott
13d5bbdf72 for-each-repo: interpolate repo path arguments
This is a quality of life change for git-maintenance, so repos can be
recorded with the tilde syntax. The register subcommand will not record
repos in this format by default.

Signed-off-by: Ronan Pigott <ronan@rjp.ie>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-14 22:39:25 -05:00
Taylor Blau
561f3948a5 Merge branch 'do/modernize-t7001'
Modernize test script to avoid "test -f" and friends.

* do/modernize-t7001:
  t7001-mv.sh: modernizing test script using functions
2022-11-14 19:53:31 -05:00
Glen Choo
b637a41ebe http: redact curl h2h3 headers in info
With GIT_TRACE_CURL=1 or GIT_CURL_VERBOSE=1, sensitive headers like
"Authorization" and "Cookie" get redacted. However, since [1], curl's
h2h3 module (invoked when using HTTP/2) also prints headers in its
"info", which don't get redacted. For example,

  echo 'github.com	TRUE	/	FALSE	1698960413304	o	foo=bar' >cookiefile &&
  GIT_TRACE_CURL=1 GIT_TRACE_CURL_NO_DATA=1 git \
    -c 'http.cookiefile=cookiefile' \
    -c 'http.version=' \
    ls-remote https://github.com/git/git refs/heads/main 2>output &&
  grep 'cookie' output

produces output like:

  23:04:16.920495 http.c:678              == Info: h2h3 [cookie: o=foo=bar]
  23:04:16.920562 http.c:637              => Send header: cookie: o=<redacted>

Teach http.c to check for h2h3 headers in info and redact them using the
existing header redaction logic. This fixes the broken redaction logic
that we noted in the previous commit, so mark the redaction tests as
passing under HTTP2.

[1] f8c3724aa9

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-14 17:42:46 -05:00
Jeff King
73c49a4474 t: run t5551 tests with both HTTP and HTTP/2
We have occasionally seen bugs that affect Git running only against an
HTTP/2 web server, not an HTTP one. For instance, b66c77a64e (http:
match headers case-insensitively when redacting, 2021-09-22). But since
we have no test coverage using HTTP/2, we only uncover these bugs in the
wild.

That commit gives a recipe for converting our Apache setup to support
HTTP/2, but:

  - it's not necessarily portable

  - we don't want to just test HTTP/2; we really want to do a variety of
    basic tests for _both_ protocols

This patch handles both problems by running a duplicate of t5551
(labeled as t5559 here) with an alternate-universe setup that enables
HTTP/2. So we'll continue to run t5551 as before, but run the same
battery of tests again with HTTP/2. If HTTP/2 isn't supported on a given
platform, then t5559 should bail during the webserver setup, and
gracefully skip all tests (unless GIT_TEST_HTTPD has been changed from
"auto" to "yes", where the point is to complain when webserver setup
fails).

In theory other http-related test scripts could benefit from the same
duplication, but doing t5551 should give us a reasonable check of basic
functionality, and would have caught both bugs we've seen in the wild
with HTTP/2.

A few notes on the implementation:

  - a script enables the server side config by calling enable_http2
    before starting the webserver. This avoids even trying to load any
    HTTP/2 config for t5551 (which is what lets it keep working with
    regular HTTP even on systems that don't support it). This also sets
    a prereq which can be used by individual tests.

  - As discussed in b66c77a64e, the http2 module isn't compatible with
    the "prefork" mpm, so we need to pick something else. I chose
    "event" here, which works on my Debian system, but it's possible
    there are platforms which would prefer something else. We can adjust
    that later if somebody finds such a platform.

  - The test "large fetch-pack requests can be sent using chunked
    encoding" makes sure we use a chunked transfer-encoding by looking
    for that header in the trace. But since HTTP/2 has its own streaming
    mechanisms, we won't find such a header. We could skip the test
    entirely by marking it with !HTTP2. But there's some value in making
    sure that the fetch itself succeeded. So instead, we'll confirm that
    either we're using HTTP2 _or_ we saw the expected chunked header.

  - the redaction tests fail under HTTP/2 with recent versions of curl.
    This is a bug! I've marked them with !HTTP2 here to skip them under
    t5559 for the moment. Using test_expect_failure would be more
    appropriate, but would require a bunch of boilerplate. Since we'll
    be fixing them momentarily, let's just skip them for now to keep the
    test suite bisectable, and we can re-enable them in the commit that
    fixes the bug.

  - one alternative layout would be to push most of t5551 into a
    lib-t5551.sh script, then source it from both t5551 and t5559.
    Keeping t5551 intact seemed a little simpler, as its one less level
    of indirection for people fixing bugs/regressions in the non-HTTP/2
    tests.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-14 17:42:46 -05:00
Teng Long
8ddc06631b pack-bitmap.c: avoid exposing absolute paths
In "open_midx_bitmap_1()" and "open_pack_bitmap_1()", when we find that
there are multiple bitmaps, we will only open the first one and then
leave warnings about the remaining pack information, the information
will contain the absolute path of the repository, for example in a
alternates usage scenario. So let's hide this kind of potentially
sensitive information in this commit.

Found-by: XingXin <moweng.xx@antgroup.com>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-14 17:21:16 -05:00
Jonathan Tan
e62f779ae6 Doc: document push.recurseSubmodules=only
Git learned pushing submodules without pushing the superproject by
the user specifying --recurse-submodules=only through 6c656c3fe4
("submodules: add RECURSE_SUBMODULES_ONLY value", 2016-12-20) and
225e8bf778 ("push: add option to push only submodules", 2016-12-20).
For users who use this feature regularly, it is desirable to have an
equivalent configuration.

It turns out that such a configuration (push.recurseSubmodules=only) is
already supported, even though it is neither documented nor mentioned
in the commit messages, due to the way the --recurse-submodules=only
feature was implemented (a function used to parse --recurse-submodules
was updated to support "only", but that same function is used to parse
push.recurseSubmodules too). What is left is to document it and test it,
which is what this commit does.

There is a possible point of confusion when recursing into a submodule
that itself has the push.recurseSubmodules=only configuration, because
if a repository has only its submodules pushed and not itself, its
superproject can never be pushed. Therefore, treat such configurations
as being "on-demand", and print a warning message.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-14 16:55:50 -05:00
Kyle Zhao
501e3bab99 merge-tree.c: allow specifying the merge-base when --stdin is passed
The previous commit added a `--merge-base` option in order to allow
using a specified merge-base for the merge.  Extend the input accepted
by `--stdin` to also allow a specified merge-base with each merge
requested.  For example:

    printf "<b3> -- <b1> <b2>" | git merge-tree --stdin

does a merge of b1 and b2, and uses b3 as the merge-base.

Signed-off-by: Kyle Zhao <kylezhao@tencent.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-12 23:53:04 -05:00
Kyle Zhao
66265a693e merge-tree.c: add --merge-base=<commit> option
This patch will give our callers more flexibility to use `git merge-tree`,
such as:

    git merge-tree --write-tree --merge-base=branch^ HEAD branch

This does a merge of HEAD and branch, but uses branch^ as the merge-base.

And the reason why using an option flag instead of a positional argument
is to allow additional commits passed to merge-tree to be handled via an
octopus merge in the future.

Signed-off-by: Kyle Zhao <kylezhao@tencent.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-12 23:53:04 -05:00
Johannes Schindelin
a90085b68c tests(scalar): tighten the stale scalar.repo test some
As pointed out by Stolee, the previous incarnation of this test case was
not stringent enough: we want to verify that _only_ the stale entries
are removed (previously, the test case would have succeeded even if all
entries had been removed).

Let's rectify this and verify that the other entries are left intact.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 17:24:36 -05:00
Ævar Arnfjörð Bjarmason
929bf9db28 bisect test: test exit codes on bad usage
Address a test blindspot, the "log" command is the odd one out because
"git-bisect.sh" ignores any arguments it receives. Let's test both the
exit codes we expect, and the stderr and stdout we're emitting.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 17:05:57 -05:00
Đoàn Trần Công Danh
8962f8f888 bisect-run: verify_good: account for non-negative exit status
Some system never reports negative exit code at all, they reports them
as bigger-than-128 instead.  We take extra care for those systems in the
later check for normal 'do_bisect_run' loop.

Let's check it here, too.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 17:05:53 -05:00
Đoàn Trần Công Danh
461fec41fa bisect run: keep some of the post-v2.30.0 output
Preceding commits fixed output and behavior regressions in
d1bbbe45df (bisect--helper: reimplement `bisect_run` shell function
in C, 2021-09-13), which did not claim to be changing the output of
"git bisect run".

But some of the output it emitted was subjectively better, so once
we've asserted that we're back on v2.29.0 behavior, let's change some
of it back:

- We now quote the arguments again, but omit the first " " when
  printing the "running" line.
- Ditto for other cases where we emitted the argument
- We say "found first bad commit" again, not just "run success"

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Based-on-patch-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 17:05:52 -05:00
Đoàn Trần Công Danh
f37d0bdd42 bisect: fix output regressions in v2.30.0
When d1bbbe45df (bisect--helper: reimplement `bisect_run` shell
function in C, 2021-09-13) reimplemented parts of "git bisect run" in
C it changed the output we emitted so that:

 - The "running ..." line was now quoted
 - We lost the \n after our output
 - We started saying "bisect found ..." instead of "bisect run success"

Arguably some of this is better now, but as d1bbbe45df did not
advocate for changing the output, let's revert this for now. It'll be
easy to change it back if that's what we'd prefer.

This does not change the one remaining use of "command.buf" to emit
the quoted argument, as that's new in d1bbbe45df.

Some of these cases were not tested for in the tests added in the
preceding commit, I didn't have time to fleshen those out, but a look
at f1de981e8b will show that the other output being adjusted here is
now equivalent to what it was before d1bbbe45df.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 17:05:51 -05:00
Ævar Arnfjörð Bjarmason
982fecf7c1 bisect tests: test for v2.30.0 "bisect run" regressions
Add three failing tests which succeed on v2.29.0, but due to the topic
merged at [1] (specifically [2]) have been failing since then. We'll
address those regressions in subsequent commits.

There was also a "regression" where:

	git bisect run ./missing-script.sh

Would count a non-existing script as "good", as the shell would exit
with 127. That edge case is a bit too insane to preserve, so let's not
add it to these regression tests.

There was another regression that 'git bisect' consumed some options
that was meant to passed down to program run with 'git bisect run'.
Since that regression is breaking user's expectation, it has been fixed
earlier without this patch queued.

1. 0a4cb1f1f2 (Merge branch 'mr/bisect-in-c-4', 2021-09-23)
2. d1bbbe45df (bisect--helper: reimplement `bisect_run` shell
   function in C, 2021-09-13)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 17:05:48 -05:00
Taylor Blau
2445d34fb9 Merge branch 'dd/bisect-helper-subcommand' into dd/git-bisect-builtin
* dd/bisect-helper-subcommand:
  bisect--helper: parse subcommand with OPT_SUBCOMMAND
  bisect--helper: move all subcommands into their own functions
  bisect--helper: remove unused options
2022-11-11 17:05:43 -05:00
Đoàn Trần Công Danh
e9011b6092 bisect--helper: parse subcommand with OPT_SUBCOMMAND
As of it is, we're parsing subcommand with OPT_CMDMODE, which will
continue to parse more options even if the command has been found.

When we're running "git bisect run" with a command that expecting
a "--log" or "--no-log" arguments, or one of those "--bisect-..."
arguments, bisect--helper may mistakenly think those options are
bisect--helper's option.

We may fix those problems by passing "--" when calling from
git-bisect.sh, and skip that "--" in bisect--helper. However, it may
interfere with user's "--".

Let's parse subcommand with OPT_SUBCOMMAND since that API was born for
this specific use-case.

Reported-by: Lukáš Doktor <ldoktor@redhat.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 17:04:57 -05:00
Eric Sunshine
48d69d8f2f chainlint: prefix annotated test definition with line numbers
When chainlint detects problems in a test, it prints out the name of the
test script, the name of the problematic test, and a copy of the test
definition with "?!FOO?!" annotations inserted at the locations where
problems were detected. Taken together this information is sufficient
for the test author to identify the problematic code in the original
test definition. However, in a lengthy script or a lengthy test
definition, the author may still end up using the editor's search
feature to home in on the exact problem location.

To further assist the test author, display line numbers along with the
annotated test definition, thus allowing the author to jump directly to
each problematic line.

Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 16:56:21 -05:00
Eric Sunshine
bf42f0a030 chainlint: latch line numbers at which each token starts and ends
When chainlint detects problems in a test, it prints out the name of the
test script, the name of the problematic test, and a copy of the test
definition with "?!FOO?!" annotations inserted at the locations where
problems were detected. Taken together this information is sufficient
for the test author to identify the problematic code in the original
test definition. However, in a lengthy script or a lengthy test
definition, the author may still end up using the editor's search
feature to home in on the exact problem location.

To further assist the test author, an upcoming change will display line
numbers along with the annotated test definition, thus allowing the
author to jump directly to each problematic line. As preparation,
upgrade Lexer to latch the line numbers at which each token starts and
ends, and return that information with the token itself.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 16:56:21 -05:00
Eric Sunshine
5451877f87 chainlint: sidestep impoverished macOS "terminfo"
Although the macOS Terminal.app is "xterm"-compatible, its corresponding
"terminfo" entries -- such as "xterm", "xterm-256color", and
"xterm-new"[1] -- neglect to mention capabilities which Terminal.app
actually supports (such as "dim text"). This oversight on Apple's part
ends up penalizing users of "good citizen" console programs which
consult "terminfo" to tailor their output based upon reported terminal
capabilities (as opposed to programs which assume that the terminal
supports ANSI codes). The same problem is present in other Apple
"terminfo" entries, such as "nsterm"[2], with which macOS Terminal.app
may be configured.

Sidestep this Apple problem by imbuing get_colors() with specific
knowledge of capabilities common to "xterm" and "nsterm", rather than
trusting "terminfo" to report them correctly. Although hard-coding such
knowledge is ugly, "xterm" support is nearly ubiquitous these days, and
Git itself sets precedence by assuming support for ANSI color codes. For
other terminal types, fall back to querying "terminfo" via `tput` as
usual.

FOOTNOTES

[1] iTerm2 FAQ suggests "xterm-new": https://iterm2.com/faq.html

[2] Neovim documentation recommends terminal type "nsterm" with
    Terminal.app: https://neovim.io/doc/user/term.html#terminfo

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-11 16:56:21 -05:00
Phillip Wood
688d82f254 sequencer: tighten label lookups
The `label` command creates a ref refs/rewritten/<label> that the
`reset` and `merge` commands resolve by calling lookup_label(). That
uses lookup_commit_reference_by_name() to look up the label ref. As
lookup_commit_reference_by_name() uses the dwim rules when looking up
the label it will look for a branch named
refs/heads/refs/rewritten/<label> and return that instead of an error if
the branch exists and the label does not. Fix this by using read_ref()
followed by lookup_commit_object() when looking up labels.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-10 23:36:24 -05:00
Phillip Wood
82766b2961 sequencer: unify label lookup
The arguments to the `reset` and `merge` commands may be a label created
with a `label` command or an arbitrary commit name. The `merge` command
uses the lookup_label() function to lookup its arguments but `reset` has
a slightly different version of that function in do_reset(). Reduce this
code duplication by calling lookup_label() from do_reset() as well.

This change improves the behavior of `reset` when the argument is a
tree.  Previously `reset` would accept a tree only for the rebase to
fail with

       update_ref failed for ref 'HEAD': cannot update ref 'HEAD': trying to write non-commit object da5497437fd67ca928333aab79c4b4b55036ea66 to branch 'HEAD'

Using lookup_label() means do_reset() will now error out straight away
if its argument is not a commit.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-10 23:36:24 -05:00
Victoria Dye
dc5d40f5bc read-tree: use 'skip_cache_tree_update' option
When running 'read-tree' with a single tree and no prefix,
'prime_cache_tree()' is called after the tree is unpacked. In that
situation, skip a redundant call to 'cache_tree_update()' in
'unpack_trees()' by enabling the 'skip_cache_tree_update' unpack option.

Removing the redundant cache tree update provides a substantial performance
improvement to 'git read-tree <tree-ish>', as shown by a test added to
'p0006-read-tree-checkout.sh':

Test                          before            after
----------------------------------------------------------------------
read-tree br_ballast_plus_1   3.94(1.80+1.57)   3.00(1.14+1.28) -23.9%

Note that the 'read-tree' in 't1022-read-tree-partial-clone.sh' is updated
to read two trees, rather than one. The test was first introduced in
d3da223f22 (cache-tree: prefetch in partial clone read-tree, 2021-07-23) to
exercise the 'cache_tree_update()' code path, as used in 'git merge'. Since
this patch drops the call to 'cache_tree_update()' in single-tree 'git
read-tree', change the test to use the two-tree variant so that
'cache_tree_update()' is called as intended.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-10 21:49:34 -05:00
Victoria Dye
0e47bca0f7 reset: use 'skip_cache_tree_update' option
Enable the 'skip_cache_tree_update' option in the variants that call
'prime_cache_tree()' after 'unpack_trees()' (specifically, 'git reset
--mixed' and 'git reset --hard'). This avoids redundantly rebuilding the
cache tree in both 'cache_tree_update()' at the end of 'unpack_trees()' and
in 'prime_cache_tree()', resulting in a small (but consistent) performance
improvement. From the newly-added 'p7102-reset.sh' test:

Test                         before            after
--------------------------------------------------------------------
7102.1: reset --hard (...)   2.11(0.40+1.54)   1.97(0.38+1.47) -6.6%

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-10 21:49:34 -05:00
Victoria Dye
94fcf0e852 cache-tree: add perf test comparing update and prime
Add a performance test comparing the execution times of 'prime_cache_tree()'
and 'cache_tree_update(_, WRITE_TREE_SILENT | WRITE_TREE_REPAIR)'. The goal
of comparing these two is to identify which is the faster method for
rebuilding an invalid cache tree, ultimately to remove one when both are
(reundantly) called in immediate succession.

Both methods are fast, so the new tests in 'p0090-cache-tree.sh' must call
each tested function multiple times to ensure the reported times (to 0.01s
resolution) convey the differences between them.

The tests compare the timing of a 'test-tool cache-tree' run as a no-op (to
capture a baseline for the overhead associated with running the tool),
'cache_tree_update()', and 'prime_cache_tree()' on four scenarios:

- A completely valid cache tree
- A cache tree with 2 invalid paths
- A cache tree with 50 invalid paths
- A completely empty cache tree

Example results:

Test                                        this tree
-----------------------------------------------------------
0090.2: no-op, clean                        1.27(0.48+0.52)
0090.3: prime_cache_tree, clean             2.02(0.83+0.85)
0090.4: cache_tree_update, clean            1.30(0.49+0.54)
0090.5: no-op, invalidate 2                 1.29(0.48+0.54)
0090.6: prime_cache_tree, invalidate 2      1.98(0.81+0.83)
0090.7: cache_tree_update, invalidate 2     2.12(0.94+0.86)
0090.8: no-op, invalidate 50                1.32(0.50+0.55)
0090.9: prime_cache_tree, invalidate 50     2.10(0.86+0.89)
0090.10: cache_tree_update, invalidate 50   2.35(1.14+0.90)
0090.11: no-op, empty                       1.33(0.50+0.54)
0090.12: prime_cache_tree, empty            2.04(0.84+0.87)
0090.13: cache_tree_update, empty           2.51(1.27+0.92)

These timings show that, while 'cache_tree_update()' is faster when the
cache tree is completely valid, it is equal to or slower than
'prime_cache_tree()' when there are any invalid paths. Since the redundant
calls are mostly in scenarios where the cache tree will be at least
partially invalid (e.g., 'git reset --hard'), 'prime_cache_tree()' will
likely perform better than 'cache_tree_update()' in typical cases.

Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-10 21:49:33 -05:00
Jeff King
eb20e63f5a branch: gracefully handle '-d' on orphan HEAD
When deleting a branch, "git branch -d" has a safety check that ensures
the branch is merged to its upstream (if any), or to HEAD. To do that,
naturally we try to resolve HEAD to a commit object. If we're on an
orphan branch (i.e., HEAD points to a branch that does not yet exist),
that will fail, and we'll bail with an error:

  $ git branch -d to-delete
  fatal: Couldn't look up commit object for HEAD

This usually isn't that big of a deal. The deletion would fail anyway,
since the branch isn't merged to HEAD, and you'd need to use "-D" (or
"-f"). And doing so skips the HEAD resolution, courtesy of 67affd5173
(git-branch -D: make it work even when on a yet-to-be-born branch,
2006-11-24).

But there are still two problems:

  1. The error message isn't very helpful. We should give the usual "not
     fully merged" message, which points the user at "branch -D". That
     was a problem even back in 67affd5173.

  2. Even without a HEAD, these days it's still possible for the
     deletion to succeed. After 67affd5173, commit 99c419c915 (branch
     -d: base the "already-merged" safety on the branch it merges with,
     2009-12-29) made it OK to delete a branch if it is merged to its
     upstream.

We can fix both by removing the die() in delete_branches() completely,
leaving head_rev NULL in this case. It's tempting to stop there, as it
appears at first glance that the rest of the code does the right thing
with a NULL. But sadly, it's not quite true.

We end up feeding the NULL to repo_is_descendant_of(). In the
traditional code path there, we call repo_in_merge_bases_many(). It
feeds the NULL to repo_parse_commit(), which is smart enough to return
an error, and we immediately return "no, it's not a descendant".

But there's an alternate code path: if we have a commit graph with
generation numbers, we end up in can_all_from_reach(), which does
eventually try to set a flag on the NULL commit and segfaults.

So instead, we'll teach the local branch_merged() helper to treat a NULL
as "not merged". This would be a little more elegant in in_merge_bases()
itself, but that function is called in a lot of places, and it's not
clear that quietly returning "not merged" is the right thing everywhere
(I'd expect in many cases, feeding a NULL is a sign of a bug).

There are four tests here:

  a. The first one confirms that deletion succeeds with an orphaned HEAD
     when the branch is merged to its upstream. This is case (2) above.

  b. Same, but with commit graphs enabled. Even if it is merged to
     upstream, we still check head_rev so that we can say "deleting
     because it's merged to upstream, even though it's not merged to
     HEAD". Without the second hunk in branch_merged(), this test would
     segfault in can_all_from_reach().

  c. The third one confirms that we correctly say "not merged to HEAD"
     when we can't resolve HEAD, and reject the deletion.

  d. Same, but with commit graphs enabled. Without the first hunk in
     branch_merged(), this one would segfault.

Reported-by: Martin von Zweigbergk <martinvonz@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-10 21:42:45 -05:00
Phillip Wood
7595c0ece1 config: require at least one digit when parsing numbers
If the input to strtoimax() or strtoumax() does not contain any digits
then they return zero and set `end` to point to the start of the input
string.  git_parse_[un]signed() do not check `end` and so fail to return
an error and instead return a value of zero if the input string is a
valid units factor without any digits (e.g "k").

Tests are added to check that 'git config --int' and OPT_MAGNITUDE()
reject a units specifier without a leading digit.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-09 21:30:39 -05:00
Phillip Wood
84356ff770 git_parse_unsigned: reject negative values
git_parse_unsigned() relies on strtoumax() which unfortunately parses
negative values as large positive integers. Fix this by rejecting any
string that contains '-' as we do in strtoul_ui(). I've chosen to treat
negative numbers as invalid input and set errno to EINVAL rather than
ERANGE one the basis that they are never acceptable if we're looking for
a unsigned integer. This is also consistent with the existing behavior
of rejecting "1–2" with EINVAL.

As we do not have unit tests for this function it is tested indirectly
by checking that negative values of reject for core.bigFileThreshold are
rejected. As this function is also used by OPT_MAGNITUDE() a test is
added to check that rejects negative values too.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-09 21:30:38 -05:00
Ævar Arnfjörð Bjarmason
8354cf752e t7610: fix flaky timeout issue, don't clone from example.com
When t7610-mergetool.sh runs without failures the git://example.com
submodule URLs will never be used. That's because we "git submodule
add" it, but then manually populate them so that subsequent "git
submodule update -N" won't attempt to clone it, only update it without
fetching.

But if we fail in an earlier test it'll have the knock-on effect of
having later tests hang on that "git submodule update -N" as we
attempt to clone this repository from example.com.

This can be reproduced on "master" by running the test with
SANITIZE=leak without "--immediate". With
"GIT_TEST_PASSING_SANITIZE_LEAK=true" (which the linux-leaks job uses)
we'll skip the test entirely. So we'll only run into this when running
it manually, or with the "GIT_TEST_PASSING_SANITIZE_LEAK=check" mode.

That's not because the failure has anything to do with leak detection
per-se. It just so happens that we have a leak that'll fail before
we've managed to fully set these up, and therefore "git submodule
update -N" ends up spawning "git clone".

Let's instead continue lying about the origin of this submodule by
providing a URL for it that doesn't work, but now one that *really*
doesn't work: /dev/null. If the test is passing we won't ever use
this, and if we have knock-on failures we'll fail early, instead of
waiting for a timeout.

The behavior of "-N" here might be surprising to some, since it's
explained as "[if you use -N we] don’t fetch new objects from the
remote site". But (perhaps counter-intuitively) it's only talking
about if it needs to do so via "git fetch". In this case we'll end up
spawning a "git clone", as we have no submodule set up.

See ff7f089ed1 (mergetool: Teach about submodules, 2011-04-13) for
the commit that implemented these "example.com" tests.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-09 17:29:31 -05:00
Taylor Blau
3a79a8085b Merge branch 'es/chainlint-output' into es/chainlint-lineno
* es/chainlint-output:
  chainlint: annotate original test definition rather than token stream
  chainlint: latch start/end position of each token
  chainlint: tighten accuracy when consuming input stream
  chainlint: add explanatory comments
2022-11-09 16:41:35 -05:00
Taylor Blau
be4ac3b197 Merge branch 'rs/no-more-run-command-v'
Simplify the run-command API.

* rs/no-more-run-command-v:
  replace and remove run_command_v_opt()
  replace and remove run_command_v_opt_cd_env_tr2()
  replace and remove run_command_v_opt_tr2()
  replace and remove run_command_v_opt_cd_env()
  use child_process members "args" and "env" directly
  use child_process member "args" instead of string array variable
  sequencer: simplify building argument list in do_exec()
  bisect--helper: factor out do_bisect_run()
  bisect: simplify building "checkout" argument list
  am: simplify building "show" argument list
  run-command: fix return value comment
  merge: remove always-the-same "verbose" arguments
2022-11-08 17:15:12 -05:00
Taylor Blau
3e9303dc8e Merge branch 'rs/archive-filter-error-once'
"git archive" mistakenly complained twice about a missing executable,
which has been corrected.

* rs/archive-filter-error-once:
  archive-tar: report filter start error only once
2022-11-08 17:15:09 -05:00
Taylor Blau
ec9a46af4f Merge branch 'ma/drop-redundant-diagnostic'
A redundant diagnostic message is dropped from test_path_is_missing().

* ma/drop-redundant-diagnostic:
  test-lib-functions: drop redundant diagnostic print
2022-11-08 17:15:06 -05:00
Taylor Blau
15df8418a5 Merge branch 'jk/ref-filter-parsing-bugs'
Various tests exercising the transfer.credentialsInUrl configuration
are taught to avoid making requests which require resolving localhost
to reduce CI-flakiness.

* jk/ref-filter-parsing-bugs:
  ref-filter: fix parsing of signatures with CRLF and no body
  ref-filter: fix parsing of signatures without blank lines
2022-11-08 17:14:52 -05:00
Taylor Blau
bdd42e34e3 Merge branch 'es/mark-gc-cruft-as-experimental'
Enable gc.cruftpacks by default for those who opt into
feature.experimental setting.

* es/mark-gc-cruft-as-experimental:
  config: let feature.experimental imply gc.cruftPacks=true
  gc: add tests for --cruft and friends
2022-11-08 17:14:48 -05:00
Eric Sunshine
73c768dae9 chainlint: annotate original test definition rather than token stream
When chainlint detects problems in a test, such as a broken &&-chain, it
prints out the test with "?!FOO?!" annotations inserted at each problem
location. However, rather than annotating the original test definition,
it instead dumps out a parsed token representation of the test. Since it
lacks comments, indentations, here-doc bodies, and so forth, this
tokenized representation can be difficult for the test author to digest
and relate back to the original test definition.

However, now that each parsed token carries positional information, the
location of a detected problem can be pinpointed precisely in the
original test definition. Therefore, take advantage of this information
to annotate the test definition itself rather than annotating the parsed
token stream, thus making it easier for a test author to relate a
problem back to the source.

Maintaining the positional meta-information associated with each
detected problem requires a slight change in how the problems are
managed internally. In particular, shell syntax such as:

    msg="total: $(cd data; wc -w *.txt) words"

requires the lexical analyzer to recursively invoke the parser in order
to detect problems within the $(...) expression inside the double-quoted
string. In this case, the recursive parse context will detect the broken
&&-chain between the `cd` and `wc` commands, returning the token stream:

    cd data ; ?!AMP?! wc -w *.txt

However, the parent parse context will see everything inside the
double-quotes as a single string token:

    "total: $(cd data ; ?!AMP?! wc -w *.txt) words"

losing whatever positional information was attached to the ";" token
where the problem was detected.

One way to preserve the positional information of a detected problem in
a recursive parse context within a string would be to attach the
positional information to the annotation textually; for instance:

    "total: $(cd data ; ?!AMP:21:22?! wc -w *.txt) words"

and then extract the positional information when annotating the original
test definition.

However, a cleaner and much simpler approach is to maintain the list of
detected problems separately rather than embedding the problems as
annotations directly in the parsed token stream. Not only does this
ensure that positional information within recursive parse contexts is
not lost, but it keeps the token stream free from non-token pollution,
which may simplify implementation of validations added in the future
since they won't have to handle non-token "?!FOO!?" items specially.

Finally, the chainlint self-test "expect" files need a few mechanical
adjustments now that the original test definitions are emitted rather
than the parsed token stream. In particular, the following items missing
from the historic parsed-token output are now preserved verbatim:

    * indentation (and whitespace, in general)

    * comments

    * here-doc bodies

    * here-doc tag quoting (i.e. "\EOF")

    * line-splices (i.e. "\" at the end of a line)

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 15:10:49 -05:00
Eric Sunshine
5f0321a9f2 chainlint: latch start/end position of each token
When chainlint detects problems in a test, such as a broken &&-chain, it
prints out the test with "?!FOO?!" annotations inserted at each problem
location. However, rather than annotating the original test definition,
it instead dumps out a parsed token representation of the test. Since it
lacks comments, indentations, here-doc bodies, and so forth, this
tokenized representation can be difficult for the test author to digest
and relate back to the original test definition.

To address this shortcoming, an upcoming change will make it print out
an annotated copy of the original test definition rather than the
tokenized representation. In order to do so, it will need to know the
start and end positions of each token in the original test definition.
As preparation, upgrade TestParser::scan_token() to latch the start and
end position of the token being scanned, and return that information
along with the token itself. A subsequent change will take advantage of
this positional information.

In terms of implementation, TestParser::scan_token() is retrofitted to
return a tuple consisting of the token's lexeme and its start and end
positions, rather than returning just the lexeme. However, an
alternative would be to define a class which represents a token:

    package Token;

    sub new {
        my ($class, $lexeme, $start, $end) = @_;
        bless [$lexeme, $start, $end] => $class;
    }

    sub as_string {
        my $self = shift @_;
        return $self->[0];
    }

    sub compare {
        my ($x, $y) = @_;
        if (UNIVERSAL::isa($y, 'Token')) {
            return $x->[0] cmp $y->[0];
        }
        return $x->[0] cmp $y;
    }

    use overload (
        '""' => 'as_string',
        'cmp' => 'compare'
    );

The major benefit of the class-based approach is that it is entirely
non-invasive; it requires no additional changes to the rest of the
script since a Token converts automatically to a string, which is what
scan_token() historically returned.

The big downside to the Token approach, however, is that it is _slow_;
on this developer's (old) machine, it increases user-time by an
unacceptable seven seconds when scanning all test scripts in the
project. Hence, the simple tuple approach is employed instead since it
adds only a fraction of a second user-time.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 15:10:49 -05:00
Eric Sunshine
ca748f5183 chainlint: tighten accuracy when consuming input stream
To extract the next token in the input stream, Lexer::scan_token() finds
the start of the token by skipping whitespace, then consumes characters
belonging to the token until it encounters a non-token character, such
as an operator, punctuation, or whitespace. In the case of an operator
or punctuation which ends a token, before returning the just-scanned
token, it pushes that operator or punctuation character back onto the
input stream to ensure that it will be the first character consumed by
the next call to scan_token().

However, scan_token() is intentionally lax when whitespace ends a token;
it doesn't bother pushing the whitespace character back onto the token
stream since it knows that the next call to scan_token() will, as its
first step, skip over whitespace anyhow when looking for the start of
the token.

Although such laxity is harmless for the proper functioning of the
lexical analyzer, it does make it difficult to precisely identify the
token's end position in the input stream. Accurate token position
information may be desirable, for instance, to annotate problems or
highlight other interesting facets of the input found during the parsing
phase. To accommodate such possibilities, tighten scan_token() by making
it push the token-ending whitespace character back onto the input
stream, just as it does for other token-ending characters.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 15:10:49 -05:00
Eric Sunshine
c90d81f8bb chainlint: add explanatory comments
The logic in TestParser::accumulate() for detecting broken &&-chains is
mostly well-commented, but a couple branches which were deemed obvious
and straightforward lack comments. In retrospect, though, these cases
may give future readers pause, so comment them, as well.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 15:10:49 -05:00
Ævar Arnfjörð Bjarmason
d50d8485ef submodule tests: test for a "foreach" blind-spot
We tested for "--" followed by command names, but not for "--"
followed by an argument that looks like an option, let's do that.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 14:55:30 -05:00
Ævar Arnfjörð Bjarmason
435285bd82 submodule--helper: fix a memory leak in "status"
The "status" sub-command was leaking the "struct strvec" it was
setting up for the reasons explained in f92dbdbc6a (revisions API:
don't leak memory on argv elements that need free()-ing, 2022-08-02),
so let's use the "free_removed_argv_elements" option to
setup_revisions() to fix the leak.

Even if we did that, clobbering the "diff_files_args.nr" with the
return value of setup_revisions() would leave leaks in place, but we
can just stop clobbering it.

Ever since that code was added in a9f8a37584 (submodule: port
submodule subcommand 'status' from shell to C, 2017-10-06) we've had
no reason to modify the "nr" member ("argc" at the time): The next use
of "diff_files_args" after this is the "strvec_clear()" at the end of
the function.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 14:55:30 -05:00
Ævar Arnfjörð Bjarmason
44874cbd19 submodule tests: add tests for top-level flag output
Exhaustively test for how combining various "mixed-level" "git
submodule" option works. "Mixed-level" here means options that are
accepted by a mixture of the top-level "submodule" command, and
e.g. the "status" sub-command.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 14:55:30 -05:00
Ævar Arnfjörð Bjarmason
cc74a4ac72 submodule--helper: move "config" to a test-tool
As with other moves to "test-tool" in f322e9f51b (Merge branch
'ab/submodule-helper-prep', 2022-09-13) the "config" sub-command was
only used by our own tests.

It was last used by "git submodule" itself in code that went away with
a6226fd772 (submodule--helper: convert the bulk of cmd_add() to C,
2021-08-10).

Let's move it over, and while doing so make it easier to reason about
by splitting up the various uses for it into separate sub-commands, so
that we don't need to count arguments to see what it does.

This also has the advantage that we stop wasting future translator
time on this command, currently the usage information for this
internal-only tool has been translated into several languages. The use
of the "_" function has also been removed from the "please make
sure..." message.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-08 14:55:30 -05:00
Ævar Arnfjörð Bjarmason
dc1cf3580e Makefile & test-tool: replace "DC_SHA1" variable with a "define"
Address the root cause of technical debt we've been carrying since
sha1collisiondetection was made the default in [1]. In a preceding
commit we narrowly fixed a bug where the "DC_SHA1" variable would be
unset (in combination with "NO_APPLE_COMMON_CRYPTO=" on OSX), even
though we had the sha1collisiondetection library enabled.

But the only reason we needed to have such a user-exposed knob went
away with [1], and it's been doing nothing useful since then. We don't
care if you define DC_SHA1=*, we only care that you don't ask for any
other SHA-1 implementation. If it turns out that you didn't, we'll use
sha1collisiondetection, whether you had "DC_SHA1" set or not.

As a result of this being confusing we had e.g. [2] for cmake and the
recent [3] for ci/lib.sh setting "DC_SHA1" explicitly, even though
this was always a NOOP.

A much simpler way to do this is to stop having the Makefile and
CMakeLists.txt set "DC_SHA1" to be picked up by the test-lib.sh, let's
instead add a trivial "test-tool sha1-is-sha1dc". It returns zero if
we're using sha1collisiondetection, non-zero otherwise.

1. e6b07da278 (Makefile: make DC_SHA1 the default, 2017-03-17)
2. c4b2f41b5f (cmake: support for testing git with ctest, 2020-06-26)
3. 1ad5c3df35 (ci: use DC_SHA1=YesPlease on osx-clang job for CI,
   2022-10-20)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-07 22:11:51 -05:00
Victoria Dye
44da9e0841 rebase --update-refs: avoid unintended ref deletion
In b3b1a21d1a (sequencer: rewrite update-refs as user edits todo list,
2022-07-19), the 'todo_list_filter_update_refs()' step was added to handle
the removal of 'update-ref' lines from a 'rebase-todo'. Specifically, it
removes potential ref updates from the "update refs state" if a ref does not
have a corresponding 'update-ref' line.

However, because 'write_update_refs_state()' will not update the state if
the 'refs_to_oids' list was empty, removing *all* 'update-ref' lines will
result in the state remaining unchanged from how it was initialized (with
all refs' "after" OID being null). Then, when the ref update is applied, all
refs will be updated to null and consequently deleted.

To fix this, delete the 'update-refs' state file when 'refs_to_oids' is
empty. Additionally, add a tests covering "all update-ref lines removed"
cases.

Reported-by: herr.kaste <herr.kaste@gmail.com>
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-07 14:16:45 -05:00
Johannes Schindelin
c90db53d20 scalar reconfigure -a: remove stale scalar.repo entries
Every once in a while, a Git for Windows installation fails because the
attempt to reconfigure a Scalar enlistment failed because it was deleted
manually without removing the corresponding entries in the global Git
config.

In f5f0842d0b (scalar: let 'unregister' handle a deleted enlistment
directory gracefully, 2021-12-03), we already taught `scalar delete` to
handle the case of a manually deleted enlistment gracefully. This patch
adds the same graceful handling to `scalar reconfigure --all`.

This patch is best viewed with `--color-moved`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-07 13:57:13 -05:00
Debra Obondo
7cccf5b6c9 t7001-mv.sh: modernizing test script using functions
Test script to verify the presence/absence of files, paths, directories,
symlinks and other features in 'git mv' command are using the command
format:

'test (-e|f|d|h|...)'

Replace them with helper functions of format:

'test_path_is_*'

Replacing idiomatic helper functions:

'! test_path_is_*'

with

'test_path_is_missing'

This uses values of 'test_path_bar' in place of '! test_path_foo' to
bring in the helpful factor of indicating the failure of tests after the
mv command has been used, that is, it echoes if the feature/test_path
exists.

Signed-off-by: Debra Obondo <debraobondo@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-04 17:58:23 -04:00
Jeff King
8e1c5fcf28 ref-filter: fix parsing of signatures with CRLF and no body
This commit fixes a bug when parsing tags that have CRLF line endings, a
signature, and no body, like this (the "^M" are marking the CRs):

  this is the subject^M
  -----BEGIN PGP SIGNATURE-----^M
  ^M
  ...some stuff...^M
  -----END PGP SIGNATURE-----^M

When trying to find the start of the body, we look for a blank line
separating the subject and body. In this case, there isn't one. But we
search for it using strstr(), which will find the blank line in the
signature.

In the non-CRLF code path, we check whether the line we found is past
the start of the signature, and if so, put the body pointer at the start
of the signature (effectively making the body empty). But the CRLF code
path doesn't catch the same case, and we end up with the body pointer in
the middle of the signature field. This has two visible problems:

  - printing %(contents:subject) will show part of the signature, too,
    since the subject length is computed as (body - subject)

  - the length of the body is (sig - body), which makes it negative.
    Asking for %(contents:body) causes us to cast this to a very large
    size_t when we feed it to xmemdupz(), which then complains about
    trying to allocate too much memory.

These are essentially the same bugs fixed in the previous commit, except
that they happen when there is a CRLF blank line in the signature,
rather than no blank line at all. Both are caused by the refactoring in
9f75ce3d8f (ref-filter: handle CRLF at end-of-line more gracefully,
2020-10-29).

We can fix this by doing the same "sigstart" check that we do in the
non-CRLF case. And rather than repeat ourselves, we can just use
short-circuiting OR to collapse both cases into a single conditional.
I.e., rather than:

  if (strstr("\n\n"))
    ...found blank, check if it's in signature...
  else if (strstr("\r\n\r\n"))
    ...found blank, check if it's in signature...
  else
    ...no blank line found...

we can collapse this to:

  if (strstr("\n\n")) ||
      strstr("\r\n\r\n")))
    ...found blank, check if it's in signature...
  else
    ...no blank line found...

The tests show the problem and the fix. Though it wasn't broken, I
included contents:signature here to make sure it still behaves as
expected, but note the shell hackery needed to make it work. A
less-clever option would be to skip using test_atom and just "append_cr
>expected" ourselves.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-02 21:36:04 -04:00
Jeff King
b01e1c7ef0 ref-filter: fix parsing of signatures without blank lines
When ref-filter is asked to show %(content:subject), etc, we end up in
find_subpos() to parse out the three major parts: the subject, the body,
and the signature (if any).

When searching for the blank line between the subject and body, if we
don't find anything, we try to treat the whole message as the subject,
with no body. But our idea of "the whole message" needs to take into
account the signature, too. Since 9f75ce3d8f (ref-filter: handle CRLF at
end-of-line more gracefully, 2020-10-29), the code instead goes all the
way to the end of the buffer, which produces confusing output.

Here's an example. If we have a tag message like this:

  this is the subject
  -----BEGIN SSH SIGNATURE-----
  ...some stuff...
  -----END SSH SIGNATURE-----

then the current parser will put the start of the body at the end of the
whole buffer. This produces two buggy outcomes:

  - since the subject length is computed as (body - subject), showing
    %(contents:subject) will print both the subject and the signature,
    rather than just the single line

  - since the body length is computed as (sig - body), and the body now
    starts _after_ the signature, we end up with a negative length!
    Fortunately we never access out-of-bounds memory, because the
    negative length is fed to xmemdupz(), which casts it to a size_t,
    and xmalloc() bails trying to allocate an absurdly large value.

    In theory it would be possible for somebody making a malicious tag
    to wrap it around to a more reasonable value, but it would require a
    tag on the order of 2^63 bytes. And even if they did, all they get
    is an out of bounds string read. So the security implications are
    probably not interesting.

We can fix both by correctly putting the start of the body at the same
index as the start of the signature (effectively making the body empty).

Note that this is a real issue with signatures generated with gpg.format
set to "ssh", which would look like the example above. In the new tests
here I use a hard-coded tag message, for a few reasons:

  - regardless of what the ssh-signing code produces now or in the
    future, we should be testing this particular case

  - skipping the actual signature makes the tests simpler to write (and
    allows them to run on more systems)

  - t6300 has helpers for working with gpg signatures; for the purposes
    of this bug, "BEGIN PGP" is just as good a demonstration, and this
    simplifies the tests

Curiously, the same issue doesn't happen with real gpg signatures (and
there are even existing tests in t6300 with cover this). Those have a
blank line between the header and the content, like:

  this is the subject
  -----BEGIN PGP SIGNATURE-----

  ...some stuff...
  -----END PGP SIGNATURE-----

Because we search for the subject/body separator line with a strstr(),
we find the blank line in the signature, even though it's outside of
what we'd consider the body. But that puts us unto a separate code path,
which realizes that we're now in the signature and adjusts the line back
to "sigstart". So this patch is basically just making the "no line found
at all" case match that. And note that "sigstart" is always defined (if
there is no signature, it points to the end of the buffer as you'd
expect).

Reported-by: Martin Englund <martin@englund.nu>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-02 21:36:04 -04:00
Johannes Schindelin
db8016b43f t5516/t5601: be less strict about the number of credential warnings
It is unclear as to _why_, but under certain circumstances the warning
about credentials being passed as part of the URL seems to be swallowed
by the `git remote-https` helper in the Windows jobs of Git's CI builds.

Since it is not actually important how many times Git prints the
warning/error message, as long as it prints it at least once, let's just
make the test a bit more lenient and test for the latter instead of the
former, which works around these CI issues.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-01 16:35:05 -04:00
Jeff King
762521e8a5 t5516: move plaintext-password tests from t5601 and t5516
Commit 6dcbdc0d66 (remote: create fetch.credentialsInUrl config,
2022-06-06) added tests for our handling of passwords in URLs. Since the
obvious URL to be affected is git-over-http, the tests use http. However
they don't set up a test server; they just try to access
https://localhost, assuming it will fail (because the nothing is
listening there).

This causes some possible problems:

  - There might be a web server running on localhost, and we do not
    actually want to connect to that.

  - The DNS resolver, or the local firewall, might take a substantial
    amount of time (or forever, whichever comes first) to fail to
    connect, slowing down the tests cases unnecessarily.

  - Since there's no server, our tests for "allow" and "warn" still
    expect the clone/fetch/push operations to fail, even though in the
    real world we'd expect these to succeed. We scrape stderr to see
    what happened, but it's not as robust as a more realistic test.

Let's instead move these to t5551, which is all about testing http and
where we have a real server. That eliminates any issues with contacting
a strange URL, and lets the "allow" and "warn" tests confirm that the
operation actually succeeds.

It's not quite a verbatim move for a few reasons:

  - we can drop the LIBCURL dependency; it's already part of
    lib-httpd.sh

  - we'll use HTTPD_URL_USER_PASS, etc, instead of our fake URL. To
    avoid repetition, we'll add a few extra variables.

  - the "https://username:@localhost" test uses a funny URL that
    lib-httpd.sh doesn't provide. We'll similarly construct it in a
    variable. Note that we're hard-coding the lib-httpd username here,
    but t5551 already does that everywhere.

  - for the "domain:port" test, the URL provided by lib-httpd is fine,
    since our test server will always be on an exotic port. But we'll
    confirm in the test that this is so.

  - since our message-matching is done via grep, I simplified it to use
    a regex, rather than trying to massage lib-httpd's variables.
    Arguably this makes it more readable, too, while retaining the bits
    we care about: the fatal/warning distinction, the "uses plaintext"
    message, and the fact that the password was redacted.

  - we'll use the /auth/ path for the repo, which shows that we are
    indeed making use of the auth information when needed.

  - we'll also use /smart/; most of these tests could be done via /dumb/
    in t5550, but setting up pushes there requires extra effort and
    dependencies. The smart protocol is what most everyone is using
    these days anyway.

This patch is my own, but I stole the analysis and a few bits of the
commit message from a patch by Johannes Schindelin.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-11-01 16:35:05 -04:00
Martin Ågren
cc8f95c042 test-lib-functions: drop redundant diagnostic print
`test_path_is_missing` was introduced back in 2caf20c52b ("test-lib:
user-friendly alternatives to test [-d|-f|-e]", 2010-08-10). It took the
path that was supposed to be missing, as well as an optional "diagnosis"
that would be echoed if the path was found to be alive.

Commit 45a2686441 ("test-lib-functions: remove bug-inducing
"diagnostics" helper param", 2021-02-12) dropped this diagnostic
functionality from several `test_path_is_foo` helpers, but note how it
tweaked the README entry on `test_path_is_missing` without actually
adjusting its implementation.

Commit e7884b353b ("test-lib-functions: assert correct parameter count",
2021-02-12) then followed up by asserting that we get just a single
argument.

This history leaves us in a state where we assert that we have exactly
one argument, then go on to anyway check for arguments, echoing them
all. It's clear that we can simplify this code. We should also note that
we run `ls -ld "$1"`, so printing the filename a second time doesn't
really buy us anything. Thus, we can drop the whole `if` block as
redundant.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-31 21:12:09 -04:00
Ævar Arnfjörð Bjarmason
fe004a4333 run-command tests: test stdout of run_command_parallel()
Extend the tests added in c553c72eed (run-command: add an
asynchronous parallel child processor, 2015-12-15) to test stdout in
addition to stderr.

When the "ungroup" feature was added in fd3aaf53f7 (run-command: add
an "ungroup" option to run_process_parallel(), 2022-06-07) its tests
were made to test both the stdout and stderr, but these existing tests
were left alone. Let's also exhaustively test our expected output
here.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-31 00:16:37 -04:00
Ævar Arnfjörð Bjarmason
ac48da5a92 submodule tests: reset "trace.out" between "grep" invocations
Fix test patterns added in 62104ba14a (submodules: allow parallel
fetching, add tests and documentation, 2015-12-15) and
a028a1930c (fetching submodules: respect `submodule.fetchJobs` config
option, 2016-02-29).

In the former case we were leaving a trace.out file at the top-level
for any subsequent tests (there are none, currently). Let's clean the
file up instead.

In the latter case we were testing that a given configuration would
result in "N tasks" in the log, but we were grepping through the log
for all previous such tests, when we really meant to clear the logs
between the "grep" invocations.

In practice this resulted in no logic error, as e.g. "--fetch 7" would
not print out a "9 tasks" line, but let's be paranoid and stop
implicitly assuming that that's the case.

This change was originally left out of 51243f9f0f (run-command API:
don't fall back on online_cpus(), 2022-10-12), which added the
">trace.out" seen at the end of the context.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-31 00:16:37 -04:00
Ævar Arnfjörð Bjarmason
035cccf46e hook tests: fix redirection logic error in 96e7225b31
The tests added in 96e7225b31 (hook: add 'run' subcommand,
2021-12-22) were redirecting to "actual" both in the body of the hook
itself and in the testing code below.

The net result was that the "2>>actual" redirection later in the test
wasn't doing anything. Let's have those redirection do what it looks
like they're doing.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-31 00:16:37 -04:00
Taylor Blau
b1e3dd68ee Merge branch 'en/merge-tree-sequence'
"git merge-tree --stdin" is a new way to request a series of merges
and report the merge results.

* en/merge-tree-sequence:
  merge-tree: support multiple batched merges with --stdin
  merge-tree: update documentation for differences in -z output
2022-10-30 21:04:44 -04:00
Taylor Blau
d32dd8add5 Merge branch 'ds/bundle-uri-3'
Define the logical elements of a "bundle list", data structure to
store them in-core, format to transfer them, and code to parse
them.

* ds/bundle-uri-3:
  bundle-uri: suppress stderr from remote-https
  bundle-uri: quiet failed unbundlings
  bundle: add flags to verify_bundle()
  bundle-uri: fetch a list of bundles
  bundle: properly clear all revision flags
  bundle-uri: limit recursion depth for bundle lists
  bundle-uri: parse bundle list in config format
  bundle-uri: unit test "key=value" parsing
  bundle-uri: create "key=value" line parsing
  bundle-uri: create base key-value pair parsing
  bundle-uri: create bundle_list struct and helpers
  bundle-uri: use plain string in find_temp_filename()
2022-10-30 21:04:44 -04:00
Taylor Blau
bf0d9d0d34 Merge branch 'rj/branch-do-not-exit-with-minus-one-status'
"git branch --edit-description" can exit with status -1 which is
not a good practice; it learned to use 1 as everybody else instead.

* rj/branch-do-not-exit-with-minus-one-status:
  branch: error code with --edit-description
2022-10-30 21:04:43 -04:00
Taylor Blau
0c025612d4 Merge branch 'rj/branch-copy-rename-error-codepath-cleanup'
Code simplification.

* rj/branch-copy-rename-error-codepath-cleanup:
  branch: error copying or renaming a detached HEAD
2022-10-30 21:04:43 -04:00
Taylor Blau
c41ec63ef5 Merge branch 'tb/cap-patch-at-1gb'
"git apply" limits its input to a bit less than 1 GiB.

* tb/cap-patch-at-1gb:
  apply: reject patches larger than ~1 GiB
2022-10-30 21:04:43 -04:00
Taylor Blau
969230b64f Merge branch 'en/ort-dir-rename-and-symlink-fix'
Merging a branch with directory renames into a branch that changes
the directory to a symlink was mishandled by the ort merge
strategy, which has been corrected.

* en/ort-dir-rename-and-symlink-fix:
  merge-ort: fix bug with dir rename vs change dir to symlink
2022-10-30 21:04:43 -04:00
Taylor Blau
a23e0b69e2 Merge branch 'pb/subtree-split-and-merge-after-squashing-tag-fix'
A bugfix to "git subtree" in its split and merge features.

* pb/subtree-split-and-merge-after-squashing-tag-fix:
  subtree: fix split after annotated tag was squashed merged
  subtree: fix squash merging after annotated tag was squashed merged
  subtree: process 'git-subtree-split' trailer in separate function
  subtree: use named variables instead of "$@" in cmd_pull
  subtree: define a variable before its first use in 'find_latest_squash'
  subtree: prefix die messages with 'fatal'
  subtree: add 'die_incompatible_opt' function to reduce duplication
  subtree: use 'git rev-parse --verify [--quiet]' for better error messages
  test-lib-functions: mark 'test_commit' variables as 'local'
2022-10-30 21:04:43 -04:00
Taylor Blau
8851c4b065 Merge branch 'pw/rebase-reflog-fixes'
Fix some bugs in the reflog messages when rebasing and changes the
reflog messages of "rebase --apply" to match "rebase --merge" with
the aim of making the reflog easier to parse.

* pw/rebase-reflog-fixes:
  rebase: cleanup action handling
  rebase --abort: improve reflog message
  rebase --apply: make reflog messages match rebase --merge
  rebase --apply: respect GIT_REFLOG_ACTION
  rebase --merge: fix reflog message after skipping
  rebase --merge: fix reflog when continuing
  t3406: rework rebase reflog tests
  rebase --apply: remove duplicated code
2022-10-30 21:04:43 -04:00
Taylor Blau
003f815dd9 Merge branch 'pw/rebase-keep-base-fixes'
"git rebase --keep-base" used to discard the commits that are
already cherry-picked to the upstream, even when "keep-base" meant
that the base, on top of which the history is being rebuilt, does
not yet include these cherry-picked commits.  The --keep-base
option now implies --reapply-cherry-picks and --no-fork-point
options.

* pw/rebase-keep-base-fixes:
  rebase --keep-base: imply --no-fork-point
  rebase --keep-base: imply --reapply-cherry-picks
  rebase: factor out branch_base calculation
  rebase: rename merge_base to branch_base
  rebase: store orig_head as a commit
  rebase: be stricter when reading state files containing oids
  t3416: set $EDITOR in subshell
  t3416: tighten two tests
2022-10-30 21:04:42 -04:00
Taylor Blau
e5be3c632a Merge branch 'jh/trace2-timers-and-counters'
Two new facilities, "timer" and "counter", are introduced to the
trace2 API.

* jh/trace2-timers-and-counters:
  trace2: add global counter mechanism
  trace2: add stopwatch timers
  trace2: convert ctx.thread_name from strbuf to pointer
  trace2: improve thread-name documentation in the thread-context
  trace2: rename the thread_name argument to trace2_thread_start
  api-trace2.txt: elminate section describing the public trace2 API
  tr2tls: clarify TLS terminology
  trace2: use size_t alloc,nr_open_regions in tr2tls_thread_ctx
2022-10-30 21:04:42 -04:00
Taylor Blau
c112d8d9c2 Merge branch 'tb/shortlog-group'
"git shortlog" learned to group by the "format" string.

* tb/shortlog-group:
  shortlog: implement `--group=committer` in terms of `--group=<format>`
  shortlog: implement `--group=author` in terms of `--group=<format>`
  shortlog: extract `shortlog_finish_setup()`
  shortlog: support arbitrary commit format `--group`s
  shortlog: extract `--group` fragment for translation
  shortlog: make trailer insertion a noop when appropriate
  shortlog: accept `--date`-related options
2022-10-30 21:04:42 -04:00