Commit Graph

20093 Commits

Author SHA1 Message Date
Johannes Altmanninger
3e367a5f2f sequencer: avoid dropping fixup commit that targets self via commit-ish
Commit 68d5d03bc4 (rebase: teach --autosquash to match on sha1 in
addition to message, 2010-11-04) taught autosquash to recognize
subjects like "fixup! 7a235b" where 7a235b is an OID-prefix. It
actually did more than advertised: 7a235b can be an arbitrary
commit-ish (as long as it's not trailed by spaces).

Accidental(?) use of this secret feature revealed a bug where we
would silently drop a fixup commit. The bug can also be triggered
when using an OID-prefix but that's unlikely in practice.

Let the commit with subject "fixup! main" be the tip of the "main"
branch. When computing the fixup target for this commit, we find
the commit itself. This is wrong because, by definition, a fixup
target must be an earlier commit in the todo list. We wrongly find
the current commit because we added it to the todo list prematurely.
Avoid these fixup-cycles by only adding the current commit to the
todo list after we have finished looking for the fixup target.

Reported-by: Erik Cervin Edin <erik@cervined.in>
Signed-off-by: Johannes Altmanninger <aclopte@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-26 10:11:57 -07:00
Jeff King
5a97b38109 remote: handle rename of remote without fetch refspec
We return an error when trying to rename a remote that has no fetch
refspec:

  $ git config --unset-all remote.origin.fetch
  $ git remote rename origin foo
  fatal: could not unset 'remote.foo.fetch'

To make things even more confusing, we actually _do_ complete the config
modification, via git_config_rename_section(). After that we try to
rewrite the fetch refspec (to say refs/remotes/foo instead of origin).
But our call to git_config_set_multivar() to remove the existing entries
fails, since there aren't any, and it calls die().

We could fix this by using the "gently" form of the config call, and
checking the error code. But there is an even simpler fix: if we know
that there are no refspecs to rewrite, then we can skip that part
entirely.

Reported-by: John A. Leuenhagen <john@zlima12.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-22 12:59:52 -07:00
Jeff King
3b910d6e29 clone: allow "--bare" with "-o"
We explicitly forbid the combination of "--bare" with "-o", but there
doesn't seem to be any good reason to do so. The original logic came as
part of e6489a1bdf (clone: do not accept more than one -o option.,
2006-01-22), but that commit does not give any reason.

Furthermore, the equivalent combination via config is allowed:

  git -c clone.defaultRemoteName=foo clone ...

and works as expected. It may be that this combination was considered
useless, because a bare clone does not set remote.origin.fetch (and
hence there is no refs/remotes/origin hierarchy). But it does set
remote.origin.url, and that name is visible to the user via "git fetch
origin", etc.

Let's allow the options to be used together, and switch the "forbid"
test in t5606 to check that we use the requested name. That test came
much later in 349cff76de (clone: add tests for --template and some
disallowed option pairs, 2020-09-29), and does not offer any logic
beyond "let's test what the code currently does".

Reported-by: John A. Leuenhagen <john@zlima12.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-22 12:57:03 -07:00
Junio C Hamano
17df9d3849 Merge branch 'sg/clean-test-results'
"make clean" stopped cleaning the test results directory as a side
effect of a topic that has nothing to do with "make clean", which
has been corrected.

* sg/clean-test-results:
  t/Makefile: remove 'test-results' on 'make clean'
2022-09-21 15:27:02 -07:00
Junio C Hamano
86c108a8a2 Merge branch 'vd/scalar-generalize-diagnose'
Portability fix.

* vd/scalar-generalize-diagnose:
  builtin/diagnose.c: don't translate the two mode values
  diagnose.c: refactor to safely use 'd_type'
2022-09-21 15:27:01 -07:00
Junio C Hamano
dd37e5607f Merge branch 'fz/help-doublofix'
Typofix for topic already in -rc0.

* fz/help-doublofix:
  help: fix doubled words in explanation for developer interfaces
2022-09-21 14:23:14 -07:00
SZEDER Gábor
d11b875197 t/Makefile: remove 'test-results' on 'make clean'
The 't/test-results' directory and its contents are by-products of the
test process, so 'make clean' should remove them, but, alas, this has
been broken since fee65b194d (t/Makefile: don't remove test-results in
"clean-except-prove-cache", 2022-07-28).

The 'clean' target in 't/Makefile' was not directly responsible for
removing the 'test-results' directory, but relied on its dependency
'clean-except-prove-cache' to do that [1].  ee65b194d broke this,
because it only removed the 'rm -r test-results' command from the
'clean-except-prove-cache' target instead of moving it to the 'clean'
target, resulting in stray 't/test-results' directories.

Add that missing cleanup command to 't/Makefile', and to all
sub-Makefiles touched by that commit as well.

[1] 60f26f6348 (t/Makefile: retain cache t/.prove across prove runs,
                2012-05-02)

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-21 11:32:13 -07:00
Junio C Hamano
279ebd4761 Merge branch 'ad/t1800-cygwin'
Test fix.

* ad/t1800-cygwin:
  t1800: correct test to handle Cygwin
2022-09-19 14:35:25 -07:00
Junio C Hamano
42bf77c7d0 Merge branch 'vd/scalar-to-main'
Hoist the remainder of "scalar" out of contrib/ to the main part of
the codebase.

* vd/scalar-to-main:
  Documentation/technical: include Scalar technical doc
  t/perf: add 'GIT_PERF_USE_SCALAR' run option
  t/perf: add Scalar performance tests
  scalar-clone: add test coverage
  scalar: add to 'git help -a' command list
  scalar: implement the `help` subcommand
  git help: special-case `scalar`
  scalar: include in standard Git build & installation
  scalar: fix command documentation section header
2022-09-19 14:35:25 -07:00
Junio C Hamano
9d58241ee4 Merge branch 'es/chainlint'
Revamp chainlint script for our tests.

* es/chainlint:
  chainlint: colorize problem annotations and test delimiters
  t: retire unused chainlint.sed
  t/Makefile: teach `make test` and `make prove` to run chainlint.pl
  test-lib: replace chainlint.sed with chainlint.pl
  test-lib: retire "lint harder" optimization hack
  t/chainlint: add more chainlint.pl self-tests
  chainlint.pl: allow `|| echo` to signal failure upstream of a pipe
  chainlint.pl: complain about loops lacking explicit failure handling
  chainlint.pl: don't flag broken &&-chain if failure indicated explicitly
  chainlint.pl: don't flag broken &&-chain if `$?` handled explicitly
  chainlint.pl: don't require `&` background command to end with `&&`
  t/Makefile: apply chainlint.pl to existing self-tests
  chainlint.pl: don't require `return|exit|continue` to end with `&&`
  chainlint.pl: validate test scripts in parallel
  chainlint.pl: add parser to identify test definitions
  chainlint.pl: add parser to validate tests
  chainlint.pl: add POSIX shell parser
  chainlint.pl: add POSIX shell lexical analyzer
  t: add skeleton chainlint.pl
2022-09-19 14:35:24 -07:00
Junio C Hamano
339517b035 Merge branch 'sy/mv-out-of-cone'
"git mv A B" in a sparsely populated working tree can be asked to
move a path from a directory that is "in cone" to another directory
that is "out of cone".  Handling of such a case has been improved.

* sy/mv-out-of-cone:
  builtin/mv.c: fix possible segfault in add_slash()
  mv: check overwrite for in-to-out move
  advice.h: add advise_on_moving_dirty_path()
  mv: cleanup empty WORKING_DIRECTORY
  mv: from in-cone to out-of-cone
  mv: remove BOTH from enum update_mode
  mv: check if <destination> is a SKIP_WORKTREE_DIR
  mv: free the with_slash in check_dir_in_index()
  mv: rename check_dir_in_index() to empty_dir_has_sparse_contents()
  t7002: add tests for moving from in-cone to out-of-cone
2022-09-19 14:35:23 -07:00
Victoria Dye
cb98e1d50a diagnose.c: refactor to safely use 'd_type'
Refactor usage of the 'd_type' property of 'struct dirent' in 'diagnose.c'
to instead utilize the compatibility macro 'DTYPE()'. On systems where
'd_type' is not present in 'struct dirent', this macro will always return
'DT_UNKNOWN'. In that case, instead fall back on using the 'stat.st_mode' to
determine whether the dirent points to a dir, file, or link.

Additionally, add a test to 't0092-diagnose.sh' to verify that files (e.g.,
loose objects) are counted properly.

Note that the new function 'get_dtype()' is based on 'resolve_dtype()' in
'dir.c' (which itself was refactored from a prior 'get_dtype()' in
ad6f2157f9 (dir: restructure in a way to avoid passing around a struct
dirent, 2020-01-16)), but differs in that it is meant for use on arbitrary
files, such as those inside the '.git' dir. Because of this, it does not
search the index for a matching entry to derive the 'd_type'.

Reported-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-19 10:25:01 -07:00
Fangyi Zhou
225e815ef2 help: fix doubled words in explanation for developer interfaces
Signed-off-by: Fangyi Zhou <me@fangyi.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-16 09:20:11 -07:00
Junio C Hamano
ca20a44bc5 Merge branch 'jk/proto-v2-ref-prefix-fix'
"git fetch" over protocol v2 sent an incorrect ref prefix request
to the server and made "git pull" with configured fetch refspec
that does not cover the remote branch to merge with fail, which has
been corrected.

* jk/proto-v2-ref-prefix-fix:
  fetch: add branch.*.merge to default ref-prefix extension
  fetch: stop checking for NULL transport->remote in do_fetch()
2022-09-15 16:09:47 -07:00
Junio C Hamano
d878d83ff0 Merge branch 'en/remerge-diff-fixes'
Fix a few "git log --remerge-diff" bugs.

* en/remerge-diff-fixes:
  diff: fix filtering of merge commits under --remerge-diff
  diff: fix filtering of additional headers under --remerge-diff
  diff: have submodule_format logic avoid additional diff headers
2022-09-15 16:09:46 -07:00
Adam Dinwoodie
255a6f91ae t1800: correct test to handle Cygwin
On Cygwin, when failing to spawn a process using start_command, Git
outputs the same error as on Linux systems, rather than using the
GIT_WINDOWS_NATIVE-specific error output.  The WINDOWS test prerequisite
is set in both Cygwin and native Windows environments, which means it's
not appropriate to use to anticipate the error output from
start_command.  Instead, use the MINGW test prerequisite, which is only
set for Git in native Windows environments, and not for Cygwin.

Signed-off-by: Adam Dinwoodie <adam@dinwoodie.org>
Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-15 10:29:51 -07:00
Junio C Hamano
b563638d2c Merge branch 'ab/submodule-helper-leakfix'
Plugging leaks in submodule--helper.

* ab/submodule-helper-leakfix:
  submodule--helper: fix a configure_added_submodule() leak
  submodule--helper: free rest of "displaypath" in "struct update_data"
  submodule--helper: free some "displaypath" in "struct update_data"
  submodule--helper: fix a memory leak in print_status()
  submodule--helper: fix a leak in module_add()
  submodule--helper: fix obscure leak in module_add()
  submodule--helper: fix "reference" leak
  submodule--helper: fix a memory leak in get_default_remote_submodule()
  submodule--helper: fix a leak with repo_clear()
  submodule--helper: fix "sm_path" and other "module_cb_list" leaks
  submodule--helper: fix "errmsg_str" memory leak
  submodule--helper: add and use *_release() functions
  submodule--helper: don't leak {run,capture}_command() cp.dir argument
  submodule--helper: "struct pathspec" memory leak in module_update()
  submodule--helper: fix most "struct pathspec" memory leaks
  submodule--helper: fix trivial get_default_remote_submodule() leak
  submodule--helper: fix a leak in "clone_submodule"
2022-09-14 12:56:40 -07:00
Junio C Hamano
dd407f1c7c Merge branch 'ab/unused-annotation'
Undoes 'jk/unused-annotation' topic and redoes it to work around
Coccinelle rules misfiring false positives in unrelated codepaths.

* ab/unused-annotation:
  git-compat-util.h: use "deprecated" for UNUSED variables
  git-compat-util.h: use "UNUSED", not "UNUSED(var)"
2022-09-14 12:56:39 -07:00
Junio C Hamano
a6b42ec0c6 Merge branch 'jk/unused-annotation'
Annotate function parameters that are not used (but cannot be
removed for structural reasons), to prepare us to later compile
with -Wunused warning turned on.

* jk/unused-annotation:
  is_path_owned_by_current_uid(): mark "report" parameter as unused
  run-command: mark unused async callback parameters
  mark unused read_tree_recursive() callback parameters
  hashmap: mark unused callback parameters
  config: mark unused callback parameters
  streaming: mark unused virtual method parameters
  transport: mark bundle transport_options as unused
  refs: mark unused virtual method parameters
  refs: mark unused reflog callback parameters
  refs: mark unused each_ref_fn parameters
  git-compat-util: add UNUSED macro
2022-09-14 12:56:39 -07:00
Junio C Hamano
2c75b3255b Merge branch 'en/merge-unstash-only-on-clean-merge' into maint
The auto-stashed local changes created by "git merge --autostash"
was mixed into a conflicted state left in the working tree, which
has been corrected.

* en/merge-unstash-only-on-clean-merge:
  merge: only apply autostash when appropriate
2022-09-13 12:21:11 -07:00
Junio C Hamano
aa31cb8974 Merge branch 'jk/pipe-command-nonblock' into maint
Fix deadlocks between main Git process and subprocess spawned via
the pipe_command() API, that can kill "git add -p" that was
reimplemented in C recently.

* jk/pipe-command-nonblock:
  pipe_command(): mark stdin descriptor as non-blocking
  pipe_command(): handle ENOSPC when writing to a pipe
  pipe_command(): avoid xwrite() for writing to pipe
  git-compat-util: make MAX_IO_SIZE define globally available
  nonblock: support Windows
  compat: add function to enable nonblocking pipes
2022-09-13 12:21:08 -07:00
Junio C Hamano
72869e750b Merge branch 'jk/is-promisor-object-keep-tree-in-use' into maint
An earlier optimization discarded a tree-object buffer that is
still in use, which has been corrected.

* jk/is-promisor-object-keep-tree-in-use:
  is_promisor_object(): fix use-after-free of tree buffer
2022-09-13 12:21:07 -07:00
Junio C Hamano
de1fee2f1e Merge branch 'ow/rev-parse-parseopt-fix'
The parser in the script interface to parse-options in "git
rev-parse" has been updated to diagnose a bogus input correctly.

* ow/rev-parse-parseopt-fix:
  rev-parse --parseopt: detect missing opt-spec
2022-09-13 11:38:25 -07:00
Junio C Hamano
e4ffba458f Merge branch 'js/builtin-add-p-portability-fix'
More fixes to "add -p"

* js/builtin-add-p-portability-fix:
  t6132(NO_PERL): do not run the scripted `add -p`
  t3701: test the built-in `add -i` regardless of NO_PERL
  add -p: avoid ambiguous signed/unsigned comparison
2022-09-13 11:38:24 -07:00
Junio C Hamano
76ffa818c7 Merge branch 'sg/parse-options-subcommand'
The codepath for the OPT_SUBCOMMAND facility has been cleaned up.

* sg/parse-options-subcommand:
  notes, remote: show unknown subcommands between `'
  notes: simplify default operation mode arguments check
  test-parse-options.c: fix style of comparison with zero
  test-parse-options.c: don't use for loop initial declaration
  t0040-parse-options: remove leftover debugging
2022-09-13 11:38:24 -07:00
Junio C Hamano
655e494047 Merge branch 'jk/rev-list-verify-objects-fix'
"git rev-list --verify-objects" ought to inspect the contents of
objects and notice corrupted ones, but it didn't when the commit
graph is in use, which has been corrected.

* jk/rev-list-verify-objects-fix:
  rev-list: disable commit graph with --verify-objects
  lookup_commit_in_graph(): use prepare_commit_graph() to check for graph
2022-09-13 11:38:24 -07:00
Junio C Hamano
8b2f027e20 Merge branch 'jk/upload-pack-skip-hash-check'
The server side that responds to "git fetch" and "git clone"
request has been optimized by allowing it to send objects in its
object store without recomputing and validating the object names.

* jk/upload-pack-skip-hash-check:
  t1060: check partial clone of misnamed blob
  parse_object(): check commit-graph when skip_hash set
  upload-pack: skip parse-object re-hashing of "want" objects
  parse_object(): allow skipping hash check
2022-09-13 11:38:23 -07:00
Junio C Hamano
f322e9f51b Merge branch 'ab/submodule-helper-prep'
Code clean-up of "git submodule--helper".

* ab/submodule-helper-prep: (33 commits)
  submodule--helper: fix bad config API usage
  submodule--helper: libify even more "die" paths for module_update()
  submodule--helper: libify more "die" paths for module_update()
  submodule--helper: check repo{_submodule,}_init() return values
  submodule--helper: libify "must_die_on_failure" code paths (for die)
  submodule--helper update: don't override 'checkout' exit code
  submodule--helper: libify "must_die_on_failure" code paths
  submodule--helper: libify determine_submodule_update_strategy()
  submodule--helper: don't exit() on failure, return
  submodule--helper: use "code" in run_update_command()
  submodule API: don't handle SM_..{UNSPECIFIED,COMMAND} in to_string()
  submodule--helper: don't call submodule_strategy_to_string() in BUG()
  submodule--helper: add missing braces to "else" arm
  submodule--helper: return "ret", not "1" from update_submodule()
  submodule--helper: rename "int res" to "int ret"
  submodule--helper: don't redundantly check "else if (res)"
  submodule--helper: refactor "errmsg_str" to be a "struct strbuf"
  submodule--helper: add "const" to passed "struct update_data"
  submodule--helper: add "const" to copy of "update_data"
  submodule--helper: add "const" to passed "module_clone_data"
  ...
2022-09-13 11:38:23 -07:00
Eric Sunshine
7c04aa7390 chainlint: colorize problem annotations and test delimiters
When `chainlint.pl` detects problems in a test definition, it emits the
test definition with "?!FOO?!" annotations highlighting the problems it
discovered. For instance, given this problematic test:

    test_expect_success 'discombobulate frobnitz' '
        git frob babble &&
        (echo balderdash; echo gnabgib) >expect &&
        for i in three two one
        do
            git nitfol $i
        done >actual
        test_cmp expect actual
    '

chainlint.pl will output:

    # chainlint: t1234-confusing.sh
    # chainlint: discombobulate frobnitz
    git frob babble &&
    (echo balderdash ; ?!AMP?! echo gnabgib) >expect &&
    for i in three two one
    do
    git nitfol $i ?!LOOP?!
    done >actual ?!AMP?!
    test_cmp expect actual

in which it may be difficult to spot the "?!FOO?!" annotations. The
problem is compounded when multiple tests, possibly in multiple
scripts, fail "linting", in which case it may be difficult to spot the
"# chainlint:" lines which delimit one problematic test from another.

To ameliorate this potential problem, colorize the "?!FOO?!" annotations
in order to quickly draw the test author's attention to the problem
spots, and colorize the "# chainlint:" lines to help the author identify
the name of each script and each problematic test.

Colorization is disabled automatically if output is not directed to a
terminal or if NO_COLOR environment variable is set. The implementation
is specific to Unix (it employs `tput` if available) but works equally
well in the Git for Windows development environment which emulates Unix
sufficiently.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-12 21:33:58 -07:00
Junio C Hamano
fe3939bc2a Merge branch 'vd/sparse-reset-checkout-fixes'
Segfault fix-up to an earlier fix to the topic to teach "git reset"
and "git checkout" work better in a sparse checkout.

* vd/sparse-reset-checkout-fixes:
  unpack-trees: fix sparse directory recursion check
2022-09-09 12:02:26 -07:00
Junio C Hamano
0e2a4764ed Merge branch 'jc/format-patch-force-in-body-from'
"git format-patch --from=<ident>" can be told to add an in-body
"From:" line even for commits that are authored by the given
<ident> with "--force-in-body-from"option.

* jc/format-patch-force-in-body-from:
  format-patch: learn format.forceInBodyFrom configuration variable
  format-patch: allow forcing the use of in-body From: header
  pretty: separate out the logic to decide the use of in-body from
2022-09-09 12:02:25 -07:00
Junio C Hamano
428dce9f4d Merge branch 'js/range-diff-with-pathspec'
Allow passing a pathspec to "git range-diff".

* js/range-diff-with-pathspec:
  range-diff: optionally accept pathspecs
  range-diff: consistently validate the arguments
  range-diff: reorder argument handling
2022-09-09 12:02:25 -07:00
Junio C Hamano
fb094cb583 Merge branch 'js/add-p-diff-parsing-fix'
Those who use diff-so-fancy as the diff-filter noticed a regression
or two in the code that parses the diff output in the built-in
version of "add -p", which has been corrected.

* js/add-p-diff-parsing-fix:
  add -p: ignore dirty submodules
  add -p: gracefully handle unparseable hunk headers in colored diffs
  add -p: detect more mismatches between plain vs colored diffs
2022-09-09 12:02:24 -07:00
Øystein Walle
f20b9c36d0 rev-parse --parseopt: detect missing opt-spec
After 2d893dff4c (rev-parse --parseopt: allow [*=?!] in argument hints,
2015-07-14) updated the parser, a line in parseopts's input can start
with one of the flag characters and be erroneously parsed as a opt-spec
where the short name of the option is the flag character itself and the
long name is after the end of the string. This makes Git want to
allocate SIZE_MAX bytes of memory at this line:

    o->long_name = xmemdupz(sb.buf + 2, s - sb.buf - 2);

Since s and sb.buf are equal the second argument is -2 (except unsigned)
and xmemdupz allocates len + 1 bytes, ie. -1 meaning SIZE_MAX.

Avoid this by checking whether a flag character was found in the zeroth
position.

Reported-by: Ingy dot Net <ingy@ingy.net>
Signed-off-by: Øystein Walle <oystwa@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-08 14:55:07 -07:00
Jeff King
49ca2fba39 fetch: add branch.*.merge to default ref-prefix extension
When running "git pull" with no arguments, we'll do a default "git
fetch" and then try to merge the branch specified by the branch.*.merge
config. There's code in get_ref_map() to treat that "merge" branch as
something we want to fetch, even if it is not otherwise covered by the
default refspec.

This works fine with the v0 protocol, as the server tells us about all
of the refs, and get_ref_map() is the ultimate decider of what we fetch.

But in the v2 protocol, we send the ref-prefix extension to the server,
asking it to limit the ref advertisement. And we only tell it about the
default refspec for the remote; we don't mention the branch.*.merge
config at all.

This usually doesn't matter, because the default refspec matches
"refs/heads/*", which covers all branches. But if you explicitly use a
narrow refspec, then "git pull" on some branches may fail. The server
doesn't advertise the branch, so we don't fetch it, and "git pull"
thinks that it went away upstream.

We can fix this by including any branch.*.merge entries for the current
branch in the list of ref-prefixes we pass to the server. This only
needs to happen when using the default configured refspec (since
command-line refspecs are already added, and take precedence in deciding
what we fetch). We don't otherwise need to replicate any of the "what to
fetch" logic in get_ref_map(). These ref-prefixes are an optimization,
so it's OK if we tell the server to advertise the branch.*.merge ref,
even if we're not going to pull it. We'll just choose not to fetch it.

The test here is based on one constructed by Johannes. I modified the
branch names to trigger the ref-prefix issue (and be more descriptive),
and to confirm that "git pull" actually updated the local ref, which
should be more robust than just checking stderr.

Reported-by: Lana Deere <lana.deere@gmail.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-08 13:10:37 -07:00
Jeff King
945ed00957 t1060: check partial clone of misnamed blob
A recent commit (upload-pack: skip parse-object re-hashing of "want"
objects, 2022-09-06) loosened the behavior of upload-pack so that it
does not verify the sha1 of objects it receives directly via "want"
requests. The existing corruption tests in t1060 aren't affected by
this: the corruptions are blobs reachable from commits, and the client
requests the commits.

The more interesting case here is a partial clone, where the client will
directly ask for the corrupted blob when it does an on-demand fetch of
the filtered object. And that is not covered at all, so let's add a
test.

It's important here that we use the "misnamed" corruption and not
"bit-error". The latter is sufficiently corrupted that upload-pack
cannot even figure out the type of the object, so it bails identically
both before and after the recent change. But with "misnamed", with the
hash-checks enabled it sees the problem (though the error messages are a
bit confusing because of the inability to create a "struct object" to
store the flags):

  error: hash mismatch d95f3ad14dee633a758d2e331151e950dd13e4ed
  fatal: git upload-pack: not our ref d95f3ad14dee633a758d2e331151e950dd13e4ed
  fatal: remote error: upload-pack: not our ref d95f3ad14dee633a758d2e331151e950dd13e4ed

After the change to skip the hash check, the server side happily sends
the bogus object, but the client correctly realizes that it did not get
the necessary data:

  remote: Enumerating objects: 1, done.
  remote: Counting objects: 100% (1/1), done.
  remote: Total 1 (delta 0), reused 0 (delta 0), pack-reused 0
  Receiving objects: 100% (1/1), 49 bytes | 49.00 KiB/s, done.
  fatal: bad revision 'd95f3ad14dee633a758d2e331151e950dd13e4ed'
  error: [...]/misnamed did not send all necessary objects

which is exactly what we expect to happen.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 15:08:51 -07:00
Jeff King
0bc2557951 upload-pack: skip parse-object re-hashing of "want" objects
Imagine we have a history with commit C pointing to a large blob B.
If a client asks us for C, we can generally serve both objects to them
without accessing the uncompressed contents of B. In upload-pack, we
figure out which commits we have and what the client has, and feed those
tips to pack-objects. In pack-objects, we traverse the commits and trees
(or use bitmaps!) to find the set of objects needed, but we never open
up B. When we serve it to the client, we can often pass the compressed
bytes directly from the on-disk packfile over the wire.

But if a client asks us directly for B, perhaps because they are doing
an on-demand fetch to fill in the missing blob of a partial clone, we
end up much slower. Upload-pack calls parse_object() on the oid we
receive, which opens up the object and re-checks its hash (even though
if it were a commit, we might skip this parse entirely in favor of the
commit graph!). And then we feed the oid directly to pack-objects, which
again calls parse_object() and opens the object. And then finally, when
we write out the result, we may send bytes straight from disk, but only
after having unnecessarily uncompressed and computed the sha1 of the
object twice!

This patch teaches both code paths to use the new SKIP_HASH_CHECK flag
for parse_object(). You can see the speed-up in p5600, which does a
blob:none clone followed by a checkout. The savings for git.git are
modest:

  Test                          HEAD^             HEAD
  ----------------------------------------------------------------------
  5600.3: checkout of result    2.23(4.19+0.24)   1.72(3.79+0.18) -22.9%

But the savings scale with the number of bytes. So on a repository like
linux.git with more files, we see more improvement (in both absolute and
relative numbers):

  Test                          HEAD^                HEAD
  ----------------------------------------------------------------------------
  5600.3: checkout of result    51.62(77.26+2.76)    34.86(61.41+2.63) -32.5%

And here's an even more extreme case. This is the android gradle-plugin
repository, whose tip checkout has ~3.7GB of files:

  Test                          HEAD^               HEAD
  --------------------------------------------------------------------------
  5600.3: checkout of result    79.51(90.84+5.55)   40.28(51.88+5.67) -49.3%

Keep in mind that these timings are of the whole checkout operation. So
they count the client indexing the pack and actually writing out the
files. If we want to see just the server's view, we can hack up the
GIT_TRACE_PACKET output from those operations and replay it via
upload-pack. For the gradle example, that gives me:

  Benchmark 1: GIT_PROTOCOL=version=2 git.old upload-pack ../gradle-plugin <input
    Time (mean ± σ):     50.884 s ±  0.239 s    [User: 51.450 s, System: 1.726 s]
    Range (min … max):   50.608 s … 51.025 s    3 runs

  Benchmark 2: GIT_PROTOCOL=version=2 git.new upload-pack ../gradle-plugin <input
    Time (mean ± σ):      9.728 s ±  0.112 s    [User: 10.466 s, System: 1.535 s]
    Range (min … max):    9.618 s …  9.842 s    3 runs

  Summary
    'GIT_PROTOCOL=version=2 git.new upload-pack ../gradle-plugin <input' ran
      5.23 ± 0.07 times faster than 'GIT_PROTOCOL=version=2 git.old upload-pack ../gradle-plugin <input'

So a server would see an 80% reduction in CPU serving the initial
checkout of a partial clone for this repository. Or possibly even more
depending on the packing; most of the time spent in the faster one were
objects we had to open during the write phase.

In both cases skipping the extra hashing on the server should be pretty
safe. The client doesn't trust the server anyway, so it will re-hash all
of the objects via index-pack. There is one thing to note, though: the
change in get_reference() affects not just pack-objects, but rev-list,
git-log, etc. We could use a flag to limit to index-pack here, but we
may already skip hash checks in this instance. For commits, we'd skip
anything we load via the commit-graph. And while before this commit we
would check a blob fed directly to rev-list on the command-line, we'd
skip checking that same blob if we found it by traversing a tree.

The exception for both is if --verify-objects is used. In that case,
we'll skip this optimization, and the new test makes sure we do this
correctly.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 12:20:02 -07:00
SZEDER Gábor
45bec2ead2 test-parse-options.c: fix style of comparison with zero
The preferred style is '!argc' instead of 'argc == 0'.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 12:06:12 -07:00
SZEDER Gábor
6983f4e3b2 test-parse-options.c: don't use for loop initial declaration
We would like to eventually use for loop initial declarations in our
codebase, but we are not there yet.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 12:06:12 -07:00
SZEDER Gábor
9a22b4d907 t0040-parse-options: remove leftover debugging
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 12:06:12 -07:00
Jeff King
b27ccae34b rev-list: disable commit graph with --verify-objects
Since the point of --verify-objects is to actually load and checksum the
bytes of each object, optimizing out reads using the commit graph runs
contrary to our goal.

The most targeted way to implement this would be for the revision
traversal code to check revs->verify_objects and avoid using the commit
graph. But it's difficult to be sure we've hit all of the correct spots.
For instance, I started this patch by writing the first of the included
test cases, where the corrupted commit is directly on rev-list's command
line. And that is easy to fix by teaching get_reference() to check
revs->verify_objects before calling lookup_commit_in_graph().

But that doesn't cover the second test case: when we traverse to a
corrupted commit, we'd parse the parent in process_parents(). So we'd
need to check there, too. And it keeps going. In handle_commit() we
sometimes parses commits, too, though I couldn't figure out a way to
trigger it that did not already parse via get_reference() or tag
peeling. And try_to_simplify_commit() has its own parse call, and so on.

So it seems like the safest thing is to just disable the commit graph
for the whole process when we see the --verify-objects option. We can do
that either in builtin/rev-list.c, where we use the option, or in
revision.c, where we parse it. There are some subtleties:

  - putting it in rev-list.c is less surprising in some ways, because
    there we know we are just doing a single traversal. In a command
    which does multiple traversals in a single process, it's rather
    unexpected to globally disable the commit graph.

  - putting it in revision.c is less surprising in some ways, because
    the caller does not have to remember to disable the graph
    themselves. But this is already tricky! The verify_objects flag in
    rev_info doesn't do anything by itself. The caller has to provide an
    object callback which does the right thing.

  - for that reason, in practice nobody but rev-list uses this option in
    the first place. So the distinction is probably not important either
    way. Arguably it should just be an option of rev-list, and not the
    general revision machinery; right now you can run "git log
    --verify-objects", but it does not actually do anything useful.

  - checking for a parsed revs.verify_objects flag in rev-list.c is too
    late. By that time we've already passed the arguments to
    setup_revisions(), which will have parsed the commits using the
    graph.

So this commit disables the graph as soon as we see the option in
revision.c. That's a pretty broad hammer, but it does what we want, and
in practice nobody but rev-list is using this flag anyway.

The tests cover both the "tip" and "parent" cases. Obviously our hammer
hits them both in this case, but it's good to check both in case
somebody later tries the more focused approach.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-07 09:44:30 -07:00
Junio C Hamano
27fb520ef2 Merge branch 'jk/test-crontab-fixes'
Test helper fix.

* jk/test-crontab-fixes:
  test-crontab: minor memory and error handling fixes
2022-09-05 18:33:41 -07:00
Junio C Hamano
fcbc8743ef Merge branch 'en/test-without-test-create-repo'
Test clean-up.

* en/test-without-test-create-repo:
  t64xx: convert 'test_create_repo' to 'git init'
2022-09-05 18:33:41 -07:00
Junio C Hamano
56785a3fad Merge branch 'bc/gc-crontab-fix'
FreeBSD portability fix for "git maintenance" that spawns "crontab"
to schedule tasks.

* bc/gc-crontab-fix:
  gc: use temporary file for editing crontab
2022-09-05 18:33:41 -07:00
Junio C Hamano
2d88021919 Merge branch 'es/t4301-sed-portability-fix'
Test clean-up.

* es/t4301-sed-portability-fix:
  t4301: emit blank line in more idiomatic fashion
  t4301: fix broken &&-chains and add missing loop termination
  t4301: account for behavior differences between sed implementations
2022-09-05 18:33:40 -07:00
Junio C Hamano
5784d201da Merge branch 'rs/test-mergesort'
Optimization of a test-helper command.

* rs/test-mergesort:
  test-mergesort: use mem_pool for sort input
  test-mergesort: read sort input all at once
2022-09-05 18:33:40 -07:00
Junio C Hamano
3fe0121479 Merge branch 'ac/bitmap-lookup-table'
The pack bitmap file gained a bitmap-lookup table to speed up
locating the necessary bitmap for a given commit.

* ac/bitmap-lookup-table:
  pack-bitmap-write: drop unused pack_idx_entry parameters
  bitmap-lookup-table: add performance tests for lookup table
  pack-bitmap: prepare to read lookup table extension
  pack-bitmap-write: learn pack.writeBitmapLookupTable and add tests
  pack-bitmap-write.c: write lookup table extension
  bitmap: move `get commit positions` code to `bitmap_writer_finish`
  Documentation/technical: describe bitmap lookup table extension
2022-09-05 18:33:39 -07:00
Junio C Hamano
cf98b69053 Merge branch 'tb/midx-with-changing-preferred-pack-fix'
Multi-pack index got corrupted when preferred pack changed from one
pack to another in a certain way, which has been corrected.

* tb/midx-with-changing-preferred-pack-fix:
  midx.c: avoid adding preferred objects twice
  midx.c: include preferred pack correctly with existing MIDX
  midx.c: extract `midx_fanout_add_pack_fanout()`
  midx.c: extract `midx_fanout_add_midx_fanout()`
  midx.c: extract `struct midx_fanout`
  t/lib-bitmap.sh: avoid silencing stderr
  t5326: demonstrate potential bitmap corruption
2022-09-05 18:33:39 -07:00
Victoria Dye
ba1b117eec t/perf: add 'GIT_PERF_USE_SCALAR' run option
Add a 'GIT_PERF_USE_SCALAR' environment variable (and corresponding perf
config 'useScalar') to register a repository created with any of:

* test_perf_fresh_repo
* test_perf_default_repo
* test_perf_large_repo

as a Scalar enlistment. This is intended to allow a developer to test the
impact of Scalar on already-defined performance scenarios.

Suggested-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 10:02:56 -07:00
Victoria Dye
e2809233d1 t/perf: add Scalar performance tests
Create 'p9210-scalar.sh' for testing Scalar performance and comparing
performance of Git operations in Scalar registrations and standard
repositories. Example results:

Test                                                   this tree
------------------------------------------------------------------------
9210.2: scalar clone                                   14.82(18.00+3.63)
9210.3: git clone                                      26.15(36.67+6.90)
9210.4: git status (scalar)                            0.04(0.01+0.01)
9210.5: git status (non-scalar)                        0.10(0.02+0.11)
9210.6: test_commit --append --no-tag A (scalar)       0.08(0.02+0.03)
9210.7: test_commit --append --no-tag A (non-scalar)   0.13(0.03+0.11)

Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 10:02:56 -07:00
Victoria Dye
14b4e7e5a4 scalar-clone: add test coverage
Create a new test file ('t9211-scalar-clone.sh') to exercise the options and
behavior of the 'scalar clone' command. Each test clones to a unique target
location and cleans up the cloned repo only when the test passes. This
ensures that failed tests' artifacts are captured in CI artifacts for
further debugging.

Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 10:02:56 -07:00
Victoria Dye
7b5c93c6c6 scalar: include in standard Git build & installation
Move 'scalar' out of 'contrib/' and into the root of the Git tree. The goal
of this change is to build 'scalar' as part of the standard Git build &
install processes.

This patch includes both the physical move of Scalar's files out of
'contrib/' ('scalar.c', 'scalar.txt', and 't9xxx-scalar.sh'), and the
changes to the build definitions in 'Makefile' and 'CMakelists.txt' to
accommodate the new program.

At a high level, Scalar is built so that:
- there is a 'scalar-objs' target (similar to those created in 029bac01a8
  (Makefile: add {program,xdiff,test,git,fuzz}-objs & objects targets,
  2021-02-23)) for debugging purposes.
- it appears in the root of the install directory (rather than the
  gitexecdir).
- it is included in the 'bin-wrappers/' directory for use in tests.
- it receives a platform-specific executable suffix (e.g., '.exe'), if
  applicable.
- 'scalar.txt' is installed as 'man1' documentation.
- the 'clean' target removes the 'scalar' executable.

Additionally, update the root level '.gitignore' file to ignore the Scalar
executable.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 10:02:55 -07:00
Victoria Dye
037f8ea6d9 unpack-trees: fix sparse directory recursion check
Ensure 'is_sparse_directory_entry()' receives a valid 'name_entry *' if one
exists in the list of tree(s) being unpacked in 'unpack_callback()'.

Currently, 'is_sparse_directory_entry()' is called with the first
'name_entry' in the 'names' list of entries on 'unpack_callback()'. However,
this entry may be empty even when other elements of 'names' are not (such as
when switching from an orphan branch back to a "normal" branch). As a
result, 'is_sparse_directory_entry()' could incorrectly indicate that a
sparse directory is *not* actually sparse because the name of the index
entry does not match the (empty) 'name_entry' path.

Fix the issue by using the existing 'name_entry *p' value in
'unpack_callback()', which points to the first non-empty entry in 'names'.
Because 'p' is 'const', also update 'is_sparse_directory_entry()'s
'name_entry *' argument to be 'const'.

Finally, add a regression test case.

Reported-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:43:09 -07:00
Elijah Newren
67360b75c6 diff: fix filtering of merge commits under --remerge-diff
Commit 95433eeed9 ("diff: add ability to insert additional headers for
paths", 2022-02-02) introduced the possibility of additional headers.
Because there could be conflicts with no content differences (e.g. a
modify/delete conflict resolved in favor of taking the modified file
as-is), that commit also modified the diff_queue_is_empty() and
diff_flush_patch() logic to ensure these headers were included even if
there was no associated content diff.

However, the added logic was a bit inconsistent between these two
functions.  diff_queue_is_empty() overlooked the fact that the
additional headers strmap could be non-NULL and empty, which would cause
it to display commits that should have been filtered out.

Fix the diff_queue_is_empty() logic to also account for
additional_path_headers being empty.

Reported-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:22:25 -07:00
Elijah Newren
71a146dc70 diff: fix filtering of additional headers under --remerge-diff
Commit 95433eeed9 ("diff: add ability to insert additional headers for
paths", 2022-02-02) introduced the possibility of additional headers.
Because there could be conflicts with no content differences (e.g. a
modify/delete conflict resolved in favor of taking the modified file
as-is), that commit also modified the diff_queue_is_empty() and
diff_flush_patch() logic to ensure these headers were included even if
there was no associated content diff.

However, when the pickaxe is active, we really only want the remerge
conflict headers to be shown when there is an associated content diff.
Adjust the logic in these two functions accordingly.

This also removes the TEST_PASSES_SANITIZE_LEAK=true declaration from
t4069, as there is apparently some kind of memory leak with the pickaxe
code.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:22:25 -07:00
Elijah Newren
9b08091cb7 diff: have submodule_format logic avoid additional diff headers
Commit 95433eeed9 ("diff: add ability to insert additional headers for
paths", 2022-02-02) introduced the possibility of additional headers,
created in create_filepairs_for_header_only_notifications().  These are
represented by inserting additional pairs in diff_queued_diff which
always have a mode of 0 and a null_oid.  When these were added, one
code path was noted to assume that at least one of the diff_filespecs
in the pair were valid, and that codepath was corrected.

The submodule_format handling is another codepath with the same issue;
it would operate on these additional headers and attempt to display them
as submodule changes.  Prevent that by explicitly checking for "phoney"
filepairs (i.e. filepairs with both modes being 0).

Reported-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:22:25 -07:00
Ævar Arnfjörð Bjarmason
fe4c750fb1 submodule--helper: fix a configure_added_submodule() leak
Fix config API a memory leak added in a452128a36 (submodule--helper:
introduce add-config subcommand, 2021-08-06) by using the *_tmp()
variant of git_config_get_string().

In this case we're only checking whether
the (repo|git)_config_get_string() call is telling us that the
"submodule.active" key exists.

As with the preceding commit we'll find many other such patterns in
the codebase if we go fishing. E.g. "git gc" leaks in the code added
in 61f7a383d3 (maintenance: use 'incremental' strategy by default,
2020-10-15). Similar code in "git gc" added in
b08ff1fee0 (maintenance: add --schedule option and config,
2020-09-11) doesn't leak, but we could avoid the malloc() & free() in
that case.

A coccinelle rule to find those would find and fix some leaks, and
cases where we're doing needless malloc() + free()'s but only care
about the key existence, or are copying
the (repo|git)_config_get_string() return value right away.

But as with the preceding commit let's punt on all of that for now,
and just narrowly fix this specific case in submodule--helper.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:18:13 -07:00
Ævar Arnfjörð Bjarmason
d40c42e06b submodule--helper: free some "displaypath" in "struct update_data"
Make the update_data_release() function free "displaypath" member when
appropriate. The "displaypath" member is always ours, the "const" on
the "char *" was wrong to begin with.

This leaves a leak of "displaypath" in update_submodule(), which as
we'll see in subsequent commits is harder to deal with than this
trivial fix.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:18:13 -07:00
Ævar Arnfjörð Bjarmason
980416e469 submodule--helper: fix "sm_path" and other "module_cb_list" leaks
Fix leaks in "struct module_cb_list" and the "struct module_cb" which
it contains, these fix leaks in e83e3333b5 (submodule: port submodule
subcommand 'summary' from shell to C, 2020-08-13).

The "sm_path" should always have been a "char *", not a "const
char *", we always create it with xstrdup().

We can't mark any tests passing passing with SANITIZE=leak using
"TEST_PASSES_SANITIZE_LEAK=true" as a result of this change, but
"t7401-submodule-summary.sh" gets closer to passing as a result of
this change.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:18:12 -07:00
Ævar Arnfjörð Bjarmason
87a683482a submodule--helper: add and use *_release() functions
Add release functions for "struct module_list", "struct
submodule_update_clone" and "struct update_data".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:18:12 -07:00
Ævar Arnfjörð Bjarmason
e77b3da6bb submodule--helper: fix a leak in "clone_submodule"
Fix a memory leak of the "clone_data_path" variable that we copy or
derive from the "struct module_clone_data" in clone_submodule(). This
code was refactored in preceding commits, but the leak has been with
us since f8eaa0ba98 (submodule--helper, module_clone: always operate
on absolute paths, 2016-03-31).

For the "else" case we don't need to xstrdup() the "clone_data->path",
and we don't need to free our own "clone_data_path". We can therefore
assign the "clone_data->path" to our own "clone_data_path" right away,
and only override it (and remember to free it!) if we need to
xstrfmt() a replacement.

In the case of the module_clone() caller it's from "argv", and doesn't
need to be free'd, and in the case of the add_submodule() caller we
get a pointer to "sm_path", which doesn't need to be directly free'd
either.

Fixing this leak makes several tests pass, so let's mark them as
passing with TEST_PASSES_SANITIZE_LEAK=true.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:18:12 -07:00
Ævar Arnfjörð Bjarmason
0b917a9f5c submodule--helper: return "ret", not "1" from update_submodule()
Amend the update_submodule() function to return the failing "ret" on
error, instead of overriding it with "1".

This code was added in b3c5f5cb04 (submodule: move core cmd_update()
logic to C, 2022-03-15), and this change ends up not making a
difference as this function is only called in update_submodules(). If
we return non-zero here we'll always in turn return "1" in
module_update().

But if we didn't do that and returned any other non-zero exit code in
update_submodules() we'd fail the test that's being amended
here. We're still testing the status quo here.

This change makes subsequent refactoring of update_submodule() easier,
as we'll no longer need to worry about clobbering the "ret" we get
from the run_command().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:24 -07:00
Ævar Arnfjörð Bjarmason
96a28a9bc6 submodule--helper: move "resolve-relative-url-test" to a test-tool
As its name suggests the "resolve-relative-url-test" has never been
used outside of the test suite, see 63e95beb08 (submodule: port
resolve_relative_url from shell to C, 2016-04-15) for its original
addition.

Perhaps it would make sense to drop this code entirely, as we feel
that we've got enough indirect test coverage, but let's leave that
question to a possible follow-up change. For now let's keep the test
coverage this gives us.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:23 -07:00
Ævar Arnfjörð Bjarmason
85321a346b submodule--helper: move "check-name" to a test-tool
Move the "check-name" helper to a test-tool, since
a6226fd772 (submodule--helper: convert the bulk of cmd_add() to C,
2021-08-10) it has only been used by this test, not git-submodule.sh.

As noted with its introduction in 0383bbb901 (submodule-config:
verify submodule names as paths, 2018-04-30) the intent of
t7450-bad-git-dotfiles.sh has always been to unit test the
check_submodule_name() function.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:23 -07:00
Ævar Arnfjörð Bjarmason
9fb2a970e9 submodule--helper: move "is-active" to a test-tool
Create a new "test-tool submodule" and move the "is-active" subcommand
over to it. It was added in 5c2bd8b77a (submodule--helper: add
is-active subcommand, 2017-03-16), since
a452128a36 (submodule--helper: introduce add-config subcommand,
2021-08-06) it hasn't been used by git-submodule.sh.

Since we're creating a command dispatch similar to test-tool.c itself
let's split out the "struct test_cmd" into a new test-tool-utils.h,
which both this new code and test-tool.c itself can use.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:23 -07:00
Ævar Arnfjörð Bjarmason
255a1ae5da test-tool submodule-config: remove unused "--url" handling
No test has used this "--url" parameter since the test code that made
use of it was removed in 32bc548329 (submodule-config: remove support
for overlaying repository config, 2017-08-03).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:23 -07:00
Ævar Arnfjörð Bjarmason
31955475d1 submodule--helper: remove unused "list" helper
Remove the "submodule--helper list" sub-command, which hasn't been
used by git-submodule.sh since 2964d6e5e1 (submodule: port subcommand
'set-branch' from shell to C, 2020-06-02).

There was a test added in 2b56bb7a87 (submodule helper list: respect
correct path prefix, 2016-02-24) which relied on it, but the right
thing to do here is to delete that test as well.

That test was regression testing the "list" subcommand itself. We're
not getting anything useful from the "list | cut -f2" invocation that
we couldn't get from "foreach 'echo $sm_path'".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:23 -07:00
Ævar Arnfjörð Bjarmason
59378e3355 submodule tests: test for "add <repository> <abs-path>"
Add a missing test for ""add <repository> <path>" where "<path>" is an
absolute path. This tests code added in [1] and later turned into an
"else" branch in clone_submodule() in [2] that's never been tested.

This needs to be skipped on WINDOWS because all of $PWD, $(pwd) and
the "$(pwd -P)" we get via "$submodurl" would fail in CI with e.g.:

	fatal: could not create directory 'D:/a/git/git/t/trash
	directory.t7400-submodule-basic/.git/modules/D:/a/git/git/t/trash
	directory.t7400-submodule-basic/add-abs'

I.e. we can't handle these sorts of paths in this context on that
platform.

I'm not sure where we run into the edges of "$PWD" behavior on
Windows (see [1] for a previous loose end on the topic), but for the
purposes of this test it's sufficient that we test this on other
platforms.

1. ee8838d157 (submodule: rewrite `module_clone` shell function in C,
   2015-09-08)
2. f8eaa0ba98 (submodule--helper, module_clone: always operate on
   absolute paths, 2016-03-31)

1. https://lore.kernel.org/git/220630.86edz6c75c.gmgdl@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:22 -07:00
Ævar Arnfjörð Bjarmason
89bc7b5c01 submodule tests: test usage behavior
Test what exit code and output we emit on "git submodule -h", how we
handle "--" when no subcommand is specified, and how the top-level
"--recursive" option is handled.

For "-h" this doesn't make sense, but let's test for it so that any
subsequent eventual behavior change will become clear.

For "--" this follows up on 68cabbfda3 (submodule: document default
behavior, 2019-02-15) and tests that "status" doesn't support
the "--" delimiter. There's no intrinsically good reason not to
support that. We behave this way due to edge cases in
git-submodule.sh's implementation, but as with "-h" let's assert our
current long-standing behavior for now.

For "--recursive" the exclusion of it from the top-level appears to
have been an omission in 15fc56a853 (git submodule foreach: Add
--recursive to recurse into nested submodules, 2009-08-19), there
doesn't seem to be a reason not to support it alongside "--quiet" and
"--cached", but let's likewise assert our existing behavior for now.

I.e. as long as "status" is optional it would make sense to support
all of its options when it's omitted, but we only do that with
"--quiet" and "--cached", and curiously omit "--recursive".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-02 09:16:22 -07:00
Junio C Hamano
014a9ea207 Merge branch 'en/t4301-more-merge-tree-tests'
More tests to protect the current behaviour of "merge-tree" before
it gets further updated.

* en/t4301-more-merge-tree-tests:
  t4301: add more interesting merge-tree testcases
2022-09-01 13:40:19 -07:00
Junio C Hamano
3a4779086d Merge branch 'en/merge-unstash-only-on-clean-merge'
The auto-stashed local changes created by "git merge --autostash"
was mixed into a conflicted state left in the working tree, which
has been corrected.

* en/merge-unstash-only-on-clean-merge:
  merge: only apply autostash when appropriate
2022-09-01 13:40:18 -07:00
Junio C Hamano
d528044c83 Merge branch 'sg/parse-options-subcommand'
Introduce the "subcommand" mode to parse-options API and update the
command line parser of Git commands with subcommands.

* sg/parse-options-subcommand: (23 commits)
  remote: run "remote rm" argv through parse_options()
  maintenance: add parse-options boilerplate for subcommands
  pass subcommand "prefix" arguments to parse_options()
  builtin/worktree.c: let parse-options parse subcommands
  builtin/stash.c: let parse-options parse subcommands
  builtin/sparse-checkout.c: let parse-options parse subcommands
  builtin/remote.c: let parse-options parse subcommands
  builtin/reflog.c: let parse-options parse subcommands
  builtin/notes.c: let parse-options parse subcommands
  builtin/multi-pack-index.c: let parse-options parse subcommands
  builtin/hook.c: let parse-options parse subcommands
  builtin/gc.c: let parse-options parse 'git maintenance's subcommands
  builtin/commit-graph.c: let parse-options parse subcommands
  builtin/bundle.c: let parse-options parse subcommands
  parse-options: add support for parsing subcommands
  parse-options: drop leading space from '--git-completion-helper' output
  parse-options: clarify the limitations of PARSE_OPT_NODASH
  parse-options: PARSE_OPT_KEEP_UNKNOWN only applies to --options
  api-parse-options.txt: fix description of OPT_CMDMODE
  t0040-parse-options: test parse_options() with various 'parse_opt_flags'
  ...
2022-09-01 13:40:18 -07:00
Junio C Hamano
68ef0425d9 Merge branch 'ds/bundle-uri-clone'
Implement "git clone --bundle-uri".

* ds/bundle-uri-clone:
  clone: warn on failure to repo_init()
  clone: --bundle-uri cannot be combined with --depth
  bundle-uri: add support for http(s):// and file://
  clone: add --bundle-uri option
  bundle-uri: create basic file-copy logic
  remote-curl: add 'get' capability
2022-09-01 13:40:17 -07:00
Ævar Arnfjörð Bjarmason
5cf88fd8b0 git-compat-util.h: use "UNUSED", not "UNUSED(var)"
As reported in [1] the "UNUSED(var)" macro introduced in
2174b8c75de (Merge branch 'jk/unused-annotation' into next,
2022-08-24) breaks coccinelle's parsing of our sources in files where
it occurs.

Let's instead partially go with the approach suggested in [2] of
making this not take an argument. As noted in [1] "coccinelle" will
ignore such tokens in argument lists that it doesn't know about, and
it's less of a surprise to syntax highlighters.

This undoes the "help us notice when a parameter marked as unused is
actually use" part of 9b24034754 (git-compat-util: add UNUSED macro,
2022-08-19), a subsequent commit will further tweak the macro to
implement a replacement for that functionality.

1. https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/
2. https://lore.kernel.org/git/220819.868rnk54ju.gmgdl@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:49:48 -07:00
Eric Sunshine
fb41727b7e t: retire unused chainlint.sed
Retire chainlint.sed since it has been replaced by a more accurate and
functional &&-chain "linter", thus is no longer used.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:41 -07:00
Eric Sunshine
69b9924b87 t/Makefile: teach make test and make prove to run chainlint.pl
Unlike chainlint.sed which "lints" a single test body at a time, thus is
invoked once per test, chainlint.pl can check all test bodies in all
test scripts with a single invocation. As such, it is akin to other bulk
"linters" run by the Makefile, such as `test-lint-shell-syntax`,
`test-lint-duplicates`, etc.

Therefore, teach `make test` and `make prove` to invoke chainlint.pl
along with the other bulk linters. Also, since the single chainlint.pl
invocation by `make test` or `make prove` has already checked all tests
in all scripts, instruct the individual test scripts not to run
chainlint.pl on themselves unnecessarily.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:41 -07:00
Eric Sunshine
23a14f3016 test-lib: replace chainlint.sed with chainlint.pl
By automatically invoking chainlint.sed upon each test it runs,
`test_run_` in test-lib.sh ensures that broken &&-chains will be
detected early as tests are modified or new are tests created since it
is typical to run a test script manually (i.e. `./t1234-test-script.sh`)
during test development. Now that the implementation of chainlint.pl is
complete, modify test-lib.sh to invoke it automatically instead of
chainlint.sed each time a test script is run.

This change reduces the number of "linter" invocations from 26800+ (once
per test run) down to 1050+ (once per test script), however, a
subsequent change will drop the number of invocations to 1 per `make
test`, thus fully realizing the benefit of the new linter.

Note that the "magic exit code 117" &&-chain checker added by bb79af9d09
(t/test-lib: introduce --chain-lint option, 2015-03-20) which is built
into t/test-lib.sh is retained since it has near zero-cost and
(theoretically) may catch a broken &&-chain not caught by chainlint.pl.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:41 -07:00
Eric Sunshine
9fd911237f test-lib: retire "lint harder" optimization hack
`test_run_` in test-lib.sh "lints" the body of a test by sending it down
a `sed chainlint.sed | grep` pipeline; this happens once for each test
run by a test script. Although this pipeline may seem relatively cheap
in isolation, it can become expensive when invoked 26800+ times by `make
test`, once for each test run, despite the existence of only 16500+ test
definitions across all tests scripts.

This difference in the number of tests defined in the scripts (16500+)
and the number of tests actually run by `make test` (26800+) is
explained by the fact that some test scripts run a very large number of
small tests, all driven by a series of functions/loops which fill in the
test bodies. This means that certain test definitions are being linted
repeatedly (tens or hundreds of times) unnecessarily. To avoid such
unnecessary work, 2d86a96220 (t: avoid sed-based chain-linting in some
expensive cases, 2021-05-13) added an optimization hack which allows
individual scripts to manually suppress the unnecessary repeated linting
of the same test definition.

However, unlike chainlint.sed which checks a test body as the test is
run, chainlint.pl checks each test definition just once, no matter how
many times the test is run, thus the sort of optimization hack
introduced by 2d86a96220 is no longer needed and can be retired.
Therefore, revert 2d86a96220.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:41 -07:00
Eric Sunshine
56066523ed t/chainlint: add more chainlint.pl self-tests
During the development of chainlint.pl, numerous new self-tests were
created to verify correct functioning beyond the checks already
represented by the existing self-tests. The new checks fall into several
categories:

* behavior of the lexical analyzer for complex cases, such as line
  splicing, token pasting, entering and exiting string contexts inside
  and outside of test script bodies; for instance:

    test_expect_success 'title' '
      x=$(echo "something" |
        sed -e '\''s/\\/\\\\/g'\'' -e '\''s/[[/.*^$]/\\&/g'\''
    '

* behavior of the parser for all compound grammatical constructs, such
  as `if...fi`, `case...esac`, `while...done`, `{...}`, etc., and for
  other legal shell grammatical constructs not covered by existing
  chainlint.sed self-tests, as well as complex cases, such as:

    OUT=$( ((large_git 1>&3) | :) 3>&1 ) &&

* detection of problems, such as &&-chain breakage, from top-level to
  any depth since the existing self-tests do not cover any top-level
  context and only cover subshells one level deep due to limitations of
  chainlint.sed

* address blind spots in chainlint.sed (such as not detecting a broken
  &&-chain on a one-line for-loop in a subshell[1]) which chainlint.pl
  correctly detects

* real-world cases which tripped up chainlint.pl during its development

[1]: https://lore.kernel.org/git/dce35a47012fecc6edc11c68e91dbb485c5bc36f.1661663880.git.gitgitgadget@gmail.com/

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:41 -07:00
Eric Sunshine
ae0c55abf8 chainlint.pl: allow || echo to signal failure upstream of a pipe
The use of `|| return` (or `|| exit`) to signal failure within a loop
isn't effective when the loop is upstream of a pipe since the pipe
swallows all upstream exit codes and returns only the exit code of the
final command in the pipeline.

To work around this limitation, tests may adopt an alternative strategy
of signaling failure by emitting text which would never be emitted in
the non-failing case. For instance:

    while condition
    do
        command1 &&
        command2 ||
        echo "impossible text"
    done |
    sort >actual &&

Such usage indicates deliberate thought about failure cases by the test
author, thus flagging them as missing `|| return` (or `|| exit`) is not
helpful. Therefore, take this case into consideration when checking for
explicit loop termination.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:41 -07:00
Eric Sunshine
fd4094c3ca chainlint.pl: complain about loops lacking explicit failure handling
Shell `for` and `while` loops do not terminate automatically just
because a command fails within the loop body. Instead, the loop
continues to iterate and eventually returns the exit status of the final
command of the final iteration, which may not be the command which
failed, thus it is possible for failures to go undetected. Consequently,
it is important for test authors to explicitly handle failure within the
loop body by terminating the loop manually upon failure. This can be
done by returning a non-zero exit code from within the loop body
(i.e. `|| return 1`) or exiting (i.e. `|| exit 1`) if the loop is within
a subshell, or by manually checking `$?` and taking some appropriate
action. Therefore, add logic to detect and complain about loops which
lack explicit `return` or `exit`, or `$?` check.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:41 -07:00
Eric Sunshine
832c68b3c2 chainlint.pl: don't flag broken &&-chain if failure indicated explicitly
There are quite a few tests which print an error messages and then
explicitly signal failure with `false`, `return 1`, or `exit 1` as the
final command in an `if` branch. In these cases, the tests don't bother
maintaining the &&-chain between `echo` and the explicit "test failed"
indicator. Since such constructs are manually signaling failure, their
&&-chain breakage is legitimate and safe -- both for the command
immediately preceding `false`, `return`, or `exit`, as well as for all
preceding commands in the `if` branch. Therefore, stop flagging &&-chain
breakage in these sorts of cases.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:41 -07:00
Eric Sunshine
a8f30ee050 chainlint.pl: don't flag broken &&-chain if $? handled explicitly
There are cases in which tests capture and check a command's exit code
explicitly without employing test_expect_code(). They do so by
intentionally breaking the &&-chain since it would be impossible to
capture "$?" in the failing case if the `status=$?` assignment was part
of the &&-chain. Since such constructs are manually checking the exit
code, their &&-chain breakage is legitimate and safe, thus should not be
flagged. Therefore, stop flagging &&-chain breakage in such cases.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:41 -07:00
Eric Sunshine
aabc3258a1 chainlint.pl: don't require & background command to end with &&
The exit status of the `&` asynchronous operator which starts a command
in the background is unconditionally zero, and the few places in the
test scripts which launch commands asynchronously are not interested in
the exit status of the `&` operator (though they often capture the
background command's PID). As such, there is little value in complaining
about broken &&-chain for a command launched in the background, and
doing so would only make busy-work for test authors. Therefore, take
this special case into account when checking for &&-chain breakage.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:40 -07:00
Eric Sunshine
d00113ec34 t/Makefile: apply chainlint.pl to existing self-tests
Now that chainlint.pl is functional, take advantage of the existing
chainlint self-tests to validate its operation. (While at it, stop
validating chainlint.sed against the self-tests since it will soon be
retired.)

Due to chainlint.sed implementation limitations leaking into the
self-test "expect" files, a few of them require minor adjustment to make
them compatible with chainlint.pl which does not share those
limitations.

First, because `sed` does not provide any sort of real recursion,
chainlint.sed only emulates recursion into subshells, and each level of
recursion leads to a multiplicative increase in complexity of the `sed`
rules. To avoid substantial complexity, chainlint.sed, therefore, only
emulates subshell recursion one level deep. Any subshell deeper than
that is passed through as-is, which means that &&-chains are not checked
in deeper subshells. chainlint.pl, on the other hand, employs a proper
recursive descent parser, thus checks subshells to any depth and
correctly flags broken &&-chains in deep subshells.

Second, due to sed's line-oriented nature, chainlint.sed, by necessity,
folds multi-line quoted strings into a single line. chainlint.pl, on the
other hand, employs a proper lexical analyzer which preserves quoted
strings as-is, including embedded newlines.

Furthermore, the output of chainlint.sed and chainlint.pl do not match
precisely in terms of whitespace. However, since the purpose of the
self-checks is to verify that the ?!AMP?! annotations are being
correctly added, minor whitespace differences are immaterial. For this
reason, rather than adjusting whitespace in all existing self-test
"expect" files to match the new linter's output, the `check-chainlint`
target ignores whitespace differences. Since `diff -w` is not POSIX,
`check-chainlint` attempts to employ `git diff -w`, and only falls back
to non-POSIX `diff -w` (and `-u`) if `git diff` is not available.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:40 -07:00
Eric Sunshine
35ebb1e37b chainlint.pl: don't require return|exit|continue to end with &&
In order to check for &&-chain breakage, each time TestParser encounters
a new command, it checks whether the previous command ends with `&&`,
and -- with a couple exceptions -- signals breakage if it does not. The
first exception is that a command may validly end with `||`, which is
commonly employed as `command || return 1` at the very end of a loop
body to terminate the loop early. The second is that piping one
command's output with `|` to another command does not constitute a
&&-chain break (the exit status of the pipe is the exit status of the
final command in the pipe).

However, it turns out that there are a few additional cases found in the
wild in which it is likely safe for `&&` to be missing even when other
commands follow. For instance:

    while {condition-1}
    do
        test {condition-2} || return 1 # or `exit 1` within a subshell
        more-commands
    done

    while {condition-1}
    do
        test {condition-2} || continue
        more-commands
    done

Such cases indicate deliberate thought about failure modes by the test
author, thus flagging them as breaking the &&-chain is not helpful.
Therefore, take these special cases into consideration when checking for
&&-chain breakage.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:40 -07:00
Eric Sunshine
29fb2ec384 chainlint.pl: validate test scripts in parallel
Although chainlint.pl has undergone a good deal of optimization during
its development -- increasing in speed significantly -- parsing and
validating 1050+ scripts and 16500+ tests via Perl is not exactly
instantaneous. However, perceived performance can be improved by taking
advantage of the fact that there is no interdependence between test
scripts or test definitions, thus parsing and validating can be done in
parallel. The number of available cores is determined automatically but
can be overridden via the --jobs option.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:40 -07:00
Eric Sunshine
d99ebd6d2e chainlint.pl: add parser to identify test definitions
Finish fleshing out chainlint.pl by adding ScriptParser, a parser which
scans shell scripts for tests defined by test_expect_success() and
test_expect_failure(), plucks the test body from each definition, and
passes it to TestParser for validation. It recognizes test definitions
not only at the top-level of test scripts but also tests synthesized
within compound commands such as loops and function.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:40 -07:00
Eric Sunshine
6d932e92fc chainlint.pl: add parser to validate tests
Continue fleshing out chainlint.pl by adding TestParser, a parser with
special knowledge about how Git tests should be written; for instance,
it knows that commands within a test body should be chained together
with `&&`. An upcoming parser which plucks test definitions from test
scripts will invoke TestParser for each test body it encounters.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:40 -07:00
Eric Sunshine
6594554119 chainlint.pl: add POSIX shell parser
Continue fleshing out chainlint.pl by adding a general purpose recursive
descent parser for the POSIX shell command language. Although never
invoked directly, upcoming parser subclasses will extend its
functionality for specific purposes, such as plucking test definitions
from input scripts and applying domain-specific knowledge to perform
test validation.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:40 -07:00
Eric Sunshine
7d4804731e chainlint.pl: add POSIX shell lexical analyzer
Begin fleshing out chainlint.pl by adding a lexical analyzer for the
POSIX shell command language. The sole entry point Lexer::scan_token()
returns the next token from the input. It will be called by the upcoming
shell language parser.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:40 -07:00
Eric Sunshine
b4f25b07c7 t: add skeleton chainlint.pl
Although chainlint.sed usefully identifies broken &&-chains in tests, it
has several shortcomings which include:

  * only detects &&-chain breakage in subshells (one-level deep)

  * does not check for broken top-level &&-chains; that task is left to
    the "magic exit code 117" checker built into test-lib.sh, however,
    that detection does not extend to `{...}` blocks, `$(...)`
    expressions, or compound statements such as `if...fi`,
    `while...done`, `case...esac`

  * uses heuristics, which makes it (potentially) fallible and difficult
    to tweak to handle additional real-world cases

  * written in `sed` and employs advanced `sed` operators which are
    probably not well-known to many programmers, thus the pool of people
    who can maintain it is likely small

  * manually simulates recursion into subshells which makes it much more
    difficult to reason about than, say, a traditional top-down parser

  * checks each test as the test is run, which can get expensive for
    tests which are run repeatedly by functions or loops since their
    bodies will be checked over and over (tens or hundreds of times)
    unnecessarily

To address these shortcomings, begin implementing a more functional and
precise test linter which understands shell syntax and semantics rather
than employing heuristics, thus is able to recognize structural problems
with tests beyond broken &&-chains.

The new linter is written in Perl, thus should be more accessible to a
wider audience, and is structured as a traditional top-down parser which
makes it much easier to reason about, and allows it to inspect compound
statements within test bodies to any depth.

Furthermore, it can check all test definitions in the entire project in
a single invocation rather than having to be invoked once per test, and
each test definition is checked only once no matter how many times the
test is actually run.

At this stage, the new linter is just a skeleton containing boilerplate
which handles command-line options, collects and reports statistics, and
feeds its arguments -- paths of test scripts -- to a (presently)
do-nothing script parser for validation. Subsequent changes will flesh
out the functionality.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:07:40 -07:00
Johannes Schindelin
0a101676e5 add -p: ignore dirty submodules
Thanks to always running `diff-index` and `diff-files` with the
`--numstat` option (the latter with `--ignore-submodules=dirty`) before
even generating any real diff to parse, the Perl version of `git add -p`
simply ignored dirty submodules and does not even offer them up for
staging.

However, the built-in variant did not use that flag because it tries to
run only one `diff` command, skipping the unneeded
`diff-index`/`diff-files` invocation of the Perl variant and therefore
only faithfully recapitulates what the Perl code does once it _does_
generate and parse the real diff.

This causes a problem when running the built-in `add -p` with
`diff-so-fancy` because that diff colorizer always inserts an empty line
before the diff header to ensure that it produces 4 lines as expected by
`git add -p` (the equivalent of the non-colorized `diff`, `index`, `---`
and `+++` lines). But `git diff-files` does not produce any `index` line
for dirty submodules.

The underlying problem is not even the discrepancy in lines, but that
`git add -p` presents diffs for dirty submodules: there is nothing that
_can_ be staged for those.

Let's fix that bug, and teach the built-in `add -p` to ignore dirty
submodules, too. This _incidentally_ also fixes the `diff-so-fancy`
problem ;-)

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 09:55:28 -07:00
Johannes Schindelin
fd3f7f619a add -p: gracefully handle unparseable hunk headers in colored diffs
In
https://lore.kernel.org/git/ecf6f5be-22ca-299f-a8f1-bda38e5ca246@gmail.com,
Phillipe Blain reported that the built-in `git add -p` command fails
when asked to use [`diff-so-fancy`][diff-so-fancy] to colorize the diff.

The reason is that this tool produces colored diffs with a hunk header
that does not contain any parseable `@@ ... @@` line range information,
and therefore we cannot detect any part in that header that comes after
the line range.

As proposed by Phillip Wood, let's take that for a clear indicator that
we should show the hunk headers verbatim. This is what the Perl version
of the interactive `add` command did, too.

[diff-so-fancy]: https://github.com/so-fancy/diff-so-fancy

Reported-by: Philippe Blain <levraiphilippeblain@gmail.com>
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 09:55:21 -07:00
Johannes Schindelin
b6633a0053 add -p: detect more mismatches between plain vs colored diffs
When parsing the colored version of a diff, the interactive `add`
command really relies on the colored version having the same number of
lines as the plain (uncolored) version. That is an invariant.

We already have code to verify correctly when the colored diff has less
lines than the plain diff. Modulo an off-by-one bug: If the last diff
line has no matching colored one, the code pretends to succeed, still.

To make matters worse, when we adjusted the test in 1e4ffc765d (t3701:
adjust difffilter test, 2020-01-14), we did not catch this because `add
-p` fails for a _different_ reason: it does not find any colored hunk
header that contains a parseable line range.

If we change the test case so that the line range _can_ be parsed, the
bug is exposed.

Let's address all of the above by

- fixing the off-by-one,

- adjusting the test case to allow `add -p` to parse the line range

- making the test case more stringent by verifying that the expected
  error message is shown

Also adjust a misleading code comment about the now-fixed code.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 09:49:45 -07:00
Jeff King
0682bc43f5 test-crontab: minor memory and error handling fixes
Since ee69e7884e (gc: use temporary file for editing crontab,
2022-08-28), we now insist that "argc == 3" (and otherwise return an
error). Coverity notes that this causes some dead code:

    if (argc == 3)
          fclose(from);
    else
          fclose(to);

as we will never trigger the else. This also causes a memory leak, since
we'll never close "to".

Now that all paths require 2 arguments, we can just reorganize the
function to check argc up front, and tweak the cleanup to do the right
thing for all cases.

While we're here, we can also notice some minor problems:

  - we return a negative int via error() from what is essentially a
    main() function; we should return a positive non-zero value for
    error. Or better yet, we can just use usage(), which gives a better
    message.

  - while writing the usage message, we can note the one in the comment
    was made out of date by ee69e7884e. But it also had a typo already,
    calling the subcommand "cron" and not "crontab"

  - we didn't check for an error from fopen(), meaning we would segfault
    if the to-be-read file was missing. We can use xfopen() to catch
    this.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-30 14:31:37 -07:00
Johannes Schindelin
64ec8efb83 t6132(NO_PERL): do not run the scripted add -p
When using the non-built-in version of `git add -p` in a `NO_PERL`
build, we expect that invocation to fail.

However, when b02fdbc80a (pathspec: correct an empty string used as a
pathspec element, 2022-05-29) added a test case to t6132 to exercise
`git add -p`, it did not add appropriate prereqs (which admittedly did
not exist back then).

Let's specify the appropriate prereqs.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-30 10:40:48 -07:00
Johannes Schindelin
7524780255 t3701: test the built-in add -i regardless of NO_PERL
The built-in `git add --interactive` does not require Perl, therefore we
can safely run these tests even when building with `NO_PERL=LetsDoThat`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-30 10:40:46 -07:00
Junio C Hamano
3658170b92 Merge branch 'es/fix-chained-tests'
Fix broken "&&-" chains and failures in early iterations of a loop.

* es/fix-chained-tests:
  t5329: notice a failure within a loop
  t: detect and signal failure within loop
  t1092: fix buggy sparse "blame" test
  t2407: fix broken &&-chains in compound statement
2022-08-29 14:55:15 -07:00
Junio C Hamano
25402204fe Merge branch 'vd/fix-perf-tests'
Rather trivial perf-test code fixes.

* vd/fix-perf-tests:
  p0006: fix 'read-tree' argument ordering
  p0004: fix prereq declaration
2022-08-29 14:55:13 -07:00