git-commit-vandalism

Author	SHA1	Message	Date
Derrick Stolee	8628a842bd	bundle-uri: suppress stderr from remote-https When downloading bundles from a git-remote-https subprocess, the bundle URI logic wants to be opportunistic and download as much as possible and work with what did succeed. This is particularly important in the "any" mode, where any single bundle success will work. If the URI is not available, the git-remote-https process will die() with a "fatal:" error message, even though that error is not actually fatal to the super process. Since stderr is passed through, it looks like a fatal error to the user. Suppress stderr to avoid these errors from bubbling to the surface. The bundle URI API adds its own warning() messages on these failures. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-10-12 09:13:25 -07:00
Derrick Stolee	70334fc3eb	bundle-uri: quiet failed unbundlings When downloading a list of bundles in "all" mode, Git has no understanding of the dependencies between the bundles. Git attempts to unbundle the bundles in some order, but some may not pass the verify_bundle() step because of missing prerequisites. This is passed as error messages to the user, even when they eventually succeed in later attempts after their dependent bundles are unbundled. Add a new VERIFY_BUNDLE_QUIET flag to verify_bundle() that avoids the error messages from the missing prerequisite commits. The method still returns the number of missing prerequisit commits, allowing callers to unbundle() to notice that the bundle failed to apply. Use this flag in bundle-uri.c and test that the messages go away for 'git clone --bundle-uri' commands. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-10-12 09:13:25 -07:00
Derrick Stolee	c23f592117	bundle-uri: fetch a list of bundles When the content at a given bundle URI is not understood as a bundle (based on inspecting the initial content), then Git currently gives up and ignores that content. Independent bundle providers may want to split up the bundle content into multiple bundles, but still make them available from a single URI. Teach Git to attempt parsing the bundle URI content as a Git config file providing the key=value pairs for a bundle list. Git then looks at the mode of the list to see if ANY single bundle is sufficient or if ALL bundles are required. The content at the selected URIs are downloaded and the content is inspected again, creating a recursive process. To guard the recursion against malformed or malicious content, limit the recursion depth to a reasonable four for now. This can be converted to a configured value in the future if necessary. The value of four is twice as high as expected to be useful (a bundle list is unlikely to point to more bundle lists). To test this scenario, create an interesting bundle topology where three incremental bundles are built on top of a single full bundle. By using a merge commit, the two middle bundles are "independent" in that they do not require each other in order to unbundle themselves. They each only need the base bundle. The bundle containing the merge commit requires both of the middle bundles, though. This leads to some interesting decisions when unbundling, especially when we later implement heuristics that promote downloading bundles until the prerequisite commits are satisfied. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-10-12 09:13:25 -07:00
Derrick Stolee	738e5245fa	bundle-uri: parse bundle list in config format When a bundle provider wants to operate independently from a Git remote, they want to provide a single, consistent URI that users can use in their 'git clone --bundle-uri' commands. At this point, the Git client expects that URI to be a single bundle that can be unbundled and used to bootstrap the rest of the clone from the Git server. This single bundle cannot be re-used to assist with future incremental fetches. To allow for the incremental fetch case, teach Git to understand a bundle list that could be advertised at an independent bundle URI. Such a bundle list is likely to be inspected by human readers, even if only by the bundle provider creating the list. For this reason, we can take our expected "key=value" pairs and instead format them using Git config format. Create bundle_uri_parse_config_format() to parse a file in config format and convert that into a 'struct bundle_list' filled with its understanding of the contents. Be careful to use error_action CONFIG_ERROR_ERROR when calling git_config_from_file_with_options() because the default action for git_config_from_file() is to die() on a parsing error. The current warning isn't particularly helpful if it arises to a user, but it will be made more verbose at a higher layer later. Update 'test-tool bundle-uri' to take this config file format as input. It uses a filename instead of stdin because there is no existing way to parse a FILE pointer in the config machinery. Using git_config_from_mem() is overly complicated and more likely to introduce bugs than this simpler version. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-10-12 09:13:24 -07:00
Ævar Arnfjörð Bjarmason	d796cedbe8	bundle-uri: unit test "key=value" parsing Create a new 'test-tool bundle-uri' test helper. This helper will assist in testing logic deep in the bundle URI feature. This change introduces the 'parse-key-values' subcommand, which parses an input file as a list of lines. These are fed into bundle_uri_parse_line() to test how we construct a 'struct bundle_list' from that data. The list is then output to stdout as if the key-value pairs were a Git config file. We use an input file instead of stdin because of a future change to parse in config-file format that works better as an input file. Co-authored-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-10-12 09:13:24 -07:00
Junio C Hamano	b0416d8f4a	Merge branch 'jk/sequencer-missing-author-name-check' Typofix in code. * jk/sequencer-missing-author-name-check: sequencer: detect author name errors in read_author_script()	2022-10-11 10:36:12 -07:00
Junio C Hamano	601bb23876	Merge branch 'pw/mailinfo-b-fix' Fix a logic in "mailinfo -b" that miscomputed the length of a substring, which lead to an out-of-bounds access. * pw/mailinfo-b-fix: mailinfo -b: fix an out of bounds access	2022-10-11 10:36:11 -07:00
Junio C Hamano	654f5cedbc	Merge branch 'rs/test-httpd-in-C-locale' Force C locale while running tests around httpd to make sure we can find expected error messages in the log. * rs/test-httpd-in-C-locale: t/lib-httpd: pass LANG and LC_ALL to Apache	2022-10-11 10:36:11 -07:00
Ævar Arnfjörð Bjarmason	db84376f98	grep.c: remove "extended" in favor of "pattern_expression", fix segfault Since `79d3696cfb` (git-grep: boolean expression on pattern matching., 2006-06-30) the "pattern_expression" member has been used for complex queries (AND/OR...), with "pattern_list" being used for the simple OR queries. Since then we've used both "pattern_expression" and its associated boolean "extended" member to see if we have a complex expression. Since `f41fb662f5` (revisions API: have release_revisions() release "grep_filter", 2022-04-13) we've had a subtle bug relating to that: If we supplied options that were only used for "complex queries", but didn't supply the query itself we'd set "opt->extended", but would have a NULL "pattern_expression". As a result these would segfault as we tried to call "free_grep_patterns()" from "release_revisions()": git -P log -1 --invert-grep git -P log -1 --all-match The root cause of this is that we were conflating the state management we needed in "compile_grep_patterns()" itself with whether or not we had an "opt->pattern_expression" later on. In this cases as we're going through "compile_grep_patterns()" we have no "opt->pattern_list" but have "opt->no_body_match" or "opt->all_match". So we'd set "opt->extended = 1", but not "return" on "opt->extended" as that's an "else if" in the same "if" statement. That behavior is intentional and required, as the common case is that we have an "opt->pattern_list" that we're about to parse into the "opt->pattern_expression". But we don't need to keep track of this "extended" flag beyond the state management in compile_grep_patterns() itself. It needs it, but once we're out of that function we can rely on "opt->pattern_expression" being non-NULL instead for using these extended patterns. As `79d3696cfb` itself shows we've assumed that there's a one-to-one mapping between the two since the very beginning. I.e. "match_line()" would check "opt->extended" to see if it should call "match_expr()", and the first thing we do in that function is assume that we have a "opt->pattern_expression". We'd then call "match_expr_eval()", which would have died if that "opt->pattern_expression" was NULL. The "die" was added in `c922b01f54` (grep: fix segfault when "git grep '('" is given, 2009-04-27), and can now be removed as it's now clearly unreachable. We still do the right thing in the case that prompted that fix: git grep '(' fatal: unmatched parenthesis Arguably neither the "--invert-grep" option added in [1] nor the earlier "--all-match" option added in [2] were intended to be used stand-alone, and another approach[3] would be to error out in those cases. But since we've been treating them as a NOOP when given without --grep for a long time let's keep doing that. We could also return in "free_pattern_expr()" if the argument is non-NULL, as an alternative fix for this segfault does [4]. That would be more elegant in making the "free_*()" function behave like "free()", but it would also remove a sanity check: The "free_pattern_expr()" function calls itself recursively, and only the top-level is allowed to be NULL, let's not conflate those two conditions. 1. `22dfa8a23d` (log: teach --invert-grep option, 2015-01-12) 2. `0ab7befa31` (grep --all-match, 2006-09-27) 3. https://lore.kernel.org/git/patch-1.1-f4b90799fce-20221010T165711Z-avarab@gmail.com/ 4. http://lore.kernel.org/git/7e094882c2a71894416089f894557a9eae07e8f8.1665423686.git.me@ttaylorr.com Reported-by: orygaw <orygaw@protonmail.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-10-11 08:48:54 -07:00
Rubén Justo	0dc4e5c574	branch: support for shortcuts like @{-1}, completed branch command with options "edit-description", "set-upstream-to" and "unset-upstream" expects a branch name. Since `ae5a6c3684` (checkout: implement "@{-N}" shortcut name for N-th last branch, 2009-01-17) a branch can be specified using shortcuts like @{-1}. Those shortcuts need to be resolved when considering the arguments. We can modify the description of the previously checked out branch with: $ git branch --edit--description @{-1} We can modify the upstream of the previously checked out branch with: $ git branch --set-upstream-to upstream @{-1} $ git branch --unset-upstream @{-1} Signed-off-by: Rubén Justo <rjusto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-10-10 16:28:59 -07:00
Phillip Wood	a677d3c416	t3435: remove redundant test case rebase --preserve-merges no longer exists so there is no point in carrying this failing test case. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-10-10 11:18:18 -07:00
Junio C Hamano	19118cb857	Merge branch 'js/merge-ort-in-read-only-repo' In read-only repositories, "git merge-tree" tried to come up with a merge result tree object, which it failed (which is not wrong) and led to a segfault (which is bad), which has been corrected. * js/merge-ort-in-read-only-repo: merge-ort: return early when failing to write a blob merge-ort: fix segmentation fault in read-only repositories	2022-10-10 10:08:43 -07:00
Junio C Hamano	a215853545	Merge branch 'tb/midx-repack-ignore-cruft-packs' "git multi-pack-index repack/expire" used to repack unreachable cruft into a new pack, which have been corrected. * tb/midx-repack-ignore-cruft-packs: midx.c: avoid cruft packs with non-zero `repack --batch-size` midx.c: remove unnecessary loop condition midx.c: replace `xcalloc()` with `CALLOC_ARRAY()` midx.c: avoid cruft packs with `repack --batch-size=0` midx.c: prevent `expire` from removing the cruft pack Documentation/git-multi-pack-index.txt: clarify expire behavior Documentation/git-multi-pack-index.txt: fix typo	2022-10-10 10:08:43 -07:00
Junio C Hamano	dc154c39f7	Merge branch 'ja/rebase-i-avoid-amending-self' "git rebase -i" can mistakenly attempt to apply a fixup to a commit itself, which has been corrected. * ja/rebase-i-avoid-amending-self: sequencer: avoid dropping fixup commit that targets self via commit-ish	2022-10-10 10:08:43 -07:00
Junio C Hamano	82d5a8483e	Merge branch 'ab/test-malloc-with-sanitize-leak' Test fix. * ab/test-malloc-with-sanitize-leak: test-lib: have SANITIZE=leak imply TEST_NO_MALLOC_CHECK	2022-10-10 10:08:40 -07:00
Junio C Hamano	67bf4a83e9	Merge branch 'sy/sparse-grep' "git grep" learned to expand the sparse-index more lazily and on demand in a sparse checkout. * sy/sparse-grep: builtin/grep.c: integrate with sparse index	2022-10-10 10:08:40 -07:00
Junio C Hamano	4b4d97cfda	Merge branch 'ds/scalar-unregister-idempotent' "scalar unregister" in a repository that is already been unregistered reported an error. * ds/scalar-unregister-idempotent: string-list: document iterator behavior on NULL input gc: replace config subprocesses with API calls scalar: make 'unregister' idempotent maintenance: add 'unregister --force'	2022-10-10 10:08:40 -07:00
Junio C Hamano	20a5dd670c	Merge branch 'jk/remote-rename-without-fetch-refspec' "git remote rename" failed to rename a remote without fetch refspec, which has been corrected. * jk/remote-rename-without-fetch-refspec: remote: handle rename of remote without fetch refspec	2022-10-10 10:08:39 -07:00
Junio C Hamano	7aeb0d4c47	Merge branch 'jk/clone-allow-bare-and-o-together' "git clone" did not like to see the "--bare" and the "--origin" options used together without a good reason. * jk/clone-allow-bare-and-o-together: clone: allow "--bare" with "-o"	2022-10-10 10:08:39 -07:00
Junio C Hamano	b77e3bdd97	symbolic-ref: teach "--[no-]recurse" option Suppose you are managing many maintenance tracks in your project, and some of the more recent ones are maint-2.36 and maint-2.37. Further imagine that your project recently tagged the official 2.38 release, which means you would need to start maint-2.38 track soon, by doing: $ git checkout -b maint-2.38 v2.38.0^0 $ git branch --list 'maint-2.3[6-9]' * maint-2.38 maint-2.36 maint-2.37 So far, so good. But it also is reasonable to want not to have to worry about which maintenance track is the latest, by pointing a more generic-sounding 'maint' branch at it, by doing: $ git symbolic-ref refs/heads/maint refs/heads/maint-2.38 which would allow you to say "whichever it is, check out the latest maintenance track", by doing: $ git checkout maint $ git branch --show-current maint-2.38 It is arguably better to say that we are on 'maint-2.38' rather than on 'maint', and "git merge/pull" would record "into maint-2.38" and not "into maint", so I think what we have is a good behaviour. One thing that is slightly irritating, however, is that I do not think there is a good way (other than "cat .git/HEAD") to learn that you checked out 'maint' to get into that state. Just like the output of "git branch --show-current" shows above, "git symbolic-ref HEAD" would report 'refs/heads/maint-2.38', bypassing the intermediate symbolic ref at 'refs/heads/maint' that is pointed at by HEAD. The internal resolve_ref() API already has the necessary support for stopping after resolving a single level of a symbolic-ref, and we can expose it by adding a "--[no-]recurse" option to the command. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-10-09 12:31:24 -07:00
Rubén Justo	bcfc82bd48	branch: description for non-existent branch errors When the repository does not yet have commits, some errors describe that there is no branch: $ git init -b first $ git branch --edit-description first error: No branch named 'first'. $ git branch --set-upstream-to=upstream fatal: branch 'first' does not exist $ git branch -c second error: refname refs/heads/first not found fatal: Branch copy failed That "first" branch is unborn but to say it doesn't exists is confusing. Options "-c" (copy) and "-m" (rename) show the same error when the origin branch doesn't exists: $ git branch -c non-existent-branch second error: refname refs/heads/non-existent-branch not found fatal: Branch copy failed $ git branch -m non-existent-branch second error: refname refs/heads/non-existent-branch not found fatal: Branch rename failed Note that "--edit-description" without an explicit argument is already considering the _empty repository_ circumstance in its error. Also note that "-m" on the initial branch it is an allowed operation. Make the error descriptions for those branch operations with unborn or non-existent branches, more informative. This is the result of the change: $ git init -b first $ git branch --edit-description first error: No commit on branch 'first' yet. $ git branch --set-upstream-to=upstream fatal: No commit on branch 'first' yet. $ git branch -c second fatal: No commit on branch 'first' yet. $ git branch [-c/-m] non-existent-branch second fatal: No branch named 'non-existent-branch'. Signed-off-by: Rubén Justo <rjusto@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-10-07 20:59:41 -07:00
Junio C Hamano	1f1f375cfe	Merge branch 'es/retire-efgrep' Prepare for GNU [ef]grep that throw warning of their uses. * es/retire-efgrep: check-non-portable-shell: detect obsolescent egrep/fgrep	2022-10-07 17:19:59 -07:00
Junio C Hamano	de73968e52	Merge branch 'dd/retire-efgrep' Prepare for GNU [ef]grep that throw warning of their uses. * dd/retire-efgrep: t: convert fgrep usage to "grep -F" t: convert egrep usage to "grep -E" t: remove \{m,n\} from BRE grep usage CodingGuidelines: allow grep -E	2022-10-07 17:19:59 -07:00
Junio C Hamano	410a0e520d	Merge branch 'ds/use-platform-regex-on-macos' With a bit of header twiddling, use the native regexp library on macOS instead of the compat/ one. * ds/use-platform-regex-on-macos: grep: fix multibyte regex handling under macOS	2022-10-07 17:19:59 -07:00
Taylor Blau	f64d4ca8d6	Sync with 2.37.4 Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-06 20:00:04 -04:00
Taylor Blau	f2798aa404	Sync with 2.36.3 Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-06 19:58:16 -04:00
Taylor Blau	9a167cb786	t7527: prepare for changing protocol.file.allow Explicitly cloning over the "file://" protocol in t7527 in preparation for merging a security release which will change the default value of this configuration to be "user". Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-06 19:57:52 -04:00
Taylor Blau	58612f82b6	Sync with 2.35.5 Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-06 17:44:44 -04:00
Taylor Blau	ac8a1db867	Sync with 2.34.5 Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-06 17:43:37 -04:00
Taylor Blau	478a426f14	Sync with 2.33.5 Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-06 17:42:55 -04:00
Taylor Blau	3957f3c84e	Sync with 2.32.4 Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-06 17:42:02 -04:00
Taylor Blau	9cbd2827c5	Sync with 2.31.5 Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-06 17:40:44 -04:00
Taylor Blau	122512967e	Sync with 2.30.6 Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-06 17:39:15 -04:00
Jeff King	6823c19888	test-submodule: inline resolve_relative_url() function The resolve_relative_url() function takes argc and argv parameters; it then reads up to 3 elements of argv without looking at argc at all. At first glance, this seems like a bug. But it has only one caller, cmd__submodule_resolve_relative_url(), which does confirm that argc is 3. The main reason this is a separate function is that it was moved from library code in `96a28a9bc6` (submodule--helper: move "resolve-relative-url-test" to a test-tool, 2022-09-01). We can make this code simpler and more obviously safe by just inlining the function in its caller. As a bonus, this silences a -Wunused-parameter warning. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-10-06 09:56:28 -07:00
René Scharfe	7a2d8ea47e	t/lib-httpd: pass LANG and LC_ALL to Apache t5411 starts a web server with no explicit language setting, so it uses the system default. Ten of its tests expect it to return error messages containing the prefix "fatal: ", emitted by die(). This prefix can be localized since `a1fd2cf8cd` (i18n: mark message helpers prefix for translation, 2022-06-21), however. As a result these ten tests break for me on a system with LANG="de_DE.UTF-8" because the web server sends localized messages with "Schwerwiegend: " instead of "fatal: ". Fix these tests by passing LANG and LC_ALL to the web server, which are set to "C" by t/test-lib.sh, to get untranslated messages on both sides. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-10-06 09:16:26 -07:00
Taylor Blau	d9fcaeece2	t5537: prepare for changing protocol.file.allow Explicitly cloning over the "file://" protocol in t5537 in preparation for merging a security release which will change the default value of this configuration to be "user". Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-05 20:19:15 -04:00
Taylor Blau	541607d934	t3206: prepare for changing protocol.file.allow Explicitly cloning over the "file://" protocol in t3206 in preparation for merging a security release which will change the default value of this configuration to be "user". Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-05 20:19:08 -04:00
Jonathan Tan	301f1e3ac1	promisor-remote: die upon failing fetch In a partial clone, an attempt to read a missing object results in an attempt to fetch that single object. In order to avoid multiple sequential fetches, which would occur when multiple objects are missing (which is the typical case), some commands have been taught to prefetch in a batch: such a command would, in a partial clone, notice that several objects that it will eventually need are missing, and call promisor_remote_get_direct() with all such objects at once. When this batch prefetch fails, these commands fall back to the sequential fetches. But at $DAYJOB we have noticed that this results in a bad user experience: a command would take unexpectedly long to finish (and possibly use up a lot of bandwidth) if the batch prefetch would fail for some intermittent reason, but all subsequent fetches would work. It would be a better user experience for such a command would just fail. Therefore, make it a fatal error if the prefetch fails and at least one object being fetched is known to be a promisor object. (The latter criterion is to make sure that we are not misleading the user that such an object would be present from the promisor remote. For example, a missing object may be a result of repository corruption and not because it is expectedly missing due to the repository being a partial clone.) Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-10-05 11:06:53 -07:00
Jeff King	45350aeb11	sequencer: detect author name errors in read_author_script() As we parse the author-script file, we check for missing or duplicate lines for GIT_AUTHOR_NAME, etc. But after reading the whole file, our final error conditional checks "date_i" twice and "name_i" not at all. This not only leads to us failing to abort, but we may do an out-of-bounds read on the string_list array. The bug goes back to `442c36bd08` (am: improve author-script error reporting, 2018-10-31), though the code was soon after moved to this spot by `bcd33ec25f` (add read_author_script() to libgit, 2018-10-31). It was presumably just a typo in `442c36bd08`. We'll add test coverage for all the error cases here, though only the GIT_AUTHOR_NAME ones fail (even in a vanilla build they segfault consistently, but certainly with SANITIZE=address). Reported-by: Michael V. Scovetta <michael.scovetta@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-10-03 11:05:53 -07:00
Phillip Wood	3ef1494685	mailinfo -b: fix an out of bounds access To remove bracketed strings containing "PATCH" from the subject line cleanup_subject() scans the subject for the opening bracket using an offset from the beginning of the line. It then searches for the closing bracket with strchr(). To calculate the length of the bracketed string it unfortunately adds rather than subtracts the offset from the result of strchr(). This leads to an out of bounds access in memmem() when looking to see if the brackets contain "PATCH". We have tests that trigger this bug that were added in `ae52d57f0b` (t5100: add some more mailinfo tests, 2017-05-31). The commit message mentions that they are marked test_expect_failure as they trigger an assertion in strbuf_splice(). While it is reassuring that strbuf_splice() detects the problem and dies in retrospect that should perhaps have warranted a little more investigation. The bug was introduced by `17635fc900` (mailinfo: -b option keeps [bracketed] strings that is not a [PATCH] marker, 2009-07-15). I think the reason it has survived so long is that '-b' is not a popular option and without it the offset is always zero. This was found by the address sanitizer while I was cleaning up the test_todo idea in [1]. [1] https://lore.kernel.org/git/db558292-2783-3270-4824-43757822a389@gmail.com/ Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-10-03 09:05:07 -07:00
René Scharfe	a79c6b6081	diff: support ^! for merges revision.c::handle_revision_arg_1() resolves <rev>^! by first adding the negated parents and then <rev> itself. builtin_diff_combined() expects the first tree to be the merge and the remaining ones to be the parents, though. This mismatch results in bogus diff output. Remember the first tree that doesn't belong to a parent and use it instead of blindly picking the first one. This makes "git diff <rev>^!" consistent with "git show <rev>^!". Reported-by: Tim Jaacks <tim.jaacks@garz-fricke.com> Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-10-01 15:58:38 -07:00
Taylor Blau	8a7bfa0fd3	t7814: prepare for changing protocol.file.allow Explicitly cloning over the "file://" protocol in t7814 in preparation for merging a security release which will change the default value of this configuration to be "user". Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-01 00:31:40 -04:00
Taylor Blau	59f2f80280	t5537: prepare for changing protocol.file.allow Explicitly cloning over the "file://" protocol in t5537 in preparation for merging a security release which will change the default value of this configuration to be "user". Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-01 00:31:36 -04:00
Taylor Blau	c193e6bbee	t5516: prepare for changing protocol.file.allow Explicitly cloning over the "file://" protocol in t5516 in preparation for merging a security release which will change the default value of this configuration to be "user". Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-01 00:31:34 -04:00
Taylor Blau	e175fb5767	t3207: prepare for changing protocol.file.allow Explicitly cloning over the "file://" protocol in t3207 in preparation for merging a security release which will change the default value of this configuration to be "user". Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-01 00:31:31 -04:00
Taylor Blau	ef374dd9b8	t2080: prepare for changing protocol.file.allow Explicitly cloning over the "file://" protocol in t1092 in preparation for merging a security release which will change the default value of this configuration to be "user". Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-01 00:30:45 -04:00
Taylor Blau	092d3a2bf9	t1092: prepare for changing protocol.file.allow Explicitly cloning over the "file://" protocol in t1092 in preparation for merging a security release which will change the default value of this configuration to be "user". Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-01 00:30:43 -04:00
Taylor Blau	067aa8fb41	t2080: prepare for changing protocol.file.allow Explicitly cloning over the "file://" protocol in t1092 in preparation for merging a security release which will change the default value of this configuration to be "user". Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-01 00:27:18 -04:00
Taylor Blau	4a7dab5ce4	t1092: prepare for changing protocol.file.allow Explicitly cloning over the "file://" protocol in t1092 in preparation for merging a security release which will change the default value of this configuration to be "user". Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-01 00:27:14 -04:00
Jeff King	71ad7fe1bc	shell: limit size of interactive commands When git-shell is run in interactive mode (which must be enabled by creating $HOME/git-shell-commands), it reads commands from stdin, one per line, and executes them. We read the commands with git_read_line_interactively(), which uses a strbuf under the hood. That means we'll accept an input of arbitrary size (limited only by how much heap we can allocate). That creates two problems: - the rest of the code is not prepared to handle large inputs. The most serious issue here is that split_cmdline() uses "int" for most of its types, which can lead to integer overflow and out-of-bounds array reads and writes. But even with that fixed, we assume that we can feed the command name to snprintf() (via xstrfmt()), which is stuck for historical reasons using "int", and causes it to fail (and even trigger a BUG() call). - since the point of git-shell is to take input from untrusted or semi-trusted clients, it's a mild denial-of-service. We'll allocate as many bytes as the client sends us (actually twice as many, since we immediately duplicate the buffer). We can fix both by just limiting the amount of per-command input we're willing to receive. We should also fix split_cmdline(), of course, which is an accident waiting to happen, but that can come on top. Most calls to split_cmdline(), including the other one in git-shell, are OK because they are reading from an OS-provided argv, which is limited in practice. This patch should eliminate the immediate vulnerabilities. I picked 4MB as an arbitrary limit. It's big enough that nobody should ever run into it in practice (since the point is to run the commands via exec, we're subject to OS limits which are typically much lower). But it's small enough that allocating it isn't that big a deal. The code is mostly just swapping out fgets() for the strbuf call, but we have to add a few niceties like flushing and trimming line endings. We could simplify things further by putting the buffer on the stack, but 4MB is probably a bit much there. Note that we'll _always_ allocate 4MB, which for normal, non-malicious requests is more than we would before this patch. But on the other hand, other git programs are happy to use 96MB for a delta cache. And since we'd never touch most of those pages, on a lazy-allocating OS like Linux they won't even get allocated to actual RAM. The ideal would be a version of strbuf_getline() that accepted a maximum value. But for a minimal vulnerability fix, let's keep things localized and simple. We can always refactor further on top. The included test fails in an obvious way with ASan or UBSan (which notice the integer overflow and out-of-bounds reads). Without them, it fails in a less obvious way: we may segfault, or we may try to xstrfmt() a long string, leading to a BUG(). Either way, it fails reliably before this patch, and passes with it. Note that we don't need an EXPENSIVE prereq on it. It does take 10-15s to fail before this patch, but with the new limit, we fail almost immediately (and the perl process generating 2GB of data exits via SIGPIPE). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-01 00:23:38 -04:00
Jeff King	32696a4cbe	shell: add basic tests We have no tests of even basic functionality of git-shell. Let's add a couple of obvious ones. This will serve as a framework for adding tests for new things we fix, as well as making sure we don't screw anything up too badly while doing so. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-01 00:23:38 -04:00
Taylor Blau	f4a32a550f	t/t9NNN: allow local submodules To prepare for the default value of `protocol.file.allow` to change to "user", ensure tests that rely on local submodules can initialize them over the file protocol. Tests that interact with submodules a handful of times use `test_config_global`. Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-01 00:23:38 -04:00
Taylor Blau	0d3beb71da	t/t7NNN: allow local submodules To prepare for the default value of `protocol.file.allow` to change to "user", ensure tests that rely on local submodules can initialize them over the file protocol. Tests that only need to interact with submodules in a limited capacity have individual Git commands annotated with the appropriate configuration via `-c`. Tests that interact with submodules a handful of times use `test_config_global` instead. Test scripts that rely on submodules throughout use a `git config --global` during a setup test towards the beginning of the script. Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-01 00:23:38 -04:00
Taylor Blau	0f21b8f468	t/t6NNN: allow local submodules To prepare for the default value of `protocol.file.allow` to change to "user", ensure tests that rely on local submodules can initialize them over the file protocol. Tests that only need to interact with submodules in a limited capacity have individual Git commands annotated with the appropriate configuration via `-c`. Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-01 00:23:38 -04:00
Taylor Blau	225d2d50cc	t/t5NNN: allow local submodules To prepare for the default value of `protocol.file.allow` to change to "user", ensure tests that rely on local submodules can initialize them over the file protocol. Tests that only need to interact with submodules in a limited capacity have individual Git commands annotated with the appropriate configuration via `-c`. Tests that interact with submodules a handful of times use `test_config_global` instead. Test scripts that rely on submodules throughout use a `git config --global` during a setup test towards the beginning of the script. Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-01 00:23:38 -04:00
Taylor Blau	ac7e57fa28	t/t4NNN: allow local submodules To prepare for the default value of `protocol.file.allow` to change to "user", ensure tests that rely on local submodules can initialize them over the file protocol. Tests that only need to interact with submodules in a limited capacity have individual Git commands annotated with the appropriate configuration via `-c`. Tests that interact with submodules a handful of times use `test_config_global` instead. Test scripts that rely on submodules throughout use a `git config --global` during a setup test towards the beginning of the script. Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-01 00:23:38 -04:00
Taylor Blau	f8d510ed0b	t/t3NNN: allow local submodules To prepare for the default value of `protocol.file.allow` to change to "user", ensure tests that rely on local submodules can initialize them over the file protocol. Tests that only need to interact with submodules in a limited capacity have individual Git commands annotated with the appropriate configuration via `-c`. Tests that interact with submodules a handful of times use `test_config_global` instead. Test scripts that rely on submodules throughout use a `git config --global` during a setup test towards the beginning of the script. Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-01 00:23:38 -04:00
Taylor Blau	99f4abb8da	t/2NNNN: allow local submodules To prepare for the default value of `protocol.file.allow` to change to "user", ensure tests that rely on local submodules can initialize them over the file protocol. Tests that only need to interact with submodules in a limited capacity have individual Git commands annotated with the appropriate configuration via `-c`. Tests that interact with submodules a handful of times use `test_config_global` instead. Test scripts that rely on submodules throughout use a `git config --global` during a setup test towards the beginning of the script. Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-01 00:23:38 -04:00
Taylor Blau	8a96dbcb33	t/t1NNN: allow local submodules To prepare for the default value of `protocol.file.allow` to change to "user", ensure tests that rely on local submodules can initialize them over the file protocol. Tests that only need to interact with submodules in a limited capacity have individual Git commands annotated with the appropriate configuration via `-c`. Tests that interact with submodules a handful of times use `test_config_global` instead. Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-01 00:23:38 -04:00
Taylor Blau	7de0c306f7	t/lib-submodule-update.sh: allow local submodules To prepare for changing the default value of `protocol.file.allow` to "user", update the `prolog()` function in lib-submodule-update to allow submodules to be cloned over the file protocol. This is used by a handful of submodule-related test scripts, which themselves will have to tweak the value of `protocol.file.allow` in certain locations. Those will be done in subsequent commits. Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-01 00:23:38 -04:00
Taylor Blau	6f054f9fb3	builtin/clone.c: disallow `--local` clones with symlinks When cloning a repository with `--local`, Git relies on either making a hardlink or copy to every file in the "objects" directory of the source repository. This is done through the callpath `cmd_clone()` -> `clone_local()` -> `copy_or_link_directory()`. The way this optimization works is by enumerating every file and directory recursively in the source repository's `$GIT_DIR/objects` directory, and then either making a copy or hardlink of each file. The only exception to this rule is when copying the "alternates" file, in which case paths are rewritten to be absolute before writing a new "alternates" file in the destination repo. One quirk of this implementation is that it dereferences symlinks when cloning. This behavior was most recently modified in `36596fd2df` (clone: better handle symlinked files at .git/objects/, 2019-07-10), which attempted to support `--local` clones of repositories with symlinks in their objects directory in a platform-independent way. Unfortunately, this behavior of dereferencing symlinks (that is, creating a hardlink or copy of the source's link target in the destination repository) can be used as a component in attacking a victim by inadvertently exposing the contents of file stored outside of the repository. Take, for example, a repository that stores a Dockerfile and is used to build Docker images. When building an image, Docker copies the directory contents into the VM, and then instructs the VM to execute the Dockerfile at the root of the copied directory. This protects against directory traversal attacks by copying symbolic links as-is without dereferencing them. That is, if a user has a symlink pointing at their private key material (where the symlink is present in the same directory as the Dockerfile, but the key itself is present outside of that directory), the key is unreadable to a Docker image, since the link will appear broken from the container's point of view. This behavior enables an attack whereby a victim is convinced to clone a repository containing an embedded submodule (with a URL like "file:///proc/self/cwd/path/to/submodule") which has a symlink pointing at a path containing sensitive information on the victim's machine. If a user is tricked into doing this, the contents at the destination of those symbolic links are exposed to the Docker image at runtime. One approach to preventing this behavior is to recreate symlinks in the destination repository. But this is problematic, since symlinking the objects directory are not well-supported. (One potential problem is that when sharing, e.g. a "pack" directory via symlinks, different writers performing garbage collection may consider different sets of objects to be reachable, enabling a situation whereby garbage collecting one repository may remove reachable objects in another repository). Instead, prohibit the local clone optimization when any symlinks are present in the `$GIT_DIR/objects` directory of the source repository. Users may clone the repository again by prepending the "file://" scheme to their clone URL, or by adding the `--no-local` option to their `git clone` invocation. The directory iterator used by `copy_or_link_directory()` must no longer dereference symlinks (i.e., it must call `lstat()` instead of `stat()` in order to discover whether or not there are symlinks present). This has no bearing on the overall behavior, since we will immediately `die()` on encounter a symlink. Note that t5604.33 suggests that we do support local clones with symbolic links in the source repository's objects directory, but this was likely unintentional, or at least did not take into consideration the problem with sharing parts of the objects directory with symbolic links at the time. Update this test to reflect which options are and aren't supported. Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-10-01 00:23:38 -04:00
Junio C Hamano	e288b3de35	branch: do not fail a no-op --edit-desc Imagine running "git branch --edit-description" while on a branch without the branch description, and then exit the editor after emptying the edit buffer, which is the way to tell the command that you changed your mind and you do not want the description after all. The command should just happily oblige, adding no branch description for the current branch, and exit successfully. But it fails to do so: $ git init -b main $ git commit --allow-empty -m commit $ GIT_EDITOR=: git branch --edit-description fatal: could not unset 'branch.main.description' The end result is OK in that the configuration variable does not exist in the resulting repository, but we should do better. If we know we didn't have a description, and if we are asked not to have a description by the editor, we can just return doing nothing. This of course introduces TOCTOU. If you add a branch description to the same branch from another window, while you had the editor open to edit the description, and then exit the editor without writing anything there, we'd end up not removing the description you added in the other window. But you are fooling yourself in your own repository at that point, and if it hurts, you'd be better off not doing so ;-). Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-30 11:13:51 -07:00
Ævar Arnfjörð Bjarmason	5e7c8b75e7	test-lib: have SANITIZE=leak imply TEST_NO_MALLOC_CHECK Since `131b94a10a` (test-lib.sh: Use GLIBC_TUNABLES instead of MALLOC_CHECK_ on glibc >= 2.34, 2022-03-04) compiling with SANITIZE=leak has missed reporting some leaks. The old MALLOC_CHECK method used before glibc 2.34 seems to have been (mostly?) compatible with it, but after `131b94a10a` e.g. running: TEST_NO_MALLOC_CHECK=1 make SANITIZE=leak test T=t6437-submodule-merge.sh Would report a leak in builtin/commit.c, but this would not: TEST_NO_MALLOC_CHECK= make SANITIZE=leak test T=t6437-submodule-merge.sh Since the interaction is clearly breaking the SANITIZE=leak mode, let's mark them as explicitly incompatible. A related regression for SANITIZE=address was fixed in `067109a5e7` (tests: make SANITIZE=address imply TEST_NO_MALLOC_CHECK, 2022-04-09). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-29 08:37:45 -07:00
Johannes Schindelin	0b55d930a6	merge-ort: fix segmentation fault in read-only repositories If the blob/tree objects cannot be written, we really need the merge operations to fail, and not to continue (and then try to access the tree object which is however still set to `NULL`). Let's stop ignoring the return value of `write_object_file()` and `write_tree()` and set `clean = -1` in the error case. Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-28 08:49:27 -07:00
Derrick Stolee	d871b6c6c6	scalar: make 'unregister' idempotent The 'scalar unregister' command removes a repository from the list of registered Scalar repositories and removes it from the list of repositories registered for background maintenance. If the repository was not already registered for background maintenance, then the command fails, even if the repository was still registered as a Scalar repository. After using 'scalar clone' or 'scalar register', the repository would be enrolled in background maintenance since those commands run 'git maintenance start'. If the user runs 'git maintenance unregister' on that repository, then it is still in the list of repositories which get new config updates from 'scalar reconfigure'. The 'scalar unregister' command would fail since 'git maintenance unregister' would fail. Further, the add_or_remove_enlistment() method in scalar.c already has this idempotent nature built in as an expectation since it returns zero when the scalar.repo list already has the proper containment of the repository. The previous change added the 'git maintenance unregister --force' option, so use it within 'scalar unregister' to make it idempotent. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-27 09:32:26 -07:00
Derrick Stolee	1ebe6b0297	maintenance: add 'unregister --force' The 'git maintenance unregister' subcommand has a step that removes the current repository from the multi-valued maitenance.repo config key. This fails if the repository is not listed in that key. This makes running 'git maintenance unregister' twice result in a failure in the second instance. This failure exit code is helpful, but its message is not. Add a new die() message that explicitly calls out the failure due to the repository not being registered. In some cases, users may want to run 'git maintenance unregister' just to make sure that background jobs will not start on this repository, but they do not want to check to see if it is registered first. Add a new '--force' option that will siltently succeed if the repository is not already registered. Also add an extra test of 'git maintenance unregister' at a point where there are no registered repositories. This should fail without --force. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-27 09:32:25 -07:00
Junio C Hamano	2a7d63a245	Merge branch 'ds/bitmap-lookup-remove-tracing' Perf-fix. * ds/bitmap-lookup-remove-tracing: pack-bitmap: remove trace2 region from hot path	2022-09-26 21:46:51 -07:00
Derrick Stolee	89a1ab8fb5	pack-bitmap: remove trace2 region from hot path The trace2 region around the call to lazy_bitmap_for_commit() in bitmap_for_commit() was added in `28cd730680` (pack-bitmap: prepare to read lookup table extension, 2022-08-14). While adding trace2 regions is typically helpful for tracking performance, this method is called possibly thousands of times as a commit walk explores commit history looking for a matching bitmap. When trace2 output is enabled, this region is emitted many times and performance is throttled by that output. For now, remove these regions entirely. This is a critical path, and it would be valuable to measure that the time spent in bitmap_for_commit() does not increase when using the commit lookup table. The best way to do that would be to use a mechanism that sums the time spent in a region and reports a single value at the end of the process. This technique was introduced but not merged by [1] so maybe this example presents some justification to revisit that approach. [1] https://lore.kernel.org/git/pull.1099.v2.git.1640720202.gitgitgadget@gmail.com/ To help with the 'git blame' output in this region, add a comment that warns against adding a trace2 region. Delete a test from t5310 that used that trace output to check that this lookup optimization was activated. To create this kind of test again in the future, the stopwatch traces mentioned earlier could be used as a signal that we activated this code path. Helpedy-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-26 12:09:18 -07:00
Johannes Altmanninger	3e367a5f2f	sequencer: avoid dropping fixup commit that targets self via commit-ish Commit `68d5d03bc4` (rebase: teach --autosquash to match on sha1 in addition to message, 2010-11-04) taught autosquash to recognize subjects like "fixup! 7a235b" where 7a235b is an OID-prefix. It actually did more than advertised: 7a235b can be an arbitrary commit-ish (as long as it's not trailed by spaces). Accidental(?) use of this secret feature revealed a bug where we would silently drop a fixup commit. The bug can also be triggered when using an OID-prefix but that's unlikely in practice. Let the commit with subject "fixup! main" be the tip of the "main" branch. When computing the fixup target for this commit, we find the commit itself. This is wrong because, by definition, a fixup target must be an earlier commit in the todo list. We wrongly find the current commit because we added it to the todo list prematurely. Avoid these fixup-cycles by only adding the current commit to the todo list after we have finished looking for the fixup target. Reported-by: Erik Cervin Edin <erik@cervined.in> Signed-off-by: Johannes Altmanninger <aclopte@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-26 10:11:57 -07:00
Shaoxuan Yuan	7cae7627c4	builtin/grep.c: integrate with sparse index Turn on sparse index and remove ensure_full_index(). Before this patch, `git-grep` utilizes the ensure_full_index() method to expand the index and search all the entries. Because this method requires walking all the trees and constructing the index, it is the slow part within the whole command. To achieve better performance, this patch uses grep_tree() to search the sparse directory entries and get rid of the ensure_full_index() method. Why grep_tree() is a better choice over ensure_full_index()? 1) grep_tree() is as correct as ensure_full_index(). grep_tree() looks into every sparse-directory entry (represented by a tree) recursively when looping over the index, and the result of doing so matches the result of expanding the index. 2) grep_tree() utilizes pathspecs to limit the scope of searching. ensure_full_index() always expands the index, which means it will always walk all the trees and blobs in the repo without caring if the user only wants a subset of the content, i.e. using a pathspec. On the other hand, grep_tree() will only search the contents that match the pathspec, and thus possibly walking fewer trees. 3) grep_tree() does not construct and copy back a new index, while ensure_full_index() does. This also saves some time. ---------------- Performance test - Summary: p2000 tests demonstrate a ~71% execution time reduction for `git grep --cached bogus -- "f2/f1/f1/"` using tree-walking logic. However, notice that this result varies depending on the pathspec given. See below "Command used for testing" for more details. Test HEAD~ HEAD ------------------------------------------------------- 2000.78: git grep ... (full-v3) 0.35 0.39 (≈) 2000.79: git grep ... (full-v4) 0.36 0.30 (≈) 2000.80: git grep ... (sparse-v3) 0.88 0.23 (-73.8%) 2000.81: git grep ... (sparse-v4) 0.83 0.26 (-68.6%) - Command used for testing: git grep --cached bogus -- "f2/f1/f1/" The reason for specifying a pathspec is that, if we don't specify a pathspec, then grep_tree() will walk all the trees and blobs to find the pattern, and the time consumed doing so is not too different from using the original ensure_full_index() method, which also spends most of the time walking trees. However, when a pathspec is specified, this latest logic will only walk the area of trees enclosed by the pathspec, and the time consumed is reasonably a lot less. Generally speaking, because the performance gain is acheived by walking less trees, which are specified by the pathspec, the HEAD time v.s. HEAD~ time in sparse-v[3\|4], should be proportional to "pathspec enclosed area" v.s. "all area", respectively. Namely, the wider the <pathspec> is encompassing, the less the performance difference between HEAD~ and HEAD, and vice versa. That is, if we don't specify a pathspec, the performance difference [1] is indistinguishable: both methods walk all the trees and take generally same amount of time (even with the index construction time included for ensure_full_index()). [1] Performance test result without pathspec (hence walking all trees): Command used: git grep --cached bogus Test HEAD~ HEAD --------------------------------------------------- 2000.78: git grep ... (full-v3) 6.17 5.19 (≈) 2000.79: git grep ... (full-v4) 6.19 5.46 (≈) 2000.80: git grep ... (sparse-v3) 6.57 6.44 (≈) 2000.81: git grep ... (sparse-v4) 6.65 6.28 (≈) -------------------------- NEEDSWORK about submodules There are a few NEEDSWORKs that belong to improvements beyond this topic. See the NEEDSWORK in builtin/grep.c::grep_submodule() for more context. The other two NEEDSWORKs in t1092 are also relative. Suggested-by: Derrick Stolee <derrickstolee@github.com> Helped-by: Derrick Stolee <derrickstolee@github.com> Helped-by: Victoria Dye <vdye@github.com> Helped-by: Elijah Newren <newren@gmail.com> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-23 09:41:27 -07:00
Eric Sunshine	2b521630f9	check-non-portable-shell: detect obsolescent egrep/fgrep GNU grep deprecated `egrep` and `fgrep` with release 2.5.3 in 2007. As of release 3.8 in 2022, those commands warn[1] that they are obsolescent. Now that all the Git test scripts have been scrubbed of uses of `egrep` and `fgrep`, make `check-non-portable-shell` complain about them to prevent new instances from creeping back into the project. [1]: https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001.html Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-23 08:31:13 -07:00
Junio C Hamano	75fc96d57e	Merge branch 'dd/retire-efgrep' into es/retire-efgrep * dd/retire-efgrep: t: convert fgrep usage to "grep -F" t: convert egrep usage to "grep -E" t: remove \{m,n\} from BRE grep usage CodingGuidelines: allow grep -E	2022-09-23 08:31:04 -07:00
Jeff King	5a97b38109	remote: handle rename of remote without fetch refspec We return an error when trying to rename a remote that has no fetch refspec: $ git config --unset-all remote.origin.fetch $ git remote rename origin foo fatal: could not unset 'remote.foo.fetch' To make things even more confusing, we actually _do_ complete the config modification, via git_config_rename_section(). After that we try to rewrite the fetch refspec (to say refs/remotes/foo instead of origin). But our call to git_config_set_multivar() to remove the existing entries fails, since there aren't any, and it calls die(). We could fix this by using the "gently" form of the config call, and checking the error code. But there is an even simpler fix: if we know that there are no refspecs to rewrite, then we can skip that part entirely. Reported-by: John A. Leuenhagen <john@zlima12.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-22 12:59:52 -07:00
Jeff King	3b910d6e29	clone: allow "--bare" with "-o" We explicitly forbid the combination of "--bare" with "-o", but there doesn't seem to be any good reason to do so. The original logic came as part of `e6489a1bdf` (clone: do not accept more than one -o option., 2006-01-22), but that commit does not give any reason. Furthermore, the equivalent combination via config is allowed: git -c clone.defaultRemoteName=foo clone ... and works as expected. It may be that this combination was considered useless, because a bare clone does not set remote.origin.fetch (and hence there is no refs/remotes/origin hierarchy). But it does set remote.origin.url, and that name is visible to the user via "git fetch origin", etc. Let's allow the options to be used together, and switch the "forbid" test in t5606 to check that we use the requested name. That test came much later in `349cff76de` (clone: add tests for --template and some disallowed option pairs, 2020-09-29), and does not offer any logic beyond "let's test what the code currently does". Reported-by: John A. Leuenhagen <john@zlima12.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-22 12:57:03 -07:00
Junio C Hamano	17df9d3849	Merge branch 'sg/clean-test-results' "make clean" stopped cleaning the test results directory as a side effect of a topic that has nothing to do with "make clean", which has been corrected. * sg/clean-test-results: t/Makefile: remove 'test-results' on 'make clean'	2022-09-21 15:27:02 -07:00
Junio C Hamano	86c108a8a2	Merge branch 'vd/scalar-generalize-diagnose' Portability fix. * vd/scalar-generalize-diagnose: builtin/diagnose.c: don't translate the two mode values diagnose.c: refactor to safely use 'd_type'	2022-09-21 15:27:01 -07:00
Junio C Hamano	dd37e5607f	Merge branch 'fz/help-doublofix' Typofix for topic already in -rc0. * fz/help-doublofix: help: fix doubled words in explanation for developer interfaces	2022-09-21 14:23:14 -07:00
SZEDER Gábor	d11b875197	t/Makefile: remove 'test-results' on 'make clean' The 't/test-results' directory and its contents are by-products of the test process, so 'make clean' should remove them, but, alas, this has been broken since `fee65b194d` (t/Makefile: don't remove test-results in "clean-except-prove-cache", 2022-07-28). The 'clean' target in 't/Makefile' was not directly responsible for removing the 'test-results' directory, but relied on its dependency 'clean-except-prove-cache' to do that [1]. ee65b194d broke this, because it only removed the 'rm -r test-results' command from the 'clean-except-prove-cache' target instead of moving it to the 'clean' target, resulting in stray 't/test-results' directories. Add that missing cleanup command to 't/Makefile', and to all sub-Makefiles touched by that commit as well. [1] `60f26f6348` (t/Makefile: retain cache t/.prove across prove runs, 2012-05-02) Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-21 11:32:13 -07:00
Đoàn Trần Công Danh	37eb90f79a	t: convert fgrep usage to "grep -F" Despite POSIX states that: > The old egrep and fgrep commands are likely to be supported for many > years to come as implementation extensions, allowing historical > applications to operate unmodified. GNU grep 3.8 started to warn[1]: > The egrep and fgrep commands, which have been deprecated since > release 2.5.3 (2007), now warn that they are obsolescent and should > be replaced by grep -E and grep -F. Prepare for their removal in the future. [1]: https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001.html Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-21 11:00:19 -07:00
Đoàn Trần Công Danh	81580fa06d	t: convert egrep usage to "grep -E" Despite POSIX states that: > The old egrep and fgrep commands are likely to be supported for many > years to come as implementation extensions, allowing historical > applications to operate unmodified. GNU grep 3.8 started to warn[1]: > The egrep and fgrep commands, which have been deprecated since > release 2.5.3 (2007), now warn that they are obsolescent and should > be replaced by grep -E and grep -F. Prepare for their removal in the future. [1]: https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001.html Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-21 11:00:18 -07:00
Đoàn Trần Công Danh	a764c37bad	t: remove \{m,n\} from BRE grep usage The CodingGuidelines says we should avoid \{m,n\} in BRE usage. And their usages in our code base is limited, and subjectively hard to read. Replace them with ERE. Except for "0\{40\}" which would be changed to "$ZERO_OID", which is a better value for testing with: GIT_TEST_DEFAULT_HASH=sha256 Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-21 11:00:18 -07:00
Taylor Blau	b62ad5681f	midx.c: avoid cruft packs with non-zero `repack --batch-size` Apply similar treatment with respect to cruft packs as in a few commits ago to `repack` with a non-zero `--batch-size`. Since the case of a non-zero `--batch-size` is handled separately (in `fill_included_packs_batch()` instead of `fill_included_packs_all()`), a separate fix must be applied for this case. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-21 10:21:47 -07:00
Taylor Blau	d9f7721450	midx.c: avoid cruft packs with `repack --batch-size=0` The `repack` sub-command of the `git multi-pack-index` builtin creates a new pack aggregating smaller packs contained in the MIDX up to some given `--batch-size`. When `--batch-size=0`, this instructs the MIDX builtin to repack everything contained in the MIDX into a single pack. In similar spirit as a previous commit, it is undesirable to repack the contents of a cruft pack in this step. Teach `repack` to ignore any cruft pack(s) when `--batch-size=0` for the same reason(s). (The case of a non-zero `--batch-size` will be handled in a subsequent commit). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-21 10:21:46 -07:00
Taylor Blau	757d457907	midx.c: prevent `expire` from removing the cruft pack The `expire` sub-command unlinks any packs that are (a) contained in the MIDX, but (b) have no objects referenced by the MIDX. This sub-command ignores `.keep` packs, which remain on-disk even if they have no objects referenced by the MIDX. Cruft packs, however, aren't given the same treatment: if none of the objects contained in the cruft pack are selected from the cruft pack by the MIDX, then the cruft pack is eligible to be expired. This is less than desireable, since the cruft pack has important metadata about the individual object mtimes, which is useful to determine how quickly an object should age out of the repository when pruning. Ordinarily, we wouldn't expect the contents of a cruft pack to duplicated across non-cruft packs (and we'd expect to see the MIDX select all cruft objects from other sources even less often). But nonetheless, it is still possible to trick the `expire` sub-command into removing the `.mtimes` file in this circumstance. Teach the `expire` sub-command to ignore cruft packs in the same manner as it does `.keep` packs, in order to keep their metadata around, even when they are unreferenced by the MIDX. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-21 10:21:46 -07:00
Junio C Hamano	279ebd4761	Merge branch 'ad/t1800-cygwin' Test fix. * ad/t1800-cygwin: t1800: correct test to handle Cygwin	2022-09-19 14:35:25 -07:00
Junio C Hamano	42bf77c7d0	Merge branch 'vd/scalar-to-main' Hoist the remainder of "scalar" out of contrib/ to the main part of the codebase. * vd/scalar-to-main: Documentation/technical: include Scalar technical doc t/perf: add 'GIT_PERF_USE_SCALAR' run option t/perf: add Scalar performance tests scalar-clone: add test coverage scalar: add to 'git help -a' command list scalar: implement the `help` subcommand git help: special-case `scalar` scalar: include in standard Git build & installation scalar: fix command documentation section header	2022-09-19 14:35:25 -07:00
Junio C Hamano	9d58241ee4	Merge branch 'es/chainlint' Revamp chainlint script for our tests. * es/chainlint: chainlint: colorize problem annotations and test delimiters t: retire unused chainlint.sed t/Makefile: teach `make test` and `make prove` to run chainlint.pl test-lib: replace chainlint.sed with chainlint.pl test-lib: retire "lint harder" optimization hack t/chainlint: add more chainlint.pl self-tests chainlint.pl: allow `\|\| echo` to signal failure upstream of a pipe chainlint.pl: complain about loops lacking explicit failure handling chainlint.pl: don't flag broken &&-chain if failure indicated explicitly chainlint.pl: don't flag broken &&-chain if `$?` handled explicitly chainlint.pl: don't require `&` background command to end with `&&` t/Makefile: apply chainlint.pl to existing self-tests chainlint.pl: don't require `return\|exit\|continue` to end with `&&` chainlint.pl: validate test scripts in parallel chainlint.pl: add parser to identify test definitions chainlint.pl: add parser to validate tests chainlint.pl: add POSIX shell parser chainlint.pl: add POSIX shell lexical analyzer t: add skeleton chainlint.pl	2022-09-19 14:35:24 -07:00
Junio C Hamano	339517b035	Merge branch 'sy/mv-out-of-cone' "git mv A B" in a sparsely populated working tree can be asked to move a path from a directory that is "in cone" to another directory that is "out of cone". Handling of such a case has been improved. * sy/mv-out-of-cone: builtin/mv.c: fix possible segfault in add_slash() mv: check overwrite for in-to-out move advice.h: add advise_on_moving_dirty_path() mv: cleanup empty WORKING_DIRECTORY mv: from in-cone to out-of-cone mv: remove BOTH from enum update_mode mv: check if <destination> is a SKIP_WORKTREE_DIR mv: free the with_slash in check_dir_in_index() mv: rename check_dir_in_index() to empty_dir_has_sparse_contents() t7002: add tests for moving from in-cone to out-of-cone	2022-09-19 14:35:23 -07:00
Victoria Dye	cb98e1d50a	diagnose.c: refactor to safely use 'd_type' Refactor usage of the 'd_type' property of 'struct dirent' in 'diagnose.c' to instead utilize the compatibility macro 'DTYPE()'. On systems where 'd_type' is not present in 'struct dirent', this macro will always return 'DT_UNKNOWN'. In that case, instead fall back on using the 'stat.st_mode' to determine whether the dirent points to a dir, file, or link. Additionally, add a test to 't0092-diagnose.sh' to verify that files (e.g., loose objects) are counted properly. Note that the new function 'get_dtype()' is based on 'resolve_dtype()' in 'dir.c' (which itself was refactored from a prior 'get_dtype()' in `ad6f2157f9` (dir: restructure in a way to avoid passing around a struct dirent, 2020-01-16)), but differs in that it is meant for use on arbitrary files, such as those inside the '.git' dir. Because of this, it does not search the index for a matching entry to derive the 'd_type'. Reported-by: Randall S. Becker <rsbecker@nexbridge.com> Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-19 10:25:01 -07:00
Fangyi Zhou	225e815ef2	help: fix doubled words in explanation for developer interfaces Signed-off-by: Fangyi Zhou <me@fangyi.io> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-16 09:20:11 -07:00
Junio C Hamano	ca20a44bc5	Merge branch 'jk/proto-v2-ref-prefix-fix' "git fetch" over protocol v2 sent an incorrect ref prefix request to the server and made "git pull" with configured fetch refspec that does not cover the remote branch to merge with fail, which has been corrected. * jk/proto-v2-ref-prefix-fix: fetch: add branch.*.merge to default ref-prefix extension fetch: stop checking for NULL transport->remote in do_fetch()	2022-09-15 16:09:47 -07:00
Junio C Hamano	d878d83ff0	Merge branch 'en/remerge-diff-fixes' Fix a few "git log --remerge-diff" bugs. * en/remerge-diff-fixes: diff: fix filtering of merge commits under --remerge-diff diff: fix filtering of additional headers under --remerge-diff diff: have submodule_format logic avoid additional diff headers	2022-09-15 16:09:46 -07:00
Adam Dinwoodie	255a6f91ae	t1800: correct test to handle Cygwin On Cygwin, when failing to spawn a process using start_command, Git outputs the same error as on Linux systems, rather than using the GIT_WINDOWS_NATIVE-specific error output. The WINDOWS test prerequisite is set in both Cygwin and native Windows environments, which means it's not appropriate to use to anticipate the error output from start_command. Instead, use the MINGW test prerequisite, which is only set for Git in native Windows environments, and not for Cygwin. Signed-off-by: Adam Dinwoodie <adam@dinwoodie.org> Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-15 10:29:51 -07:00
Torsten Bögershausen	12fc4ad89e	diff.c: use utf8_strwidth() to count display width When unicode filenames (encoded in UTF-8) are used, the visible width on the screen is not the same as strlen(). For example, `git log --stat` may produce an output like this: [snip the header] Arger.txt \| 1 + Ärger.txt \| 1 + 2 files changed, 2 insertions(+) A side note: the original report was about cyrillic filenames. After some investigations it turned out that a) This is not a problem with "ambiguous characters" in unicode b) The same problem exists for all unicode code points (so we can use Latin based Umlauts for demonstrations below) The 'Ä' takes the same space on the screen as the 'A'. But needs one more byte in memory, so the the `git log --stat` output for "Arger.txt" (!) gets mis-aligned: The maximum length is derived from "Ärger.txt", 10 bytes in memory, 9 positions on the screen. That is why "Arger.txt" gets one extra ' ' for aligment, it needs 9 bytes in memory. If there was a file "Ö", it would be correctly aligned by chance, but "Öhö" would not. The solution is of course, to use utf8_strwidth() instead of strlen() when dealing with the width on screen. And then there is another problem, code like this: strbuf_addf(&out, "%-s", len, name); (or using the underlying snprintf() function) does not align the buffer to a minimum of len measured in screen-width, but uses the memory count. One could be tempted to wish that snprintf() was UTF-8 aware. That doesn't seem to be the case anywhere (tested on Linux and Mac), probably snprintf() uses the "bytes in memory"/strlen() approach to be compatible with older versions and this will never change. The basic idea is to change code in diff.c like this strbuf_addf(&out, "%-s", len, name); into something like this: int padding = len - utf8_strwidth(name); if (padding < 0) padding = 0; strbuf_addf(&out, " %s%*s", name, padding, ""); The real change is slighty bigger, as it, as well, integrates two calls of strbuf_addf() into one. Tests: Two things need to be tested: - The calculation of the maximum width - The calculation of padding The name "textfile" is changed into "tëxtfilë", both have a width of 8. If strlen() was used, to get the maximum width, the shorter "binfile" would have been mis-aligned: binfile \| [snip] tëxtfilë \| [snip] If only "binfile" would be renamed into "binfilë": binfilë \| [snip] textfile \| [snip] In order to verify that the width is calculated correctly everywhere, "binfile" is renamed into "binfilë", giving 1 bytes more in strlen() "tëxtfile" is renamed into "tëxtfilë", 2 byte more in strlen(). The updated t4012-diff-binary.sh checks the correct aligment: binfilë \| [snip] tëxtfilë \| [snip] Reported-by: Alexander Meshcheryakov <alexander.s.m@gmail.com> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-14 13:48:18 -07:00
Junio C Hamano	b563638d2c	Merge branch 'ab/submodule-helper-leakfix' Plugging leaks in submodule--helper. * ab/submodule-helper-leakfix: submodule--helper: fix a configure_added_submodule() leak submodule--helper: free rest of "displaypath" in "struct update_data" submodule--helper: free some "displaypath" in "struct update_data" submodule--helper: fix a memory leak in print_status() submodule--helper: fix a leak in module_add() submodule--helper: fix obscure leak in module_add() submodule--helper: fix "reference" leak submodule--helper: fix a memory leak in get_default_remote_submodule() submodule--helper: fix a leak with repo_clear() submodule--helper: fix "sm_path" and other "module_cb_list" leaks submodule--helper: fix "errmsg_str" memory leak submodule--helper: add and use *_release() functions submodule--helper: don't leak {run,capture}_command() cp.dir argument submodule--helper: "struct pathspec" memory leak in module_update() submodule--helper: fix most "struct pathspec" memory leaks submodule--helper: fix trivial get_default_remote_submodule() leak submodule--helper: fix a leak in "clone_submodule"	2022-09-14 12:56:40 -07:00
Junio C Hamano	dd407f1c7c	Merge branch 'ab/unused-annotation' Undoes 'jk/unused-annotation' topic and redoes it to work around Coccinelle rules misfiring false positives in unrelated codepaths. * ab/unused-annotation: git-compat-util.h: use "deprecated" for UNUSED variables git-compat-util.h: use "UNUSED", not "UNUSED(var)"	2022-09-14 12:56:39 -07:00
Junio C Hamano	a6b42ec0c6	Merge branch 'jk/unused-annotation' Annotate function parameters that are not used (but cannot be removed for structural reasons), to prepare us to later compile with -Wunused warning turned on. * jk/unused-annotation: is_path_owned_by_current_uid(): mark "report" parameter as unused run-command: mark unused async callback parameters mark unused read_tree_recursive() callback parameters hashmap: mark unused callback parameters config: mark unused callback parameters streaming: mark unused virtual method parameters transport: mark bundle transport_options as unused refs: mark unused virtual method parameters refs: mark unused reflog callback parameters refs: mark unused each_ref_fn parameters git-compat-util: add UNUSED macro	2022-09-14 12:56:39 -07:00
Junio C Hamano	2c75b3255b	Merge branch 'en/merge-unstash-only-on-clean-merge' into maint The auto-stashed local changes created by "git merge --autostash" was mixed into a conflicted state left in the working tree, which has been corrected. * en/merge-unstash-only-on-clean-merge: merge: only apply autostash when appropriate	2022-09-13 12:21:11 -07:00
Junio C Hamano	aa31cb8974	Merge branch 'jk/pipe-command-nonblock' into maint Fix deadlocks between main Git process and subprocess spawned via the pipe_command() API, that can kill "git add -p" that was reimplemented in C recently. * jk/pipe-command-nonblock: pipe_command(): mark stdin descriptor as non-blocking pipe_command(): handle ENOSPC when writing to a pipe pipe_command(): avoid xwrite() for writing to pipe git-compat-util: make MAX_IO_SIZE define globally available nonblock: support Windows compat: add function to enable nonblocking pipes	2022-09-13 12:21:08 -07:00
Junio C Hamano	72869e750b	Merge branch 'jk/is-promisor-object-keep-tree-in-use' into maint An earlier optimization discarded a tree-object buffer that is still in use, which has been corrected. * jk/is-promisor-object-keep-tree-in-use: is_promisor_object(): fix use-after-free of tree buffer	2022-09-13 12:21:07 -07:00
Junio C Hamano	de1fee2f1e	Merge branch 'ow/rev-parse-parseopt-fix' The parser in the script interface to parse-options in "git rev-parse" has been updated to diagnose a bogus input correctly. * ow/rev-parse-parseopt-fix: rev-parse --parseopt: detect missing opt-spec	2022-09-13 11:38:25 -07:00
Junio C Hamano	e4ffba458f	Merge branch 'js/builtin-add-p-portability-fix' More fixes to "add -p" * js/builtin-add-p-portability-fix: t6132(NO_PERL): do not run the scripted `add -p` t3701: test the built-in `add -i` regardless of NO_PERL add -p: avoid ambiguous signed/unsigned comparison	2022-09-13 11:38:24 -07:00
Junio C Hamano	76ffa818c7	Merge branch 'sg/parse-options-subcommand' The codepath for the OPT_SUBCOMMAND facility has been cleaned up. * sg/parse-options-subcommand: notes, remote: show unknown subcommands between `' notes: simplify default operation mode arguments check test-parse-options.c: fix style of comparison with zero test-parse-options.c: don't use for loop initial declaration t0040-parse-options: remove leftover debugging	2022-09-13 11:38:24 -07:00
Junio C Hamano	655e494047	Merge branch 'jk/rev-list-verify-objects-fix' "git rev-list --verify-objects" ought to inspect the contents of objects and notice corrupted ones, but it didn't when the commit graph is in use, which has been corrected. * jk/rev-list-verify-objects-fix: rev-list: disable commit graph with --verify-objects lookup_commit_in_graph(): use prepare_commit_graph() to check for graph	2022-09-13 11:38:24 -07:00
Junio C Hamano	8b2f027e20	Merge branch 'jk/upload-pack-skip-hash-check' The server side that responds to "git fetch" and "git clone" request has been optimized by allowing it to send objects in its object store without recomputing and validating the object names. * jk/upload-pack-skip-hash-check: t1060: check partial clone of misnamed blob parse_object(): check commit-graph when skip_hash set upload-pack: skip parse-object re-hashing of "want" objects parse_object(): allow skipping hash check	2022-09-13 11:38:23 -07:00
Junio C Hamano	f322e9f51b	Merge branch 'ab/submodule-helper-prep' Code clean-up of "git submodule--helper". * ab/submodule-helper-prep: (33 commits) submodule--helper: fix bad config API usage submodule--helper: libify even more "die" paths for module_update() submodule--helper: libify more "die" paths for module_update() submodule--helper: check repo{_submodule,}_init() return values submodule--helper: libify "must_die_on_failure" code paths (for die) submodule--helper update: don't override 'checkout' exit code submodule--helper: libify "must_die_on_failure" code paths submodule--helper: libify determine_submodule_update_strategy() submodule--helper: don't exit() on failure, return submodule--helper: use "code" in run_update_command() submodule API: don't handle SM_..{UNSPECIFIED,COMMAND} in to_string() submodule--helper: don't call submodule_strategy_to_string() in BUG() submodule--helper: add missing braces to "else" arm submodule--helper: return "ret", not "1" from update_submodule() submodule--helper: rename "int res" to "int ret" submodule--helper: don't redundantly check "else if (res)" submodule--helper: refactor "errmsg_str" to be a "struct strbuf" submodule--helper: add "const" to passed "struct update_data" submodule--helper: add "const" to copy of "update_data" submodule--helper: add "const" to passed "module_clone_data" ...	2022-09-13 11:38:23 -07:00
Eric Sunshine	7c04aa7390	chainlint: colorize problem annotations and test delimiters When `chainlint.pl` detects problems in a test definition, it emits the test definition with "?!FOO?!" annotations highlighting the problems it discovered. For instance, given this problematic test: test_expect_success 'discombobulate frobnitz' ' git frob babble && (echo balderdash; echo gnabgib) >expect && for i in three two one do git nitfol $i done >actual test_cmp expect actual ' chainlint.pl will output: # chainlint: t1234-confusing.sh # chainlint: discombobulate frobnitz git frob babble && (echo balderdash ; ?!AMP?! echo gnabgib) >expect && for i in three two one do git nitfol $i ?!LOOP?! done >actual ?!AMP?! test_cmp expect actual in which it may be difficult to spot the "?!FOO?!" annotations. The problem is compounded when multiple tests, possibly in multiple scripts, fail "linting", in which case it may be difficult to spot the "# chainlint:" lines which delimit one problematic test from another. To ameliorate this potential problem, colorize the "?!FOO?!" annotations in order to quickly draw the test author's attention to the problem spots, and colorize the "# chainlint:" lines to help the author identify the name of each script and each problematic test. Colorization is disabled automatically if output is not directed to a terminal or if NO_COLOR environment variable is set. The implementation is specific to Unix (it employs `tput` if available) but works equally well in the Git for Windows development environment which emulates Unix sufficiently. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-12 21:33:58 -07:00
Junio C Hamano	fe3939bc2a	Merge branch 'vd/sparse-reset-checkout-fixes' Segfault fix-up to an earlier fix to the topic to teach "git reset" and "git checkout" work better in a sparse checkout. * vd/sparse-reset-checkout-fixes: unpack-trees: fix sparse directory recursion check	2022-09-09 12:02:26 -07:00
Junio C Hamano	0e2a4764ed	Merge branch 'jc/format-patch-force-in-body-from' "git format-patch --from=<ident>" can be told to add an in-body "From:" line even for commits that are authored by the given <ident> with "--force-in-body-from"option. * jc/format-patch-force-in-body-from: format-patch: learn format.forceInBodyFrom configuration variable format-patch: allow forcing the use of in-body From: header pretty: separate out the logic to decide the use of in-body from	2022-09-09 12:02:25 -07:00
Junio C Hamano	428dce9f4d	Merge branch 'js/range-diff-with-pathspec' Allow passing a pathspec to "git range-diff". * js/range-diff-with-pathspec: range-diff: optionally accept pathspecs range-diff: consistently validate the arguments range-diff: reorder argument handling	2022-09-09 12:02:25 -07:00
Junio C Hamano	fb094cb583	Merge branch 'js/add-p-diff-parsing-fix' Those who use diff-so-fancy as the diff-filter noticed a regression or two in the code that parses the diff output in the built-in version of "add -p", which has been corrected. * js/add-p-diff-parsing-fix: add -p: ignore dirty submodules add -p: gracefully handle unparseable hunk headers in colored diffs add -p: detect more mismatches between plain vs colored diffs	2022-09-09 12:02:24 -07:00
Øystein Walle	f20b9c36d0	rev-parse --parseopt: detect missing opt-spec After `2d893dff4c` (rev-parse --parseopt: allow [*=?!] in argument hints, 2015-07-14) updated the parser, a line in parseopts's input can start with one of the flag characters and be erroneously parsed as a opt-spec where the short name of the option is the flag character itself and the long name is after the end of the string. This makes Git want to allocate SIZE_MAX bytes of memory at this line: o->long_name = xmemdupz(sb.buf + 2, s - sb.buf - 2); Since s and sb.buf are equal the second argument is -2 (except unsigned) and xmemdupz allocates len + 1 bytes, ie. -1 meaning SIZE_MAX. Avoid this by checking whether a flag character was found in the zeroth position. Reported-by: Ingy dot Net <ingy@ingy.net> Signed-off-by: Øystein Walle <oystwa@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-08 14:55:07 -07:00
Jeff King	49ca2fba39	fetch: add branch..merge to default ref-prefix extension When running "git pull" with no arguments, we'll do a default "git fetch" and then try to merge the branch specified by the branch..merge config. There's code in get_ref_map() to treat that "merge" branch as something we want to fetch, even if it is not otherwise covered by the default refspec. This works fine with the v0 protocol, as the server tells us about all of the refs, and get_ref_map() is the ultimate decider of what we fetch. But in the v2 protocol, we send the ref-prefix extension to the server, asking it to limit the ref advertisement. And we only tell it about the default refspec for the remote; we don't mention the branch..merge config at all. This usually doesn't matter, because the default refspec matches "refs/heads/", which covers all branches. But if you explicitly use a narrow refspec, then "git pull" on some branches may fail. The server doesn't advertise the branch, so we don't fetch it, and "git pull" thinks that it went away upstream. We can fix this by including any branch..merge entries for the current branch in the list of ref-prefixes we pass to the server. This only needs to happen when using the default configured refspec (since command-line refspecs are already added, and take precedence in deciding what we fetch). We don't otherwise need to replicate any of the "what to fetch" logic in get_ref_map(). These ref-prefixes are an optimization, so it's OK if we tell the server to advertise the branch..merge ref, even if we're not going to pull it. We'll just choose not to fetch it. The test here is based on one constructed by Johannes. I modified the branch names to trigger the ref-prefix issue (and be more descriptive), and to confirm that "git pull" actually updated the local ref, which should be more robust than just checking stderr. Reported-by: Lana Deere <lana.deere@gmail.com> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-08 13:10:37 -07:00
Jeff King	945ed00957	t1060: check partial clone of misnamed blob A recent commit (upload-pack: skip parse-object re-hashing of "want" objects, 2022-09-06) loosened the behavior of upload-pack so that it does not verify the sha1 of objects it receives directly via "want" requests. The existing corruption tests in t1060 aren't affected by this: the corruptions are blobs reachable from commits, and the client requests the commits. The more interesting case here is a partial clone, where the client will directly ask for the corrupted blob when it does an on-demand fetch of the filtered object. And that is not covered at all, so let's add a test. It's important here that we use the "misnamed" corruption and not "bit-error". The latter is sufficiently corrupted that upload-pack cannot even figure out the type of the object, so it bails identically both before and after the recent change. But with "misnamed", with the hash-checks enabled it sees the problem (though the error messages are a bit confusing because of the inability to create a "struct object" to store the flags): error: hash mismatch d95f3ad14dee633a758d2e331151e950dd13e4ed fatal: git upload-pack: not our ref d95f3ad14dee633a758d2e331151e950dd13e4ed fatal: remote error: upload-pack: not our ref d95f3ad14dee633a758d2e331151e950dd13e4ed After the change to skip the hash check, the server side happily sends the bogus object, but the client correctly realizes that it did not get the necessary data: remote: Enumerating objects: 1, done. remote: Counting objects: 100% (1/1), done. remote: Total 1 (delta 0), reused 0 (delta 0), pack-reused 0 Receiving objects: 100% (1/1), 49 bytes \| 49.00 KiB/s, done. fatal: bad revision 'd95f3ad14dee633a758d2e331151e950dd13e4ed' error: [...]/misnamed did not send all necessary objects which is exactly what we expect to happen. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-07 15:08:51 -07:00
Jeff King	0bc2557951	upload-pack: skip parse-object re-hashing of "want" objects Imagine we have a history with commit C pointing to a large blob B. If a client asks us for C, we can generally serve both objects to them without accessing the uncompressed contents of B. In upload-pack, we figure out which commits we have and what the client has, and feed those tips to pack-objects. In pack-objects, we traverse the commits and trees (or use bitmaps!) to find the set of objects needed, but we never open up B. When we serve it to the client, we can often pass the compressed bytes directly from the on-disk packfile over the wire. But if a client asks us directly for B, perhaps because they are doing an on-demand fetch to fill in the missing blob of a partial clone, we end up much slower. Upload-pack calls parse_object() on the oid we receive, which opens up the object and re-checks its hash (even though if it were a commit, we might skip this parse entirely in favor of the commit graph!). And then we feed the oid directly to pack-objects, which again calls parse_object() and opens the object. And then finally, when we write out the result, we may send bytes straight from disk, but only after having unnecessarily uncompressed and computed the sha1 of the object twice! This patch teaches both code paths to use the new SKIP_HASH_CHECK flag for parse_object(). You can see the speed-up in p5600, which does a blob:none clone followed by a checkout. The savings for git.git are modest: Test HEAD^ HEAD ---------------------------------------------------------------------- 5600.3: checkout of result 2.23(4.19+0.24) 1.72(3.79+0.18) -22.9% But the savings scale with the number of bytes. So on a repository like linux.git with more files, we see more improvement (in both absolute and relative numbers): Test HEAD^ HEAD ---------------------------------------------------------------------------- 5600.3: checkout of result 51.62(77.26+2.76) 34.86(61.41+2.63) -32.5% And here's an even more extreme case. This is the android gradle-plugin repository, whose tip checkout has ~3.7GB of files: Test HEAD^ HEAD -------------------------------------------------------------------------- 5600.3: checkout of result 79.51(90.84+5.55) 40.28(51.88+5.67) -49.3% Keep in mind that these timings are of the whole checkout operation. So they count the client indexing the pack and actually writing out the files. If we want to see just the server's view, we can hack up the GIT_TRACE_PACKET output from those operations and replay it via upload-pack. For the gradle example, that gives me: Benchmark 1: GIT_PROTOCOL=version=2 git.old upload-pack ../gradle-plugin <input Time (mean ± σ): 50.884 s ± 0.239 s [User: 51.450 s, System: 1.726 s] Range (min … max): 50.608 s … 51.025 s 3 runs Benchmark 2: GIT_PROTOCOL=version=2 git.new upload-pack ../gradle-plugin <input Time (mean ± σ): 9.728 s ± 0.112 s [User: 10.466 s, System: 1.535 s] Range (min … max): 9.618 s … 9.842 s 3 runs Summary 'GIT_PROTOCOL=version=2 git.new upload-pack ../gradle-plugin <input' ran 5.23 ± 0.07 times faster than 'GIT_PROTOCOL=version=2 git.old upload-pack ../gradle-plugin <input' So a server would see an 80% reduction in CPU serving the initial checkout of a partial clone for this repository. Or possibly even more depending on the packing; most of the time spent in the faster one were objects we had to open during the write phase. In both cases skipping the extra hashing on the server should be pretty safe. The client doesn't trust the server anyway, so it will re-hash all of the objects via index-pack. There is one thing to note, though: the change in get_reference() affects not just pack-objects, but rev-list, git-log, etc. We could use a flag to limit to index-pack here, but we may already skip hash checks in this instance. For commits, we'd skip anything we load via the commit-graph. And while before this commit we would check a blob fed directly to rev-list on the command-line, we'd skip checking that same blob if we found it by traversing a tree. The exception for both is if --verify-objects is used. In that case, we'll skip this optimization, and the new test makes sure we do this correctly. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-07 12:20:02 -07:00
SZEDER Gábor	45bec2ead2	test-parse-options.c: fix style of comparison with zero The preferred style is '!argc' instead of 'argc == 0'. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-07 12:06:12 -07:00
SZEDER Gábor	6983f4e3b2	test-parse-options.c: don't use for loop initial declaration We would like to eventually use for loop initial declarations in our codebase, but we are not there yet. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-07 12:06:12 -07:00
SZEDER Gábor	9a22b4d907	t0040-parse-options: remove leftover debugging Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-07 12:06:12 -07:00
Jeff King	b27ccae34b	rev-list: disable commit graph with --verify-objects Since the point of --verify-objects is to actually load and checksum the bytes of each object, optimizing out reads using the commit graph runs contrary to our goal. The most targeted way to implement this would be for the revision traversal code to check revs->verify_objects and avoid using the commit graph. But it's difficult to be sure we've hit all of the correct spots. For instance, I started this patch by writing the first of the included test cases, where the corrupted commit is directly on rev-list's command line. And that is easy to fix by teaching get_reference() to check revs->verify_objects before calling lookup_commit_in_graph(). But that doesn't cover the second test case: when we traverse to a corrupted commit, we'd parse the parent in process_parents(). So we'd need to check there, too. And it keeps going. In handle_commit() we sometimes parses commits, too, though I couldn't figure out a way to trigger it that did not already parse via get_reference() or tag peeling. And try_to_simplify_commit() has its own parse call, and so on. So it seems like the safest thing is to just disable the commit graph for the whole process when we see the --verify-objects option. We can do that either in builtin/rev-list.c, where we use the option, or in revision.c, where we parse it. There are some subtleties: - putting it in rev-list.c is less surprising in some ways, because there we know we are just doing a single traversal. In a command which does multiple traversals in a single process, it's rather unexpected to globally disable the commit graph. - putting it in revision.c is less surprising in some ways, because the caller does not have to remember to disable the graph themselves. But this is already tricky! The verify_objects flag in rev_info doesn't do anything by itself. The caller has to provide an object callback which does the right thing. - for that reason, in practice nobody but rev-list uses this option in the first place. So the distinction is probably not important either way. Arguably it should just be an option of rev-list, and not the general revision machinery; right now you can run "git log --verify-objects", but it does not actually do anything useful. - checking for a parsed revs.verify_objects flag in rev-list.c is too late. By that time we've already passed the arguments to setup_revisions(), which will have parsed the commits using the graph. So this commit disables the graph as soon as we see the option in revision.c. That's a pretty broad hammer, but it does what we want, and in practice nobody but rev-list is using this flag anyway. The tests cover both the "tip" and "parent" cases. Obviously our hammer hits them both in this case, but it's good to check both in case somebody later tries the more focused approach. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-07 09:44:30 -07:00
Junio C Hamano	27fb520ef2	Merge branch 'jk/test-crontab-fixes' Test helper fix. * jk/test-crontab-fixes: test-crontab: minor memory and error handling fixes	2022-09-05 18:33:41 -07:00
Junio C Hamano	fcbc8743ef	Merge branch 'en/test-without-test-create-repo' Test clean-up. * en/test-without-test-create-repo: t64xx: convert 'test_create_repo' to 'git init'	2022-09-05 18:33:41 -07:00
Junio C Hamano	56785a3fad	Merge branch 'bc/gc-crontab-fix' FreeBSD portability fix for "git maintenance" that spawns "crontab" to schedule tasks. * bc/gc-crontab-fix: gc: use temporary file for editing crontab	2022-09-05 18:33:41 -07:00
Junio C Hamano	2d88021919	Merge branch 'es/t4301-sed-portability-fix' Test clean-up. * es/t4301-sed-portability-fix: t4301: emit blank line in more idiomatic fashion t4301: fix broken &&-chains and add missing loop termination t4301: account for behavior differences between sed implementations	2022-09-05 18:33:40 -07:00
Junio C Hamano	5784d201da	Merge branch 'rs/test-mergesort' Optimization of a test-helper command. * rs/test-mergesort: test-mergesort: use mem_pool for sort input test-mergesort: read sort input all at once	2022-09-05 18:33:40 -07:00
Junio C Hamano	3fe0121479	Merge branch 'ac/bitmap-lookup-table' The pack bitmap file gained a bitmap-lookup table to speed up locating the necessary bitmap for a given commit. * ac/bitmap-lookup-table: pack-bitmap-write: drop unused pack_idx_entry parameters bitmap-lookup-table: add performance tests for lookup table pack-bitmap: prepare to read lookup table extension pack-bitmap-write: learn pack.writeBitmapLookupTable and add tests pack-bitmap-write.c: write lookup table extension bitmap: move `get commit positions` code to `bitmap_writer_finish` Documentation/technical: describe bitmap lookup table extension	2022-09-05 18:33:39 -07:00
Junio C Hamano	cf98b69053	Merge branch 'tb/midx-with-changing-preferred-pack-fix' Multi-pack index got corrupted when preferred pack changed from one pack to another in a certain way, which has been corrected. * tb/midx-with-changing-preferred-pack-fix: midx.c: avoid adding preferred objects twice midx.c: include preferred pack correctly with existing MIDX midx.c: extract `midx_fanout_add_pack_fanout()` midx.c: extract `midx_fanout_add_midx_fanout()` midx.c: extract `struct midx_fanout` t/lib-bitmap.sh: avoid silencing stderr t5326: demonstrate potential bitmap corruption	2022-09-05 18:33:39 -07:00
Victoria Dye	ba1b117eec	t/perf: add 'GIT_PERF_USE_SCALAR' run option Add a 'GIT_PERF_USE_SCALAR' environment variable (and corresponding perf config 'useScalar') to register a repository created with any of: * test_perf_fresh_repo * test_perf_default_repo * test_perf_large_repo as a Scalar enlistment. This is intended to allow a developer to test the impact of Scalar on already-defined performance scenarios. Suggested-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-02 10:02:56 -07:00
Victoria Dye	e2809233d1	t/perf: add Scalar performance tests Create 'p9210-scalar.sh' for testing Scalar performance and comparing performance of Git operations in Scalar registrations and standard repositories. Example results: Test this tree ------------------------------------------------------------------------ 9210.2: scalar clone 14.82(18.00+3.63) 9210.3: git clone 26.15(36.67+6.90) 9210.4: git status (scalar) 0.04(0.01+0.01) 9210.5: git status (non-scalar) 0.10(0.02+0.11) 9210.6: test_commit --append --no-tag A (scalar) 0.08(0.02+0.03) 9210.7: test_commit --append --no-tag A (non-scalar) 0.13(0.03+0.11) Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-02 10:02:56 -07:00
Victoria Dye	14b4e7e5a4	scalar-clone: add test coverage Create a new test file ('t9211-scalar-clone.sh') to exercise the options and behavior of the 'scalar clone' command. Each test clones to a unique target location and cleans up the cloned repo only when the test passes. This ensures that failed tests' artifacts are captured in CI artifacts for further debugging. Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-02 10:02:56 -07:00
Victoria Dye	7b5c93c6c6	scalar: include in standard Git build & installation Move 'scalar' out of 'contrib/' and into the root of the Git tree. The goal of this change is to build 'scalar' as part of the standard Git build & install processes. This patch includes both the physical move of Scalar's files out of 'contrib/' ('scalar.c', 'scalar.txt', and 't9xxx-scalar.sh'), and the changes to the build definitions in 'Makefile' and 'CMakelists.txt' to accommodate the new program. At a high level, Scalar is built so that: - there is a 'scalar-objs' target (similar to those created in `029bac01a8` (Makefile: add {program,xdiff,test,git,fuzz}-objs & objects targets, 2021-02-23)) for debugging purposes. - it appears in the root of the install directory (rather than the gitexecdir). - it is included in the 'bin-wrappers/' directory for use in tests. - it receives a platform-specific executable suffix (e.g., '.exe'), if applicable. - 'scalar.txt' is installed as 'man1' documentation. - the 'clean' target removes the 'scalar' executable. Additionally, update the root level '.gitignore' file to ignore the Scalar executable. Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-02 10:02:55 -07:00
Victoria Dye	037f8ea6d9	unpack-trees: fix sparse directory recursion check Ensure 'is_sparse_directory_entry()' receives a valid 'name_entry ' if one exists in the list of tree(s) being unpacked in 'unpack_callback()'. Currently, 'is_sparse_directory_entry()' is called with the first 'name_entry' in the 'names' list of entries on 'unpack_callback()'. However, this entry may be empty even when other elements of 'names' are not (such as when switching from an orphan branch back to a "normal" branch). As a result, 'is_sparse_directory_entry()' could incorrectly indicate that a sparse directory is not* actually sparse because the name of the index entry does not match the (empty) 'name_entry' path. Fix the issue by using the existing 'name_entry p' value in 'unpack_callback()', which points to the first non-empty entry in 'names'. Because 'p' is 'const', also update 'is_sparse_directory_entry()'s 'name_entry ' argument to be 'const'. Finally, add a regression test case. Reported-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-02 09:43:09 -07:00
Elijah Newren	67360b75c6	diff: fix filtering of merge commits under --remerge-diff Commit `95433eeed9` ("diff: add ability to insert additional headers for paths", 2022-02-02) introduced the possibility of additional headers. Because there could be conflicts with no content differences (e.g. a modify/delete conflict resolved in favor of taking the modified file as-is), that commit also modified the diff_queue_is_empty() and diff_flush_patch() logic to ensure these headers were included even if there was no associated content diff. However, the added logic was a bit inconsistent between these two functions. diff_queue_is_empty() overlooked the fact that the additional headers strmap could be non-NULL and empty, which would cause it to display commits that should have been filtered out. Fix the diff_queue_is_empty() logic to also account for additional_path_headers being empty. Reported-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-02 09:22:25 -07:00
Elijah Newren	71a146dc70	diff: fix filtering of additional headers under --remerge-diff Commit `95433eeed9` ("diff: add ability to insert additional headers for paths", 2022-02-02) introduced the possibility of additional headers. Because there could be conflicts with no content differences (e.g. a modify/delete conflict resolved in favor of taking the modified file as-is), that commit also modified the diff_queue_is_empty() and diff_flush_patch() logic to ensure these headers were included even if there was no associated content diff. However, when the pickaxe is active, we really only want the remerge conflict headers to be shown when there is an associated content diff. Adjust the logic in these two functions accordingly. This also removes the TEST_PASSES_SANITIZE_LEAK=true declaration from t4069, as there is apparently some kind of memory leak with the pickaxe code. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-02 09:22:25 -07:00
Elijah Newren	9b08091cb7	diff: have submodule_format logic avoid additional diff headers Commit `95433eeed9` ("diff: add ability to insert additional headers for paths", 2022-02-02) introduced the possibility of additional headers, created in create_filepairs_for_header_only_notifications(). These are represented by inserting additional pairs in diff_queued_diff which always have a mode of 0 and a null_oid. When these were added, one code path was noted to assume that at least one of the diff_filespecs in the pair were valid, and that codepath was corrected. The submodule_format handling is another codepath with the same issue; it would operate on these additional headers and attempt to display them as submodule changes. Prevent that by explicitly checking for "phoney" filepairs (i.e. filepairs with both modes being 0). Reported-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-02 09:22:25 -07:00
Ævar Arnfjörð Bjarmason	fe4c750fb1	submodule--helper: fix a configure_added_submodule() leak Fix config API a memory leak added in `a452128a36` (submodule--helper: introduce add-config subcommand, 2021-08-06) by using the *_tmp() variant of git_config_get_string(). In this case we're only checking whether the (repo\|git)_config_get_string() call is telling us that the "submodule.active" key exists. As with the preceding commit we'll find many other such patterns in the codebase if we go fishing. E.g. "git gc" leaks in the code added in `61f7a383d3` (maintenance: use 'incremental' strategy by default, 2020-10-15). Similar code in "git gc" added in `b08ff1fee0` (maintenance: add --schedule option and config, 2020-09-11) doesn't leak, but we could avoid the malloc() & free() in that case. A coccinelle rule to find those would find and fix some leaks, and cases where we're doing needless malloc() + free()'s but only care about the key existence, or are copying the (repo\|git)_config_get_string() return value right away. But as with the preceding commit let's punt on all of that for now, and just narrowly fix this specific case in submodule--helper. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-02 09:18:13 -07:00
Ævar Arnfjörð Bjarmason	d40c42e06b	submodule--helper: free some "displaypath" in "struct update_data" Make the update_data_release() function free "displaypath" member when appropriate. The "displaypath" member is always ours, the "const" on the "char *" was wrong to begin with. This leaves a leak of "displaypath" in update_submodule(), which as we'll see in subsequent commits is harder to deal with than this trivial fix. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-02 09:18:13 -07:00
Ævar Arnfjörð Bjarmason	980416e469	submodule--helper: fix "sm_path" and other "module_cb_list" leaks Fix leaks in "struct module_cb_list" and the "struct module_cb" which it contains, these fix leaks in `e83e3333b5` (submodule: port submodule subcommand 'summary' from shell to C, 2020-08-13). The "sm_path" should always have been a "char ", not a "const char ", we always create it with xstrdup(). We can't mark any tests passing passing with SANITIZE=leak using "TEST_PASSES_SANITIZE_LEAK=true" as a result of this change, but "t7401-submodule-summary.sh" gets closer to passing as a result of this change. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-02 09:18:12 -07:00
Ævar Arnfjörð Bjarmason	87a683482a	submodule--helper: add and use *_release() functions Add release functions for "struct module_list", "struct submodule_update_clone" and "struct update_data". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-02 09:18:12 -07:00
Ævar Arnfjörð Bjarmason	e77b3da6bb	submodule--helper: fix a leak in "clone_submodule" Fix a memory leak of the "clone_data_path" variable that we copy or derive from the "struct module_clone_data" in clone_submodule(). This code was refactored in preceding commits, but the leak has been with us since `f8eaa0ba98` (submodule--helper, module_clone: always operate on absolute paths, 2016-03-31). For the "else" case we don't need to xstrdup() the "clone_data->path", and we don't need to free our own "clone_data_path". We can therefore assign the "clone_data->path" to our own "clone_data_path" right away, and only override it (and remember to free it!) if we need to xstrfmt() a replacement. In the case of the module_clone() caller it's from "argv", and doesn't need to be free'd, and in the case of the add_submodule() caller we get a pointer to "sm_path", which doesn't need to be directly free'd either. Fixing this leak makes several tests pass, so let's mark them as passing with TEST_PASSES_SANITIZE_LEAK=true. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-02 09:18:12 -07:00
Ævar Arnfjörð Bjarmason	0b917a9f5c	submodule--helper: return "ret", not "1" from update_submodule() Amend the update_submodule() function to return the failing "ret" on error, instead of overriding it with "1". This code was added in `b3c5f5cb04` (submodule: move core cmd_update() logic to C, 2022-03-15), and this change ends up not making a difference as this function is only called in update_submodules(). If we return non-zero here we'll always in turn return "1" in module_update(). But if we didn't do that and returned any other non-zero exit code in update_submodules() we'd fail the test that's being amended here. We're still testing the status quo here. This change makes subsequent refactoring of update_submodule() easier, as we'll no longer need to worry about clobbering the "ret" we get from the run_command(). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-02 09:16:24 -07:00
Ævar Arnfjörð Bjarmason	96a28a9bc6	submodule--helper: move "resolve-relative-url-test" to a test-tool As its name suggests the "resolve-relative-url-test" has never been used outside of the test suite, see `63e95beb08` (submodule: port resolve_relative_url from shell to C, 2016-04-15) for its original addition. Perhaps it would make sense to drop this code entirely, as we feel that we've got enough indirect test coverage, but let's leave that question to a possible follow-up change. For now let's keep the test coverage this gives us. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-02 09:16:23 -07:00
Ævar Arnfjörð Bjarmason	85321a346b	submodule--helper: move "check-name" to a test-tool Move the "check-name" helper to a test-tool, since `a6226fd772` (submodule--helper: convert the bulk of cmd_add() to C, 2021-08-10) it has only been used by this test, not git-submodule.sh. As noted with its introduction in `0383bbb901` (submodule-config: verify submodule names as paths, 2018-04-30) the intent of t7450-bad-git-dotfiles.sh has always been to unit test the check_submodule_name() function. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-02 09:16:23 -07:00
Ævar Arnfjörð Bjarmason	9fb2a970e9	submodule--helper: move "is-active" to a test-tool Create a new "test-tool submodule" and move the "is-active" subcommand over to it. It was added in `5c2bd8b77a` (submodule--helper: add is-active subcommand, 2017-03-16), since `a452128a36` (submodule--helper: introduce add-config subcommand, 2021-08-06) it hasn't been used by git-submodule.sh. Since we're creating a command dispatch similar to test-tool.c itself let's split out the "struct test_cmd" into a new test-tool-utils.h, which both this new code and test-tool.c itself can use. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-02 09:16:23 -07:00
Ævar Arnfjörð Bjarmason	255a1ae5da	test-tool submodule-config: remove unused "--url" handling No test has used this "--url" parameter since the test code that made use of it was removed in `32bc548329` (submodule-config: remove support for overlaying repository config, 2017-08-03). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-02 09:16:23 -07:00
Ævar Arnfjörð Bjarmason	31955475d1	submodule--helper: remove unused "list" helper Remove the "submodule--helper list" sub-command, which hasn't been used by git-submodule.sh since `2964d6e5e1` (submodule: port subcommand 'set-branch' from shell to C, 2020-06-02). There was a test added in `2b56bb7a87` (submodule helper list: respect correct path prefix, 2016-02-24) which relied on it, but the right thing to do here is to delete that test as well. That test was regression testing the "list" subcommand itself. We're not getting anything useful from the "list \| cut -f2" invocation that we couldn't get from "foreach 'echo $sm_path'". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-02 09:16:23 -07:00
Ævar Arnfjörð Bjarmason	59378e3355	submodule tests: test for "add <repository> <abs-path>" Add a missing test for ""add <repository> <path>" where "<path>" is an absolute path. This tests code added in [1] and later turned into an "else" branch in clone_submodule() in [2] that's never been tested. This needs to be skipped on WINDOWS because all of $PWD, $(pwd) and the "$(pwd -P)" we get via "$submodurl" would fail in CI with e.g.: fatal: could not create directory 'D:/a/git/git/t/trash directory.t7400-submodule-basic/.git/modules/D:/a/git/git/t/trash directory.t7400-submodule-basic/add-abs' I.e. we can't handle these sorts of paths in this context on that platform. I'm not sure where we run into the edges of "$PWD" behavior on Windows (see [1] for a previous loose end on the topic), but for the purposes of this test it's sufficient that we test this on other platforms. 1. `ee8838d157` (submodule: rewrite `module_clone` shell function in C, 2015-09-08) 2. `f8eaa0ba98` (submodule--helper, module_clone: always operate on absolute paths, 2016-03-31) 1. https://lore.kernel.org/git/220630.86edz6c75c.gmgdl@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-02 09:16:22 -07:00
Ævar Arnfjörð Bjarmason	89bc7b5c01	submodule tests: test usage behavior Test what exit code and output we emit on "git submodule -h", how we handle "--" when no subcommand is specified, and how the top-level "--recursive" option is handled. For "-h" this doesn't make sense, but let's test for it so that any subsequent eventual behavior change will become clear. For "--" this follows up on `68cabbfda3` (submodule: document default behavior, 2019-02-15) and tests that "status" doesn't support the "--" delimiter. There's no intrinsically good reason not to support that. We behave this way due to edge cases in git-submodule.sh's implementation, but as with "-h" let's assert our current long-standing behavior for now. For "--recursive" the exclusion of it from the top-level appears to have been an omission in `15fc56a853` (git submodule foreach: Add --recursive to recurse into nested submodules, 2009-08-19), there doesn't seem to be a reason not to support it alongside "--quiet" and "--cached", but let's likewise assert our existing behavior for now. I.e. as long as "status" is optional it would make sense to support all of its options when it's omitted, but we only do that with "--quiet" and "--cached", and curiously omit "--recursive". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-02 09:16:22 -07:00
Junio C Hamano	014a9ea207	Merge branch 'en/t4301-more-merge-tree-tests' More tests to protect the current behaviour of "merge-tree" before it gets further updated. * en/t4301-more-merge-tree-tests: t4301: add more interesting merge-tree testcases	2022-09-01 13:40:19 -07:00
Junio C Hamano	3a4779086d	Merge branch 'en/merge-unstash-only-on-clean-merge' The auto-stashed local changes created by "git merge --autostash" was mixed into a conflicted state left in the working tree, which has been corrected. * en/merge-unstash-only-on-clean-merge: merge: only apply autostash when appropriate	2022-09-01 13:40:18 -07:00
Junio C Hamano	d528044c83	Merge branch 'sg/parse-options-subcommand' Introduce the "subcommand" mode to parse-options API and update the command line parser of Git commands with subcommands. * sg/parse-options-subcommand: (23 commits) remote: run "remote rm" argv through parse_options() maintenance: add parse-options boilerplate for subcommands pass subcommand "prefix" arguments to parse_options() builtin/worktree.c: let parse-options parse subcommands builtin/stash.c: let parse-options parse subcommands builtin/sparse-checkout.c: let parse-options parse subcommands builtin/remote.c: let parse-options parse subcommands builtin/reflog.c: let parse-options parse subcommands builtin/notes.c: let parse-options parse subcommands builtin/multi-pack-index.c: let parse-options parse subcommands builtin/hook.c: let parse-options parse subcommands builtin/gc.c: let parse-options parse 'git maintenance's subcommands builtin/commit-graph.c: let parse-options parse subcommands builtin/bundle.c: let parse-options parse subcommands parse-options: add support for parsing subcommands parse-options: drop leading space from '--git-completion-helper' output parse-options: clarify the limitations of PARSE_OPT_NODASH parse-options: PARSE_OPT_KEEP_UNKNOWN only applies to --options api-parse-options.txt: fix description of OPT_CMDMODE t0040-parse-options: test parse_options() with various 'parse_opt_flags' ...	2022-09-01 13:40:18 -07:00
Junio C Hamano	68ef0425d9	Merge branch 'ds/bundle-uri-clone' Implement "git clone --bundle-uri". * ds/bundle-uri-clone: clone: warn on failure to repo_init() clone: --bundle-uri cannot be combined with --depth bundle-uri: add support for http(s):// and file:// clone: add --bundle-uri option bundle-uri: create basic file-copy logic remote-curl: add 'get' capability	2022-09-01 13:40:17 -07:00
Ævar Arnfjörð Bjarmason	5cf88fd8b0	git-compat-util.h: use "UNUSED", not "UNUSED(var)" As reported in [1] the "UNUSED(var)" macro introduced in 2174b8c75de (Merge branch 'jk/unused-annotation' into next, 2022-08-24) breaks coccinelle's parsing of our sources in files where it occurs. Let's instead partially go with the approach suggested in [2] of making this not take an argument. As noted in [1] "coccinelle" will ignore such tokens in argument lists that it doesn't know about, and it's less of a surprise to syntax highlighters. This undoes the "help us notice when a parameter marked as unused is actually use" part of `9b24034754` (git-compat-util: add UNUSED macro, 2022-08-19), a subsequent commit will further tweak the macro to implement a replacement for that functionality. 1. https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/ 2. https://lore.kernel.org/git/220819.868rnk54ju.gmgdl@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 10:49:48 -07:00
Eric Sunshine	fb41727b7e	t: retire unused chainlint.sed Retire chainlint.sed since it has been replaced by a more accurate and functional &&-chain "linter", thus is no longer used. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 10:07:41 -07:00
Eric Sunshine	69b9924b87	t/Makefile: teach `make test` and `make prove` to run chainlint.pl Unlike chainlint.sed which "lints" a single test body at a time, thus is invoked once per test, chainlint.pl can check all test bodies in all test scripts with a single invocation. As such, it is akin to other bulk "linters" run by the Makefile, such as `test-lint-shell-syntax`, `test-lint-duplicates`, etc. Therefore, teach `make test` and `make prove` to invoke chainlint.pl along with the other bulk linters. Also, since the single chainlint.pl invocation by `make test` or `make prove` has already checked all tests in all scripts, instruct the individual test scripts not to run chainlint.pl on themselves unnecessarily. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 10:07:41 -07:00
Eric Sunshine	23a14f3016	test-lib: replace chainlint.sed with chainlint.pl By automatically invoking chainlint.sed upon each test it runs, `test_run_` in test-lib.sh ensures that broken &&-chains will be detected early as tests are modified or new are tests created since it is typical to run a test script manually (i.e. `./t1234-test-script.sh`) during test development. Now that the implementation of chainlint.pl is complete, modify test-lib.sh to invoke it automatically instead of chainlint.sed each time a test script is run. This change reduces the number of "linter" invocations from 26800+ (once per test run) down to 1050+ (once per test script), however, a subsequent change will drop the number of invocations to 1 per `make test`, thus fully realizing the benefit of the new linter. Note that the "magic exit code 117" &&-chain checker added by `bb79af9d09` (t/test-lib: introduce --chain-lint option, 2015-03-20) which is built into t/test-lib.sh is retained since it has near zero-cost and (theoretically) may catch a broken &&-chain not caught by chainlint.pl. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 10:07:41 -07:00
Eric Sunshine	9fd911237f	test-lib: retire "lint harder" optimization hack `test_run_` in test-lib.sh "lints" the body of a test by sending it down a `sed chainlint.sed \| grep` pipeline; this happens once for each test run by a test script. Although this pipeline may seem relatively cheap in isolation, it can become expensive when invoked 26800+ times by `make test`, once for each test run, despite the existence of only 16500+ test definitions across all tests scripts. This difference in the number of tests defined in the scripts (16500+) and the number of tests actually run by `make test` (26800+) is explained by the fact that some test scripts run a very large number of small tests, all driven by a series of functions/loops which fill in the test bodies. This means that certain test definitions are being linted repeatedly (tens or hundreds of times) unnecessarily. To avoid such unnecessary work, `2d86a96220` (t: avoid sed-based chain-linting in some expensive cases, 2021-05-13) added an optimization hack which allows individual scripts to manually suppress the unnecessary repeated linting of the same test definition. However, unlike chainlint.sed which checks a test body as the test is run, chainlint.pl checks each test definition just once, no matter how many times the test is run, thus the sort of optimization hack introduced by `2d86a96220` is no longer needed and can be retired. Therefore, revert `2d86a96220`. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 10:07:41 -07:00
Eric Sunshine	56066523ed	t/chainlint: add more chainlint.pl self-tests During the development of chainlint.pl, numerous new self-tests were created to verify correct functioning beyond the checks already represented by the existing self-tests. The new checks fall into several categories: * behavior of the lexical analyzer for complex cases, such as line splicing, token pasting, entering and exiting string contexts inside and outside of test script bodies; for instance: test_expect_success 'title' ' x=$(echo "something" \| sed -e '\''s/\\/\\\\/g'\'' -e '\''s/[[/.^$]/\\&/g'\'' ' behavior of the parser for all compound grammatical constructs, such as `if...fi`, `case...esac`, `while...done`, `{...}`, etc., and for other legal shell grammatical constructs not covered by existing chainlint.sed self-tests, as well as complex cases, such as: OUT=$( ((large_git 1>&3) \| :) 3>&1 ) && * detection of problems, such as &&-chain breakage, from top-level to any depth since the existing self-tests do not cover any top-level context and only cover subshells one level deep due to limitations of chainlint.sed * address blind spots in chainlint.sed (such as not detecting a broken &&-chain on a one-line for-loop in a subshell[1]) which chainlint.pl correctly detects * real-world cases which tripped up chainlint.pl during its development [1]: https://lore.kernel.org/git/dce35a47012fecc6edc11c68e91dbb485c5bc36f.1661663880.git.gitgitgadget@gmail.com/ Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 10:07:41 -07:00
Eric Sunshine	ae0c55abf8	chainlint.pl: allow `\|\| echo` to signal failure upstream of a pipe The use of `\|\| return` (or `\|\| exit`) to signal failure within a loop isn't effective when the loop is upstream of a pipe since the pipe swallows all upstream exit codes and returns only the exit code of the final command in the pipeline. To work around this limitation, tests may adopt an alternative strategy of signaling failure by emitting text which would never be emitted in the non-failing case. For instance: while condition do command1 && command2 \|\| echo "impossible text" done \| sort >actual && Such usage indicates deliberate thought about failure cases by the test author, thus flagging them as missing `\|\| return` (or `\|\| exit`) is not helpful. Therefore, take this case into consideration when checking for explicit loop termination. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 10:07:41 -07:00
Eric Sunshine	fd4094c3ca	chainlint.pl: complain about loops lacking explicit failure handling Shell `for` and `while` loops do not terminate automatically just because a command fails within the loop body. Instead, the loop continues to iterate and eventually returns the exit status of the final command of the final iteration, which may not be the command which failed, thus it is possible for failures to go undetected. Consequently, it is important for test authors to explicitly handle failure within the loop body by terminating the loop manually upon failure. This can be done by returning a non-zero exit code from within the loop body (i.e. `\|\| return 1`) or exiting (i.e. `\|\| exit 1`) if the loop is within a subshell, or by manually checking `$?` and taking some appropriate action. Therefore, add logic to detect and complain about loops which lack explicit `return` or `exit`, or `$?` check. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 10:07:41 -07:00
Eric Sunshine	832c68b3c2	chainlint.pl: don't flag broken &&-chain if failure indicated explicitly There are quite a few tests which print an error messages and then explicitly signal failure with `false`, `return 1`, or `exit 1` as the final command in an `if` branch. In these cases, the tests don't bother maintaining the &&-chain between `echo` and the explicit "test failed" indicator. Since such constructs are manually signaling failure, their &&-chain breakage is legitimate and safe -- both for the command immediately preceding `false`, `return`, or `exit`, as well as for all preceding commands in the `if` branch. Therefore, stop flagging &&-chain breakage in these sorts of cases. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 10:07:41 -07:00
Eric Sunshine	a8f30ee050	chainlint.pl: don't flag broken &&-chain if `$?` handled explicitly There are cases in which tests capture and check a command's exit code explicitly without employing test_expect_code(). They do so by intentionally breaking the &&-chain since it would be impossible to capture "$?" in the failing case if the `status=$?` assignment was part of the &&-chain. Since such constructs are manually checking the exit code, their &&-chain breakage is legitimate and safe, thus should not be flagged. Therefore, stop flagging &&-chain breakage in such cases. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 10:07:41 -07:00
Eric Sunshine	aabc3258a1	chainlint.pl: don't require `&` background command to end with `&&` The exit status of the `&` asynchronous operator which starts a command in the background is unconditionally zero, and the few places in the test scripts which launch commands asynchronously are not interested in the exit status of the `&` operator (though they often capture the background command's PID). As such, there is little value in complaining about broken &&-chain for a command launched in the background, and doing so would only make busy-work for test authors. Therefore, take this special case into account when checking for &&-chain breakage. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 10:07:40 -07:00
Eric Sunshine	d00113ec34	t/Makefile: apply chainlint.pl to existing self-tests Now that chainlint.pl is functional, take advantage of the existing chainlint self-tests to validate its operation. (While at it, stop validating chainlint.sed against the self-tests since it will soon be retired.) Due to chainlint.sed implementation limitations leaking into the self-test "expect" files, a few of them require minor adjustment to make them compatible with chainlint.pl which does not share those limitations. First, because `sed` does not provide any sort of real recursion, chainlint.sed only emulates recursion into subshells, and each level of recursion leads to a multiplicative increase in complexity of the `sed` rules. To avoid substantial complexity, chainlint.sed, therefore, only emulates subshell recursion one level deep. Any subshell deeper than that is passed through as-is, which means that &&-chains are not checked in deeper subshells. chainlint.pl, on the other hand, employs a proper recursive descent parser, thus checks subshells to any depth and correctly flags broken &&-chains in deep subshells. Second, due to sed's line-oriented nature, chainlint.sed, by necessity, folds multi-line quoted strings into a single line. chainlint.pl, on the other hand, employs a proper lexical analyzer which preserves quoted strings as-is, including embedded newlines. Furthermore, the output of chainlint.sed and chainlint.pl do not match precisely in terms of whitespace. However, since the purpose of the self-checks is to verify that the ?!AMP?! annotations are being correctly added, minor whitespace differences are immaterial. For this reason, rather than adjusting whitespace in all existing self-test "expect" files to match the new linter's output, the `check-chainlint` target ignores whitespace differences. Since `diff -w` is not POSIX, `check-chainlint` attempts to employ `git diff -w`, and only falls back to non-POSIX `diff -w` (and `-u`) if `git diff` is not available. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 10:07:40 -07:00
Eric Sunshine	35ebb1e37b	chainlint.pl: don't require `return\|exit\|continue` to end with `&&` In order to check for &&-chain breakage, each time TestParser encounters a new command, it checks whether the previous command ends with `&&`, and -- with a couple exceptions -- signals breakage if it does not. The first exception is that a command may validly end with `\|\|`, which is commonly employed as `command \|\| return 1` at the very end of a loop body to terminate the loop early. The second is that piping one command's output with `\|` to another command does not constitute a &&-chain break (the exit status of the pipe is the exit status of the final command in the pipe). However, it turns out that there are a few additional cases found in the wild in which it is likely safe for `&&` to be missing even when other commands follow. For instance: while {condition-1} do test {condition-2} \|\| return 1 # or `exit 1` within a subshell more-commands done while {condition-1} do test {condition-2} \|\| continue more-commands done Such cases indicate deliberate thought about failure modes by the test author, thus flagging them as breaking the &&-chain is not helpful. Therefore, take these special cases into consideration when checking for &&-chain breakage. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 10:07:40 -07:00
Eric Sunshine	29fb2ec384	chainlint.pl: validate test scripts in parallel Although chainlint.pl has undergone a good deal of optimization during its development -- increasing in speed significantly -- parsing and validating 1050+ scripts and 16500+ tests via Perl is not exactly instantaneous. However, perceived performance can be improved by taking advantage of the fact that there is no interdependence between test scripts or test definitions, thus parsing and validating can be done in parallel. The number of available cores is determined automatically but can be overridden via the --jobs option. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 10:07:40 -07:00
Eric Sunshine	d99ebd6d2e	chainlint.pl: add parser to identify test definitions Finish fleshing out chainlint.pl by adding ScriptParser, a parser which scans shell scripts for tests defined by test_expect_success() and test_expect_failure(), plucks the test body from each definition, and passes it to TestParser for validation. It recognizes test definitions not only at the top-level of test scripts but also tests synthesized within compound commands such as loops and function. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 10:07:40 -07:00
Eric Sunshine	6d932e92fc	chainlint.pl: add parser to validate tests Continue fleshing out chainlint.pl by adding TestParser, a parser with special knowledge about how Git tests should be written; for instance, it knows that commands within a test body should be chained together with `&&`. An upcoming parser which plucks test definitions from test scripts will invoke TestParser for each test body it encounters. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 10:07:40 -07:00
Eric Sunshine	6594554119	chainlint.pl: add POSIX shell parser Continue fleshing out chainlint.pl by adding a general purpose recursive descent parser for the POSIX shell command language. Although never invoked directly, upcoming parser subclasses will extend its functionality for specific purposes, such as plucking test definitions from input scripts and applying domain-specific knowledge to perform test validation. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 10:07:40 -07:00
Eric Sunshine	7d4804731e	chainlint.pl: add POSIX shell lexical analyzer Begin fleshing out chainlint.pl by adding a lexical analyzer for the POSIX shell command language. The sole entry point Lexer::scan_token() returns the next token from the input. It will be called by the upcoming shell language parser. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 10:07:40 -07:00
Eric Sunshine	b4f25b07c7	t: add skeleton chainlint.pl Although chainlint.sed usefully identifies broken &&-chains in tests, it has several shortcomings which include: * only detects &&-chain breakage in subshells (one-level deep) * does not check for broken top-level &&-chains; that task is left to the "magic exit code 117" checker built into test-lib.sh, however, that detection does not extend to `{...}` blocks, `$(...)` expressions, or compound statements such as `if...fi`, `while...done`, `case...esac` * uses heuristics, which makes it (potentially) fallible and difficult to tweak to handle additional real-world cases * written in `sed` and employs advanced `sed` operators which are probably not well-known to many programmers, thus the pool of people who can maintain it is likely small * manually simulates recursion into subshells which makes it much more difficult to reason about than, say, a traditional top-down parser * checks each test as the test is run, which can get expensive for tests which are run repeatedly by functions or loops since their bodies will be checked over and over (tens or hundreds of times) unnecessarily To address these shortcomings, begin implementing a more functional and precise test linter which understands shell syntax and semantics rather than employing heuristics, thus is able to recognize structural problems with tests beyond broken &&-chains. The new linter is written in Perl, thus should be more accessible to a wider audience, and is structured as a traditional top-down parser which makes it much easier to reason about, and allows it to inspect compound statements within test bodies to any depth. Furthermore, it can check all test definitions in the entire project in a single invocation rather than having to be invoked once per test, and each test definition is checked only once no matter how many times the test is actually run. At this stage, the new linter is just a skeleton containing boilerplate which handles command-line options, collects and reports statistics, and feeds its arguments -- paths of test scripts -- to a (presently) do-nothing script parser for validation. Subsequent changes will flesh out the functionality. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 10:07:40 -07:00
Johannes Schindelin	0a101676e5	add -p: ignore dirty submodules Thanks to always running `diff-index` and `diff-files` with the `--numstat` option (the latter with `--ignore-submodules=dirty`) before even generating any real diff to parse, the Perl version of `git add -p` simply ignored dirty submodules and does not even offer them up for staging. However, the built-in variant did not use that flag because it tries to run only one `diff` command, skipping the unneeded `diff-index`/`diff-files` invocation of the Perl variant and therefore only faithfully recapitulates what the Perl code does once it _does_ generate and parse the real diff. This causes a problem when running the built-in `add -p` with `diff-so-fancy` because that diff colorizer always inserts an empty line before the diff header to ensure that it produces 4 lines as expected by `git add -p` (the equivalent of the non-colorized `diff`, `index`, `---` and `+++` lines). But `git diff-files` does not produce any `index` line for dirty submodules. The underlying problem is not even the discrepancy in lines, but that `git add -p` presents diffs for dirty submodules: there is nothing that _can_ be staged for those. Let's fix that bug, and teach the built-in `add -p` to ignore dirty submodules, too. This _incidentally_ also fixes the `diff-so-fancy` problem ;-) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 09:55:28 -07:00
Johannes Schindelin	fd3f7f619a	add -p: gracefully handle unparseable hunk headers in colored diffs In https://lore.kernel.org/git/ecf6f5be-22ca-299f-a8f1-bda38e5ca246@gmail.com, Phillipe Blain reported that the built-in `git add -p` command fails when asked to use [`diff-so-fancy`][diff-so-fancy] to colorize the diff. The reason is that this tool produces colored diffs with a hunk header that does not contain any parseable `@@ ... @@` line range information, and therefore we cannot detect any part in that header that comes after the line range. As proposed by Phillip Wood, let's take that for a clear indicator that we should show the hunk headers verbatim. This is what the Perl version of the interactive `add` command did, too. [diff-so-fancy]: https://github.com/so-fancy/diff-so-fancy Reported-by: Philippe Blain <levraiphilippeblain@gmail.com> Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 09:55:21 -07:00
Johannes Schindelin	b6633a0053	add -p: detect more mismatches between plain vs colored diffs When parsing the colored version of a diff, the interactive `add` command really relies on the colored version having the same number of lines as the plain (uncolored) version. That is an invariant. We already have code to verify correctly when the colored diff has less lines than the plain diff. Modulo an off-by-one bug: If the last diff line has no matching colored one, the code pretends to succeed, still. To make matters worse, when we adjusted the test in `1e4ffc765d` (t3701: adjust difffilter test, 2020-01-14), we did not catch this because `add -p` fails for a _different_ reason: it does not find any colored hunk header that contains a parseable line range. If we change the test case so that the line range _can_ be parsed, the bug is exposed. Let's address all of the above by - fixing the off-by-one, - adjusting the test case to allow `add -p` to parse the line range - making the test case more stringent by verifying that the expected error message is shown Also adjust a misleading code comment about the now-fixed code. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 09:49:45 -07:00
Jeff King	0682bc43f5	test-crontab: minor memory and error handling fixes Since `ee69e7884e` (gc: use temporary file for editing crontab, 2022-08-28), we now insist that "argc == 3" (and otherwise return an error). Coverity notes that this causes some dead code: if (argc == 3) fclose(from); else fclose(to); as we will never trigger the else. This also causes a memory leak, since we'll never close "to". Now that all paths require 2 arguments, we can just reorganize the function to check argc up front, and tweak the cleanup to do the right thing for all cases. While we're here, we can also notice some minor problems: - we return a negative int via error() from what is essentially a main() function; we should return a positive non-zero value for error. Or better yet, we can just use usage(), which gives a better message. - while writing the usage message, we can note the one in the comment was made out of date by `ee69e7884e`. But it also had a typo already, calling the subcommand "cron" and not "crontab" - we didn't check for an error from fopen(), meaning we would segfault if the to-be-read file was missing. We can use xfopen() to catch this. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-30 14:31:37 -07:00
Johannes Schindelin	64ec8efb83	t6132(NO_PERL): do not run the scripted `add -p` When using the non-built-in version of `git add -p` in a `NO_PERL` build, we expect that invocation to fail. However, when `b02fdbc80a` (pathspec: correct an empty string used as a pathspec element, 2022-05-29) added a test case to t6132 to exercise `git add -p`, it did not add appropriate prereqs (which admittedly did not exist back then). Let's specify the appropriate prereqs. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-30 10:40:48 -07:00
Johannes Schindelin	7524780255	t3701: test the built-in `add -i` regardless of NO_PERL The built-in `git add --interactive` does not require Perl, therefore we can safely run these tests even when building with `NO_PERL=LetsDoThat`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-30 10:40:46 -07:00
Junio C Hamano	3658170b92	Merge branch 'es/fix-chained-tests' Fix broken "&&-" chains and failures in early iterations of a loop. * es/fix-chained-tests: t5329: notice a failure within a loop t: detect and signal failure within loop t1092: fix buggy sparse "blame" test t2407: fix broken &&-chains in compound statement	2022-08-29 14:55:15 -07:00
Junio C Hamano	25402204fe	Merge branch 'vd/fix-perf-tests' Rather trivial perf-test code fixes. * vd/fix-perf-tests: p0006: fix 'read-tree' argument ordering p0004: fix prereq declaration	2022-08-29 14:55:13 -07:00
Junio C Hamano	a572a5d4c1	Merge branch 'jd/prompt-show-conflict' The bash prompt (in contrib/) learned to optionally indicate when the index is unmerged. * jd/prompt-show-conflict: git-prompt: show presence of unresolved conflicts at command prompt	2022-08-29 14:55:12 -07:00
Junio C Hamano	0b08ba7eb6	Merge branch 'en/ancestry-path-in-a-range' "git rev-list --ancestry-path=C A..B" is a natural extension of "git rev-list A..B"; instead of choosing a subset of A..B to those that have ancestry relationship with A, it lets a subset with ancestry relationship with C. * en/ancestry-path-in-a-range: revision: allow --ancestry-path to take an argument t6019: modernize tests with helper rev-list-options.txt: fix simple typo	2022-08-29 14:55:11 -07:00
Junio C Hamano	64cb4c34d1	Merge branch 'mt/rot13-in-c' Test portability improvements. * mt/rot13-in-c: tests: use the new C rot13-filter helper to avoid PERL prereq t0021: implementation the rot13-filter.pl script in C t0021: avoid grepping for a Perl-specific string at filter output	2022-08-29 14:55:11 -07:00
Junio C Hamano	c068a3b8ee	Merge branch 'ds/decorate-filter-tweak' The namespaces used by "log --decorate" from "refs/" hierarchy by default has been tightened. * ds/decorate-filter-tweak: fetch: use ref_namespaces during prefetch maintenance: stop writing log.excludeDecoration log: create log.initialDecorationSet=all log: add --clear-decorations option log: add default decoration filter log-tree: use ref_namespaces instead of if/else-if refs: use ref_namespaces for replace refs base refs: add array of ref namespaces t4207: test coloring of grafted decorations t4207: modernize test refs: allow "HEAD" as decoration filter	2022-08-29 14:55:11 -07:00
Junio C Hamano	d5fc07df68	format-patch: learn format.forceInBodyFrom configuration variable As the need to use the "--force-in-body-from" option primarily is tied to which mailing list the mails go to (and get their From: address mangled), it is likely that a user who needs to use this option once to interact with their upstream project needs to use it for all patches they send out. Add a configuration variable, suitable for setting in the local configuration file per repository, for this. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-29 14:39:13 -07:00
Junio C Hamano	34bc1b1045	format-patch: allow forcing the use of in-body From: header Users may be authoring and committing their commits under the same e-mail address they use to send their patches from, in which case they shouldn't need to use the in-body From: line in their outgoing e-mails. At the receiving end, "git am" will use the address on the "From:" header of the incoming e-mail and all should be well. Some mailing lists, however, mangle the From: address from what the original sender had; in such a situation, the user may want to add the in-body "From:" header even for their own patches. "git format-patch --[no-]force-in-body-from" was invented for such users. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-29 14:39:13 -07:00
Eric Sunshine	3871bdb7e4	t4301: emit blank line in more idiomatic fashion The unusual use of: printf "\\n" >>file && may give readers pause, making them wonder why this form was chosen over the more typical: printf "\n" >>file && However, even that may give pause since it is a somewhat unusual and long-winded way of saying: echo >>file && Therefore, replace `printf` with the more idiomatic `echo`, with the hope of eliminating a possible stumbling block for those reading the code. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-29 09:28:40 -07:00
Eric Sunshine	87ed97167a	t4301: fix broken &&-chains and add missing loop termination Fix &&-chain breaks in a couple tests which went unnoticed due to blind spots in the &&-chain linters. In particular, the "magic exit code 117" &&-chain checker built into test-lib.sh only recognizes broken &&-chains at the top-level; it does not work within `{...}` groups, `(...)` subshells, `$(...)` substitutions, or within bodies of compound statements, such as `if`, `for`, `while`, `case`, etc. Furthermore, `chainlint.sed`, which detects broken &&-chains only in `(...)` subshells, missed these cases (which are in subshells) because it (surprisingly) neglects to check for intact &&-chain on single-line `for` loops. While at it, explicitly signal failure of commands within the `for` loops (which might arise due to the filesystem being full or "inode" exhaustion). This is important since failures within `for` and `while` loops can go unnoticed if not detected and signaled manually since the loop itself does not abort when a contained command fails, nor will a failure necessarily be detected when the loop finishes since the loop returns the exit code of the last command it ran on the final iteration, which may not be the command which failed. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-29 09:28:31 -07:00
brian m. carlson	ee69e7884e	gc: use temporary file for editing crontab While cron is specified by POSIX, there are a wide variety of implementations in use. "git maintenance" assumes that the "crontab" command can be fed from its standard input the new contents and the syntax to do so is not to have any filename argument, as POSIX describes. However, on FreeBSD, the cron implementation requires a file name argument: if the user wants to edit standard input, they must specify "-". Unfortunately, POSIX systems do not have to interpret "-" on the command line of crontab as a request to read from the standard input. Blindly adding "-" on the command line would not work as a general solution. Since POSIX tells us that cron must accept a file name argument, let's solve this problem by specifying a temporary file instead. This will ensure that we work with the vast majority of implementations. Note that because delete_tempfile closes the file for us, we should not call fclose here on the handle, since doing so will introduce a double free. Reported-by: Renato Botelho <garga@FreeBSD.org> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-28 15:47:00 -07:00
René Scharfe	c333c2ce65	test-mergesort: use mem_pool for sort input The previous patch almost halved the number of heap allocations for the sort subcommand. Reduce it further by using a mem_pool for the line objects. Note that t/perf/run can't be used directly to compare two versions of test-mergesort because it always runs the helpers from the checked-out version. So I hand-merged the results of separate runs before and with this patch: macOS 12.5.1 on M1: 0071.12: DEFINE_LIST_SORT unsorted 0.22(0.20+0.01) 0.21(0.19+0.01) 0071.14: DEFINE_LIST_SORT sorted 0.10(0.08+0.01) 0.10(0.08+0.01) 0071.16: DEFINE_LIST_SORT reversed 0.10(0.08+0.01) 0.10(0.08+0.01) Git SDK 64-bit on Windows 11 21H2 on Ryzen 7 5800H: 0071.12: DEFINE_LIST_SORT unsorted 0.54(0.00+0.06) 0.44(0.01+0.06) 0071.14: DEFINE_LIST_SORT sorted 0.21(0.03+0.03) 0.19(0.04+0.01) 0071.16: DEFINE_LIST_SORT reversed 0.21(0.01+0.04) 0.19(0.04+0.04) Debian bullseye on WSL2 on the same system: 0071.12: DEFINE_LIST_SORT unsorted 0.29(0.27+0.01) 0.22(0.19+0.02) 0071.14: DEFINE_LIST_SORT sorted 0.07(0.06+0.01) 0.06(0.04+0.02) 0071.16: DEFINE_LIST_SORT reversed 0.07(0.04+0.03) 0.06(0.04+0.02) Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-28 13:10:22 -07:00
René Scharfe	f3e8ba2e64	test-mergesort: read sort input all at once The sort subcommand of test-mergesort is used to test the performance of sorting linked lists. It reads lines from stdin, sorts them and prints the result to stdout. Two heap allocations are done per line: One for the linked list item and one for the actual line string. That imposes a significant amount of allocation overhead. Reduce it by doing the same as the sort subcommand of test-string-list, namely to read the whole input file into a single buffer and then split it in-place. Note that t/perf/run can't be used directly to compare two versions of test-mergesort because it always runs the helpers from the checked-out version. So I hand-merged the results of separate runs before and with this patch: macOS 12.5.1 on M1: 0071.12: DEFINE_LIST_SORT unsorted 0.23(0.20+0.01) 0.22(0.20+0.01) 0071.14: DEFINE_LIST_SORT sorted 0.12(0.10+0.01) 0.10(0.08+0.01) 0071.16: DEFINE_LIST_SORT reversed 0.12(0.10+0.01) 0.10(0.08+0.01) Git SDK 64-bit on Windows 11 21H2 on Ryzen 7 5800H: 0071.12: DEFINE_LIST_SORT unsorted 0.71(0.00+0.03) 0.54(0.00+0.06) 0071.14: DEFINE_LIST_SORT sorted 0.42(0.00+0.04) 0.21(0.03+0.03) 0071.16: DEFINE_LIST_SORT reversed 0.42(0.06+0.01) 0.21(0.01+0.04) Debian bullseye on WSL2 on the same system: 0071.12: DEFINE_LIST_SORT unsorted 0.41(0.39+0.02) 0.29(0.27+0.01) 0071.14: DEFINE_LIST_SORT sorted 0.11(0.08+0.02) 0.07(0.06+0.01) 0071.16: DEFINE_LIST_SORT reversed 0.11(0.08+0.02) 0.07(0.04+0.03) Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-28 13:10:20 -07:00
Eric Sunshine	bcf325ae77	t4301: account for behavior differences between sed implementations It is a common pattern in this script to write the result of `merge-tree -z` (NUL-termination mode) to an "actual" file and then manually append a newline to that file so that it can be diff'd easily with a hand-crafted "expect" file which itself ends with a newline since it has been created by standard Unix tools which terminate lines by default. For instance: git merge-tree --write-tree -z ... >out && printf "\\n" >>out anonymize_hash out >actual && q_to_nul <<-EOF >expect && ... EOF test_cmp expect actual However, one test gets this backward: git merge-tree --write-tree -z ... >out && anonymize_hash out >actual && printf "\\n" >>actual which means that, unlike all other cases, when anonymize_hash() is called, the file being anonymized does not end with a newline. As a result, this test fails on some platforms. anonymize_hash() is implemented like this: anonymize_hash() { sed -e "s/[0-9a-f]\{40,\}/HASH/g" "$@" } The problem arises due to differences in behavior of various `sed` implementations when fed an incomplete line (lacking a newline). Although most modern `sed` implementations output such a line unmolested (i.e. without a newline), some older `sed` implementations forcibly add a newline to the incomplete line (giving the output an extra unexpected newline), while other very old implementations simply swallow an incomplete line and don't emit it at all (making the output shorter than expected). Fix this test by manually adding the newline before passing it through `sed`, thus ensuring identical behavior with all `sed` implementation, and bringing the test in line with other tests in this script. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-28 13:09:46 -07:00
Junio C Hamano	2987ce743d	Merge branch 'en/t4301-more-merge-tree-tests' into es/t4301-sed-portability-fix * en/t4301-more-merge-tree-tests: t4301: add more interesting merge-tree testcases	2022-08-28 13:08:54 -07:00
Diomidis Spinellis	1819ad327b	grep: fix multibyte regex handling under macOS The commit `29de20504e` (Makefile: fix default regex settings on Darwin, 2013-05-11) fixed t0070-fundamental.sh under Darwin (macOS) by adopting Git's regex library. However, this library is compiled with NO_MBSUPPORT, which causes git-grep to work incorrectly on multibyte (e.g. UTF-8) files. Current macOS versions pass t0070-fundamental.sh with the native macOS regex library, which also supports multibyte characters. Adjust the Makefile to use the native regex library, and call setlocale(3) to set CTYPE according to the user's preference. The setlocale call is required on all platforms, but in platforms supporting gettext(3), setlocale was called as a side-effect of initializing gettext. Therefore, move the CTYPE setlocale call from gettext.c to common-main.c and the corresponding locale.h include into git-compat-util.h. Thanks to the global initialization of CTYPE setlocale, the test-tool regex command now works correctly with supported multibyte regexes, and is used to set the MB_REGEX test prerequisite by assessing a platform's support for them. Signed-off-by: Diomidis Spinellis <dds@aueb.gr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-26 11:45:52 -07:00
Junio C Hamano	7be9f3f335	Merge branch 'vd/sparse-reset-checkout-fixes' into maint Fixes to sparse index compatibility work for "reset" and "checkout" commands. source: <pull.1312.v3.git.1659985672.gitgitgadget@gmail.com> * vd/sparse-reset-checkout-fixes: unpack-trees: unpack new trees as sparse directories cache.h: create 'index_name_pos_sparse()' oneway_diff: handle removed sparse directories checkout: fix nested sparse directory diff in sparse index	2022-08-26 11:13:13 -07:00
Junio C Hamano	e5cb51d3aa	Merge branch 'jk/fsck-tree-mode-bits-fix' into maint "git fsck" reads mode from tree objects but canonicalizes the mode before passing it to the logic to check object sanity, which has hid broken tree objects from the checking logic. This has been corrected, but to help exiting projects with broken tree objects that they cannot fix retroactively, the severity of anomalies this code detects has been demoted to "info" for now. source: <YvQcNpizy9uOZiAz@coredump.intra.peff.net> * jk/fsck-tree-mode-bits-fix: fsck: downgrade tree badFilemode to "info" fsck: actually detect bad file modes in trees tree-walk: add a mechanism for getting non-canonicalized modes	2022-08-26 11:13:12 -07:00
Junio C Hamano	6283c1e6ad	Merge branch 'pw/use-glibc-tunable-for-malloc-optim' into maint Avoid repeatedly running getconf to ask libc version in the test suite, and instead just as it once per script. source: <pull.1311.git.1659620305757.gitgitgadget@gmail.com> * pw/use-glibc-tunable-for-malloc-optim: tests: cache glibc version check	2022-08-26 11:13:12 -07:00
Junio C Hamano	5825304328	Merge branch 'ab/hooks-regression-fix' into maint A follow-up fix to a fix for a regression in 2.36. source: <patch-1.1-2450e3e65cf-20220805T141402Z-avarab@gmail.com> * ab/hooks-regression-fix: hook API: don't segfault on strbuf_addf() to NULL "out"	2022-08-26 11:13:12 -07:00
Abhradeep Chakraborty	761416ef91	bitmap-lookup-table: add performance tests for lookup table Add performance tests to verify the performance of lookup table. `p5310-pack-bitmaps.sh` contain tests with and without lookup table. `p5312-pack-bitmaps-revs.sh` contain same tests with and without lookup table but with `pack.writeReverseIndex` enabled. Lookup table makes Git run faster in most of the cases. Below is the result of `t/perf/p5310-pack-bitmaps.sh`.`perf/p5326-multi-pack-bitmaps.sh` gives similar result. The repository used in the test is linux kernel. Test this tree ----------------------------------------------------------------------- 5310.4: enable lookup table: false 0.01(0.00+0.00) 5310.5: repack to disk 320.89(230.20+23.45) 5310.6: simulated clone 14.04(5.78+1.79) 5310.7: simulated fetch 1.95(3.05+0.20) 5310.8: pack to file (bitmap) 44.73(20.55+7.45) 5310.9: rev-list (commits) 0.78(0.46+0.10) 5310.10: rev-list (objects) 4.07(3.97+0.08) 5310.11: rev-list with tag negated via --not 0.06(0.02+0.03) --all (objects) 5310.12: rev-list with negative tag (objects) 0.21(0.15+0.05) 5310.13: rev-list count with blob:none 0.24(0.17+0.06) 5310.14: rev-list count with blob:limit=1k 7.07(5.92+0.48) 5310.15: rev-list count with tree:0 0.25(0.17+0.07) 5310.16: simulated partial clone 5.67(3.28+0.64) 5310.18: clone (partial bitmap) 16.05(8.34+1.86) 5310.19: pack to file (partial bitmap) 59.76(27.22+7.43) 5310.20: rev-list with tree filter (partial bitmap) 0.90(0.18+0.16) 5310.24: enable lookup table: true 0.01(0.00+0.00) 5310.25: repack to disk 319.73(229.30+23.01) 5310.26: simulated clone 13.69(5.72+1.78) 5310.27: simulated fetch 1.84(3.02+0.16) 5310.28: pack to file (bitmap) 45.63(20.67+7.50) 5310.29: rev-list (commits) 0.56(0.39+0.8) 5310.30: rev-list (objects) 3.77(3.74+0.08) 5310.31: rev-list with tag negated via --not 0.05(0.02+0.03) --all (objects) 5310.32: rev-list with negative tag (objects) 0.21(0.15+0.05) 5310.33: rev-list count with blob:none 0.23(0.17+0.05) 5310.34: rev-list count with blob:limit=1k 6.65(5.72+0.40) 5310.35: rev-list count with tree:0 0.23(0.16+0.06) 5310.36: simulated partial clone 5.57(3.26+0.59) 5310.38: clone (partial bitmap) 15.89(8.39+1.84) 5310.39: pack to file (partial bitmap) 58.32(27.55+7.47) 5310.40: rev-list with tree filter (partial bitmap) 0.73(0.18+0.15) Test 4-15 are tested without using lookup table. Same tests are repeated in 16-30 (using lookup table). Mentored-by: Taylor Blau <me@ttaylorr.com> Co-Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com> Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-26 10:14:02 -07:00
Abhradeep Chakraborty	28cd730680	pack-bitmap: prepare to read lookup table extension Earlier change teaches Git to write bitmap lookup table. But Git does not know how to parse them. Teach Git to parse the existing bitmap lookup table. The older versions of Git are not affected by it. Those versions ignore the lookup table. Mentored-by: Taylor Blau <me@ttaylorr.com> Co-Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com> Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-26 10:13:58 -07:00
Abhradeep Chakraborty	76f14b777c	pack-bitmap-write: learn pack.writeBitmapLookupTable and add tests Teach Git to provide a way for users to enable/disable bitmap lookup table extension by providing a config option named 'writeBitmapLookupTable'. Default is false. Also add test to verify writting of lookup table. Mentored-by: Taylor Blau <me@ttaylorr.com> Co-Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com> Co-Authored-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-26 10:13:54 -07:00
Johannes Schindelin	b75747829f	range-diff: optionally accept pathspecs The `git range-diff` command can be quite expensive, which is not a surprise given that the underlying algorithm to match up pairs of commits between the provided two commit ranges has a cubic runtime. Therefore it makes sense to restrict the commit ranges as much as possible, to reduce the amount of input to that O(N^3) algorithm. In chatty repositories with wide trees, this is not necessarily possible merely by choosing commit ranges wisely. Let's give users another option to restrict the commit ranges: by providing a pathspec. That helps in repositories with wide trees because it is likely that the user has a good idea which subset of the tree they are actually interested in. Example: git range-diff upstream/main upstream/seen HEAD -- range-diff.c This shows commits that are either in the local branch or in `seen`, but not in `main`, skipping all commits that do not touch `range-diff.c`. Note: Since we piggy-back the pathspecs onto the `other_arg` mechanism that was introduced to be able to pass through the `--notes` option to the revision machinery, we must now ensure that the `other_arg` array is appended at the end (the revision range must come before the pathspecs, if any). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-26 09:49:26 -07:00
Elijah Newren	6693fb3f01	t64xx: convert 'test_create_repo' to 'git init' Convert the merge-specific tests (those in the t64xx range) over to using 'git init' instead of 'test_create_repo'. Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-26 09:23:03 -07:00
Junio C Hamano	f00ddc9f48	Merge branch 'vd/scalar-generalize-diagnose' The "diagnose" feature to create a zip archive for diagnostic material has been lifted from "scalar" and made into a feature of "git bugreport". * vd/scalar-generalize-diagnose: scalar: update technical doc roadmap scalar-diagnose: use 'git diagnose --mode=all' builtin/bugreport.c: create '--diagnose' option builtin/diagnose.c: add '--mode' option builtin/diagnose.c: create 'git diagnose' builtin diagnose.c: add option to configure archive contents scalar-diagnose: move functionality to common location scalar-diagnose: move 'get_disk_info()' to 'compat/' scalar-diagnose: add directory to archiver more gently scalar-diagnose: avoid 32-bit overflow of size_t scalar-diagnose: use "$GIT_UNZIP" in test	2022-08-25 14:42:32 -07:00
Junio C Hamano	a103ad6f3d	Merge branch 'jk/pipe-command-nonblock' Fix deadlocks between main Git process and subprocess spawned via the pipe_command() API, that can kill "git add -p" that was reimplemented in C recently. * jk/pipe-command-nonblock: pipe_command(): mark stdin descriptor as non-blocking pipe_command(): handle ENOSPC when writing to a pipe pipe_command(): avoid xwrite() for writing to pipe git-compat-util: make MAX_IO_SIZE define globally available nonblock: support Windows compat: add function to enable nonblocking pipes	2022-08-25 14:42:32 -07:00
Junio C Hamano	098b7bfaa6	Merge branch 'js/fetch-negotiation-trace' The common ancestor negotiation exchange during a "git fetch" session now leaves trace log. * js/fetch-negotiation-trace: fetch-pack: add tracing for negotiation rounds	2022-08-25 14:42:31 -07:00
Junio C Hamano	01a30a5a58	Merge branch 'jk/is-promisor-object-keep-tree-in-use' An earlier optimization discarded a tree-object buffer that is still in use, which has been corrected. * jk/is-promisor-object-keep-tree-in-use: is_promisor_object(): fix use-after-free of tree buffer	2022-08-25 14:42:31 -07:00
Junio C Hamano	df3c129e24	Merge branch 'en/submodule-merge-messages-fixes' Further update the help messages given while merging submodules. * en/submodule-merge-messages-fixes: merge-ort: provide helpful submodule update message when possible merge-ort: avoid surprise with new sub_flag variable merge-ort: remove translator lego in new "submodule conflict suggestion" submodule merge: update conflict error message	2022-08-25 14:42:29 -07:00
Junio C Hamano	f677f62970	Merge branch 'ds/bundle-uri-clone' into ds/bundle-uri-3 * ds/bundle-uri-clone: clone: warn on failure to repo_init() clone: --bundle-uri cannot be combined with --depth bundle-uri: add support for http(s):// and file:// clone: add --bundle-uri option bundle-uri: create basic file-copy logic remote-curl: add 'get' capability	2022-08-24 16:05:16 -07:00
Junio C Hamano	9905e80b0f	t5329: notice a failure within a loop We try to write "\|\| return 1" (or "\|\| exit 1" in a subshell) at the end of a sequence of &&-chained command in a loop of our tests, so that a failure of any step during the earlier iteration of the loop can properly be caught. There is one loop in this test script that is used to compute the expected result, that will be later compared with an actual output produced by the "test-tool pack-mtimes" command. This particular loop, however, is placed on the upstream side of a pipe, whose non-zero exit code does not get noticed. Emit a line that will never be produced by the "test-tool pack-mtimes" to cause the later comparison to fail. As we use test_cmp to compare this "expected output" file with the "actual output", the "error message" we are emitting into the expected output stream will stand out and shown to the tester. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-24 14:24:17 -07:00
Elijah Newren	3c4dbf556f	t4301: add more interesting merge-tree testcases This adds several tests of `merge-tree -z` extended conflict output behavior to the testsuite, including some tests adapted from t6422. These tests mark current behavior, not necessarily optimal behavior. In particular, some path_msg() calls might want to include additional paths. These testcases also make something clear about the <Conflicted file> info section of the output. That section consists of a sequence of lines of the form <mode> <object> <stage> <filename> where <stage> is always greater than 0 (since each line comes from a conflicted file). The lines correspond to conflicts that would be placed in the index if we were doing a merge in a working tree. It is perhaps natural to assume that for any given line, the <object> and <filename> correspond to a single <revision>:<filename> pair from one of the commits being merged (or from the merge base). This is true for simple conflicts. However, these testcases make it clear that this is not the case in general. For example, <object> may be the hash of a three-way content merge of three different files (and with different filenames). The tests no longer pass under TEST_PASSES_SANITIZE_LEAK; it appears that doing a directory rename with "git mv", among other possible problems, triggers issues. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-24 09:42:36 -07:00
Elijah Newren	d3a9295ada	merge: only apply autostash when appropriate If a merge failed and we are leaving conflicts in the working directory for the user to resolve, we should not attempt to apply any autostash. Further, if we fail to apply the autostash (because either the merge failed, or the user requested --no-commit), then we should instruct the user how to apply it later. Add a testcase verifying we have corrected this behavior. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-24 09:08:32 -07:00
Taylor Blau	cdf517be06	midx.c: include preferred pack correctly with existing MIDX This patch resolves an issue where the object order used to generate a MIDX bitmap would violate an invariant that all of the preferred pack's objects are represented by that pack in the MIDX. The problem arises when reusing an existing MIDX while generating a new one, and occurs specifically when the identity of the preferred pack changes from one MIDX to another, along with a few other conditions: - the new preferred pack must also be present in the existing MIDX - the new preferred pack must not have been the preferred pack in the existing MIDX - most importantly, there must be at least one object present in the physical preferred pack (ie., it shows up in that pack's index) but was selected from a different pack when the previous MIDX was generated When the above conditions are all met, we end up (incorrectly) discarding copies of some objects in the pack selected as the preferred pack. This is because `get_sorted_entries()` adds objects to its list by doing the following at each fanout level: - first, adding all objects from that fanout level from an existing MIDX - then, adding all objects from that fanout level in each pack not included in the existing MIDX So if some object was not selected from the to-be-preferred pack when writing the previous MIDX, then we will never consider it as a candidate when generating the new MIDX. This means that it's possible for the preferred pack to not include all of its objects in the MIDX's pseudo-pack object order, which is an invariant violation of that order. Resolve this by adding all objects from the preferred pack separately when it appears in the existing MIDX (if one was present). This will duplicate objects from that pack that did appear in the MIDX, but this is fine, since get_sorted_entries() already handles duplicates. (A future optimization in this area could avoid adding copies of objects that we know already existing in the MIDX.) Note that we no longer need to compute the preferred-ness of objects added from the MIDX, since we only want to select the preferred objects from a single source. (We could still mark these preferred bits, but doing so is redundant and unnecessary). This resolves the bug demonstrated by t5326.174 ("preferred pack change with existing MIDX bitmap"). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-22 13:04:22 -07:00
Taylor Blau	0b6203c4ef	t/lib-bitmap.sh: avoid silencing stderr The midx_bitmap_partial_tests() function is responsible for setting up a state where some (but not all) packs in the repository are covered by a MIDX (and bitmap). This function has redirected the `git multi-pack-index write --bitmap`'s stderr to a file "err" since its introduction back in `c51f5a6437` (t5326: test multi-pack bitmap behavior, 2021-08-31). This was likely a stray change left over from a slightly different version of this test, since the file "err" is never read after being written. This leads to confusingly-missing output, especially when the contents of stderr are important. Resolve this confusion by avoiding silencing stderr in this case. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-22 13:04:22 -07:00
Taylor Blau	65168c42df	t5326: demonstrate potential bitmap corruption It is possible to generate a corrupt MIDX bitmap when certain conditions are met. This happens when the preferred pack "P" changes to one (say, "Q") that: - "Q" has objects included in an existing MIDX, - but "Q" is different than "P", - and "Q" and "P" have some objects in common When this is the case, not all objects from "Q" will be selected from "Q" (ie., the generated MIDX will represent them as coming from a different pack), despite "Q" being preferred. This is an invariant violation, since all objects contained in the MIDX's preferred pack are supposed to originate from the preferred pack. In other words, all duplicate objects are resolved in favor of the copy that comes from the MIDX's preferred pack, if any. This violation results in a corrupt object order, which cannot be interpreted by the pack-bitmap code, leading to broken clones and other defects. This test demonstrates the above problem by constructing a minimal reproduction, and showing that the final `git clone` invocation fails. The reproduction is mostly straightforward, except that the new pack generated between MIDX writes (which is necessary in order to prevent that operation from being a noop) must sort ahead of all existing packs in order to prevent a different pack (neither "P" nor "Q") from appearing as preferred (meaning all its objects appear in order at the beginning of the pseudo-pack order). Subsequent commits will first refactor the midx.c::get_sorted_entries() function, and then fix this bug. Reported-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-22 13:04:21 -07:00
Eric Sunshine	0e66bc1b21	t: detect and signal failure within loop Failures within `for` and `while` loops can go unnoticed if not detected and signaled manually since the loop itself does not abort when a contained command fails, nor will a failure necessarily be detected when the loop finishes since the loop returns the exit code of the last command it ran on the final iteration, which may not be the command which failed. Therefore, detect and signal failures manually within loops using the idiom `\|\| return 1` (or `\|\| exit 1` within subshells). Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-22 12:53:02 -07:00
Eric Sunshine	625ff5c320	t1092: fix buggy sparse "blame" test This test wants to verify that `git blame` errors out when asked to blame a file _not_ in the sparse checkout. However, the very first file it asks to blame _is_ present in the checkout, thus `test_must_fail git blame $file` gives an unexpected result (the "blame" succeeds). This problem went unnoticed because the test invokes `test_must_fail git blame $file` in loop but forgets to break out of the loop early upon failure, thus the failure gets swallowed. Fix the test by having it not ask to blame a file present in the sparse checkout, and instead only blame files not present, as intended. While at it, also add the missing `\|\| return 1` which allowed this bug to go unnoticed. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-22 12:53:02 -07:00
Eric Sunshine	308cbaa082	t2407: fix broken &&-chains in compound statement The breaks in the &&-chain in this test went unnoticed because the "magic exit code 117" &&-chain checker built into test-lib.sh only recognizes broken &&-chains at the top-level; it does not work within `{...}` groups, `(...)` subshells, `$(...)` substitutions, or within bodies of compound statements, such as `if`, `for`, `while`, `case`, etc. Furthermore, `chainlint.sed` detects broken &&-chains only in `(...)` subshells. Thus, the &&-chain breaks in this test fall into the blind spots of the &&-chain linters. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-22 12:53:02 -07:00
Victoria Dye	77b9e85c0f	p0006: fix 'read-tree' argument ordering In the 'p0006' test "read-tree br_base br_ballast", move the '-n' flag used in 'git read-tree' ahead of its positional arguments. Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 14:35:30 -07:00
Victoria Dye	59c72303dd	p0004: fix prereq declaration Fix multi-threaded 'p0004' test's use of the 'REPO_BIG_ENOUGH_FOR_MULTI' prerequisite. Unlike normal 't/' tests, 't/perf/' tests need to have their prerequisites declared with the '--prereq' flag. Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 14:35:28 -07:00
Jeff King	783a86c142	config: mark unused callback parameters The callback passed to git_config() must conform to a particular interface. But most callbacks don't actually look at the extra "void *data" parameter. Let's mark the unused parameters to make -Wunused-parameter happy. Note there's one unusual case here in get_remote_default() where we actually ignore the "value" parameter. That's because it's only checking whether the option is found at all, and not parsing its value. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 12:18:55 -07:00
Jeff King	c006e9fa59	refs: mark unused reflog callback parameters Functions used with for_each_reflog_ent() need to conform to a particular interface, but not every function needs all of the parameters. Mark the unused ones to make -Wunused-parameter happy. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 12:18:54 -07:00
Jeff King	63e14ee2d6	refs: mark unused each_ref_fn parameters Functions used with for_each_ref(), etc, need to conform to the each_ref_fn interface. But most of them don't need every parameter; let's annotate the unused ones to quiet -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 12:18:54 -07:00
SZEDER Gábor	2b057d97d7	builtin/stash.c: let parse-options parse subcommands 'git stash' parses its subcommands with a long list of if-else if statements. parse-options has just learned to parse subcommands, so let's use that facility instead, with the benefits of shorter code, and listing subcommands for Bash completion. Note that the push_stash() function implementing the 'push' subcommand accepts an extra flag parameter to indicate whether push was assumed, so add a wrapper function with the standard subcommand function signature. Note also that this change "hides" the '-h' option in 'git stash push -h' from the parse_option() call in cmd_stash(), as it comes after the subcommand. Consequently, from now on it will emit the usage of the 'push' subcommand instead of the usage of 'git stash'. We had a failing test for this case, which can now be flipped to expect success. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 11:13:16 -07:00
SZEDER Gábor	b26a412f1e	builtin/remote.c: let parse-options parse subcommands 'git remote' parses its subcommands with a long list of if-else if statements. parse-options has just learned to parse subcommands, so let's use that facility instead, with the benefits of shorter code, handling unknown subcommands, and listing subcommands for Bash completion. Make sure that the default operation mode doesn't accept any arguments; and while at it remove the capitalization of the error message and adjust the test checking it accordingly. Note that 'git remote' has both 'remove' and 'rm' subcommands, and the former is preferred [1], so hide the latter for completion. Note also that the functions implementing each subcommand only accept the 'argc' and '*argv' parameters, so add a (unused) 'prefix' parameter to make them match the type expected by parse-options, and thus avoid casting a bunch of function pointers. [1] `e17dba8fe1` (remote: prefer subcommand name 'remove' to 'rm', 2012-09-06) Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 11:13:16 -07:00
SZEDER Gábor	0350954482	builtin/gc.c: let parse-options parse 'git maintenance's subcommands 'git maintenanze' parses its subcommands with a couple of if statements. parse-options has just learned to parse subcommands, so let's use that facility instead, with the benefits of shorter code, handling missing or unknown subcommands, and listing subcommands for Bash completion. This change makes 'git maintenance' consistent with other commands in that the help text shown for '-h' goes to standard output, not error, in the exit code and error message on unknown subcommand, and the error message on missing subcommand. There is a test checking these, which is now updated accordingly. Note that some of the functions implementing each subcommand don't accept any parameters, so add the (unused) 'argc', '*argv' and 'prefix' parameters to make them match the type expected by parse-options, and thus avoid casting function pointers. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 11:13:15 -07:00
SZEDER Gábor	1c3b05170a	builtin/commit-graph.c: let parse-options parse subcommands 'git commit-graph' parses its subcommands with an if-else if statement. parse-options has just learned to parse subcommands, so let's use that facility instead, with the benefits of shorter code, handling missing or unknown subcommands, and listing subcommands for Bash completion. Note that the functions implementing each subcommand only accept the 'argc' and '*argv' parameters, so add a (unused) 'prefix' parameter to make them match the type expected by parse-options, and thus avoid casting function pointers. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 11:13:15 -07:00
SZEDER Gábor	fa83cc834d	parse-options: add support for parsing subcommands Several Git commands have subcommands to implement mutually exclusive "operation modes", and they usually parse their subcommand argument with a bunch of if-else if statements. Teach parse-options to handle subcommands as well, which will result in shorter and simpler code with consistent error handling and error messages on unknown or missing subcommand, and it will also make possible for our Bash completion script to handle subcommands programmatically. The approach is guided by the following observations: - Most subcommands [1] are implemented in dedicated functions, and most of those functions [2] either have a signature matching the 'int cmd_foo(int argc, const char *argc, const char prefix)' signature of builtin commands or can be trivially converted to that signature, because they miss only that last prefix parameter or have no parameters at all. - Subcommand arguments only have long form, and they have no double dash prefix, no negated form, and no description, and they don't take any arguments, and can't be abbreviated. - There must be exactly one subcommand among the arguments, or zero if the command has a default operation mode. - All arguments following the subcommand are considered to be arguments of the subcommand, and, conversely, arguments meant for the subcommand may not preceed the subcommand. So in the end subcommand declaration and parsing would look something like this: parse_opt_subcommand_fn *fn = NULL; struct option builtin_commit_graph_options[] = { OPT_STRING(0, "object-dir", &opts.obj_dir, N_("dir"), N_("the object directory to store the graph")), OPT_SUBCOMMAND("verify", &fn, graph_verify), OPT_SUBCOMMAND("write", &fn, graph_write), OPT_END(), }; argc = parse_options(argc, argv, prefix, options, builtin_commit_graph_usage, 0); return fn(argc, argv, prefix); Here each OPT_SUBCOMMAND specifies the name of the subcommand and the function implementing it, and the address of the same 'fn' subcommand function pointer. parse_options() then processes the arguments until it finds the first argument matching one of the subcommands, sets 'fn' to the function associated with that subcommand, and returns, leaving the rest of the arguments unprocessed. If none of the listed subcommands is found among the arguments, parse_options() will show usage and abort. If a command has a default operation mode, 'fn' should be initialized to the function implementing that mode, and parse_options() should be invoked with the PARSE_OPT_SUBCOMMAND_OPTIONAL flag. In this case parse_options() won't error out when not finding any subcommands, but will return leaving 'fn' unchanged. Note that if that default operation mode has any --options, then the PARSE_OPT_KEEP_UNKNOWN_OPT flag is necessary as well (otherwise parse_options() would error out upon seeing the unknown option meant to the default operation mode). Some thoughts about the implementation: - The same pointer to 'fn' must be specified as 'value' for each OPT_SUBCOMMAND, because there can be only one set of mutually exclusive subcommands; parse_options() will BUG() otherwise. There are other ways to tell parse_options() where to put the function associated with the subcommand given on the command line, but I didn't like them: - Change parse_options()'s signature by adding a pointer to subcommand function to be set to the function associated with the given subcommand, affecting all callsites, even those that don't have subcommands. - Introduce a specific parse_options_and_subcommand() variant with that extra funcion parameter. - I decided against automatically calling the subcommand function from within parse_options(), because: - There are commands that have to perform additional actions after option parsing but before calling the function implementing the specified subcommand. - The return code of the subcommand is usually the return code of the git command, but preserving the return code of the automatically called subcommand function would have made the API awkward. - Also add a OPT_SUBCOMMAND_F() variant to allow specifying an option flag: we have two subcommands that are purposefully excluded from completion ('git remote rm' and 'git stash save'), so they'll have to be specified with the PARSE_OPT_NOCOMPLETE flag. - Some of the 'parse_opt_flags' don't make sense with subcommands, and using them is probably just an oversight or misunderstanding. Therefore parse_options() will BUG() when invoked with any of the following flags while the options array contains at least one OPT_SUBCOMMAND: - PARSE_OPT_KEEP_DASHDASH: parse_options() stops parsing arguments when encountering a "--" argument, so it doesn't make sense to expect and keep one before a subcommand, because it would prevent the parsing of the subcommand. However, this flag is allowed in combination with the PARSE_OPT_SUBCOMMAND_OPTIONAL flag, because the double dash might be meaningful for the command's default operation mode, e.g. to disambiguate refs and pathspecs. - PARSE_OPT_STOP_AT_NON_OPTION: As its name suggests, this flag tells parse_options() to stop as soon as it encouners a non-option argument, but subcommands are by definition not options... so how could they be parsed, then?! - PARSE_OPT_KEEP_UNKNOWN: This flag can be used to collect any unknown --options and then pass them to a different command or subsystem. Surely if a command has subcommands, then this functionality should rather be delegated to one of those subcommands, and not performed by the command itself. However, this flag is allowed in combination with the PARSE_OPT_SUBCOMMAND_OPTIONAL flag, making possible to pass --options to the default operation mode. - If the command with subcommands has a default operation mode, then all arguments to the command must preceed the arguments of the subcommand. AFAICT we don't have any commands where this makes a difference, because in those commands either only the command accepts any arguments ('notes' and 'remote'), or only the default subcommand ('reflog' and 'stash'), but never both. - The 'argv' array passed to subcommand functions currently starts with the name of the subcommand. Keep this behavior. AFAICT no subcommand functions depend on the actual content of 'argv[0]', but the parse_options() call handling their options expects that the options start at argv[1]. - To support handling subcommands programmatically in our Bash completion script, 'git cmd --git-completion-helper' will now list both subcommands and regular --options, if any. This means that the completion script will have to separate subcommands (i.e. words without a double dash prefix) from --options on its own, but that's rather easy to do, and it's not much work either, because the number of subcommands a command might have is rather low, and those commands accept only a single --option or none at all. An alternative would be to introduce a separate option that lists only subcommands, but then the completion script would need not one but two git invocations and command substitutions for commands with subcommands. Note that this change doesn't affect the behavior of our Bash completion script, because when completing the --option of a command with subcommands, e.g. for 'git notes --<TAB>', then all subcommands will be filtered out anyway, as none of them will match the word to be completed starting with that double dash prefix. [1] Except 'git rerere', because many of its subcommands are implemented in the bodies of the if-else if statements parsing the command's subcommand argument. [2] Except 'credential', 'credential-store' and 'fsmonitor--daemon', because some of the functions implementing their subcommands take special parameters. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 11:13:14 -07:00
SZEDER Gábor	99d86d60e5	parse-options: PARSE_OPT_KEEP_UNKNOWN only applies to --options The description of 'PARSE_OPT_KEEP_UNKNOWN' starts with "Keep unknown arguments instead of erroring out". This is a bit misleading, as this flag only applies to unknown --options, while non-option arguments are kept even without this flag. Update the description to clarify this, and rename the flag to PARSE_OPTIONS_KEEP_UNKNOWN_OPT to make this obvious just by looking at the flag name. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 11:13:14 -07:00
SZEDER Gábor	c1b117d31c	t0040-parse-options: test parse_options() with various 'parse_opt_flags' In 't0040-parse-options.sh' we thoroughly test the parsing of all types and forms of options, but in all those tests parse_options() is always invoked with a 0 flags parameter. Add a few tests to demonstrate how various 'enum parse_opt_flags' values are supposed to influence option parsing. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 11:13:14 -07:00
SZEDER Gábor	31a66c1964	t5505-remote.sh: check the behavior without a subcommand 'git remote' without a subcommand defaults to listing all remotes and doesn't accept any arguments except the '-v\|--verbose' option. We are about to teach parse-options to handle subcommands, and update 'git remote' to make use of that new feature. So let's add some tests to make sure that the upcoming changes don't inadvertently change the behavior in these cases. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 11:13:13 -07:00
SZEDER Gábor	9e4658d5c6	t3301-notes.sh: check that default operation mode doesn't take arguments 'git notes' without a subcommand defaults to listing all notes and doesn't accept any arguments. We are about to teach parse-options to handle subcommands, and update 'git notes' to make use of that new feature. So let's add a test to make sure that the upcoming changes don't inadvertenly change the behavior in this corner case. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 11:13:13 -07:00
Justin Donnelly	e03acd0d4a	git-prompt: show presence of unresolved conflicts at command prompt If GIT_PS1_SHOWCONFLICTSTATE is set to "yes", show the word "CONFLICT" on the command prompt when there are unresolved conflicts. Example prompt: (main\|CONFLICT) Signed-off-by: Justin Donnelly <justinrdonnelly@gmail.com> Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 10:58:40 -07:00
Elijah Newren	257418c590	revision: allow --ancestry-path to take an argument We have long allowed users to run e.g. git log --ancestry-path master..seen which shows all commits which satisfy all three of these criteria: * are an ancestor of seen * are not an ancestor of master * have master as an ancestor This commit allows another variant: git log --ancestry-path=$TOPIC master..seen which shows all commits which satisfy all of these criteria: * are an ancestor of seen * are not an ancestor of master * have $TOPIC in their ancestry-path that last bullet can be defined as commits meeting any of these criteria: * are an ancestor of $TOPIC * have $TOPIC as an ancestor * are $TOPIC This also allows multiple --ancestry-path arguments, which can be used to find commits with any of the given topics in their ancestry path. Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 10:45:08 -07:00
Derrick Stolee	1838e21cff	t6019: modernize tests with helper The tests in t6019 are repetitive, so create a helper that greatly simplifies the test script. In addition, update the common pattern that places 'git rev-list' on the left side of a pipe, which can hide some exit codes. Send the output to a 'raw' file that is then consumed by other tools so the Git exit code is verified as zero. And since we're using --format anyway, switch to `git log`, so that we get the desired format and can avoid using sed. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 10:45:07 -07:00
Junio C Hamano	fddd8b4801	Merge branch 'll/disk-usage-humanise' "git rev-list --disk-usage" learned to take an optional value "human" to show the reported value in human-readable format, like "3.40MiB". * ll/disk-usage-humanise: rev-list: support human-readable output for `--disk-usage`	2022-08-18 13:07:05 -07:00
Junio C Hamano	9b9445cfde	Merge branch 'sy/sparse-rm' "git rm" has become more aware of the sparse-index feature. * sy/sparse-rm: rm: integrate with sparse-index rm: expand the index only when necessary pathspec.h: move pathspec_needs_expanded_index() from reset.c to here t1092: add tests for `git-rm`	2022-08-18 13:07:05 -07:00
Junio C Hamano	80ffc849bd	Merge branch 'vd/sparse-reset-checkout-fixes' Fixes to sparse index compatibility work for "reset" and "checkout" commands. * vd/sparse-reset-checkout-fixes: unpack-trees: unpack new trees as sparse directories cache.h: create 'index_name_pos_sparse()' oneway_diff: handle removed sparse directories checkout: fix nested sparse directory diff in sparse index	2022-08-18 13:07:04 -07:00
Junio C Hamano	363a193c3a	Merge branch 'jk/fsck-tree-mode-bits-fix' "git fsck" reads mode from tree objects but canonicalizes the mode before passing it to the logic to check object sanity, which has hid broken tree objects from the checking logic. This has been corrected, but to help exiting projects with broken tree objects that they cannot fix retroactively, the severity of anomalies this code detects has been demoted to "info" for now. * jk/fsck-tree-mode-bits-fix: fsck: downgrade tree badFilemode to "info" fsck: actually detect bad file modes in trees tree-walk: add a mechanism for getting non-canonicalized modes	2022-08-18 13:07:04 -07:00
Jeff King	716c1f649e	pipe_command(): mark stdin descriptor as non-blocking Our pipe_command() helper lets you both write to and read from a child process on its stdin/stdout. It's supposed to work without deadlocks because we use poll() to check when descriptors are ready for reading or writing. But there's a bug: if both the data to be written and the data to be read back exceed the pipe buffer, we'll deadlock. The issue is that the code assumes that if you have, say, a 2MB buffer to write and poll() tells you that the pipe descriptor is ready for writing, that calling: write(cmd->in, buf, 210241024); will do a partial write, filling the pipe buffer and then returning what it did write. And that is what it would do on a socket, but not for a pipe. When writing to a pipe, at least on Linux, it will block waiting for the child process to read() more. And now we have a potential deadlock, because the child may be writing back to us, waiting for us to read() ourselves. An easy way to trigger this is: git -c add.interactive.useBuiltin=true \ -c interactive.diffFilter=cat \ checkout -p HEAD~200 The diff against HEAD~200 will be big, and the filter wants to write all of it back to us (obviously this is a dummy filter, but in the real world something like diff-highlight would similarly stream back a big output). If you set add.interactive.useBuiltin to false, the problem goes away, because now we're not using pipe_command() anymore (instead, that part happens in perl). But this isn't a bug in the interactive code at all. It's the underlying pipe_command() code which is broken, and has been all along. We presumably didn't notice because most calls only do input _or_ output, not both. And the few that do both, like gpg calls, may have large inputs or outputs, but never both at the same time (e.g., consider signing, which has a large payload but a small signature comes back). The obvious fix is to put the descriptor into non-blocking mode, and indeed, that makes the problem go away. Callers shouldn't need to care, because they never see the descriptor (they hand us a buffer to feed into it). The included test fails reliably on Linux without this patch. Curiously, it doesn't fail in our Windows CI environment, but has been reported to do so for individual developers. It should pass in any environment after this patch (courtesy of the compat/ layers added in the last few commits). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-17 09:21:41 -07:00
Josh Steadmon	a29263cf5f	fetch-pack: add tracing for negotiation rounds Currently, negotiation for V0/V1/V2 fetch have trace2 regions covering the entire negotiation process. However, we'd like additional data, such as timing for each round of negotiation or the number of "haves" in each round. Additionally, "independent negotiation" (AKA push negotiation) has no tracing at all. Having this data would allow us to compare the performance of the various negotation implementations, and to debug unexpectedly slow fetch & push sessions. Add per-round trace2 regions for all negotiation implementations (V0+V1, V2, and independent negotiation), as well as an overall region for independent negotiation. Add trace2 data logging for the number of haves and "in vain" objects for each round, and for the total number of rounds once negotiation completes. Finally, add a few checks into various tests to verify that the number of rounds is logged as expected. Signed-off-by: Josh Steadmon <steadmon@google.com> Acked-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-15 09:17:03 -07:00
Junio C Hamano	7d0a1c8895	Merge branch 'pw/use-glibc-tunable-for-malloc-optim' Avoid repeatedly running getconf to ask libc version in the test suite, and instead just as it once per script. * pw/use-glibc-tunable-for-malloc-optim: tests: cache glibc version check	2022-08-14 23:19:28 -07:00
Junio C Hamano	c0f6dd49f1	Merge branch 'ab/tech-docs-to-help' Expose a lot of "tech docs" via "git help" interface. * ab/tech-docs-to-help: docs: move http-protocol docs to man section 5 docs: move cruft pack docs to gitformat-pack docs: move pack format docs to man section 5 docs: move signature docs to man section 5 docs: move index format docs to man section 5 docs: move protocol-related docs to man section 5 docs: move commit-graph format docs to man section 5 git docs: add a category for file formats, protocols and interfaces git docs: add a category for user-facing file, repo and command UX git help doc: use "<doc>" instead of "<guide>" help.c: remove common category behavior from drop_prefix() behavior help.c: refactor drop_prefix() to use a "switch" statement"	2022-08-14 23:19:28 -07:00
Junio C Hamano	d86ac14dd7	Merge branch 'ab/hooks-regression-fix' A follow-up fix to a fix for a regression in 2.36. * ab/hooks-regression-fix: hook API: don't segfault on strbuf_addf() to NULL "out"	2022-08-14 23:19:27 -07:00
Matheus Tavares	4d1d843be7	tests: use the new C rot13-filter helper to avoid PERL prereq The previous commit implemented a C version of the t0021/rot13-filter.pl script. Let's use this new C helper to eliminate the PERL prereq from various tests, and also remove the superseded Perl script. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-14 22:57:12 -07:00
Matheus Tavares	52917a998e	t0021: implementation the rot13-filter.pl script in C This script is currently used by three test files: t0021-conversion.sh, t2080-parallel-checkout-basics.sh, and t2082-parallel-checkout-attributes.sh. To avoid the need for the PERL dependency at these tests, let's convert the script to a C test-tool command. The following commit will take care of actually modifying the said tests to use the new C helper and removing the Perl script. The Perl script flushes the log file handler after each write. As commented in [1], this seems to be an early design decision that was later reconsidered, but possibly ended up being left in the code by accident: >> +$debug->flush(); > > Isn't $debug flushed automatically? Maybe, but autoflush is not explicitly enabled. I will enable it again (I disabled it because of Eric's comment but I re-read the comment and he is only talking about pipes). Anyways, this behavior is not really needed for the tests and the flush() calls make the code slightly larger, so let's avoid them altogether in the new C version. [1]: https://lore.kernel.org/git/7F1F1A0E-8FC3-4FBD-81AA-37786DE0EF50@gmail.com/ Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-14 22:57:12 -07:00
Matheus Tavares	bed8947751	t0021: avoid grepping for a Perl-specific string at filter output This test sets the t0021/rot13-filter.pl script as a long-running process filter for a git checkout command. It then expects the filter to fail producing a specific error message at stderr. In the following commits we are going to replace the script with a C test-tool helper, but the test currently expects the error message in a Perl-specific format. That is, when you call `die <msg>` in Perl, it emits "<msg> at - line 1." In preparation for the conversion, let's avoid the Perl-specific part and only grep for <msg> itself. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-14 22:57:11 -07:00
Jeff King	1490d7d82d	is_promisor_object(): fix use-after-free of tree buffer Since commit `fcc07e980b` (is_promisor_object(): free tree buffer after parsing, 2021-04-13), we'll always free the buffers attached to a "struct tree" after searching them for promisor links. But there's an important case where we don't want to do so: if somebody else is already using the tree! This can happen during a "rev-list --missing=allow-promisor" traversal in a partial clone that is missing one or more trees or blobs. The backtrace for the free looks like this: #1 free_tree_buffer tree.c:147 #2 add_promisor_object packfile.c:2250 #3 for_each_object_in_pack packfile.c:2190 #4 for_each_packed_object packfile.c:2215 #5 is_promisor_object packfile.c:2272 #6 finish_object__ma builtin/rev-list.c:245 #7 finish_object builtin/rev-list.c:261 #8 show_object builtin/rev-list.c:274 #9 process_blob list-objects.c:63 #10 process_tree_contents list-objects.c:145 #11 process_tree list-objects.c:201 #12 traverse_trees_and_blobs list-objects.c:344 [...] We're in the middle of walking through the entries of a tree object via process_tree_contents(). We see a blob (or it could even be another tree entry) that we don't have, so we call is_promisor_object() to check it. That function loops over all of the objects in the promisor packfile, including the tree we're currently walking. When we're done with it there, we free the tree buffer. But as we return to the walk in process_tree_contents(), it's still holding on to a pointer to that buffer, via its tree_desc iterator, and it accesses the freed memory. Even a trivial use of "--missing=allow-promisor" triggers this problem, as the included test demonstrates (it's just a vanilla --blob:none clone). We can detect this case by only freeing the tree buffer if it was allocated on our behalf. This is a little tricky since that happens inside parse_object(), and it doesn't tell us whether the object was already parsed, or whether it allocated the buffer itself. But by checking for an already-parsed tree beforehand, we can distinguish the two cases. That feels a little hacky, and does incur an extra lookup in the object-hash table. But that cost is fairly minimal compared to actually loading objects (and since we're iterating the whole pack here, we're likely to be loading most objects, rather than reusing cached results). It may also be a good direction for this function in general, as there are other possible optimizations that rely on doing some analysis before parsing: - we could detect blobs and avoid reading their contents; they can't link to other objects, but parse_object() doesn't know that we don't care about checking their hashes. - we could avoid allocating object structs entirely for most objects (since we really only need them in the oidset), which would save some memory. - promisor commits could use the commit-graph rather than loading the object from disk This commit doesn't do any of those optimizations, but I think it argues that this direction is reasonable, rather than relying on parse_object() and trying to teach it to give us more information about whether it parsed. The included test fails reliably under SANITIZE=address just when running "rev-list --missing=allow-promisor". Checking the output isn't strictly necessary to detect the bug, but it seems like a reasonable addition given the general lack of coverage for "allow-promisor" in the test suite. Reported-by: Andrew Olsen <andrew.olsen@koordinates.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-14 18:03:36 -07:00
Victoria Dye	aac0e8ffee	builtin/bugreport.c: create '--diagnose' option Create a '--diagnose' option for 'git bugreport' to collect additional information about the repository and write it to a zipped archive. The '--diagnose' option behaves effectively as an alias for simultaneously running 'git bugreport' and 'git diagnose'. In the documentation, users are explicitly recommended to attach the diagnostics alongside a bug report to provide additional context to readers, ideally reducing some back-and-forth between reporters and those debugging the issue. Note that '--diagnose' may take an optional string arg (either 'stats' or 'all'). If specified without the arg, the behavior corresponds to running 'git diagnose' without '--mode'. As with 'git diagnose', this default is intended to help reduce unintentional leaking of sensitive information). Users can also explicitly specify '--diagnose=(stats\|all)' to generate the respective archive created by 'git diagnose --mode=(stats\|all)'. Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-12 13:20:02 -07:00
Victoria Dye	7ecf193f7d	builtin/diagnose.c: add '--mode' option Create '--mode=<mode>' option in 'git diagnose' to allow users to optionally select non-default diagnostic information to include in the output archive. Additionally, document the currently-available modes, emphasizing the importance of not sharing a '--mode=all' archive publicly due to the presence of sensitive information. Note that the option parsing callback - 'option_parse_diagnose()' - is added to 'diagnose.c' rather than 'builtin/diagnose.c' so that it may be reused in future callers configuring a diagnostics archive. Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-12 13:20:02 -07:00
Victoria Dye	6783fd3cef	builtin/diagnose.c: create 'git diagnose' builtin Create a 'git diagnose' builtin to generate a standalone zip archive of repository diagnostics. The "diagnose" functionality was originally implemented for Scalar in `aa5c79a331` (scalar: implement `scalar diagnose`, 2022-05-28). However, the diagnostics gathered are not specific to Scalar-cloned repositories and can be useful when diagnosing issues in any Git repository. Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-12 13:20:02 -07:00
Junio C Hamano	83489a5b20	Merge branch 'ab/plug-revisions-leak' Plug a bit more leaks in the revisions API. * ab/plug-revisions-leak: revisions API: don't leak memory on argv elements that need free()-ing bisect.c: partially fix bisect_rev_setup() memory leak log: refactor "rev.pending" code in cmd_show() log: fix a memory leak in "git show <revision>..." test-fast-rebase helper: use release_revisions() (again) bisect.c: add missing "goto" for release_revisions()	2022-08-12 13:19:08 -07:00
Junio C Hamano	657c7403a3	Merge branch 'ab/leak-check' Extend SANITIZE=leak checking and declare more tests "currently leak-free". * ab/leak-check: CI: use "GIT_TEST_SANITIZE_LEAK_LOG=true" in linux-leaks upload-pack: fix a memory leak in create_pack_file() leak tests: mark passing SANITIZE=leak tests as leak-free leak tests: don't skip some tests under SANITIZE=leak test-lib: have the "check" mode for SANITIZE=leak consider leak logs test-lib: add a GIT_TEST_PASSING_SANITIZE_LEAK=check mode test-lib: simplify by removing test_external tests: move copy/pasted PERL + Test::More checks to a lib-perl.sh t/Makefile: don't remove test-results in "clean-except-prove-cache" test-lib: add a SANITIZE=leak logging mode t/README: reword the "GIT_TEST_PASSING_SANITIZE_LEAK" description test-lib: add a --invert-exit-code switch test-lib: fix GIT_EXIT_OK logic errors, use BAIL_OUT test-lib: don't set GIT_EXIT_OK before calling test_atexit_handler test-lib: use $1, not $@ in test_known_broken_{ok,failure}_	2022-08-12 13:19:08 -07:00
Junio C Hamano	8faaf690f7	Merge branch 'lt/symbolic-ref-sanity' "git symbolic-ref symref non..sen..se" is now diagnosed as an error. * lt/symbolic-ref-sanity: symbolic-ref: refuse to set syntactically invalid target	2022-08-12 13:19:08 -07:00
Li Linchao	9096451acd	rev-list: support human-readable output for `--disk-usage` The '--disk-usage' option for git-rev-list was introduced in `16950f8384` (rev-list: add --disk-usage option for calculating disk usage, 2021-02-09). This is very useful for people inspect their git repo's objects usage infomation, but the resulting number is quit hard for a human to read. Teach git rev-list to output a human readable result when using '--disk-usage'. Signed-off-by: Li Linchao <lilinchao@oschina.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-11 13:45:23 -07:00
Junio C Hamano	5856cb98c0	Merge branch 'ma/t4200-update' into maint Test fix. source: <20220718154322.2177166-1-martin.agren@gmail.com> * ma/t4200-update: t4200: drop irrelevant code	2022-08-10 21:52:35 -07:00
Junio C Hamano	042159a509	Merge branch 'tb/commit-graph-genv2-upgrade-fix' into maint There was a bug in the codepath to upgrade generation information in commit-graph from v1 to v2 format, which has been corrected. source: <cover.1657667404.git.me@ttaylorr.com> * tb/commit-graph-genv2-upgrade-fix: commit-graph: fix corrupt upgrade from generation v1 to v2 commit-graph: introduce `repo_find_commit_pos_in_graph()` t5318: demonstrate commit-graph generation v2 corruption	2022-08-10 21:52:35 -07:00
Junio C Hamano	4f049a16bf	Merge branch 'tk/untracked-cache-with-uall' into maint Fix for a bug that makes write-tree to fail to write out a non-existent index as a tree, introduced in 2.37. source: <20220722212232.833188-1-martin.agren@gmail.com> * tk/untracked-cache-with-uall: read-cache: make `do_read_index()` always set up `istate->repo`	2022-08-10 21:52:34 -07:00
Junio C Hamano	340a6120e5	Merge branch 'mt/checkout-count-fix' into maint "git checkout" miscounted the paths it updated, which has been corrected. source: <cover.1657799213.git.matheus.bernardino@usp.br> * mt/checkout-count-fix: checkout: fix two bugs on the final count of updated entries checkout: show bug about failed entries being included in final report checkout: document bug where delayed checkout counts entries twice	2022-08-10 21:52:34 -07:00
Junio C Hamano	312d5b7429	Merge branch 'hx/lookup-commit-in-graph-fix' into maint A corner case bug where lazily fetching objects from a promisor remote resulted in infinite recursion has been corrected. source: <cover.1656593279.git.hanxin.hx@bytedance.com> * hx/lookup-commit-in-graph-fix: t5330: remove run_with_limited_processses() commit-graph.c: no lazy fetch in lookup_commit_in_graph()	2022-08-10 21:52:32 -07:00
Junio C Hamano	a6aeb2fef9	Merge branch 'jc/resolve-undo' into maint The resolve-undo information in the index was not protected against GC, which has been corrected. source: <xmqq35f7kzad.fsf@gitster.g> * jc/resolve-undo: fsck: do not dereference NULL while checking resolve-undo data revision: mark blobs needed for resolve-undo as reachable	2022-08-10 21:52:32 -07:00
Jeff King	4dd3b045f5	fsck: downgrade tree badFilemode to "info" The previous commit un-broke the "badFileMode" check; before then it was literally testing nothing. And as far as I can tell, it has been so since the very initial version of fsck. The current severity of "badFileMode" is just "warning". But in the --strict mode used by transfer.fsckObjects, that is elevated to an error. This will potentially cause hassle for users, because historical objects with bad modes will suddenly start causing pushes to many server operators to be rejected. At the same time, these bogus modes aren't actually a big risk. Because we canonicalize them everywhere besides fsck, they can't cause too much mischief in the real world. The worst thing you can do is end up with two almost-identical trees that have different hashes but are interpreted the same. That will generally cause things to be inefficient rather than wrong, and is a bug somebody working on a Git implementation would want to fix, but probably not worth inconveniencing users by refusing to push or fetch. So let's downgrade this to "info" by default, which is our setting for "mention this when fscking, but don't ever reject, even under strict mode". If somebody really wants to be paranoid, they can still adjust the level using config. Suggested-by: Xavier Morel <xavier.morel@masklinn.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-10 14:26:29 -07:00
Jeff King	53602a937d	fsck: actually detect bad file modes in trees We use the normal tree_desc code to iterate over trees in fsck, meaning we only see the canonicalized modes it returns. And hence we'd never see anything unexpected, since it will coerce literally any garbage into one of our normal and accepted modes. We can use the new RAW_MODES flag to see the real modes, and then use the existing code to actually analyze them. The existing code is written as allow-known-good, so there's not much point in testing a variety of breakages. The one tested here should be S_IFREG but with nonsense permissions. Do note that the error-reporting here isn't great. We don't mention the specific bad mode, but just that the tree has one or more broken modes. But when you go to look at it with "git ls-tree", we'll report the canonicalized mode! This isn't ideal, but given that this should come up rarely, and that any number of other tree corruptions might force you into looking at the binary bytes via "cat-file", it's not the end of the world. And it's something we can improve on top later if we choose. Reported-by: Xavier Morel <xavier.morel@masklinn.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-10 14:26:27 -07:00
Derrick Stolee	e21e663cd1	clone: --bundle-uri cannot be combined with --depth A previous change added the '--bundle-uri' option, but did not check if the --depth parameter was included. Since bundles are not compatible with shallow clones, provide an error message to the user who is attempting this combination. I am leaving this as its own change, separate from the one that implements '--bundle-uri', because this is more of an advisory for the user. There is nothing wrong with bootstrapping with bundles and then fetching a shallow clone. However, that is likely going to involve too much work for the client _and_ the server. The client will download all of this bundle information containing the full history of the repository only to ignore most of it. The server will get a shallow fetch request, but with a list of haves that might cause a more painful computation of that shallow pack-file. Reviewed-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-10 14:07:37 -07:00
Derrick Stolee	59c1752ab6	bundle-uri: add support for http(s):// and file:// The previous change created the 'git clone --bundle-uri=<uri>' option. Currently, <uri> must be a filename. Update copy_uri_to_file() to first inspect the URI for an HTTP(S) prefix and use git-remote-https as the way to download the data at that URI. Otherwise, check to see if file:// is present and modify the prefix accordingly. Reviewed-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-10 14:07:37 -07:00
Derrick Stolee	5556891961	clone: add --bundle-uri option Cloning a remote repository is one of the most expensive operations in Git. The server can spend a lot of CPU time generating a pack-file for the client's request. The amount of data can clog the network for a long time, and the Git protocol is not resumable. For users with poor network connections or are located far away from the origin server, this can be especially painful. Add a new '--bundle-uri' option to 'git clone' to bootstrap a clone from a bundle. If the user is aware of a bundle server, then they can tell Git to bootstrap the new repository with these bundles before fetching the remaining objects from the origin server. Reviewed-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-10 14:07:37 -07:00
Derrick Stolee	53a50892be	bundle-uri: create basic file-copy logic Before implementing a way to fetch bundles into a repository, create the basic logic. Assume that the URI is actually a file path. Future logic will make this more careful to other protocols. For now, we also only succeed if the content at the URI is a bundle file, not a bundle list. Bundle lists will be implemented in a future change. Note that the discovery of a temporary filename is slightly racy because the odb_mkstemp() relies on the temporary file not existing. With the current implementation being limited to file copies, we could replace the copy_file() with copy_fd(). The tricky part comes in future changes that send the filename to 'git remote-https' and its 'get' capability. At that point, we need the file descriptor closed _and_ the file unlinked. If we were to keep the file descriptor open for the sake of normal file copies, then we would pollute the rest of the code for little benefit. This is especially the case because we expect that most bundle URI use will be based on HTTPS instead of file copies. Reviewed-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-10 14:07:37 -07:00
Derrick Stolee	b5624a4474	remote-curl: add 'get' capability A future change will want a way to download a file over HTTP(S) using the simplest of download mechanisms. We do not want to assume that the server on the other side understands anything about the Git protocol but could be a simple static web server. Create the new 'get' capability for the remote helpers which advertises that the 'get' command is avalable. A caller can send a line containing 'get <url> <path>' to download the file at <url> into the file at <path>. Reviewed-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-10 14:07:37 -07:00
Shaoxuan Yuan	da6fe05b3d	mv: check overwrite for in-to-out move Add checking logic for overwriting when moving from in-cone to out-of-cone. It is the index version of the original overwrite logic. Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-10 13:57:50 -07:00
Shaoxuan Yuan	5efd533ed8	advice.h: add advise_on_moving_dirty_path() Add an advice. When the user use `git mv --sparse <dirty-path> <destination>`, Git will warn the user to use `git add --sparse <paths>` then use `git sparse-checkout reapply` to apply the sparsity rules. Add a few lines to previous "move dirty path" tests so we can test this new advice is working. Suggested-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-10 13:57:50 -07:00
Shaoxuan Yuan	b6f51e3db9	mv: cleanup empty WORKING_DIRECTORY Originally, moving from-in-to-out may leave an empty <source> directory on-disk (this kind of directory is marked as WORKING_DIRECTORY). Cleanup such directories if they are empty (don't have any entries under them). Modify two tests that take <source> as WORKING_DIRECTORY to test this behavior. Suggested-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-10 13:57:49 -07:00
Shaoxuan Yuan	5784db1b22	mv: from in-cone to out-of-cone Originally, moving an in-cone <source> to an out-of-cone <destination> was not possible, mainly because such <destination> is a directory that is not present in the working tree. Change the behavior so that we can move an in-cone <source> to out-of-cone <destination> when --sparse is supplied. Notice that <destination> can also be an out-of-cone file path, rather than a directory. Such <source> can be either clean or dirty, and moving it results in different behaviors: A clean move should move <source> to <destination> in the index (do not create <destination> in the worktree), then delete <source> from the worktree. A dirty move should move the <source> to the <destination>, both in the working tree and the index, but should not remove the resulted path from the working tree and should not turn on its CE_SKIP_WORKTREE bit. Optional reading ================ We are strict about cone mode when <destination> is a file path. The reason is that some of the previous tests that use no-cone mode in t7002 are keep breaking, mainly because the `dst_mode = SPARSE;` line added in this patch. Most features developed in both "from-out-to-in" and "from-in-to-out" only care about cone mode situation, as no-cone mode is becoming irrelevant. And because assigning `SPARSE` to `dst_mode` when the repo is in no-cone mode causes miscellaneous bugs, we should just leave this new functionality to be exclusive cone mode and save some time. Helped-by: Derrick Stolee <derrickstolee@github.com> Helped-by: Victoria Dye <vdye@github.com> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-10 13:57:49 -07:00
Shaoxuan Yuan	5506683dea	t7002: add tests for moving from in-cone to out-of-cone Add corresponding tests to test that user can move an in-cone <source> to out-of-cone <destination> when --sparse is supplied. Such <source> can be either clean or dirty, and moving it results in different behaviors: A clean move should move <source> to <destination> in the index (do not create <destination> in the worktree), then delete <source> from the worktree. A dirty move should move the <source> to the <destination>, both in the working tree and the index, but should not remove the resulted path from the working tree and should not turn on its CE_SKIP_WORKTREE bit. Also make sure that if <destination> exists in the index (existing check for if <destination> is in the worktree is not enough in in-to-out moves), warn user against the overwrite. And Git should force the overwrite when supplied with -f or --force. Helped-by: Derrick Stolee <derrickstolee@github.com> Helped-by: Victoria Dye <vdye@github.com> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-10 13:57:48 -07:00
Shaoxuan Yuan	ede241c715	rm: integrate with sparse-index Enable the sparse index within the `git-rm` command. The `p2000` tests demonstrate a ~92% execution time reduction for 'git rm' using a sparse index. Test HEAD~1 HEAD -------------------------------------------------------------------------- 2000.74: git rm ... (full-v3) 0.41(0.37+0.05) 0.43(0.36+0.07) +4.9% 2000.75: git rm ... (full-v4) 0.38(0.34+0.05) 0.39(0.35+0.05) +2.6% 2000.76: git rm ... (sparse-v3) 0.57(0.56+0.01) 0.05(0.05+0.00) -91.2% 2000.77: git rm ... (sparse-v4) 0.57(0.55+0.02) 0.03(0.03+0.00) -94.7% ---- Also, normalize a behavioral difference of `git-rm` under sparse-index. See related discussion [1]. `git-rm` a sparse-directory entry within a sparse-index enabled repo behaves differently from a sparse directory within a sparse-checkout enabled repo. For example, in a sparse-index repo, where 'folder1' is a sparse-directory entry, `git rm -r --sparse folder1` provides this: rm 'folder1/' Whereas in a sparse-checkout repo without sparse-index, doing so provides this: rm 'folder1/0/0/0' rm 'folder1/0/1' rm 'folder1/a' Because `git rm` a sparse-directory entry does not need to expand the index, therefore we should accept the current behavior, which is faster than "expand the sparse-directory entry to match the sparse-checkout situation". Modify a previous test so such difference is not considered as an error. [1] https://github.com/ffyuanda/git/pull/6#discussion_r934861398 Helped-by: Victoria Dye <vdye@github.com> Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-08 13:23:26 -07:00
Shaoxuan Yuan	bcf96cfca6	rm: expand the index only when necessary Remove the `ensure_full_index()` method so `git-rm` does not always expand the index when the expansion is unnecessary, i.e. when <pathspec> does not have any possibilities to match anything outside of sparse-checkout definition. Expand the index when the <pathspec> needs an expanded index, i.e. the <pathspec> contains wildcard that may need a full-index or the <pathspec> is simply outside of sparse-checkout definition. Notice that the test 'rm pathspec expands index when necessary' in t1092 is testing this code change behavior, though it will be marked as 'test_expect_success' only in the next patch, where we officially mark `command_requires_full_index = 0`, so the index does not expand unless we tell it to do so. Notice that because we also want `ensure_full_index` to record the stdout and stderr from Git command, a corresponding modification is also included in this patch. The reason we want the "sparse-index-out" and "sparse-index-err", is that we need to make sure there is no error from Git command itself, so we can rely on the `test_region` result and determine if the index is expanded or not. Helped-by: Victoria Dye <vdye@github.com> Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-08 13:23:26 -07:00
Shaoxuan Yuan	ba808251aa	t1092: add tests for `git-rm` Add tests for `git-rm`, make sure it behaves as expected when <pathspec> is both inside or outside of sparse-checkout definition. Helped-by: Victoria Dye <vdye@github.com> Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-08 13:23:26 -07:00
Junio C Hamano	3f61790678	Merge branch 'vd/sparse-reset-checkout-fixes' into sy/sparse-rm * vd/sparse-reset-checkout-fixes: unpack-trees: unpack new trees as sparse directories cache.h: create 'index_name_pos_sparse()' oneway_diff: handle removed sparse directories checkout: fix nested sparse directory diff in sparse index	2022-08-08 13:23:06 -07:00
Victoria Dye	b15207b8cf	unpack-trees: unpack new trees as sparse directories If 'unpack_single_entry()' is unpacking a new directory tree (that is, one not already present in the index) into a sparse index, unpack the tree as a sparse directory rather than traversing its contents and unpacking each file individually. This helps keep the sparse index as collapsed as possible in cases such as 'git reset --hard' restoring a outside-of-cone directory removed with 'git rm -r --sparse'. Without this patch, 'unpack_single_entry()' will only unpack a directory into the index as a sparse directory (rather than traversing into it and unpacking its files one-by-one) if an entry with the same name already exists in the index. This patch allows sparse directory unpacking without a matching index entry when the following conditions are met: 1. the directory's path is outside the sparse cone, and 2. there are no children of the directory in the index If a directory meets these requirements (as determined by 'is_new_sparse_dir()'), 'unpack_single_entry()' unpacks the sparse directory index entry and propagates the decision back up to 'unpack_callback()' to prevent unnecessary tree traversal into the unpacked directory. Reported-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-08 13:21:50 -07:00
Victoria Dye	49ff3cb90f	checkout: fix nested sparse directory diff in sparse index Add the 'recursive' diff flag to the local changes reporting done by 'git checkout' in 'show_local_changes()'. Without the flag enabled, unexpanded sparse directories will not be recursed into to report the diff of each file's contents, resulting in the reported local changes including "modified" sparse directories. The same issue was found and fixed for 'git status' in `2c521b0e49` (status: fix nested sparse directory diff in sparse index, 2022-03-01) Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-08 13:21:49 -07:00
Junio C Hamano	1b53bea29a	Merge branch 'js/t5351-freebsd-fix' Some tests assumed that core.fsyncMethod=batch is supported everywhere, which broke FreeBSD. * js/t5351-freebsd-fix: t5351: avoid using `test_cmp` for binary data t5351: avoid relying on `core.fsyncMethod = batch` to be supported	2022-08-08 13:13:14 -07:00
Junio C Hamano	1e92768aa1	Merge branch 'tb/cat-file-z' Operating modes like "--batch" of "git cat-file" command learned to take NUL-terminated input, instead of one-item-per-line. * tb/cat-file-z: builtin/cat-file.c: support NUL-delimited input with `-z` t1006: extract --batch-command inputs to variables	2022-08-05 15:52:14 -07:00
Junio C Hamano	ef7b9ad032	Merge branch 'ds/doc-wo-whitelist' into maint Avoid "white/black-list" in documentation and code comments. source: <pull.1274.v3.git.1658255537.gitgitgadget@gmail.com> * ds/doc-wo-whitelist: transport.c: avoid "whitelist" t: avoid "whitelist" git.txt: remove redundant language git-cvsserver: clarify directory list daemon: clarify directory arguments	2022-08-05 15:51:37 -07:00
Junio C Hamano	d16978517c	Merge branch 'mb/config-document-include' into maint Add missing documentation for "include" and "includeIf" features in "git config" file format, which incidentally teaches the command line completion to include them in its offerings. source: <pull.1285.v2.git.1658002423864.gitgitgadget@gmail.com> * mb/config-document-include: config.txt: document include, includeIf	2022-08-05 15:51:36 -07:00
Junio C Hamano	de28459136	Merge branch 'jk/clone-unborn-confusion' into maint "git clone" from a repository with some ref whose HEAD is unborn did not set the HEAD in the resulting repository correctly, which has been corrected. source: <YsdyLS4UFzj0j/wB@coredump.intra.peff.net> * jk/clone-unborn-confusion: clone: move unborn head creation to update_head() clone: use remote branch if it matches default HEAD clone: propagate empty remote HEAD even with other branches clone: drop extra newline from warning message	2022-08-05 15:51:35 -07:00
Derrick Stolee	863a8ae97b	maintenance: stop writing log.excludeDecoration This reverts commit `96eaffebbf` (maintenance: set log.excludeDecoration durin prefetch, 2021-01-19). The previous change created a default decoration filter that does not include refs/prefetch/, so this modification of the config is no longer needed. One issue that can happen from this point on is that users who ran the prefetch task on previous versions of Git will still have a log.excludeDecoration value and that will prevent the new default decoration filter from being active. Thus, when we add the refs/bundle/ namespace as part of the bundle URI feature, those users will see refs/bundle/ decorations. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-05 14:13:13 -07:00
Derrick Stolee	3e103ed23f	log: create log.initialDecorationSet=all The previous change introduced the --clear-decorations option for users who do not want their decorations limited to a narrow set of ref namespaces. Add a config option that is equivalent to specifying --clear-decorations by default. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-05 14:13:12 -07:00
Derrick Stolee	748706d713	log: add --clear-decorations option The previous changes introduced a new default ref filter for decorations in the 'git log' command. This can be overridden using --decorate-refs=HEAD and --decorate-refs=refs/, but that is cumbersome for users. Instead, add a --clear-decorations option that resets all previous filters to a blank filter that accepts all refs. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-05 14:13:12 -07:00
Derrick Stolee	92156291ca	log: add default decoration filter When a user runs 'git log', they expect a certain set of helpful decorations. This includes: * The HEAD ref * Branches (refs/heads/) * Stashes (refs/stash) * Tags (refs/tags/) * Remote branches (refs/remotes/) * Replace refs (refs/replace/ or $GIT_REPLACE_REF_BASE) Each of these namespaces was selected due to existing test cases that verify these namespaces appear in the decorations. In particular, stashes and replace refs can have custom colors from the color.decorate.<slot> config option. While one test checks for a decoration from notes, it only applies to the tip of refs/notes/commit (or its configured ref name). Notes form their own kind of decoration instead. Modify the expected output for the tests in t4013 that expect this note decoration. There are several tests throughout the codebase that verify that --decorate-refs, --decorate-refs-exclude, and log.excludeDecoration work as designed and the tests continue to pass without intervention. However, there are other refs that are less helpful to show as decoration: * Prefetch refs (refs/prefetch/) * Rebase refs (refs/rebase-merge/ and refs/rebase-apply/) * Bundle refs (refs/bundle/) [!] [!] The bundle refs are part of a parallel series that bootstraps a repo from a bundle file, storing the bundle's refs into the repo's refs/bundle/ namespace. In the case of prefetch refs, `96eaffebbf` (maintenance: set log.excludeDecoration durin prefetch, 2021-01-19) added logic to add refs/prefetch/ to the log.excludeDecoration config option. Additional feedback pointed out that having such a side-effect can be confusing and perhaps not helpful to users. Instead, we should hide these ref namespaces that are being used by Git for internal reasons but are not helpful for the users to see. The way to provide a seamless user experience without setting the config is to modify the default decoration filters to match our expectation of what refs the user actually wants to see. In builtin/log.c, after parsing the --decorate-refs and --decorate-refs-exclude options from the command-line, call set_default_decoration_filter(). This method populates the exclusions from log.excludeDecoration, then checks if the list of pattern modifications are empty. If none are specified, then the default set is restricted to the set of inclusions mentioned earlier (HEAD, branches, etc.). A previous change introduced the ref_namespaces array, which includes all of these currently-used namespaces. The 'decoration' value is non-zero when that namespace is associated with a special coloring and fits into the list of "expected" decorations as described above, which makes the implementation of this filter very simple. Note that the logic in ref_filter_match() in log-tree.c follows this matching pattern: 1. If there are exclusion patterns and the ref matches one, then ignore the decoration. 2. If there are inclusion patterns and the ref matches one, then definitely include the decoration. 3. If there are config-based exclusions from log.excludeDecoration and the ref matches one, then ignore the decoration. With this logic in mind, we need to ensure that we do not populate our new defaults if any of these filters are manually set. Specifically, if a user runs git -c log.excludeDecoration=HEAD log then we expect the HEAD decoration to not appear. If we left the default inclusions in the set, then HEAD would match that inclusion before reaching the config-based exclusions. A potential alternative would be to check the list of default inclusions at the end, after the config-based exclusions. This would still create a behavior change for some uses of --decorate-refs-exclude=<X>, and could be overwritten somewhat with --decorate-refs=refs/ and --decorate-refs=HEAD. However, it no longer becomes possible to include refs outside of the defaults while also excluding some using log.excludeDecoration. Another alternative would be to exclude the known namespaces that are not intended to be shown. This would reduce the visible effect of the change for expert users who use their own custom ref namespaces. The implementation change would be very simple to swap due to our use of ref_namespaces: int i; struct string_list exclude = decoration_filter->exclude_ref_pattern; / * No command-line or config options were given, so * populate with sensible defaults. */ for (i = 0; i < NAMESPACE__COUNT; i++) { if (ref_namespaces[i].decoration) continue; string_list_append(exclude, ref_namespaces[i].ref); } The main downside of this approach is that we expect to add new hidden namespaces in the future, and that means that Git versions will be less stable in how they behave as those namespaces are added. It is critical that we provide ways for expert users to disable this behavior change via command-line options and config keys. These changes will be implemented in a future change. Add a test that checks that the defaults are not added when --decorate-refs is specified. We verify this by showing that HEAD is not included as it normally would. Also add a test that shows that the default filter avoids the unwanted decorations from refs/prefetch, refs/rebase-merge, and refs/bundle. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-05 14:13:12 -07:00
Derrick Stolee	5797b13919	t4207: test coloring of grafted decorations The color.decorate.<slot> config option added the 'grafted' slot in `09c4ba410b` (log-tree: allow to customize 'grafted' color, 2018-05-26) but included no tests for this behavior. When modifying some logic around decorations, this ref namespace was ignored and could have been lost as a default namespace for 'git log' decorations by default. Add two tests to t4207 that check that the replaced objects are correctly decorated. Use "black" as the color since it is distinct from the other colors already in the test. The first test uses regular replace-objects while the second creates a commit graft. Be sure to test both modes with GIT_REPLACE_REF_BASE unset and set to an alternative base. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-05 14:13:11 -07:00
Derrick Stolee	b004521aa6	t4207: modernize test Before adding new tests to t4207-log-decoration-colors.sh, update the existing test to use modern test conventions. This includes: 1. Use lowercase in test names. 2. Keep all test setup inside the test_expect_success blocks. We need to be careful about left whitespace in the broken lines of the input file. 3. Do not use 'git' commands on the left side of a pipe. 4. Create a cmp_filtered_decorations helper to perform the 'log', 'sed', and test_decode_color manipulations. Move the '--all' option to be an argument so we can change that value in future tests. 5. Modify the 'sed' command to use a simpler form that is more portable. The next change will introduce new tests usinge these new conventions. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-05 14:13:11 -07:00
Derrick Stolee	b877e617e6	refs: allow "HEAD" as decoration filter The normalize_glob_ref() method was introduced in `65516f586b` (log: add option to choose which refs to decorate, 2017-11-21) to help with decoration filters such as --decorate-refs=<filter> and --decorate-refs-exclude=<filter>. The method has not been used anywhere else. At the moment, it is impossible to specify HEAD as a decoration filter since normalize_glob_ref() prepends "refs/" to the filter if it isn't already there. Allow adding HEAD as a decoration filter by allowing the exact string "HEAD" to not be prepended with "refs/". Add a test in t4202-log.sh that would previously fail since the HEAD decoration would exist in the output. It is sufficient to only cover "HEAD" here and not include other special refs like REBASE_HEAD. This is because HEAD is the only ref outside of refs/* that is added to the list of decorations. However, we may want to special-case these other refs in normalize_glob_ref() in the future. Leave a NEEDSWORK comment for now. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-05 14:13:11 -07:00
Ævar Arnfjörð Bjarmason	99ddc24672	hook API: don't segfault on strbuf_addf() to NULL "out" Fix a logic error in `a082345372` (hook API: fix v2.36.0 regression: hooks should be connected to a TTY, 2022-06-07). When it started using the "ungroup" API added in `fd3aaf53f7` (run-command: add an "ungroup" option to run_process_parallel(), 2022-06-07) it should have made the same sort of change that `fd3aaf53f7` itself made in "t/helper/test-run-command.c". The correct way to emit this "Couldn't start" output with "ungroup" would be: fprintf(stderr, _("Couldn't start hook '%s'\n"), hook_path); But we should instead remove the emitting of this output. As the added test shows we already emit output when we can't run the child. The "cannot run" output here is emitted by run-command.c's child_err_spew(). So the addition of the "Couldn't start hook" output here in `96e7225b31` (hook: add 'run' subcommand, 2021-12-22) was always redundant. For the pre-commit hook we'll now emit exactly the same output as we did before `f443246b9f` (commit: convert {pre-commit,prepare-commit-msg} hook to hook.h, 2021-12-22) (and likewise for others). We could at this point add this to the pick_next_hook() callbacks in hook.c: assert(!out); assert(!*pp_task_cb); And this to notify_start_failure() and notify_hook_finished() (in the latter case the parameter is called "pp_task_cp"): assert(!out); assert(!pp_task_cb); But let's leave any such instrumentation for some eventual cleanup of the "ungroup" API. Reported-by: Ilya K <me@0upti.me> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Emily Shaffer <emilyshaffer@google.com> Reviewed-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-05 14:12:00 -07:00
Ævar Arnfjörð Bjarmason	5db921054e	docs: move protocol-related docs to man section 5 Continue the move of existing Documentation/technical/* protocol and file-format documentation into our main documentation space. By moving the things that discuss the protocol we can properly link from e.g. lsrefs.unborn and protocol.version documentation to a manpage we build by default. So far we have been using the "gitformat-" prefix for the documentation we've been moving over from Documentation/technical/, but for protocol documentation let's use "gitprotocol-". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-04 14:12:23 -07:00
Ævar Arnfjörð Bjarmason	844739ba27	git docs: add a category for file formats, protocols and interfaces Create a new "File formats, protocols and other developer interfaces" section in the main "git help git" manual page and start moving the documentation that now lives in "Documentation/technical/.git" over to it. This complements the newly added and adjacent "Repository, command and file interfaces" section. This makes the technical documentation more accessible and discoverable. Before this we wouldn't install it by default, and had no ability to build man page versions of them. The links to them from our existing documentation link to the generated HTML version of these docs. So let's start moving those over, starting with just the "bundle-format.txt" documentation added in `7378ec90e1` (doc: describe Git bundle format, 2020-02-07). We'll now have a new gitformat-bundle(5) man page. Subsequent commits will move more git internal format documentation over. Unfortunately the syntax of the current Documentation/technical/.txt is not the same (when it comes to section headings etc.) as our Documentation/*.txt documentation, so change the relevant bits of syntax as we're moving this over. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-04 14:12:23 -07:00
Ævar Arnfjörð Bjarmason	d976c5100f	git docs: add a category for user-facing file, repo and command UX Create a new "Repository, command and file interfaces" section in the main "git help git" manual page. Move things that belong under this new criteria from the generic "Guides" section. The "Guides" section was added in `f442f28a81` (git.txt: add list of guides, 2020-08-05). It makes sense to have e.g. "giteveryday(7)" and "gitfaq(7)" listed under "Guides". But placing e.g. "gitignore(5)" in it is stretching the meaning of what a "guide" is, ideally that section should list things similar to "giteveryday(7)" and "gitcore-tutorial(7)". An alternate name that was considered for this new section was "User formats", for consistency with the nomenclature used for man section 5 in general. My man(1) lists it as "File formats and conventions, e.g. /etc/passwd". So calling this "git help --formats" or "git help --user-formats" would make sense for e.g. gitignore(5), but would be stretching it somewhat for githooks(5), and would seem really suspect for the likes of gitcli(7). Let's instead pick a name that's closer to the generic term "User interface", which is really what this documentation discusses: General user-interface documentation that doesn't obviously belong elsewhere. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-04 14:12:23 -07:00
Calvin Wan	4057523a40	submodule merge: update conflict error message When attempting to merge in a superproject with conflicting submodule pointers that cannot be fast-forwarded or trivially resolved, the merge fails and Git prints an error message that accurately describes the failure, but does not provide steps for the user to resolve the error. Git is left in a conflicted state, which requires the user to: 1. merge submodules or update submodules to an already existing commit that reflects the merge 2. add submodules changes to the superproject 3. finish merging superproject These steps are non-obvious for newer submodule users to figure out based on the error message and neither `git submodule status` nor `git status` provide any useful pointers. Update error message to provide steps to resolve submodule merge conflict. Future work could involve adding an advice flag to the message. Although the message is long, it also has the id of the submodule commit that needs to be merged, which could be useful information for the user. Additionally, 5 merge failures that resulted in an early return have been updated to reflect the status of the merge. 1. Null merge base (null o): CONFLICT_SUBMODULE_NULL_MERGE_BASE added as a new conflict type and will print updated error message. 2. Null merge side a (null a): BUG(). See [1] for discussion 3. Null merge side b (null b): BUG(). See [1] for discussion 4. Submodule not checked out: added NEEDSWORK bit 5. Submodule commits not present: added NEEDSWORK bit The errors with a NEEDSWORK bit deserve a more detailed explanation of how to resolve them. See [2] for more context. [1] https://lore.kernel.org/git/CABPp-BE0qGwUy80dmVszkJQ+tcpfLRW0OZyErymzhZ9+HWY1mw@mail.gmail.com/ [2] https://lore.kernel.org/git/xmqqpmhjjwo9.fsf@gitster.g/ Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-04 13:43:07 -07:00
Phillip Wood	a6a58f7801	tests: cache glibc version check `131b94a10a` ("test-lib.sh: Use GLIBC_TUNABLES instead of MALLOC_CHECK_ on glibc >= 2.34", 2022-03-04) introduced a check for the version of glibc that is in use. This check is performed as part of setup_malloc_check() which is called at least once for each test. As the test involves forking `getconf` and `expr` cache the result and use that within setup_malloc_check() to avoid forking these extra processes for each test. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-04 11:09:18 -07:00
Junio C Hamano	966ff64a30	Merge branch 'en/merge-restore-to-pristine' When "git merge" finds that it cannot perform a merge, it should restore the working tree to the state before the command was initiated, but in some corner cases it didn't. * en/merge-restore-to-pristine: merge: do not exit restore_state() prematurely merge: ensure we can actually restore pre-merge state merge: make restore_state() restore staged state too merge: fix save_state() to work when there are stat-dirty files merge: do not abort early if one strategy fails to handle the merge merge: abort if index does not match HEAD for trivial merges merge-resolve: abort if index does not match HEAD merge-ort-wrappers: make printed message match the one from recursive	2022-08-03 13:36:09 -07:00
Junio C Hamano	4e0d160bbc	Merge branch 'rs/mergesort' Make our mergesort implementation type-safe. * rs/mergesort: mergesort: remove llist_mergesort() packfile: use DEFINE_LIST_SORT fetch-pack: use DEFINE_LIST_SORT commit: use DEFINE_LIST_SORT blame: use DEFINE_LIST_SORT test-mergesort: use DEFINE_LIST_SORT test-mergesort: use DEFINE_LIST_SORT_DEBUG mergesort: add macros for typed sort of linked lists mergesort: tighten merge loop mergesort: unify ranks loops	2022-08-03 13:36:09 -07:00
Junio C Hamano	87098a047b	Merge branch 'sa/cat-file-mailmap' "git cat-file" learned an option to use the mailmap when showing commit and tag objects. * sa/cat-file-mailmap: cat-file: add mailmap support ident: rename commit_rewrite_person() to apply_mailmap_to_header() ident: move commit_rewrite_person() to ident.c revision: improve commit_rewrite_person()	2022-08-03 13:36:08 -07:00
Junio C Hamano	8e56affcb5	Merge branch 'zh/ls-files-format' "git ls-files" learns the "--format" option to tweak its output. * zh/ls-files-format: ls-files: introduce "--format" option	2022-08-03 13:36:08 -07:00
Junio C Hamano	37e4bdd5ee	Merge branch 'tb/commit-graph-genv2-upgrade-fix' There was a bug in the codepath to upgrade generation information in commit-graph from v1 to v2 format, which has been corrected. * tb/commit-graph-genv2-upgrade-fix: commit-graph: fix corrupt upgrade from generation v1 to v2 commit-graph: introduce `repo_find_commit_pos_in_graph()` t5318: demonstrate commit-graph generation v2 corruption	2022-08-03 13:36:08 -07:00

... 4 5 6 7 8 ...

20373 Commits