git-commit-vandalism

Author	SHA1	Message	Date
Eric Sunshine	48d69d8f2f	chainlint: prefix annotated test definition with line numbers When chainlint detects problems in a test, it prints out the name of the test script, the name of the problematic test, and a copy of the test definition with "?!FOO?!" annotations inserted at the locations where problems were detected. Taken together this information is sufficient for the test author to identify the problematic code in the original test definition. However, in a lengthy script or a lengthy test definition, the author may still end up using the editor's search feature to home in on the exact problem location. To further assist the test author, display line numbers along with the annotated test definition, thus allowing the author to jump directly to each problematic line. Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-11 16:56:21 -05:00
Eric Sunshine	bf42f0a030	chainlint: latch line numbers at which each token starts and ends When chainlint detects problems in a test, it prints out the name of the test script, the name of the problematic test, and a copy of the test definition with "?!FOO?!" annotations inserted at the locations where problems were detected. Taken together this information is sufficient for the test author to identify the problematic code in the original test definition. However, in a lengthy script or a lengthy test definition, the author may still end up using the editor's search feature to home in on the exact problem location. To further assist the test author, an upcoming change will display line numbers along with the annotated test definition, thus allowing the author to jump directly to each problematic line. As preparation, upgrade Lexer to latch the line numbers at which each token starts and ends, and return that information with the token itself. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-11 16:56:21 -05:00
Eric Sunshine	5451877f87	chainlint: sidestep impoverished macOS "terminfo" Although the macOS Terminal.app is "xterm"-compatible, its corresponding "terminfo" entries -- such as "xterm", "xterm-256color", and "xterm-new"[1] -- neglect to mention capabilities which Terminal.app actually supports (such as "dim text"). This oversight on Apple's part ends up penalizing users of "good citizen" console programs which consult "terminfo" to tailor their output based upon reported terminal capabilities (as opposed to programs which assume that the terminal supports ANSI codes). The same problem is present in other Apple "terminfo" entries, such as "nsterm"[2], with which macOS Terminal.app may be configured. Sidestep this Apple problem by imbuing get_colors() with specific knowledge of capabilities common to "xterm" and "nsterm", rather than trusting "terminfo" to report them correctly. Although hard-coding such knowledge is ugly, "xterm" support is nearly ubiquitous these days, and Git itself sets precedence by assuming support for ANSI color codes. For other terminal types, fall back to querying "terminfo" via `tput` as usual. FOOTNOTES [1] iTerm2 FAQ suggests "xterm-new": https://iterm2.com/faq.html [2] Neovim documentation recommends terminal type "nsterm" with Terminal.app: https://neovim.io/doc/user/term.html#terminfo Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-11 16:56:21 -05:00
Phillip Wood	688d82f254	sequencer: tighten label lookups The `label` command creates a ref refs/rewritten/<label> that the `reset` and `merge` commands resolve by calling lookup_label(). That uses lookup_commit_reference_by_name() to look up the label ref. As lookup_commit_reference_by_name() uses the dwim rules when looking up the label it will look for a branch named refs/heads/refs/rewritten/<label> and return that instead of an error if the branch exists and the label does not. Fix this by using read_ref() followed by lookup_commit_object() when looking up labels. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-10 23:36:24 -05:00
Phillip Wood	82766b2961	sequencer: unify label lookup The arguments to the `reset` and `merge` commands may be a label created with a `label` command or an arbitrary commit name. The `merge` command uses the lookup_label() function to lookup its arguments but `reset` has a slightly different version of that function in do_reset(). Reduce this code duplication by calling lookup_label() from do_reset() as well. This change improves the behavior of `reset` when the argument is a tree. Previously `reset` would accept a tree only for the rebase to fail with update_ref failed for ref 'HEAD': cannot update ref 'HEAD': trying to write non-commit object da5497437fd67ca928333aab79c4b4b55036ea66 to branch 'HEAD' Using lookup_label() means do_reset() will now error out straight away if its argument is not a commit. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-10 23:36:24 -05:00
Victoria Dye	652bd0211d	rebase: use 'skip_cache_tree_update' option Enable the 'skip_cache_tree_update' option in both 'do_reset()' ('sequencer.c') and 'reset_head()' ('reset.c'). Both of these callers invoke 'prime_cache_tree()' after 'unpack_trees()', so we can remove an unnecessary cache tree rebuild by skipping 'cache_tree_update()'. When testing with 'p3400-rebase.sh' and 'p3404-rebase-interactive.sh', the performance change of this update was negligible, likely due to the operation being dominated by more expensive operations (like checking out trees). However, since the change doesn't harm performance, it's worth keeping this 'unpack_trees()' usage consistent with others that subsequently invoke 'prime_cache_tree()'. Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-10 21:49:34 -05:00
Victoria Dye	dc5d40f5bc	read-tree: use 'skip_cache_tree_update' option When running 'read-tree' with a single tree and no prefix, 'prime_cache_tree()' is called after the tree is unpacked. In that situation, skip a redundant call to 'cache_tree_update()' in 'unpack_trees()' by enabling the 'skip_cache_tree_update' unpack option. Removing the redundant cache tree update provides a substantial performance improvement to 'git read-tree <tree-ish>', as shown by a test added to 'p0006-read-tree-checkout.sh': Test before after ---------------------------------------------------------------------- read-tree br_ballast_plus_1 3.94(1.80+1.57) 3.00(1.14+1.28) -23.9% Note that the 'read-tree' in 't1022-read-tree-partial-clone.sh' is updated to read two trees, rather than one. The test was first introduced in `d3da223f22` (cache-tree: prefetch in partial clone read-tree, 2021-07-23) to exercise the 'cache_tree_update()' code path, as used in 'git merge'. Since this patch drops the call to 'cache_tree_update()' in single-tree 'git read-tree', change the test to use the two-tree variant so that 'cache_tree_update()' is called as intended. Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-10 21:49:34 -05:00
Victoria Dye	0e47bca0f7	reset: use 'skip_cache_tree_update' option Enable the 'skip_cache_tree_update' option in the variants that call 'prime_cache_tree()' after 'unpack_trees()' (specifically, 'git reset --mixed' and 'git reset --hard'). This avoids redundantly rebuilding the cache tree in both 'cache_tree_update()' at the end of 'unpack_trees()' and in 'prime_cache_tree()', resulting in a small (but consistent) performance improvement. From the newly-added 'p7102-reset.sh' test: Test before after -------------------------------------------------------------------- 7102.1: reset --hard (...) 2.11(0.40+1.54) 1.97(0.38+1.47) -6.6% Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-10 21:49:34 -05:00
Victoria Dye	68fcd48baf	unpack-trees: add 'skip_cache_tree_update' option Add (disabled by default) option to skip the 'cache_tree_update()' at the end of 'unpack_trees()'. In many cases, this cache tree update is redundant because the caller of 'unpack_trees()' immediately follows it with 'prime_cache_tree()', rebuilding the entire cache tree from scratch. While these operations aren't the most expensive part of operations like 'git reset', the duplicate calls still create a minor unnecessary slowdown. Introduce an option for callers to skip the 'cache_tree_update()' in 'unpack_trees()' if it is redundant (that is, if 'prime_cache_tree()' is called afterwards). At the moment, no 'unpack_trees()' callers use the new option; they will be updated in subsequent patches. Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-10 21:49:34 -05:00
Victoria Dye	94fcf0e852	cache-tree: add perf test comparing update and prime Add a performance test comparing the execution times of 'prime_cache_tree()' and 'cache_tree_update(_, WRITE_TREE_SILENT \| WRITE_TREE_REPAIR)'. The goal of comparing these two is to identify which is the faster method for rebuilding an invalid cache tree, ultimately to remove one when both are (reundantly) called in immediate succession. Both methods are fast, so the new tests in 'p0090-cache-tree.sh' must call each tested function multiple times to ensure the reported times (to 0.01s resolution) convey the differences between them. The tests compare the timing of a 'test-tool cache-tree' run as a no-op (to capture a baseline for the overhead associated with running the tool), 'cache_tree_update()', and 'prime_cache_tree()' on four scenarios: - A completely valid cache tree - A cache tree with 2 invalid paths - A cache tree with 50 invalid paths - A completely empty cache tree Example results: Test this tree ----------------------------------------------------------- 0090.2: no-op, clean 1.27(0.48+0.52) 0090.3: prime_cache_tree, clean 2.02(0.83+0.85) 0090.4: cache_tree_update, clean 1.30(0.49+0.54) 0090.5: no-op, invalidate 2 1.29(0.48+0.54) 0090.6: prime_cache_tree, invalidate 2 1.98(0.81+0.83) 0090.7: cache_tree_update, invalidate 2 2.12(0.94+0.86) 0090.8: no-op, invalidate 50 1.32(0.50+0.55) 0090.9: prime_cache_tree, invalidate 50 2.10(0.86+0.89) 0090.10: cache_tree_update, invalidate 50 2.35(1.14+0.90) 0090.11: no-op, empty 1.33(0.50+0.54) 0090.12: prime_cache_tree, empty 2.04(0.84+0.87) 0090.13: cache_tree_update, empty 2.51(1.27+0.92) These timings show that, while 'cache_tree_update()' is faster when the cache tree is completely valid, it is equal to or slower than 'prime_cache_tree()' when there are any invalid paths. Since the redundant calls are mostly in scenarios where the cache tree will be at least partially invalid (e.g., 'git reset --hard'), 'prime_cache_tree()' will likely perform better than 'cache_tree_update()' in typical cases. Helped-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-10 21:49:33 -05:00
Jeff King	eb20e63f5a	branch: gracefully handle '-d' on orphan HEAD When deleting a branch, "git branch -d" has a safety check that ensures the branch is merged to its upstream (if any), or to HEAD. To do that, naturally we try to resolve HEAD to a commit object. If we're on an orphan branch (i.e., HEAD points to a branch that does not yet exist), that will fail, and we'll bail with an error: $ git branch -d to-delete fatal: Couldn't look up commit object for HEAD This usually isn't that big of a deal. The deletion would fail anyway, since the branch isn't merged to HEAD, and you'd need to use "-D" (or "-f"). And doing so skips the HEAD resolution, courtesy of `67affd5173` (git-branch -D: make it work even when on a yet-to-be-born branch, 2006-11-24). But there are still two problems: 1. The error message isn't very helpful. We should give the usual "not fully merged" message, which points the user at "branch -D". That was a problem even back in `67affd5173`. 2. Even without a HEAD, these days it's still possible for the deletion to succeed. After `67affd5173`, commit `99c419c915` (branch -d: base the "already-merged" safety on the branch it merges with, 2009-12-29) made it OK to delete a branch if it is merged to its upstream. We can fix both by removing the die() in delete_branches() completely, leaving head_rev NULL in this case. It's tempting to stop there, as it appears at first glance that the rest of the code does the right thing with a NULL. But sadly, it's not quite true. We end up feeding the NULL to repo_is_descendant_of(). In the traditional code path there, we call repo_in_merge_bases_many(). It feeds the NULL to repo_parse_commit(), which is smart enough to return an error, and we immediately return "no, it's not a descendant". But there's an alternate code path: if we have a commit graph with generation numbers, we end up in can_all_from_reach(), which does eventually try to set a flag on the NULL commit and segfaults. So instead, we'll teach the local branch_merged() helper to treat a NULL as "not merged". This would be a little more elegant in in_merge_bases() itself, but that function is called in a lot of places, and it's not clear that quietly returning "not merged" is the right thing everywhere (I'd expect in many cases, feeding a NULL is a sign of a bug). There are four tests here: a. The first one confirms that deletion succeeds with an orphaned HEAD when the branch is merged to its upstream. This is case (2) above. b. Same, but with commit graphs enabled. Even if it is merged to upstream, we still check head_rev so that we can say "deleting because it's merged to upstream, even though it's not merged to HEAD". Without the second hunk in branch_merged(), this test would segfault in can_all_from_reach(). c. The third one confirms that we correctly say "not merged to HEAD" when we can't resolve HEAD, and reject the deletion. d. Same, but with commit graphs enabled. Without the first hunk in branch_merged(), this one would segfault. Reported-by: Martin von Zweigbergk <martinvonz@google.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-10 21:42:45 -05:00
M Hickford	f13c3f28e7	Documentation: increase example cache timeout to 1 hour Previously, the example decreased the cache timeout compared to the default, making it less user friendly. Instead, nudge users to make cache more usable. Many users choose store over cache. https://lore.kernel.org/git/CAGJzqskRYN49SeS8kSEN5-vbB_Jt1QvAV9QhS6zNuKh0u8wxPQ@mail.gmail.com/ The default timeout remains 15 minutes. A stronger nudge would be to increase that. Signed-off-by: M Hickford <mirth.hickford@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-09 21:28:53 -05:00
Phillip Wood	0e34efb31d	rebase: stop exporting GIT_REFLOG_ACTION Now that struct replay_opts has a reflog_action member we no longer need to export GIT_REFLOG_ACTION when starting a rebase. If the user has set GIT_REFLOG_ACTION then we use it when initializing reflog_action. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-09 18:15:54 -05:00
Phillip Wood	d188a60d72	sequencer: stop exporting GIT_REFLOG_ACTION Each time it picks a commit the sequencer copies the GIT_REFLOG_ACITON environment variable so it can temporarily change it and then restore the previous value. This results in code that is hard to follow and also leaks memory because (i) we fail to free the copy when we've finished with it and (ii) each call to setenv() leaks the previous value. Instead pass the reflog action around in a variable and use it to set GIT_REFLOG_ACTION in the child environment when running "git commit". Within the sequencer GIT_REFLOG_ACTION is no longer set and is only read by sequencer_reflog_action(). It is still set by rebase before calling the sequencer, that will be addressed in the next commit. cherry-pick and revert are unaffected as they do not set GIT_REFLOG_ACTION before calling the sequencer. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-09 18:15:43 -05:00
Ævar Arnfjörð Bjarmason	8354cf752e	t7610: fix flaky timeout issue, don't clone from example.com When t7610-mergetool.sh runs without failures the git://example.com submodule URLs will never be used. That's because we "git submodule add" it, but then manually populate them so that subsequent "git submodule update -N" won't attempt to clone it, only update it without fetching. But if we fail in an earlier test it'll have the knock-on effect of having later tests hang on that "git submodule update -N" as we attempt to clone this repository from example.com. This can be reproduced on "master" by running the test with SANITIZE=leak without "--immediate". With "GIT_TEST_PASSING_SANITIZE_LEAK=true" (which the linux-leaks job uses) we'll skip the test entirely. So we'll only run into this when running it manually, or with the "GIT_TEST_PASSING_SANITIZE_LEAK=check" mode. That's not because the failure has anything to do with leak detection per-se. It just so happens that we have a leak that'll fail before we've managed to fully set these up, and therefore "git submodule update -N" ends up spawning "git clone". Let's instead continue lying about the origin of this submodule by providing a URL for it that doesn't work, but now one that really doesn't work: /dev/null. If the test is passing we won't ever use this, and if we have knock-on failures we'll fail early, instead of waiting for a timeout. The behavior of "-N" here might be surprising to some, since it's explained as "[if you use -N we] don’t fetch new objects from the remote site". But (perhaps counter-intuitively) it's only talking about if it needs to do so via "git fetch". In this case we'll end up spawning a "git clone", as we have no submodule set up. See `ff7f089ed1` (mergetool: Teach about submodules, 2011-04-13) for the commit that implemented these "example.com" tests. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-09 17:29:31 -05:00
Taylor Blau	3a79a8085b	Merge branch 'es/chainlint-output' into es/chainlint-lineno * es/chainlint-output: chainlint: annotate original test definition rather than token stream chainlint: latch start/end position of each token chainlint: tighten accuracy when consuming input stream chainlint: add explanatory comments	2022-11-09 16:41:35 -05:00
Taylor Blau	319605f8f0	The eleventh batch Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-08 17:18:48 -05:00
Taylor Blau	be4ac3b197	Merge branch 'rs/no-more-run-command-v' Simplify the run-command API. * rs/no-more-run-command-v: replace and remove run_command_v_opt() replace and remove run_command_v_opt_cd_env_tr2() replace and remove run_command_v_opt_tr2() replace and remove run_command_v_opt_cd_env() use child_process members "args" and "env" directly use child_process member "args" instead of string array variable sequencer: simplify building argument list in do_exec() bisect--helper: factor out do_bisect_run() bisect: simplify building "checkout" argument list am: simplify building "show" argument list run-command: fix return value comment merge: remove always-the-same "verbose" arguments	2022-11-08 17:15:12 -05:00
Taylor Blau	3e9303dc8e	Merge branch 'rs/archive-filter-error-once' "git archive" mistakenly complained twice about a missing executable, which has been corrected. * rs/archive-filter-error-once: archive-tar: report filter start error only once	2022-11-08 17:15:09 -05:00
Taylor Blau	ec9a46af4f	Merge branch 'ma/drop-redundant-diagnostic' A redundant diagnostic message is dropped from test_path_is_missing(). * ma/drop-redundant-diagnostic: test-lib-functions: drop redundant diagnostic print	2022-11-08 17:15:06 -05:00
Taylor Blau	d957761eff	Merge branch 'vb/ls-files-docfix' Docfix. * vb/ls-files-docfix: ls-files: fix --ignored and --killed flags in synopsis	2022-11-08 17:14:53 -05:00
Taylor Blau	15df8418a5	Merge branch 'jk/ref-filter-parsing-bugs' Various tests exercising the transfer.credentialsInUrl configuration are taught to avoid making requests which require resolving localhost to reduce CI-flakiness. * jk/ref-filter-parsing-bugs: ref-filter: fix parsing of signatures with CRLF and no body ref-filter: fix parsing of signatures without blank lines	2022-11-08 17:14:52 -05:00
Taylor Blau	4b6302c72f	Merge branch 'po/glossary-around-traversal' The glossary entries for "commit-graph file" and "reachability bitmap" have been added. * po/glossary-around-traversal: glossary: add reachability bitmap description glossary: add "commit graph" description doc: use 'object database' not ODB or abbreviation doc: use "commit-graph" hyphenation consistently	2022-11-08 17:14:51 -05:00
Taylor Blau	06e7696025	Merge branch 'jc/set-gid-bit-less-aggressively' The adjust_shared_perm() helper function learned to refrain from setting the "g+s" bit on directories when it is not necessary. * jc/set-gid-bit-less-aggressively: adjust_shared_perm(): leave g+s alone when the group does not matter	2022-11-08 17:14:49 -05:00
Taylor Blau	bdd42e34e3	Merge branch 'es/mark-gc-cruft-as-experimental' Enable gc.cruftpacks by default for those who opt into feature.experimental setting. * es/mark-gc-cruft-as-experimental: config: let feature.experimental imply gc.cruftPacks=true gc: add tests for --cruft and friends	2022-11-08 17:14:48 -05:00
Taylor Blau	098b1d07bc	Merge branch 'tb/howto-using-redo-script' Doc update. * tb/howto-using-redo-script: Documentation/howto/maintain-git.txt: fix Meta/redo-jch.sh invocation	2022-11-08 17:14:45 -05:00
M Hickford	54e95b4663	Documentation/gitcredentials.txt: mention password alternatives Git asks for a "password", but the user might use a personal access token or OAuth access token instead. Example: Password for 'https://AzureDiamond@github.com': Signed-off-by: M Hickford <mirth.hickford@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-08 16:46:54 -05:00
srz_zumix	ee0e7fc927	fsmonitor--daemon: on macOS support symlink Resolves a problem where symbolic links were not showing up in diff when created or modified. kFSEventStreamEventFlagItemIsSymlink is also treated as a file update. This is because kFSEventStreamEventFlagItemIsFile is not included in FSEvents when creating or deleting symbolic links. For example: $ ln -snf t test fsevent: '/path/to/dir/test', flags=0x40100 ItemCreated\|ItemIsSymlink\| $ ln -snf ci test fsevent: '/path/to/dir/test', flags=0x40200 ItemIsSymlink\|ItemRemoved\| fsevent: '/path/to/dir/test', flags=0x40100 ItemCreated\|ItemIsSymlink\| Signed-off-by: srz_zumix <zumix.cpp@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-08 16:36:09 -05:00
Ævar Arnfjörð Bjarmason	916ebb327c	revisions API: extend the nascent REV_INFO_INIT macro Have the REV_INFO_INIT macro added in [1] declare more members of "struct rev_info" that we can initialize statically, and have repo_init_revisions() do so with the memcpy(..., &blank) idiom introduced in [2]. As the comment for the "REV_INFO_INIT" macro notes this still isn't sufficient to initialize a "struct rev_info" for use yet. But we are getting closer to that eventual goal. Even though we can't fully initialize a "struct rev_info" with REV_INFO_INIT it's useful for readability to clearly separate those things that we can statically initialize, and those that we can't. This change could replace the: list_objects_filter_init(&revs->filter); In the repo_init_revisions() with this line, at the end of the REV_INFO_INIT deceleration in revisions.h: .filter = LIST_OBJECTS_FILTER_INIT, \ But doing so would produce a minor conflict with an outstanding topic[3]. Let's skip that for now. I have follow-ups to initialize more of this statically, e.g. changes to get rid of grep_init(). We can initialize more members with the macro in a future series. 1. `f196c1e908` (revisions API users: use release_revisions() needing REV_INFO_INIT, 2022-04-13) 2. `5726a6b401` (.c _init(): define in terms of corresponding *_INIT macro, 2021-07-01) 3. https://lore.kernel.org/git/265b292ed5c2de19b7118dfe046d3d9d932e2e89.1667901510.git.ps@pks.im/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-08 16:34:01 -05:00
Johannes Schindelin	63357b79c9	ci: use a newer `github-script` version The old version we currently use runs in node.js v12.x, which is being deprecated in GitHub Actions. The new version uses node.js v16.x. Incidentally, this also avoids the warning about the deprecated `::set-output::` workflow command because the newer version of the `github-script` Action uses the recommended new way to specify outputs. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-08 15:35:13 -05:00
Eric Sunshine	73c768dae9	chainlint: annotate original test definition rather than token stream When chainlint detects problems in a test, such as a broken &&-chain, it prints out the test with "?!FOO?!" annotations inserted at each problem location. However, rather than annotating the original test definition, it instead dumps out a parsed token representation of the test. Since it lacks comments, indentations, here-doc bodies, and so forth, this tokenized representation can be difficult for the test author to digest and relate back to the original test definition. However, now that each parsed token carries positional information, the location of a detected problem can be pinpointed precisely in the original test definition. Therefore, take advantage of this information to annotate the test definition itself rather than annotating the parsed token stream, thus making it easier for a test author to relate a problem back to the source. Maintaining the positional meta-information associated with each detected problem requires a slight change in how the problems are managed internally. In particular, shell syntax such as: msg="total: $(cd data; wc -w .txt) words" requires the lexical analyzer to recursively invoke the parser in order to detect problems within the $(...) expression inside the double-quoted string. In this case, the recursive parse context will detect the broken &&-chain between the `cd` and `wc` commands, returning the token stream: cd data ; ?!AMP?! wc -w .txt However, the parent parse context will see everything inside the double-quotes as a single string token: "total: $(cd data ; ?!AMP?! wc -w .txt) words" losing whatever positional information was attached to the ";" token where the problem was detected. One way to preserve the positional information of a detected problem in a recursive parse context within a string would be to attach the positional information to the annotation textually; for instance: "total: $(cd data ; ?!AMP:21:22?! wc -w .txt) words" and then extract the positional information when annotating the original test definition. However, a cleaner and much simpler approach is to maintain the list of detected problems separately rather than embedding the problems as annotations directly in the parsed token stream. Not only does this ensure that positional information within recursive parse contexts is not lost, but it keeps the token stream free from non-token pollution, which may simplify implementation of validations added in the future since they won't have to handle non-token "?!FOO!?" items specially. Finally, the chainlint self-test "expect" files need a few mechanical adjustments now that the original test definitions are emitted rather than the parsed token stream. In particular, the following items missing from the historic parsed-token output are now preserved verbatim: * indentation (and whitespace, in general) * comments * here-doc bodies * here-doc tag quoting (i.e. "\EOF") * line-splices (i.e. "\" at the end of a line) Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-08 15:10:49 -05:00
Eric Sunshine	5f0321a9f2	chainlint: latch start/end position of each token When chainlint detects problems in a test, such as a broken &&-chain, it prints out the test with "?!FOO?!" annotations inserted at each problem location. However, rather than annotating the original test definition, it instead dumps out a parsed token representation of the test. Since it lacks comments, indentations, here-doc bodies, and so forth, this tokenized representation can be difficult for the test author to digest and relate back to the original test definition. To address this shortcoming, an upcoming change will make it print out an annotated copy of the original test definition rather than the tokenized representation. In order to do so, it will need to know the start and end positions of each token in the original test definition. As preparation, upgrade TestParser::scan_token() to latch the start and end position of the token being scanned, and return that information along with the token itself. A subsequent change will take advantage of this positional information. In terms of implementation, TestParser::scan_token() is retrofitted to return a tuple consisting of the token's lexeme and its start and end positions, rather than returning just the lexeme. However, an alternative would be to define a class which represents a token: package Token; sub new { my ($class, $lexeme, $start, $end) = @_; bless [$lexeme, $start, $end] => $class; } sub as_string { my $self = shift @_; return $self->[0]; } sub compare { my ($x, $y) = @_; if (UNIVERSAL::isa($y, 'Token')) { return $x->[0] cmp $y->[0]; } return $x->[0] cmp $y; } use overload ( '""' => 'as_string', 'cmp' => 'compare' ); The major benefit of the class-based approach is that it is entirely non-invasive; it requires no additional changes to the rest of the script since a Token converts automatically to a string, which is what scan_token() historically returned. The big downside to the Token approach, however, is that it is _slow_; on this developer's (old) machine, it increases user-time by an unacceptable seven seconds when scanning all test scripts in the project. Hence, the simple tuple approach is employed instead since it adds only a fraction of a second user-time. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-08 15:10:49 -05:00
Eric Sunshine	ca748f5183	chainlint: tighten accuracy when consuming input stream To extract the next token in the input stream, Lexer::scan_token() finds the start of the token by skipping whitespace, then consumes characters belonging to the token until it encounters a non-token character, such as an operator, punctuation, or whitespace. In the case of an operator or punctuation which ends a token, before returning the just-scanned token, it pushes that operator or punctuation character back onto the input stream to ensure that it will be the first character consumed by the next call to scan_token(). However, scan_token() is intentionally lax when whitespace ends a token; it doesn't bother pushing the whitespace character back onto the token stream since it knows that the next call to scan_token() will, as its first step, skip over whitespace anyhow when looking for the start of the token. Although such laxity is harmless for the proper functioning of the lexical analyzer, it does make it difficult to precisely identify the token's end position in the input stream. Accurate token position information may be desirable, for instance, to annotate problems or highlight other interesting facets of the input found during the parsing phase. To accommodate such possibilities, tighten scan_token() by making it push the token-ending whitespace character back onto the input stream, just as it does for other token-ending characters. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-08 15:10:49 -05:00
Eric Sunshine	c90d81f8bb	chainlint: add explanatory comments The logic in TestParser::accumulate() for detecting broken &&-chains is mostly well-commented, but a couple branches which were deemed obvious and straightforward lack comments. In retrospect, though, these cases may give future readers pause, so comment them, as well. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-08 15:10:49 -05:00
Ævar Arnfjörð Bjarmason	69d94464e1	submodule--helper: use OPT_SUBCOMMAND() API Have the cmd_submodule__helper() use the OPT_SUBCOMMAND() API introduced in `fa83cc834d` (parse-options: add support for parsing subcommands, 2022-08-19). This is only a marginal reduction in line count, but once we start unifying this with a yet-to-be-added "builtin/submodule.c" it'll be much easier to reason about those changes, as they'll both use OPT_SUBCOMMAND(). We don't need to worry about "argv[0]" being NULL in the die() because we'd have errored out in parse_options() as we're not using "PARSE_OPT_SUBCOMMAND_OPTIONAL". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-08 14:55:30 -05:00
Ævar Arnfjörð Bjarmason	1b6e2001c7	submodule--helper: drop "update --prefix <pfx>" for "-C <pfx> update" Since `29a5e9e1ff` (submodule--helper update-clone: learn --init, 2022-03-04) we've been passing "-C <prefix>" from "git-submodule.sh" whenever we pass "--prefix <prefix>", so the latter is redundant to the former. Let's drop the "--prefix" option. Suggested-by: Glen Choo <chooglen@google.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-08 14:55:30 -05:00
Ævar Arnfjörð Bjarmason	64f48ad1f0	submodule--helper: remove --prefix from "absorbgitdirs" Let's pass the "-C <prefix>" option instead to "absorbgitdirs" from its only caller. When it was added in `f6f8586140` (submodule: add absorb-git-dir function, 2016-12-12) there were other "submodule--helper" subcommands that were invoked with "-C <prefix>", so we could have done this all along. Suggested-by: Glen Choo <chooglen@google.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-08 14:55:30 -05:00
Ævar Arnfjörð Bjarmason	82ff87789b	submodule API & "absorbgitdirs": remove "----recursive" option Remove the "----recursive" option to "git submodule--helper absorbgitdirs" (yes, with 4 dashes, not 2). This option and all the "else" when "flags & ABSORB_GITDIR_RECURSE_SUBMODULES" is false has never been used since it was added in `f6f8586140` (submodule: add absorb-git-dir function, 2016-12-12), which we'd have had to do as "----recursive", a "--recursive" would have errored out. It would be nice to follow-up with an optbug() assertion to parse-options.c for such funnily named options, I manually validated that this was the only long option whose name started with "-", but let's skip adding such an assertion for now. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-08 14:55:30 -05:00
Ævar Arnfjörð Bjarmason	46e87b5482	submodule.c: refactor recursive block out of absorb function A move and indentation-only change to move the ABSORB_GITDIR_RECURSE_SUBMODULES case into its own function, which as we'll see makes the subsequent commit changing this code much smaller. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-08 14:55:30 -05:00
Ævar Arnfjörð Bjarmason	d50d8485ef	submodule tests: test for a "foreach" blind-spot We tested for "--" followed by command names, but not for "--" followed by an argument that looks like an option, let's do that. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-08 14:55:30 -05:00
Ævar Arnfjörð Bjarmason	435285bd82	submodule--helper: fix a memory leak in "status" The "status" sub-command was leaking the "struct strvec" it was setting up for the reasons explained in `f92dbdbc6a` (revisions API: don't leak memory on argv elements that need free()-ing, 2022-08-02), so let's use the "free_removed_argv_elements" option to setup_revisions() to fix the leak. Even if we did that, clobbering the "diff_files_args.nr" with the return value of setup_revisions() would leave leaks in place, but we can just stop clobbering it. Ever since that code was added in `a9f8a37584` (submodule: port submodule subcommand 'status' from shell to C, 2017-10-06) we've had no reason to modify the "nr" member ("argc" at the time): The next use of "diff_files_args" after this is the "strvec_clear()" at the end of the function. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-08 14:55:30 -05:00
Ævar Arnfjörð Bjarmason	44874cbd19	submodule tests: add tests for top-level flag output Exhaustively test for how combining various "mixed-level" "git submodule" option works. "Mixed-level" here means options that are accepted by a mixture of the top-level "submodule" command, and e.g. the "status" sub-command. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-08 14:55:30 -05:00
Ævar Arnfjörð Bjarmason	cc74a4ac72	submodule--helper: move "config" to a test-tool As with other moves to "test-tool" in `f322e9f51b` (Merge branch 'ab/submodule-helper-prep', 2022-09-13) the "config" sub-command was only used by our own tests. It was last used by "git submodule" itself in code that went away with `a6226fd772` (submodule--helper: convert the bulk of cmd_add() to C, 2021-08-10). Let's move it over, and while doing so make it easier to reason about by splitting up the various uses for it into separate sub-commands, so that we don't need to count arguments to see what it does. This also has the advantage that we stop wasting future translator time on this command, currently the usage information for this internal-only tool has been translated into several languages. The use of the "_" function has also been removed from the "please make sure..." message. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-08 14:55:30 -05:00
Ævar Arnfjörð Bjarmason	d00fa5528b	Makefile: discuss SHAttered in *_SHA{1,256} discussion Let's mention the SHAttered attack and more generally why we use the sha1collisiondetection backend by default, and note that for SHA-256 the user should feel free to pick any of the supported backends as far as hashing security is concerned. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-07 22:11:51 -05:00
Ævar Arnfjörð Bjarmason	fb8d7add06	Makefile: document default SHA-1 backend on OSX Since [1] the default SHA-1 backend on OSX has been APPLE_COMMON_CRYPTO. Per [2] we'll skip using it on anything older than Mac OS X 10.4 "Tiger"[3]. When "DC_SHA1" was made the default in [4] this interaction between it and APPLE_COMMON_CRYPTO seems to have been missed in. Ever since DC_SHA1 was "made the default" we've still used Apple's CommonCrypto instead of sha1collisiondetection on modern versions of Darwin and OSX. 1. `61067954ce` (cache.h: eliminate SHA-1 deprecation warnings on Mac OS X, 2013-05-19) 2. `9c7a0beee0` (config.mak.uname: set NO_APPLE_COMMON_CRYPTO on older systems, 2014-08-15) 3. We could probably drop "NO_APPLE_COMMON_CRYPTO", as nobody's likely to care about such on old version of OSX anymore. But let's leave that for now. 4. `e6b07da278` (Makefile: make DC_SHA1 the default, 2017-03-17) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-07 22:11:51 -05:00
Ævar Arnfjörð Bjarmason	dc1cf3580e	Makefile & test-tool: replace "DC_SHA1" variable with a "define" Address the root cause of technical debt we've been carrying since sha1collisiondetection was made the default in [1]. In a preceding commit we narrowly fixed a bug where the "DC_SHA1" variable would be unset (in combination with "NO_APPLE_COMMON_CRYPTO=" on OSX), even though we had the sha1collisiondetection library enabled. But the only reason we needed to have such a user-exposed knob went away with [1], and it's been doing nothing useful since then. We don't care if you define DC_SHA1=*, we only care that you don't ask for any other SHA-1 implementation. If it turns out that you didn't, we'll use sha1collisiondetection, whether you had "DC_SHA1" set or not. As a result of this being confusing we had e.g. [2] for cmake and the recent [3] for ci/lib.sh setting "DC_SHA1" explicitly, even though this was always a NOOP. A much simpler way to do this is to stop having the Makefile and CMakeLists.txt set "DC_SHA1" to be picked up by the test-lib.sh, let's instead add a trivial "test-tool sha1-is-sha1dc". It returns zero if we're using sha1collisiondetection, non-zero otherwise. 1. `e6b07da278` (Makefile: make DC_SHA1 the default, 2017-03-17) 2. `c4b2f41b5f` (cmake: support for testing git with ctest, 2020-06-26) 3. `1ad5c3df35` (ci: use DC_SHA1=YesPlease on osx-clang job for CI, 2022-10-20) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-07 22:11:51 -05:00
Ævar Arnfjörð Bjarmason	ed605fa1a8	Makefile: document SHA-1 and SHA-256 default and selection order For the _SHA1 and _SHA256 flags we've discussed the various flags, but not the fact that when you define multiple flags we'll pick one. Which one we pick depends on the order they're listed in the Makefile, which differed from the order we discussed them in this documentation. Let's be explicit about how we select these, and re-arrange the listings so that they're listed in the priority order we've picked. I'd personally prefer that the selection was more explicit, and that we'd error out if conflicting flags were provided, but per the discussion downhtread of[1] the consensus was to keep theses semantics. This behavior makes it easier to e.g. integrate with autoconf-like systems, where the configuration can provide everything it can support, and Git is tasked with picking the first one it prefers. 1. https://lore.kernel.org/git/220710.86mtdh81ty.gmgdl@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-07 22:11:51 -05:00
Ævar Arnfjörð Bjarmason	84d71c2021	Makefile: document default SHA-256 backend Since `27dc04c545` (sha256: add an SHA-256 implementation using libgcrypt, 2018-11-14) we've claimed to support a BLK_SHA256 flag, but there's no such SHA-256 backend. Instead we fall back on adding "sha256/block/sha256.o" to "LIB_OBJS" and adding "-DSHA256_BLK" to BASIC_CFLAGS. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-07 22:11:51 -05:00
Ævar Arnfjörð Bjarmason	f569897cda	Makefile: rephrase the discussion of _SHA1 knobs In the preceding commit the discussion of the _SHA1 knobs was left as-is to benefit from a smaller diff, but since we're changing these let's use the same phrasing we use for most other knobs. E.g. "define X", not "define X environment variable", and get rid of the "when running make to link with" entirely. Furthermore the discussion of DC_SHA1* options is now under a "Options for the sha1collisiondetection implementation" heading, so we don't need to clarify that these options go along with DC_SHA1=Y, so let's rephrase them accordingly. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-07 22:11:51 -05:00
Ævar Arnfjörð Bjarmason	34b660e3e6	Makefile: create and use sections for "define" flag listing Since the "Define ..." template of comments at the top of the Makefile was started in `5bdac8b326` ([PATCH] Improve the compilation-time settings interface, 2005-07-29) we've had a lot more flags added, including flags that come in "groups". Not having any obvious structure to the >500 line comment at the top of the Makefile has made it hard to follow. This change is almost entirely a move-only change, the two paragraphs at the start of the first two sections are new, and so are the added sections themselves, but other than that no lines are changed, only moved. We now list Makefile-only flags at the start, followed by stand-alone flags, and then cover "optional library" flags in their respective groups, followed by SHA-1 and SHA-256 flags, and finally DEVELOPER-specific flags. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-07 22:11:51 -05:00

1 2 3 4 5 ...

68719 Commits