git-commit-vandalism

Author	SHA1	Message	Date
Junio C Hamano	c5305bbe32	Merge branch 'ow/ref-format-remove-unused-member' Code clean-up. * ow/ref-format-remove-unused-member: ref-filter: remove unused ref_format member	2023-04-06 13:38:32 -07:00
Junio C Hamano	6047b28eb7	Merge branch 'en/header-split-cleanup' Split key function and data structure definitions out of cache.h to new header files and adjust the users. * en/header-split-cleanup: csum-file.h: remove unnecessary inclusion of cache.h write-or-die.h: move declarations for write-or-die.c functions from cache.h treewide: remove cache.h inclusion due to setup.h changes setup.h: move declarations for setup.c functions from cache.h treewide: remove cache.h inclusion due to environment.h changes environment.h: move declarations for environment.c functions from cache.h treewide: remove unnecessary includes of cache.h wrapper.h: move declarations for wrapper.c functions from cache.h path.h: move function declarations for path.c functions from cache.h cache.h: remove expand_user_path() abspath.h: move absolute path functions from cache.h environment: move comment_line_char from cache.h treewide: remove unnecessary cache.h inclusion from several sources treewide: remove unnecessary inclusion of gettext.h treewide: be explicit about dependence on gettext.h treewide: remove unnecessary cache.h inclusion from a few headers	2023-04-06 13:38:31 -07:00
Junio C Hamano	72871b198f	Merge branch 'ab/remove-implicit-use-of-the-repository' Code clean-up around the use of the_repository. * ab/remove-implicit-use-of-the-repository: libs: use "struct repository " argument, not "the_repository" post-cocci: adjust comments for recent repo_ migration cocci: apply the "revision.h" part of "the_repository.pending" cocci: apply the "rerere.h" part of "the_repository.pending" cocci: apply the "refs.h" part of "the_repository.pending" cocci: apply the "promisor-remote.h" part of "the_repository.pending" cocci: apply the "packfile.h" part of "the_repository.pending" cocci: apply the "pretty.h" part of "the_repository.pending" cocci: apply the "object-store.h" part of "the_repository.pending" cocci: apply the "diff.h" part of "the_repository.pending" cocci: apply the "commit.h" part of "the_repository.pending" cocci: apply the "commit-reach.h" part of "the_repository.pending" cocci: apply the "cache.h" part of "the_repository.pending" cocci: add missing "the_repository" macros to "pending" cocci: sort "the_repository" rules by header cocci: fix incorrect & verbose "the_repository" rules cocci: remove dead rule from "the_repository.pending.cocci"	2023-04-06 13:38:30 -07:00
Junio C Hamano	7727da99df	Merge branch 'ds/ahead-behind' "git for-each-ref" learns '%(ahead-behind:<base>)' that computes the distances from a single reference point in the history with bunch of commits in bulk. * ds/ahead-behind: commit-reach: add tips_reachable_from_bases() for-each-ref: add ahead-behind format atom commit-reach: implement ahead_behind() logic commit-graph: introduce `ensure_generations_valid()` commit-graph: return generation from memory commit-graph: simplify compute_generation_numbers() commit-graph: refactor compute_topological_levels() for-each-ref: explicitly test no matches for-each-ref: add --stdin option	2023-04-06 13:38:21 -07:00
Øystein Walle	4833b08426	ref-filter: remove unused ref_format member use_rest was added in `b9dee075eb` (ref-filter: add %(rest) atom, 2021-07-26) but was never used. As far as I can tell it was used in a later patch that was submitted to the mailing list but never applied. Signed-off-by: Øystein Walle <oystwa@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-30 10:17:49 -07:00
Ævar Arnfjörð Bjarmason	d850b7a545	cocci: apply the "cache.h" part of "the_repository.pending" Apply the part of "the_repository.pending.cocci" pertaining to "cache.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-28 07:36:36 -07:00
Elijah Newren	32a8f51061	environment.h: move declarations for environment.c functions from cache.h Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-21 10:56:53 -07:00
Elijah Newren	f394e093df	treewide: be explicit about dependence on gettext.h Dozens of files made use of gettext functions, without explicitly including gettext.h. This made it more difficult to find which files could remove a dependence on cache.h. Make C files explicitly include gettext.h if they are using it. However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an include of gettext.h, it was left out to avoid conflicting with an in-flight topic. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-21 10:56:51 -07:00
Elijah Newren	a6dc3d364c	treewide: remove unnecessary cache.h inclusion from a few headers Ever since `a64215b6cd` ("object.h: stop depending on cache.h; make cache.h depend on object.h", 2023-02-24), we have a few headers that could have replaced their include of cache.h with an include of object.h. Make that change now. Some C files had to start including cache.h after this change (or some smaller header it had brought in), because the C files were depending on things from cache.h but were only formerly implicitly getting cache.h through one of these headers being modified in this patch. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-21 10:56:50 -07:00
Derrick Stolee	cbfe360b14	commit-reach: add tips_reachable_from_bases() Both 'git for-each-ref --merged=<X>' and 'git branch --merged=<X>' use the ref-filter machinery to select references or branches (respectively) that are reachable from a set of commits presented by one or more --merged arguments. This happens within reach_filter(), which uses the revision-walk machinery to walk history in a standard way. However, the commit-reach.c file is full of custom searches that are more efficient, especially for reachability queries that can terminate early when reachability is discovered. Add a new tips_reachable_from_bases() method to commit-reach.c and call it from within reach_filter() in ref-filter.c. This affects both 'git branch' and 'git for-each-ref' as tested in p1500-graph-walks.sh. For the Linux kernel repository, we take an already-fast algorithm and make it even faster: Test HEAD~1 HEAD ------------------------------------------------------------------- 1500.5: contains: git for-each-ref --merged 0.13 0.02 -84.6% 1500.6: contains: git branch --merged 0.14 0.02 -85.7% 1500.7: contains: git tag --merged 0.15 0.03 -80.0% (Note that we remove the iterative 'git rev-list' test from p1500 because it no longer makes sense as a comparison to 'git for-each-ref' and would just waste time running it for these comparisons.) The algorithm is implemented in commit-reach.c in the method tips_reachable_from_base(). This method takes a string_list of tips and assigns the 'util' for each item with the value 1 if the base commit can reach those tips. Like other reachability queries in commit-reach.c, the fastest way to search for "can A reach B?" is to do a depth-first search up to the generation number of B, preferring to explore first parents before later parents. While we must walk all reachable commits up to that generation number when the answer is "no", the depth-first search can answer "yes" much faster than other approaches in most cases. This search becomes trickier when there are multiple targets for the depth-first search. The commits with lower generation number are more likely to be within the history of the start commit, but we don't want to waste time searching commits of low generation number if the commit target with lowest generation number has already been found. The trick here is to take the input commits and sort them by generation number in ascending order. Track the index within this order as min_generation_index. When we find a commit, if its index in the list is equal to min_generation_index, then we can increase the generation number boundary of our search to the next-lowest value in the list. With this mechanism, the number of commits to search is minimized with respect to the depth-first search heuristic. We will walk all commits up to the minimum generation number of a commit that is _not_ reachable from the start, but we will walk only the necessary portion of the depth-first search for the reachable commits of lower generation. Add extra tests for this behavior in t6600-test-reach.sh as the interesting data shape of that repository can sometimes demonstrate corner case bugs. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 12:17:33 -07:00
Derrick Stolee	49abcd21da	for-each-ref: add ahead-behind format atom The previous change implemented the ahead_behind() method, including an algorithm to compute the ahead/behind values for a number of commit tips relative to a number of commit bases. Now, integrate that algorithm as part of 'git for-each-ref' hidden behind a new format atom, ahead-behind. This naturally extends to 'git branch' and 'git tag' builtins, as well. This format allows specifying multiple bases, if so desired, and all matching references are compared against all of those bases. For this reason, failing to read a reference provided from these atoms results in an error. In order to translate the ahead_behind() method information to the format output code in ref-filter.c, we must populate arrays of ahead_behind_count structs. In struct ref_array, we store the full array that will be passed to ahead_behind(). In struct ref_array_item, we store an array of pointers that point to the relvant items within the full array. In this way, we can pull all relevant ahead/behind values directly when formatting output for a specific item. It also ensures the lifetime of the ahead_behind_count structs matches the time that the array is being used. Add specific tests of the ahead/behind counts in t6600-test-reach.sh, as it has an interesting repository shape. In particular, its merging strategy and its use of different commit-graphs would demonstrate over- counting if the ahead_behind() method did not already account for that possibility. Also add tests for the specific for-each-ref, branch, and tag builtins. In the case of 'git tag', there are intersting cases that happen when some of the selected tips are not commits. This requires careful logic around commits_nr in the second loop of filter_ahead_behind(). Also, the test in t7004 is carefully located to avoid being dependent on the GPG prereq. It also avoids using the test_commit helper, as that will add ticks to the time and disrupt the expected timestamps in later tag tests. Also add performance tests in a new p1300-graph-walks.sh script. This will be useful for more uses in the future, but for now compare the ahead-behind counting algorithm in 'git for-each-ref' to the naive implementation by running 'git rev-list --count' processes for each input. For the Git source code repository, the improvement is already obvious: Test this tree --------------------------------------------------------------- 1500.2: ahead-behind counts: git for-each-ref 0.07(0.07+0.00) 1500.3: ahead-behind counts: git branch 0.07(0.06+0.00) 1500.4: ahead-behind counts: git tag 0.07(0.06+0.00) 1500.5: ahead-behind counts: git rev-list 1.32(1.04+0.27) But the standard performance benchmark is the Linux kernel repository, which demosntrates a significant improvement: Test this tree --------------------------------------------------------------- 1500.2: ahead-behind counts: git for-each-ref 0.27(0.24+0.02) 1500.3: ahead-behind counts: git branch 0.27(0.24+0.03) 1500.4: ahead-behind counts: git tag 0.28(0.27+0.01) 1500.5: ahead-behind counts: git rev-list 4.57(4.03+0.54) The 'git rev-list' test exists in this change as a demonstration, but it will be removed in the next change to avoid wasting time on this comparison. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 12:17:33 -07:00
Junio C Hamano	d0732a8120	Merge branch 'jk/unused-post-2.39-part2' More work towards -Wunused. * jk/unused-post-2.39-part2: (21 commits) help: mark unused parameter in git_unknown_cmd_config() run_processes_parallel: mark unused callback parameters userformat_want_item(): mark unused parameter for_each_commit_graft(): mark unused callback parameter rewrite_parents(): mark unused callback parameter fetch-pack: mark unused parameter in callback function notes: mark unused callback parameters prio-queue: mark unused parameters in comparison functions for_each_object: mark unused callback parameters list-objects: mark unused callback parameters mark unused parameters in signal handlers run-command: mark error routine parameters as unused mark "pointless" data pointers in callbacks ref-filter: mark unused callback parameters http-backend: mark unused parameters in virtual functions http-backend: mark argc/argv unused object-name: mark unused parameters in disambiguate callbacks serve: mark unused parameters in virtual functions serve: use repository pointer to get config ls-refs: drop config caching ...	2023-03-17 14:03:09 -07:00
Jeff King	5fe9e1ce2f	ref-filter: mark unused callback parameters The ref-filter code uses virtual functions to handle specific atoms, but many of the functions ignore some parameters: - most atom parsers do not need the ref_format itself, unless they are looking at centralized options like use_color, quote_style, etc. - meta-atom handlers like append_atom(), align_atom_handler(), etc, can't generate errors, so ignore their "err" parameter - likewise, the handlers for then/else/end do not even need to look at their atom_value, as the "if" handler put everything they need into the ref_formatting_state stack Since these functions all have to conform to virtual function interfaces, we can't just drop the unused parameters, but must mark them as UNUSED (to appease -Wunused-parameter). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-02-24 09:13:30 -08:00
Jeff King	fe6258c348	ref-filter: drop unused atom parameter from get_worktree_path() The get_worktree_path() function is used to populate the %(worktreepath) value, but it has never used its "atom" parameter since it was added in `2582083fa1` (ref-filter: add worktreepath atom, 2019-04-28). Normally we'd use the atom struct to cache any work we do, but in this case there's a global hashmap that does that caching already. So we can just drop the unused parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-02-24 09:13:28 -08:00
Elijah Newren	41771fa435	cache.h: remove dependence on hex.h; make other files include it explicitly Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-02-23 17:25:29 -08:00
Elijah Newren	36bf195890	alloc.h: move ALLOC_GROW() functions from cache.h This allows us to replace includes of cache.h with includes of the much smaller alloc.h in many places. It does mean that we also need to add includes of alloc.h in a number of C files. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-02-23 17:25:28 -08:00
Jeff King	20869d1a1d	convert trivial uses of strncmp() to starts_with() It's more readable to use starts_with() instead of strncmp() to match a prefix, as the latter requires a manually-computed length, and has the funny "matching is zero" return value common to cmp functions. This patch converts several cases which were found with: git grep 'strncmp(., [0-9])' But note that it doesn't convert all such cases. There are several where the magic length number is repeated elsewhere in the code, like: /* handle "buf" which isn't NUL-terminated and might be too small / if (len >= 3 && !strncmp(buf, "foo", 3)) or: / exact match for "foo", but within a larger string */ if (end - buf == 3 && !strncmp(buf, "foo", 3)) While it would not produce the wrong outcome to use starts_with() in these cases, we'd still be left with one instance of "3". We're better to leave them for now, as the repeated "3" makes it clear that the two are linked (there may be other refactorings that handle both, but they're out of scope for this patch). A few things to note while reading the patch: - all cases but one are trying to match, and so lose the extra "!". The case in the first hunk of urlmatch.c is not-matching, and hence gains a "!". - the case in remote-fd.c is matching the beginning of "connect foo", but we never look at str+8 to parse the "foo" part (which would make this a candidate for skip_prefix(), not starts_with()). This seems at first glance like a bug, but is a limitation of how remote-fd works. - the second hunk in urlmatch.c shows some cases adjacent to other strncmp() calls that are left. These are of the "exact match within a larger string" type, as described above. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-01-08 10:34:35 +09:00
Junio C Hamano	78d15022e7	Merge branch 'jk/ref-filter-error-reporting-fix' Clean-ups in error messages produced by "git for-each-ref" and friends. * jk/ref-filter-error-reporting-fix: ref-filter: convert email atom parser to use err_bad_arg() ref-filter: truncate atom names in error messages ref-filter: factor out "unrecognized %(foo) arg" errors ref-filter: factor out "%(foo) does not take arguments" errors ref-filter: reject arguments to %(HEAD)	2022-12-26 11:42:06 +09:00
Junio C Hamano	179547932f	Merge branch 'jk/unused-post-2.39' Code clean-up around unused function parameters. * jk/unused-post-2.39: userdiff: mark unused parameter in internal callback list-objects-filter: mark unused parameters in virtual functions diff: mark unused parameters in callbacks xdiff: mark unused parameter in xdl_call_hunk_func() xdiff: drop unused parameter in def_ff() ws: drop unused parameter from ws_blank_line() list-objects: drop process_gitlink() function blob: drop unused parts of parse_blob_buffer() ls-refs: use repository parameter to iterate refs	2022-12-26 11:42:05 +09:00
Jeff King	285da4321a	ref-filter: convert email atom parser to use err_bad_arg() The error message for a bogus argument to %(authoremail), etc, is: $ git for-each-ref --format='%(authoremail:foo)' fatal: unrecognized email option: foo Saying just "email" is a little vague; most of the other atom parsers would use the full name "%(authoremail)", but we can't do that here because the same function also handles %(taggeremail), etc. Until recently, passing atom->name was a bad idea, because it erroneously included the arguments in the atom name. But since the previous commit taught err_bad_arg() to handle this, we can now do so and get: fatal: unrecognized %(authoremail) argument: foo which is consistent with other atoms. Signed-off-by: Jeff King <peff@peff.net> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-12-15 09:14:09 +09:00
Jeff King	1955ef10ed	ref-filter: truncate atom names in error messages If you pass a bogus argument to %(refname), you may end up with a message like this: $ git for-each-ref --format='%(refname:foo)' fatal: unrecognized %(refname:foo) argument: foo which is confusing. It should just say: fatal: unrecognized %(refname) argument: foo which is clearer, and is consistent with most other atom parsers. Those other parsers do not have the same problem because they pass the atom name from a string literal in the parser function. But because the parser for %(refname) also handles %(upstream) and %(push), it instead uses atom->name, which includes the arguments. The oid atom parser which handles %(tree), %(parent), etc suffers from the same problem. It seems like the cleanest fix would be for atom->name to be _just_ the name, since there's already a separate "args" field. But since that field is also used for other things, we can't change it easily (e.g., it's how we find things in the used_atoms array, and clearly %(refname) and %(refname:short) are not the same thing). Instead, we'll teach our error_bad_arg() function to stop at the first ":". This is a little hacky, as we're effectively re-parsing the name, but the format is simple enough to do this as a one-liner, and this localizes the change to the error-reporting code. We'll give the same treatment to err_no_arg(). None of its callers use this atom->name trick, but it's worth future-proofing it while we're here. Signed-off-by: Jeff King <peff@peff.net> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-12-15 09:14:04 +09:00
Jeff King	dda4fc1a84	ref-filter: factor out "unrecognized %(foo) arg" errors Atom parsers that take arguments generally have a catch-all for "this arg is not recognized". Most of them use the same printf template, which is good, because it makes life easier for translators. Let's pull this template into a helper function, which makes the code in the parsers shorter and avoids any possibility of differences. As with the previous commit, we'll pick an arbitrary atom to make sure the test suite covers this code. Signed-off-by: Jeff King <peff@peff.net> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-12-15 09:14:00 +09:00
Jeff King	a33d0fae76	ref-filter: factor out "%(foo) does not take arguments" errors Many atom parsers give the same error message, differing only in the name of the atom. If we use "%s does not take arguments", that should make life easier for translators, as they only need to translate one string. And in doing so, we can easily pull it into a helper function to make sure they are all using the exact same string. I've added a basic test here for %(HEAD), just to make sure this code is exercised at all in the test suite. We could cover each such atom, but the effort-to-reward ratio of trying to maintain an exhaustive list doesn't seem worth it. Signed-off-by: Jeff King <peff@peff.net> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-12-15 09:13:56 +09:00
Jeff King	afc1a946b2	ref-filter: reject arguments to %(HEAD) The %(HEAD) atom doesn't take any arguments, but unlike other atoms in the same boat (objecttype, deltabase, etc), it does not detect this situation and complain. Let's make it consistent with the others. Signed-off-by: Jeff King <peff@peff.net> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-12-15 09:13:35 +09:00
Junio C Hamano	9ea1378d04	Merge branch 'ab/various-leak-fixes' Various leak fixes. * ab/various-leak-fixes: built-ins: use free() not UNLEAK() if trivial, rm dead code revert: fix parse_options_concat() leak cherry-pick: free "struct replay_opts" members rebase: don't leak on "--abort" connected.c: free the "struct packed_git" sequencer.c: fix "opts->strategy" leak in read_strategy_opts() ls-files: fix a --with-tree memory leak revision API: call graph_clear() in release_revisions() unpack-file: fix ancient leak in create_temp_file() built-ins & libs & helpers: add/move destructors, fix leaks dir.c: free "ident" and "exclude_per_dir" in "struct untracked_cache" read-cache.c: clear and free "sparse_checkout_patterns" commit: discard partial cache before (re-)reading it {reset,merge}: call discard_index() before returning tests: mark tests as passing with SANITIZE=leak	2022-12-14 15:55:46 +09:00
Jeff King	91e2ab1587	ls-refs: use repository parameter to iterate refs The ls_refs() function (for the v2 protocol command of the same name) takes a repository parameter (like all v2 commands), but ignores it. It should use it to access the refs. This isn't a bug in practice, since we only call this function when serving upload-pack from the main repository. But it's an awkward gotcha, and it causes -Wunused-parameter to complain. The main reason we don't use the repository parameter is that the ref iteration interface we call doesn't have a "refs_" variant that takes a ref_store. However we can easily add one. In fact, since there is only one other caller (in ref-filter.c), there is no need to maintain the non-repository wrapper; that caller can just use the_repository. It's still a long way from consistently using a repository object, but it's one small step in the right direction. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-12-13 22:16:22 +09:00
Ævar Arnfjörð Bjarmason	b6046abc0c	built-ins & libs & helpers: add/move destructors, fix leaks Fix various leaks in built-ins, libraries and a test helper here we were missing a call to strbuf_release(), string_list_clear() etc, or were calling them after a potential "return". Comments on individual changes: - builtin/checkout.c: Fix a memory leak that was introduced in [1]. A sibling leak introduced in [2] was recently fixed in [3]. As with [3] we should be using the wt_status_state_free_buffers() API introduced in [4]. - builtin/repack.c: Fix a leak that's been here since this use of "strbuf_release()" was added in `a1bbc6c017` (repack: rewrite the shell script in C, 2013-09-15). We don't use the variable for anything except this loop, so we can instead free it right afterwards. - builtin/rev-parse: Fix a leak that's been here since this code was added in `21d4783538` (Add a parseopt mode to git-rev-parse to bring parse-options to shell scripts., 2007-11-04). - builtin/stash.c: Fix a couple of leaks that have been here since this code was added in `d4788af875` (stash: convert create to builtin, 2019-02-25), we strbuf_release()'d only some of the "struct strbuf" we allocated earlier in the function, let's release all of them. - ref-filter.c: Fix a leak in `482c119186` (gpg-interface: improve interface for parsing tags, 2021-02-11), we don't use the "payload" variable that we ask parse_signature() to populate for us, so let's free it. - t/helper/test-fake-ssh.c: Fix a leak that's been here since this code was added in `3064d5a38c` (mingw: fix t5601-clone.sh, 2016-01-27). Let's free the "struct strbuf" as soon as we don't need it anymore. 1. `c45f0f525d` (switch: reject if some operation is in progress, 2019-03-29) 2. `2708ce62d2` (branch: sort detached HEAD based on a flag, 2021-01-07) 3. `abcac2e19f` (ref-filter.c: fix a leak in get_head_description, 2022-09-25) 4. `962dd7ebc3` (wt-status: introduce wt_status_state_free_buffers(), 2020-09-27). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-21 12:32:48 +09:00
Jeff King	8e1c5fcf28	ref-filter: fix parsing of signatures with CRLF and no body This commit fixes a bug when parsing tags that have CRLF line endings, a signature, and no body, like this (the "^M" are marking the CRs): this is the subject^M -----BEGIN PGP SIGNATURE-----^M ^M ...some stuff...^M -----END PGP SIGNATURE-----^M When trying to find the start of the body, we look for a blank line separating the subject and body. In this case, there isn't one. But we search for it using strstr(), which will find the blank line in the signature. In the non-CRLF code path, we check whether the line we found is past the start of the signature, and if so, put the body pointer at the start of the signature (effectively making the body empty). But the CRLF code path doesn't catch the same case, and we end up with the body pointer in the middle of the signature field. This has two visible problems: - printing %(contents:subject) will show part of the signature, too, since the subject length is computed as (body - subject) - the length of the body is (sig - body), which makes it negative. Asking for %(contents:body) causes us to cast this to a very large size_t when we feed it to xmemdupz(), which then complains about trying to allocate too much memory. These are essentially the same bugs fixed in the previous commit, except that they happen when there is a CRLF blank line in the signature, rather than no blank line at all. Both are caused by the refactoring in `9f75ce3d8f` (ref-filter: handle CRLF at end-of-line more gracefully, 2020-10-29). We can fix this by doing the same "sigstart" check that we do in the non-CRLF case. And rather than repeat ourselves, we can just use short-circuiting OR to collapse both cases into a single conditional. I.e., rather than: if (strstr("\n\n")) ...found blank, check if it's in signature... else if (strstr("\r\n\r\n")) ...found blank, check if it's in signature... else ...no blank line found... we can collapse this to: if (strstr("\n\n")) \|\| strstr("\r\n\r\n"))) ...found blank, check if it's in signature... else ...no blank line found... The tests show the problem and the fix. Though it wasn't broken, I included contents:signature here to make sure it still behaves as expected, but note the shell hackery needed to make it work. A less-clever option would be to skip using test_atom and just "append_cr >expected" ourselves. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-02 21:36:04 -04:00
Jeff King	b01e1c7ef0	ref-filter: fix parsing of signatures without blank lines When ref-filter is asked to show %(content:subject), etc, we end up in find_subpos() to parse out the three major parts: the subject, the body, and the signature (if any). When searching for the blank line between the subject and body, if we don't find anything, we try to treat the whole message as the subject, with no body. But our idea of "the whole message" needs to take into account the signature, too. Since `9f75ce3d8f` (ref-filter: handle CRLF at end-of-line more gracefully, 2020-10-29), the code instead goes all the way to the end of the buffer, which produces confusing output. Here's an example. If we have a tag message like this: this is the subject -----BEGIN SSH SIGNATURE----- ...some stuff... -----END SSH SIGNATURE----- then the current parser will put the start of the body at the end of the whole buffer. This produces two buggy outcomes: - since the subject length is computed as (body - subject), showing %(contents:subject) will print both the subject and the signature, rather than just the single line - since the body length is computed as (sig - body), and the body now starts _after_ the signature, we end up with a negative length! Fortunately we never access out-of-bounds memory, because the negative length is fed to xmemdupz(), which casts it to a size_t, and xmalloc() bails trying to allocate an absurdly large value. In theory it would be possible for somebody making a malicious tag to wrap it around to a more reasonable value, but it would require a tag on the order of 2^63 bytes. And even if they did, all they get is an out of bounds string read. So the security implications are probably not interesting. We can fix both by correctly putting the start of the body at the same index as the start of the signature (effectively making the body empty). Note that this is a real issue with signatures generated with gpg.format set to "ssh", which would look like the example above. In the new tests here I use a hard-coded tag message, for a few reasons: - regardless of what the ssh-signing code produces now or in the future, we should be testing this particular case - skipping the actual signature makes the tests simpler to write (and allows them to run on more systems) - t6300 has helpers for working with gpg signatures; for the purposes of this bug, "BEGIN PGP" is just as good a demonstration, and this simplifies the tests Curiously, the same issue doesn't happen with real gpg signatures (and there are even existing tests in t6300 with cover this). Those have a blank line between the header and the content, like: this is the subject -----BEGIN PGP SIGNATURE----- ...some stuff... -----END PGP SIGNATURE----- Because we search for the subject/body separator line with a strstr(), we find the blank line in the signature, even though it's outside of what we'd consider the body. But that puts us unto a separate code path, which realizes that we're now in the signature and adjusts the line back to "sigstart". So this patch is basically just making the "no line found at all" case match that. And note that "sigstart" is always defined (if there is no signature, it points to the end of the buffer as you'd expect). Reported-by: Martin Englund <martin@englund.nu> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-02 21:36:04 -04:00
Junio C Hamano	83b2b47850	Merge branch 'rj/ref-filter-get-head-description-leakfix' Leakfix. * rj/ref-filter-get-head-description-leakfix: ref-filter.c: fix a leak in get_head_description	2022-10-10 10:08:42 -07:00
Rubén Justo	abcac2e19f	ref-filter.c: fix a leak in get_head_description In `2708ce62d2` (branch: sort detached HEAD based on a flag, 2021-01-07) a call to wt_status_state_free_buffers, responsible of freeing the resources that could be allocated in the local struct wt_status_state state, was eliminated. The call to wt_status_state_free_buffers was introduced in `962dd7ebc3` (wt-status: introduce wt_status_state_free_buffers(), 2020-09-27). This commit brings back that call in get_head_description. Signed-off-by: Rubén Justo <rjusto@gmail.com> Reviewed-by: Martin Ågren <martin.agren@gmail.com> Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-26 11:14:49 -07:00
Ævar Arnfjörð Bjarmason	5cf88fd8b0	git-compat-util.h: use "UNUSED", not "UNUSED(var)" As reported in [1] the "UNUSED(var)" macro introduced in 2174b8c75de (Merge branch 'jk/unused-annotation' into next, 2022-08-24) breaks coccinelle's parsing of our sources in files where it occurs. Let's instead partially go with the approach suggested in [2] of making this not take an argument. As noted in [1] "coccinelle" will ignore such tokens in argument lists that it doesn't know about, and it's less of a surprise to syntax highlighters. This undoes the "help us notice when a parameter marked as unused is actually use" part of `9b24034754` (git-compat-util: add UNUSED macro, 2022-08-19), a subsequent commit will further tweak the macro to implement a replacement for that functionality. 1. https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/ 2. https://lore.kernel.org/git/220819.868rnk54ju.gmgdl@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 10:49:48 -07:00
Jeff King	02c3c59e62	hashmap: mark unused callback parameters Hashmap comparison functions must conform to a particular callback interface, but many don't use all of their parameters. Especially the void cmp_data pointer, but some do not use keydata either (because they can easily form a full struct to pass when doing lookups). Let's mark these to make -Wunused-parameter happy. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 12:18:55 -07:00
Jeff King	359b01ca84	ref-filter: disable save_commit_buffer while traversing Various ref-filter options like "--contains" or "--merged" may cause us to traverse large segments of the history graph. It's counter-productive to have save_commit_buffer turned on, as that will instruct the commit code to cache in-memory the object contents for each commit we traverse. This increases the amount of heap memory used while providing little or no benefit, since we're not actually planning to display those commits (which is the usual reason that tools like git-log want to keep them around). We can easily disable this feature while ref-filter is running. This lowers peak heap (as measured by massif) for running: git tag --contains 1da177e4c3 in linux.git from ~100MB to ~20MB. It also seems to improve runtime by 4-5% (600ms vs 630ms). A few points to note: - it should be safe to temporarily disable save_commit_buffer like this. The saved buffers are accessed through get_commit_buffer(), which treats the saved ones like a cache, and loads on-demand from the object database on a cache miss. So any code that was using this would not be wrong, it might just incur an extra object lookup for some objects. But... - I don't think any ref-filter related code is using the cache. While it's true that an option like "--format=%(contents:subject)" or "--sort=authordate" will need to look at the commit contents, ref-filter doesn't use get_commit_buffer() to do so! It always reads the objects directly via read_object_file(), though it does avoid re-reading objects if the format can be satisfied without them. Timing "git tag --format=%(*authordate)" shows that we're the same before and after, as expected. - Note that all of this assumes you don't have a commit-graph file. if you do, then the heap usage is even lower, and the runtime is 10x faster. So in that sense this is not urgent, as there's a much better solution. But since it's such an obvious and easy win for fallback cases (including commits which aren't yet in the graph file), there's no reason not to. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-07-11 14:27:31 -07:00
Junio C Hamano	2da81d1efb	Merge branch 'ab/plug-leak-in-revisions' Plug the memory leaks from the trickiest API of all, the revision walker. * ab/plug-leak-in-revisions: (27 commits) revisions API: add a TODO for diff_free(&revs->diffopt) revisions API: have release_revisions() release "topo_walk_info" revisions API: have release_revisions() release "date_mode" revisions API: call diff_free(&revs->pruning) in revisions_release() revisions API: release "reflog_info" in release revisions() revisions API: clear "boundary_commits" in release_revisions() revisions API: have release_revisions() release "prune_data" revisions API: have release_revisions() release "grep_filter" revisions API: have release_revisions() release "filter" revisions API: have release_revisions() release "cmdline" revisions API: have release_revisions() release "mailmap" revisions API: have release_revisions() release "commits" revisions API users: use release_revisions() for "prune_data" users revisions API users: use release_revisions() with UNLEAK() revisions API users: use release_revisions() in builtin/log.c revisions API users: use release_revisions() in http-push.c revisions API users: add "goto cleanup" for release_revisions() stash: always have the owner of "stash_info" free it revisions API users: use release_revisions() needing REV_INFO_INIT revision.[ch]: document and move code declared around "init" ...	2022-06-07 14:10:56 -07:00
Junio C Hamano	538dc459a0	Merge branch 'ep/maint-equals-null-cocci' Introduce and apply coccinelle rule to discourage an explicit comparison between a pointer and NULL, and applies the clean-up to the maintenance track. * ep/maint-equals-null-cocci: tree-wide: apply equals-null.cocci tree-wide: apply equals-null.cocci contrib/coccinnelle: add equals-null.cocci	2022-05-20 15:26:59 -07:00
Junio C Hamano	2b0a58d164	Merge branch 'ep/maint-equals-null-cocci' for maint-2.35 * ep/maint-equals-null-cocci: tree-wide: apply equals-null.cocci contrib/coccinnelle: add equals-null.cocci	2022-05-02 10:06:04 -07:00
Junio C Hamano	afe8a9070b	tree-wide: apply equals-null.cocci Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-02 09:50:37 -07:00
Ævar Arnfjörð Bjarmason	2108fe4a19	revisions API users: add straightforward release_revisions() Add a release_revisions() to various users of "struct rev_list" in those straightforward cases where we only need to add the release_revisions() call to the end of a block, and don't need to e.g. refactor anything to use a "goto cleanup" pattern. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-04-13 23:56:08 -07:00
Ævar Arnfjörð Bjarmason	974c919d36	date API: add and use a date_mode_release() Fix a memory leak in the parse_date_format() function by providing a new date_mode_release() companion function. By using this in "t/helper/test-date.c" we can mark the "t0006-date.sh" test as passing when git is compiled with SANITIZE=leak, and whitelist it to run under "GIT_TEST_PASSING_SANITIZE_LEAK=true" by adding "TEST_PASSES_SANITIZE_LEAK=true" to the test itself. The other tests that expose this memory leak (i.e. take the "mode->type == DATE_STRFTIME" branch in parse_date_format()) are "t6300-for-each-ref.sh" and "t7004-tag.sh". The former is due to an easily fixed leak in "ref-filter.c", and brings the failures in "t6300-for-each-ref.sh" down from 51 to 48. Fixing the remaining leaks will have to wait until there's a release_revisions() in "revision.c", as they have to do with leaks via "struct rev_info". There is also a leak in "builtin/blame.c" due to its call to parse_date_format() to parse the "blame.date" configuration. However as it declares a file-level "static struct date_mode blame_date_mode" to track the data, LSAN will not report it as a leak. It's possible to get valgrind(1) to complain about it with e.g.: valgrind --leak-check=full --show-leak-kinds=all ./git -P -c blame.date=format:%Y blame README.md But let's focus on things LSAN complains about, and are thus observable with "TEST_PASSES_SANITIZE_LEAK=true". We should get to fixing memory leaks in "builtin/blame.c", but as doing so would require some re-arrangement of cmd_blame() let's leave it for some other time. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-16 09:40:00 -08:00
Ævar Arnfjörð Bjarmason	f184289832	date API: provide and use a DATE_MODE_INIT Provide and use a DATE_MODE_INIT macro. Most of the users of struct date_mode" use it via pretty.h's "struct pretty_print_context" which doesn't have an initialization macro, so we're still bound to being initialized to "{ 0 }" by default. But we can change the couple of callers that directly declared a variable on the stack to instead use the initializer, and thus do away with the "mode.local = 0" added in `add00ba2de` (date: make "local" orthogonal to date format, 2015-09-03). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-16 09:40:00 -08:00
Jean-Noël Avila	d7d30badbf	i18n: ref-filter: factorize "%(foo) atom used without %(bar) atom" Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Reviewed-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-01-05 13:31:00 -08:00
Jean-Noël Avila	68e2ea0b30	i18n: refactor "unrecognized %(foo) argument" strings Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Reviewed-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-01-05 13:31:00 -08:00
Junio C Hamano	5126145ba8	Merge branch 'jc/fix-ref-sorting-parse' Things like "git -c branch.sort=bogus branch new HEAD", i.e. the operation modes of the "git branch" command that do not need the sort key information, no longer errors out by seeing a bogus sort key. * jc/fix-ref-sorting-parse: for-each-ref: delay parsing of --sort=<atom> options	2021-11-29 15:41:47 -08:00
Jiang Xin	f733719316	i18n: fix typos found during l10n for git 2.34.0 Emir and Jean-Noël reported typos in some i18n messages when preparing l10n for git 2.34.0. * Fix unstable spelling of config variable "gpg.ssh.defaultKeyCommand" which was introduced in commit `fd9e226776` (ssh signing: retrieve a default key from ssh-agent, 2021-09-10). * Add missing space between "with" and "--python" which was introduced in commit `bd0708c7eb` (ref-filter: add %(raw) atom, 2021-07-26). * Fix unmatched single quote in 'builtin/index-pack.c' which was introduced in commit `8737dab346` (index-pack: refactor renaming in final(), 2021-09-09) [1] https://github.com/git-l10n/git-po/pull/567 Reported-by: Emir Sarı <bitigchi@me.com> Reported-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Jiang Xin <worldhello.net@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-31 22:49:49 -07:00
Junio C Hamano	98e7ab6d42	for-each-ref: delay parsing of --sort=<atom> options The for-each-ref family of commands invoke parsers immediately when it sees each --sort=<atom> option, and die before even seeing the other options on the command line when the <atom> is unrecognised. Instead, accumulate them in a string list, and have them parsed into a ref_sorting structure after the command line parsing is done. As a consequence, "git branch --sort=bogus -h" used to fail to give the brief help, which arguably may have been a feature, now does so, which is more consistent with how other options work. The patch is smaller than the actual extent of the "damage" to the codebase, thanks to the fact that the original code consistently used OPT_REF_SORT() macro to handle command line options. We only needed to replace the variable used for the list, and implementation of the callback function used in the macro. The old rule was for the users of the API to: - Declare ref_sorting and ref_sorting_tail variables; - OPT_REF_SORT() macro will instantiate ref_sorting instance (which may barf and die) and append it to the tail; - Append to the tail each ref_sorting read from the configuration by parsing in the config callback (which may barf and die); - See if ref_sorting is null and use ref_sorting_default() instead. Now the rule is not all that different but is simpler: - Declare ref_sorting_options string list. - OPT_REF_SORT() macro will append it to the string list; - Append to the string list the sort key read from the configuration; - call ref_sorting_options() to turn the string list to ref_sorting structure (which also deals with the default value). As side effects, this change also cleans up a few issues: - `95be717c` (parse_opt_ref_sorting: always use with NONEG flag, 2019-03-20) muses that "git for-each-ref --no-sort" should simply clear the sort keys accumulated so far; it now does. - The implementation detail of "struct ref_sorting" and the helper function parse_ref_sorting() can now be private to the ref-filter API implementation. - If you set branch.sort to a bogus value, the any "git branch" invocation, not only the listing mode, would abort with the original code; now it doesn't Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-20 14:33:07 -07:00
Ævar Arnfjörð Bjarmason	e5fb028688	ref-filter API user: add and use a ref_sorting_release() Add a ref_sorting_release() and use it for some of the current API users, the ref_sorting_default() function and its siblings will do a malloc() which wasn't being free'd previously. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-20 11:36:13 -07:00
Junio C Hamano	404c4a5462	Merge branch 'ab/designated-initializers' Code clean-up. * ab/designated-initializers: cbtree.h: define cb_init() in terms of CBTREE_INIT .h: move some _INIT to designated initializers .h _INIT macros: don't specify fields equal to 0 .[ch] *_INIT macros: use { 0 } for a "zero out" idiom submodule-config.h: remove unused SUBMODULE_INIT macro	2021-10-11 10:21:48 -07:00
Ævar Arnfjörð Bjarmason	9865b6e6a4	.[ch] _INIT macros: use { 0 } for a "zero out" idiom In C it isn't required to specify that all members of a struct are zero'd out to 0, NULL or '\0', just providing a "{ 0 }" will accomplish that. Let's also change code that provided N zero'd fields to just provide one, and change e.g. "{ NULL }" to "{ 0 }" for consistency. I.e. even if the first member is a pointer let's use "0" instead of "NULL". The point of using "0" consistently is to pick one, and to not have the reader wonder why we're not using the same pattern everywhere. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 14:47:59 -07:00
Jeff King	67985e4e4a	refs: drop "broken" flag from for_each_fullref_in() No callers pass in anything but "0" here. Likewise to our sibling functions. Note that some of them ferry along the flag, but none of their callers pass anything but "0" either. Nor is anybody likely to change that. Callers which really want to see all of the raw refs use for_each_rawref(). And anybody interested in iterating a subset of the refs will likely be happy to use the now-default behavior of showing broken refs, but omitting dangling symlinks. So we can get rid of this whole feature. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 12:36:45 -07:00

1 2 3 4 5 ...

343 Commits