git-commit-vandalism

Author	SHA1	Message	Date
Junio C Hamano	dbb4102f7b	Merge branch 'sg/parse-options-h-users' Code clean-up to include and/or uninclude parse-options.h file as needed. * sg/parse-options-h-users: treewide: remove unnecessary inclusions of parse-options.h from headers treewide: include parse-options.h in source files	2023-03-30 13:47:11 -07:00
SZEDER Gábor	49fd551194	treewide: include parse-options.h in source files The builtins 'ls-remote', 'pack-objects', 'receive-pack', 'reflog' and 'send-pack' use parse_options(), but their source files don't directly include 'parse-options.h'. Furthermore, the source files 'diagnose.c', 'list-objects-filter-options.c', 'remote.c' and 'send-pack.c' define option parsing callback functions, while 'revision.c' defines an option parsing helper function, and thus need access to various fields in 'struct option' and 'struct parse_opt_ctx_t', but they don't directly include 'parse-options.h' either. They all can still be built, of course, because they include one of the header files that does include 'parse-options.h' (though unnecessarily, see the next commit). Add those missing includes to these files, as our general rule is that "a C file must directly include the header files that declare the functions and the types it uses". Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 11:26:47 -07:00
Junio C Hamano	4d87411ffe	Merge branch 'ew/fetch-hiderefs' A new "fetch.hideRefs" option can be used to exclude specified refs from "rev-list --objects --stdin --not --all" traversal for checking object connectivity, most useful when there are many unrelated histories in a single repository. * ew/fetch-hiderefs: fetch: support hideRefs to speed up connectivity checks	2023-03-17 14:03:10 -07:00
Junio C Hamano	d0732a8120	Merge branch 'jk/unused-post-2.39-part2' More work towards -Wunused. * jk/unused-post-2.39-part2: (21 commits) help: mark unused parameter in git_unknown_cmd_config() run_processes_parallel: mark unused callback parameters userformat_want_item(): mark unused parameter for_each_commit_graft(): mark unused callback parameter rewrite_parents(): mark unused callback parameter fetch-pack: mark unused parameter in callback function notes: mark unused callback parameters prio-queue: mark unused parameters in comparison functions for_each_object: mark unused callback parameters list-objects: mark unused callback parameters mark unused parameters in signal handlers run-command: mark error routine parameters as unused mark "pointless" data pointers in callbacks ref-filter: mark unused callback parameters http-backend: mark unused parameters in virtual functions http-backend: mark argc/argv unused object-name: mark unused parameters in disambiguate callbacks serve: mark unused parameters in virtual functions serve: use repository pointer to get config ls-refs: drop config caching ...	2023-03-17 14:03:09 -07:00
Eric Wong	c6ce27ab08	fetch: support hideRefs to speed up connectivity checks With roughly 800 remotes all fetching into their own refs/remotes/$REMOTE/* island, the connectivity check[1] gets expensive for each fetch on systems which lack sufficient RAM to cache objects. To do a no-op fetch on one $REMOTE out of hundreds, hideRefs now allows the no-op fetch to take ~30 seconds instead of ~20 minutes on a noisy, RAM-constrained machine (localhost, so no network latency): git -c fetch.hideRefs=refs \ -c fetch.hideRefs='!refs/remotes/$REMOTE/' \ fetch $REMOTE [1] `git rev-list --objects --stdin --not --all --quiet --alternate-refs' Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-02-27 09:27:03 -08:00
Jeff King	be252d3349	for_each_object: mark unused callback parameters The for_each_{loose,packed}_object interface uses callback functions, but not every callback needs all of the parameters. Mark the unused ones to satisfy -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-02-24 09:13:31 -08:00
Jeff King	d3dcfa047f	mark "pointless" data pointers in callbacks Both the object_array_filter() and trie_find() functions use callback functions that let the caller specify which elements match. These callbacks take a void pointer in case the caller wants to pass in extra data. But in each case, the single user of these functions just passes NULL, and the callback ignores the extra pointer. We could just remove these unused parameters from the callback interface entirely. But it's good practice to provide such a pointer, as it guides future callers of the function in the right direction (rather than tempting them to access global data). Plus it's consistent with other generic callback interfaces. So let's instead annotate the unused parameters, in order to silence the compiler's -Wunused-parameter warning. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-02-24 09:13:30 -08:00
Elijah Newren	41771fa435	cache.h: remove dependence on hex.h; make other files include it explicitly Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-02-23 17:25:29 -08:00
Elijah Newren	36bf195890	alloc.h: move ALLOC_GROW() functions from cache.h This allows us to replace includes of cache.h with includes of the much smaller alloc.h in many places. It does mean that we also need to add includes of alloc.h in a number of C files. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-02-23 17:25:28 -08:00
Ævar Arnfjörð Bjarmason	6269f8eaad	treewide: always have a valid "index_state.repo" member When the "repo" member was added to "the_index" in [1] the repo_read_index() was made to populate it, but the unpopulated "the_index" variable didn't get the same treatment. Let's do that in initialize_the_repository() when we set it up, and likewise for all of the current callers initialized an empty "struct index_state". This simplifies code that needs to deal with "the_index" or a custom "struct index_state", we no longer need to second-guess this part of the "index_state" deep in the stack. A recent example of such second-guessing is the "istate->repo ? istate->repo : the_repository" code in [2]. We can now simply use "istate->repo". We're doing this by making use of the INDEX_STATE_INIT() macro (and corresponding function) added in [3], which now have mandatory "repo" arguments. Because we now call index_state_init() in repository.c's initialize_the_repository() we don't need to handle the case where we have a "repo->index" whose "repo" member doesn't match the "repo" we're setting up, i.e. the "Complete the double-reference" code in repo_read_index() being altered here. That logic was originally added in [1], and was working around the lack of what we now have in initialize_the_repository(). For "fsmonitor-settings.c" we can remove the initialization of a NULL "r" argument to "the_repository". This was added back in [4], and was needed at the time for callers that would pass us the "r" from an "istate->repo". Before this change such a change to "fsmonitor-settings.c" would segfault all over the test suite (e.g. in t0002-gitfile.sh). This change has wider eventual implications for "fsmonitor-settings.c". The reason the other lazy loading behavior in it is required (starting with "if (!r->settings.fsmonitor) ..." is because of the previously passed "r" being "NULL". I have other local changes on top of this which move its configuration reading to "prepare_repo_settings()" in "repo-settings.c", as we could now start to rely on it being called for our "r". But let's leave all of that for now, and narrowly remove this particular part of the lazy-loading. 1. `1fd9ae517c` (repository: add repo reference to index_state, 2021-01-23) 2. `ee1f0c242e` (read-cache: add index.skipHash config option, 2023-01-06) 3. `2f6b1eb794` (cache API: add a "INDEX_STATE_INIT" macro/function, add release_index(), 2023-01-12) 4. `1e0ea5c431` (fsmonitor: config settings are repository-specific, 2022-03-25) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-01-17 14:32:06 -08:00
Ævar Arnfjörð Bjarmason	2f6b1eb794	cache API: add a "INDEX_STATE_INIT" macro/function, add release_index() Hopefully in some not so distant future, we'll get advantages from always initializing the "repo" member of the "struct index_state". To make that easier let's introduce an initialization macro & function. The various ad-hoc initialization of the structure can then be changed over to it, and we can remove the various "0" assignments in discard_index() in favor of calling index_state_init() at the end. While not strictly necessary, let's also change the CALLOC_ARRAY() of various "struct index_state *" to use an ALLOC_ARRAY() followed by index_state_init() instead. We're then adding the release_index() function and converting some callers (including some of these allocations) over to it if they either won't need to use their "struct index_state" again, or are just about to call index_state_init(). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-01-16 10:46:58 -08:00
Junio C Hamano	179547932f	Merge branch 'jk/unused-post-2.39' Code clean-up around unused function parameters. * jk/unused-post-2.39: userdiff: mark unused parameter in internal callback list-objects-filter: mark unused parameters in virtual functions diff: mark unused parameters in callbacks xdiff: mark unused parameter in xdl_call_hunk_func() xdiff: drop unused parameter in def_ff() ws: drop unused parameter from ws_blank_line() list-objects: drop process_gitlink() function blob: drop unused parts of parse_blob_buffer() ls-refs: use repository parameter to iterate refs	2022-12-26 11:42:05 +09:00
Junio C Hamano	9ea1378d04	Merge branch 'ab/various-leak-fixes' Various leak fixes. * ab/various-leak-fixes: built-ins: use free() not UNLEAK() if trivial, rm dead code revert: fix parse_options_concat() leak cherry-pick: free "struct replay_opts" members rebase: don't leak on "--abort" connected.c: free the "struct packed_git" sequencer.c: fix "opts->strategy" leak in read_strategy_opts() ls-files: fix a --with-tree memory leak revision API: call graph_clear() in release_revisions() unpack-file: fix ancient leak in create_temp_file() built-ins & libs & helpers: add/move destructors, fix leaks dir.c: free "ident" and "exclude_per_dir" in "struct untracked_cache" read-cache.c: clear and free "sparse_checkout_patterns" commit: discard partial cache before (re-)reading it {reset,merge}: call discard_index() before returning tests: mark tests as passing with SANITIZE=leak	2022-12-14 15:55:46 +09:00
Jeff King	61bdc7c5d8	diff: mark unused parameters in callbacks The diff code provides a format_callback interface, but not every callback needs each parameter (e.g., the "opt" and "data" parameters are frequently left unused). Likewise for the output_prefix callback, the low-level change/add_remove interfaces, the callbacks used by xdi_diff(), etc. Mark unused arguments in the callback implementations to quiet -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-12-13 22:16:23 +09:00
Junio C Hamano	f8828f9125	Merge branch 'ps/receive-use-only-advertised' "git receive-pack" used to use all the local refs as the boundary for checking connectivity of the data "git push" sent, but now it uses only the refs that it advertised to the pusher. In a repository with the .hideRefs configuration, this reduces the resources needed to perform the check. cf. <221028.86bkpw805n.gmgdl@evledraar.gmail.com> cf. <xmqqr0yrizqm.fsf@gitster.g> * ps/receive-use-only-advertised: receive-pack: only use visible refs for connectivity check rev-parse: add `--exclude-hidden=` option revision: add new parameter to exclude hidden refs revision: introduce struct to handle exclusions revision: move together exclusion-related functions refs: get rid of global list of hidden refs refs: fix memory leak when parsing hideRefs config	2022-11-23 11:22:25 +09:00
Ævar Arnfjörð Bjarmason	fc47252d5b	revision API: call graph_clear() in release_revisions() Call graph_clear() in release_revisions(), this will free memory allocated by e.g. this command, which will now run without memory leaks: git -P log -1 --graph --no-graph --graph Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-21 12:32:48 +09:00
Patrick Steinhardt	8c1bc2a71a	revision: add new parameter to exclude hidden refs Users can optionally hide refs from remote users in git-upload-pack(1), git-receive-pack(1) and others via the `transfer.hideRefs`, but there is not an easy way to obtain the list of all visible or hidden refs right now. We'll require just that though for a performance improvement in our connectivity check. Add a new option `--exclude-hidden=` that excludes any hidden refs from the next pseudo-ref like `--all` or `--branches`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-17 16:22:52 -05:00
Patrick Steinhardt	1e9f273ac0	revision: introduce struct to handle exclusions The functions that handle exclusion of refs work on a single string list. We're about to add a second mechanism for excluding refs though, and it makes sense to reuse much of the same architecture for both kinds of exclusion. Introduce a new `struct ref_exclusions` that encapsulates all the logic related to excluding refs and move the `struct string_list` that holds all wildmatch patterns of excluded refs into it. Rename functions that operate on this struct to match its name. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-17 16:22:52 -05:00
Patrick Steinhardt	05b9425960	revision: move together exclusion-related functions Move together the definitions of functions that handle exclusions of refs so that related functionality sits in a single place, only. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-17 16:22:51 -05:00
Ævar Arnfjörð Bjarmason	916ebb327c	revisions API: extend the nascent REV_INFO_INIT macro Have the REV_INFO_INIT macro added in [1] declare more members of "struct rev_info" that we can initialize statically, and have repo_init_revisions() do so with the memcpy(..., &blank) idiom introduced in [2]. As the comment for the "REV_INFO_INIT" macro notes this still isn't sufficient to initialize a "struct rev_info" for use yet. But we are getting closer to that eventual goal. Even though we can't fully initialize a "struct rev_info" with REV_INFO_INIT it's useful for readability to clearly separate those things that we can statically initialize, and those that we can't. This change could replace the: list_objects_filter_init(&revs->filter); In the repo_init_revisions() with this line, at the end of the REV_INFO_INIT deceleration in revisions.h: .filter = LIST_OBJECTS_FILTER_INIT, \ But doing so would produce a minor conflict with an outstanding topic[3]. Let's skip that for now. I have follow-ups to initialize more of this statically, e.g. changes to get rid of grep_init(). We can initialize more members with the macro in a future series. 1. `f196c1e908` (revisions API users: use release_revisions() needing REV_INFO_INIT, 2022-04-13) 2. `5726a6b401` (.c _init(): define in terms of corresponding *_INIT macro, 2021-07-01) 3. https://lore.kernel.org/git/265b292ed5c2de19b7118dfe046d3d9d932e2e89.1667901510.git.ps@pks.im/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-08 16:34:01 -05:00
Junio C Hamano	b988427918	Merge branch 'rs/diff-caret-bang-with-parents' "git diff rev^!" did not show combined diff to go to the rev from its parents. * rs/diff-caret-bang-with-parents: diff: support ^! for merges revisions.txt: unspecify order of resolved parts of ^! revision: use strtol_i() for exclude_parent	2022-10-25 17:11:43 -07:00
Ævar Arnfjörð Bjarmason	82dd01d81b	CodingGuidelines: allow declaring variables in for loops Since `44ba10d671` (revision: use C99 declaration of variable in for() loop, 2021-11-14) released with v2.35.0 we've had a variable declared with in a for loop. Since then we've had inadvertent follow-ups to that with at least `cb2607759e` (merge-ort: store more specific conflict information, 2022-06-18) released with v2.38.0. As November 2022 is within the window of this upcoming release, let's update the guideline to allow this. We can have the promised "revisit" discussion while this patch cooks, and drop it if it turns out that it is still premature, which is not expected to happen at this moment. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-10-10 13:41:11 -07:00
René Scharfe	793c21182e	revision: use strtol_i() for exclude_parent Avoid silent overflow of the int exclude_parent by using the appropriate function, strtol_i(), to parse its value. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-10-01 15:58:33 -07:00
Junio C Hamano	298a958224	Merge branch 'jk/list-objects-filter-cleanup' A couple of bugfixes with code clean-up. * jk/list-objects-filter-cleanup: list-objects-filter: convert filter_spec to a strbuf list-objects-filter: add and use initializers list-objects-filter: handle null default filter spec list-objects-filter: don't memset after releasing filter struct	2022-09-19 14:35:24 -07:00
Junio C Hamano	dd407f1c7c	Merge branch 'ab/unused-annotation' Undoes 'jk/unused-annotation' topic and redoes it to work around Coccinelle rules misfiring false positives in unrelated codepaths. * ab/unused-annotation: git-compat-util.h: use "deprecated" for UNUSED variables git-compat-util.h: use "UNUSED", not "UNUSED(var)"	2022-09-14 12:56:39 -07:00
Junio C Hamano	a6b42ec0c6	Merge branch 'jk/unused-annotation' Annotate function parameters that are not used (but cannot be removed for structural reasons), to prepare us to later compile with -Wunused warning turned on. * jk/unused-annotation: is_path_owned_by_current_uid(): mark "report" parameter as unused run-command: mark unused async callback parameters mark unused read_tree_recursive() callback parameters hashmap: mark unused callback parameters config: mark unused callback parameters streaming: mark unused virtual method parameters transport: mark bundle transport_options as unused refs: mark unused virtual method parameters refs: mark unused reflog callback parameters refs: mark unused each_ref_fn parameters git-compat-util: add UNUSED macro	2022-09-14 12:56:39 -07:00
Junio C Hamano	655e494047	Merge branch 'jk/rev-list-verify-objects-fix' "git rev-list --verify-objects" ought to inspect the contents of objects and notice corrupted ones, but it didn't when the commit graph is in use, which has been corrected. * jk/rev-list-verify-objects-fix: rev-list: disable commit graph with --verify-objects lookup_commit_in_graph(): use prepare_commit_graph() to check for graph	2022-09-13 11:38:24 -07:00
Jeff King	2a01bdedf8	list-objects-filter: add and use initializers In `7e2619d8ff` (list_objects_filter_options: plug leak of filter_spec strings, 2022-09-08), we noted that the filter_spec string_list was inconsistent in how it handled memory ownership of strings stored in the list. The fix there was a bit of a band-aid to set the "strdup_strings" variable right before adding anything. That works OK, and it lets the users of the API continue to zero-initialize the struct. But it makes the code a bit hard to follow and accident-prone, as any other spots appending the filter_spec need to think about whether to set the strdup_strings value, too (there's one such spot in partial_clone_get_default_filter_spec(), which is probably a possible memory leak). So let's do that full cleanup now. We'll introduce a LIST_OBJECTS_FILTER_INIT macro and matching function, and use them as appropriate (though it is for the "_options" struct, this matches the corresponding list_objects_filter_release() function). This is harder than it seems! Many other structs, like git_transport_data, embed the filter struct. So they need to initialize it themselves even if the rest of the enclosing struct is OK with zero-initialization. I found all of the relevant spots by grepping manually for declarations of list_objects_filter_options. And then doing so recursively for structs which embed it, and ones which embed those, and so on. I'm pretty sure I got everything, but there's no change that would alert the compiler if any topics in flight added new declarations. To catch this case, we now double-check in the parsing function that things were initialized as expected and BUG() if appropriate. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-12 08:38:59 -07:00
Jeff King	9a8c3c4a5f	parse_object(): check commit-graph when skip_hash set If the caller told us that they don't care about us checking the object hash, then we're free to implement any optimizations that get us the parsed value more quickly. An obvious one is to check the commit graph before loading an object from disk. And in fact, both of the callers who pass in this flag are already doing so before they call parse_object()! So we can simplify those callers, as well as any possible future ones, by moving the logic into parse_object(). There are two subtle things to note in the diff, but neither has any impact in practice: - it seems least-surprising here to do the graph lookup on the git-replace'd oid, rather than the original. This is in theory a change of behavior from the earlier code, as neither caller did a replace lookup itself. But in practice it doesn't matter, as we disable the commit graph entirely if there are any replace refs. - the caller in get_reference() passes the skip_hash flag only if revs->verify_objects isn't set, whereas it would look in the commit graph unconditionally. In practice this should not matter as we should disable the commit graph entirely when using verify_objects (and that was done recently in another patch). So this should be a pure cleanup with no behavior change. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-07 12:27:02 -07:00
Jeff King	0bc2557951	upload-pack: skip parse-object re-hashing of "want" objects Imagine we have a history with commit C pointing to a large blob B. If a client asks us for C, we can generally serve both objects to them without accessing the uncompressed contents of B. In upload-pack, we figure out which commits we have and what the client has, and feed those tips to pack-objects. In pack-objects, we traverse the commits and trees (or use bitmaps!) to find the set of objects needed, but we never open up B. When we serve it to the client, we can often pass the compressed bytes directly from the on-disk packfile over the wire. But if a client asks us directly for B, perhaps because they are doing an on-demand fetch to fill in the missing blob of a partial clone, we end up much slower. Upload-pack calls parse_object() on the oid we receive, which opens up the object and re-checks its hash (even though if it were a commit, we might skip this parse entirely in favor of the commit graph!). And then we feed the oid directly to pack-objects, which again calls parse_object() and opens the object. And then finally, when we write out the result, we may send bytes straight from disk, but only after having unnecessarily uncompressed and computed the sha1 of the object twice! This patch teaches both code paths to use the new SKIP_HASH_CHECK flag for parse_object(). You can see the speed-up in p5600, which does a blob:none clone followed by a checkout. The savings for git.git are modest: Test HEAD^ HEAD ---------------------------------------------------------------------- 5600.3: checkout of result 2.23(4.19+0.24) 1.72(3.79+0.18) -22.9% But the savings scale with the number of bytes. So on a repository like linux.git with more files, we see more improvement (in both absolute and relative numbers): Test HEAD^ HEAD ---------------------------------------------------------------------------- 5600.3: checkout of result 51.62(77.26+2.76) 34.86(61.41+2.63) -32.5% And here's an even more extreme case. This is the android gradle-plugin repository, whose tip checkout has ~3.7GB of files: Test HEAD^ HEAD -------------------------------------------------------------------------- 5600.3: checkout of result 79.51(90.84+5.55) 40.28(51.88+5.67) -49.3% Keep in mind that these timings are of the whole checkout operation. So they count the client indexing the pack and actually writing out the files. If we want to see just the server's view, we can hack up the GIT_TRACE_PACKET output from those operations and replay it via upload-pack. For the gradle example, that gives me: Benchmark 1: GIT_PROTOCOL=version=2 git.old upload-pack ../gradle-plugin <input Time (mean ± σ): 50.884 s ± 0.239 s [User: 51.450 s, System: 1.726 s] Range (min … max): 50.608 s … 51.025 s 3 runs Benchmark 2: GIT_PROTOCOL=version=2 git.new upload-pack ../gradle-plugin <input Time (mean ± σ): 9.728 s ± 0.112 s [User: 10.466 s, System: 1.535 s] Range (min … max): 9.618 s … 9.842 s 3 runs Summary 'GIT_PROTOCOL=version=2 git.new upload-pack ../gradle-plugin <input' ran 5.23 ± 0.07 times faster than 'GIT_PROTOCOL=version=2 git.old upload-pack ../gradle-plugin <input' So a server would see an 80% reduction in CPU serving the initial checkout of a partial clone for this repository. Or possibly even more depending on the packing; most of the time spent in the faster one were objects we had to open during the write phase. In both cases skipping the extra hashing on the server should be pretty safe. The client doesn't trust the server anyway, so it will re-hash all of the objects via index-pack. There is one thing to note, though: the change in get_reference() affects not just pack-objects, but rev-list, git-log, etc. We could use a flag to limit to index-pack here, but we may already skip hash checks in this instance. For commits, we'd skip anything we load via the commit-graph. And while before this commit we would check a blob fed directly to rev-list on the command-line, we'd skip checking that same blob if we found it by traversing a tree. The exception for both is if --verify-objects is used. In that case, we'll skip this optimization, and the new test makes sure we do this correctly. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-07 12:20:02 -07:00
Jeff King	b27ccae34b	rev-list: disable commit graph with --verify-objects Since the point of --verify-objects is to actually load and checksum the bytes of each object, optimizing out reads using the commit graph runs contrary to our goal. The most targeted way to implement this would be for the revision traversal code to check revs->verify_objects and avoid using the commit graph. But it's difficult to be sure we've hit all of the correct spots. For instance, I started this patch by writing the first of the included test cases, where the corrupted commit is directly on rev-list's command line. And that is easy to fix by teaching get_reference() to check revs->verify_objects before calling lookup_commit_in_graph(). But that doesn't cover the second test case: when we traverse to a corrupted commit, we'd parse the parent in process_parents(). So we'd need to check there, too. And it keeps going. In handle_commit() we sometimes parses commits, too, though I couldn't figure out a way to trigger it that did not already parse via get_reference() or tag peeling. And try_to_simplify_commit() has its own parse call, and so on. So it seems like the safest thing is to just disable the commit graph for the whole process when we see the --verify-objects option. We can do that either in builtin/rev-list.c, where we use the option, or in revision.c, where we parse it. There are some subtleties: - putting it in rev-list.c is less surprising in some ways, because there we know we are just doing a single traversal. In a command which does multiple traversals in a single process, it's rather unexpected to globally disable the commit graph. - putting it in revision.c is less surprising in some ways, because the caller does not have to remember to disable the graph themselves. But this is already tricky! The verify_objects flag in rev_info doesn't do anything by itself. The caller has to provide an object callback which does the right thing. - for that reason, in practice nobody but rev-list uses this option in the first place. So the distinction is probably not important either way. Arguably it should just be an option of rev-list, and not the general revision machinery; right now you can run "git log --verify-objects", but it does not actually do anything useful. - checking for a parsed revs.verify_objects flag in rev-list.c is too late. By that time we've already passed the arguments to setup_revisions(), which will have parsed the commits using the graph. So this commit disables the graph as soon as we see the option in revision.c. That's a pretty broad hammer, but it does what we want, and in practice nobody but rev-list is using this flag anyway. The tests cover both the "tip" and "parent" cases. Obviously our hammer hits them both in this case, but it's good to check both in case somebody later tries the more focused approach. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-07 09:44:30 -07:00
Ævar Arnfjörð Bjarmason	5cf88fd8b0	git-compat-util.h: use "UNUSED", not "UNUSED(var)" As reported in [1] the "UNUSED(var)" macro introduced in 2174b8c75de (Merge branch 'jk/unused-annotation' into next, 2022-08-24) breaks coccinelle's parsing of our sources in files where it occurs. Let's instead partially go with the approach suggested in [2] of making this not take an argument. As noted in [1] "coccinelle" will ignore such tokens in argument lists that it doesn't know about, and it's less of a surprise to syntax highlighters. This undoes the "help us notice when a parameter marked as unused is actually use" part of `9b24034754` (git-compat-util: add UNUSED macro, 2022-08-19), a subsequent commit will further tweak the macro to implement a replacement for that functionality. 1. https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/ 2. https://lore.kernel.org/git/220819.868rnk54ju.gmgdl@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 10:49:48 -07:00
Jeff King	02c3c59e62	hashmap: mark unused callback parameters Hashmap comparison functions must conform to a particular callback interface, but many don't use all of their parameters. Especially the void cmp_data pointer, but some do not use keydata either (because they can easily form a full struct to pass when doing lookups). Let's mark these to make -Wunused-parameter happy. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 12:18:55 -07:00
Jeff King	c006e9fa59	refs: mark unused reflog callback parameters Functions used with for_each_reflog_ent() need to conform to a particular interface, but not every function needs all of the parameters. Mark the unused ones to make -Wunused-parameter happy. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 12:18:54 -07:00
Jeff King	63e14ee2d6	refs: mark unused each_ref_fn parameters Functions used with for_each_ref(), etc, need to conform to the each_ref_fn interface. But most of them don't need every parameter; let's annotate the unused ones to quiet -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 12:18:54 -07:00
Elijah Newren	257418c590	revision: allow --ancestry-path to take an argument We have long allowed users to run e.g. git log --ancestry-path master..seen which shows all commits which satisfy all three of these criteria: * are an ancestor of seen * are not an ancestor of master * have master as an ancestor This commit allows another variant: git log --ancestry-path=$TOPIC master..seen which shows all commits which satisfy all of these criteria: * are an ancestor of seen * are not an ancestor of master * have $TOPIC in their ancestry-path that last bullet can be defined as commits meeting any of these criteria: * are an ancestor of $TOPIC * have $TOPIC as an ancestor * are $TOPIC This also allows multiple --ancestry-path arguments, which can be used to find commits with any of the given topics in their ancestry path. Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 10:45:08 -07:00
Junio C Hamano	83489a5b20	Merge branch 'ab/plug-revisions-leak' Plug a bit more leaks in the revisions API. * ab/plug-revisions-leak: revisions API: don't leak memory on argv elements that need free()-ing bisect.c: partially fix bisect_rev_setup() memory leak log: refactor "rev.pending" code in cmd_show() log: fix a memory leak in "git show <revision>..." test-fast-rebase helper: use release_revisions() (again) bisect.c: add missing "goto" for release_revisions()	2022-08-12 13:19:08 -07:00
Junio C Hamano	a6aeb2fef9	Merge branch 'jc/resolve-undo' into maint The resolve-undo information in the index was not protected against GC, which has been corrected. source: <xmqq35f7kzad.fsf@gitster.g> * jc/resolve-undo: fsck: do not dereference NULL while checking resolve-undo data revision: mark blobs needed for resolve-undo as reachable	2022-08-10 21:52:32 -07:00
Junio C Hamano	87098a047b	Merge branch 'sa/cat-file-mailmap' "git cat-file" learned an option to use the mailmap when showing commit and tag objects. * sa/cat-file-mailmap: cat-file: add mailmap support ident: rename commit_rewrite_person() to apply_mailmap_to_header() ident: move commit_rewrite_person() to ident.c revision: improve commit_rewrite_person()	2022-08-03 13:36:08 -07:00
Ævar Arnfjörð Bjarmason	f92dbdbc6a	revisions API: don't leak memory on argv elements that need free()-ing Add a "free_removed_argv_elements" member to "struct setup_revision_opt", and use it to fix several memory leaks. We have various memory leaks in APIs that take and munge "const char **argv", e.g. parse_options(). Sometimes these APIs are given the "argv" we get to the "main" function, in which case we don't leak memory, but other times we're giving it the "v" member of a "struct strvec" we created. There's several potential ways to fix those sort of leaks, we could add a "nodup" mode to "struct strvec", which would work for the cases where we push constant strings to it. But that wouldn't work as soon as we used strvec_pushf(), or otherwise needed to duplicate or create a string for that "struct strvec". Let's instead make it the responsibility of the revisions API. If it's going to clobber elements of argv it can also free() them, which it will now do if instructed to do so via "free_removed_argv_elements". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-03 11:12:36 -07:00
Junio C Hamano	418aef9055	Merge branch 'jc/resolve-undo' The resolve-undo information in the index was not protected against GC, which has been corrected. * jc/resolve-undo: fsck: do not dereference NULL while checking resolve-undo data revision: mark blobs needed for resolve-undo as reachable	2022-07-19 16:40:16 -07:00
Siddharth Asthana	66a8a95315	ident: rename commit_rewrite_person() to apply_mailmap_to_header() commit_rewrite_person() takes a commit buffer and replaces the idents in the header with their canonical versions using the mailmap mechanism. The name "commit_rewrite_person()" is misleading as it doesn't convey what kind of rewrite are we going to do to the buffer. It also doesn't clearly mention that the function will limit itself to the header part of the buffer. The new name, "apply_mailmap_to_header()", expresses the functionality of the function pretty clearly. We intend to use apply_mailmap_to_header() in git-cat-file to replace idents in the headers of commit and tag object buffers. So, we will be extending this function to take tag objects buffer as well and replace idents on the tagger header using the mailmap mechanism. Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: John Cai <johncai86@gmail.com> Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-07-18 12:55:53 -07:00
Siddharth Asthana	dc88e349a2	ident: move commit_rewrite_person() to ident.c commit_rewrite_person() and rewrite_ident_line() are static functions defined in revision.c. Their usages are as follows: - commit_rewrite_person() takes a commit buffer and replaces the author and committer idents with their canonical versions using the mailmap mechanism - rewrite_ident_line() takes author/committer header lines from the commit buffer and replaces the idents with their canonical versions using the mailmap mechanism. This patch moves commit_rewrite_person() and rewrite_ident_line() to ident.c which contains many other functions related to idents like split_ident_line(). By moving commit_rewrite_person() to ident.c, we also intend to use it in git-cat-file to replace committer and author idents from the headers to their canonical versions using the mailmap mechanism. The function is moved as is for now to make it clear that there are no other changes, but it will be renamed in a following commit. Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: John Cai <johncai86@gmail.com> Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-07-18 12:55:53 -07:00
Siddharth Asthana	e9c1b0e38c	revision: improve commit_rewrite_person() The function, commit_rewrite_person(), is designed to find and replace an ident string in the header part, and the way it avoids a random occurrence of "author A U Thor <author@example.com" in the text is by insisting "author" to appear at the beginning of line by passing "\nauthor " as "what". The implementation also doesn't make any effort to limit itself to the commit header by locating the blank line that appears after the header part and stopping the search there. Also, the interface forces the caller to make multiple calls if it wants to rewrite idents on multiple headers. It shouldn't be the case. To support the existing caller better, update commit_rewrite_person() to: - Make a single pass in the input buffer to locate headers named "author" and "committer" and replace idents on them. - Stop at the end of the header, ensuring that nothing in the body of the commit object is modified. The return type of the function commit_rewrite_person() has also been changed from int to void. This has been done because the caller of the function doesn't do anything with the return value of the function. By simplifying the interface of the commit_rewrite_person(), we also intend to expose it as a public function. We will also be renaming the function in a future commit to a different name which clearly tells that the function replaces idents in the header of the commit buffer. Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: John Cai <johncai86@gmail.com> Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-07-18 12:55:53 -07:00
Junio C Hamano	5a5ea141e7	revision: mark blobs needed for resolve-undo as reachable The resolve-undo extension was added to the index in `cfc5789a` (resolve-undo: record resolved conflicts in a new index extension section, 2009-12-25). This extension records the blob object names and their modes of conflicted paths when the path gets resolved (e.g. with "git add"), to allow "undoing" the resolution with "checkout -m path". These blob objects should be guarded from garbage-collection while we have the resolve-undo information in the index (otherwise unresolve operation may try to use a blob object that has already been pruned away). But the code called from mark_reachable_objects() for the index forgets to do so. Teach add_index_objects_to_pending() helper to also add objects referred to by the resolve-undo extension. Also make matching changes to "fsck", which has code that is fairly similar to the reachability stuff, but have parallel implementations for all these stuff, which may (or may not) someday want to be unified. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-09 16:45:07 -07:00
Junio C Hamano	2da81d1efb	Merge branch 'ab/plug-leak-in-revisions' Plug the memory leaks from the trickiest API of all, the revision walker. * ab/plug-leak-in-revisions: (27 commits) revisions API: add a TODO for diff_free(&revs->diffopt) revisions API: have release_revisions() release "topo_walk_info" revisions API: have release_revisions() release "date_mode" revisions API: call diff_free(&revs->pruning) in revisions_release() revisions API: release "reflog_info" in release revisions() revisions API: clear "boundary_commits" in release_revisions() revisions API: have release_revisions() release "prune_data" revisions API: have release_revisions() release "grep_filter" revisions API: have release_revisions() release "filter" revisions API: have release_revisions() release "cmdline" revisions API: have release_revisions() release "mailmap" revisions API: have release_revisions() release "commits" revisions API users: use release_revisions() for "prune_data" users revisions API users: use release_revisions() with UNLEAK() revisions API users: use release_revisions() in builtin/log.c revisions API users: use release_revisions() in http-push.c revisions API users: add "goto cleanup" for release_revisions() stash: always have the owner of "stash_info" free it revisions API users: use release_revisions() needing REV_INFO_INIT revision.[ch]: document and move code declared around "init" ...	2022-06-07 14:10:56 -07:00
Junio C Hamano	538dc459a0	Merge branch 'ep/maint-equals-null-cocci' Introduce and apply coccinelle rule to discourage an explicit comparison between a pointer and NULL, and applies the clean-up to the maintenance track. * ep/maint-equals-null-cocci: tree-wide: apply equals-null.cocci tree-wide: apply equals-null.cocci contrib/coccinnelle: add equals-null.cocci	2022-05-20 15:26:59 -07:00
Junio C Hamano	2b0a58d164	Merge branch 'ep/maint-equals-null-cocci' for maint-2.35 * ep/maint-equals-null-cocci: tree-wide: apply equals-null.cocci contrib/coccinnelle: add equals-null.cocci	2022-05-02 10:06:04 -07:00
Junio C Hamano	afe8a9070b	tree-wide: apply equals-null.cocci Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-02 09:50:37 -07:00
Miklos Vajna	96697781e0	log: "--since-as-filter" option is a non-terminating "--since" variant The "--since=<time>" option of "git log" limits the commits displayed by the command by stopping the traversal once it sees a commit whose timestamp is older than the given time and not digging further into its parents. This is OK in a history where a commit always has a newer timestamp than any of its parents'. Once you see a commit older than the given <time>, all ancestor commits of it are even older than the time anyway. It poses, however, a problem when there is a commit with a wrong timestamp that makes it appear older than its parents. Stopping traversal at the "incorrectly old" commit will hide its ancestors that are newer than that wrong commit and are newer than the cut-off time given with the --since option. --max-age and --after being the synonyms to --since, they share the same issue. Add a new "--since-as-filter" option that is a variant of "--since=<time>". Instead of stopping the traversal to hide an old enough commit and its all ancestors, exclude commits with an old timestamp from the output but still keep digging the history. Without other traversal stopping options, this will force the command in "git log" family to dig down the history to the root. It may be an acceptable cost for a small project with short history and many commits with screwy timestamps. It is quite unlikely for us to add traversal stopper other than since, so have this as a --since-as-filter option, rather than a separate --as-filter, that would be probably more confusing. Signed-off-by: Miklos Vajna <vmiklos@vmiklos.hu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-04-23 09:36:07 -07:00

1 2 3 4 5 ...

955 Commits