git-commit-vandalism

Author	SHA1	Message	Date
Derrick Stolee	49abcd21da	for-each-ref: add ahead-behind format atom The previous change implemented the ahead_behind() method, including an algorithm to compute the ahead/behind values for a number of commit tips relative to a number of commit bases. Now, integrate that algorithm as part of 'git for-each-ref' hidden behind a new format atom, ahead-behind. This naturally extends to 'git branch' and 'git tag' builtins, as well. This format allows specifying multiple bases, if so desired, and all matching references are compared against all of those bases. For this reason, failing to read a reference provided from these atoms results in an error. In order to translate the ahead_behind() method information to the format output code in ref-filter.c, we must populate arrays of ahead_behind_count structs. In struct ref_array, we store the full array that will be passed to ahead_behind(). In struct ref_array_item, we store an array of pointers that point to the relvant items within the full array. In this way, we can pull all relevant ahead/behind values directly when formatting output for a specific item. It also ensures the lifetime of the ahead_behind_count structs matches the time that the array is being used. Add specific tests of the ahead/behind counts in t6600-test-reach.sh, as it has an interesting repository shape. In particular, its merging strategy and its use of different commit-graphs would demonstrate over- counting if the ahead_behind() method did not already account for that possibility. Also add tests for the specific for-each-ref, branch, and tag builtins. In the case of 'git tag', there are intersting cases that happen when some of the selected tips are not commits. This requires careful logic around commits_nr in the second loop of filter_ahead_behind(). Also, the test in t7004 is carefully located to avoid being dependent on the GPG prereq. It also avoids using the test_commit helper, as that will add ticks to the time and disrupt the expected timestamps in later tag tests. Also add performance tests in a new p1300-graph-walks.sh script. This will be useful for more uses in the future, but for now compare the ahead-behind counting algorithm in 'git for-each-ref' to the naive implementation by running 'git rev-list --count' processes for each input. For the Git source code repository, the improvement is already obvious: Test this tree --------------------------------------------------------------- 1500.2: ahead-behind counts: git for-each-ref 0.07(0.07+0.00) 1500.3: ahead-behind counts: git branch 0.07(0.06+0.00) 1500.4: ahead-behind counts: git tag 0.07(0.06+0.00) 1500.5: ahead-behind counts: git rev-list 1.32(1.04+0.27) But the standard performance benchmark is the Linux kernel repository, which demosntrates a significant improvement: Test this tree --------------------------------------------------------------- 1500.2: ahead-behind counts: git for-each-ref 0.27(0.24+0.02) 1500.3: ahead-behind counts: git branch 0.27(0.24+0.03) 1500.4: ahead-behind counts: git tag 0.28(0.27+0.01) 1500.5: ahead-behind counts: git rev-list 4.57(4.03+0.54) The 'git rev-list' test exists in this change as a demonstration, but it will be removed in the next change to avoid wasting time on this comparison. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 12:17:33 -07:00
Derrick Stolee	fd67d149bd	commit-reach: implement ahead_behind() logic Fully implement the commit-counting logic required to determine ahead/behind counts for a batch of commit pairs. This is a new library method within commit-reach.h. This method will be linked to the for-each-ref builtin in the next change. The interface for ahead_behind() uses two arrays. The first array of commits contains the list of all starting points for the walk. This includes all tip commits _and_ base commits. The second array specifies base/tip pairs by pointing to commits within the first array, by index. The second array also stores the resulting ahead/behind counts for each of these pairs. This implementation of ahead_behind() allows multiple bases, if desired. Even with multiple bases, there is only one commit walk used for counting the ahead/behind values, saving time when the base/tip ranges overlap significantly. This interface for ahead_behind() also makes it very easy to call ensure_generations_valid() on the entire array of bases and tips. This call is necessary because it is critical that the walk that counts ahead/behind values never walks a commit more than once. Without generation numbers on every commit, there is a possibility that a commit date skew could cause the walk to revisit a commit and then double-count it. For this reason, it is strongly recommended that 'git ahead-behind' is only run in a repository with a commit-graph file that covers most of the reachable commits, storing precomputed generation numbers. If no commit-graph exists, this walk will be much slower as it must walk all reachable commits in ensure_generations_valid() before performing the counting logic. It is possible to detect if generation numbers are available at run time and redirect the implementation to another algorithm that does not require this property. However, that implementation requires a commit walk per base/tip pair _and_ can be slower due to the commit date heuristics required. Such an implementation could be considered in the future if there is a reason to include it, but most Git hosts should already be generating a commit-graph file as part of repository maintenance. Most Git clients should also be generating commit-graph files as part of background maintenance or automatic GCs. Now, let's discuss the ahead/behind counting algorithm. The first array of commits are considered the starting commits. The index within that array will play a critical role. We create a new commit slab that maps commits to a bitmap. For a given commit (anywhere in the history), its bitmap stores information relative to which of the input commits can reach that commit. The ith bit will be on if the ith commit from the starting list can reach that commit. It is important to notice that these bitmaps are not the typical "reachability bitmaps" that are stored in .bitmap files. Instead of signalling which objects are reachable from the current commit, they instead signal "which starting commits can reach me?" It is also important to know that the bitmap is not necessarily "complete" until we walk that commit. We will perform a commit walk by generation number in such a way that we can guarantee the bitmap is correct when we visit that commit. At the beginning of the ahead_behind() method, we initialize the bitmaps for each of the starting commits. By enabling the ith bit for the ith starting commit, we signal "the ith commit can reach itself." We walk commits by popping the commit with maximum generation number out of the queue, guaranteeing that we will never walk a child of that commit in any future steps. As we walk, we load the bitmap for the current commit and perform two main steps. The _second_ step examines each parent of the current commit and adds the current commit's bitmap bits to each parent's bitmap. (We create a new bitmap for the parent if this is our first time seeing that parent.) After adding the bits to the parent's bitmap, the parent is added to the walk queue. Due to this passing of bits to parents, the current commit has a guarantee that the ith bit is enabled on its bitmap if and only if the ith commit can reach the current commit. The first step of the walk is to examine the bitmask on the current commit and decide which ranges the commit is in or not. Due to the "bit pushing" in the second step, we have a guarantee that the ith bit of the current commit's bitmap is on if and only if the ith starting commit can reach it. For each ahead_behind_count struct, check the base_index and tip_index to see if those bits are enabled on the current bitmap. If exactly one bit is enabled, then increment the corresponding 'ahead' or 'behind' count. This increment is the reason we _absolutely need_ to walk commits at most once. The only subtle thing to do with this walk is to check to see if a parent has all bits on in its bitmap, in which case it becomes "stale" and is marked with the STALE bit. This allows queue_has_nonstale() to be the terminating condition of the walk, which greatly reduces the number of commits walked if all of the commits are nearby in history. It avoids walking a large number of common commits when there is a deep history. We also use the helper method insert_no_dup() to add commits to the priority queue without adding them multiple times. This uses the PARENT2 flag. Thus, we must clear both the STALE and PARENT2 bits of all commits, in case ahead_behind() is called multiple times in the same process. Co-authored-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 12:17:33 -07:00
Taylor Blau	c08645b353	commit-graph: introduce `ensure_generations_valid()` Use the just-introduced compute_reachable_generation_numbers_1() to implement a function which dynamically computes topological levels (or corrected commit dates) for out-of-graph commits. This will be useful for the ahead-behind algorithm we are about to introduce, which needs accurate topological levels on _all_ commits reachable from the tips in order to avoid over-counting. Co-authored-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 12:17:33 -07:00
Derrick Stolee	2ee11f7261	commit-graph: return generation from memory The commit_graph_generation() method used to report a value of GENERATION_NUMBER_INFINITY if the commit_graph_data_slab had an instance for the given commit but the graph_pos indicated the commit was not in the commit-graph file. However, an upcoming change will introduce the ability to set generation values in-memory without writing the commit-graph file. Thus, we can no longer trust 'graph_pos' to indicate whether or not the generation member can be trusted. Instead, trust the 'generation' member if the commit has a value in the slab _and_ the 'generation' member is non-zero. Otherwise, treat it as GENERATION_NUMBER_INFINITY. This only makes a difference for a very old case for the commit-graph: the very first Git release to write commit-graph files wrote zeroes in the topological level positions. If we are parsing a commit-graph with all zeroes, those commits will now appear to have GENERATION_NUMBER_INFINITY (as if they were not parsed from the commit-graph). I attempted several variations to work around the need for providing an uninitialized 'generation' member, but this was the best one I found. It does require a change to a verification test in t5318 because it reports a different error than the one about non-zero generation numbers. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 12:17:33 -07:00
Derrick Stolee	80c928d947	commit-graph: simplify compute_generation_numbers() The previous change introduced the generic algorithm compute_reachable_generation_numbers() and used it as the core functionality of compute_topological_levels(). Now, use it as the core functionality of compute_generation_numbers(). The main difference here is that we use generation version 2, which is used in to toggle the logic in compute_generation_from_max() for computing the corrected commit date based on the corrected commit dates of the parent commits (and the commit date of the current commit). It also uses different methods for (get\|set)_generation in the vtable in order to store and access the value in the correct places. Co-authored-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 12:17:33 -07:00
Derrick Stolee	368d19b0b7	commit-graph: refactor compute_topological_levels() This patch extracts the common code used to compute topological levels and corrected committer dates into a common routine, compute_reachable_generation_numbers(). For ease of reading, it only modifies compute_topological_levels() to use this new routine, leaving compute_generation_numbers() to be modified in the next change. This new routine dispatches to call the necessary functions to get and set the generation number for a given commit through a vtable (the compute_generation_info struct). Computing the generation number itself is done in compute_generation_from_max(), which dispatches its implementation based on the generation version requested, or issuing a BUG() for unrecognized generation versions. This does not use a vtable because the logic depends only on the generation number version, not where the data is being loaded from or being stored to. This is a subtle point that will make more sense in a future change that modifies the in-memory generation values instead of just preparing values for writing to a commit-graph file. This change looks like it adds a lot of new code. However, two upcoming changes will be quite small due to the work being done in this change. Co-authored-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 12:17:33 -07:00
Derrick Stolee	b2c51b7590	for-each-ref: explicitly test no matches The for-each-ref builtin can take a list of ref patterns, but if none match, it still succeeds (but with no output). Add an explicit test that demonstrates that behavior. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 12:17:32 -07:00
Derrick Stolee	b73dec5530	for-each-ref: add --stdin option When a user wishes to input a large list of patterns to 'git for-each-ref' (likely a long list of exact refs) there are frequently system limits on the number of command-line arguments. Add a new --stdin option to instead read the patterns from standard input. Add tests that check that any unrecognized arguments are considered an error when --stdin is provided. Also, an empty pattern list is interpreted as the complete ref set. When reading from stdin, we populate the filter.name_patterns array dynamically as opposed to pointing to the 'argv' array directly. This is simple when using a strvec, as it is NULL-terminated in the same way. We then free the memory directly from the strvec. Helped-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 12:17:32 -07:00
SZEDER Gábor	353e6d4554	parse-options.h: use designated initializers in OPT_* macros Use designated initializers in the expansions of the OPT_* macros to make it more readable which one-letter macro parameter initializes which field in the resulting 'struct option'. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 12:04:07 -07:00
SZEDER Gábor	aa0275a2c0	parse-options.h: rename _OPT_CONTAINS_OR_WITH()'s parameters Rename the 'help' parameter as it matches one of the fields in 'struct option', and, while at it, rename all other parameters to the usual one-letter name used in similar macro definitions. Furthermore, put all parameters in the replacement list between parentheses, like all other OPT_* macros do. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 12:04:06 -07:00
SZEDER Gábor	ab0845b382	parse-options.h: use consistent name for the callback parameters In the various OPT_* macros the 'f' parameter is usually used to specify flags, while the 'cb' parameter is used to specify a callback function. OPT_CALLBACK and OPT_NUMBER_CALLBACKS, however, are inconsistent with the rest, as they use 'f' to specify their callback function. Rename their callback macro parameters to 'cb' to avoid the inconsistency. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 12:04:06 -07:00
SZEDER Gábor	c4d9c79378	treewide: remove unnecessary inclusions of parse-options.h from headers The headers 'diagnose.h', 'list-objects-filter-options.h', 'ref-filter.h' and 'remote.h' declare option parsing callback functions with a 'struct option' parameter, and 'revision.h' declares an option parsing helper function taking 'struct parse_opt_ctx_t' and 'struct option' parameters. These headers all include 'parse-options.h', although they don't need any of the type definitions from that header file. Furthermore, 'list-objects-filter-options.h' and 'ref-filter.h' also define some OPT_ macros to initialize a 'struct option', but these don't necessitate the inclusion of parse-options.h in these headers either, because these macros are only expanded in source files. Remove these unnecessary inclusions of parse-options.h and use forward declarations to declare the necessary types. After this patch none of the header files include parse-options.h anymore. With these changes, the build time after modifying only parse-options.h is reduced by about 30%, and the number of targets built is almost 20% less: Before: $ touch parse-options.h && time make -j4 \|wc -l 353 real 1m1.527s user 3m32.205s sys 0m15.903s After: 289 real 0m39.285s user 2m12.540s sys 0m11.164s Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 11:55:18 -07:00
SZEDER Gábor	49fd551194	treewide: include parse-options.h in source files The builtins 'ls-remote', 'pack-objects', 'receive-pack', 'reflog' and 'send-pack' use parse_options(), but their source files don't directly include 'parse-options.h'. Furthermore, the source files 'diagnose.c', 'list-objects-filter-options.c', 'remote.c' and 'send-pack.c' define option parsing callback functions, while 'revision.c' defines an option parsing helper function, and thus need access to various fields in 'struct option' and 'struct parse_opt_ctx_t', but they don't directly include 'parse-options.h' either. They all can still be built, of course, because they include one of the header files that does include 'parse-options.h' (though unnecessarily, see the next commit). Add those missing includes to these files, as our general rule is that "a C file must directly include the header files that declare the functions and the types it uses". Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 11:26:47 -07:00
Patrick Steinhardt	d6606e02aa	fetch: centralize printing of reference updates In order to print updated references during a fetch, the two different call sites that do this will first call `format_display()` followed by a call to `fputs()`. This is needlessly roundabout now that we have the `display_state` structure that encapsulates all of the printing logic for references. Move displaying the reference updates into `format_display()` and rename it to `display_ref_update()` to better match its new purpose, which finalizes the conversion to make both the formatting and printing logic of reference updates self-contained. This will make it easier to add new output formats and printing to a different file descriptor than stderr. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 11:02:43 -07:00
Patrick Steinhardt	c4ef5edbc9	fetch: centralize logic to print remote URL When fetching from a remote, we not only print the actual references that have changed, but will also print the URL from which we have fetched them to standard output. The logic to handle this is duplicated across two different callsites with some non-trivial logic to compute the anonymized URL. Furthermore, we're using global state to track whether we have already shown the URL to the user or not. Refactor the code by moving it into `format_display()`. Like this, we can convert the global variable into a member of `display_state`. And second, we can deduplicate the logic to compute the anonymized URL. This also works as expected when fetching from multiple remotes, for example via a group of remotes, as we do this by forking a standalone git-fetch(1) process per remote that is to be fetched. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 11:02:43 -07:00
Patrick Steinhardt	331b7d29f0	fetch: centralize handling of per-reference format The function `format_display()` is used to print a single reference update to a buffer which will then ultimately be printed by the caller. This architecture causes us to duplicate some logic across the different callsites of this function. This makes it hard to follow the code as some parts of the logic are located in one place, while other parts of the logic are located in a different place. Furthermore, by having the logic scattered around it becomes quite hard to implement a new output format for the reference updates. We can make the logic a whole lot easier to understand by making the `format_display()` function self-contained so that it handles formatting and printing of the references. This will eventually allow us to easily implement a completely different output format, but also opens the door to conditionally print to either stdout or stderr depending on the output format. As a first step towards that goal we move the formatting directive used by both callers to print a single reference update into this function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 11:02:43 -07:00
Patrick Steinhardt	7c978db889	fetch: pass the full local reference name to `format_display` Before printing the name of the local references that would be updated by a fetch we first prettify the reference name. This is done at the calling side so that `format_display()` never sees the full name of the local reference. This restricts our ability to introduce new output formats that might want to print the full reference name. Right now, all callsites except one are prettifying the reference name anyway. And the only callsite that doesn't passes `FETCH_HEAD` as the hardcoded reference name to `format_display()`, which would never be changed by a call to `prettify_refname()` anyway. So let's refactor the code to pass in the full local reference name and then prettify it in the formatting code. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 11:02:43 -07:00
Patrick Steinhardt	5cab51ff71	fetch: move output format into `display_state` The git-fetch(1) command supports printing references either in "full" or "compact" format depending on the `fetch.ouput` config key. The format that is to be used is tracked in a global variable. De-globalize the variable by moving it into the `display_state` structure. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 11:02:43 -07:00
Patrick Steinhardt	ce9636d645	fetch: move reference width calculation into `display_state` In order to print references in proper columns we need to calculate the width of the reference column before starting to print the references. This is done with the help of a global variable `refcol_width`. Refactor the code to instead use a new structure `display_state` that contains the computed width and plumb it through the stack as required. This is only the first step towards de-globalizing the state required to print references. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 11:02:43 -07:00
Phillip Wood	91b81b64e3	wildmatch: hide internal return values WM_ABORT_ALL and WM_ABORT_TO_STARSTAR are used internally to limit backtracking when a match fails, they are not of interest to the caller and so should not be public. Suggested-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 10:58:53 -07:00
Phillip Wood	81b26f8f28	wildmatch: avoid undefined behavior The code changed in this commit is designed to check if the pattern starts with "/" or contains "//" (see `3a078dec33` (wildmatch: fix "" special case, 2013-01-01)). Unfortunately when the pattern begins with "/" `prev_p = p - 2` is evaluated when `p` points to the second "*" and so the subtraction is undefined according to section 6.5.6 of the C standard because the result does not point within the same object as `p`. Fix this by avoiding the subtraction unless it is well defined. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 10:58:53 -07:00
Phillip Wood	1f2e05f0b7	wildmatch: fix exponential behavior When dowild() cannot match a '' or '//' wildcard then it must return WM_ABORT_TO_STARSTAR or WM_ABORT_ALL respectively. Failure to observe this results in unnecessary backtracking and the time taken for a failed match increases exponentially with the number of wildcards in the pattern [1]. Unfortunately in some instances dowild() returns WM_NOMATCH for a failed match resulting in long match times for patterns containing multiple wildcards as can be seen in the following benchmark. (Note that the timings in the Benchmark 1 are really measuring the time to execute test-tool rather than the time to match the pattern) Benchmark 1: t/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "a" Time (mean ± σ): 22.8 ms ± 1.7 ms [User: 12.1 ms, System: 10.6 ms] Range (min … max): 19.4 ms … 26.9 ms 113 runs Warning: Ignoring non-zero exit code. Benchmark 2: t/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "aaaaaaaaa" Time (mean ± σ): 5.244 s ± 0.228 s [User: 5.229 s, System: 0.010 s] Range (min … max): 4.969 s … 5.707 s 10 runs Warning: Ignoring non-zero exit code. Summary 't/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "a"' ran 230.37 ± 20.04 times faster than 't/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "aaaaaaaaa"' The security implications are limited as it only affects operations that are potentially DoS vectors. For example by creating a blob containing such a pattern a malicious user can exploit this behavior to use large amounts of CPU time on a remote server by pushing the blob and then creating a new clone with --filter=sparse:oid. However this filter type is usually disabled as it is known to consume large amounts of CPU time even without this bug. The WM_MATCH changed in the first hunk of this patch comes from the original implementation imported from rsync in `5230f605e1` (Import wildmatch from rsync, 2012-10-15). Compared to the others converted here it is fairly harmless as it only triggers at the end of the pattern and so will only cause a single unnecessary backtrack. The others introduced by `6f1a31f0aa` (wildmatch: advance faster in <asterisk> + <literal> patterns, 2013-01-01) and `46983441ae` (wildmatch: make a special case for "/" with FNM_PATHNAME, 2013-01-01) are more pernicious and will cause exponential behavior. A new test is added to protect against future regressions. [1] https://research.swtch.com/glob Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 10:58:53 -07:00
Andrei Rybak	a93cbe8d78	t1507: assert output of rev-parse Tests in t1507-rev-parse-upstream.sh compare files "expect" and "actual" to assert the output of "git rev-parse", "git show", and "git log". However, two of the tests '@{reflog}-parsing does not look beyond colon' and '@{upstream}-parsing does not look beyond colon' don't inspect the contents of the created files. Assert output of "git rev-parse" in tests in t1507-rev-parse-upstream.sh to improve test coverage. Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 09:11:42 -07:00
Andrei Rybak	7deec9442f	t1404: don't create unused file Some tests in file t1404-update-ref-errors.sh create file "unchanged" as the expected side for a test_cmp assertion at the end of the test for output of "git for-each-ref". Test 'no bogus intermediate values during delete' also creates a file named "unchanged" using "git for-each-ref". However, the file isn't used for any assertions in the test. Instead, "git rev-parse" is used to compare the reference with variable $D. Don't create unused file "unchanged" in test 'no bogus intermediate values during delete' of t1404-update-ref-errors.sh. Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 09:11:42 -07:00
Andrei Rybak	94f07b5544	t1400: assert output of update-ref In t1400-update-ref.sh test 'transaction can create and delete' creates files "expect" and "actual", but doesn't compare them. Similarly, test 'transaction cannot restart ongoing transaction' redirects output of "git update-ref" to file "actual", but doesn't check its contents with any assertions. Assert output of "git update-ref" in tests to improve test coverage in t1400-update-ref.sh. Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 09:11:42 -07:00
Andrei Rybak	17ae7f758e	t1302: don't create unused file Test 'gitdir selection on unsupported repo' in t1302-repo-version.sh writes output of a "git config" invocation to file "actual". However, the test doesn't have any assertions for the file. The file was used by this test until commit `b9605bc4f2` (config: only read .git/config from configured repos, 2016-09-12), before which "git config" was expected to print the bogus value of "core.repositoryformatversion" to standard output. Don't redirect output of "git config" to file "actual" in test 'gitdir selection on unsupported repo'. Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 09:11:41 -07:00
Andrei Rybak	f4b98e17cf	t1010: don't create unused files Builtin "git mktree" writes the the object name of the tree object built to the standard output. Tests 'mktree refuses to read ls-tree -r output (1)' and 'mktree refuses to read ls-tree -r output (2)' in "t1010-mktree.sh" redirect output of "git mktree" to a file, but don't use its contents in assertions. Don't redirect output of "git mktree" to file "actual" in tests that assert that an invocation of "git mktree" must fail. Output of "git mktree" is empty when it refuses to build a tree object. So, alternatively, the test could assert that the output is empty. However, there isn't a good reason for the user to expect the command to be silent in such cases, so we shouldn't enforce it. The user shouldn't use the output of a failing command anyway. Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 09:11:41 -07:00
Andrei Rybak	4e273368ce	t1006: assert error output of cat-file Test "cat-file $arg1 $arg2 error on missing full OID" in t1006-cat-file.sh compares files "expect.err" and "err.actual" to assert the expected error output of "git cat-file". A similar test in the same file named "cat-file $arg1 $arg2 error on missing short OID" also creates these two files, but doesn't use them in assertions. Assert error output of "git cat-file" in test "cat-file $arg1 $arg2 error on missing short OID" of t1006-cat-file.sh to improve test coverage. Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 09:11:41 -07:00
Andrei Rybak	8fc184c0eb	t1005: assert output of ls-files Test 'reset should work' in t1005-read-tree-reset.sh compares two files "expect" and "actual" to assert the expected output of "git ls-files". Several other tests in the same file also create files "expect" and "actual", but don't use them in assertions. Assert output of "git ls-files" in t1005-read-tree-reset.sh to improve test coverage. Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 09:11:41 -07:00
Junio C Hamano	e25cabbf6b	The second batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-19 15:03:22 -07:00
Junio C Hamano	a9f4a01760	Merge branch 'jk/add-p-unmerged-fix' "git add -p" while the index is unmerged sometimes failed to parse the diff output it internally produces and died, which has been corrected. * jk/add-p-unmerged-fix: add-patch: handle "* Unmerged path" lines	2023-03-19 15:03:13 -07:00
Junio C Hamano	947604ddb7	Merge branch 'ew/fetch-no-write-fetch-head-fix' * ew/fetch-no-write-fetch-head-fix: fetch: pass --no-write-fetch-head to subprocesses	2023-03-19 15:03:13 -07:00
Junio C Hamano	9de14c71f7	Merge branch 'fc/advice-diverged-history' After "git pull" that is configured with pull.rebase=false merge.ff=only fails due to our end having our own development, give advice messages to get out of the "Not possible to fast-forward" state. * fc/advice-diverged-history: advice: add diverging advice for novices	2023-03-19 15:03:13 -07:00
Junio C Hamano	fc1a4ce043	Merge branch 'ab/fix-strategy-opts-parsing' The code to parse "git rebase -X<opt>" was not prepared to see an unparsable option string, which has been corrected. * ab/fix-strategy-opts-parsing: sequencer.c: fix overflow & segfault in parse_strategy_opts()	2023-03-19 15:03:12 -07:00
Junio C Hamano	0717a424a7	Merge branch 'ds/reprepare-alternates-when-repreparing-packfiles' Once we start running, we assumed that the list of alternate object databases would never change. Hook into the machinery used to update the list of packfiles during runtime to update this list as well. * ds/reprepare-alternates-when-repreparing-packfiles: object-file: reprepare alternates when necessary	2023-03-19 15:03:12 -07:00
Junio C Hamano	5c92a451be	Merge branch 'jk/format-patch-change-format-for-empty-commits' "git format-patch" learned to write a log-message only output file for empty commits. * jk/format-patch-change-format-for-empty-commits: format-patch: output header for empty commits	2023-03-19 15:03:12 -07:00
Junio C Hamano	95de376349	Merge branch 'jk/bundle-use-dash-for-stdfiles' "git bundle" learned that "-" is a common way to say that the input comes from the standard input and/or the output goes to the standard output. It used to work only for output and only from the root level of the working tree. * jk/bundle-use-dash-for-stdfiles: parse-options: use prefix_filename_except_for_dash() helper parse-options: consistently allocate memory in fix_filename() bundle: don't blindly apply prefix_filename() to "-" bundle: document handling of "-" as stdin bundle: let "-" mean stdin for reading operations	2023-03-19 15:03:12 -07:00
Junio C Hamano	12201fd756	Merge branch 'jk/bundle-progress' Simplify UI to control progress meter given by "git bundle" command. * jk/bundle-progress: bundle: turn on --all-progress-implied by default	2023-03-19 15:03:11 -07:00
Junio C Hamano	3f3bb90c8f	Merge branch 'as/doc-markup-fix' Fix for a mis-mark-up in doc made in Git 2.39 days. * as/doc-markup-fix: git-merge-tree.txt: replace spurious HTML entity	2023-03-19 15:03:11 -07:00
Junio C Hamano	96a806f87a	Merge branch 'rj/avoid-switching-to-already-used-branch' A few subcommands have been taught to stop users from working on a branch that is being used in another worktree linked to the same repository. * rj/avoid-switching-to-already-used-branch: switch: reject if the branch is already checked out elsewhere (test) rebase: refuse to switch to a branch already checked out elsewhere (test) branch: fix die_if_checked_out() when ignore_current_worktree worktree: introduce is_shared_symref()	2023-03-19 15:03:11 -07:00
Junio C Hamano	c79786c486	Merge branch 'rj/bisect-already-used-branch' Allow "git bisect reset" to check out the original branch when the branch is already checked out in a different worktree linked to the same repository. * rj/bisect-already-used-branch: bisect: fix "reset" when branch is checked out elsewhere	2023-03-19 15:03:11 -07:00
Junio C Hamano	4a25b911cd	Merge branch 'zh/push-to-delete-onelevel-ref' "git push" has been taught to allow deletion of refs with one-level names to help repairing a repository who acquired such a ref by mistake. In general, we don't encourage use of such a ref, and creation or update to such a ref is rejected as before. * zh/push-to-delete-onelevel-ref: push: allow delete single-level ref receive-pack: fix funny ref error messsage	2023-03-19 15:03:10 -07:00
Junio C Hamano	67076b85b8	Merge branch 'ak/restore-both-incompatible-with-conflicts' "git restore" supports options like "--ours" that are only meaningful during a conflicted merge, but these options are only meaningful when updating the working tree files. These options are marked to be incompatible when both "--staged" and "--worktree" are in effect. * ak/restore-both-incompatible-with-conflicts: restore: fault --staged --worktree with merge opts	2023-03-19 15:03:10 -07:00
Junio C Hamano	b0d2440442	Merge branch 'ew/commit-reach-clean-up-flags-fix' Fix a segfaulting loop. The function and its caller may need further clean-up. * ew/commit-reach-clean-up-flags-fix: commit-reach: avoid NULL dereference	2023-03-19 15:03:10 -07:00
Junio C Hamano	6f54213718	Merge branch 'ab/avoid-losing-exit-codes-in-tests' Test clean-up. * ab/avoid-losing-exit-codes-in-tests: tests: don't lose misc "git" exit codes tests: don't lose exit status with "test <op> $(git ...)" tests: don't lose "git" exit codes in "! ( git ... \| grep )" tests: don't lose exit status with "(cd ...; test <op> $(git ...))" t/lib-patch-mode.sh: fix ignored exit codes auto-crlf tests: don't lose exit code in loops and outside tests	2023-03-19 15:03:10 -07:00
Jeff King	eaa0fd6584	git_connect(): fix corner cases in downgrading v2 to v0 There's code in git_connect() that checks whether we are doing a push with protocol_v2, and if so, drops us to protocol_v0 (since we know how to do v2 only for fetches). But it misses some corner cases: 1. it checks the "prog" variable, which is actually the path to receive-pack on the remote side. By default this is just "git-receive-pack", but it could be an arbitrary string (like "/path/to/git receive-pack", etc). We'd accidentally stay in v2 mode in this case. 2. besides "receive-pack" and "upload-pack", there's one other value we'd expect: "upload-archive" for handling "git archive --remote". Like receive-pack, this doesn't understand v2, and should use the v0 protocol. In practice, neither of these causes bugs in the real world so far. We do send a "we understand v2" probe to the server, but since no server implements v2 for anything but upload-pack, it's simply ignored. But this would eventually become a problem if we do implement v2 for those endpoints, as older clients would falsely claim to understand it, leading to a server response they can't parse. We can fix (1) by passing in both the program path and the "name" of the operation. I treat the name as a string here, because that's the pattern set in transport_connect(), which is one of our callers (we were simply throwing away the "name" value there before). We can fix (2) by allowing only known-v2 protocols ("upload-pack"), rather than blocking unknown ones ("receive-pack" and "upload-archive"). That will mean whoever eventually implements v2 push will have to adjust this list, but that's reasonable. We'll do the safe, conservative thing (sticking to v0) by default, and anybody working on v2 will quickly realize this spot needs to be updated. The new tests cover the receive-pack and upload-archive cases above, and re-confirm that we allow v2 with an arbitrary "--upload-pack" path (that already worked before this patch, of course, but it would be an easy thing to break if we flipped the allow/block logic without also handling "name" separately). Here are a few miscellaneous implementation notes, since I had to do a little head-scratching to understand who calls what: - transport_connect() is called only for git-upload-archive. For non-http git remotes, that resolves to the virtual connect_git() function (which then calls git_connect(); confused yet?). So plumbing through "name" in connect_git() covers that. - for regular fetches and pushes, callers use higher-level functions like transport_fetch_refs(). For non-http git remotes, that means calling git_connect() under the hood via connect_setup(). And that uses the "for_push" flag to decide which name to use. - likewise, plumbing like fetch-pack and send-pack may call git_connect() directly; they each know which name to use. - for remote helpers (including http), we already have separate parameters for "name" and "exec" (another name for "prog"). In process_connect_service(), we feed the "name" to the helper via "connect" or "stateless-connect" directives. There's also a "servpath" option, which can be used to tell the helper about the "exec" path. But no helpers we implement support it! For http it would be useless anyway (no reasonable server implementation will allow you to send a shell command to run the server). In theory it would be useful for more obscure helpers like remote-ext, but even there it is not implemented. It's tempting to get rid of it simply to reduce confusion, but we have publicly documented it since it was added in `fa8c097cc9` (Support remote helpers implementing smart transports, 2009-12-09), so it's possible some helper in the wild is using it. - So for v2, helpers (again, including http) are mainly used via stateless-connect, driven by the main program. But they do still need to decide whether to do a v2 probe. And so there's similar logic in remote-curl.c's discover_refs() that looks for "git-receive-pack". But it's not buggy in the same way. Since it doesn't support servpath, it is always dealing with a "service" string like "git-receive-pack". And since it doesn't support straight "connect", it can't be used for "upload-archive". So we could leave that spot alone. But I've updated it here to match the logic we're changing in connect_git(). That seems like the least confusing thing for somebody who has to touch both of these spots later (say, to add v2 push support). I didn't add a new test to make sure this doesn't break anything; we already have several tests (in t5551 and elsewhere) that make sure we are using v2 over http. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-17 15:15:59 -07:00
Jeff King	b3edf335df	transport: mark unused parameters in fetch_refs_from_bundle() We don't look at the "to_fetch" or "nr_heads" parameters at all. At first glance this seems like a bug (or at least pessimisation), because it means we fetch more objects from the bundle than we actually need. But the bundle does not have any way of computing the set of reachable objects itself (we'd have to pull all of the objects out to walk them). And anyway, we've probably already paid most of the cost of grabbing the objects, since we must copy the bundle locally before accessing it. So it's perfectly reasonable for the bundle code to just pull everything into the local object store. Unneeded objects can be dropped later via gc, etc. But we should mark these unused parameters as such to avoid the wrath of -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-17 14:17:48 -07:00
Jeff King	1e5e097496	http: mark unused parameter in fill_active_slot() callbacks We have a generic "fill" function that is used by both the dumb http push and fetch code paths. It takes a void parameter in case the caller wants to pass along extra data, but (since the previous commit) neither does so. So we could simply drop the extra parameter. But since it's good practice to provide a void pointer for in callback functions, we'll leave it here for the future, and just annotate it as unused (to appease -Wunused-parameter). While we're marking it, let's also fix the type in http-walker's function to have the correct "void" type. The original had to cast the function pointer and was technically undefined behavior (though generally OK in practice). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-17 14:17:48 -07:00
Jeff King	647edf79d6	http: drop unused parameter from start_object_request() We take a "walker" parameter for the request, but don't actually look at it. This is due to `5424bc557f` (http*: add helper methods for fetching objects (loose), 2009-06-06). Before then, we consulted the "walker" struct to tell us if we should be verbose, but now those messages are printed elsewhere. Let's drop the unused parameter to make -Wunused-parameter happy. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-17 14:17:48 -07:00
Jeff King	910d07a861	mailmap: drop debugging code There's some debugging code in mailmap.c which is only compiled if you manually tweak the source to set DEBUG_MAILMAP. When it's not set, the fallback noop uses static inline functions; we couldn't use macros here because one of the functions is variadic (and variadic macros were forbidden back then, but aren't now). As a result, this triggers a -Wunused-parameter warning. We have a few options here: 1. Leave it be. Just mark it as UNUSED, or switch to a variadic macro. 2. Assume the debugging code is useful, compile it always, and trigger it with a run-time flag (e.g., with a trace key). This is pretty easy to do, and carries a pretty small runtime cost. 3. Assume the debugging is not very useful, and just rip it out. This matches what we did with a similar case in `69c5f17f11` (attr: drop DEBUG_ATTR code, 2022-10-06). The debugging flag has been mentioned only three times on the list. Once, when it was added in 2009: https://lore.kernel.org/git/cover.1234102794.git.marius@trolltech.com/ In 2013, when somebody fixed some compilation errors in the conditional code (presumably because they used it while making other changes): https://lore.kernel.org/git/1373871253-96480-1-git-send-email-sunshine@sunshineco.com/ And finally it seemed to have been useful to somebody in 2020: https://lore.kernel.org/git/87eejswql6.fsf@evledraar.gmail.com/ So it's not totally without value. On the other hand, it's not likely to be useful to non-developers (and certainly isn't if you have to recompile). And using a debugger or adding your own inspection code is likely to be as useful. So I've just dropped the code entirely here. Note that we do still have to mark a few parameters unused in callback functions which are passed to string_list_clear_func(). Those get an extra pointer with the string being cleared, which we previously fed to the debugging code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-17 14:17:19 -07:00

... 4 5 6 7 8 ...

69975 Commits