git-commit-vandalism

Author	SHA1	Message	Date
Ævar Arnfjörð Bjarmason	1fd2aa543d	grep.h: remove unused grep_threads_ok() declaration This function was removed in `0579f91dd7` (grep: enable threading with -p and -W using lazy attribute lookup, 2011-12-12), but not its corresponding *.h declaration. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-01 14:39:46 -07:00
Ævar Arnfjörð Bjarmason	f787ebd51c	builtin.h: remove cmd_tar_tree() declaration The cmd_tar_tree() function itself was removed in `925ceccf05` (tar-tree: remove deprecated command, 2013-11-10). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-01 14:39:46 -07:00
Ævar Arnfjörð Bjarmason	0000e81811	builtin/remote.c: add and use SHOW_INFO_INIT In the preceding commit we introduced REF_STATES_INIT, but did not change the "struct show_info" to have a corresponding initializer. Let's do that, and make it use "REF_STATES_INIT" and "STRING_LIST_INIT_DUP", doing that requires changing "list" and "states" away from being pointers. The resulting end-state is simpler since we omit the local "info_list" and "states" variables in show() as well as the memset(). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-01 14:22:51 -07:00
Ævar Arnfjörð Bjarmason	0bc7787ca9	builtin/remote.c: add and use a REF_STATES_INIT Use a new REF_STATES_INIT designated initializer instead of assigning to the "strdup_strings" member of the previously memzero()'d version of this struct. The pattern of assigning to "strdup_strings" dates back to `211c89682e` (Make git-remote a builtin, 2008-02-29) (when it was "strdup_paths"), i.e. long before we used anything like our current established *_INIT patterns consistently. Then in `e61e0cc6b7` (builtin-remote: teach show to display remote HEAD, 2009-02-25) and `e5dcbfd9ab` (builtin-remote: new show output style for push refspecs, 2009-02-25) we added some more of these. As it turns out we only initialized this struct three times, all the other uses were of pointers to those initialized structs. So let's initialize it in those three places, skip the memset(), and pass those structs down appropriately. This would be a behavior change if we had codepaths that relied say on implicitly having had "new_refs" initialized to STRING_LIST_INIT_NODUP with the memset(), but only set the "strdup_strings" on some other struct, but then called string_list_append() on "new_refs". There isn't any such codepath, all of the late assignments to "strdup_strings" assigned to those structs that we'd use for those codepaths. So just initializing them all up-front makes for easier to understand code, i.e. in the pre-image it looked as though we had that tricky edge case, but we didn't. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-01 14:22:51 -07:00
Ævar Arnfjörð Bjarmason	73ee449bbf	urlmatch.[ch]: add and use URLMATCH_CONFIG_INIT Change the initialization pattern of "struct urlmatch_config" to use an _INIT macro and designated initializers. Right now there's no other "struct" member of "struct urlmatch_config" which would require its own _INIT, but it's good practice not to assume that. Let's also change this to a designated initializer while we're at it. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-01 14:22:51 -07:00
René Scharfe	afc72b5d3a	mergesort: use ranks stack The bottom-up mergesort implementation needs to skip through sublists a lot. A recursive version could avoid that, but would require log2(n) stack frames. Explicitly manage a stack of sorted sublists of various lengths instead to avoid fast-forwarding while also keeping a lid on memory usage. While this patch was developed independently, a ranks stack is also used in https://github.com/mono/mono/blob/master/mono/eglib/sort.frag.h by the Mono project. The idea is to keep slots for log2(n_max) sorted sublists, one for each power of 2. Such a construct can accommodate lists of any length up to n_max. Since there is a known maximum number of items (effectively SIZE_MAX), we can preallocate the whole rank stack. We add items one by one, which is akin to incrementing a binary number. Make use of that by keeping track of the number of items and check bits in it instead of checking for NULL in the rank stack when checking if a sublist of a certain rank exists, in order to avoid memory accesses. The first item can go into the empty first slot as a sublist of length 2^0. The second one needs to be merged with the previous sublist and the result goes into the empty second slot as a sublist of length 2^1. The third one goes into vacated first slot and so on. At the end we merge all the sublists to get the result. The new version still performs a stable sort by making sure to put items seen earlier first when the compare function indicates equality. That's done by preferring items from sublists with a higher rank. The new merge function also tries to minimize the number of operations. Like blame.c::blame_merge(), the function doesn't set the next pointer if it already points to the right item, and it exits when it reaches the end of one of the two sublists that it's given. The old code couldn't do the latter because it kept all items in a single list. The number of comparisons stays the same, though. Here's example output of "test-tool mergesort test" for the rand distributions with the most number of comparisons with the ranks stack: $ t/helper/test-tool mergesort test \| awk ' NR > 1 && $1 != "rand" {next} $7 > max[$3] {max[$3] = $7; line[$3] = $0} END {for (n in line) print line[n]} ' distribut mode n m get_next set_next compare verdict rand copy 100 32 669 420 569 OK rand dither 1023 64 9997 5396 8974 OK rand dither 1024 512 10007 6159 8983 OK rand dither 1025 256 10993 5988 9968 OK Here are the differences to the results without this patch: distribut mode n m get_next set_next compare rand copy 100 32 -515 -280 0 rand dither 1023 64 -6376 -4834 0 rand dither 1024 512 -6377 -4081 0 rand dither 1025 256 -7461 -5287 0 The numbers of get_next and set_next calls are reduced significantly. NB: These winners are different than the ones shown in the patch that introduced the unriffle mode because the addition of the unriffle_skewed mode in between changed the consumption of rand() values. Here are the distributions with the most comparisons overall with the ranks stack: $ t/helper/test-tool mergesort test \| awk ' $7 > max[$3] {max[$3] = $7; line[$3] = $0} END {for (n in line) print line[n]} ' distribut mode n m get_next set_next compare verdict sawtooth unriffle_skewed 100 128 689 632 589 OK sawtooth unriffle_skewed 1023 1024 10230 10220 9207 OK sawtooth unriffle 1024 1024 10241 10240 9217 OK sawtooth unriffle_skewed 1025 2048 11266 10242 10241 OK And here the differences to before: distribut mode n m get_next set_next compare sawtooth unriffle_skewed 100 128 -495 -68 0 sawtooth unriffle_skewed 1023 1024 -6143 -10 0 sawtooth unriffle 1024 1024 -6143 0 0 sawtooth unriffle_skewed 1025 2048 -7188 -1033 0 We get a similar reduction of get_next calls here, but only a slight reduction of set_next calls, if at all. And here are the results of p0071-sort.sh before: 0071.12: llist_mergesort() unsorted 0.36(0.33+0.01) 0071.14: llist_mergesort() sorted 0.15(0.13+0.01) 0071.16: llist_mergesort() reversed 0.16(0.14+0.01) ... and here the ones with this patch: 0071.12: llist_mergesort() unsorted 0.24(0.22+0.01) 0071.14: llist_mergesort() sorted 0.12(0.10+0.01) 0071.16: llist_mergesort() reversed 0.12(0.10+0.01) NB: We can't use t/perf/run to compare revisions in one run because it uses the test-tool from the worktree, not from the revisions being tested. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-01 12:43:09 -07:00
René Scharfe	40bc872adb	p0071: test performance of llist_mergesort() Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-01 12:43:09 -07:00
René Scharfe	84edc40676	p0071: measure sorting of already sorted and reversed files Check if sorting takes advantage of already sorted or reversed content, or if that corner case actually decreases performance, like it would for a simplistic quicksort implementation. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-01 12:43:09 -07:00
René Scharfe	f1ed4ce9e3	test-mergesort: add unriffle_skewed mode Add a mode that turns a sorted list into adversarial input for a bottom-up mergesort implementation that doubles the length of sorted sublists at each level -- like our llist_mergesort(). While unriffle mode splits the list in half at each recursion step, unriffle_skewed splits it into 2^l items and the rest, with 2^l being the highest power of two smaller than the number of items and thus 2^l >= rest. The rest is unriffled with the tail of the first half to require a merge to compare the maximum number of elements. It complements the unriffle mode, which targets balanced merges. If the number of elements is a power of two then both actually produce the same result, as 2^l == rest == n/2 at each recursion step in that case. Here are the results: $ t/helper/test-tool mergesort test \| awk ' $7 > max[$3] {max[$3] = $7; line[$3] = $0} END {for (n in line) print line[n]} ' distribut mode n m get_next set_next compare verdict sawtooth unriffle_skewed 100 128 1184 700 589 OK sawtooth unriffle_skewed 1023 1024 16373 10230 9207 OK sawtooth unriffle 1024 1024 16384 10240 9217 OK sawtooth unriffle_skewed 1025 2048 18454 11275 10241 OK The sawtooth distribution with m>=n produces a sorted list and unriffle_skewed mode turns it into adversarial input for unbalanced merges, which it wins in all cases except for n=1024 -- the resulting list is the same, but unriffle is tested before unriffle_skewed, so its result is selected by the AWK script. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-01 12:43:09 -07:00
René Scharfe	1aa589922b	test-mergesort: add unriffle mode Add a mode that turns sorted items into adversarial input for mergesort. Do that by running mergesort in reverse and rearranging the items in such a way that each merge needs the maximum number of operations to undo it. To riffle is a card shuffling technique and involves splitting a deck into two and then to interleave them. A perfect riffle takes one card from each half in turn. That's similar to the most expensive merge, which has to take one item from each sublist in turn, which requires the maximum number of comparisons (n-1). So unriffle does that in reverse, i.e. it generates the first sublist out of the items at even indexes and the second sublist out of the items at odd indexes, without changing their order in any other way. Done recursively until we reach the trivial sublist length of one, this twists the list into an order that requires the maximum effort for mergesort to untangle. As a baseline, here are the rand distributions with the highest number of comparisons from "test-tool mergesort test": $ t/helper/test-tool mergesort test \| awk ' NR > 1 && $1 != "rand" {next} $7 > max[$3] {max[$3] = $7; line[$3] = $0} END {for (n in line) print line[n]} ' distribut mode n m get_next set_next compare verdict rand copy 100 32 1184 700 569 OK rand reverse_1st_half 1023 256 16373 10230 8976 OK rand reverse_1st_half 1024 512 16384 10240 8993 OK rand dither 1025 64 18454 11275 9970 OK And here are the most expensive ones overall: $ t/helper/test-tool mergesort test \| awk ' $7 > max[$3] {max[$3] = $7; line[$3] = $0} END {for (n in line) print line[n]} ' distribut mode n m get_next set_next compare verdict stagger reverse 100 64 1184 700 580 OK sawtooth unriffle 1023 1024 16373 10230 9179 OK sawtooth unriffle 1024 1024 16384 10240 9217 OK stagger unriffle 1025 2048 18454 11275 10241 OK The sawtooth distribution with m>=n generates a sorted list. The unriffle mode is designed to turn that into adversarial input for mergesort, and that checks out for n=1023 and n=1024, where it produces the list that requires the most comparisons. Item counts that are not powers of two have other winners, and that's because unriffle recursively splits lists into equal-sized halves, while llist_mergesort() splits them into the biggest power of two smaller than n and the rest, e.g. for n=1025 it sorts the first 1024 separately and finally merges them to the last item. So unriffle mode works as designed for the intended use case, but to consistently generate adversarial input for unbalanced merges we need something else. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-01 12:43:08 -07:00
René Scharfe	0cecb75531	test-mergesort: add generate subcommand Add a subcommand for printing test data. It can be used to generate special test cases and feed them into the sort subcommand or sort(1) for performance measurements. It may also be useful to illustrate the effect of distributions, modes and their parameters. It generates n integers with the specified distribution and its distribution-specific parameter m. E.g. m is the maximum value for the plateau distribution and the length and height of individual teeth of the sawtooth distribution. The generated values are printed as zero-padded eight-digit hexadecimal numbers to make sure alphabetic and numeric order are the same. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-01 12:43:08 -07:00
René Scharfe	e031e9719d	test-mergesort: add test subcommand Adapt the qsort certification program from "Engineering a Sort Function" by Bentley and McIlroy for testing our linked list sort function. It generates several lists with various distribution patterns and counts the number of operations llist_mergesort() needs to order them. It compares the result to the output of a trusted sort function (qsort(1)) and also checks if the sort is stable. Also add a test script that makes use of the new subcommand. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-01 12:43:08 -07:00
René Scharfe	d536a71169	test-mergesort: add sort subcommand Give the code for sorting a text file its own sub-command. This allows extending the helper, which we'll do in the following patches. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-01 12:43:08 -07:00
René Scharfe	2e6701017e	test-mergesort: use strbuf_getline() Strip line ending characters to make sure empty lines are sorted like sort(1) does. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-01 12:43:08 -07:00
David Aguilar	28c10ecbfc	difftool: add a missing space to the run_dir_diff() comments Signed-off-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-30 18:48:51 -07:00
David Aguilar	8e2af8f0db	difftool: remove an unnecessary call to strbuf_release() The `buf` strbuf is reused again later in the same function, so there is no benefit to calling strbuf_release(). The subsequent usage is already using strbuf_reset() to reset the buffer, so releasing it early is only going to lead to a wasteful reallocation. Remove the early call to strbuf_release(). The same strbuf is already cleaned up in the "finish:" section so nothing is leaked, either. Signed-off-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-30 18:48:51 -07:00
David Aguilar	2255c80c91	difftool: refactor dir-diff to write files using helper functions Add a helpers function to handle the unlinking and writing of the dir-diff submodule and symlink stand-in files. Use the helpers to implement the guts of the hashmap loops. This eliminate duplicate code and safeguards the submodules hashmap loop against the symlink-chasing behavior that `5bafb3576a` (difftool: fix symlink-file writing in dir-diff mode, 2021-09-22) addressed. The submodules loop should not strictly require the unlink() call that this is introducing to them, but it does not necessarily hurt them either beyond the cost of the extra unlink(). Signed-off-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-30 18:48:51 -07:00
David Aguilar	4ac9f15492	difftool: create a tmpdir path without repeated slashes The paths generated by difftool are passed to user-facing diff tools. Using paths with repeated slashes in them is a cosmetic blemish that is exposed to users and can be avoided. Use a strbuf to create the buffer used for the dir-diff tmpdir. Strip trailing slashes from the value read from TMPDIR to avoid repeated slashes in the generated paths. Adjust the error handling to avoid leaking strbufs and to avoid returning -1 to cmd_main(). Signed-off-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-30 18:48:51 -07:00
Hamza Mahfooz	3f566c4e69	grep: refactor next_match() and match_one_pattern() for external use These changes are made in preparation of, the colorization support for the "git log" subcommands that, rely on regex functionality (i.e. "--author", "--committer" and "--grep"). These changes are necessary primarily because match_one_pattern() expects header lines to be prefixed, however, in pretty, the prefixes are stripped from the lines because the name-email pairs need to go through additional parsing, before they can be printed and because next_match() doesn't handle the case of "ctx == GREP_CONTEXT_HEAD" at all. So, teach next_match() how to handle the new case and move match_one_pattern()'s core logic to headerless_match_one_pattern() while preserving match_one_pattern()'s uses that depend on the additional processing. Signed-off-by: Hamza Mahfooz <someguy@effective-light.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-29 13:23:11 -07:00
Matheus Tavares	45bde58ef8	grep: demonstrate bug with textconv attributes and submodules In some circumstances, "git grep --textconv --recurse-submodules" ignores the textconv attributes from the submodules and erroneously applies the attributes defined in the superproject on the submodules' files. The textconv cache is also saved on the superproject, even for submodule objects. A fix for these problems will probably require at least three changes: - Some textconv and attributes functions (as well as their callees) will have to be adjusted to work with arbitrary repositories. Note that "fill_textconv()", for example, already receives a "struct repository" but it writes the textconv cache using "write_loose_object()", which implicitly works on "the_repository". - grep.c functions will have to call textconv/userdiff routines passing the "repo" field from "struct grep_source" instead of the one from "struct grep_opt". The latter always points to "the_repository" on "git grep" executions (see its initialization in builtin/grep.c), but the former points to the correct repository that each source (an object, file, or buffer) comes from. - "userdiff_find_by_path()" might need to use a different attributes stack for each repository it works on or reset its internal static stack when the repository is changed throughout the calls. For now, let's add some tests to demonstrate these problems, and also update a NEEDSWORK comment in grep.h that mentions this bug to reference the added tests. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-29 13:19:38 -07:00
Taylor Blau	6d08b9d4ca	builtin/repack.c: make largest pack preferred When repacking into a geometric series and writing a multi-pack bitmap, it is beneficial to have the largest resulting pack be the preferred object source in the bitmap's MIDX, since selecting the large packs can lead to fewer broken delta chains and better compression. Teach 'git repack' to identify this pack and pass it to the MIDX write machinery in order to mark it as preferred. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-28 21:20:56 -07:00
Taylor Blau	1d89d88d37	builtin/repack.c: support writing a MIDX while repacking Teach `git repack` a new `--write-midx` option for callers that wish to persist a multi-pack index in their repository while repacking. There are two existing alternatives to this new flag, but they don't cover our particular use-case. These alternatives are: - Call 'git multi-pack-index write' after running 'git repack', or - Set 'GIT_TEST_MULTI_PACK_INDEX=1' in your environment when running 'git repack'. The former works, but introduces a gap in bitmap coverage between repacking and writing a new MIDX (since the repack may have deleted a pack included in the existing MIDX, invalidating it altogether). Setting the 'GIT_TEST_' environment variable is obviously unsupported. In fact, even if it were supported officially, it still wouldn't work, because it generates the MIDX after redundant packs have been dropped, leading to the same issue as above. Introduce a new option which eliminates this race by teaching `git repack` to generate the MIDX at the critical point: after the new packs have been written and moved into place, but before the redundant packs have been removed. This option is compatible with `git repack`'s '--bitmap' option (it changes the interpretation to be: "write a bitmap corresponding to the MIDX after one has been generated"). There is a little bit of additional noise in the patch below to avoid repeating ourselves when selecting which packs to delete. Instead of a single loop as before (where we iterate over 'existing_packs', decide if a pack is worth deleting, and if so, delete it), we have two loops (the first where we decide which ones are worth deleting, and the second where we actually do the deleting). This makes it so we have a single check we can make consistently when (1) telling the MIDX which packs we want to exclude, and (2) actually unlinking the redundant packs. There is also a tiny change to short-circuit the body of write_midx_included_packs() when no packs remain in the case of an empty repository. The MIDX code does not handle this, so avoid trying to generate a MIDX covering zero packs in the first place. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-28 21:20:56 -07:00
Taylor Blau	5f18e31f46	builtin/repack.c: extract showing progress to a variable We only ask whether stderr is a tty before calling 'prune_packed_objects()', but the subsequent patch will add another use. Extract this check into a variable so that both can use it without having to call 'isatty()' twice. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-28 21:20:56 -07:00
Taylor Blau	a169166d2b	builtin/repack.c: rename variables that deal with non-kept packs The new variable `existing_kept_packs` (and corresponding parameter `fname_kept_list`) added by the previous patch make it seem like `existing_packs` and `fname_list` are each subsets of the other two respectively. In reality, each pair is disjoint: one stores the packs without .keep files, and the other stores the packs with .keep files. Rename each to more clearly reflect this. Suggested-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-28 21:20:56 -07:00
Taylor Blau	90f838bc36	builtin/repack.c: keep track of existing packs unconditionally In order to be able to write a multi-pack index during repacking, `git repack` must keep track of which packs it wants to write into the MIDX. This set is the union of existing packs which will not be deleted, new pack(s) generated as a result of the repack, and .keep packs. Prior to this patch, `git repack` populated the list of existing packs only when repacking all-into-one (i.e., with `-A` or `-a`), but we will soon need to know this list when repacking when writing a MIDX without a-i-o. Populate the list of existing packs unconditionally, and guard removing packs from that list only when repacking a-i-o. Additionally, keep track of filenames of kept packs separately, since this, too, will be used in an upcoming patch. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-28 21:20:56 -07:00
Taylor Blau	08944d1c22	midx: preliminary support for `--refs-snapshot` To figure out which commits we can write a bitmap for, the multi-pack index/bitmap code does a reachability traversal, marking any commit which can be found in the MIDX as eligible to receive a bitmap. This approach will cause a problem when multi-pack bitmaps are able to be generated from `git repack`, since the reference tips can change during the repack. Even though we ignore commits that don't exist in the MIDX (when doing a scan of the ref tips), it's possible that a commit in the MIDX reaches something that isn't. This can happen when a multi-pack index contains some pack which refers to loose objects (e.g., if a pack was pushed after starting the repack but before generating the MIDX which depends on an object which is stored as loose in the repository, and by definition isn't included in the multi-pack index). By taking a snapshot of the references before we start repacking, we can close that race window. In the above scenario (where we have a packed object pointing at a loose one), we'll either (a) take a snapshot of the references before seeing the packed one, or (b) take it after, at which point we can guarantee that the loose object will be packed and included in the MIDX. This patch does just that. It writes a temporary "reference snapshot", which is a list of OIDs that are at the ref tips before writing a multi-pack bitmap. References that are "preferred" (i.e,. are a suffix of at least one value of the 'pack.preferBitmapTips' configuration) are marked with a special '+'. The format is simple: one line per commit at each tip, with an optional '+' at the beginning (for preferred references, as described above). When provided, the reference snapshot is used to drive bitmap selection instead of the MIDX code doing its own traversal. When it isn't provided, the usual traversal takes place instead. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-28 21:20:56 -07:00
Taylor Blau	6fb22ca463	builtin/multi-pack-index.c: support `--stdin-packs` mode To power a new `--write-midx` mode, `git repack` will want to write a multi-pack index containing a certain set of packs in the repository. This new option will be used by `git repack` to write a MIDX which contains only the packs which will survive after the repack (that is, it will exclude any packs which are about to be deleted). This patch effectively exposes the function implemented in the previous commit via the `git multi-pack-index` builtin. An alternative approach would have been to call that function from the `git repack` builtin directly, but this introduces awkward problems around closing and reopening the object store, so the MIDX will be written out-of-process. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-28 21:20:55 -07:00
Taylor Blau	56d863e979	midx: expose `write_midx_file_only()` publicly Expose a variant of the write_midx_file() function which ignores packs that aren't included in an explicit "allow" list. This will be used in an upcoming patch to power a new `--stdin-packs` mode of `git multi-pack-index write` for callers that only want to include certain packs in a MIDX (and ignore any packs which may have happened to enter the repository independently, e.g., from pushes). Those patches will provide test coverage for this new function. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-28 21:20:55 -07:00
Carlo Marcelo Arenas Belón	ebd2e4a13a	Makefile: restrict -Wpedantic and -Wno-pedantic-ms-format better `6a8cbc41ba` (developer: enable pedantic by default, 2021-09-03) enables pedantic mode in as many compilers as possible to help gather feedback on future tightening, so lets do so. -Wpedantic is missing in some really old gcc 4 versions so lets restrict it to gcc5 and clang4 (it does work in clang3 AFAIK, but it will be unlikely that a developer will use such an old compiler anyway). MinGW gcc is the only one which has -Wno-pedantic-ms-format, and while that is available also in older compilers, the Windows SDK provides gcc10 so lets aim for that. Note that in order to target the flag to only Windows, additional changes were needed in config.mak.uname to propagate the OS detection which also did some minor refactoring, but which is functionaly equivalent. Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-28 21:15:53 -07:00
Ævar Arnfjörð Bjarmason	3b723f722d	parse-options.h: move PARSE_OPT_SHELL_EVAL between enums Fix a bad landmine of a bug which has been with us ever since PARSE_OPT_SHELL_EVAL was added in `47e9cd28f8` (parseopt: wrap rev-parse --parseopt usage for eval consumption, 2010-06-12). It's an argument to parse_options() and should therefore be in "enum parse_opt_flags", but it was added to the per-option "enum parse_opt_option_flags" by mistake. Therefore as soon as we'd have an enum member in the former that reached its value of "1 << 8" we'd run into a seemingly bizarre bug where that new option would turn on the unrelated PARSE_OPT_SHELL_EVAL in "git rev-parse --parseopt" by proxy. I manually checked that no other enum members suffered from such overlap, by setting the values to non-overlapping values, and making the relevant codepaths BUG() out if the given value was above/below the expected (excluding flags=0 in the case of "enum parse_opt_flags"). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-28 16:50:42 -07:00
Orgad Shaneh	6ffb990dc4	doc: fix capitalization in "git status --porcelain=v2" description The summary line had xy, while the description (and other sub-sections) has XY. Signed-off-by: Orgad Shaneh <orgads@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-28 16:29:04 -07:00
Junio C Hamano	b6b210c5e1	Merge branch 'jk/ref-paranoia' into jt/no-abuse-alternate-odb-for-submodules * jk/ref-paranoia: (71 commits) refs: drop "broken" flag from for_each_fullref_in() ref-filter: drop broken-ref code entirely ref-filter: stop setting FILTER_REFS_INCLUDE_BROKEN repack, prune: drop GIT_REF_PARANOIA settings refs: turn on GIT_REF_PARANOIA by default refs: omit dangling symrefs when using GIT_REF_PARANOIA refs: add DO_FOR_EACH_OMIT_DANGLING_SYMREFS flag refs-internal.h: reorganize DO_FOR_EACH_* flag documentation refs-internal.h: move DO_FOR_EACH_* flags next to each other t5312: be more assertive about command failure t5312: test non-destructive repack t5312: create bogus ref as necessary t5312: drop "verbose" helper t5600: provide detached HEAD for corruption failures t5516: don't use HEAD ref for invalid ref-deletion tests t7900: clean up some more broken refs The eighth batch t0000: avoid masking git exit value through pipes tree-diff: fix leak when not HAVE_ALLOCA_H pack-revindex.h: correct the time complexity descriptions ...	2021-09-28 15:15:42 -07:00
Ævar Arnfjörð Bjarmason	750036c8f7	refs/ref-cache.[ch]: remove "incomplete" from create_dir_entry() Remove the now-unused "incomplete" parameter from create_dir_entry(), all its callers specify it as "1", so let's drop the "incomplete=0" case. The last caller to use it was search_for_subdir(), but that code was removed in the preceding commit. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-28 15:12:04 -07:00
Ævar Arnfjörð Bjarmason	5e4546d599	refs/ref-cache.c: remove "mkdir" parameter from find_containing_dir() Remove the "mkdir" parameter from the find_containing_dir() function, the add_ref_entry() function removed in the preceding commit was its last user. Since "mkdir" is always "0" we can also remove the parameter from search_for_subdir(), which in turn means that we can delete most of that function. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-28 15:12:04 -07:00
Ævar Arnfjörð Bjarmason	6a99fa2e9e	refs/ref-cache.[ch]: remove unused add_ref_entry() This function has not been used since `9dd389f3d8` (packed_ref_store: get rid of the `ref_cache` entirely, 2017-09-25). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-28 15:12:04 -07:00
Ævar Arnfjörð Bjarmason	34e8a20d76	refs/ref-cache.[ch]: remove unused remove_entry_from_dir() This function was missed in `9939b33d6a` (packed-backend: rip out some now-unused code, 2017-09-08), and has been orphaned since then. Let's delete it. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-28 15:12:04 -07:00
Ævar Arnfjörð Bjarmason	98961e42f0	refs.[ch]: remove unused ref_storage_backend_exists() This function was added in `3dce444f17` (refs: add a backend method structure, 2016-09-04), but has never been used by anything. The only caller that might care uses find_ref_storage_backend() directly. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-28 15:12:04 -07:00
Ævar Arnfjörð Bjarmason	73c5f67071	config.c: remove unused git_config_key_is_valid() The git_config_key_is_valid() function got left behind in a refactoring in `a9bcf6586d` (alias: use the early config machinery to expand aliases, 2017-06-14), It previously had two users when it was added in `9e9de18f1a` (config: silence warnings for command names with invalid keys, 2015-08-24), and after `6a1e1bc0a1` (pager: use callbacks instead of configset, 2016-09-12) only one remained. By removing it we can get rid of the "quiet" branches in this function, as well as cases where "store_key" is NULL, for which there are no other users. Out of the 5 callers of git_config_parse_key() only one needs to pass a non-NULL "size_t *baselen_", so we could remove the third parameter from the public interface. I did not find that potential simplification to be worthwhile. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-28 14:54:15 -07:00
Ævar Arnfjörð Bjarmason	abf897bacd	string-list.[ch]: remove string_list_init() compatibility function Remove this function left over to accommodate in-flight changes, see `770fedaf9f` (string-list.[ch]: add a string_list_init_{nodup,dup}(), 2021-07-01) for the recent change to add "string_list_init_{nodup,dup}()" initializers. There was only one user of the API left in remote-curl.c. I don't know why I didn't include this change to remote-curl.c in `bc40dfb10a` (string-list.h users: change to use *_{nodup,dup}(), 2021-07-01), perhaps I just missed it. In any case, let's change that one user to use the new API, as of writing this there are no in-flight changes that use, so this seems like a good time to drop this before we get any new users of this compatibility API. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-28 14:43:38 -07:00
Junio C Hamano	cefe983a32	The ninth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-28 13:06:53 -07:00
Junio C Hamano	45d141a1dd	Merge branch 'en/typofixes' Typofixes. * en/typofixes: merge-ort: fix completely wrong comment trace2.h: fix trivial comment typo	2021-09-28 13:06:53 -07:00
Junio C Hamano	3d875f96f1	Merge branch 'cb/unicode-14' The unicode character width table (used for output alignment) has been updated. * cb/unicode-14: unicode: update the width tables to Unicode 14	2021-09-28 13:06:53 -07:00
Junio C Hamano	bb1677fc29	Merge branch 'jk/reduce-malloc-in-v2-servers' Code cleanup to limit memory consumption and tighten protocol message parsing. * jk/reduce-malloc-in-v2-servers: ls-refs: reject unknown arguments serve: reject commands used as capabilities serve: reject bogus v2 "command=ls-refs=foo" docs/protocol-v2: clarify some ls-refs ref-prefix details ls-refs: ignore very long ref-prefix counts serve: drop "keys" strvec serve: provide "receive" function for session-id capability serve: provide "receive" function for object-format capability serve: add "receive" method for v2 capabilities table serve: return capability "value" from get_capability() serve: rename is_command() to parse_command()	2021-09-28 13:06:53 -07:00
Derrick Stolee	6579e788c0	advice: update message to suggest '--sparse' The previous changes modified the behavior of 'git add', 'git rm', and 'git mv' to not adjust paths outside the sparse-checkout cone, even if they exist in the working tree and their cache entries lack the SKIP_WORKTREE bit. The intention is to warn users that they are doing something potentially dangerous. The '--sparse' option was added to each command to allow careful users the same ability they had before. To improve the discoverability of this new functionality, add a message to advice.updateSparsePath that mentions the existence of the option. The previous set of changes also modified the purpose of this message to include possibly a list of paths instead of only a list of pathspecs. Make the warning message more clear about this new behavior. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-28 10:31:02 -07:00
Derrick Stolee	93d2c16041	mv: refuse to move sparse paths Since cmd_mv() does not operate on cache entries and instead directly checks the filesystem, we can only use path_in_sparse_checkout() as a mechanism for seeing if a path is sparse or not. Be sure to skip returning a failure if '-k' is specified. To ensure that the advice around sparse paths is the only reason a move failed, be sure to check this as the very last thing before inserting into the src_for_dst list. The tests cover a variety of cases such as whether the target is tracked or untracked, and whether the source or destination are in or outside of the sparse-checkout definition. Helped-by: Matheus Tavares Bernardino <matheus.bernardino@usp.br> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-28 10:31:02 -07:00
Derrick Stolee	d7c4415e55	rm: skip sparse paths with missing SKIP_WORKTREE If a path does not match the sparse-checkout cone but is somehow missing the SKIP_WORKTREE bit, then 'git rm' currently succeeds in removing the file. One reason a user might be in this situation is a merge conflict outside of the sparse-checkout cone. Removing such a file might be problematic for users who are not sure what they are doing. Add a check to path_in_sparse_checkout() when 'git rm' is checking if a path should be considered for deletion. Of course, this check is ignored if the '--sparse' option is specified, allowing users who accept the risks to continue with the removal. This also removes a confusing behavior where a user asks for a directory to be removed, but only the entries that are within the sparse-checkout definition are removed. Now, 'git rm <dir>' will fail without '--sparse' and will succeed in removing all contained paths with '--sparse'. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-28 10:31:02 -07:00
Derrick Stolee	f9786f9b85	rm: add --sparse option As we did previously in 'git add', add a '--sparse' option to 'git rm' that allows modifying paths outside of the sparse-checkout definition. The existing checks in 'git rm' are restricted to tracked files that have the SKIP_WORKTREE bit in the current index. Future changes will cause 'git rm' to reject removing paths outside of the sparse-checkout definition, even if they are untracked or do not have the SKIP_WORKTREE bit. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-28 10:31:02 -07:00
Derrick Stolee	61d450f049	add: update --renormalize to skip sparse paths We added checks for path_in_sparse_checkout() to portions of 'git add' that add warnings and prevent stagins a modification, but we skipped the --renormalize mode. Update renormalize_tracked_files() to ignore cache entries whose path is outside of the sparse-checkout cone (unless --sparse is provided). Add a test in t3705. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-28 10:31:02 -07:00
Derrick Stolee	63b60b3add	add: update --chmod to skip sparse paths We added checks for path_in_sparse_checkout() to portions of 'git add' that add warnings and prevent staging a modification, but we skipped the --chmod mode. Update chmod_pathspec() to ignore cache entries whose path is outside of the sparse-checkout cone (unless --sparse is provided). Add a test in t3705. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-28 10:31:02 -07:00
Derrick Stolee	0299a69694	add: implement the --sparse option We previously modified 'git add' to refuse updating index entries outside of the sparse-checkout cone. This is justified to prevent users from accidentally getting into a confusing state when Git removes those files from the working tree at some later point. Unfortunately, this caused some workflows that were previously possible to become impossible, especially around merge conflicts outside of the sparse-checkout cone. These were documented in tests within t1092. We now re-enable these workflows using a new '--sparse' option to 'git add'. This allows users to signal "Yes, I do know what I'm doing with these files," and accept the consequences of the files leaving the worktree later. We delay updating the advice message until implementing a similar option in 'git rm' and 'git mv'. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-28 10:31:02 -07:00

... 10 11 12 13 14 ...

65095 Commits