git-commit-vandalism

Author	SHA1	Message	Date
Eric Wong	33f379eee6	make object_directory.loose_objects_subdir_seen a bitmap There's no point in using 8 bits per-directory when 1 bit will do. This saves us 224 bytes per object directory, which ends up being 22MB when dealing with 100K alternates. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-07 21:27:58 -07:00
Eric Wong	407532f82d	avoid strlen via strbuf_addstr in link_alt_odb_entry We can save a few milliseconds (across 100K odbs) by using strbuf_addbuf() instead of strbuf_addstr() by passing `entry' as a strbuf pointer rather than a "const char *". Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-07 21:27:56 -07:00
Eric Wong	cf2dc1c238	speed up alt_odb_usable() with many alternates With many alternates, the duplicate check in alt_odb_usable() wastes many cycles doing repeated fspathcmp() on every existing alternate. Use a khash to speed up lookups by odb->path. Since the kh_put_* API uses the supplied key without duplicating it, we also take advantage of it to replace both xstrdup() and strbuf_release() in link_alt_odb_entry() with strbuf_detach() to avoid the allocation and copy. In a test repository with 50K alternates and each of those 50K alternates having one alternate each (for a total of 100K total alternates); this speeds up lookup of a non-existent blob from over 16 minutes to roughly 2.7 seconds on my busy workstation. Note: all underlying git object directories were small and unpacked with only loose objects and no packs. Having to load packs increases times significantly. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-07 17:21:12 -07:00
Ævar Arnfjörð Bjarmason	351bca2d1f	imap-send.c: use less verbose strbuf_fread() idiom When looking for things that hardcoded a non-zero "hint" parameter to strbuf_fread() I discovered that since `f2561fda36` (Add git-imap-send, derived from isync 1.0.1., 2006-03-10) we've been passing a hardcoded 4096 in imap-send.c to read stdin. Since we're not doing anything unusual here let's use a less verbose pattern used in a lot of other places (the hint of "0" will default to 8192). We don't need to take a FILE * here either, so we can use "0" instead of "stdin". While we're at it improve the error message if we can't read the input to use error_errno(). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-07 14:29:12 -07:00
Atharva Raykar	84069fcc14	t7400: test failure to add submodule in tracked path Add a test to ensure failure on adding a submodule to a directory with tracked contents in the index. As we are going to refactor and port to C some parts of `git submodule add`, let's add a test to help ensure no regression is introduced. Signed-off-by: Atharva Raykar <raykar.ath@gmail.com> Mentored-by: Christian Couder <christian.couder@gmail.com> Based-on-patch-by: Shourya Shukla <periperidip@gmail.com> Mentored-by: Shourya Shukla <periperidip@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-07 11:00:01 -07:00
Andrew Berry	e04170697a	docs: .gitignore parsing is to the top of the repo The current documentation reads as if .gitignore files will be parsed in every parent directory, and not until they reach a repository boundary. This clarifies the current behaviour. As well, this corrects 'toplevel' to 'top-level', matching usage for 'top-level domain'. Signed-off-by: Andrew Berry <andrew@furrypaws.ca> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-06 17:20:02 -07:00
Atharva Raykar	6a24cc71ed	submodule--helper: remove redundant include "dir.h" should have been included only once. Signed-off-by: Atharva Raykar <raykar.ath@gmail.com> Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Shourya Shukla <periperidip@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-06 14:01:08 -07:00
Andrei Rybak	d5659f856f	help: convert git_cmd to page in one place Depending on the chosen format of help pages, git-help uses function show_man_page, show_info_page, or show_html_page. The first thing all three functions do is to convert given `git_cmd` to a `page` using function cmd_to_page. Move the common part of these three functions to function cmd_help to avoid code duplication. Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Reviewed-by: Felipe Contreras <felipe.contreras@gmail.com> Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-06 13:09:20 -07:00
René Scharfe	5632e838f8	khash: clarify that allocations never fail We use our standard allocation functions and macros (xcalloc, ALLOC_ARRAY, REALLOC_ARRAY) in our version of khash.h. They terminate the program on error instead, so code that's using them doesn't have to handle allocation failures. Make this behavior explicit by turning kh_resize_ into a void function and removing the related unreachable error handling code. Helped-by: Jeff King <peff@peff.net> Signed-off-by: René Scharfe <l.s.r@web.de> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-06 13:07:50 -07:00
Han-Wen Nienhuys	52a47aea70	t7509: avoid direct file access for writing CHERRY_PICK_HEAD Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-06 12:56:38 -07:00
Han-Wen Nienhuys	ae815940f6	t1415: avoid direct filesystem access for writing refs Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-06 12:56:38 -07:00
Đoàn Trần Công Danh	ba8c2680ec	t6402: preserve git exit status code In t6402, we're checking number of files in the index and the working tree by piping the output of Git's command to "wc -l", thus losing the exit status code of git. Let's use the new helper test_stdout_line_count in order to preserve Git's exit status code. Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-06 12:24:12 -07:00
Đoàn Trần Công Danh	66c9562013	t6400: preserve git ls-files exit status code In t6400, we're checking number of files in the index and the working tree by piping the output of "git ls-files" to "wc -l", thus losing the exit status code of git. Let's use the newly introduced test_stdout_line_count in order to check the exit status code of Git's command. Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-06 12:24:11 -07:00
Đoàn Trần Công Danh	cdff1bb5a3	test-lib-functions: introduce test_stdout_line_count In some tests, we're checking the number of lines in output of some commands, including but not limited to Git's command. We're doing the check by running those commands in the left side of a pipe, thus losing the exit status code of those commands. Meanwhile, we really want to check the exit status code of Git's command. Let's write the output of those commands to a temporary file, and use test_line_count separately in order to check exit status code of those commands properly. Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-06 12:22:27 -07:00
Johannes Schindelin	0dc787a9f2	ci: accelerate the checkout By upgrading from v1 to v2 of `actions/checkout`, we avoid fetching all the tags and the complete history: v2 only fetches one revision by default. This should make things a lot faster. Note that `actions/checkout@v2` seems to be incompatible with running in containers: https://github.com/actions/checkout/issues/151. Therefore, we stick with v1 there. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-06 12:20:58 -07:00
Dennis Ameling	9ab0b66129	ci (vs-build): build with NO_GETTEXT We already build Git for Windows with `NO_GETTEXT` when compiling with GCC. Let's do the same with Visual C, too. Note that we do not technically _need_ to pass `NO_GETTEXT` explicitly in that `make artifacts-tar` invocation because we do this while `MSVC` is set (which will set `uname_S := Windows`, which in turn will set `NO_GETTEXT = YesPlease`). But it is definitely nicer to be explicit here. Signed-off-by: Dennis Ameling <dennis@dennisameling.com> Helped-by: Matthias Aßhauer <mha1993@live.de> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-06 12:20:58 -07:00
Johannes Schindelin	4348824059	artifacts-tar: respect NO_GETTEXT We obviously do not want to bundle `.mo` files during `make artifacts-tar NO_GETTEXT=Yep`, but that was the case. To fix that, go a step beyond just fixing the symptom, and simply define the lists of `.po` and `.mo` files as empty if `NO_GETTEXT` is set. Helped-by: Matthias Aßhauer <mha1993@live.de> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-06 12:20:58 -07:00
Johannes Schindelin	d681d0dc3a	ci (windows): transfer also the Git-tracked files to the test jobs Git's test suite is excruciatingly slow on Windows, mainly due to the fact that it executes a lot of shell script code, and that's simply not native to Windows. To help with that, we established the pattern where the artifacts are first built in one job, and then multiple test jobs run in parallel using the artifacts built in the first job. We take pains in transferring only the build outputs, and letting `actions/checkout` fill in the rest of the files. One major downside of that strategy is that the test jobs might fail to check out the intended revision (e.g. because the branch has been updated while the build was running, as is frequently the case with the `seen` branch). Let's transfer also the files tracked by Git, and skip the checkout step in the test jobs. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-06 12:20:58 -07:00
Ævar Arnfjörð Bjarmason	10b635b773	bundle: remove "ref_list" in favor of string-list.c API Move away from the "struct ref_list" in bundle.c in favor of the almost identical string-list.c API. That API fits this use-case perfectly, but did not exist in its current form when this code was added in `2e0afafebd` (Add git-bundle: move objects and references by archive, 2007-02-22), with hindsight we could have used the path-list API, which later got renamed to string-list. See `8fd2cb4069` (Extract helper bits from c-merge-recursive work, 2006-07-25) We need to change "name" to "string" and "oid" to "util" to make this conversion, but other than that the APIs are pretty much identical for what bundle.c made use of. Let's also replace the memset(..,0,...) pattern with a more idiomatic "INIT" macro, and finally add a *_release() function so to free the allocated memory. Before this the add_to_ref_list() would leak memory, now e.g. "bundle list-heads" reports no memory leaks at all under valgrind. In the bundle_header_init() function we're using a clever trick to memcpy() what we'd get from the corresponding BUNDLE_HEADER_INIT. There is a concurrent series to make use of that pattern more generally, see [1]. 1. https://lore.kernel.org/git/cover-0.5-00000000000-20210701T104855Z-avarab@gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-06 12:10:17 -07:00
Ævar Arnfjörð Bjarmason	15e7c7dca6	bundle.c: use a temporary variable for OIDs and names In preparation for moving away from accessing the OID and name via the "oid" and "name" slots in a subsequent commit, change the code that accesses it to use named variables. This makes the subsequent change smaller. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-06 12:10:16 -07:00
Ævar Arnfjörð Bjarmason	db6bfb9fe8	bundle cmd: stop leaking memory from parse_options_cmd_bundle() Fix a memory leak from the prefix_filename() function introduced with its use in `3b754eedd5` (bundle: use prefix_filename with bundle path, 2017-03-20). As noted in that commit the leak was intentional as a part of being sloppy about freeing resources just before we exit, I'm changing this because I'll be fixing other memory leaks in the bundle API (including the library version) in subsequent commits. It's easier to reason about those fixes if valgrind runs cleanly at the end without any leaks whatsoever. An earlier version of this change[1] went out of its way to not leak memory on the die() codepaths here, but doing so will only avoid reports of potential leaks under heap-only leak trackers such as valgrind, not the SANITIZE=leak mode. Avoiding those leaks as well might be useful to enable us to run cleanly under the likes of valgrind in the future. But for now the relative verbosity of the resulting code, and the fact that we don't have some valgrind or SANITIZE=leak mode as part of our CI (it's only run ad-hoc, see [2]), means we're not worrying about that for now. 1. https://lore.kernel.org/git/87v95vdxrc.fsf@evledraar.gmail.com/ 2. https://lore.kernel.org/git/87czsv2idy.fsf@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-06 12:10:16 -07:00
Bagas Sanjaya	f3abf6aa63	l10n: id.po: fix mismatched variable names Jiang Xin notified me [1] to fix some typos. Running git-po-helper [2] check, it found following: 1. git submodule--helper typo (was git submodule==helper) 2. new_index improper translation (DO NOT translate instead) Fix them. [1]: https://lore.kernel.org/git/20210703111837.14894-1-worldhello.net@gmail.com/ [2]: `e44df847ab` Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>	2021-07-05 10:02:34 +08:00
Patrick Steinhardt	3663e5904d	perf: fix when running with TEST_OUTPUT_DIRECTORY When the TEST_OUTPUT_DIRECTORY is defined, then all test data will be written in that directory instead of the default directory located in "t/". While this works as expected for our normal tests, performance tests fail to locate and aggregate performance data because they don't know to handle TEST_OUTPUT_DIRECTORY correctly and always look at the default location. Fix the issue by adding a `--results-dir` parameter to "aggregate.perl" which identifies the directory where results are and by making the "run" script awake of the TEST_OUTPUT_DIRECTORY variable. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-02 15:47:30 -07:00
Ævar Arnfjörð Bjarmason	bc40dfb10a	string-list.h users: change to use *_{nodup,dup}() Change all in-tree users of the string_list_init(LIST, BOOL) API to use string_list_init_{nodup,dup}(LIST) instead. As noted in the preceding commit let's leave the now-unused string_list_init() wrapper in-place for any in-flight users, it can be removed at some later date. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-01 12:32:22 -07:00
Ævar Arnfjörð Bjarmason	770fedaf9f	string-list.[ch]: add a string_list_init_{nodup,dup}() In order to use the new "memcpy() a 'blank' struct on the stack" pattern for string_list_init(), and to make the macro initialization consistent with the function initialization introduce two new string_list_init_{nodup,dup}() functions. These are like the old string_list_init() when called with a false and true second argument, respectively. I think this not only makes things more consistent, but also easier to read. I often had to lookup what the ", 0)" or ", 1)" in these invocations meant, now it's right there in the function name, and corresponds to the macros. A subsequent commit will convert existing API users to this pattern, but as this is a very common API let's leave a compatibility function in place for later removal. This intermediate state also proves that the compatibility function works. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-01 12:32:22 -07:00
Ævar Arnfjörð Bjarmason	ce93a4c612	dir.[ch]: replace dir_init() with DIR_INIT Remove the dir_init() function and replace it with a DIR_INIT macro. In many cases in the codebase we need to initialize things with a function for good reasons, e.g. needing to call another function on initialization. The "dir_init()" function was not one such case, and could trivially be replaced with a more idiomatic macro initialization pattern. The only place where we made use of its use of memset() was in dir_clear() itself, which resets the contents of an an existing struct pointer. Let's use the new "memcpy() a 'blank' struct on the stack" idiom to do that reset. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-01 12:32:22 -07:00
Ævar Arnfjörð Bjarmason	5726a6b401	.c _init(): define in terms of corresponding _INIT macro Change the common patter in the codebase of duplicating the initialization logic between an _INIT macro and a corresponding _init() function to use the macro as the canonical source of truth. Now we no longer need to keep the function up-to-date with the macro version. This implements a suggestion by Jeff King who found that under -O2 [1] modern compilers will init new version in place without the extra copy[1]. The performance of a single _init() won't matter in most cases, but even if it does we're going to be producing efficient machine code to perform these operations. 1. https://lore.kernel.org/git/YNyrDxUO1PlGJvCn@coredump.intra.peff.net/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-01 12:32:22 -07:00
Ævar Arnfjörð Bjarmason	3d97ea479f	.h: move some _INIT to designated initializers Move *_INIT macros I'll use in a subsequent commits to designated initializers. This isn't required for those follow-up changes, but since next commits will change things in this area, let's use the modern pattern over the old one while we're at it. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-01 12:31:45 -07:00
Jeff King	560bf51892	test-lib: avoid accidental globbing in match_pattern_list() We have a custom match_pattern_list() function which we use for matching test names (like "t1234") against glob-like patterns (like "t1???") for $GIT_SKIP_TESTS, --verbose-only, etc. Those patterns may have multiple whitespace-separated elements (e.g., "t0* t1234 t5?78"). The callers of match_pattern_list thus pass the strings unquoted, so that the shell does the usual field-splitting into separate arguments. But this also means the shell will do the usual globbing for each argument, which can result in us seeing an expansion based on what's in the filesystem, rather than the real pattern. For example, if I have the path "t5000" in the filesystem, and you feed the pattern "t?000", that _should_ match the string "t0000", but it won't after the shell has expanded it to "t5000". This has been a bug ever since that function was introduced. But it didn't usually trigger since we typically use the function inside the trash directory, which has a very limited set of files that are unlikely to match. It became a lot easier to trigger after `edc23840b0` (test-lib: bring $remove_trash out of retirement, 2021-05-10), because now we match $GIT_SKIP_TESTS before even entering the trash directory. So the t5000 example above can be seen with: GIT_SKIP_TESTS=t?000 ./t0000-basic.sh which should skip all tests but doesn't. We can fix this by using "set -f" to ask the shell not to glob (which is in POSIX, so should hopefully be portable enough). We only want to do this in a subshell (to avoid polluting the rest of the script), which means we need to get the whole string intact into the match_pattern_list function by quoting it. Arguably this is a good idea anyway, since it makes it much more obvious that we intend to split, and it's not simply sloppy scripting. Diagnosed-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-01 12:29:32 -07:00
Ævar Arnfjörð Bjarmason	60fadf8bd2	fetch: document the --negotiate-only option There was no documentation for the --negotiate-only option added in `9c1e657a8f` (fetch: teach independent negotiation (no packfile), 2021-05-04), only documentation for the related push.negotiation option added in the following commit in `477673d6f3` (send-pack: support push negotiation, 2021-05-04). Let's document it, and update the cross-linking I'd added between --negotiation-tip=* and 'fetch.negotiationAlgorithm' in `526608284a` (fetch doc: cross-link two new negotiation options, 2018-08-01). I think it would be better to say "in common with the remote" here than "...the server", but the documentation for --negotiation-tip=* above this talks about "the server", so let's continue doing that in this related option. See `3390e42adb` (fetch-pack: support negotiation tip whitelist, 2018-07-02) for that documentation. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-30 14:57:22 -07:00
Ævar Arnfjörð Bjarmason	1e5b5ea538	send-pack.c: move "no refs in common" abort earlier Move the early return if we have no remote refs in send_pack() earlier. When this was added in `4c353e890c` (Warn when send-pack does nothing, 2005-12-04) one of the first things we'd do was to abort, but as of `cfee10a773` (send-pack/receive-pack: allow errors to be reported back to pusher., 2005-12-25) we've added numerous server_supports() conditions that are acted on later in the function, that won't be used if we don't have remote refs. Then as of `477673d6f3` (send-pack: support push negotiation, 2021-05-04) we started doing even more work on the assumption that we had some remote refs to feed to --negotiation-tip=* options. We only hit this condition if we have nothing to push, so we don't need to consider "push.negotiate" etc. only to do nothing with that information. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-30 14:57:22 -07:00
Elijah Newren	3585d0ea23	merge-recursive: handle rename-to-self case Directory rename detection can cause transitive renames, e.g. if the two different sides of history each do one half of: A/file -> B/file B/ -> C/ then directory rename detection transitively renames to give us A/file -> C/file However, when C/ == A/, note that this gives us A/file -> A/file. merge-recursive assumed that any rename D -> E would have D != E. While that is almost always true, the above is a special case where it is not. So we cannot do things like delete the rename source, we cannot assume that a file existing at path E implies a rename/add conflict and we have to be careful about what stages end up in the output. This change feels a bit hackish. It took me surprisingly many hours to find, and given merge-recursive's design causing it to attempt to enumerate all combinations of edge and corner cases with special code for each combination, I'm worried there are other similar fixes needed elsewhere if we can just come up with the right special testcase. Perhaps an audit would rule it out, but I have not the energy. merge-recursive deserves to die, and since it is on its way out anyway, fixing this particular bug narrowly will have to be good enough. Reported-by: Anders Kaseorg <andersk@mit.edu> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-30 14:40:10 -07:00
Elijah Newren	a492d5331c	merge-ort: ensure we consult df_conflict and path_conflicts Path conflicts (typically rename path conflicts, e.g. rename/rename(1to2) or rename/add/delete), and directory/file conflicts should obviously result in files not being marked as clean in the merge. We had a codepath where we missed consulting the path_conflict and df_conflict flags, based on match_mask. Granted, it requires an unusual setup to trigger this codepath (directory rename causing rename-to-self is the only case I can think of), but we still need to handle it. To make it clear that we have audited the other codepaths that do not explicitly mention these flags, add some assertions that the flags are not set. Reported-by: Anders Kaseorg <andersk@mit.edu> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-30 14:40:10 -07:00
Elijah Newren	806f83287f	t6423: test directory renames causing rename-to-self Directory rename detection can cause transitive renames, e.g. if the two different sides of history each do one half of: A/file -> B/file B/ -> C/ then directory rename detection transitively renames to give us C/file. Since the default for merge.directoryRenames is conflict, this results in an error message saying it is unclear whether the file should be placed at B/file or C/file. What if C/ is A/, though? In such a case, the transitive rename would give us A/file, the original name we started with. Logically, having an error message with B/file vs. A/file should be fine, as should leaving the file where it started. But the logic in both merge-recursive and merge-ort did not handle a case of a filename being renamed to itself correctly; merge-recursive had two bugs, and merge-ort had one. Add some testcases covering such a scenario. Based-on-testcase-by: Anders Kaseorg <andersk@mit.edu> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-30 14:40:09 -07:00
René Scharfe	fe7fe62d8d	grep: report missing left operand of --and Git grep allows combining two patterns with --and. It checks and reports if the second pattern is missing when compiling the expression. A missing first pattern, however, is only reported later at match time. Thus no error is returned if no matching is done, e.g. because no file matches the also given pathspec. When that happens we get an expression tree with an GREP_NODE_AND node and a NULL pointer to the missing left child. free_pattern_expr() tries to dereference it during the cleanup at the end, which results in a segmentation fault. Fix this by verifying the presence of the left operand at expression compilation time. Reported-by: Matthew Hughes <matthewhughes934@gmail.com> Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-30 14:19:03 -07:00
Eric Wong	dc05929411	xmmap: inform Linux users of tuning knobs on ENOMEM Linux users may benefit from additional information on how to avoid ENOMEM from mmap despite the system having enough RAM to accomodate them. We can't reliably unmap pack windows to work around the issue since malloc and other library routines may mmap without our knowledge. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-29 23:14:25 -07:00
Ævar Arnfjörð Bjarmason	c49a177bec	test-lib.sh: set COLUMNS=80 for --verbose repeatability Some tests will fail under --verbose because while we've unset COLUMNS since `b1d645b58a` (tests: unset COLUMNS inherited from environment, 2012-03-27), we also look for the columns with an ioctl(.., TIOCGWINSZ, ...) on some platforms. By setting COLUMNS again we preempt the TIOCGWINSZ lookup in pager.c's term_columns(), it'll take COLUMNS over TIOCGWINSZ, This fixes t0500-progress-display.sh., which broke because of a combination of the this issue and the progress output reacting to the column width since `545dc345eb` (progress: break too long progress bar lines, 2019-04-12). The t5324-split-commit-graph.sh fails in a similar manner due to progress output, see [1] for details. The issue is not specific to progress.c, the diff code also checks COLUMNS and some of its tests can be made to fail in a similar manner[2], anything that invokes a pager is potentially affected. See `ea77e675e5` (Make "git help" react to window size correctly, 2005-12-18) and `ad6c3739a3` (pager: find out the terminal width before spawning the pager, 2012-02-12) for how the TIOCGWINSZ code ended up in pager.c 1. http://lore.kernel.org/git/20210624051253.GG6312@szeder.dev 2. https://lore.kernel.org/git/20210627074419.GH6312@szeder.dev/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-29 13:06:30 -07:00
Ævar Arnfjörð Bjarmason	7b76d6bf22	Makefile: add and use the ".DELETE_ON_ERROR" flag Use the GNU make ".DELETE_ON_ERROR" flag in our main Makefile, as we already do in the Documentation/Makefile since `db10fc6c09` (doc: simplify Makefile using .DELETE_ON_ERROR, 2021-05-21). Now if a command to make X fails X will be removed, the default behavior of GNU make is to only do so if "make" itself is interrupted with a signal. E.g. if we now intentionally break one of the rules with: - mv $@+ $@ + mv $@+ $@ && \ + false We'll get output like: $ make git CC git.o LINK git make: * [Makefile:2179: git] Error 1 make: * Deleting file 'git' $ file git git: cannot open `git' (No such file or directory) Before this change we'd leave the file in place in under this scenario. As in `db10fc6c09` this allows us to remove patterns of removing leftover $@ files at the start of rules, since previous failing runs of the Makefile won't have left those littered around anymore. I'm not as confident that we should be replacing the "mv $@+ $@" pattern entirely, since that means that external programs or one of our other Makefiles might race and get partial content. I'm not changing $(REMOTE_CURL_ALIASES) since that uses a ln/ln -s/cp dance, and would require the addition of "-f" flags if the "rm" at the start was removed. I've also got plans to fix that ln/ln -s/cp pattern in another series. For $(LIB_FILE) and $(XDIFF_LIB) we can rely on the "c" (create) being present in ARFLAGS. I'm not changing "$(ETAGS_TARGET)", "tags" and "cscope" because they've got a messy combination of removing "$@+" not "$@" at the beginning, or "$@*". I'm also addressing those in another series. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-29 08:03:45 -07:00
Taylor Blau	f89ecf7988	midx: report checksum mismatches during 'verify' 'git multi-pack-index verify' inspects the data in an existing MIDX for correctness by checking that the recorded object offsets are correct, and so on. But it does not check that the file's trailing checksum matches the data that it records. So, if an on-disk corruption happened to occur in the final few bytes (and all other data was recorded correctly), we would: - get a clean result from 'git multi-pack-index verify', but - be unable to reuse the existing MIDX when writing a new one (since we now check for checksum mismatches before reusing a MIDX) Teach the 'verify' sub-command to recognize corruption in the checksum by calling midx_checksum_valid(). Suggested-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-28 20:36:17 -07:00
Taylor Blau	ec1e28ef9c	midx: don't reuse corrupt MIDXs when writing When writing a new multi-pack index, Git tries to reuse as much of the data from an existing MIDX as possible, like object offsets. This is done to avoid re-opening a bunch of *.idx files unnecessarily, but can lead to problems if the data we are reusing is corrupt. That's because we'll blindly reuse data from an existing MIDX without checking its trailing checksum for validity. So if there is memory corruption while writing a MIDX, or disk corruption in the intervening period between writing and reuse, we'll blindly propagate those bad values forward. Suppose we experience a memory corruption while writing a MIDX such that we write an incorrect object offset (or alternatively, the disk corrupts the data after being written, but before being reused). Then when we go to write a new MIDX, we'll reuse the bad object offset without checking its validity. This means that the MIDX we just wrote is broken, but its trailing checksum is in-tact, since we never bothered to look at the values before writing. In the above, a "git multi-pack-index verify" would have caught the problem before writing, but writing a new MIDX wouldn't have noticed anything wrong, blindly carrying forward the corrupt offset. Individual pack indexes check their validity by verifying the crc32 attached to each entry when carrying data forward during a repack. We could solve this problem for MIDXs in the same way, but individual crc32's don't make much sense, since their entries are so small. Likewise, checking the whole file on every read may be prohibitively expensive if a repository has a lot of objects, packs, or both. But we can check the trailing checksum when reusing an existing MIDX when writing a new one. And a corrupt MIDX need not stop us from writing a new one, since we can just avoid reusing the existing one at all and pretend as if we are writing a new MIDX from scratch. Suggested-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-28 20:36:17 -07:00
Taylor Blau	15316a4732	commit-graph: rewrite to use checksum_valid() Rewrite an existing caller in `git commit-graph verify` to take advantage of checksum_valid(). Note that the replacement isn't a verbatim cut-and-paste, since the new function avoids using hashfile at all and instead talks to the_hash_algo directly, but it is functionally equivalent. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-28 20:36:17 -07:00
Taylor Blau	f9221e2cf5	csum-file: introduce checksum_valid() Introduce a new function which checks the validity of a file's trailing checksum. This is similar to hashfd_check(), but different since it is intended to be used by callers who aren't writing the same data (like `git index-pack --verify`), but who instead want to validate the integrity of data that they are reading. Rewrite the first of two callers which could benefit from this new function in pack-check.c. Subsequent callers will be added in the following patches. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-28 20:36:17 -07:00
Johannes Schindelin	e9f79acb28	ci: upgrade to using actions/{up,down}load-artifacts v2 The GitHub Actions to upload/download workflow artifacts saw a major upgrade since Git's GitHub workflow was established. Let's use it. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-28 20:35:40 -07:00
Johannes Schindelin	abb2b389f7	ci (vs-build): use `cmd` to copy the DLLs, not `powershell` We use a `.bat` script to copy the DLLs in the `vs-build` job, and those type of scripts are native to CMD, not to PowerShell. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-28 20:35:39 -07:00
Johannes Schindelin	0eb6c189a3	ci: use the new GitHub Action to download git-sdk-64-minimal In our continuous builds, Windows is the odd cookie that requires a complete development environment to be downloaded because there is no suitable one installed by default on Windows. Side note: technically, there _is_ a development environment present in GitHub Actions' build agents: MSYS2. But it differs from Git for Windows' SDK in subtle points, unfortunately enough so to prevent Git's test suite from running without failures. Traditionally, we support downloading this environment (which we nicknamed `git-sdk-64-minimal`) via a PowerShell scriptlet that accesses the build artifacts of a dedicated Azure Pipeline (which packages a tiny subset of the full Git for Windows SDK, containing just enough to build Git and run its test suite). This PowerShell script is unfortunately not very robust and sometimes fails due to network issues. Of course, we could add code to detect that situation, wait a little, try again, if it fails again wait a little longer, lather, rinse and repeat. Instead of doing all of this in Git's own `.github/workflows/`, though, let's offload this logic to the new GitHub Action at https://github.com/marketplace/actions/setup-git-for-windows-sdk This Action not only downloads and extracts git-sdk-64-minimal _outside_ the worktree (making it no longer necessary to meddle with `.gitignore` or `.git/info/exclude`), it also adds the `bash.exe` to the `PATH` and sets the environment variable `MSYSTEM` (an implementation detail that Git's workflow should never have needed to know about). This allows us to convert all those funny PowerShell tasks that wanted to call git-sdk-64-minimal's `bash.exe`: they all are now regular `bash` scriptlets. This finally lets us get rid of the funny quoting and escaping where we had to pay attention not only to quote and escape the Bash scriptlets properly, but also to add a second level of escaping (with backslashes for double quotes and backticks for dollar signs) to stop PowerShell from doing unintended things. Further, this Action uses a fast caching strategy native to GitHub Actions that should accelerate the download across CI runs: git-sdk-64-minimal is usually updated once per 24h, and needs to be cached only once within that period. Caching it (unfortunately only on a per-branch basis) speeds up the download step, and makes it much more robust at the same time by virtue of accessing a cache location that is closer in the network topology. With this we can drop the home-rolled caching where we try to accelerate the test phase by uploading git-sdk-64-minimal as a workflow artifact after using it to build Git, and then download it as workflow artifact in the test phase. Even better: the `vs-test` job no longer needs to depend on the `windows-build` job. The only reason it depended on it was to ensure that the `git-sdk-64-minimal` workflow artifact was available. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-28 20:35:39 -07:00
Jeff King	6afb265b96	add_ref_decoration(): rename s/type/deco_type/ Now that we have two types (a decoration type and an object type) in the function, let's give them both unique names to avoid accidentally using one instead of the other. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-28 20:32:32 -07:00
Jeff King	88473c8bae	load_ref_decorations(): avoid parsing non-tag objects When we load the ref decorations, we parse the object pointed to by each ref in order to get a "struct object". This is unnecessarily expensive; we really only need the object struct, and don't even look at the parsed contents. The exception is tags, which we do need to peel. We can improve this by looking up the object type first (which is much cheaper), and skipping the parse entirely for non-tags. This increases the work slightly for annotated tags (which now do a type lookup _and_ a parse), but decreases it a lot for other types. On balance, this seems to be a good tradeoff. In my git.git clone, with ~2k refs, most of which are branches, the time to run "git log -1 --decorate" drops from 34ms to 11ms. Even on my linux.git clone, which contains mostly tags and only a handful of branches, the time drops from 30ms to 19ms. And on a more extreme real-world case with ~220k refs, mostly non-tags, the time drops from 2.6s to 650ms. That command is a lop-sided example, of course, because it does as little non-loading work as possible. But it does show the absolute time improvement. Even in something like a full "git log --decorate" on that extreme repo, we'd still be saving 2s of CPU time. Ideally we could push this even further, and avoid parsing even tags, by relying on the packed-refs "peel" optimization (which we could do by calling peel_iterated_oid() instead of peeling manually). But we can't do that here. The packed-refs file only stores the bottom-layer of the peel (so in a "tag->tag->commit" chain, it stores only the commit as the peel result). But the decoration code wants to peel the layers individually, annotating the middle layers of the chain. If the packed-refs file ever learns to store all of the peeled layers, then we could switch to it. Or even if it stored a flag to indicate the peel was not multi-layer (because most of them aren't), then we could use it most of the time and fall back to a manual peel for the rare cases. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-28 20:31:40 -07:00
Jeff King	7463064b28	object.h: add lookup_object_by_type() function In some cases it's useful for efficiency reasons to get the type of an object before deciding whether to parse it, but we still want an object struct. E.g., in reachable.c, bitmaps give us the type, but we just want to mark flags on each object. Likewise, we may loop over every object and only parse tags in order to peel them; checking the type first lets us avoid parsing the non-tags. But our lookup_blob(), etc, functions make getting an object struct annoying: we have to call the right function for every type. And we cannot just use the generic lookup_object(), because it only returns an already-seen object; it won't allocate a new object struct. Let's provide a function that dispatches to the correct lookup_* function based on a run-time type. In fact, reachable.c already has such a helper, so we'll just make that public. I did change the return type from "void " to "struct object ". While the former is a clever way to avoid casting inside the function, it's less safe and less informative to people reading the function declaration. The next commit will add a new caller. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-28 20:30:18 -07:00
Jeff King	542d6abbb4	object.h: expand docstring for lookup_unknown_object() The lookup_unknown_object() system is not often used and is somewhat confusing. Let's try to explain it a bit more (which is especially important as I'm adding a related but slightly different function in the next commit). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-28 20:30:17 -07:00
Jeff King	b2086b5183	log: avoid loading decorations for userformats that don't need it If no --decorate option is given, we default to auto-decoration. And when that kicks in, cmd_log_init_finish() will unconditionally load the decoration refs. However, if we are using a user-format that does not include "%d" or "%D", we won't show the decorations at all, so we don't need to load them. We can detect this case and auto-disable them by adding a new field to our userformat_want helper. We can do this even when the user explicitly asked for --decorate, because it can't affect the output at all. This patch consistently reduces the time to run "git log -1 --format=%H" on my git.git clone (with ~2k refs) from 34ms to 7ms. On a much more extreme real-world repository (with ~220k refs), it goes from 2.5s to 4ms. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-28 20:30:17 -07:00

... 5 6 7 8 9 ...

63792 Commits