git-commit-vandalism

Author	SHA1	Message	Date
Derrick Stolee	d52fcf493b	p2000: remove stray '--sparse' flag from test This argument was added in `7cae7627c4` (builtin/grep.c: integrate with sparse index, 2022-09-22), but it was a carry-over from an earlier version where the --sparse flag was added to the 'git grep' builtin. This argument does not exist, so currently the p2000-sparse-operations.sh performance test script fails when reaching this step. With this fix, the script works with these numbers for my copy of the Git source code repository: Test HEAD ------------------------------------------------------------ 2000.30: git grep --cached ... (full-v3) 0.34(1.20+0.14) 2000.31: git grep --cached ... (full-v4) 0.31(1.15+0.13) 2000.32: git grep --cached ... (sparse-v3) 0.26(1.13+0.12) 2000.33: git grep --cached ... (sparse-v4) 0.27(1.13+0.12) Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-28 13:25:52 -07:00
Shaoxuan Yuan	7cae7627c4	builtin/grep.c: integrate with sparse index Turn on sparse index and remove ensure_full_index(). Before this patch, `git-grep` utilizes the ensure_full_index() method to expand the index and search all the entries. Because this method requires walking all the trees and constructing the index, it is the slow part within the whole command. To achieve better performance, this patch uses grep_tree() to search the sparse directory entries and get rid of the ensure_full_index() method. Why grep_tree() is a better choice over ensure_full_index()? 1) grep_tree() is as correct as ensure_full_index(). grep_tree() looks into every sparse-directory entry (represented by a tree) recursively when looping over the index, and the result of doing so matches the result of expanding the index. 2) grep_tree() utilizes pathspecs to limit the scope of searching. ensure_full_index() always expands the index, which means it will always walk all the trees and blobs in the repo without caring if the user only wants a subset of the content, i.e. using a pathspec. On the other hand, grep_tree() will only search the contents that match the pathspec, and thus possibly walking fewer trees. 3) grep_tree() does not construct and copy back a new index, while ensure_full_index() does. This also saves some time. ---------------- Performance test - Summary: p2000 tests demonstrate a ~71% execution time reduction for `git grep --cached bogus -- "f2/f1/f1/"` using tree-walking logic. However, notice that this result varies depending on the pathspec given. See below "Command used for testing" for more details. Test HEAD~ HEAD ------------------------------------------------------- 2000.78: git grep ... (full-v3) 0.35 0.39 (≈) 2000.79: git grep ... (full-v4) 0.36 0.30 (≈) 2000.80: git grep ... (sparse-v3) 0.88 0.23 (-73.8%) 2000.81: git grep ... (sparse-v4) 0.83 0.26 (-68.6%) - Command used for testing: git grep --cached bogus -- "f2/f1/f1/" The reason for specifying a pathspec is that, if we don't specify a pathspec, then grep_tree() will walk all the trees and blobs to find the pattern, and the time consumed doing so is not too different from using the original ensure_full_index() method, which also spends most of the time walking trees. However, when a pathspec is specified, this latest logic will only walk the area of trees enclosed by the pathspec, and the time consumed is reasonably a lot less. Generally speaking, because the performance gain is acheived by walking less trees, which are specified by the pathspec, the HEAD time v.s. HEAD~ time in sparse-v[3\|4], should be proportional to "pathspec enclosed area" v.s. "all area", respectively. Namely, the wider the <pathspec> is encompassing, the less the performance difference between HEAD~ and HEAD, and vice versa. That is, if we don't specify a pathspec, the performance difference [1] is indistinguishable: both methods walk all the trees and take generally same amount of time (even with the index construction time included for ensure_full_index()). [1] Performance test result without pathspec (hence walking all trees): Command used: git grep --cached bogus Test HEAD~ HEAD --------------------------------------------------- 2000.78: git grep ... (full-v3) 6.17 5.19 (≈) 2000.79: git grep ... (full-v4) 6.19 5.46 (≈) 2000.80: git grep ... (sparse-v3) 6.57 6.44 (≈) 2000.81: git grep ... (sparse-v4) 6.65 6.28 (≈) -------------------------- NEEDSWORK about submodules There are a few NEEDSWORKs that belong to improvements beyond this topic. See the NEEDSWORK in builtin/grep.c::grep_submodule() for more context. The other two NEEDSWORKs in t1092 are also relative. Suggested-by: Derrick Stolee <derrickstolee@github.com> Helped-by: Derrick Stolee <derrickstolee@github.com> Helped-by: Victoria Dye <vdye@github.com> Helped-by: Elijah Newren <newren@gmail.com> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-23 09:41:27 -07:00
Junio C Hamano	f00ddc9f48	Merge branch 'vd/scalar-generalize-diagnose' The "diagnose" feature to create a zip archive for diagnostic material has been lifted from "scalar" and made into a feature of "git bugreport". * vd/scalar-generalize-diagnose: scalar: update technical doc roadmap scalar-diagnose: use 'git diagnose --mode=all' builtin/bugreport.c: create '--diagnose' option builtin/diagnose.c: add '--mode' option builtin/diagnose.c: create 'git diagnose' builtin diagnose.c: add option to configure archive contents scalar-diagnose: move functionality to common location scalar-diagnose: move 'get_disk_info()' to 'compat/' scalar-diagnose: add directory to archiver more gently scalar-diagnose: avoid 32-bit overflow of size_t scalar-diagnose: use "$GIT_UNZIP" in test	2022-08-25 14:42:32 -07:00
Junio C Hamano	a103ad6f3d	Merge branch 'jk/pipe-command-nonblock' Fix deadlocks between main Git process and subprocess spawned via the pipe_command() API, that can kill "git add -p" that was reimplemented in C recently. * jk/pipe-command-nonblock: pipe_command(): mark stdin descriptor as non-blocking pipe_command(): handle ENOSPC when writing to a pipe pipe_command(): avoid xwrite() for writing to pipe git-compat-util: make MAX_IO_SIZE define globally available nonblock: support Windows compat: add function to enable nonblocking pipes	2022-08-25 14:42:32 -07:00
Junio C Hamano	098b7bfaa6	Merge branch 'js/fetch-negotiation-trace' The common ancestor negotiation exchange during a "git fetch" session now leaves trace log. * js/fetch-negotiation-trace: fetch-pack: add tracing for negotiation rounds	2022-08-25 14:42:31 -07:00
Junio C Hamano	01a30a5a58	Merge branch 'jk/is-promisor-object-keep-tree-in-use' An earlier optimization discarded a tree-object buffer that is still in use, which has been corrected. * jk/is-promisor-object-keep-tree-in-use: is_promisor_object(): fix use-after-free of tree buffer	2022-08-25 14:42:31 -07:00
Junio C Hamano	df3c129e24	Merge branch 'en/submodule-merge-messages-fixes' Further update the help messages given while merging submodules. * en/submodule-merge-messages-fixes: merge-ort: provide helpful submodule update message when possible merge-ort: avoid surprise with new sub_flag variable merge-ort: remove translator lego in new "submodule conflict suggestion" submodule merge: update conflict error message	2022-08-25 14:42:29 -07:00
Junio C Hamano	fddd8b4801	Merge branch 'll/disk-usage-humanise' "git rev-list --disk-usage" learned to take an optional value "human" to show the reported value in human-readable format, like "3.40MiB". * ll/disk-usage-humanise: rev-list: support human-readable output for `--disk-usage`	2022-08-18 13:07:05 -07:00
Junio C Hamano	9b9445cfde	Merge branch 'sy/sparse-rm' "git rm" has become more aware of the sparse-index feature. * sy/sparse-rm: rm: integrate with sparse-index rm: expand the index only when necessary pathspec.h: move pathspec_needs_expanded_index() from reset.c to here t1092: add tests for `git-rm`	2022-08-18 13:07:05 -07:00
Junio C Hamano	80ffc849bd	Merge branch 'vd/sparse-reset-checkout-fixes' Fixes to sparse index compatibility work for "reset" and "checkout" commands. * vd/sparse-reset-checkout-fixes: unpack-trees: unpack new trees as sparse directories cache.h: create 'index_name_pos_sparse()' oneway_diff: handle removed sparse directories checkout: fix nested sparse directory diff in sparse index	2022-08-18 13:07:04 -07:00
Junio C Hamano	363a193c3a	Merge branch 'jk/fsck-tree-mode-bits-fix' "git fsck" reads mode from tree objects but canonicalizes the mode before passing it to the logic to check object sanity, which has hid broken tree objects from the checking logic. This has been corrected, but to help exiting projects with broken tree objects that they cannot fix retroactively, the severity of anomalies this code detects has been demoted to "info" for now. * jk/fsck-tree-mode-bits-fix: fsck: downgrade tree badFilemode to "info" fsck: actually detect bad file modes in trees tree-walk: add a mechanism for getting non-canonicalized modes	2022-08-18 13:07:04 -07:00
Jeff King	716c1f649e	pipe_command(): mark stdin descriptor as non-blocking Our pipe_command() helper lets you both write to and read from a child process on its stdin/stdout. It's supposed to work without deadlocks because we use poll() to check when descriptors are ready for reading or writing. But there's a bug: if both the data to be written and the data to be read back exceed the pipe buffer, we'll deadlock. The issue is that the code assumes that if you have, say, a 2MB buffer to write and poll() tells you that the pipe descriptor is ready for writing, that calling: write(cmd->in, buf, 210241024); will do a partial write, filling the pipe buffer and then returning what it did write. And that is what it would do on a socket, but not for a pipe. When writing to a pipe, at least on Linux, it will block waiting for the child process to read() more. And now we have a potential deadlock, because the child may be writing back to us, waiting for us to read() ourselves. An easy way to trigger this is: git -c add.interactive.useBuiltin=true \ -c interactive.diffFilter=cat \ checkout -p HEAD~200 The diff against HEAD~200 will be big, and the filter wants to write all of it back to us (obviously this is a dummy filter, but in the real world something like diff-highlight would similarly stream back a big output). If you set add.interactive.useBuiltin to false, the problem goes away, because now we're not using pipe_command() anymore (instead, that part happens in perl). But this isn't a bug in the interactive code at all. It's the underlying pipe_command() code which is broken, and has been all along. We presumably didn't notice because most calls only do input _or_ output, not both. And the few that do both, like gpg calls, may have large inputs or outputs, but never both at the same time (e.g., consider signing, which has a large payload but a small signature comes back). The obvious fix is to put the descriptor into non-blocking mode, and indeed, that makes the problem go away. Callers shouldn't need to care, because they never see the descriptor (they hand us a buffer to feed into it). The included test fails reliably on Linux without this patch. Curiously, it doesn't fail in our Windows CI environment, but has been reported to do so for individual developers. It should pass in any environment after this patch (courtesy of the compat/ layers added in the last few commits). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-17 09:21:41 -07:00
Josh Steadmon	a29263cf5f	fetch-pack: add tracing for negotiation rounds Currently, negotiation for V0/V1/V2 fetch have trace2 regions covering the entire negotiation process. However, we'd like additional data, such as timing for each round of negotiation or the number of "haves" in each round. Additionally, "independent negotiation" (AKA push negotiation) has no tracing at all. Having this data would allow us to compare the performance of the various negotation implementations, and to debug unexpectedly slow fetch & push sessions. Add per-round trace2 regions for all negotiation implementations (V0+V1, V2, and independent negotiation), as well as an overall region for independent negotiation. Add trace2 data logging for the number of haves and "in vain" objects for each round, and for the total number of rounds once negotiation completes. Finally, add a few checks into various tests to verify that the number of rounds is logged as expected. Signed-off-by: Josh Steadmon <steadmon@google.com> Acked-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-15 09:17:03 -07:00
Junio C Hamano	7d0a1c8895	Merge branch 'pw/use-glibc-tunable-for-malloc-optim' Avoid repeatedly running getconf to ask libc version in the test suite, and instead just as it once per script. * pw/use-glibc-tunable-for-malloc-optim: tests: cache glibc version check	2022-08-14 23:19:28 -07:00
Junio C Hamano	c0f6dd49f1	Merge branch 'ab/tech-docs-to-help' Expose a lot of "tech docs" via "git help" interface. * ab/tech-docs-to-help: docs: move http-protocol docs to man section 5 docs: move cruft pack docs to gitformat-pack docs: move pack format docs to man section 5 docs: move signature docs to man section 5 docs: move index format docs to man section 5 docs: move protocol-related docs to man section 5 docs: move commit-graph format docs to man section 5 git docs: add a category for file formats, protocols and interfaces git docs: add a category for user-facing file, repo and command UX git help doc: use "<doc>" instead of "<guide>" help.c: remove common category behavior from drop_prefix() behavior help.c: refactor drop_prefix() to use a "switch" statement"	2022-08-14 23:19:28 -07:00
Junio C Hamano	d86ac14dd7	Merge branch 'ab/hooks-regression-fix' A follow-up fix to a fix for a regression in 2.36. * ab/hooks-regression-fix: hook API: don't segfault on strbuf_addf() to NULL "out"	2022-08-14 23:19:27 -07:00
Jeff King	1490d7d82d	is_promisor_object(): fix use-after-free of tree buffer Since commit `fcc07e980b` (is_promisor_object(): free tree buffer after parsing, 2021-04-13), we'll always free the buffers attached to a "struct tree" after searching them for promisor links. But there's an important case where we don't want to do so: if somebody else is already using the tree! This can happen during a "rev-list --missing=allow-promisor" traversal in a partial clone that is missing one or more trees or blobs. The backtrace for the free looks like this: #1 free_tree_buffer tree.c:147 #2 add_promisor_object packfile.c:2250 #3 for_each_object_in_pack packfile.c:2190 #4 for_each_packed_object packfile.c:2215 #5 is_promisor_object packfile.c:2272 #6 finish_object__ma builtin/rev-list.c:245 #7 finish_object builtin/rev-list.c:261 #8 show_object builtin/rev-list.c:274 #9 process_blob list-objects.c:63 #10 process_tree_contents list-objects.c:145 #11 process_tree list-objects.c:201 #12 traverse_trees_and_blobs list-objects.c:344 [...] We're in the middle of walking through the entries of a tree object via process_tree_contents(). We see a blob (or it could even be another tree entry) that we don't have, so we call is_promisor_object() to check it. That function loops over all of the objects in the promisor packfile, including the tree we're currently walking. When we're done with it there, we free the tree buffer. But as we return to the walk in process_tree_contents(), it's still holding on to a pointer to that buffer, via its tree_desc iterator, and it accesses the freed memory. Even a trivial use of "--missing=allow-promisor" triggers this problem, as the included test demonstrates (it's just a vanilla --blob:none clone). We can detect this case by only freeing the tree buffer if it was allocated on our behalf. This is a little tricky since that happens inside parse_object(), and it doesn't tell us whether the object was already parsed, or whether it allocated the buffer itself. But by checking for an already-parsed tree beforehand, we can distinguish the two cases. That feels a little hacky, and does incur an extra lookup in the object-hash table. But that cost is fairly minimal compared to actually loading objects (and since we're iterating the whole pack here, we're likely to be loading most objects, rather than reusing cached results). It may also be a good direction for this function in general, as there are other possible optimizations that rely on doing some analysis before parsing: - we could detect blobs and avoid reading their contents; they can't link to other objects, but parse_object() doesn't know that we don't care about checking their hashes. - we could avoid allocating object structs entirely for most objects (since we really only need them in the oidset), which would save some memory. - promisor commits could use the commit-graph rather than loading the object from disk This commit doesn't do any of those optimizations, but I think it argues that this direction is reasonable, rather than relying on parse_object() and trying to teach it to give us more information about whether it parsed. The included test fails reliably under SANITIZE=address just when running "rev-list --missing=allow-promisor". Checking the output isn't strictly necessary to detect the bug, but it seems like a reasonable addition given the general lack of coverage for "allow-promisor" in the test suite. Reported-by: Andrew Olsen <andrew.olsen@koordinates.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-14 18:03:36 -07:00
Victoria Dye	aac0e8ffee	builtin/bugreport.c: create '--diagnose' option Create a '--diagnose' option for 'git bugreport' to collect additional information about the repository and write it to a zipped archive. The '--diagnose' option behaves effectively as an alias for simultaneously running 'git bugreport' and 'git diagnose'. In the documentation, users are explicitly recommended to attach the diagnostics alongside a bug report to provide additional context to readers, ideally reducing some back-and-forth between reporters and those debugging the issue. Note that '--diagnose' may take an optional string arg (either 'stats' or 'all'). If specified without the arg, the behavior corresponds to running 'git diagnose' without '--mode'. As with 'git diagnose', this default is intended to help reduce unintentional leaking of sensitive information). Users can also explicitly specify '--diagnose=(stats\|all)' to generate the respective archive created by 'git diagnose --mode=(stats\|all)'. Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-12 13:20:02 -07:00
Victoria Dye	7ecf193f7d	builtin/diagnose.c: add '--mode' option Create '--mode=<mode>' option in 'git diagnose' to allow users to optionally select non-default diagnostic information to include in the output archive. Additionally, document the currently-available modes, emphasizing the importance of not sharing a '--mode=all' archive publicly due to the presence of sensitive information. Note that the option parsing callback - 'option_parse_diagnose()' - is added to 'diagnose.c' rather than 'builtin/diagnose.c' so that it may be reused in future callers configuring a diagnostics archive. Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-12 13:20:02 -07:00
Victoria Dye	6783fd3cef	builtin/diagnose.c: create 'git diagnose' builtin Create a 'git diagnose' builtin to generate a standalone zip archive of repository diagnostics. The "diagnose" functionality was originally implemented for Scalar in `aa5c79a331` (scalar: implement `scalar diagnose`, 2022-05-28). However, the diagnostics gathered are not specific to Scalar-cloned repositories and can be useful when diagnosing issues in any Git repository. Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-12 13:20:02 -07:00
Junio C Hamano	83489a5b20	Merge branch 'ab/plug-revisions-leak' Plug a bit more leaks in the revisions API. * ab/plug-revisions-leak: revisions API: don't leak memory on argv elements that need free()-ing bisect.c: partially fix bisect_rev_setup() memory leak log: refactor "rev.pending" code in cmd_show() log: fix a memory leak in "git show <revision>..." test-fast-rebase helper: use release_revisions() (again) bisect.c: add missing "goto" for release_revisions()	2022-08-12 13:19:08 -07:00
Junio C Hamano	657c7403a3	Merge branch 'ab/leak-check' Extend SANITIZE=leak checking and declare more tests "currently leak-free". * ab/leak-check: CI: use "GIT_TEST_SANITIZE_LEAK_LOG=true" in linux-leaks upload-pack: fix a memory leak in create_pack_file() leak tests: mark passing SANITIZE=leak tests as leak-free leak tests: don't skip some tests under SANITIZE=leak test-lib: have the "check" mode for SANITIZE=leak consider leak logs test-lib: add a GIT_TEST_PASSING_SANITIZE_LEAK=check mode test-lib: simplify by removing test_external tests: move copy/pasted PERL + Test::More checks to a lib-perl.sh t/Makefile: don't remove test-results in "clean-except-prove-cache" test-lib: add a SANITIZE=leak logging mode t/README: reword the "GIT_TEST_PASSING_SANITIZE_LEAK" description test-lib: add a --invert-exit-code switch test-lib: fix GIT_EXIT_OK logic errors, use BAIL_OUT test-lib: don't set GIT_EXIT_OK before calling test_atexit_handler test-lib: use $1, not $@ in test_known_broken_{ok,failure}_	2022-08-12 13:19:08 -07:00
Junio C Hamano	8faaf690f7	Merge branch 'lt/symbolic-ref-sanity' "git symbolic-ref symref non..sen..se" is now diagnosed as an error. * lt/symbolic-ref-sanity: symbolic-ref: refuse to set syntactically invalid target	2022-08-12 13:19:08 -07:00
Li Linchao	9096451acd	rev-list: support human-readable output for `--disk-usage` The '--disk-usage' option for git-rev-list was introduced in `16950f8384` (rev-list: add --disk-usage option for calculating disk usage, 2021-02-09). This is very useful for people inspect their git repo's objects usage infomation, but the resulting number is quit hard for a human to read. Teach git rev-list to output a human readable result when using '--disk-usage'. Signed-off-by: Li Linchao <lilinchao@oschina.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-11 13:45:23 -07:00
Junio C Hamano	5856cb98c0	Merge branch 'ma/t4200-update' into maint Test fix. source: <20220718154322.2177166-1-martin.agren@gmail.com> * ma/t4200-update: t4200: drop irrelevant code	2022-08-10 21:52:35 -07:00
Junio C Hamano	042159a509	Merge branch 'tb/commit-graph-genv2-upgrade-fix' into maint There was a bug in the codepath to upgrade generation information in commit-graph from v1 to v2 format, which has been corrected. source: <cover.1657667404.git.me@ttaylorr.com> * tb/commit-graph-genv2-upgrade-fix: commit-graph: fix corrupt upgrade from generation v1 to v2 commit-graph: introduce `repo_find_commit_pos_in_graph()` t5318: demonstrate commit-graph generation v2 corruption	2022-08-10 21:52:35 -07:00
Junio C Hamano	4f049a16bf	Merge branch 'tk/untracked-cache-with-uall' into maint Fix for a bug that makes write-tree to fail to write out a non-existent index as a tree, introduced in 2.37. source: <20220722212232.833188-1-martin.agren@gmail.com> * tk/untracked-cache-with-uall: read-cache: make `do_read_index()` always set up `istate->repo`	2022-08-10 21:52:34 -07:00
Junio C Hamano	340a6120e5	Merge branch 'mt/checkout-count-fix' into maint "git checkout" miscounted the paths it updated, which has been corrected. source: <cover.1657799213.git.matheus.bernardino@usp.br> * mt/checkout-count-fix: checkout: fix two bugs on the final count of updated entries checkout: show bug about failed entries being included in final report checkout: document bug where delayed checkout counts entries twice	2022-08-10 21:52:34 -07:00
Junio C Hamano	312d5b7429	Merge branch 'hx/lookup-commit-in-graph-fix' into maint A corner case bug where lazily fetching objects from a promisor remote resulted in infinite recursion has been corrected. source: <cover.1656593279.git.hanxin.hx@bytedance.com> * hx/lookup-commit-in-graph-fix: t5330: remove run_with_limited_processses() commit-graph.c: no lazy fetch in lookup_commit_in_graph()	2022-08-10 21:52:32 -07:00
Junio C Hamano	a6aeb2fef9	Merge branch 'jc/resolve-undo' into maint The resolve-undo information in the index was not protected against GC, which has been corrected. source: <xmqq35f7kzad.fsf@gitster.g> * jc/resolve-undo: fsck: do not dereference NULL while checking resolve-undo data revision: mark blobs needed for resolve-undo as reachable	2022-08-10 21:52:32 -07:00
Jeff King	4dd3b045f5	fsck: downgrade tree badFilemode to "info" The previous commit un-broke the "badFileMode" check; before then it was literally testing nothing. And as far as I can tell, it has been so since the very initial version of fsck. The current severity of "badFileMode" is just "warning". But in the --strict mode used by transfer.fsckObjects, that is elevated to an error. This will potentially cause hassle for users, because historical objects with bad modes will suddenly start causing pushes to many server operators to be rejected. At the same time, these bogus modes aren't actually a big risk. Because we canonicalize them everywhere besides fsck, they can't cause too much mischief in the real world. The worst thing you can do is end up with two almost-identical trees that have different hashes but are interpreted the same. That will generally cause things to be inefficient rather than wrong, and is a bug somebody working on a Git implementation would want to fix, but probably not worth inconveniencing users by refusing to push or fetch. So let's downgrade this to "info" by default, which is our setting for "mention this when fscking, but don't ever reject, even under strict mode". If somebody really wants to be paranoid, they can still adjust the level using config. Suggested-by: Xavier Morel <xavier.morel@masklinn.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-10 14:26:29 -07:00
Jeff King	53602a937d	fsck: actually detect bad file modes in trees We use the normal tree_desc code to iterate over trees in fsck, meaning we only see the canonicalized modes it returns. And hence we'd never see anything unexpected, since it will coerce literally any garbage into one of our normal and accepted modes. We can use the new RAW_MODES flag to see the real modes, and then use the existing code to actually analyze them. The existing code is written as allow-known-good, so there's not much point in testing a variety of breakages. The one tested here should be S_IFREG but with nonsense permissions. Do note that the error-reporting here isn't great. We don't mention the specific bad mode, but just that the tree has one or more broken modes. But when you go to look at it with "git ls-tree", we'll report the canonicalized mode! This isn't ideal, but given that this should come up rarely, and that any number of other tree corruptions might force you into looking at the binary bytes via "cat-file", it's not the end of the world. And it's something we can improve on top later if we choose. Reported-by: Xavier Morel <xavier.morel@masklinn.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-10 14:26:27 -07:00
Shaoxuan Yuan	ede241c715	rm: integrate with sparse-index Enable the sparse index within the `git-rm` command. The `p2000` tests demonstrate a ~92% execution time reduction for 'git rm' using a sparse index. Test HEAD~1 HEAD -------------------------------------------------------------------------- 2000.74: git rm ... (full-v3) 0.41(0.37+0.05) 0.43(0.36+0.07) +4.9% 2000.75: git rm ... (full-v4) 0.38(0.34+0.05) 0.39(0.35+0.05) +2.6% 2000.76: git rm ... (sparse-v3) 0.57(0.56+0.01) 0.05(0.05+0.00) -91.2% 2000.77: git rm ... (sparse-v4) 0.57(0.55+0.02) 0.03(0.03+0.00) -94.7% ---- Also, normalize a behavioral difference of `git-rm` under sparse-index. See related discussion [1]. `git-rm` a sparse-directory entry within a sparse-index enabled repo behaves differently from a sparse directory within a sparse-checkout enabled repo. For example, in a sparse-index repo, where 'folder1' is a sparse-directory entry, `git rm -r --sparse folder1` provides this: rm 'folder1/' Whereas in a sparse-checkout repo without sparse-index, doing so provides this: rm 'folder1/0/0/0' rm 'folder1/0/1' rm 'folder1/a' Because `git rm` a sparse-directory entry does not need to expand the index, therefore we should accept the current behavior, which is faster than "expand the sparse-directory entry to match the sparse-checkout situation". Modify a previous test so such difference is not considered as an error. [1] https://github.com/ffyuanda/git/pull/6#discussion_r934861398 Helped-by: Victoria Dye <vdye@github.com> Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-08 13:23:26 -07:00
Shaoxuan Yuan	bcf96cfca6	rm: expand the index only when necessary Remove the `ensure_full_index()` method so `git-rm` does not always expand the index when the expansion is unnecessary, i.e. when <pathspec> does not have any possibilities to match anything outside of sparse-checkout definition. Expand the index when the <pathspec> needs an expanded index, i.e. the <pathspec> contains wildcard that may need a full-index or the <pathspec> is simply outside of sparse-checkout definition. Notice that the test 'rm pathspec expands index when necessary' in t1092 is testing this code change behavior, though it will be marked as 'test_expect_success' only in the next patch, where we officially mark `command_requires_full_index = 0`, so the index does not expand unless we tell it to do so. Notice that because we also want `ensure_full_index` to record the stdout and stderr from Git command, a corresponding modification is also included in this patch. The reason we want the "sparse-index-out" and "sparse-index-err", is that we need to make sure there is no error from Git command itself, so we can rely on the `test_region` result and determine if the index is expanded or not. Helped-by: Victoria Dye <vdye@github.com> Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-08 13:23:26 -07:00
Shaoxuan Yuan	ba808251aa	t1092: add tests for `git-rm` Add tests for `git-rm`, make sure it behaves as expected when <pathspec> is both inside or outside of sparse-checkout definition. Helped-by: Victoria Dye <vdye@github.com> Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-08 13:23:26 -07:00
Junio C Hamano	3f61790678	Merge branch 'vd/sparse-reset-checkout-fixes' into sy/sparse-rm * vd/sparse-reset-checkout-fixes: unpack-trees: unpack new trees as sparse directories cache.h: create 'index_name_pos_sparse()' oneway_diff: handle removed sparse directories checkout: fix nested sparse directory diff in sparse index	2022-08-08 13:23:06 -07:00
Victoria Dye	b15207b8cf	unpack-trees: unpack new trees as sparse directories If 'unpack_single_entry()' is unpacking a new directory tree (that is, one not already present in the index) into a sparse index, unpack the tree as a sparse directory rather than traversing its contents and unpacking each file individually. This helps keep the sparse index as collapsed as possible in cases such as 'git reset --hard' restoring a outside-of-cone directory removed with 'git rm -r --sparse'. Without this patch, 'unpack_single_entry()' will only unpack a directory into the index as a sparse directory (rather than traversing into it and unpacking its files one-by-one) if an entry with the same name already exists in the index. This patch allows sparse directory unpacking without a matching index entry when the following conditions are met: 1. the directory's path is outside the sparse cone, and 2. there are no children of the directory in the index If a directory meets these requirements (as determined by 'is_new_sparse_dir()'), 'unpack_single_entry()' unpacks the sparse directory index entry and propagates the decision back up to 'unpack_callback()' to prevent unnecessary tree traversal into the unpacked directory. Reported-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-08 13:21:50 -07:00
Victoria Dye	49ff3cb90f	checkout: fix nested sparse directory diff in sparse index Add the 'recursive' diff flag to the local changes reporting done by 'git checkout' in 'show_local_changes()'. Without the flag enabled, unexpanded sparse directories will not be recursed into to report the diff of each file's contents, resulting in the reported local changes including "modified" sparse directories. The same issue was found and fixed for 'git status' in `2c521b0e49` (status: fix nested sparse directory diff in sparse index, 2022-03-01) Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-08 13:21:49 -07:00
Junio C Hamano	1b53bea29a	Merge branch 'js/t5351-freebsd-fix' Some tests assumed that core.fsyncMethod=batch is supported everywhere, which broke FreeBSD. * js/t5351-freebsd-fix: t5351: avoid using `test_cmp` for binary data t5351: avoid relying on `core.fsyncMethod = batch` to be supported	2022-08-08 13:13:14 -07:00
Junio C Hamano	1e92768aa1	Merge branch 'tb/cat-file-z' Operating modes like "--batch" of "git cat-file" command learned to take NUL-terminated input, instead of one-item-per-line. * tb/cat-file-z: builtin/cat-file.c: support NUL-delimited input with `-z` t1006: extract --batch-command inputs to variables	2022-08-05 15:52:14 -07:00
Junio C Hamano	ef7b9ad032	Merge branch 'ds/doc-wo-whitelist' into maint Avoid "white/black-list" in documentation and code comments. source: <pull.1274.v3.git.1658255537.gitgitgadget@gmail.com> * ds/doc-wo-whitelist: transport.c: avoid "whitelist" t: avoid "whitelist" git.txt: remove redundant language git-cvsserver: clarify directory list daemon: clarify directory arguments	2022-08-05 15:51:37 -07:00
Junio C Hamano	d16978517c	Merge branch 'mb/config-document-include' into maint Add missing documentation for "include" and "includeIf" features in "git config" file format, which incidentally teaches the command line completion to include them in its offerings. source: <pull.1285.v2.git.1658002423864.gitgitgadget@gmail.com> * mb/config-document-include: config.txt: document include, includeIf	2022-08-05 15:51:36 -07:00
Junio C Hamano	de28459136	Merge branch 'jk/clone-unborn-confusion' into maint "git clone" from a repository with some ref whose HEAD is unborn did not set the HEAD in the resulting repository correctly, which has been corrected. source: <YsdyLS4UFzj0j/wB@coredump.intra.peff.net> * jk/clone-unborn-confusion: clone: move unborn head creation to update_head() clone: use remote branch if it matches default HEAD clone: propagate empty remote HEAD even with other branches clone: drop extra newline from warning message	2022-08-05 15:51:35 -07:00
Ævar Arnfjörð Bjarmason	99ddc24672	hook API: don't segfault on strbuf_addf() to NULL "out" Fix a logic error in `a082345372` (hook API: fix v2.36.0 regression: hooks should be connected to a TTY, 2022-06-07). When it started using the "ungroup" API added in `fd3aaf53f7` (run-command: add an "ungroup" option to run_process_parallel(), 2022-06-07) it should have made the same sort of change that `fd3aaf53f7` itself made in "t/helper/test-run-command.c". The correct way to emit this "Couldn't start" output with "ungroup" would be: fprintf(stderr, _("Couldn't start hook '%s'\n"), hook_path); But we should instead remove the emitting of this output. As the added test shows we already emit output when we can't run the child. The "cannot run" output here is emitted by run-command.c's child_err_spew(). So the addition of the "Couldn't start hook" output here in `96e7225b31` (hook: add 'run' subcommand, 2021-12-22) was always redundant. For the pre-commit hook we'll now emit exactly the same output as we did before `f443246b9f` (commit: convert {pre-commit,prepare-commit-msg} hook to hook.h, 2021-12-22) (and likewise for others). We could at this point add this to the pick_next_hook() callbacks in hook.c: assert(!out); assert(!*pp_task_cb); And this to notify_start_failure() and notify_hook_finished() (in the latter case the parameter is called "pp_task_cp"): assert(!out); assert(!pp_task_cb); But let's leave any such instrumentation for some eventual cleanup of the "ungroup" API. Reported-by: Ilya K <me@0upti.me> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Emily Shaffer <emilyshaffer@google.com> Reviewed-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-05 14:12:00 -07:00
Ævar Arnfjörð Bjarmason	5db921054e	docs: move protocol-related docs to man section 5 Continue the move of existing Documentation/technical/* protocol and file-format documentation into our main documentation space. By moving the things that discuss the protocol we can properly link from e.g. lsrefs.unborn and protocol.version documentation to a manpage we build by default. So far we have been using the "gitformat-" prefix for the documentation we've been moving over from Documentation/technical/, but for protocol documentation let's use "gitprotocol-". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-04 14:12:23 -07:00
Ævar Arnfjörð Bjarmason	844739ba27	git docs: add a category for file formats, protocols and interfaces Create a new "File formats, protocols and other developer interfaces" section in the main "git help git" manual page and start moving the documentation that now lives in "Documentation/technical/.git" over to it. This complements the newly added and adjacent "Repository, command and file interfaces" section. This makes the technical documentation more accessible and discoverable. Before this we wouldn't install it by default, and had no ability to build man page versions of them. The links to them from our existing documentation link to the generated HTML version of these docs. So let's start moving those over, starting with just the "bundle-format.txt" documentation added in `7378ec90e1` (doc: describe Git bundle format, 2020-02-07). We'll now have a new gitformat-bundle(5) man page. Subsequent commits will move more git internal format documentation over. Unfortunately the syntax of the current Documentation/technical/.txt is not the same (when it comes to section headings etc.) as our Documentation/*.txt documentation, so change the relevant bits of syntax as we're moving this over. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-04 14:12:23 -07:00
Ævar Arnfjörð Bjarmason	d976c5100f	git docs: add a category for user-facing file, repo and command UX Create a new "Repository, command and file interfaces" section in the main "git help git" manual page. Move things that belong under this new criteria from the generic "Guides" section. The "Guides" section was added in `f442f28a81` (git.txt: add list of guides, 2020-08-05). It makes sense to have e.g. "giteveryday(7)" and "gitfaq(7)" listed under "Guides". But placing e.g. "gitignore(5)" in it is stretching the meaning of what a "guide" is, ideally that section should list things similar to "giteveryday(7)" and "gitcore-tutorial(7)". An alternate name that was considered for this new section was "User formats", for consistency with the nomenclature used for man section 5 in general. My man(1) lists it as "File formats and conventions, e.g. /etc/passwd". So calling this "git help --formats" or "git help --user-formats" would make sense for e.g. gitignore(5), but would be stretching it somewhat for githooks(5), and would seem really suspect for the likes of gitcli(7). Let's instead pick a name that's closer to the generic term "User interface", which is really what this documentation discusses: General user-interface documentation that doesn't obviously belong elsewhere. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-04 14:12:23 -07:00
Calvin Wan	4057523a40	submodule merge: update conflict error message When attempting to merge in a superproject with conflicting submodule pointers that cannot be fast-forwarded or trivially resolved, the merge fails and Git prints an error message that accurately describes the failure, but does not provide steps for the user to resolve the error. Git is left in a conflicted state, which requires the user to: 1. merge submodules or update submodules to an already existing commit that reflects the merge 2. add submodules changes to the superproject 3. finish merging superproject These steps are non-obvious for newer submodule users to figure out based on the error message and neither `git submodule status` nor `git status` provide any useful pointers. Update error message to provide steps to resolve submodule merge conflict. Future work could involve adding an advice flag to the message. Although the message is long, it also has the id of the submodule commit that needs to be merged, which could be useful information for the user. Additionally, 5 merge failures that resulted in an early return have been updated to reflect the status of the merge. 1. Null merge base (null o): CONFLICT_SUBMODULE_NULL_MERGE_BASE added as a new conflict type and will print updated error message. 2. Null merge side a (null a): BUG(). See [1] for discussion 3. Null merge side b (null b): BUG(). See [1] for discussion 4. Submodule not checked out: added NEEDSWORK bit 5. Submodule commits not present: added NEEDSWORK bit The errors with a NEEDSWORK bit deserve a more detailed explanation of how to resolve them. See [2] for more context. [1] https://lore.kernel.org/git/CABPp-BE0qGwUy80dmVszkJQ+tcpfLRW0OZyErymzhZ9+HWY1mw@mail.gmail.com/ [2] https://lore.kernel.org/git/xmqqpmhjjwo9.fsf@gitster.g/ Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-04 13:43:07 -07:00
Phillip Wood	a6a58f7801	tests: cache glibc version check `131b94a10a` ("test-lib.sh: Use GLIBC_TUNABLES instead of MALLOC_CHECK_ on glibc >= 2.34", 2022-03-04) introduced a check for the version of glibc that is in use. This check is performed as part of setup_malloc_check() which is called at least once for each test. As the test involves forking `getconf` and `expr` cache the result and use that within setup_malloc_check() to avoid forking these extra processes for each test. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-04 11:09:18 -07:00
Junio C Hamano	966ff64a30	Merge branch 'en/merge-restore-to-pristine' When "git merge" finds that it cannot perform a merge, it should restore the working tree to the state before the command was initiated, but in some corner cases it didn't. * en/merge-restore-to-pristine: merge: do not exit restore_state() prematurely merge: ensure we can actually restore pre-merge state merge: make restore_state() restore staged state too merge: fix save_state() to work when there are stat-dirty files merge: do not abort early if one strategy fails to handle the merge merge: abort if index does not match HEAD for trivial merges merge-resolve: abort if index does not match HEAD merge-ort-wrappers: make printed message match the one from recursive	2022-08-03 13:36:09 -07:00

1 2 3 4 5 ...

19876 Commits