git-commit-vandalism

Author	SHA1	Message	Date
brian m. carlson	e74b606d47	builtin/verify-pack: implement an --object-format option A recently added test in t5702 started using git verify-pack outside of a repository. While this poses no problems with SHA-1, with SHA-256 we implicitly rely on the setup of the repository to initialize our hash algorithm settings. Since we're not in a repository here, we need to provide git verify-pack help to set things up properly. git index-pack already knows an --object-format option, so let's accept one as well and pass it down to our git index-pack invocation. Since we're now dynamically adjusting the elements in argv, let's switch to using struct argv_array to manage them. Finally, let's make t5702 pass the proper argument on down to its git verify-pack caller. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-30 09:16:48 -07:00
Jeff King	9ab89a2439	log: enable "-m" automatically with "--first-parent" When using "--first-parent" to consider history as a single line of commits, git-log still defaults to treating merges specially, even though they could be considered as single commits in the linearized history (that just introduce all of the changes from the second and higher parents). Let's instead have "--first-parent" imply "-m", which makes something like: git log --first-parent -p do what you'd expect. Likewise: git log --first-parent -Sfoo will find "foo" in merge commits. No new test is needed; we'll tweak the output of the existing "--first-parent -p" test, which now matches the "-m --first-parent -p" test. The unchanged existing test for "--no-diff-merges" confirms that the user can get the old behavior if they want. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-29 13:43:57 -07:00
Jeff King	6fae74b418	revision: add "--no-diff-merges" option to counteract "-m" The "-m" option sets revs->ignore_merges to "0", but there's no way to undo it. This probably isn't something anybody overly cares about, since "1" is already the default, but it will serve as an escape hatch when we flip the default for ignore_merges to "0" in more situations. We'll also add a few extra niceties: - initialize the value to "-1" to indicate "not set", and then resolve it to the normal 0/1 bool in setup_revisions(). This lets any tweak functions, as well as setup_revisions() itself, avoid clobbering the user's preference (which until now they couldn't actually express). - since we now have --no-diff-merges, let's add the matching --diff-merges, which is just a synonym for "-m". Then we don't even need to document --no-diff-merges separately; it countermands the long form of "-m" in the usual way. The new test shows that this behaves just the same as the current behavior without "-m". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-29 13:43:57 -07:00
Jeff King	eed5332a13	log: drop "--cc implies -m" logic This was added by `82dee4160c` (log: show merge commit when --cc is given, 2015-08-20), which explains why we need it. But that commit failed to notice that setup_revisions() already does the same thing, since `cd2bdc5309` (Common option parsing for "git log --diff" and friends, 2006-04-14). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-29 13:43:57 -07:00
René Scharfe	98c6871fad	grep: avoid using oid_to_hex() with parse_object_or_die() parse_object_or_die() is passed an object ID and a name to show if the object cannot be parsed. If the name is NULL then it shows the hexadecimal object ID. Use that feature instead of preparing and passing the hexadecimal representation to the function proactively. That's shorter and a bit more efficient. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-28 15:26:12 -07:00
Jeff King	f6d8942b1f	strvec: fix indentation in renamed calls Code which split an argv_array call across multiple lines, like: argv_array_pushl(&args, "one argument", "another argument", "and more", NULL); was recently mechanically renamed to use strvec, which results in mis-matched indentation like: strvec_pushl(&args, "one argument", "another argument", "and more", NULL); Let's fix these up to align the arguments with the opening paren. I did this manually by sifting through the results of: git jump grep 'strvec_.*,$' and liberally applying my editor's auto-format. Most of the changes are of the form shown above, though I also normalized a few that had originally used a single-tab indentation (rather than our usual style of aligning with the open paren). I also rewrapped a couple of obvious cases (e.g., where previously too-long lines became short enough to fit on one), but I wasn't aggressive about it. In cases broken to three or more lines, the grouping of arguments is sometimes meaningful, and it wasn't worth my time or reviewer time to ponder each case individually. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-28 15:02:18 -07:00
Jeff King	22f9b7f3f5	strvec: convert builtin/ callers away from argv_array name We eventually want to drop the argv_array name and just use strvec consistently. There's no particular reason we have to do it all at once, or care about interactions between converted and unconverted bits. Because of our preprocessor compat layer, the names are interchangeable to the compiler (so even a definition and declaration using different names is OK). This patch converts all of the files in builtin/ to keep the diff to a manageable size. The conversion was done purely mechanically with: git ls-files '.c' '.h' \| xargs perl -i -pe ' s/ARGV_ARRAY/STRVEC/g; s/argv_array/strvec/g; ' and then selectively staging files with "git add builtin/". We'll deal with any indentation/style fallouts separately. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-28 15:02:18 -07:00
Jeff King	2745b6b450	quote: rename sq_dequote_to_argv_array to mention strvec We want to eventually drop the use of the "argv_array" name in favor of "strvec." Unlike most other uses of the name, this one is embedded in a function name, so the definition and all of the callers need to be updated at the same time. We don't technically need to update the parameter types here (our preprocessor compat macros make the two names interchangeable), but let's do so to keep the site consistent for now. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-28 15:02:18 -07:00
Jeff King	dbbcd44fb4	strvec: rename files from argv-array to strvec This requires updating #include lines across the code-base, but that's all fairly mechanical, and was done with: git ls-files '.c' '.h' \| xargs perl -i -pe 's/argv-array.h/strvec.h/' Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-28 15:02:17 -07:00
Jonathan Tan	e00549aa9b	pack-objects: prefetch objects to be packed When an object to be packed is noticed to be missing, prefetch all to-be-packed objects in one batch. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-21 14:29:42 -07:00
Jonathan Tan	8d5cf95735	pack-objects: refactor to oid_object_info_extended Use oid_object_info_extended() instead of oid_object_info() because a subsequent commit needs to specify an additional flag here. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-21 14:29:42 -07:00
Chris Torek	9b906af657	git-mv: improve error message for conflicted file 'git mv' has always complained about renaming a conflicted file, as it cannot handle multiple index entries for one file. However, the error message it uses has been the same as the one for an untracked file: fatal: not under version control, src=... which is patently wrong. Distinguish the two cases and add a test to make sure we produce the correct message. Signed-off-by: Chris Torek <chris.torek@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-20 14:35:43 -07:00
Junio C Hamano	d1ae8ba096	Merge branch 'tb/commit-graph-no-check-oids' into master Fix to the code to produce progress bar, which is new in the upcoming release. * tb/commit-graph-no-check-oids: commit-graph: fix "Collecting commits from input" progress line	2020-07-15 16:29:45 -07:00
SZEDER Gábor	862aead24e	commit-graph: fix "Collecting commits from input" progress line To display a progress line while reading commits from standard input and looking them up, `5b6653e523` (builtin/commit-graph.c: dereference tags in builtin, 2020-05-13) should have added a pair of start_delayed_progress() and stop_progress() calls around the loop reading stdin. Alas, the stop_progress() call ended up at the wrong place, after write_commit_graph(), which does all the commit-graph computation and writing, and has several progress lines of its own. Consequently, that new Collecting commits from input: 1234 progress line is overwritten by the first progress line shown by write_commit_graph(), and its final "done" line is shown last, after everything is finished: $ { sleep 3 ; git rev-list -3 HEAD ; sleep 1 ; } \| ~/src/git/git commit-graph write --stdin-commits Expanding reachable commits in commit graph: 873402, done. Writing out commit graph in 4 passes: 100% (3493608/3493608), done. Collecting commits from input: 3, done. Furthermore, that stop_progress() call was added after the 'cleanup' label, where that loop reading stdin jumps in case of an error. In case of invalid input this then results in the "done" line shown after the error message: $ { sleep 3 ; git rev-list -3 HEAD ; echo junk ; } \| ~/src/git/git commit-graph write --stdin-commits error: unexpected non-hex object ID: junk Collecting commits from input: 3, done. Move that stop_progress() call to the right place. While at it, drop the unnecessary 'if (progress)' condition protecting the stop_progress() call, because that function is prepared to handle a NULL progress struct. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-15 11:57:19 -07:00
Han-Wen Nienhuys	de966e39a8	bisect: treat BISECT_HEAD as a pseudo ref Both the git-bisect.sh as bisect--helper inspected the file system directly. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-10 13:53:37 -07:00
Ben Wijen	dfaa209a79	git clone: don't clone into non-empty directory When using git clone with --separate-git-dir realgitdir and realgitdir already exists, it's content is destroyed. So, make sure we don't clone into an existing non-empty directory. When `d45420c1` (clone: do not clean up directories we didn't create, 2018-01-02) tightened the clean-up procedure after a failed cloning into an empty directory, it assumed that the existing directory given is an empty one so it is OK to keep that directory, while running the clean-up procedure that is designed to remove everything in it (since there won't be any, anyway). Check and make sure that the $GIT_DIR is empty even cloning into an existing repository. Signed-off-by: Ben Wijen <ben@wijen.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-10 11:43:29 -07:00
Junio C Hamano	46be023084	Merge branch 'ct/diff-with-merge-base-clarification' into master Recent update to "git diff" meant as a code clean-up introduced a bug in its error handling code, which has been corrected. * ct/diff-with-merge-base-clarification: diff: check for merge bases before assigning sym->base	2020-07-09 14:00:43 -07:00
Junio C Hamano	8251695fe7	Merge branch 'cc/cat-file-usage-update' into master Doc/usage update. * cc/cat-file-usage-update: cat-file: add missing [=<format>] to usage/synopsis	2020-07-09 14:00:41 -07:00
Jeff King	5f46e610cb	diff: check for merge bases before assigning sym->base In symdiff_prepare(), we iterate over the set of parsed objects to pick out any symmetric differences, including the left, right, and base elements. We assign the results into pointers in a "struct symdiff", and then complain if we didn't find a base, like so: sym->left = rev->pending.objects[lpos].name; sym->right = rev->pending.objects[rpos].name; sym->base = rev->pending.objects[basepos].name; if (basecount == 0) die(_("%s...%s: no merge base"), sym->left, sym->right); But the least lines are backwards. If basecount is 0, then basepos will be -1, and we will access memory outside of the pending array. This isn't usually that big a deal, since we don't do anything besides a single pointer-sized read before exiting anyway, but it does violate the C standard, and of course memory-checking tools like ASan complain. Let's put the basecount check first. Note that we haveto split it from the other assignments, since the die() relies on sym->left and sym->right having been assigned (this isn't strictly necessary, but is easier to read than dereferencing the pending array again). Reported-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-08 13:57:18 -07:00
Junio C Hamano	0a23331aa6	Merge branch 'jk/fast-export-anonym-alt' "git fast-export --anonymize" learned to take customized mapping to allow its users to tweak its output more usable for debugging. * jk/fast-export-anonym-alt: fast-export: use local array to store anonymized oid fast-export: anonymize "master" refname fast-export: allow seeding the anonymized mapping fast-export: add a "data" callback parameter to anonymize_str() fast-export: move global "idents" anonymize hashmap into function fast-export: use a flex array to store anonymized entries fast-export: stop storing lengths in anonymized hashmaps fast-export: tighten anonymize_mem() interface to handle only strings fast-export: store anonymized oids as hex strings fast-export: use xmemdupz() for anonymizing oids t9351: derive anonymized tree checks from original repo	2020-07-06 22:09:17 -07:00
Junio C Hamano	11cbda2add	Merge branch 'js/default-branch-name' The name of the primary branch in existing repositories, and the default name used for the first branch in newly created repositories, is made configurable, so that we can eventually wean ourselves off of the hardcoded 'master'. * js/default-branch-name: contrib: subtree: adjust test to change in fmt-merge-msg testsvn: respect `init.defaultBranch` remote: use the configured default branch name when appropriate clone: use configured default branch name when appropriate init: allow setting the default for the initial branch name via the config init: allow specifying the initial branch name for the new repository docs: add missing diamond brackets submodule: fall back to remote's HEAD for missing remote.<name>.branch send-pack/transport-helper: avoid mentioning a particular branch fmt-merge-msg: stop treating `master` specially	2020-07-06 22:09:17 -07:00
Junio C Hamano	0258ed1e08	Merge branch 'cb/is-descendant-of' Code clean-up. * cb/is-descendant-of: commit-reach: avoid is_descendant_of() shim	2020-07-06 22:09:16 -07:00
Junio C Hamano	645f63111b	Merge branch 'es/get-worktrees-unsort' API cleanup for get_worktrees() * es/get-worktrees-unsort: worktree: drop get_worktrees() unused 'flags' argument worktree: drop get_worktrees() special-purpose sorting option	2020-07-06 22:09:15 -07:00
Junio C Hamano	d80bea479d	Merge branch 'ak/commit-graph-to-slab' A few fields in "struct commit" that do not have to always be present have been moved to commit slabs. * ak/commit-graph-to-slab: commit-graph: minimize commit_graph_data_slab access commit: move members graph_pos, generation to a slab commit-graph: introduce commit_graph_data_slab object: drop parsed_object_pool->commit_count	2020-07-06 22:09:14 -07:00
Junio C Hamano	12210859da	Merge branch 'bc/sha-256-part-2' SHA-256 migration work continues. * bc/sha-256-part-2: (44 commits) remote-testgit: adapt for object-format bundle: detect hash algorithm when reading refs t5300: pass --object-format to git index-pack t5704: send object-format capability with SHA-256 t5703: use object-format serve option t5702: offer an object-format capability in the test t/helper: initialize the repository for test-sha1-array remote-curl: avoid truncating refs with ls-remote t1050: pass algorithm to index-pack when outside repo builtin/index-pack: add option to specify hash algorithm remote-curl: detect algorithm for dumb HTTP by size builtin/ls-remote: initialize repository based on fetch t5500: make hash independent serve: advertise object-format capability for protocol v2 connect: parse v2 refs with correct hash algorithm connect: pass full packet reader when parsing v2 refs Documentation/technical: document object-format for protocol v2 t1302: expect repo format version 1 for SHA-256 builtin/show-index: provide options to determine hash algo t5302: modernize test formatting ...	2020-07-06 22:09:13 -07:00
Christian Couder	0172f7834a	cat-file: add missing [=<format>] to usage/synopsis When displaying cat-file usage, the fact that a <format> can be specified is only visible when lookling at the --batch and --batch-check options which are shown like this: --batch[=<format>] show info and content of objects fed from the standard input --batch-check[=<format>] show info about objects fed from the standard input It seems more coherent and improves discovery to also show it on the usage line. In the documentation the DESCRIPTION tells us that "The output format can be overridden using the optional <format> argument", but we can't see the <format> argument in the SYNOPSIS above the description which is confusing. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-01 15:54:05 -07:00
Derrick Stolee	0087a87ba8	commit-graph: persist existence of changed-paths The changed-path Bloom filters were released in v2.27.0, but have a significant drawback. A user can opt-in to writing the changed-path filters using the "--changed-paths" option to "git commit-graph write" but the next write will drop the filters unless that option is specified. This becomes even more important when considering the interaction with gc.writeCommitGraph (on by default) or fetch.writeCommitGraph (part of features.experimental). These config options trigger commit-graph writes that the user did not signal, and hence there is no --changed-paths option available. Allow a user that opts-in to the changed-path filters to persist the property of "my commit-graph has changed-path filters" automatically. A user can drop filters using the --no-changed-paths option. In the process, we need to be extremely careful to match the Bloom filter settings as specified by the commit-graph. This will allow future versions of Git to customize these settings, and the version with this change will persist those settings as commit-graphs are rewritten on top. Use the trace2 API to signal the settings used during the write, and check that output in a test after manually adjusting the correct bytes in the commit-graph file. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-01 14:17:43 -07:00
Junio C Hamano	298d704e70	Merge branch 'sk/diff-files-show-i-t-a-as-new' "git diff-files" has been taught to say paths that are marked as intent-to-add are new files, not modified from an empty blob. * sk/diff-files-show-i-t-a-as-new: diff-files: treat "i-t-a" files as "not-in-index"	2020-06-29 14:17:27 -07:00
Junio C Hamano	b381c98891	Merge branch 'rs/pull-leakfix' Leakfix. * rs/pull-leakfix: pull: plug minor memory leak after using is_descendant_of()	2020-06-29 14:17:26 -07:00
Junio C Hamano	1ea1f93fd9	Merge branch 'dl/diff-usage-comment-update' An in-code comment in "git diff" has been updated. * dl/diff-usage-comment-update: builtin/diff: fix botched update of usage comment builtin/diff: update usage comment	2020-06-29 14:17:25 -07:00
Junio C Hamano	1033b98291	Merge branch 'xl/upgrade-repo-format' Allow runtime upgrade of the repository format version, which needs to be done carefully. There is a rather unpleasant backward compatibility worry with the last step of this series, but it is the right thing to do in the longer term. * xl/upgrade-repo-format: check_repository_format_gently(): refuse extensions for old repositories sparse-checkout: upgrade repository to version 1 when enabling extension fetch: allow adding a filter after initial clone repository: add a helper function to perform repository format upgrade	2020-06-29 14:17:24 -07:00
Jeff King	f39ad38410	fast-export: use local array to store anonymized oid Some older versions of gcc complain about this line: builtin/fast-export.c:412:2: error: dereferencing type-punned pointer will break strict-aliasing rules [-Werror=strict-aliasing] put_be32(oid.hash + hashsz - 4, counter++); ^ This seems to be a false positive, as there's no type-punning at all here. oid.hash is an array of unsigned char; when we pass it to a function it decays to a pointer to unsigned char. We do take a void pointer in put_be32(), but it's immediately aliased with another pointer to unsigned char (and clearly the compiler is looking inside the inlined put_be32(), since the warning doesn't happen with -O0). This happens on gcc 4.8 and 4.9, but not later versions (I tested gcc 6, 7, 8, and 9). We can work around it by using a local array instead of an object_id struct. This is a little more intimate with the details of object_id, but for whatever reason doesn't seem to trigger the compiler warning. We can revert this patch once we decide that those gcc versions are too old to care about for a warning like this (gcc 4.8 is the default compiler for Ubuntu Trusty, which is out-of-support but not fully end-of-life'd until April 2022). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-25 14:19:23 -07:00
Jeff King	8a49495583	fast-export: anonymize "master" refname Running "fast-export --anonymize" will leave "refs/heads/master" untouched in the output, for two reasons: - it helped to have some known reference point between the original and anonymized repository - since it's historically the default branch name, it doesn't leak any information Now that we can ask fast-export to retain particular tokens, we have a much better tool for the first one (because it works for any ref, not just master). For the second, the notion of "default branch name" is likely to become configurable soon, at which point the name _does_ leak information. Let's drop this special case in preparation. Note that we have to adjust the test a bit, since it relied on using the name "master" in the anonymized repos. We could just use --anonymize-map=master to keep the same output, but then we wouldn't know if it works because of our hard-coded master or because of the explicit map. So let's flip the test a bit, and confirm that we anonymize "master", but keep "other" in the output. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-25 14:19:23 -07:00
Jeff King	65b5d9fae7	fast-export: allow seeding the anonymized mapping After you anonymize a repository, it can be hard to find which commits correspond between the original and the result, and thus hard to reproduce commands that triggered bugs in the original. Let's make it possible to seed the anonymization map. This lets users either: - mark names to be retained as-is, if they don't consider them secret (in which case their original commands would just work) - map names to new values, which lets them adapt the reproduction recipe to the new names without revealing the originals The implementation is fairly straight-forward. We already store each anonymized token in a hashmap (so that the same token appearing twice is converted to the same result). We can just introduce a new "seed" hashmap which is consulted first. This does make a few more promises to the user about how we'll anonymize things (e.g., token-splitting pathnames). But it's unlikely that we'd want to change those rules, even if the actual anonymization of a single token changes. And it makes things much easier for the user, who can unblind only a directory name without having to specify each path within it. One alternative to this approach would be to anonymize as we see fit, and then dump the whole refname and pathname mappings to a file. This does work, but it's a bit awkward to use (you have to manually dig the items you care about out of the mapping). Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-25 14:19:23 -07:00
Junio C Hamano	34e849b05a	Merge branch 'jt/cdn-offload' The "fetch/clone" protocol has been updated to allow the server to instruct the clients to grab pre-packaged packfile(s) in addition to the packed object data coming over the wire. * jt/cdn-offload: upload-pack: fix a sparse '0 as NULL pointer' warning upload-pack: send part of packfile response as uri fetch-pack: support more than one pack lockfile upload-pack: refactor reading of pack-objects out Documentation: add Packfile URIs design doc Documentation: order protocol v2 sections http-fetch: support fetching packfiles by URL http-fetch: refactor into function http: refactor finish_http_pack_request() http: use --stdin when indexing dumb HTTP pack	2020-06-25 12:27:47 -07:00
Junio C Hamano	10462829e3	Merge branch 'ss/submodule-set-branch-in-c' Rewrite of parts of the scripted "git submodule" Porcelain command continues; this time it is "git submodule set-branch" subcommand's turn. * ss/submodule-set-branch-in-c: submodule: port subcommand 'set-branch' from shell to C	2020-06-25 12:27:47 -07:00
Junio C Hamano	7b2685ef2d	Merge branch 'dl/branch-cleanup' Code clean-up around "git branch" with a minor bugfix. * dl/branch-cleanup: branch: don't mix --edit-description t3200: test for specific errors t3200: rename "expected" to "expect"	2020-06-25 12:27:47 -07:00
Junio C Hamano	1457886ce2	Merge branch 'ct/diff-with-merge-base-clarification' "git diff" used to take arguments in random and nonsense range notation, e.g. "git diff A..B C", "git diff A..B C...D", etc., which has been cleaned up. * ct/diff-with-merge-base-clarification: Documentation: usage for diff combined commits git diff: improve range handling t/t3430: avoid undefined git diff behavior	2020-06-25 12:27:46 -07:00
Junio C Hamano	53674699c0	Merge branch 'en/clean-cleanups' Code clean-up of "git clean" resulted in a fix of recent performance regression. * en/clean-cleanups: clean: optimize and document cases where we recurse into subdirectories clean: consolidate handling of ignored parameters dir, clean: avoid disallowed behavior dir: fix a few confusing comments	2020-06-25 12:27:45 -07:00
Johannes Schindelin	0cc1b475bb	clone: use configured default branch name when appropriate When cloning a repository without any branches, Git chooses a default branch name for the as-yet unborn branch. As part of the implicit initialization of the local repository, Git just learned to respect `init.defaultBranch` to choose a different initial branch name. We now really want that branch name to be used as a fall-back. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-24 09:14:21 -07:00
Don Goodman-Wilson	8747ebb7cd	init: allow setting the default for the initial branch name via the config We just introduced the command-line option `--initial-branch=<branch-name>` to allow initializing a new repository with a different initial branch than the hard-coded one. To allow users to override the initial branch name more permanently (i.e. without having to specify the name manually for each and every `git init` invocation), let's introduce the `init.defaultBranch` config setting. Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Helped-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Don Goodman-Wilson <don@goodman-wilson.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-24 09:14:21 -07:00
Johannes Schindelin	32ba12dab2	init: allow specifying the initial branch name for the new repository There is a growing number of projects and companies desiring to change the main branch name of their repositories (see e.g. https://twitter.com/mislav/status/1270388510684598272 for background on this). To change that branch name for new repositories, currently the only way to do that automatically is by copying all of Git's template directory, then hard-coding the desired default branch name into the `.git/HEAD` file, and then configuring `init.templateDir` to point to those copied template files. To make this process much less cumbersome, let's introduce a new option: `--initial-branch=<branch-name>`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-24 09:14:21 -07:00
Johannes Schindelin	f0a96e8d4c	submodule: fall back to remote's HEAD for missing remote.<name>.branch When `remote.<name>.branch` is not configured, `git submodule update` currently falls back to using the branch name `master`. A much better idea, however, is to use the remote `HEAD`: on all Git servers running reasonably recent Git versions, the symref `HEAD` points to the main branch. Note: t7419 demonstrates that there _might_ be use cases out there that _expect_ `git submodule update --remote` to update submodules to the remote `master` branch even if the remote `HEAD` points to another branch. Arguably, this patch makes the behavior more intuitive, but there is a slight possibility that this might cause regressions in obscure setups. Even so, it should be okay to fix this behavior without anything like a longer transition period: - The `git submodule update --remote` command is not really common. - Current Git's behavior when running this command is outright confusing, unless the remote repository's current branch _is_ `master` (in which case the proposed behavior matches the old behavior). - If a user encounters a regression due to the changed behavior, the fix is actually trivial: setting `submodule.<name>.branch` to `master` will reinstate the old behavior. Helped-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-24 09:14:21 -07:00
Jeff King	d5bf91fde4	fast-export: add a "data" callback parameter to anonymize_str() The anonymize_str() function takes a generator callback, but there's no way to pass extra context to it. Let's add the usual "void *data" parameter to the generator interface and pass it along. This is mildly annoying for existing callers, all of which pass NULL, but is necessary to avoid extra globals in some cases we'll add in a subsequent patch. While we're touching each of these callbacks, we can further observe that none of them use the existing orig/len parameters at all. This makes sense, since the point is for their output to have no discernable basis in the original (my original version had some notion that we might use a one-way function to obfuscate the names, but it was never implemented). So let's drop those extra parameters. If a caller really wants to do something with them, it can pass a struct through the new data parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-23 19:56:26 -07:00
Jeff King	6416a865da	fast-export: move global "idents" anonymize hashmap into function All of the other anonymization functions keep their static mappings inside the function to avoid polluting the global namespace. Let's do the same for "idents", as nobody needs it outside of anonymize_ident_line(). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-23 19:56:26 -07:00
Jeff King	55b01456a9	fast-export: use a flex array to store anonymized entries Now that we're using a separate keydata struct for hash lookups, we have more flexibility in how we allocate anonymized_entry structs. Let's push the "orig" key into a flex member within the struct. That should save us a few bytes of memory per entry (a pointer plus any malloc overhead), and may make lookups a little faster (since it's one less pointer to chase in the comparison function). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-23 19:56:26 -07:00
Jeff King	a0f65641df	fast-export: stop storing lengths in anonymized hashmaps Now that the anonymize_str() interface is restricted to NUL-terminated strings, there's no need for us to keep track of the length of each entry in the hashmap. This simplifies the code and saves a bit of memory. Note that we do still need to compare the stored results to partial strings passed in by the callers. We can do that by using hashmap's keydata feature to get the ptr/len pair into the comparison function, and then using strncmp(). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-23 19:56:26 -07:00
Jeff King	7f40759496	fast-export: tighten anonymize_mem() interface to handle only strings While the anonymize_mem() interface _can_ store arbitrary byte sequences, none of the callers uses this feature (as of the previous commit). We'd like to keep it that way, as we'll be exposing the string-like nature of the anonymization routines to the user. So let's tighten up the interface a bit: - don't treat "len" as an out-parameter from anonymize_mem(); this ensures callers treat the pointer result as a NUL-terminated string - likewise, don't treat "len" as an out-parameter from generator functions - swap out "void " for "char " as appropriate to signal that we don't handle arbitrary memory - rename the function to anonymize_str() This will also open up some optimization opportunities in a future patch. Note that we can't drop the "len" parameter entirely. Some callers do pass in partial strings (e.g., "foo/bar", len=3) to avoid copying, and we need to handle those still. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-23 19:56:26 -07:00
Jeff King	750bb32589	fast-export: store anonymized oids as hex strings When fast-export stores anonymized oids, it does so as binary strings. And while the anonymous mapping storage is binary-clean (at least as of the previous commit), this will become awkward when we start exposing more of it to the user. In particular, if we allow a method for retaining token "foo", then users may want to specify a hex oid as such a token. Let's just switch to storing the hex strings. The difference in memory usage is negligible (especially considering how infrequently we'd generally store an oid compared to, say, path components). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-23 19:56:26 -07:00
Jeff King	b897bf5f37	fast-export: use xmemdupz() for anonymizing oids Our anonymize_mem() function is careful to take a ptr/len pair to allow storing binary tokens like object ids, as well as partial strings (e.g., just "foo" of "foo/bar"). But it duplicates the hash key using xstrdup()! That means that: - for a partial string, we'd store all bytes up to the NUL, even though we'd never look at anything past "len". This didn't produce wrong behavior, but was wasteful. - for a binary oid that doesn't contain a zero byte, we'd copy garbage bytes off the end of the array (though as long as nothing complained about reading uninitialized bytes, further reads would be limited by "len", and we'd produce the correct results) - for a binary oid that does contain a zero byte, we'd copy _fewer_ bytes than intended into the hashmap struct. When we later try to look up a value, we'd access uninitialized memory and potentially falsely claim that a particular oid is not present. The most common reason to store an oid is an anonymized gitlink, but our test case doesn't have any gitlinks at all. So let's add one whose oid contains a NUL and is present at two different paths. ASan catches the memory error, but even without it we can detect the bug because the oid is not anonymized the same way for both paths. And of course the fix is to copy the correct number of bytes. We don't technically need the appended NUL from xmemdupz(), but it doesn't hurt as an extra protection against anybody treating it like a string (plus a future patch will push us more in that direction). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-23 19:56:26 -07:00
Denton Liu	c592fd4c83	builtin/diff: fix botched update of usage comment In the previous commit, an attempt was made to correct the "N=1, M=0" case. However, the fix was botched and it introduced two half-correct sections by mistake. Combine these half-correct sections into one fully correct section. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-23 16:39:41 -07:00
Carlo Marcelo Arenas Belón	c1ea625f72	commit-reach: avoid is_descendant_of() shim `d91d6fbf26` (commit-reach: create repo_is_descendant_of(), 2020-06-17) adds a repository aware version of is_descendant_of() and a backward compatibility shim that is barely used. Update all callers to directly use the new repo_is_descendant_of() function instead; making the codebase simpler and pushing more the_repository references higher up the stack. Helped-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-23 16:36:53 -07:00
Junio C Hamano	9740ef888e	Merge branch 'es/worktree-duplicate-paths' The same worktree directory must be registered only once, but "git worktree move" allowed this invariant to be violated, which has been corrected. * es/worktree-duplicate-paths: worktree: make "move" refuse to move atop missing registered worktree worktree: generalize candidate worktree path validation worktree: prune linked worktree referencing main worktree path worktree: prune duplicate entries referencing same worktree path worktree: make high-level pruning re-usable worktree: give "should be pruned?" function more meaningful name worktree: factor out repeated string literal	2020-06-22 15:55:03 -07:00
Srinidhi Kaushik	feea6946a5	diff-files: treat "i-t-a" files as "not-in-index" The `diff-files' command and related commands which call the function `cmd_diff_files()', consider the "intent-to-add" files as a part of the index when comparing the work-tree against it. This was previously addressed in commits [1] and [2] by turning the option `--ita-invisible-in-index' (introduced in [3]) on by default. For `diff-files' (and `add -p' as a consequence) to show the i-t-a files as as new, `ita_invisible_in_index' will be enabled by default here as well. [1] `0231ae71d3` (diff: turn --ita-invisible-in-index on by default, 2018-05-26) [2] `425a28e0a4` (diff-lib: allow ita entries treated as "not yet exist in index", 2016-10-24) [3] `b42b451919` (diff: add --ita-[in]visible-in-index, 2016-10-24) Signed-off-by: Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-22 10:46:45 -07:00
Eric Sunshine	03f2465bb1	worktree: drop get_worktrees() unused 'flags' argument get_worktrees() accepts a 'flags' argument, however, there are no existing flags (the lone flag GWT_SORT_LINKED was recently retired) and no behavior which can be tweaked. Therefore, drop the 'flags' argument. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-22 10:31:15 -07:00
Eric Sunshine	d9c54c2bbf	worktree: drop get_worktrees() special-purpose sorting option Of all the clients of get_worktrees(), only "git worktree list" wants the list sorted in a very specific way; other clients simply don't care about the order. Rather than imbuing get_worktrees() with special knowledge about how various clients -- now and in the future -- may want the list sorted, drop the sorting capability altogether and make it the client's responsibility to sort the list if needed. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-22 10:30:29 -07:00
brian m. carlson	586740aa6e	builtin/index-pack: add option to specify hash algorithm git index-pack is usually run in a repository, but need not be. Since packs don't contains information on the algorithm in use, instead relying on context, add an option to index-pack to tell it which one we're using in case someone runs it outside of a repository. Since using --stdin necessarily implies a repository, don't allow specifying an object format if it's provided to prevent users from passing an option that won't work. Add documentation for this option. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-19 14:04:08 -07:00
René Scharfe	0c9a4f638a	pull: plug minor memory leak after using is_descendant_of() cmd_pull() builds a commit_list to pass a single potential ancestor to is_descendant_of(). The latter leaves the list intact. Release the allocated memory after the call. Leaking in cmd_*() isn't a big deal, but sets a bad example for other users of is_descendant_of(). Signed-off-by: René Scharfe <l.s.r@web.de> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-19 12:17:21 -07:00
Denton Liu	a9d7689cd4	builtin/diff: update usage comment A comment in cmd_diff() states that if one tree-ish and no blobs are provided, (the "N=1, M=0" case), it will provide a diff between the tree and the cache. This is incorrect because a diff happens between the tree-ish and the working tree. Remove the `--cached` in the comment so that the correct behavior is shown. Add a new section describing the "N=1, M=0, --cached" behavior. Next, describe the "N=0, M=0, --cached" case, similar to the above since it is undocumented. Finally, fix some spacing issues. Add spaces between each section for consistency and readability. Also, change tabs within the comment into spaces. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-18 15:01:15 -07:00
Junio C Hamano	a554228ffb	Merge branch 'en/sparse-checkout' The behaviour of "sparse-checkout" in the state "git clone --no-checkout" left was changed accidentally in 2.27, which has been corrected. * en/sparse-checkout: sparse-checkout: avoid staging deletions of all files	2020-06-17 21:54:02 -07:00
Junio C Hamano	524caf8035	Merge branch 'js/reflog-anonymize-for-clone-and-fetch' The reflog entries for "git clone" and "git fetch" did not anonymize the URL they operated on. * js/reflog-anonymize-for-clone-and-fetch: clone/fetch: anonymize URLs in the reflog	2020-06-17 21:54:01 -07:00
Abhishek Kumar	6da43d937c	object: drop parsed_object_pool->commit_count `14ba97f8` (alloc: allow arbitrary repositories for alloc functions, 2018-05-15) introduced parsed_object_pool->commit_count to keep count of commits per repository and was used to assign commit->index. However, commit-slab code requires commit->index values to be unique and a global count would be correct, rather than a per-repo count. Let's introduce a static counter variable, `parsed_commits_count` to keep track of parsed commits so far. As commit_count has no use anymore, let's also drop it from the struct. Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-17 14:37:14 -07:00
Denton Liu	dc44639904	branch: don't mix --edit-description `git branch` accepts `--edit-description` in conjunction with other arguments. However, `--edit-description` is its own mode, similar to `--set-upstream-to`, which is also made mutually exclusive with other modes. Prevent `--edit-description` from being mixed with other modes. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-17 11:12:34 -07:00
Elijah Newren	7233f17577	clean: optimize and document cases where we recurse into subdirectories Commit `6b1db43109` ("clean: teach clean -d to preserve ignored paths", 2017-05-23) added the following code block (among others) to git-clean: if (remove_directories) dir.flags \|= DIR_SHOW_IGNORED_TOO \| DIR_KEEP_UNTRACKED_CONTENTS; The reason for these flags is well documented in the commit message, but isn't obvious just from looking at the code. Add some explanations to the code to make it clearer. Further, it appears git-2.26 did not correctly handle this combination of flags from git-clean. With both these flags and without DIR_SHOW_IGNORED_TOO_MODE_MATCHING set, git is supposed to recurse into all untracked AND ignored directories. git-2.26.0 clearly was not doing that. I don't know the full reasons for that or whether git < 2.27.0 had additional unknown bugs because of that misbehavior, because I don't feel it's worth digging into. As per the huge changes and craziness documented in commit `8d92fb2927` ("dir: replace exponential algorithm with a linear one", 2020-04-01), the old algorithm was a mess and was thrown out. What I can say is that git-2.27.0 correctly recurses into untracked AND ignored directories with that combination. However, in clean's case we don't need to recurse into ignored directories; that is just a waste of time. Thus, when git-2.27.0 started correctly handling those flags, we got a performance regression report. Rather than relying on other bugs in fill_directory()'s former logic to provide the behavior of skipping ignored directories, make use of the DIR_SHOW_IGNORED_TOO_MODE_MATCHING value specifically added in commit `eec0f7f2b7` ("status: add option to show ignored files differently", 2017-10-30) for this purpose. Reported-by: Brian Malehorn <bmalehorn@gmail.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-12 17:27:16 -07:00
Elijah Newren	f7f5c6c0ba	clean: consolidate handling of ignored parameters I spent a long time trying to figure out how and whether the code worked with different values of ignore, ignore_only, and remove_directories. After lots of time setting up lots of testcases, sifting through lots of print statements, and walking through the debugger, I finally realized that one piece of code related to how it was all setup was found in clean.c rather than dir.c. Make a change that would have made it easier for me to do the extra testing by putting this handling in one spot. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-12 17:27:16 -07:00
Elijah Newren	351ea1c3cb	dir, clean: avoid disallowed behavior dir.h documented quite clearly that DIR_SHOW_IGNORED and DIR_SHOW_IGNORED_TOO are mutually exclusive, with a big comment to this effect by the definition of both enum values. However, a command like git clean -fx $DIR would set both values for dir.flags. I _think_ it happened to work because: * As dir.h points out, DIR_KEEP_UNTRACKED_CONTENTS only takes effect if DIR_SHOW_IGNORED_TOO is set. * As coded, I believe DIR_SHOW_IGNORED would just happen to take precedence over DIR_SHOW_IGNORED_TOO in the code as currently constructed. Which is a long way of saying "we just got lucky". Fix clean.c to avoid setting these mutually exclusive values at the same time, and add a check to dir.c that will throw a BUG() to prevent anyone else from making this mistake. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-12 17:27:16 -07:00
Chris Torek	b7e10b2ca2	Documentation: usage for diff combined commits Document the usage for producing combined commits with "git diff". This includes updating the synopsis section. While here, add the three-dot notation to the synopsis. Make "git diff -h" print the same usage summary as the manual page synopsis, minus the "A..B" form, which is now discouraged. Signed-off-by: Chris Torek <chris.torek@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-12 10:53:44 -07:00
Chris Torek	8bfcb3a690	git diff: improve range handling When git diff is given a symmetric difference A...B, it chooses some merge base from the two specified commits (as documented). This fails, however, if there is no merge base: instead, you see the differences between A and B, which is certainly not what is expected. Moreover, if additional revisions are specified on the command line ("git diff A...B C"), the results get a bit weird: * If there is a symmetric difference merge base, this is used as the left side of the diff. The last final ref is used as the right side. * If there is no merge base, the symmetric status is completely lost. We will produce a combined diff instead. Similar weirdness occurs if you use, e.g., "git diff C A...B D". Likewise, using multiple two-dot ranges, or tossing extra revision specifiers into the command line with two-dot ranges, or mixing two and three dot ranges, all produce nonsense. To avoid all this, add a routine to catch the range cases and verify that that the arguments make sense. As a side effect, produce a warning showing which merge base is being used when there are multiple choices; die if there is no merge base. Signed-off-by: Chris Torek <chris.torek@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-12 10:53:44 -07:00
Jonathan Tan	dd4b732df7	upload-pack: send part of packfile response as uri Teach upload-pack to send part of its packfile response as URIs. An administrator may configure a repository with one or more "uploadpack.blobpackfileuri" lines, each line containing an OID, a pack hash, and a URI. A client may configure fetch.uriprotocols to be a comma-separated list of protocols that it is willing to use to fetch additional packfiles - this list will be sent to the server. Whenever an object with one of those OIDs would appear in the packfile transmitted by upload-pack, the server may exclude that object, and instead send the URI. The client will then download the packs referred to by those URIs before performing the connectivity check. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-10 18:06:34 -07:00
Jonathan Tan	9da69a6539	fetch-pack: support more than one pack lockfile Whenever a fetch results in a packfile being downloaded, a .keep file is generated, so that the packfile can be preserved (from, say, a running "git repack") until refs are written referring to the contents of the packfile. In a subsequent patch, a successful fetch using protocol v2 may result in more than one .keep file being generated. Therefore, teach fetch_pack() and the transport mechanism to support multiple .keep files. Implementation notes: - builtin/fetch-pack.c normally does not generate .keep files, and thus is unaffected by this or future changes. However, it has an undocumented "--lock-pack" feature, used by remote-curl.c when implementing the "fetch" remote helper command. In keeping with the remote helper protocol, only one "lock" line will ever be written; the rest will result in warnings to stderr. However, in practice, warnings will never be written because the remote-curl.c "fetch" is only used for protocol v0/v1 (which will not generate multiple .keep files). (Protocol v2 uses the "stateless-connect" command, not the "fetch" command.) - connected.c has an optimization in that connectivity checks on a ref need not be done if the target object is in a pack known to be self-contained and connected. If there are multiple packfiles, this optimization can no longer be done. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-10 18:06:34 -07:00
Eric Sunshine	810382ed37	worktree: make "move" refuse to move atop missing registered worktree "git worktree add" takes special care to avoid creating a new worktree at a location already registered to an existing worktree even if that worktree is missing (which can happen, for instance, if the worktree resides on removable media). "git worktree move", however, is not so careful when validating the destination location and will happily move the source worktree atop the location of a missing worktree. This leads to the anomalous situation of multiple worktrees being associated with the same path, which is expressly forbidden by design. For example: $ git clone foo.git $ cd foo $ git worktree add ../bar $ git worktree add ../baz $ rm -rf ../bar $ git worktree move ../baz ../bar $ git worktree list .../foo beefd00f [master] .../bar beefd00f [bar] .../bar beefd00f [baz] $ git worktree remove ../bar fatal: validation failed, cannot remove working tree: '.../bar' does not point back to '.git/worktrees/bar' Fix this shortcoming by enhancing "git worktree move" to perform the same additional validation of the destination directory as done by "git worktree add". While at it, add a test to verify that "git worktree move" won't move a worktree atop an existing (non-worktree) path -- a restriction which has always been in place but was never tested. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-10 10:54:49 -07:00
Eric Sunshine	d179af679b	worktree: generalize candidate worktree path validation "git worktree add" checks that the specified path is a valid location for a new worktree by ensuring that the path does not already exist and is not already registered to another worktree (a path can be registered but missing, for instance, if it resides on removable media). Since "git worktree add" is not the only command which should perform such validation ("git worktree move" ought to also), generalize the the validation function for use by other callers, as well. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-10 10:54:49 -07:00
Eric Sunshine	916133ef8e	worktree: prune linked worktree referencing main worktree path "git worktree prune" detects when multiple entries are associated with the same path and prunes the duplicates, however, it does not detect when a linked worktree points at the path of the main worktree. Although "git worktree add" disallows creating a new worktree with the same path as the main worktree, such a case can arise outside the control of Git even without the user mucking with .git/worktree/<id>/ administrative files. For instance: $ git clone foo.git $ git -C foo worktree add ../bar $ rm -rf bar $ mv foo bar $ git -C bar worktree list .../bar deadfeeb [master] .../bar deadfeeb [bar] Help the user recover from such corruption by extending "git worktree prune" to also detect when a linked worktree is associated with the path of the main worktree. Reported-by: Jonathan Müller <jonathanmueller.dev@gmail.com> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-10 10:54:49 -07:00
Eric Sunshine	4a3ce479ce	worktree: prune duplicate entries referencing same worktree path A fundamental restriction of linked working trees is that there must only ever be a single worktree associated with a particular path, thus "git worktree add" explicitly disallows creation of a new worktree at the same location as an existing registered worktree. Nevertheless, users can still "shoot themselves in the foot" by mucking with administrative files in .git/worktree/<id>/. Worse, "git worktree move" is careless[1] and allows a worktree to be moved atop a registered but missing worktree (which can happen, for instance, if the worktree is on removable media). For instance: $ git clone foo.git $ cd foo $ git worktree add ../bar $ git worktree add ../baz $ rm -rf ../bar $ git worktree move ../baz ../bar $ git worktree list .../foo beefd00f [master] .../bar beefd00f [bar] .../bar beefd00f [baz] Help users recover from this form of corruption by teaching "git worktree prune" to detect when multiple worktrees are associated with the same path. [1]: A subsequent commit will fix "git worktree move" validation to be more strict. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-10 10:54:49 -07:00
Eric Sunshine	dd9609a12e	worktree: make high-level pruning re-usable The low-level logic for removing a worktree is well encapsulated in delete_git_dir(). However, high-level details related to pruning a worktree -- such as dealing with verbosity and dry-run mode -- are not encapsulated. Factor out this high-level logic into its own function so it can be re-used as new worktree corruption detectors are added. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-10 10:54:49 -07:00
Eric Sunshine	1b14d40b38	worktree: give "should be pruned?" function more meaningful name Readers of the name prune_worktree() are likely to expect the function to actually prune a worktree, however, it only answers the question "should this worktree be pruned?". Give it a name more reflective of its true purpose to avoid such confusion. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-10 10:54:49 -07:00
Junio C Hamano	63e50b8678	Merge branch 'cb/bisect-helper-parser-fix' The code to parse "git bisect start" command line was lax in validating the arguments. * cb/bisect-helper-parser-fix: bisect--helper: avoid segfault with bad syntax in `start --term-*`	2020-06-08 18:06:32 -07:00
Junio C Hamano	b37fd14beb	Merge branch 'dl/remote-curl-deadlock-fix' On-the-wire protocol v2 easily falls into a deadlock between the remote-curl helper and the fetch-pack process when the server side prematurely throws an error and disconnects. The communication has been updated to make it more robust. * dl/remote-curl-deadlock-fix: stateless-connect: send response end packet pkt-line: define PACKET_READ_RESPONSE_END remote-curl: error on incomplete packet pkt-line: extern packet_length() transport: extract common fetch_pack() call remote-curl: remove label indentation remote-curl: fix typo	2020-06-08 18:06:30 -07:00
Junio C Hamano	ded44afa02	Merge branch 'bc/filter-process' Code simplification and test coverage enhancement. * bc/filter-process: t2060: add a test for switch with --orphan and --discard-changes builtin/checkout: simplify metadata initialization	2020-06-08 18:06:30 -07:00
Junio C Hamano	dc57a9be5e	Merge branch 'tb/commit-graph-no-check-oids' Clean-up the commit-graph codepath. * tb/commit-graph-no-check-oids: commit-graph: drop COMMIT_GRAPH_WRITE_CHECK_OIDS flag t5318: reorder test below 'graph_read_expect' commit-graph.c: simplify 'fill_oids_from_commits' builtin/commit-graph.c: dereference tags in builtin builtin/commit-graph.c: extract 'read_one_commit()' commit-graph.c: peel refs in 'add_ref_to_set' commit-graph.c: show progress of finding reachable commits commit-graph.c: extract 'refs_cb_data'	2020-06-08 18:06:27 -07:00
Eric Sunshine	c9b77f2cea	worktree: factor out repeated string literal For each worktree removed by "git worktree prune", it reports the reason for the removal. All reasons share the common prefix "Removing worktrees/%s:". As new removal reasons are added, this prefix needs to be duplicated, which is error-prone and potentially cumbersome. Therefore, factor out the common prefix. Although this change seems to increase the "sentence lego quotient", it should be reasonably safe, as the reason for removal is a distinct clause, not strictly related to the prefix. Moreover, the "worktrees" in "Removing worktrees/%s:" is a path literal which ought not be localized, so by factoring it out, we can more easily avoid exposing that path fragment to translators. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-08 13:31:27 -07:00
Xin Li	98564d8059	sparse-checkout: upgrade repository to version 1 when enabling extension The 'extensions' configuration variable gets special meaning in the new repository version, so when enabling the extension we should upgrade the repository to version 1. Signed-off-by: Xin Li <delphij@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-05 10:13:30 -07:00
Xin Li	01bbbbd9da	fetch: allow adding a filter after initial clone Retroactively adding a filter can be useful for existing shallow clones as they allow users to see earlier change histories without downloading all git objects in a regular --unshallow fetch. Without this patch, users can make a clone partial by editing the repository configuration to convert the remote into a promisor, like: git config core.repositoryFormatVersion 1 git config extensions.partialClone origin git fetch --unshallow --filter=blob:none origin Since the hard part of making this work is already in place and such edits can be error-prone, teach Git to perform the required configuration change automatically instead. Note that this change does not modify the existing git behavior which recognizes setting extensions.partialClone without changing repositoryFormatVersion. Signed-off-by: Xin Li <delphij@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-05 10:13:30 -07:00
Elijah Newren	b5bfc08a97	sparse-checkout: avoid staging deletions of all files sparse-checkout's purpose is to update the working tree to have it reflect a subset of the tracked files. As such, it shouldn't be switching branches, making commits, downloading or uploading data, or staging or unstaging changes. Other than updating the worktree, the only thing sparse-checkout should touch is the SKIP_WORKTREE bit of the index. In particular, this sets up a nice invariant: running sparse-checkout will never change the status of any file in `git status` (reflecting the fact that we only set the SKIP_WORKTREE bit if the file is safe to delete, i.e. if the file is unmodified). Traditionally, we did a _really_ bad job with this goal. The predecessor to sparse-checkout involved manual editing of .git/info/sparse-checkout and running `git read-tree -mu HEAD`. That command would stage and unstage changes and overwrite dirty changes in the working tree. The initial implementation of the sparse-checkout command was no better; it simply invoked `git read-tree -mu HEAD` as a subprocess and had the same caveats, though this issue came up repeatedly in review comments and workarounds for the problems were put in place before the feature was merged[1, 2, 3, 4, 5, 6; especially see 4 & 6]. [1] https://lore.kernel.org/git/CABPp-BFT9A5n=_bx5LsjCvbogqwSjiwgr5amcjgbU1iAk4KLJg@mail.gmail.com/ [2] https://lore.kernel.org/git/CABPp-BEmwSwg4tgJg6nVG8a3Hpn_g-=ZjApZF4EiJO+qVgu4uw@mail.gmail.com/ [3] https://lore.kernel.org/git/CABPp-BFV7TA0qwZCQpHCqx9N+JifyRyuBQ-pZ_oGfe-NOgyh7A@mail.gmail.com/ [4] https://lore.kernel.org/git/CABPp-BHYCCD+Vx5fq35jH82eHc1-P53Lz_aGNpHJNcx9kg2K-A@mail.gmail.com/ [5] https://lore.kernel.org/git/CABPp-BF+JWYZfDqp2Tn4AEKVp4b0YMA=Mbz4Nz62D-gGgiduYQ@mail.gmail.com/ [6] https://lore.kernel.org/git/20191121163706.GV23183@szeder.dev/ However, these workarounds, in addition to disabling the feature in a number of important cases, also missed one special case. I'll get back to it later. In the 2.27.0 cycle, the disabling of the feature was lifted by finally replacing the internal equivalent of `git read-tree -mu HEAD` with something that did what we wanted: the new update_sparsity() function in unpack-trees.c that only ever updates SKIP_WORKTREE bits in the index and updates the working tree to match. This new function handles all the cases that were problematic for the old implementation, except that it breaks the same special case that avoided the workarounds of the old implementation, but broke it in a different way. So...that brings us to the special case: a git clone performed with --no-checkout. As per the meaning of the flag, --no-checkout does not check out any branch, with the implication that you aren't on one and need to switch to one after the clone. Implementationally, HEAD is still set (so in some sense you are partially on a branch), but * the index is "unborn" (non-existent) * there are no files in the working tree (other than .git/) * the next time git switch (or git checkout) is run it will run unpack_trees with `initial_checkout` flag set to true. It is not until you run, e.g. `git switch <somebranch>` that the index will be written and files in the working tree populated. With this special --no-checkout case, the traditional `read-tree -mu HEAD` behavior would have done the equivalent of acting like checkout -- switch to the default branch (HEAD), write out an index that matches HEAD, and update the working tree to match. This special case slipped through the avoid-making-changes checks in the original sparse-checkout command and thus continued there. After update_sparsity() was introduced and used (see commit `f56f31af03` ("sparse-checkout: use new update_sparsity() function", 2020-03-27)), the behavior for the --no-checkout case changed: Due to git's auto-vivification of an empty in-memory index (see do_read_index() and note that `must_exist` is false), and due to sparse-checkout's update_working_directory() code to always write out the index after it was done, we got a new bug. That made it so that sparse-checkout would switch the repository from a clone with an "unborn" index (i.e. still needing an initial_checkout), to one that had a recorded index with no entries. Thus, instead of all the files appearing deleted in `git status` being known to git as a special artifact of not yet being on a branch, our recording of an empty index made it suddenly look to git as though it was definitely on a branch with ALL files staged for deletion! A subsequent checkout or switch then had to contend with the fact that it wasn't on an initial_checkout but had a bunch of staged deletions. Make sure that sparse-checkout changes nothing in the index other than the SKIP_WORKTREE bit; in particular, when the index is unborn we do not have any branch checked out so there is no sparsification or de-sparsification work to do. Simply return from update_working_directory() early. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-05 08:05:50 -07:00
Johannes Schindelin	46da295a77	clone/fetch: anonymize URLs in the reflog Even if we strongly discourage putting credentials into the URLs passed via the command-line, there _is_ support for that, and users _do_ do that. Let's scrub them before writing them to the reflog. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-04 13:20:21 -07:00
Junio C Hamano	de82fb45db	Merge branch 'rs/checkout-b-track-error' The error message from "git checkout -b foo -t bar baz" was confusing. * rs/checkout-b-track-error: checkout: improve error messages for -b with extra argument checkout: add tests for -b and --track	2020-06-02 13:35:04 -07:00
Junio C Hamano	0739479c6a	Merge branch 'an/merge-single-strategy-optim' Code optimization for a common case. * an/merge-single-strategy-optim: merge: optimization to skip evaluate_result for single strategy	2020-06-02 13:35:01 -07:00
Shourya Shukla	2964d6e5e1	submodule: port subcommand 'set-branch' from shell to C Convert submodule subcommand 'set-branch' to a builtin and call it via 'git-submodule.sh'. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com> Helped-by: Denton Liu <liu.denton@gmail.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-02 10:51:54 -07:00
brian m. carlson	d96dab868e	builtin/ls-remote: initialize repository based on fetch ls-remote may or may not operate within a repository, and as such will not have been initialized with the repository's hash algorithm. Even if it were, the remote side could be using a different algorithm and we would still want to display those refs properly. Find the hash algorithm used by the remote side by querying the transport object and set our hash algorithm accordingly. Without this change, if the remote side is using SHA-256, we truncate the refs to 40 hex characters, since that's the length of the default hash algorithm (SHA-1). Note that technically this is not a correct setting of the repository hash algorithm since, if we are in a repository, it might be one of a different hash algorithm from the remote side. However, our current code paths don't handle multiple algorithms and won't for some time, so this is the best we can do. We rely on the fact that ls-remote never modifies the current repository, which is a reasonable assumption to make. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-27 10:07:07 -07:00
brian m. carlson	88a09a557c	builtin/show-index: provide options to determine hash algo show-index is capable of reading any possible index file whether or not the index is inside a repository. However, because our index files lack metadata about the hash algorithm in use, it's not possible to autodetect the algorithm that a particular index file is using. In order to allow us to read index files of any algorithm, let's set up the .git directory gently so that we default to the algorithm for the current repository, and add an --object-format option to allow users to override this setting and continue to run show-index outside of a repository altogether. Let's also document this new option so that people can find it and use it. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-27 10:07:07 -07:00
brian m. carlson	629dffc461	packfile: compute and use the index CRC offset Both v2 pack index files and the v3 format specified as part of the NewHash work have similar data starting at the CRC table. Much of the existing code wants to read either this table or the offset entries following it, and in doing so computes the offset each time. In order to share as much code between v2 and v3, compute the offset of the CRC table and store it when the pack is opened. Use this value to compute offsets to not only the CRC table, but to the offset entries beyond it. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-27 10:07:07 -07:00
brian m. carlson	b65dc2cebd	builtin/clone: initialize hash algorithm properly When performing a clone, we don't know what hash algorithm the other end will support. Currently, we don't support fetching data belonging to a different algorithm, so we must know what algorithm the remote side is using in order to properly initialize the repository. We can know that only after fetching the refs, so if the remote side has any references, use that information to reinitialize the repository with the correct hash algorithm information. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-27 10:07:06 -07:00
brian m. carlson	bb095d0875	builtin/receive-pack: detect when the server doesn't support our hash Detect when the server doesn't support our hash algorithm and abort. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-27 10:07:06 -07:00
brian m. carlson	bf30dbf826	remote: advertise the object-format capability on the server side Advertise the current hash algorithm in use by using the object-format capability as part of the ref advertisement. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-27 10:07:06 -07:00
Junio C Hamano	d55a4ae71d	Merge branch 'ds/multi-pack-verify' Fix for a copy-and-paste error introduced during 2.20 era. * ds/multi-pack-verify: fsck: use ERROR_MULTI_PACK_INDEX	2020-05-24 19:39:39 -07:00
Denton Liu	b0df0c16ea	stateless-connect: send response end packet Currently, remote-curl acts as a proxy and blindly forwards packets between an HTTP server and fetch-pack. In the case of a stateless RPC connection where the connection is terminated before the transaction is complete, remote-curl will blindly forward the packets before waiting on more input from fetch-pack. Meanwhile, fetch-pack will read the transaction and continue reading, expecting more input to continue the transaction. This results in a deadlock between the two processes. This can be seen in the following command which does not terminate: $ git -c protocol.version=2 clone https://github.com/git/git.git --shallow-since=20151012 Cloning into 'git'... whereas the v1 version does terminate as expected: $ git -c protocol.version=1 clone https://github.com/git/git.git --shallow-since=20151012 Cloning into 'git'... fatal: the remote end hung up unexpectedly Instead of blindly forwarding packets, make remote-curl insert a response end packet after proxying the responses from the remote server when using stateless_connect(). On the RPC client side, ensure that each response ends as described. A separate control packet is chosen because we need to be able to differentiate between what the remote server sends and remote-curl's control packets. By ensuring in the remote-curl code that a server cannot send response end packets, we prevent a malicious server from being able to perform a denial of service attack in which they spoof a response end packet and cause the described deadlock to happen. Reported-by: Force Charlie <charlieio@outlook.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-24 16:26:00 -07:00
René Scharfe	bb2198fb91	checkout: improve error messages for -b with extra argument When we try to create a branch "foo" based on "origin/master" and give git commit -b an extra unsupported argument "bar", it confusingly reports: $ git checkout -b foo origin/master bar fatal: 'bar' is not a commit and a branch 'foo' cannot be created from it $ git checkout --track -b foo origin/master bar fatal: 'bar' is not a commit and a branch 'foo' cannot be created from it That's wrong, because it very well understands that "origin/master" is supposed to be the start point for the new branch and not "bar". Check if we got a commit and show more fitting messages in that case instead: $ git checkout -b foo origin/master bar fatal: Cannot update paths and switch to branch 'foo' at the same time. $ git checkout --track -b foo origin/master bar fatal: '--track' cannot be used with updating paths Original-patch-by: Jeff King <peff@peff.net> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-24 16:21:30 -07:00
Carlo Marcelo Arenas Belón	4d9005ff5d	bisect--helper: avoid segfault with bad syntax in `start --term-*` `06f5608c14` (bisect--helper: `bisect_start` shell function partially in C, 2019-01-02) adds a lax parser for `git bisect start` which could result in a segfault under a bad syntax call for start with custom terms. Detect if there are enough arguments left in the command line to use for --term-{old,good,new,bad} and abort with the same syntax error the original implementation will show if not. While at it, remove an unnecessary (and incomplete) check for unknown arguments and make sure to add a test to avoid regressions. Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Acked-by: Christian Couder <christian.couder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-24 09:00:11 -07:00
brian m. carlson	81861288a9	builtin/checkout: simplify metadata initialization When we call init_checkout_metadata in reset_tree, we want to pass the object ID of the commit in question so that it can be passed to filters, or if there is no commit, the tree. We anticipated this latter case, which can occur elsewhere in the checkout code, but it cannot occur here. The only case in which we do not have a commit object is when invoking git switch with --orphan. Moreover, we can only hit this code path without a commit object additionally with either --force or --discard-changes. In such a case, there is no point initializing the checkout metadata with a commit or tree because (a) there is no commit, only the empty tree, and (b) we will never use the data, since no files will be smudged when checking out a branch with no files. Pass the all-zeros object ID in this case, since we just need some value which is a valid pointer. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-21 09:55:21 -07:00
Derrick Stolee	e68a5272b1	fsck: use ERROR_MULTI_PACK_INDEX The multi-pack-index was added to the data verified by git-fsck in ea5ae6c3 "fsck: verify multi-pack-index". This implementation was based on the implementation for verifying the commit-graph, and a copy-paste error kept the ERROR_COMMIT_GRAPH flag as the bit set when an error appears in the multi-pack-index. Add a new flag, ERROR_MULTI_PACK_INDEX, and use that instead. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-19 16:13:22 -07:00

1 2 3 4 5 ...

8642 Commits