git-commit-vandalism

Author	SHA1	Message	Date
Junio C Hamano	77348b0e6e	Merge branch 'js/range-diff-wo-dotdot' There are other ways than ".." for a single token to denote a "commit range", namely "<rev>^!" and "<rev>^-<n>", but "git range-diff" did not understand them. * js/range-diff-wo-dotdot: range-diff(docs): explain how to specify commit ranges range-diff/format-patch: handle commit ranges other than A..B range-diff/format-patch: refactor check for commit range	2021-02-17 17:21:41 -08:00
Junio C Hamano	69571dfe21	Merge branch 'jt/clone-unborn-head' "git clone" tries to locally check out the branch pointed at by HEAD of the remote repository after it is done, but the protocol did not convey the information necessary to do so when copying an empty repository. The protocol v2 learned how to do so. * jt/clone-unborn-head: clone: respect remote unborn HEAD connect, transport: encapsulate arg in struct ls-refs: report unborn targets of symrefs	2021-02-17 17:21:40 -08:00
Junio C Hamano	0871fb9af5	Merge branch 'mr/bisect-in-c-4' Piecemeal of rewrite of "git bisect" in C continues. * mr/bisect-in-c-4: bisect--helper: retire `--check-and-set-terms` subcommand bisect--helper: reimplement `bisect_skip` shell function in C bisect--helper: retire `--bisect-auto-next` subcommand bisect--helper: use `res` instead of return in BISECT_RESET case option bisect--helper: retire `--bisect-write` subcommand bisect--helper: reimplement `bisect_replay` shell function in C bisect--helper: reimplement `bisect_log` shell function in C	2021-02-17 17:21:40 -08:00
Ævar Arnfjörð Bjarmason	cbe81e653f	grep/pcre2: move back to thread-only PCREv2 structures Change the setup of the "pcre2_general_context" to happen per-thread in compile_pcre2_pattern() instead of in grep_init(). This change brings it in line with how the rest of the pcre2_* members in the grep_pat structure are set up. As noted in the preceding commit the approach `513f2b0bbd` (grep: make PCRE2 aware of custom allocator, 2019-10-16) took to allocate the pcre2_general_context seems to have been initially based on a misunderstanding of how PCREv2 memory allocation works. The approach of creating a global context in grep_init() is just added complexity for almost zero gain. On my system it's 24 bytes saved per-thread. For comparison PCREv2 will then go on to allocate at least a kilobyte for its own thread-local state. As noted in `6d423dd542` (grep: don't redundantly compile throwaway patterns under threading, 2017-05-25) the grep code is intentionally not trying to micro-optimize allocations by e.g. sharing some PCREv2 structures globally, while making others thread-local. So let's remove this special case and make all of them thread-local again for simplicity. With this change we could move the pcre2_{malloc,free} functions around to live closer to their current use. I'm not doing that here to keep this change small, that cleanup will be done in a follow-up commit. See also the discussion in `94da9193a6` (grep: add support for PCRE v2, 2017-06-01) about thread safety, and Johannes's comments[1] to the effect that we should be doing what this patch is doing. 1. https://lore.kernel.org/git/nycvar.QRO.7.76.6.1908052120302.46@tvgsbejvaqbjf.bet/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-17 16:32:19 -08:00
Rafael Silva	8e16effe97	blame: remove unnecessary use of get_commit_info() When `git blame --color-by-age`, the determine_line_heat() is called to select how to color the output based on the commit's author date. It uses the get_commit_info() to parse the information into a `commit_info` structure, however, this is actually unnecessary because the determine_line_heat() caller also does the same. Instead, let's change the determine_line_heat() to take a `commit_info` structure and remove the internal call to get_commit_info() thus cleaning up and optimizing the code path. Enabling Git's trace2 API in order to record the execution time for every call to determine_line_heat() function: + trace2_region_enter("blame", "determine_line_heat", the_repository); determine_line_heat(ent, &default_color); + trace2_region_enter("blame", "determine_line_heat", the_repository); Then, running `git blame` for "kernel/fork.c" in linux.git and summing all the execution time for every call (around 1.3k calls) resulted in 2.6x faster execution (best out 3): git built from `328c109303` (The eighth batch, 2021-02-12) = 42ms git built from `328c109303` + this change = 16ms Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-17 11:04:17 -08:00
Matheus Tavares	3f7ba60350	checkout-index: omit entries with no tempname from --temp output With --temp (or --stage=all, which implies --temp), checkout-index writes a list to stdout associating temporary file names to the entries' names. But if it fails to write an entry, and the failure happens before even assigning a temporary filename to that entry, we get an odd output line. This can be seen when trying to check out a symlink whose blob is missing: $ missing_blob=$(git hash-object --stdin </dev/null) $ git update-index --add --cacheinfo 120000,$missing_blob,foo $ git checkout-index --temp foo error: unable to read sha1 file of foo (`e69de29bb2`) foo The 'TAB foo' line is not much useful and it might break scripts that expect the 'tempname TAB foo' output. So let's omit such entries from the stdout list (but leaving the error message on stderr). We could also consider omitting _all_ failed entries from the output list, but that's probably not a good idea as the associated tempfiles may have been created even when checkout failed, so scripts may want to use the output list for cleanup. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 11:27:18 -08:00
Jeff King	1679d60bfc	exclude: add flags parameter to add_patterns() There are a number of callers of add_patterns() and its sibling functions. Let's give them a "flags" parameter for adding new options without having to touch each caller. We'll use this in a future patch to add O_NOFOLLOW support. But for now each caller just passes 0. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 09:41:33 -08:00
Junio C Hamano	1eb4136ac2	diff: --{rotate,skip}-to=<path> In the implementation of "git difftool", there is a case where the user wants to start viewing the diffs at a specific path and continue on to the rest, optionally wrapping around to the beginning. Since it is somewhat cumbersome to implement such a feature as a post-processing step of "git diff" output, let's support it internally with two new options. - "git diff --rotate-to=C", when the resulting patch would show paths A B C D E without the option, would "rotate" the paths to shows patch to C D E A B instead. It is an error when there is no patch for C is shown. - "git diff --skip-to=C" would instead "skip" the paths before C, and shows patch to C D E. Again, it is an error when there is no patch for C is shown. - "git log [-p]" also accepts these two options, but it is not an error if there is no change to the specified path. Instead, the set of output paths are rotated or skipped to the specified path or the first path that sorts after the specified path. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 09:30:42 -08:00
Junio C Hamano	8b25dee615	Merge branch 'tb/precompose-prefix-too' When commands are started from a subdirectory, they may have to compare the path to the subdirectory (called prefix and found out from $(pwd)) with the tracked paths. On macOS, $(pwd) and readdir() yield decomposed path, while the tracked paths are usually normalized to the precomposed form, causing mismatch. This has been fixed by taking the same approach used to normalize the command line arguments. * tb/precompose-prefix-too: MacOS: precompose_argv_prefix()	2021-02-12 14:21:04 -08:00
Junio C Hamano	3c12d0b885	Merge branch 'tb/pack-revindex-on-disk' Introduce an on-disk file to record revindex for packdata, which traditionally was always created on the fly and only in-core. * tb/pack-revindex-on-disk: t5325: check both on-disk and in-memory reverse index pack-revindex: ensure that on-disk reverse indexes are given precedence t: support GIT_TEST_WRITE_REV_INDEX t: prepare for GIT_TEST_WRITE_REV_INDEX Documentation/config/pack.txt: advertise 'pack.writeReverseIndex' builtin/pack-objects.c: respect 'pack.writeReverseIndex' builtin/index-pack.c: write reverse indexes builtin/index-pack.c: allow stripping arbitrary extensions pack-write.c: prepare to write 'pack-.rev' files packfile: prepare for the existence of '.rev' files	2021-02-12 14:21:04 -08:00
Denton Liu	3e885f0277	stash: declare ref_stash as an array Save sizeof(const char *) bytes by declaring ref_stash as an array instead of having a redundant pointer to an array. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 13:34:58 -08:00
Jeff King	16950f8384	rev-list: add --disk-usage option for calculating disk usage It can sometimes be useful to see which refs are contributing to the overall repository size (e.g., does some branch have a bunch of objects not found elsewhere in history, which indicates that deleting it would shrink the size of a clone). You can find that out by generating a list of objects, getting their sizes from cat-file, and then summing them, like: git rev-list --objects --no-object-names main..branch git cat-file --batch-check='%(objectsize:disk)' \| perl -lne '$total += $_; END { print $total }' Though note that the caveats from git-cat-file(1) apply here. We "blame" base objects more than their deltas, even though the relationship could easily be flipped. Still, it can be a useful rough measure. But one problem is that it's slow to run. Teaching rev-list to sum up the sizes can be much faster for two reasons: 1. It skips all of the piping of object names and sizes. 2. If bitmaps are in use, for objects that are in the bitmapped packfile we can skip the oid_object_info() lookup entirely, and just ask the revindex for the on-disk size. This patch implements a --disk-usage option which produces the same answer in a fraction of the time. Here are some timings using a clone of torvalds/linux: [rev-list piped to cat-file, no bitmaps] $ time git rev-list --objects --no-object-names --all \| git cat-file --buffer --batch-check='%(objectsize:disk)' \| perl -lne '$total += $_; END { print $total }' 1459938510 real 0m29.635s user 0m38.003s sys 0m1.093s [internal, no bitmaps] $ time git rev-list --disk-usage --objects --all 1459938510 real 0m31.262s user 0m30.885s sys 0m0.376s Even though the wall-clock time is slightly worse due to parallelism, notice the CPU savings between the two. We saved 21% of the CPU just by avoiding the pipes. But the real win is with bitmaps. If we use them without the new option: [rev-list piped to cat-file, bitmaps] $ time git rev-list --objects --no-object-names --all --use-bitmap-index \| git cat-file --batch-check='%(objectsize:disk)' \| perl -lne '$total += $_; END { print $total }' 1459938510 real 0m6.244s user 0m8.452s sys 0m0.311s then we're faster to generate the list of objects, but we still spend a lot of time piping and looking things up. But if we do both together: [internal, bitmaps] $ time git rev-list --disk-usage --objects --all --use-bitmap-index 1459938510 real 0m0.219s user 0m0.169s sys 0m0.049s then we get the same answer much faster. For "--all", that answer will correspond closely to "du objects/pack", of course. But we're actually checking reachability here, so we're still fast when we ask for more interesting things: $ time git rev-list --disk-usage --use-bitmap-index v5.0..v5.10 374798628 real 0m0.429s user 0m0.356s sys 0m0.072s Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 09:57:55 -08:00
Johannes Schindelin	c809798b2a	reflog expire --stale-fix: be generous about missing objects Whenever a user runs `git reflog expire --stale-fix`, the most likely reason is that their repository is at least _somewhat_ corrupt. Which means that it is more than just possible that some objects are missing. If that is the case, that can currently let the command abort through the phase where it tries to mark all reachable objects. Instead of adding insult to injury, let's be gentle and continue as best as we can in such a scenario, simply by ignoring the missing objects and moving on. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 09:21:52 -08:00
Ævar Arnfjörð Bjarmason	e900d494dc	diff: add an API for deferred freeing Add a diff_free() function to free anything we may have allocated in the "diff_options" struct, and the ability to make calling it a noop by setting "no_free" in "diff_options". This is required because when e.g. "git diff" is run we'll allocate things in that struct, use the diff machinery once, and then exit. But if we run e.g. "git log -p" we're going to re-use what we allocated across multiple diff_flush() calls, and only want to free things at the end. We've thus ended up with features like the recently added "diff -I"[1] where we'll leak memory. As it turns out it could have simply used the pattern established in `6ea57703f6` (log: prepare log/log-tree to reuse the diffopt.close_file attribute, 2016-06-22). Manually adding more such flags to things log_tree_commit() every time we need to allocate something would be tedious. Let's instead move that fclose() code it to a new diff_free(), in anticipation of freeing more things in that function in follow-up commits. Some functions such as log_tree_commit() need an idiom of optionally retaining a previous "no_free", as they may either free the memory themselves, or their caller may do so. I'm keeping that idiom in log_show_early() for good measure, even though I don't think it's currently called in this manner. It also gets passed an existing "struct rev_info", so future callers may want to set the "no_free" flag. This change is a bit hard to read because while the freeing pattern we're introducing isn't unusual, the "file" member is a special snowflake. We usually don't want to fclose() it. This is because "file" is usually stdout, in which case we don't want to fclose() it. We only want to opt-in to closing it when we e.g. open a file on the filesystem. Thus the opt-in "close_file" flag. So the API in general just needs a "no_free" flag to defer freeing, but the "file" member still needs its "close_file" flag. This is made more confusing because while refactoring this code we could replace some "close_file=0" with "no_free=1", whereas others need to set both flags. This is because there were some cases where an existing "close_file=0" meant "let's defer deallocation", and others where it meant "we don't want to close this file handle at all". 1. `296d4a94e7` (diff: add -I<regex> that ignores matching changes, 2020-10-20) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-11 09:21:05 -08:00
brian m. carlson	482c119186	gpg-interface: improve interface for parsing tags We have a function which parses a buffer with a signature at the end, parse_signature, and this function is used for signed tags. However, we'll need to store values for multiple algorithms, and we'll do this by using a header for the non-default algorithm. Adjust the parse_signature interface to store the parsed data in two strbufs and turn the existing function into parse_signed_buffer. The latter is still used in places where we know we always have a signed buffer, such as push certs. Adjust all the callers to deal with this new interface. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 23:35:42 -08:00
Junio C Hamano	2f794620f5	Merge branch 'ds/more-index-cleanups' Cleaning various codepaths up. * ds/more-index-cleanups: t1092: test interesting sparse-checkout scenarios test-lib: test_region looks for trace2 regions sparse-checkout: load sparse-checkout patterns name-hash: use trace2 regions for init repository: add repo reference to index_state fsmonitor: de-duplicate BUG()s around dirty bits cache-tree: extract subtree_pos() cache-tree: simplify verify_cache() prototype cache-tree: clean up cache_tree_update()	2021-02-10 14:48:33 -08:00
Junio C Hamano	02fb21617e	Merge branch 'rs/worktree-list-verbose' `git worktree list` now annotates worktrees as prunable, shows locked and prunable attributes in --porcelain mode, and gained a --verbose option. * rs/worktree-list-verbose: worktree: teach `list` verbose mode worktree: teach `list` to annotate prunable worktree worktree: teach `list --porcelain` to annotate locked worktree t2402: ensure locked worktree is properly cleaned up worktree: teach worktree_lock_reason() to gently handle main worktree worktree: teach worktree to lazy-load "prunable" reason worktree: libify should_prune_worktree()	2021-02-10 14:48:32 -08:00
Junio C Hamano	c9f94ab4fa	Merge branch 'ab/lose-grep-debug' Lose the debugging aid that may have been useful in the past, but no longer is, in the "grep" codepaths. * ab/lose-grep-debug: grep/log: remove hidden --debug and --grep-debug options	2021-02-10 14:48:31 -08:00
Junio C Hamano	9d5b1c06ac	Merge branch 'jk/use-oid-pos' Code clean-up to ensure our use of hashtables using object names as keys use the "struct object_id" objects, not the raw hash values. * jk/use-oid-pos: oid_pos(): access table through const pointers hash_pos(): convert to oid_pos() rerere: use strmap to store rerere directories rerere: tighten rr-cache dirname check rerere: check dirname format while iterating rr_cache directory commit_graft_pos(): take an oid instead of a bare hash	2021-02-10 14:48:31 -08:00
Matheus Tavares	42d906bec4	grep: honor sparse-checkout on working tree searches On a sparse checked out repository, `git grep` (without --cached) ends up searching the cache when an entry matches the search pathspec and has the SKIP_WORKTREE bit set. This is confusing both because the sparse paths are not expected to be in a working tree search (as they are not checked out), and because the output mixes working tree and cache results without distinguishing them. (Note that grep also resorts to the cache on working tree searches that include --assume-unchanged paths. But the whole point in that case is to assume that the contents of the index entry and the file are the same. This does not apply to the case of sparse paths, where the file isn't even expected to be present.) Fix that by teaching grep to honor the sparse-checkout rules for working tree searches. If the user wants to grep paths outside the current sparse-checkout definition, they may either update the sparsity rules to materialize the files, or use --cached to search all blobs registered in the index. Note: it might also be interesting to add a configuration option that allow users to search paths that are present despite having the SKIP_WORKTREE bit set, and/or to restrict searches in the index and past revisions too. These ideas are left as future improvements to avoid conflicting with other sparse-checkout topics currently in flight. Suggested-by: Elijah Newren <newren@gmail.com> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-09 23:10:51 -08:00
Derrick Stolee	acc1c4d5d4	maintenance: incremental strategy runs pack-refs weekly When the 'maintenance.strategy' config option is set to 'incremental', a default maintenance schedule is enabled. Add the 'pack-refs' task to that strategy at the weekly cadence. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-09 23:09:29 -08:00
Derrick Stolee	41abfe15d9	maintenance: add pack-refs task It is valuable to collect loose refs into a more compressed form. This is typically the packed-refs file, although this could be the reftable in the future. Having packed refs can be extremely valuable in repos with many tags or remote branches that are not modified by the local user, but still are necessary for other queries. For instance, with many exploded refs, commands such as git describe --tags --exact-match HEAD can be very slow (multiple seconds). This command in particular is used by terminal prompts to show when a detatched HEAD is pointing to an existing tag, so having it be slow causes significant delays for users. Add a new 'pack-refs' maintenance task. It runs 'git pack-refs --all --prune' to move loose refs into a packed form. For now, that is the packed-refs file, but could adjust to other file formats in the future. This is the first of several sub-tasks of the 'gc' task that could be extracted to their own tasks. In this process, we should not change the behavior of the 'gc' task since that remains the default way to keep repositories maintained. Creating a new task for one of these sub-tasks only provides more customization options for those choosing to not use the 'gc' task. It is certainly possible to have both the 'gc' and 'pack-refs' tasks enabled and run regularly. While they may repeat effort, they do not conflict in a destructive way. The 'auto_condition' function pointer is left NULL for now. We could extend this in the future to have a condition check if pack-refs should be run during 'git maintenance run --auto'. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-09 23:09:24 -08:00
Matheus Tavares	0c5d83b248	grep: error out if --untracked is used with --cached The options --untracked and --cached are not compatible, but if they are used together, grep just silently ignores --cached and searches the working tree. Error out, instead, to avoid any potential confusion. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-09 12:39:06 -08:00
Junio C Hamano	f2d156dc48	Merge branch 'ab/branch-sort' into maint The implementation of "git branch --sort" wrt the detached HEAD display has always been hacky, which has been cleaned up. * ab/branch-sort: branch: show "HEAD detached" first under reverse sort branch: sort detached HEAD based on a flag ref-filter: move ref_sorting flags to a bitfield ref-filter: move "cmp_fn" assignment into "else if" arm ref-filter: add braces to if/else if/else chain branch tests: add to --sort tests branch: change "--local" to "--list" in comment	2021-02-08 14:05:55 -08:00
Junio C Hamano	171675a6c5	Merge branch 'ma/more-opaque-lock-file' into maint Code clean-up. * ma/more-opaque-lock-file: read-cache: try not to peek into `struct {lock_,temp}file` refs/files-backend: don't peek into `struct lock_file` midx: don't peek into `struct lock_file` commit-graph: don't peek into `struct lock_file` builtin/gc: don't peek into `struct lock_file`	2021-02-08 14:05:55 -08:00
Junio C Hamano	3e52ab222a	Merge branch 'zh/arg-help-format' into maint Clean up option descriptions in "git cmd --help". * zh/arg-help-format: builtin/*: update usage format parse-options: format argh like error messages	2021-02-08 14:05:54 -08:00
Johannes Schindelin	1e79f97326	range-diff: offer --left-only/--right-only options When comparing commit ranges, one is frequently interested only in one side, such as asking the question "Has this patch that I submitted to the Git mailing list been applied?": one would only care about the part of the output that corresponds to the commits in a local branch. To make that possible, imitate the `git rev-list` options `--left-only` and `--right-only`. This addresses https://github.com/gitgitgadget/git/issues/206 Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-06 21:14:31 -08:00
Johannes Schindelin	f1ce6c191e	range-diff: combine all options in a single data structure This will make it easier to implement the `--left-only` and `--right-only` options. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-06 21:14:31 -08:00
Junio C Hamano	77db59c2f9	Merge branch 'jv/pack-objects-narrower-ref-iteration' The "pack-objects" command needs to iterate over all the tags when automatic tag following is enabled, but it actually iterated over all refs and then discarded everything outside "refs/tags/" hierarchy, which was quite wasteful. * jv/pack-objects-narrower-ref-iteration: builtin/pack-objects.c: avoid iterating all refs	2021-02-05 16:40:45 -08:00
Junio C Hamano	f6ef8baba2	Merge branch 'ph/use-delete-refs' When removing many branches and tags, the code used to do so one ref at a time. There is another API it can use to delete multiple refs, and it makes quite a lot of performance difference when the refs are packed. * ph/use-delete-refs: use delete_refs when deleting tags or branches	2021-02-05 16:40:45 -08:00
Junio C Hamano	5198426d91	Merge branch 'zh/ls-files-deduplicate' "git ls-files" can and does show multiple entries when the index is unmerged, which is a source for confusion unless -s/-u option is in use. A new option --deduplicate has been introduced. * zh/ls-files-deduplicate: ls-files.c: add --deduplicate option ls_files.c: consolidate two for loops into one ls_files.c: bugfix for --deleted and --modified	2021-02-05 16:40:44 -08:00
Junio C Hamano	aac006aa99	Merge branch 'so/log-diff-merge' "git log" learned a new "--diff-merges=<how>" option. * so/log-diff-merge: (32 commits) t4013: add tests for --diff-merges=first-parent doc/git-show: include --diff-merges description doc/rev-list-options: document --first-parent changes merges format doc/diff-generate-patch: mention new --diff-merges option doc/git-log: describe new --diff-merges options diff-merges: add '--diff-merges=1' as synonym for 'first-parent' diff-merges: add old mnemonic counterparts to --diff-merges diff-merges: let new options enable diff without -p diff-merges: do not imply -p for new options diff-merges: implement new values for --diff-merges diff-merges: make -m/-c/--cc explicitly mutually exclusive diff-merges: refactor opt settings into separate functions diff-merges: get rid of now empty diff_merges_init_revs() diff-merges: group diff-merge flags next to each other inside 'rev_info' diff-merges: split 'ignore_merges' field diff-merges: fix -m to properly override -c/--cc t4013: add tests for -m failing to override -c/--cc t4013: support test_expect_failure through ':failure' magic diff-merges: revise revs->diff flag handling diff-merges: handle imply -p on -c/--cc logic for log.c ...	2021-02-05 16:40:44 -08:00
Junio C Hamano	897d28bcc2	Merge branch 'ds/for-each-repo-noopfix' into maint "git for-each-repo --config=<var> <cmd>" should not run <cmd> for any repository when the configuration variable <var> is not defined even once. * ds/for-each-repo-noopfix: for-each-repo: do nothing on empty config	2021-02-05 16:31:23 -08:00
Junio C Hamano	a4031f6dc0	Merge branch 'en/stash-apply-sparse-checkout' into maint "git stash" did not work well in a sparsely checked out working tree. * en/stash-apply-sparse-checkout: stash: fix stash application in sparse-checkouts stash: remove unnecessary process forking t7012: add a testcase demonstrating stash apply bugs in sparse checkouts	2021-02-05 16:31:22 -08:00
Junio C Hamano	a08832f16e	Merge branch 'rs/rebase-commit-validation' into maint Diagnose command line error of "git rebase" early. * rs/rebase-commit-validation: rebase: verify commit parameter	2021-02-05 16:31:22 -08:00
Jonathan Tan	4f37d45706	clone: respect remote unborn HEAD Teach Git to use the "unborn" feature introduced in a previous patch as follows: Git will always send the "unborn" argument if it is supported by the server. During "git clone", if cloning an empty repository, Git will use the new information to determine the local branch to create. In all other cases, Git will ignore it. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-05 13:49:55 -08:00
Jonathan Tan	39835409d1	connect, transport: encapsulate arg in struct In a future patch we plan to return the name of an unborn current branch from deep in the callchain to a caller via a new pointer parameter that points at a variable in the caller when the caller calls get_remote_refs() and transport_get_remote_refs(). In preparation for that, encapsulate the existing ref_prefixes parameter into a struct. The aforementioned unborn current branch will go into this new struct in the future patch. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-05 13:49:54 -08:00
Junio C Hamano	973e20b83f	Merge branch 'jk/peel-iterated-oid' The peel_ref() API has been replaced with peel_iterated_oid(). * jk/peel-iterated-oid: refs: switch peel_ref() to peel_iterated_oid()	2021-02-03 15:04:49 -08:00
Junio C Hamano	15bf48b987	Merge branch 'ds/maintenance-prefetch-cleanup' Test clean-up plus UI improvement by hiding extra refs that the prefetch task uses from "log --decorate" output. * ds/maintenance-prefetch-cleanup: t7900: clean up some broken refs maintenance: set log.excludeDecoration durin prefetch	2021-02-03 15:04:48 -08:00
Pranit Bauva	97b8294474	bisect--helper: retire `--check-and-set-terms` subcommand The `--check-and-set-terms` subcommand is no longer from the git-bisect.sh shell script. Instead the function `check_and_set_terms()` is called from the C implementation. Mentored-by: Lars Schneider <larsxschneider@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com> Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com> Signed-off-by: Miriam Rubio <mirucam@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-03 14:52:09 -08:00
Pranit Bauva	e4c7b33747	bisect--helper: reimplement `bisect_skip` shell function in C Reimplement the `bisect_skip()` shell function in C and also add `bisect-skip` subcommand to `git bisect--helper` to call it from git-bisect.sh Using `--bisect-skip` subcommand is a temporary measure to port shell function to C so as to use the existing test suite. Mentored-by: Lars Schneider <larsxschneider@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com> Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com> Signed-off-by: Miriam Rubio <mirucam@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-03 14:52:09 -08:00
Pranit Bauva	9feea34810	bisect--helper: retire `--bisect-auto-next` subcommand The --bisect-auto-next subcommand is no longer used from the git-bisect.sh shell script. Instead the function bisect_auto_next() is directly called from the C implementation. Mentored-by: Lars Schneider <larsxschneider@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com> Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com> Signed-off-by: Miriam Rubio <mirucam@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-03 14:52:09 -08:00
Pranit Bauva	b7a6f163d6	bisect--helper: use `res` instead of return in BISECT_RESET case option Use `res` variable to store `bisect_reset()` output in BISECT_RESET case option to make bisect--helper.c more consistent. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Miriam Rubio <mirucam@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-03 14:52:09 -08:00
Pranit Bauva	68efed8c8a	bisect--helper: retire `--bisect-write` subcommand The `--bisect-write` subcommand is no longer used from the git-bisect.sh shell script. Instead the function `bisect_write()` is directly called from the C implementation. Mentored-by: Lars Schneider <larsxschneider@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com> Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com> Signed-off-by: Miriam Rubio <mirucam@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-03 14:52:08 -08:00
Pranit Bauva	2b1fd947f6	bisect--helper: reimplement `bisect_replay` shell function in C Reimplement the `bisect_replay` shell function in C and also add `--bisect-replay` subcommand to `git bisect--helper` to call it from git-bisect.sh Using `--bisect-replay` subcommand is a temporary measure to port shell function to C so as to use the existing test suite. Mentored-by: Lars Schneider <larsxschneider@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com> Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com> Signed-off-by: Miriam Rubio <mirucam@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-03 14:52:08 -08:00
Pranit Bauva	97d5ba6a39	bisect--helper: reimplement `bisect_log` shell function in C Reimplement the `bisect_log()` shell function in C and also add `--bisect-log` subcommand to `git bisect--helper` to call it from git-bisect.sh . Using `--bisect-log` subcommand is a temporary measure to port shell function to C so as to use the existing test suite. Mentored-by: Lars Schneider <larsxschneider@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Helped-by: Rafael Silva <rafaeloliveira.cs@gmail.com> Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com> Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com> Signed-off-by: Miriam Rubio <mirucam@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-03 14:52:08 -08:00
Torsten Bögershausen	5c327502db	MacOS: precompose_argv_prefix() The following sequence leads to a "BUG" assertion running under MacOS: DIR=git-test-restore-p Adiarnfd=$(printf 'A\314\210') DIRNAME=xx${Adiarnfd}yy mkdir $DIR && cd $DIR && git init && mkdir $DIRNAME && cd $DIRNAME && echo "Initial" >file && git add file && echo "One more line" >>file && echo y \| git restore -p . Initialized empty Git repository in /tmp/git-test-restore-p/.git/ BUG: pathspec.c:495: error initializing pathspec_item Cannot close git diff-index --cached --numstat [snip] The command `git restore` is run from a directory inside a Git repo. Git needs to split the $CWD into 2 parts: The path to the repo and "the rest", if any. "The rest" becomes a "prefix" later used inside the pathspec code. As an example, "/path/to/repo/dir-inside-repå" would determine "/path/to/repo" as the root of the repo, the place where the configuration file .git/config is found. The rest becomes the prefix ("dir-inside-repå"), from where the pathspec machinery expands the ".", more about this later. If there is a decomposed form, (making the decomposing visible like this), "dir-inside-rep°a" doesn't match "dir-inside-repå". Git commands need to: (a) read the configuration variable "core.precomposeunicode" (b) precocompose argv[] (c) precompose the prefix, if there was any The first commit, `76759c7dff` "git on Mac OS and precomposed unicode" addressed (a) and (b). The call to precompose_argv() was added into parse-options.c, because that seemed to be a good place when the patch was written. Commands that don't use parse-options need to do (a) and (b) themselfs. The commands `diff-files`, `diff-index`, `diff-tree` and `diff` learned (a) and (b) in commit `90a78b83e0` "diff: run arguments through precompose_argv" Branch names (or refs in general) using decomposed code points resulting in decomposed file names had been fixed in commit `8e712ef6fc` "Honor core.precomposeUnicode in more places" The bug report from above shows 2 things: - more commands need to handle precomposed unicode - (c) should be implemented for all commands using pathspecs Solution: precompose_argv() now handles the prefix (if needed), and is renamed into precompose_argv_prefix(). Inside this function the config variable core.precomposeunicode is read into the global variable precomposed_unicode, as before. This reading is skipped if precomposed_unicode had been read before. The original patch for preocomposed unicode, `76759c7dff`, placed precompose_argv() into parse-options.c Now add it into git.c::run_builtin() as well. Existing precompose calls in diff-files.c and others may become redundant, and if we audit the callflows that reach these places to make sure that they can never be reached without going through the new call added to run_builtin(), we might be able to remove these existing ones. But in this commit, we do not bother to do so and leave these precompose callsites as they are. Because precompose() is idempotent and can be called on an already precomposed string safely, this is safer than removing existing calls without fully vetting the callflows. There is certainly room for cleanups - this change intends to be a bug fix. Cleanups needs more tests in e.g. t/t3910-mac-os-precompose.sh, and should be done in future commits. [1] git-bugreport-2021-01-06-1209.txt (git can't deal with special characters) [2] https://lore.kernel.org/git/A102844A-9501-4A86-854D-E3B387D378AA@icloud.com/ Reported-by: Daniel Troger <random_n0body@icloud.com> Helped-By: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-03 14:09:37 -08:00
Rafael Silva	076b444a62	worktree: teach `list` verbose mode "git worktree list" annotates each worktree according to its state such as "prunable" or "locked", however it is not immediately obvious why these worktrees are being annotated. For prunable worktrees a reason is available that is returned by should_prune_worktree() and for locked worktrees a reason might be available provided by the user via `lock` command. Let's teach "git worktree list" a --verbose mode that outputs the reason why the worktrees are being annotated. The reason is a text that can take virtually any size and appending the text on the default columned format will make it difficult to extend the command with other annotations and not fit nicely on the screen. In order to address this shortcoming the annotation is then moved to the next line indented followed by the reason If the reason is not available the annotation stays on the same line as the worktree itself. The output of "git worktree list" with verbose becomes like so: $ git worktree list --verbose ... /path/to/locked-no-reason acb124 [branch-a] locked /path/to/locked-with-reason acc125 [branch-b] locked: worktree with a locked reason /path/to/prunable-reason ace127 [branch-d] prunable: gitdir file points to non-existent location ... Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-30 09:57:40 -08:00
Rafael Silva	9b19a58f66	worktree: teach `list` to annotate prunable worktree The "git worktree list" command shows the absolute path to the worktree, the commit that is checked out, the name of the branch, and a "locked" annotation if the worktree is locked, however, it does not indicate whether the worktree is prunable. The "prune" command will remove a worktree if it is prunable unless `--dry-run` option is specified. This could lead to a worktree being removed without the user realizing before it is too late, in case the user forgets to pass --dry-run for instance. If the "list" command shows which worktree is prunable, the user could verify before running "git worktree prune" and hopefully prevents the working tree to be removed "accidentally" on the worse case scenario. Let's teach "git worktree list" to show when a worktree is a prunable candidate for both default and porcelain format. In the default format a "prunable" text is appended: $ git worktree list /path/to/main aba123 [main] /path/to/linked 123abc [branch-a] /path/to/prunable ace127 (detached HEAD) prunable In the --porcelain format a prunable label is added followed by its reason: $ git worktree list --porcelain ... worktree /path/to/prunable HEAD abc1234abc1234abc1234abc1234abc1234abc12 detached prunable gitdir file points to non-existent location ... Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-30 09:57:35 -08:00
Rafael Silva	862c723d18	worktree: teach `list --porcelain` to annotate locked worktree Commit `c57b3367be` (worktree: teach `list` to annotate locked worktree, 2020-10-11) taught "git worktree list" to annotate locked worktrees by appending "locked" text to its output, however, this is not listed in the --porcelain format. Teach "list --porcelain" to do the same and add a "locked" attribute followed by its reason, thus making both default and porcelain format consistent. If the locked reason is not available then only "locked" is shown. The output of the "git worktree list --porcelain" becomes like so: $ git worktree list --porcelain ... worktree /path/to/locked HEAD 123abcdea123abcd123acbd123acbda123abcd12 detached locked worktree /path/to/locked-with-reason HEAD abc123abc123abc123abc123abc123abc123abc1 detached locked reason why it is locked ... In porcelain mode, if the lock reason contains special characters such as newlines, they are escaped with backslashes and the entire reason is enclosed in double quotes. For example: $ git worktree list --porcelain ... locked "worktree's path mounted in\nremovable device" ... Furthermore, let's update the documentation to state that some attributes in the porcelain format might be listed alone or together with its value depending whether the value is available or not. Thus documenting the case of the new "locked" attribute. Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-30 09:57:29 -08:00
Rafael Silva	eb36135af7	worktree: teach worktree_lock_reason() to gently handle main worktree worktree_lock_reason() aborts with an assertion failure when called on the main worktree since locking the main worktree is nonsensical. Not only is this behavior undocumented, thus callers might not even be aware that the call could potentially crash the program, but it also forces clients to be extra careful: if (!is_main_worktree(wt) && worktree_locked_reason(...)) ... Since we know that locking makes no sense in the context of the main worktree, we can simply return false for the main worktree, thus making client code less complex by eliminating the need for the callers to have inside knowledge about the implementation: if (worktree_lock_reason(...)) ... Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-30 09:57:20 -08:00
Rafael Silva	a29a8b7574	worktree: libify should_prune_worktree() As part of teaching "git worktree list" to annotate worktree that is a candidate for pruning, let's move should_prune_worktree() from builtin/worktree.c to worktree.c in order to make part of the worktree public API. should_prune_worktree() knows how to select the given worktree for pruning based on an expiration date, however the expiration value is stored in a static file-scope variable and it is not local to the function. In order to move the function, teach should_prune_worktree() to take the expiration date as an argument and document the new parameter that is not immediately obvious. Also, change the function comment to clearly state that the worktree's path is returned in `wtpath` argument. Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-30 09:57:08 -08:00
Jeff King	8380dcd700	oid_pos(): access table through const pointers When we are looking up an oid in an array, we obviously don't need to write to the array. Let's mark it as const in the function interfaces, as well as in the local variables we use to derference the void pointer (note a few cases use pointers-to-pointers, so we mark everything const). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-28 12:03:26 -08:00
Jeff King	45ee13b942	hash_pos(): convert to oid_pos() All of our callers are actually looking up an object_id, not a bare hash. Likewise, the arrays they are looking in are actual arrays of object_id (not just raw bytes of hashes, as we might find in a pack .idx; those are handled by bsearch_hash()). Using an object_id gives us more type safety, and makes the callers slightly shorter. It also gets rid of the word "sha1" from several access functions, though we could obviously also rename those with s/sha1/hash/. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-28 12:02:39 -08:00
Johannes Schindelin	679b5916cd	range-diff/format-patch: refactor check for commit range Currently, when called with exactly two arguments, `git range-diff` tests for a literal `..` in each of the two. Likewise, the argument provided via `--range-diff` to `git format-patch` is checked in the same manner. However, `<commit>^!` is a perfectly valid commit range, equivalent to `<commit>^..<commit>` according to the `SPECIFYING RANGES` section of gitrevisions[7]. In preparation for allowing more sophisticated ways to specify commit ranges, let's refactor the check into its own function. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-27 22:01:49 -08:00
Ævar Arnfjörð Bjarmason	15c9649730	grep/log: remove hidden --debug and --grep-debug options Remove the hidden "grep --debug" and "log --grep-debug" options added in `17bf35a3c7` (grep: teach --debug option to dump the parse tree, 2012-09-13). At the time these options seem to have been intended to go along with a documentation discussion and to help the author of relevant tests to perform ad-hoc debugging on them[1]. Reasons to want this gone: 1. They were never documented, and the only (rather trivial) use of them in our own codebase for testing is something I removed back in `e01b4dab01` (grep: change non-ASCII -i test to stop using --debug, 2017-05-20). 2. Googling around doesn't show any in-the-wild uses I could dig up, and on the Git ML the only mentions after the original discussion seem to have been when they came up in unrelated diff contexts, or that test commit of mine. 3. An exception to that is `c581e4a749` (grep: under --debug, show whether PCRE JIT is enabled, 2019-08-18) where we added the ability to dump out when PCREv2 has the JIT in effect. The combination of that and my earlier `b65abcafc7` (grep: use PCRE v2 for optimized fixed-string search, 2019-07-01) means Git prints this out in its most common in-the-wild configuration: $ git log --grep-debug --grep=foo --grep=bar --grep=baz --all-match pcre2_jit_on=1 pcre2_jit_on=1 pcre2_jit_on=1 [all-match] (or pattern_body<body>foo (or pattern_body<body>bar pattern_body<body>baz ) ) $ git grep --debug $ -e foo --and -e bar $ --or -e baz pcre2_jit_on=1 pcre2_jit_on=1 pcre2_jit_on=1 (or (and patternfoo patternbar ) patternbaz ) I.e. for each pattern we're considering for the and/or/--all-match etc. debugging we'll now diligently spew out another identical line saying whether the PCREv2 JIT is on or not. I think that nobody's complained about that rather glaringly obviously bad output says something about how much this is used, i.e. it's not. The need for this debugging aid for the composed grep/log patterns seems to have passed, and the desire to dump the JIT config seems to have been another one-off around the time we had JIT-related issues on the PCREv2 codepath. That the original author of this debugging facility seemingly hasn't noticed the bad output since then[2] is probably some indicator. 1. https://lore.kernel.org/git/cover.1347615361.git.git@drmicha.warpmail.net/ 2. https://lore.kernel.org/git/xmqqk1b8x0ac.fsf@gitster-ct.c.googlers.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-26 11:36:20 -08:00
Taylor Blau	e8c58f894b	t: support GIT_TEST_WRITE_REV_INDEX Add a new option that unconditionally enables the pack.writeReverseIndex setting in order to run the whole test suite in a mode that generates on-disk reverse indexes. Additionally, enable this mode in the second run of tests under linux-gcc in 'ci/run-build-and-tests.sh'. Once on-disk reverse indexes are proven out over several releases, we can change the default value of that configuration to 'true', and drop this patch. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-25 18:32:44 -08:00
Taylor Blau	c97733435a	builtin/pack-objects.c: respect 'pack.writeReverseIndex' Now that we have an implementation that can write the new reverse index format, enable writing a .rev file in 'git pack-objects' by consulting the pack.writeReverseIndex configuration variable. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-25 18:32:43 -08:00
Taylor Blau	e37d0b8730	builtin/index-pack.c: write reverse indexes Teach 'git index-pack' to optionally write and verify reverse index with '--[no-]rev-index', as well as respecting the 'pack.writeReverseIndex' configuration option. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-25 18:32:43 -08:00
Taylor Blau	84d544943c	builtin/index-pack.c: allow stripping arbitrary extensions To derive the filename for a .idx file, 'git index-pack' uses derive_filename() to strip the '.pack' suffix and add the new suffix. Prepare for stripping off suffixes other than '.pack' by making the suffix to strip a parameter of derive_filename(). In order to make this consistent with the "suffix" parameter which does not begin with a ".", an additional check in derive_filename. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-25 18:32:43 -08:00
Taylor Blau	2f4ba2a867	packfile: prepare for the existence of '.rev' files Specify the format of the on-disk reverse index 'pack-.rev' file, as well as prepare the code for the existence of such files. The reverse index maps from pack relative positions (i.e., an index into the array of object which is sorted by their offsets within the packfile) to their position within the 'pack-.idx' file. Today, this is done by building up a list of (off_t, uint32_t) tuples for each object (the off_t corresponding to that object's offset, and the uint32_t corresponding to its position in the index). To convert between pack and index position quickly, this array of tuples is radix sorted based on its offset. This has two major drawbacks: First, the in-memory cost scales linearly with the number of objects in a pack. Each 'struct revindex_entry' is sizeof(off_t) + sizeof(uint32_t) + padding bytes for a total of 16. To observe this, force Git to load the reverse index by, for e.g., running 'git cat-file --batch-check="%(objectsize:disk)"'. When asking for a single object in a fresh clone of the kernel, Git needs to allocate 120+ MB of memory in order to hold the reverse index in memory. Second, the cost to sort also scales with the size of the pack. Luckily, this is a linear function since 'load_pack_revindex()' uses a radix sort, but this cost still must be paid once per pack per process. As an example, it takes ~60x longer to print the _size_ of an object as it does to print that entire object's _contents_: Benchmark #1: git.compile cat-file --batch <obj Time (mean ± σ): 3.4 ms ± 0.1 ms [User: 3.3 ms, System: 2.1 ms] Range (min … max): 3.2 ms … 3.7 ms 726 runs Benchmark #2: git.compile cat-file --batch-check="%(objectsize:disk)" <obj Time (mean ± σ): 210.3 ms ± 8.9 ms [User: 188.2 ms, System: 23.2 ms] Range (min … max): 193.7 ms … 224.4 ms 13 runs Instead, avoid computing and sorting the revindex once per process by writing it to a file when the pack itself is generated. The format is relatively straightforward. It contains an array of uint32_t's, the length of which is equal to the number of objects in the pack. The ith entry in this table contains the index position of the ith object in the pack, where "ith object in the pack" is determined by pack offset. One thing that the on-disk format does _not_ contain is the full (up to) eight-byte offset corresponding to each object. This is something that the in-memory revindex contains (it stores an off_t in 'struct revindex_entry' along with the same uint32_t that the on-disk format has). Omit it in the on-disk format, since knowing the index position for some object is sufficient to get a constant-time lookup in the pack-.idx file to ask for an object's offset within the pack. This trades off between the on-disk size of the 'pack-.rev' file for runtime to chase down the offset for some object. Even though the lookup is constant time, the constant is heavier, since it can potentially involve two pointer walks in v2 indexes (one to access the 4-byte offset table, and potentially a second to access the double wide offset table). Consider trying to map an object's pack offset to a relative position within that pack. In a cold-cache scenario, more page faults occur while switching between binary searching through the reverse index and searching through the .idx file for an object's offset. Sure enough, with a cold cache (writing '3' into '/proc/sys/vm/drop_caches' after 'sync'ing), printing out the entire object's contents is still marginally faster than printing its size: Benchmark #1: git.compile cat-file --batch-check="%(objectsize:disk)" <obj >/dev/null Time (mean ± σ): 22.6 ms ± 0.5 ms [User: 2.4 ms, System: 7.9 ms] Range (min … max): 21.4 ms … 23.5 ms 41 runs Benchmark #2: git.compile cat-file --batch <obj >/dev/null Time (mean ± σ): 17.2 ms ± 0.7 ms [User: 2.8 ms, System: 5.5 ms] Range (min … max): 15.6 ms … 18.2 ms 45 runs (Numbers taken in the kernel after cheating and using the next patch to generate a reverse index). There are a couple of approaches to improve cold cache performance not pursued here: - We could include the object offsets in the reverse index format. Predictably, this does result in fewer page faults, but it triples the size of the file, while simultaneously duplicating a ton of data already available in the .idx file. (This was the original way I implemented the format, and it did show `--batch-check='%(objectsize:disk)'` winning out against `--batch`.) On the other hand, this increase in size also results in a large block-cache footprint, which could potentially hurt other workloads. - We could store the mapping from pack to index position in more cache-friendly way, like constructing a binary search tree from the table and writing the values in breadth-first order. This would result in much better locality, but the price you pay is trading O(1) lookup in 'pack_pos_to_index()' for an O(log n) one (since you can no longer directly index the table). So, neither of these approaches are taken here. (Thankfully, the format is versioned, so we are free to pursue these in the future.) But, cold cache performance likely isn't interesting outside of one-off cases like asking for the size of an object directly. In real-world usage, Git is often performing many operations in the revindex (i.e., asking about many objects rather than a single one). The trade-off is worth it, since we will avoid the vast majority of the cost of generating the revindex that the extra pointer chase will look like noise in the following patch's benchmarks. This patch describes the format and prepares callers (like in pack-revindex.c) to be able to read *.rev files once they exist. An implementation of the writer will appear in the next patch, and callers will gradually begin to start using the writer in the patches that follow after that. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-25 18:32:43 -08:00
Junio C Hamano	bcaaf972e6	Merge branch 'tb/pack-revindex-api' Abstract accesses to in-core revindex that allows enumerating objects stored in a packfile in the order they appear in the pack, in preparation for introducing an on-disk precomputed revindex. * tb/pack-revindex-api: (21 commits) for_each_object_in_pack(): clarify pack vs index ordering pack-revindex.c: avoid direct revindex access in 'offset_to_pack_pos()' pack-revindex: hide the definition of 'revindex_entry' pack-revindex: remove unused 'find_revindex_position()' pack-revindex: remove unused 'find_pack_revindex()' builtin/gc.c: guess the size of the revindex for_each_object_in_pack(): convert to new revindex API unpack_entry(): convert to new revindex API packed_object_info(): convert to new revindex API retry_bad_packed_offset(): convert to new revindex API get_delta_base_oid(): convert to new revindex API rebuild_existing_bitmaps(): convert to new revindex API try_partial_reuse(): convert to new revindex API get_size_by_pos(): convert to new revindex API show_objects_for_type(): convert to new revindex API bitmap_position_packfile(): convert to new revindex API check_object(): convert to new revindex API write_reused_pack_verbatim(): convert to new revindex API write_reused_pack_one(): convert to new revindex API write_reuse_object(): convert to new revindex API ...	2021-01-25 14:19:20 -08:00
Junio C Hamano	7eefa1349b	Merge branch 'cc/write-promisor-file' A bit of code refactoring. * cc/write-promisor-file: pack-write: die on error in write_promisor_file() fetch-pack: refactor writing promisor file fetch-pack: rename helper to create_promisor_file()	2021-01-25 14:19:19 -08:00
Junio C Hamano	42342b3ee6	Merge branch 'ab/mailmap' Clean-up docs, codepaths and tests around mailmap. * ab/mailmap: (22 commits) shortlog: remove unused(?) "repo-abbrev" feature mailmap doc + tests: document and test for case-insensitivity mailmap tests: add tests for empty "<>" syntax mailmap tests: add tests for whitespace syntax mailmap tests: add a test for comment syntax mailmap doc + tests: add better examples & test them tests: refactor a few tests to use "test_commit --append" test-lib functions: add an --append option to test_commit test-lib functions: add --author support to test_commit test-lib functions: document arguments to test_commit test-lib functions: expand "test_commit" comment template mailmap: test for silent exiting on missing file/blob mailmap tests: get rid of overly complex blame fuzzing mailmap tests: add a test for "not a blob" error mailmap tests: remove redundant entry in test mailmap tests: improve --stdin tests mailmap tests: modernize syntax & test idioms mailmap tests: use our preferred whitespace syntax mailmap doc: start by mentioning the comment syntax check-mailmap doc: note config options ...	2021-01-25 14:19:19 -08:00
Junio C Hamano	60ecad090d	Merge branch 'ps/fetch-atomic' "git fetch" learns to treat ref updates atomically in all-or-none fashion, just like "git push" does, with the new "--atomic" option. * ps/fetch-atomic: fetch: implement support for atomic reference updates fetch: allow passing a transaction to `s_update_ref()` fetch: refactor `s_update_ref` to use common exit path fetch: use strbuf to format FETCH_HEAD updates fetch: extract writing to FETCH_HEAD	2021-01-25 14:19:19 -08:00
Junio C Hamano	dfcd905069	Merge branch 'jc/deprecate-pack-redundant' Warn loudly when the "pack-redundant" command, which has been left stale with almost unusable performance issues, gets used, as we no longer want to recommend its use (instead just "repack -d" instead). * jc/deprecate-pack-redundant: pack-redundant: gauge the usage before proposing its removal	2021-01-25 14:19:18 -08:00
Junio C Hamano	9e409d7e07	Merge branch 'ab/branch-sort' The implementation of "git branch --sort" wrt the detached HEAD display has always been hacky, which has been cleaned up. * ab/branch-sort: branch: show "HEAD detached" first under reverse sort branch: sort detached HEAD based on a flag ref-filter: move ref_sorting flags to a bitfield ref-filter: move "cmp_fn" assignment into "else if" arm ref-filter: add braces to if/else if/else chain branch tests: add to --sort tests branch: change "--local" to "--list" in comment	2021-01-25 14:19:17 -08:00
Junio C Hamano	58e2ce9112	Merge branch 'ma/more-opaque-lock-file' Code clean-up. * ma/more-opaque-lock-file: read-cache: try not to peek into `struct {lock_,temp}file` refs/files-backend: don't peek into `struct lock_file` midx: don't peek into `struct lock_file` commit-graph: don't peek into `struct lock_file` builtin/gc: don't peek into `struct lock_file`	2021-01-25 14:19:17 -08:00
Junio C Hamano	c7d6d419b0	Merge branch 'ab/mktag' "git mktag" validates its input using its own rules before writing a tag object---it has been updated to share the logic with "git fsck". * ab/mktag: (23 commits) mktag: add a --[no-]strict option mktag: mark strings for translation mktag: convert to parse-options mktag: allow omitting the header/body \n separator mktag: allow turning off fsck.extraHeaderEntry fsck: make fsck_config() re-usable mktag: use fsck instead of custom verify_tag() mktag: use puts(str) instead of printf("%s\n", str) mktag: remove redundant braces in one-line body "if" mktag: use default strbuf_read() hint mktag tests: test verify_object() with replaced objects mktag tests: improve verify_object() test coverage mktag tests: test "hash-object" compatibility mktag tests: stress test whitespace handling mktag tests: run "fsck" after creating "mytag" mktag tests: don't create "mytag" twice mktag tests: don't redirect stderr to a file needlessly mktag tests: remove needless SHA-1 hardcoding mktag tests: use "test_commit" helper mktag tests: don't needlessly use a subshell ...	2021-01-25 14:19:17 -08:00
Derrick Stolee	dd23022acb	sparse-checkout: load sparse-checkout patterns A future feature will want to load the sparse-checkout patterns into a pattern_list, but the current mechanism to do so is a bit complicated. This is made difficult due to needing to find the sparse-checkout file in different ways throughout the codebase. The logic implemented in the new get_sparse_checkout_patterns() was duplicated in populate_from_existing_patterns() in unpack-trees.c. Use the new method instead, keeping the logic around handling the struct unpack_trees_options. The callers to get_sparse_checkout_filename() in builtin/sparse-checkout.c manipulate the sparse-checkout file directly, so it is not appropriate to replace logic in that file with get_sparse_checkout_patterns(). Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 17:14:07 -08:00
Derrick Stolee	fb0882648e	cache-tree: clean up cache_tree_update() Make the method safer by allocating a cache_tree member for the given index_state if it is not already present. This is preferrable to a BUG() statement or returning with an error because future callers will want to populate an empty cache-tree using this method. Callers can also remove their conditional allocations of cache_tree. Also drop local variables that can be found directly from the 'istate' parameter. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 17:14:07 -08:00
ZheNing Hu	93a7d9835f	ls-files.c: add --deduplicate option During a merge conflict, the name of a file may appear multiple times in "git ls-files" output, once for each stage. If you use both `--delete` and `--modify` at the same time, the output may mention a deleted file twice. When none of the '-t', '-u', or '-s' options is in use, these duplicate entries do not add much value to the output. Introduce a new '--deduplicate' option to suppress them. Signed-off-by: ZheNing Hu <adlternative@gmail.com> [jc: extended doc and rewritten commit log] Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 11:48:20 -08:00
ZheNing Hu	ed644d1666	ls_files.c: consolidate two for loops into one This will make it easier to show only one entry per filename in the next step. Signed-off-by: ZheNing Hu <adlternative@gmail.com> [jc: corrected the log message] Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 11:48:20 -08:00
ZheNing Hu	f1c462ea41	ls_files.c: bugfix for --deleted and --modified This situation may occur in the original code: lstat() failed but we use `&st` to feed ie_modified() later. Therefore, we can directly execute show_ce without the judgment of ie_modified() when lstat() has failed. Signed-off-by: ZheNing Hu <adlternative@gmail.com> [jc: fixed misindented code] Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-23 11:48:11 -08:00
Jacob Vosmaer	be18153b97	builtin/pack-objects.c: avoid iterating all refs In git-pack-objects, we iterate over all the tags if the --include-tag option is passed on the command line. For some reason this uses for_each_ref which is expensive if the repo has many refs. We should use for_each_tag_ref instead. Because the add_ref_tag callback will now only visit tags we simplified it a bit. The motivation for this change is that we observed performance issues with a repository on gitlab.com that has 500,000 refs but only 2,000 tags. The fetch traffic on that repo is dominated by CI, and when we changed CI to fetch with 'git fetch --no-tags' we saw a dramatic change in the CPU profile of git-pack-objects. This lead us to this particular ref walk. More details in: https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/746#note_483546598 Signed-off-by: Jacob Vosmaer <jacob@gitlab.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-22 17:27:42 -08:00
Phil Hord	8198907795	use delete_refs when deleting tags or branches 'git tag -d' accepts one or more tag refs to delete, but each deletion is done by calling `delete_ref` on each argv. This is very slow when removing from packed refs. Use delete_refs instead so all the removals can be done inside a single transaction with a single update. Do the same for 'git branch -d'. Since delete_refs performs all the packed-refs delete operations inside a single transaction, if any of the deletes fail then all them will be skipped. In practice, none of them should fail since we verify the hash of each one before calling delete_refs, but some network error or odd permissions problem could have different results after this change. Also, since the file-backed deletions are not performed in the same transaction, those could succeed even when the packed-refs transaction fails. After deleting branches, remove the branch config only if the branch ref was removed and was not subsequently added back in. A manual test deleting 24,000 tags took about 30 minutes using delete_ref. It takes about 5 seconds using delete_refs. Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Phil Hord <phil.hord@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-21 16:05:05 -08:00
Jeff King	36a317929b	refs: switch peel_ref() to peel_iterated_oid() The peel_ref() interface is confusing and error-prone: - it's typically used by ref iteration callbacks that have both a refname and oid. But since they pass only the refname, we may load the ref value from the filesystem again. This is inefficient, but also means we are open to a race if somebody simultaneously updates the ref. E.g., this: int some_ref_cb(const char refname, const struct object_id oid, ...) { if (!peel_ref(refname, &peeled)) printf("%s peels to %s", oid_to_hex(oid), oid_to_hex(&peeled); } could print nonsense. It is correct to say "refname peels to..." (you may see the "before" value or the "after" value, either of which is consistent), but mentioning both oids may be mixing before/after values. Worse, whether this is possible depends on whether the optimization to read from the current iterator value kicks in. So it is actually not possible with: for_each_ref(some_ref_cb); but it _is_ possible with: head_ref(some_ref_cb); which does not use the iterator mechanism (though in practice, HEAD should never peel to anything, so this may not be triggerable). - it must take a fully-qualified refname for the read_ref_full() code path to work. Yet we routinely pass it partial refnames from callbacks to for_each_tag_ref(), etc. This happens to work when iterating because there we do not call read_ref_full() at all, and only use the passed refname to check if it is the same as the iterator. But the requirements for the function parameters are quite unclear. Instead of taking a refname, let's instead take an oid. That fixes both problems. It's a little funny for a "ref" function not to involve refs at all. The key thing is that it's optimizing under the hood based on having access to the ref iterator. So let's change the name to make it clear why you'd want this function versus just peel_object(). There are two other directions I considered but rejected: - we could pass the peel information into the each_ref_fn callback. However, we don't know if the caller actually wants it or not. For packed-refs, providing it is essentially free. But for loose refs, we actually have to peel the object, which would be wasteful in most cases. We could likewise pass in a flag to the callback indicating whether the peeled information is known, but that complicates those callbacks, as they then have to decide whether to manually peel themselves. Plus it requires changing the interface of every callback, whether they care about peeling or not, and there are many of them. - we could make a function to return the peeled value of the current iterated ref (computing it if necessary), and BUG() otherwise. I.e.: int peel_current_iterated_ref(struct object_id *out); Each of the current callers is an each_ref_fn callback, so they'd mostly be happy. But: - we use those callbacks with functions like head_ref(), which do not use the iteration code. So we'd need to handle the fallback case there, anyway. - it's possible that a caller would want to call into generic code that sometimes is used during iteration and sometimes not. This encapsulates the logic to do the fast thing when possible, and fallback when necessary. The implementation is mostly obvious, but I want to call out a few things in the patch: - the test-tool coverage for peel_ref() is now meaningless, as it all collapses to a single peel_object() call (arguably they were pretty uninteresting before; the tricky part of that function is the fast-path we see during iteration, but these calls didn't trigger that). I've just dropped it entirely, though note that some other tests relied on the tags we created; I've moved that creation to the tests where it matters. - we no longer need to take a ref_store parameter, since we'd never look up a ref now. We do still rely on a global "current iterator" variable which _could_ be kept per-ref-store. But in practice this is only useful if there are multiple recursive iterations, at which point the more appropriate solution is probably a stack of iterators. No caller used the actual ref-store parameter anyway (they all call the wrapper that passes the_repository). - the original only kicked in the optimization when the "refname" pointer matched (i.e., not string comparison). We do likewise with the "oid" parameter here, but fall back to doing an actual oideq() call. This in theory lets us kick in the optimization more often, though in practice no current caller cares. It should never be wrong, though (peeling is a property of an object, so two refs pointing to the same object would peel identically). - the original took care not to touch the peeled out-parameter unless we found something to put in it. But no caller cares about this, and anyway, it is enforced by peel_object() itself (and even in the optimized iterator case, that's where we eventually end up). We can shorten the code and avoid an extra copy by just passing the out-parameter through the stack. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-21 15:51:31 -08:00
Derrick Stolee	96eaffebbf	maintenance: set log.excludeDecoration durin prefetch The 'prefetch' task fetches refs from all remotes and places them in the refs/prefetch/<remote>/ refspace. As this task is intended to run in the background, this allows users to keep their local data very close to the remote servers' data while not updating the users' understanding of the remote refs in refs/remotes/<remote>/. However, this can clutter 'git log' decorations with copies of the refs with the full name 'refs/prefetch/<remote>/<branch>'. The log.excludeDecoration config option was added in `a6be5e67` (log: add log.excludeDecoration config option, 2020-05-16) for exactly this purpose. Ensure we set this only for users that would benefit from it by assigning it at the beginning of the prefetch task. Other alternatives would be during 'git maintenance register' or 'git maintenance start', but those might assign the config even when the prefetch task is disabled by existing config. Further, users could run 'git maintenance run --task=prefetch' using their own scripting or scheduling. This provides the best coverage to automatically update the config when valuable. It is improbable, but possible, that users might want to run the prefetch task _and_ see these refs in their log decorations. This seems incredibly unlikely to me, but users can always opt-in on a command-by-command basis using --decorate-refs=refs/prefetch/. Test that this works in a few cases. In particular, ensure that our assignment of log.excludeDecoration=refs/prefetch/ is additive to other existing exclusions. Further, ensure we do not add multiple copies in multiple runs. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-20 18:46:22 -08:00
Junio C Hamano	aa08688362	Merge branch 'ds/for-each-repo-noopfix' "git for-each-repo --config=<var> <cmd>" should not run <cmd> for any repository when the configuration variable <var> is not defined even once. * ds/for-each-repo-noopfix: for-each-repo: do nothing on empty config	2021-01-15 21:48:46 -08:00
Junio C Hamano	b2ace18759	Merge branch 'ds/maintenance-part-4' Follow-up on the "maintenance part-3" which introduced scheduled maintenance tasks to support platforms whose native scheduling methods are not 'cron'. * ds/maintenance-part-4: maintenance: use Windows scheduled tasks maintenance: use launchctl on macOS maintenance: include 'cron' details in docs maintenance: extract platform-specific scheduling	2021-01-15 21:48:45 -08:00
Junio C Hamano	62fb47a4d3	Merge branch 'en/stash-apply-sparse-checkout' "git stash" did not work well in a sparsely checked out working tree. * en/stash-apply-sparse-checkout: stash: fix stash application in sparse-checkouts stash: remove unnecessary process forking t7012: add a testcase demonstrating stash apply bugs in sparse checkouts	2021-01-15 15:20:29 -08:00
Junio C Hamano	2ce8de6bf9	Merge branch 'zh/arg-help-format' Clean up option descriptions in "git cmd --help". * zh/arg-help-format: builtin/*: update usage format parse-options: format argh like error messages	2021-01-15 15:20:29 -08:00
Junio C Hamano	df26861c56	Merge branch 'rs/rebase-commit-validation' Diagnose command line error of "git rebase" early. * rs/rebase-commit-validation: rebase: verify commit parameter	2021-01-15 15:20:29 -08:00
Junio C Hamano	8b327f1784	Merge branch 'ma/sha1-is-a-hash' Retire more names with "sha1" in it. * ma/sha1-is-a-hash: hash-lookup: rename from sha1-lookup sha1-lookup: rename `sha1_pos()` as `hash_pos()` object-file.c: rename from sha1-file.c object-name.c: rename from sha1-name.c	2021-01-15 15:20:29 -08:00
Junio C Hamano	9ba366f12b	Merge branch 'bc/rev-parse-path-format' "git rev-parse" can be explicitly told to give output as absolute or relative path with the `--path-format=(absolute\|relative)` option. * bc/rev-parse-path-format: rev-parse: add option for absolute or relative path formatting abspath: add a function to resolve paths with missing components	2021-01-15 15:20:28 -08:00
Taylor Blau	2891b434ac	builtin/gc.c: guess the size of the revindex 'estimate_repack_memory()' takes into account the amount of memory required to load the reverse index in memory by multiplying the assumed number of objects by the size of the 'revindex_entry' struct. Prepare for hiding the definition of 'struct revindex_entry' by removing a 'sizeof()' of that type from outside of pack-revindex.c. Instead, guess that one off_t and one uint32_t are required per object. Strictly speaking, this is a worse guess than asking for 'sizeof(struct revindex_entry)' directly, since the true size of this struct is 16 bytes with padding on the end of the struct in order to align the offset field. But, this is an approximation anyway, and it does remove a use of the 'struct revindex_entry' from outside of pack-revindex internals. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-13 21:53:47 -08:00
Taylor Blau	eb3fd99efd	check_object(): convert to new revindex API Replace direct accesses to the revindex with calls to 'offset_to_pack_pos()' and 'pack_pos_to_index()'. Since this caller already had some error checking (it can jump to the 'give_up' label if it encounters an error), we can easily check whether or not the provided offset points to an object in the given pack. This error checking existed prior to this patch, too, since the caller checks whether the return value from 'find_pack_revindex()' was NULL or not. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-13 21:53:45 -08:00
Taylor Blau	6a5c10c45f	write_reused_pack_verbatim(): convert to new revindex API Replace a direct access to the revindex array with 'pack_pos_to_offset()'. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-13 21:53:45 -08:00
Taylor Blau	66cbd3e2fb	write_reused_pack_one(): convert to new revindex API Replace direct revindex accesses with calls to 'pack_pos_to_offset()' and 'pack_pos_to_index()'. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-13 21:53:45 -08:00
Taylor Blau	952fc6870d	write_reuse_object(): convert to new revindex API First replace 'find_pack_revindex()' with its replacement 'offset_to_pack_pos()'. This prevents any bogus OFS_DELTA that may make its way through until 'write_reuse_object()' from causing a bad memory read (if 'revidx' is 'NULL') Next, replace a direct access of '->nr' with the wrapper function 'pack_pos_to_index()'. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-13 21:53:45 -08:00
Christian Couder	33add2ad7d	fetch-pack: refactor writing promisor file Let's replace the 2 different pieces of code that write a promisor file in 'builtin/repack.c' and 'fetch-pack.c' with a new function called 'write_promisor_file()' in 'pack-write.c' and 'pack.h'. This might also help us in the future, if we want to put back the ref names and associated hashes that were in the promisor files we are repacking in 'builtin/repack.c' as suggested by a NEEDSWORK comment just above the code we are refactoring. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 16:01:07 -08:00
Ævar Arnfjörð Bjarmason	4e168333a8	shortlog: remove unused(?) "repo-abbrev" feature Remove support for the magical "repo-abbrev" comment in .mailmap files. This was added to .mailmap parsing in [1], as a generalized feature of the git-shortlog Perl script added earlier in [2]. There was no documentation or tests for this feature, and I don't think it's used in practice anymore. What it did was to allow you to specify a single string to be search-replaced with "/.../" in the .mailmap file. E.g. for linux.git's current .mailmap: git archive --remote=git@gitlab.com:linux-kernel/linux.git \ HEAD -- .mailmap \| grep -a repo-abbrev # repo-abbrev: /pub/scm/linux/kernel/git/ Then when running e.g.: git shortlog --merges --author=Linus -1 v5.10-rc7..v5.10 \| grep Merge We'd emit (the [...] is mine): Merge tag [...]git://git.kernel.org/.../tip/tip But will now emit: Merge tag [...]git.kernel.org/pub/scm/linux/kernel/git/tip/tip I think at this point this is just a historical artifact we can get rid of. It was initially meant for Linus's own use when we integrated the Perl script[2], but since then it seems he's stopped using it. Digging through Linus's release announcements on the LKML[3] the last release I can find that made use of this output is Linux 2.6.25-rc6 back in March 2008[4]. Later on Linus started using --no-merges[5], and nowadays seems to prefer some custom not-quite-shortlog format of merges from lieutenants[6]. You will still see it on linux.git if you run "git shortlog" manually yourself with --merges, with this removed you can still get the same output with: git log --pretty=fuller v5.10-rc7..v5.10 \| sed 's!/pub/scm/linux/kernel/git/!/.../!g' \| git shortlog Arguably we should do the same for the search-replacing of "[PATCH]" at the beginning with "". That seems to be another relic of a bygone era when linux.git patches would have their E-Mail subject lines applied as-is by "git am" or whatever. But we documented that feature in "git-shortlog(1)", and it seems more widely applicable than something purely kernel-specific. 1. `7595e2ee6e` (git-shortlog: make common repository prefix configurable with .mailmap, 2006-11-25) 2. `fa375c7f1b` (Add git-shortlog perl script, 2005-06-04) 3. https://lore.kernel.org/lkml/ 4. https://lore.kernel.org/lkml/alpine.LFD.1.00.0803161651350.3020@woody.linux-foundation.org/ 5. https://lore.kernel.org/lkml/BANLkTinrbh7Xi27an3uY7pDWrNKhJRYmEA@mail.gmail.com/ 6. https://lore.kernel.org/lkml/CAHk-=wg1+kf1AVzXA-RQX0zjM6t9J2Kay9xyuNqcFHWV-y5ZYw@mail.gmail.com/ Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 14:04:42 -08:00
Patrick Steinhardt	c7b190dabd	fetch: implement support for atomic reference updates When executing a fetch, then git will currently allocate one reference transaction per reference update and directly commit it. This means that fetches are non-atomic: even if some of the reference updates fail, others may still succeed and modify local references. This is fine in many scenarios, but this strategy has its downsides. - The view of remote references may be inconsistent and may show a bastardized state of the remote repository. - Batching together updates may improve performance in certain scenarios. While the impact probably isn't as pronounced with loose references, the upcoming reftable backend may benefit as it needs to write less files in case the update is batched. - The reference-update hook is currently being executed twice per updated reference. While this doesn't matter when there is no such hook, we have seen severe performance regressions when doing a git-fetch(1) with reference-transaction hook when the remote repository has hundreds of thousands of references. Similar to `git push --atomic`, this commit thus introduces atomic fetches. Instead of allocating one reference transaction per updated reference, it causes us to only allocate a single transaction and commit it as soon as all updates were received. If locking of any reference fails, then we abort the complete transaction and don't update any reference, which gives us an all-or-nothing fetch. Note that this may not completely fix the first of above downsides, as the consistent view also depends on the server-side. If the server doesn't have a consistent view of its own references during the reference negotiation phase, then the client would get the same inconsistent view the server has. This is a separate problem though and, if it actually exists, can be fixed at a later point. This commit also changes the way we write FETCH_HEAD in case `--atomic` is passed. Instead of writing changes as we go, we need to accumulate all changes first and only commit them at the end when we know that all reference updates succeeded. Ideally, we'd just do so via a temporary file so that we don't need to carry all updates in-memory. This isn't trivially doable though considering the `--append` mode, where we do not truncate the file but simply append to it. And given that we support concurrent processes appending to FETCH_HEAD at the same time without any loss of data, seeding the temporary file with current contents of FETCH_HEAD initially and then doing a rename wouldn't work either. So this commit implements the simple strategy of buffering all changes and appending them to the file on commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 12:06:15 -08:00
Patrick Steinhardt	d4c8db8f1b	fetch: allow passing a transaction to `s_update_ref()` The handling of ref updates is completely handled by `s_update_ref()`, which will manage the complete lifecycle of the reference transaction. This is fine right now given that git-fetch(1) does not support atomic fetches, so each reference gets its own transaction. It is quite inflexible though, as `s_update_ref()` only knows about a single reference update at a time, so it doesn't allow us to alter the strategy. This commit prepares `s_update_ref()` and its only caller `update_local_ref()` to allow passing an external transaction. If none is given, then the existing behaviour is triggered which creates a new transaction and directly commits it. Otherwise, if the caller provides a transaction, then we only queue the update but don't commit it. This optionally allows the caller to manage when a transaction will be committed. Given that `update_local_ref()` is always called with a `NULL` transaction for now, no change in behaviour is expected from this change. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 12:06:15 -08:00
Patrick Steinhardt	c45889f104	fetch: refactor `s_update_ref` to use common exit path The cleanup code in `s_update_ref()` is currently duplicated for both succesful and erroneous exit paths. This commit refactors the function to have a shared exit path for both cases to remove the duplication. Suggested-by: Christian Couder <christian.couder@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 12:06:15 -08:00
Patrick Steinhardt	929d044575	fetch: use strbuf to format FETCH_HEAD updates This commit refactors `append_fetch_head()` to use a `struct strbuf` for formatting the update which we're about to append to the FETCH_HEAD file. While the refactoring doesn't have much of a benefit right now, it serves as a preparatory step to implement atomic fetches where we need to buffer all updates to FETCH_HEAD and only flush them out if all reference updates succeeded. No change in behaviour is expected from this commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 12:06:14 -08:00
Patrick Steinhardt	58a646a368	fetch: extract writing to FETCH_HEAD When performing a fetch with the default `--write-fetch-head` option, we write all updated references to FETCH_HEAD while the updates are performed. Given that updates are not performed atomically, it means that we we write to FETCH_HEAD even if some or all of the reference updates fail. Given that we simply update FETCH_HEAD ad-hoc with each reference, the logic is completely contained in `store_update_refs` and thus quite hard to extend. This can already be seen by the way we skip writing to the FETCH_HEAD: instead of having a conditional which simply skips writing, we instead open "/dev/null" and needlessly write all updates there. We are about to extend git-fetch(1) to accept an `--atomic` flag which will make the fetch an all-or-nothing operation with regards to the reference updates. This will also require us to make the updates to FETCH_HEAD an all-or-nothing operation, but as explained doing so is not easy with the current layout. This commit thus refactors the wa we write to FETCH_HEAD and pulls out the logic to open, append to, commit and close the file. While this may seem rather over-the top at first, pulling out this logic will make it a lot easier to update the code in a subsequent commit. It also allows us to easily skip writing completely in case `--no-write-fetch-head` was passed. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-12 12:06:14 -08:00
Derrick Stolee	6c62f01552	for-each-repo: do nothing on empty config 'git for-each-repo --config=X' should return success without calling any subcommands when the config key 'X' has no value. The current implementation instead segfaults. A user could run into this issue if they used 'git maintenance start' to initialize their cron schedule using 'git for-each-repo --config=maintenance.repo ...' but then using 'git maintenance unregister' to remove the config option. (Note: 'git maintenance stop' would remove the config _and_ remove the cron schedule.) Add a simple test to ensure this works. Use 'git help --no-such-option' as the potential subcommand to ensure that we will hit a failure if the subcommand is ever run. Reported-by: Andreas Bühmann <dev@uuml.de> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-07 19:12:02 -08:00
Ævar Arnfjörð Bjarmason	2708ce62d2	branch: sort detached HEAD based on a flag Change the ref-filter sorting of detached HEAD to check the FILTER_REFS_DETACHED_HEAD flag, instead of relying on the ref description filled-in by get_head_description() to start with "(", which in turn we expect to ASCII-sort before any other reference. For context, we'd like the detached line to appear first at the start of "git branch -l", e.g.: $ git branch -l * (HEAD detached at <hash>) master This doesn't change that, but improves on a fix made in `28438e84e0` (ref-filter: sort detached HEAD lines firstly, 2019-06-18) and gives the Chinese translation the ability to use its preferred punctuation marks again. In Chinese the fullwidth versions of punctuation like "()" are typically written as (U+FF08 fullwidth left parenthesis), (U+FF09 fullwidth right parenthesis) instead[1]. This form is used in both po/zh_{CN,TW}.po in most cases where "()" is translated in a string. Aside from that improvement to the Chinese translation, it also just makes for cleaner code that we mark any special cases in the ref_array we're sorting with flags and make the sort function aware of them, instead of piggy-backing on the general-case of strcmp() doing the right thing. As seen in the amended tests this made reverse sorting a bit more consistent. Before this we'd sometimes sort this message in the middle, now it's consistently at the beginning or end, depending on whether we're doing a normal or reverse sort. Having it at the end doesn't make much sense either, but at least it behaves consistently now. A follow-up commit will make this behavior under reverse sorting even better. I'm removing the "TRANSLATORS" comments that were in the old code while I'm at it. Those were added in `d4919bb288` (ref-filter: move get_head_description() from branch.c, 2017-01-10). I think it's obvious from context, string and translation memory in typical translation tools that these are the same or similar string. 1. https://en.wikipedia.org/wiki/Chinese_punctuation#Marks_similar_to_European_punctuation Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-07 15:13:21 -08:00
Ævar Arnfjörð Bjarmason	7c269a7b16	ref-filter: move ref_sorting flags to a bitfield Change the reverse/ignore_case/version sort flags in the ref_sorting struct into a bitfield. Having three of them was already a bit unwieldy, but it would be even more so if another flag needed a function like ref_sorting_icase_all() introduced in `76f9e569ad` (ref-filter: apply --ignore-case to all sorting keys, 2020-05-03). A follow-up change will introduce such a flag, so let's move this over to a bitfield. Instead of using the usual '#define' pattern I'm using the "enum" pattern from builtin/rebase.c's `b4c8eb024a` (builtin rebase: support --quiet, 2018-09-04). Perhaps there's a more idiomatic way of doing the "for each in list amend mask" pattern than this "mask/on" variable combo. This function doesn't allow us to e.g. do any arbitrary changes to the bitfield for multiple flags, but I think in this case that's fine. The common case is that we're calling this with a list of one. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-07 15:13:21 -08:00
Junio C Hamano	8664fcb83b	Merge branch 'es/worktree-repair-both-moved' "git worktree repair" learned to deal with the case where both the repository and the worktree moved. * es/worktree-repair-both-moved: worktree: teach `repair` to fix multi-directional breakage	2021-01-06 23:33:44 -08:00
Junio C Hamano	d3fa84d528	Merge branch 'fc/pull-merge-rebase' When a user does not tell "git pull" to use rebase or merge, the command gives a loud message telling a user to choose between rebase or merge but creates a merge anyway, forcing users who would want to rebase to redo the operation. Fix an early part of this problem by tightening the condition to give the message---there is no reason to stop or force the user to choose between rebase or merge if the history fast-forwards. * fc/pull-merge-rebase: pull: display default warning only when non-ff pull: correct condition to trigger non-ff advice pull: get rid of unnecessary global variable pull: give the advice for choosing rebase/merge much later pull: refactor fast-forward check	2021-01-06 23:33:44 -08:00
Junio C Hamano	c256631065	Merge branch 'tb/pack-bitmap' Various improvements to the codepath that writes out pack bitmaps. * tb/pack-bitmap: (24 commits) pack-bitmap-write: better reuse bitmaps pack-bitmap-write: relax unique revwalk condition pack-bitmap-write: use existing bitmaps pack-bitmap: factor out 'add_commit_to_bitmap()' pack-bitmap: factor out 'bitmap_for_commit()' pack-bitmap-write: ignore BITMAP_FLAG_REUSE pack-bitmap-write: build fewer intermediate bitmaps pack-bitmap.c: check reads more aggressively when loading pack-bitmap-write: rename children to reverse_edges t5310: add branch-based checks commit: implement commit_list_contains() bitmap: implement bitmap_is_subset() pack-bitmap-write: fill bitmap with commit history pack-bitmap-write: pass ownership of intermediate bitmaps pack-bitmap-write: reimplement bitmap writing ewah: add bitmap_dup() function ewah: implement bitmap_or() ewah: make bitmap growth less aggressive ewah: factor out bitmap growth rev-list: die when --test-bitmap detects a mismatch ...	2021-01-06 23:33:43 -08:00
Ævar Arnfjörð Bjarmason	ffdd02a55d	branch: change "--local" to "--list" in comment There has never been a "git branch --local", this is just a typo for "--list". Fixes a comment added in `23e714df91` (branch: roll show_detached HEAD into regular ref_list, 2015-09-23). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-06 15:15:39 -08:00
ZheNing Hu	e73fe3dd02	builtin/*: update usage format According to the guidelines in parse-options.h, we should not end in a full stop or start with a capital letter. Fix old error and usage messages to match this expectation. Signed-off-by: ZheNing Hu <adlternative@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-06 15:10:49 -08:00
Ævar Arnfjörð Bjarmason	06ce79152b	mktag: add a --[no-]strict option Now that mktag has been migrated to use the fsck machinery to check its input, it makes sense to teach it to run in the equivalent of "git fsck"'s default mode. For cases where mktag is used to (re)create a tag object using data from an existing and malformed tag object, the validation may optionally have to be loosened. Teach the command to take the "--[no-]strict" option to do so. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-06 14:22:24 -08:00
Martin Ågren	d4a4976648	builtin/gc: don't peek into `struct lock_file` A `struct lock_file` is pretty much just a wrapper around a tempfile. But it's easy enough to avoid relying on this. Use the wrappers that the lock file API provides rather than peeking at the temp file or even into its internals. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-06 13:53:32 -08:00
Ævar Arnfjörð Bjarmason	2aa9425fbe	mktag: mark strings for translation Mark the errors mktag might emit for translation. This is a plumbing command, but the errors it emits are intended to be human-readable. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason	3f390a366c	mktag: convert to parse-options Convert the "mktag" command to use parse-options.h instead of its own ad-hoc argc handling. This doesn't matter much in practice since it doesn't support any options, but removes another special-case in our codebase, and makes it easier to add options to it in the future. It does marginally improve the situation for programs that want to execute git commands in a consistent manner and e.g. always use --end-of-options. E.g. "gitaly" does that, and has a blacklist of built-ins that don't support --end-of-options. This is one less special case for it and other similar programs to support. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason	acfc01332b	mktag: allow turning off fsck.extraHeaderEntry In earlier commits mktag learned to use the fsck machinery, at which point we needed to add fsck.extraHeaderEntry so it could be as strict about extra headers as it's been ever since it was implemented. But it's not nice to need to switch away from "mktag" to "hash-object" + manual "fsck" just because you'd like to have an extra header. So let's support turning it off by getting "fsck.*" variables from the config. Pedantically speaking it's still not possible to make "mktag" behave just like "hash-object -t tag" does, since we're unconditionally going to check the referenced object in verify_object_in_tag(), which is our own check, and not one that exists in fsck.c. But the spirit of "this works like fsck" is preserved, in that if you created such a tag with "hash-object" and did a full "fsck" on the repository it would also error out about that invalid object, it just wouldn't emit the same message as fsck does. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason	1f3299fda9	fsck: make fsck_config() re-usable Move the fsck_config() function from builtin/fsck.c to fsck.[ch]. This allows for re-using it in other tools that expose fsck logic and want to support its configuration variables. A logical continuation of this change would be to use a common function for all of {fetch,receive}.fsck.* and fsck.. See `5d477a334a` (fsck (receive-pack): allow demoting errors to warnings, 2015-06-22) and my own `1362df0d41` (fetch: implement fetch.fsck., 2018-07-27) for the relevant code. However, those routines want to not parse the fsck.skipList into OIDs, but rather pass them along with the --strict option to another process. It would be possible to refactor that whole thing so we support e.g. a "fetch." prefix, then just keep track of the skiplist as a filename instead of parsing it, and learn to spew that all out from our internal structures into something we can append to the --strict option. But instead I'm planning to re-use this in "mktag", which'll just re-use these "fsck.*" variables as-is. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason	acf9de4c94	mktag: use fsck instead of custom verify_tag() Change the validation logic in "mktag" to use fsck's fsck_tag() instead of its own custom parser. Curiously the logic for both dates back to the same commit[1]. Let's unify them so we're not maintaining two sets functions to verify that a tag is OK. The behavior of fsck_tag() and the old "mktag" code being removed here is different in few aspects. I think it makes sense to remove some of those checks, namely: A. fsck only cares that the timezone matches [-+][0-9]{4}. The mktag code disallowed values larger than 1400. Yes there's currently no timezone with a greater offset[2], but since we allow any number of non-offical timezones (e.g. +1234) passing this through seems fine. Git also won't break in the future if e.g. French Polynesia decides it needs to outdo the Line Islands when it comes to timezone extravagance. B. fsck allows missing author names such as "tagger <email>", mktag wouldn't, but would allow e.g. "tagger [2 spaces] <email>" (but not "tagger [1 space] <email>"). Now we allow all of these. C. Like B, but "mktag" disallowed spaces in the <email> part, fsck allows it. In some ways fsck_tag() is stricter than "mktag" was, namely: D. fsck disallows zero-padded dates, but mktag didn't care. So e.g. the timestamp "0000000000 +0000" produces an error now. A test in "t1006-cat-file.sh" relied on this, it's been changed to use "hash-object" (without fsck) instead. There was one check I deemed worth keeping by porting it over to fsck_tag(): E. "mktag" did not allow any custom headers, and by extension (as an empty commit is allowed) also forbade an extra stray trailing newline after the headers it knew about. Add a new check in the "ignore" category to fsck and use it. This somewhat abuses the facility added in `efaba7cc77` (fsck: optionally ignore specific fsck issues completely, 2015-06-22). This is somewhat of hack, but probably the least invasive change we can make here. The fsck command will shuffle these categories around, e.g. under --strict the "info" becomes a "warn" and "warn" becomes "error". Existing users of fsck's (and others, e.g. index-pack) --strict option rely on this. So we need to put something into a category that'll be ignored by all existing users of the API. Pretending that fsck.extraHeaderEntry=error ("ignore" by default) was set serves to do this for us. 1. `ec4465adb3` (Add "tag" objects that can be used to sign other objects., 2005-04-25) 2. https://en.wikipedia.org/wiki/List_of_UTC_time_offsets Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason	40ef015a27	mktag: use puts(str) instead of printf("%s\n", str) This introduces no functional change, but refactors the print-out of the hash at the end to do the same thing with less code. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason	dfe3948728	mktag: remove redundant braces in one-line body "if" This minor stylistic churn is usually something we'd avoid, but if we don't do this then the file after changes in subsequent commits will only have this minor style inconsistency, so let's change this while we're at it. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason	0c439117bb	mktag: use default strbuf_read() hint Change the hardcoded hint of 2^12 to 0. The default strbuf hint is perfectly fine here, and the only reason we were hardcoding it is because it survived migration from a pre-strbuf fixed-sized buffer. See `fd17f5b5f7` (Replace all read_fd use with strbuf_read, and get rid of it., 2007-09-10) for that migration. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:29 -08:00
Derrick Stolee	3797a0a7b7	maintenance: use Windows scheduled tasks Git's background maintenance uses cron by default, but this is not available on Windows. Instead, integrate with Task Scheduler. Tasks can be scheduled using the 'schtasks' command. There are several command-line options that can allow for some advanced scheduling, but unfortunately these seem to all require authenticating using a password. Instead, use the "/xml" option to pass an XML file that contains the configuration for the necessary schedule. These XML files are based on some that I exported after constructing a schedule in the Task Scheduler GUI. These options only run background maintenance when the user is logged in, and more fields are populated with the current username and SID at run-time by 'schtasks'. Since the GIT_TEST_MAINT_SCHEDULER environment variable allows us to specify 'schtasks' as the scheduler, we can test the Windows-specific logic on other platforms. Thus, add a check that the XML file written by Git is valid when xmllint exists on the system. Since we use a temporary file for the XML files sent to 'schtasks', we prefix the random characters with the frequency so it is easier to examine the proper file during tests. Instead of an exact match on the 'args' file, we 'grep' for the arguments other than the filename. There is a deficiency in the current design. Windows has two kinds of applications: GUI applications that start by "winmain()" and console applications that start by "main()". Console applications are attached to a new Console window if they are not already associated with a GUI application. This means that every hour the scheudled task launches a command window for the scheduled tasks. Not only is this visually obtrusive, but it also takes focus from whatever else the user is doing! A simple fix would be to insert a GUI application that acts as a shim between the scheduled task and Git. This is currently possible in Git for Windows by setting the <Command> tag equal to C:\Program Files\Git\git-bash.exe with options "--hide --no-needs-console --command=cmd\git.exe" followed by the arguments currently used. Since git-bash.exe is not included in Windows builds of core Git, I chose to leave out this feature. My plan is to submit a small patch to Git for Windows that converts the use of git.exe with this use of git-bash.exe in the short term. In the long term, we can consider creating this GUI shim application within core Git, perhaps in contrib/. Co-authored-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:38:02 -08:00
Derrick Stolee	2afe7e3567	maintenance: use launchctl on macOS The existing mechanism for scheduling background maintenance is done through cron. The 'crontab -e' command allows updating the schedule while cron itself runs those commands. While this is technically supported by macOS, it has some significant deficiencies: 1. Every run of 'crontab -e' must request elevated privileges through the user interface. When running 'git maintenance start' from the Terminal app, it presents a dialog box saying "Terminal.app would like to administer your computer. Administration can include modifying passwords, networking, and system settings." This is more alarming than what we are hoping to achieve. If this alert had some information about how "git" is trying to run "crontab" then we would have some reason to believe that this dialog might be fine. However, it also doesn't help that some scenarios just leave Git waiting for a response without presenting anything to the user. I experienced this when executing the command from a Bash terminal view inside Visual Studio Code. 2. While cron initializes a user environment enough for "git config --global --show-origin" to show the correct config file information, it does not set up the environment enough for Git Credential Manager Core to load credentials during a 'prefetch' task. My prefetches against private repositories required re-authenticating through UI pop-ups in a way that should not be required. The solution is to switch from cron to the Apple-recommended [1] 'launchd' tool. [1] https://developer.apple.com/library/archive/documentation/MacOSX/Conceptual/BPSystemStartup/Chapters/ScheduledJobs.html The basics of this tool is that we need to create XML-formatted "plist" files inside "~/Library/LaunchAgents/" and then use the 'launchctl' tool to make launchd aware of them. The plist files include all of the scheduling information, along with the command-line arguments split across an array of <string> tags. For example, here is my plist file for the weekly scheduled tasks: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"><dict> <key>Label</key><string>org.git-scm.git.weekly</string> <key>ProgramArguments</key> <array> <string>/usr/local/libexec/git-core/git</string> <string>--exec-path=/usr/local/libexec/git-core</string> <string>for-each-repo</string> <string>--config=maintenance.repo</string> <string>maintenance</string> <string>run</string> <string>--schedule=weekly</string> </array> <key>StartCalendarInterval</key> <array> <dict> <key>Day</key><integer>0</integer> <key>Hour</key><integer>0</integer> <key>Minute</key><integer>0</integer> </dict> </array> </dict> </plist> The schedules for the daily and hourly tasks are more complicated since we need to use an array for the StartCalendarInterval with an entry for each of the six days other than the 0th day (to avoid colliding with the weekly task), and each of the 23 hours other than the 0th hour (to avoid colliding with the daily task). The "Label" value is currently filled with "org.git-scm.git.X" where X is the frequency. We need a different plist file for each frequency. The launchctl command needs to be aligned with a user id in order to initialize the command environment. This must be done using the 'launchctl bootstrap' subcommand. This subcommand is new as of macOS 10.11, which was released in September 2015. Before that release the 'launchctl load' subcommand was recommended. The best source of information on this transition I have seen is available at [2]. The current design does not preclude a future version that detects the available fatures of 'launchctl' to use the older commands. However, it is best to rely on the newest version since Apple might completely remove the deprecated version on short notice. [2] https://babodee.wordpress.com/2016/04/09/launchctl-2-0-syntax/ To remove a schedule, we must run 'launchctl bootout' with a valid plist file. We also need to 'bootout' a task before the 'bootstrap' subcommand will succeed, if such a task already exists. The need for a user id requires us to run 'id -u' which works on POSIX systems but not Windows. Further, the need for fully-qualitifed path names including $HOME behaves differently in the Git internals and the external test suite. The $HOME variable starts with "C:\..." instead of the "/c/..." that is provided by Git in these subcommands. The test therefore has a prerequisite that we are not on Windows. The cross- platform logic still allows us to test the macOS logic on a Linux machine. We can verify the commands that were run by 'git maintenance start' and 'git maintenance stop' by injecting a script that writes the command-line arguments into GIT_TEST_MAINT_SCHEDULER. An earlier version of this patch accidentally had an opening "<dict>" tag when it should have had a closing "</dict>" tag. This was caught during manual testing with actual 'launchctl' commands, but we do not want to update developers' tasks when running tests. It appears that macOS includes the "xmllint" tool which can verify the XML format. This is useful for any system that might contain the tool, so use it whenever it is available. We strive to make these tests work on all platforms, but Windows caused some headaches. In particular, the value of getuid() called by the C code is not guaranteed to be the same as `$(id -u)` invoked by a test. This is because `git.exe` is a native Windows program, whereas the utility programs run by the test script mostly utilize the MSYS2 runtime, which emulates a POSIX-like environment. Since the purpose of the test is to check that the input to the hook is well-formed, the actual user ID is immaterial, thus we can work around the problem by making the the test UID-agnostic. Another subtle issue is the $HOME environment variable being a Windows-style path instead of a Unix-style path. We can be more flexible here instead of expecting exact path matches. Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Co-authored-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:38:02 -08:00
René Scharfe	ca5120c339	rebase: verify commit parameter If the user specifies a base commit to switch to, check if it actually references a commit right away to avoid getting confused later on when it turns out to be an invalid object. Reported-by: LeSeulArtichaut <leseulartichaut@gmail.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 15:24:13 -08:00
Martin Ågren	bc62692757	hash-lookup: rename from sha1-lookup Change all remnants of "sha1" in hash-lookup.c and .h and rename them to reflect that we're not just able to handle SHA-1 these days. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 13:01:55 -08:00
Martin Ågren	7a7d992d0d	sha1-lookup: rename `sha1_pos()` as `hash_pos()` Rename this function to reflect that we're not just able to handle SHA-1 these days. There are a few instances of "sha1" left in sha1-lookup.[ch] after this, but those will be addressed in the next commit. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 13:01:55 -08:00
Martin Ågren	e5afd4449d	object-file.c: rename from sha1-file.c Drop the last remnant of "sha1" in this file and rename it to reflect that we're not just able to handle SHA-1 these days. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 13:01:55 -08:00
Junio C Hamano	7a50265295	Merge branch 'ma/maintenance-crontab-fix' Hotfix for a topic of this cycle. * ma/maintenance-crontab-fix: t7900-maintenance: test for magic markers gc: fix handling of crontab magic markers git-maintenance.txt: add missing word	2020-12-23 13:59:46 -08:00
Junio C Hamano	04cd999638	Merge branch 'dl/checkout-p-merge-base' Fix to a regression introduced during this cycle. * dl/checkout-p-merge-base: checkout -p: handle tree arguments correctly again	2020-12-23 13:59:46 -08:00
Junio C Hamano	c46f849f8a	Merge branch 'jx/pack-redundant-on-single-pack' "git pack-redandant" when there is only one packfile used to crash, which has been corrected. * jx/pack-redundant-on-single-pack: pack-redundant: fix crash when one packfile in repo	2020-12-23 13:59:46 -08:00
Martin Ågren	66dc0a3625	gc: fix handling of crontab magic markers On `git maintenance start`, we add a few entries to the user's cron table. We wrap our entries using two magic markers, "# BEGIN GIT MAINTENANCE SCHEDULE" and "# END GIT MAINTENANCE SCHEDULE". At a later `git maintenance stop`, we will go through the table and remove these lines. Or rather, we will remove the "BEGIN" marker, the "END" marker and everything between them. Alas, we have a bug in how we detect the "END" marker: we don't. As we loop through all the lines of the crontab, if we are in the "old region", i.e., the region we're aiming to remove, we make an early `continue` and don't get as far as checking for the "END" marker. Thus, once we've seen our "BEGIN", we remove everything until the end of the file. Rewrite the logic for identifying these markers. There are four cases that are mutually exclusive: The current line starts a region or it ends it, or it's firmly within the region, or it's outside of it (and should be printed). Signed-off-by: Martin Ågren <martin.agren@gmail.com> Acked-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 14:33:08 -08:00
Johannes Schindelin	5c29f19cda	checkout -p: handle tree arguments correctly again This fixes a segmentation fault. The bug is caused by dereferencing `new_branch_info->commit` when it is `NULL`, which is the case when the tree-ish argument is actually a tree, not a commit-ish. This was introduced in `5602b500c3` (builtin/checkout: fix `git checkout -p HEAD...` bug, 2020-10-07), where we tried to ensure that the special tree-ish `HEAD...` is handled correctly. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 14:06:09 -08:00
Sergey Organov	0c627f5d3c	diff-merges: handle imply -p on -c/--cc logic for log.c Move logic that handles implying -p on -c/--cc from log_setup_revisions_tweak() to diff_merges_setup_revs(), where it belongs. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:31 -08:00
Sergey Organov	3b6c17b5c0	diff-merges: new function diff_merges_set_dense_combined_if_unset() Call it where given functionality is needed instead of direct checking/tweaking of diff merges related fields. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:31 -08:00
Sergey Organov	09322b1da9	diff-merges: new function diff_merges_suppress() This function sets all the relevant flags to disabled state, so that no code that checks only one of them get it wrong. Then we call this new function everywhere where diff merges output suppression is needed. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:31 -08:00
Sergey Organov	4f54544d73	diff-merges: rename diff_merges_default_to_enable() to match semantics Rename diff_merges_default_to_enable() to diff_merges_default_to_first_parent() to match its semantics. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:31 -08:00
Sergey Organov	7acf0d06f5	diff-merges: move checks for first_parent_only out of the module The checks for first_parent_only don't in fact belong to this module, as the primary purpose of this flag is history traversal limiting, so get it out of this module and rename the diff_merges_first_parent_defaults_to_enable() to diff_merges_default_to_enable() to match new semantics. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:30 -08:00
Sergey Organov	18f09473bf	diff-merges: rename all functions to have common prefix Use the same "diff_merges" prefix for all the diff merges function names. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:30 -08:00
Sergey Organov	a37eec6333	revision: move diff merges functions to its own diff-merges.c Create separate diff-merges.c and diff-merges.h files, and move all the code related to handling of diff merges there. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:30 -08:00
Sergey Organov	3d4fd94363	revision: provide implementation for diff merges tweaks Use these implementations from show_setup_revisions_tweak() and log_setup_revisions_tweak() in builtin/log.c. This completes moving of management of diff merges parameters to a single place, where we can finally observe them simultaneously. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:47:30 -08:00
Eric Sunshine	cf76baea41	worktree: teach `repair` to fix multi-directional breakage `git worktree repair` knows how to repair the two-way links between the repository and a worktree as long as a link in one or the other direction is sound. For instance, if a linked worktree is moved (without using `git worktree move`), repair is possible because the worktree still knows the location of the repository even though the repository no longer knows where the worktree is. Similarly, if the repository is moved, repair is possible since the repository still knows the locations of the worktrees even though the worktrees no longer know where the repository is. However, if both the repository and the worktrees are moved, then links are severed in both directions, and no repair is possible. This is the case even when the new worktree locations are specified as arguments to `git worktree repair`. The reason for this limitation is twofold. First, when `repair` consults the worktree's gitfile (/path/to/worktree/.git) to determine the corresponding <repo>/worktrees/<id>/gitdir file to fix, <repo> is the old path to the repository, thus it is unable to fix the `gitdir` file at its new location since it doesn't know where it is. Second, when `repair` consults <repo>/worktrees/<id>/gitdir to find the location of the worktree's gitfile (/path/to/worktree/.git), the path recorded in `gitdir` is the old location of the worktree's gitfile, thus it is unable to repair the gitfile since it doesn't know where it is. Fix these shortcomings by teaching `repair` to attempt to infer the new location of the <repo>/worktrees/<id>/gitdir file when the location recorded in the worktree's gitfile has become stale but the file is otherwise well-formed. The inference is intentionally simple-minded. For each worktree path specified as an argument, `git worktree repair` manually reads the ".git" gitfile at that location and, if it is well-formed, extracts the <id>. It then searches for a corresponding <id> in <repo>/worktrees/ and, if found, concludes that there is a reasonable match and updates <repo>/worktrees/<id>/gitdir to point at the specified worktree path. In order for <repo> to be known, `git worktree repair` must be run in the main worktree or bare repository. `git worktree repair` first attempts to repair each incoming /path/to/worktree/.git gitfile to point at the repository, and then attempts to repair outgoing <repo>/worktrees/<id>/gitdir files to point at the worktrees. This sequence was chosen arbitrarily when originally implemented since the order of fixes is immaterial as long as one side of the two-way link between the repository and a worktree is sound. However, for this new repair technique to work, the order must be reversed. This is because the new inference mechanism, when it is successful, allows the outgoing <repo>/worktrees/<id>/gitdir file to be repaired, thus fixing one side of the two-way link. Once that side is fixed, the other side can be fixed by the existing repair mechanism, hence the order of repairs is now significant. Two safeguards are employed to avoid hijacking a worktree from a different repository if the user accidentally specifies a foreign worktree as an argument. The first, as described above, is that it requires an <id> match between the repository and the worktree. That itself is not foolproof for preventing hijack, so the second safeguard is that the inference will only kick in if the worktree's /path/to/worktree/.git gitfile does not point at a repository. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-21 13:44:28 -08:00
Junio C Hamano	3517022568	Merge branch 'ab/unreachable-break' Code clean-up. * ab/unreachable-break: style: do not "break" in switch() after "return"	2020-12-18 15:15:18 -08:00
Junio C Hamano	772bdcd429	Merge branch 'js/init-defaultbranch-advice' Our users are going to be trained to prepare for future change of init.defaultBranch configuration variable. * js/init-defaultbranch-advice: init: provide useful advice about init.defaultBranch get_default_branch_name(): prepare for showing some advice branch -m: allow renaming a yet-unborn branch init: document `init.defaultBranch` better	2020-12-18 15:15:17 -08:00
Junio C Hamano	21127fa982	Merge branch 'tb/partial-clone-filters-fix' Fix potential server side resource deallocation issues when responding to a partial clone request. * tb/partial-clone-filters-fix: upload-pack.c: don't free allowed_filters util pointers builtin/clone.c: don't ignore transport_fetch_refs() errors	2020-12-17 15:06:40 -08:00
Jiang Xin	0696232390	pack-redundant: fix crash when one packfile in repo Command `git pack-redundant --all` will crash if there is only one packfile in the repository. This is because, if there is only one packfile in local_packs, `cmp_local_packs` will do nothing and will leave `pl->unique_objects` as uninitialized. Also add testcases for repository with no packfile and one packfile in t5323. Reported-by: Daniel C. Klauer <daniel.c.klauer@web.de> Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-16 21:21:06 -08:00
Felipe Contreras	c525de335e	pull: display default warning only when non-ff There's no need to display the annoying warning on every pull... only the ones that are not fast-forward. The current warning tests still pass, but not because of the arguments or the configuration, but because they are all fast-forward. We need to test non-fast-forward situations now. Suggestions-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-15 17:39:42 -08:00
Junio C Hamano	7539fdc629	pull: correct condition to trigger non-ff advice Refactor the advise() call that teaches users how they can choose between merge and rebase into a helper function. This revealed that the caller's logic needs to be further clarified to allow future actions (like "erroring out" instead of the current "go ahead and merge anyway") that should happen whether the advice message is squelched out. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-15 17:39:42 -08:00
Junio C Hamano	b044db9172	pull: get rid of unnecessary global variable It is easy enough to do, and gives a more descriptive name to the variable that is scoped in a more focused way. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-15 17:39:17 -08:00
Ævar Arnfjörð Bjarmason	56f56ac50b	style: do not "break" in switch() after "return" Remove this unreachable code. It was found by SunCC, it's found by a non-fatal warning emitted by SunCC. It's one of the things it's more vehement about than GCC & Clang. It complains about a lot of other similarly unreachable code, e.g. a BUG(...) without a "return", and a "return 0" after a long if/else, both of whom have "return" statements. Those are also genuine redundancies to a compiler, but arguably make the code a bit easier to read & less fragile to maintain. These return/break cases are just unnecessary however, and as seen here the surrounding code just did a plain "return" without a "break" already. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-15 16:32:50 -08:00
Junio C Hamano	c3b58472be	pack-redundant: gauge the usage before proposing its removal The subcommand is unusably slow and the reason why nobody reports it as a performance bug is suspected to be the absense of users. Let's show a big message that asks the user to tell us that they still care about the command when an attempt is made to run the command, with an escape hatch to override it with a command line option. In a few releases, we may turn it into an error and keep it for a few more releases before finally removing it (during the whole time, the plan to remove it would be interrupted by end user raising hand). Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-15 14:30:11 -08:00
Felipe Contreras	278f4be806	pull: give the advice for choosing rebase/merge much later Eventually we want to be omit the advice when we can fast-forward in which case there is no reason to require the user to choose between rebase or merge. In order to do so, we need to delay giving the advice up to the point where we can check if we can fast-forward or not. Additionally, config_get_rebase() was probably never its true home. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-14 09:03:17 -08:00
Felipe Contreras	77a7ec6329	pull: refactor fast-forward check We would like to be able to make this check before the decision to rebase is made in a future step. Besides, using a separate helper makes the code easier to follow. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-14 08:59:40 -08:00
Johannes Schindelin	cc0f13c57d	get_default_branch_name(): prepare for showing some advice We are about to introduce a message giving users running `git init` some advice about `init.defaultBranch`. This will necessarily be done in `repo_default_branch_name()`. Not all code paths want to show that advice, though. In particular, the `git clone` codepath _specifically_ asks for `init_db()` to be quiet, via the `INIT_DB_QUIET` flag. In preparation for showing users above-mentioned advice, let's change the function signature of `get_default_branch_name()` to accept the parameter `quiet`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 15:53:50 -08:00
Johannes Schindelin	cfaff3aac8	branch -m: allow renaming a yet-unborn branch In one of the next commits, we would like to give users some advice regarding the initial branch name, and how to modify it. To that end, it would be good if `git branch -m <name>` worked in a freshly initialized repository without any commits. Let's make it so. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-13 15:53:50 -08:00
brian m. carlson	fac60b8925	rev-parse: add option for absolute or relative path formatting git rev-parse has several options which print various paths. Some of these paths are printed relative to the current working directory, and some are absolute. Normally, this is not a problem, but there are times when one wants paths entirely in one format or another. This can be done trivially if the paths are canonical, but canonicalizing paths is not possible on some shell scripting environments which lack realpath(1) and also in Go, which lacks functions that properly canonicalize paths on Windows. To help out the scripter, let's provide an option which turns most of the paths printed by git rev-parse to be either relative to the current working directory or absolute and canonical. Document which options are affected and which are not so that users are not confused. This approach is cleaner and tidier than providing duplicates of existing options which are either relative or absolute. Note that if the user needs both forms, it is possible to pass an additional option in the middle of the command line which changes the behavior of subsequent operations. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-12 23:35:51 -08:00
Junio C Hamano	bb48056cb2	Merge branch 'tb/bugreport-no-localtime' Use of non-reentrant localtime() has been removed. * tb/bugreport-no-localtime: builtin/bugreport.c: use thread-safe localtime_r()	2020-12-08 15:11:21 -08:00
Junio C Hamano	f2a75cb312	Merge branch 'rs/maintenance-run-outside-repo' "git maintenance run/start/stop" needed to be run in a repository to hold the lockfile they use, but didn't make sure they are actually in a repository, which has been corrected. * rs/maintenance-run-outside-repo: t7900: fix typo: "test_execpt_success" maintenance: fix SEGFAULT when no repository	2020-12-08 15:11:21 -08:00
Junio C Hamano	5dfb976460	Merge branch 'ma/grep-init-default' Code clean-up. * ma/grep-init-default: MyFirstObjectWalk: drop `init_walken_defaults()` grep: copy struct in one fell swoop grep: use designated initializers for `grep_defaults` grep: don't set up a "default" repo for grep	2020-12-08 15:11:20 -08:00
Junio C Hamano	01b8886a62	Merge branch 'js/trace2-session-id' The transport layer was taught to optionally exchange the session ID assigned by the trace2 subsystem during fetch/push transactions. * js/trace2-session-id: receive-pack: log received client session ID send-pack: advertise session ID in capabilities upload-pack, serve: log received client session ID fetch-pack: advertise session ID in capabilities transport: log received server session ID serve: advertise session ID in v2 capabilities receive-pack: advertise session ID in v0 capabilities upload-pack: advertise session ID in v0 capabilities trace2: add a public function for getting the SID docs: new transfer.advertiseSID option docs: new capability to advertise session IDs	2020-12-08 15:11:20 -08:00
Junio C Hamano	d702cb9e89	Merge branch 'ds/maintenance-part-3' "git maintenance" command had trouble working in a directory whose pathname contained an ERE metacharacter like '+'. * ds/maintenance-part-3: maintenance: use 'git config --fixed-value'	2020-12-08 15:11:19 -08:00
Junio C Hamano	a10e7842ab	Merge branch 'ds/config-literal-value' Various subcommands of "git config" that takes value_regex learn the "--literal-value" option to take the value_regex option as a literal string. * ds/config-literal-value: config doc: value-pattern is not necessarily a regexp config: implement --fixed-value with --get* config: plumb --fixed-value into config API config: add --fixed-value option, un-implemented t1300: add test for --replace-all with value-pattern t1300: test "set all" mode with value-pattern config: replace 'value_regex' with 'value_pattern' config: convert multi_replace to flags	2020-12-08 15:11:19 -08:00
Junio C Hamano	1bc550effe	Merge branch 'ps/update-ref-multi-transaction' "git update-ref --stdin" learns to take multiple transactions in a single session. * ps/update-ref-multi-transaction: update-ref: disallow "start" for ongoing transactions p1400: use `git-update-ref --stdin` to test multiple transactions update-ref: allow creation of multiple transactions t1400: avoid touching refs on filesystem	2020-12-08 15:11:17 -08:00
Jeff King	449fa5ee06	pack-bitmap-write: ignore BITMAP_FLAG_REUSE The on-disk bitmap format has a flag to mark a bitmap to be "reused". This is a rather curious feature, and works like this: - a run of pack-objects would decide to mark the last 80% of the bitmaps it generates with the reuse flag - the next time we generate bitmaps, we'd see those reuse flags from the last run, and mark those commits as special: - we'd be more likely to select those commits to get bitmaps in the new output - when generating the bitmap for a selected commit, we'd reuse the old bitmap as-is (rearranging the bits to match the new pack, of course) However, neither of these behaviors particularly makes sense. Just because a commit happened to be bitmapped last time does not make it a good candidate for having a bitmap this time. In particular, we may choose bitmaps based on how recent they are in history, or whether a ref tip points to them, and those things will change. We're better off re-considering fresh which commits are good candidates. Reusing the existing bitmap _is_ a reasonable thing to do to save computation. But only reusing exact bitmaps is a weak form of this. If we have an old bitmap for A and now want a new bitmap for its child, we should be able to compute that only by looking at trees and that are new to the child. But this code would consider only exact reuse (which is perhaps why it was eager to select those commits in the first place). Furthermore, the recent switch to the reverse-edge algorithm for generating bitmaps dropped this optimization entirely (and yet still performs better). So let's do a few cleanups: - drop the whole "reusing bitmaps" phase of generating bitmaps. It's not helping anything, and is mostly unused code (or worse, code that is using CPU but not doing anything useful) - drop the use of the on-disk reuse flag to select commits to bitmap - stop setting the on-disk reuse flag in bitmaps we generate (since nothing respects it anymore) We will keep a few innards of the reuse code, which will help us implement a more capable version of the "reuse" optimization: - simplify rebuild_existing_bitmaps() into a function that only builds the mapping of bits between the old and new orders, but doesn't actually convert any bitmaps - make rebuild_bitmap() public; we'll call it lazily to convert bitmaps as we traverse (using the mapping created above) Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-08 14:48:17 -08:00
Taylor Blau	aab179d937	builtin/clone.c: don't ignore transport_fetch_refs() errors If 'git clone' couldn't execute 'transport_fetch_refs()' (e.g., because of an error on the remote's side in 'git upload-pack'), then it will silently ignore it. Even though this has been the case at least since clone was ported to C (way back in `8434c2f1af` (Build in clone, 2008-04-27)), 'git fetch' doesn't ignore these and reports any failures it sees. That suggests that ignoring the return value in 'git clone' is simply an oversight that should be corrected. That's exactly what this patch does. (Noticing and fixing this is no coincidence, we'll want it in the next patch in order to demonstrate a regression in 'git upload-pack' via a 'git clone'.) There's no additional logging here, but that matches how 'git fetch' handles the same case. An assumption there is that whichever part of transport_fetch_refs() fails will complain loudly, so any additional logging here is redundant. Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-03 12:42:29 -08:00
Junio C Hamano	39d38a5c5f	Merge branch 'tb/repack-simplify' Simplify the logic to deal with a repack operation that ended up creating the same packfile. * tb/repack-simplify: builtin/repack.c: don't move existing packs out of the way builtin/repack.c: keep track of what pack-objects wrote repack: make "exts" array available outside cmd_repack()	2020-12-03 00:18:06 -08:00
Junio C Hamano	c692e1b673	Merge branch 'pb/pull-rebase-recurse-submodules' "git pull --rebase --recurse-submodules" checked for local changes in a wrong range and failed to run correctly when it should. * pb/pull-rebase-recurse-submodules: pull: check for local submodule modifications with the right range t5572: describe '--rebase' tests a little more t5572: add notes on a peculiar test pull --rebase: compute rebase arguments in separate function	2020-12-03 00:18:06 -08:00
Elijah Newren	ba359fd507	stash: fix stash application in sparse-checkouts sparse-checkouts are built on the patterns in the $GIT_DIR/info/sparse-checkout file, where commands have modified behavior for paths that do not match those patterns. The differences in behavior, as far as the bugs concerned here, fall into three different categories (with git subcommands that fall into each category listed): * commands that only look at files matching the patterns: * status * diff * clean * update-index * commands that remove files from the working tree that do not match the patterns, and restore files that do match them: * read-tree * switch * checkout * reset (--hard) * commands that omit writing files to the working tree that do not match the patterns, unless those files are not clean: * merge * rebase * cherry-pick * revert There are some caveats above, e.g. a plain `git diff` ignores files outside the sparsity patterns but will show diffs for paths outside the sparsity patterns when revision arguments are passed. (Technically, diff is treating the sparse paths as matching HEAD.) So, there is some internal inconsistency among these commands. There are also additional commands that should behave differently in the face of sparse-checkouts, as the sparse-checkout documentation alludes to, but the above is sufficient for me to explain how `git stash` is affected. What is relevant here is that logically 'stash' should behave like a merge; it three-way merges the changes the user had in progress at stash creation time, the HEAD at the time the stash was created, and the current HEAD, in order to get the stashed changes applied to the current branch. However, this simplistic view doesn't quite work in practice, because stash tweaks it a bit due to two factors: (1) flags like --keep-index and --include-untracked (why we used two different verbs, 'keep' and 'include', is a rant for another day) modify what should be staged at the end and include more things that should be quasi-merged, (2) stash generally wants changes to NOT be staged. It only provides exceptions when (a) some of the changes had conflicts and thus we want to use stages to denote the clean merges and higher order stages to mark the conflicts, or (b) if there is a brand new file we don't want it to become untracked. stash has traditionally gotten this special behavior by first doing a merge, and then when it's clean, applying a pipeline of commands to modify the result. This series of commands for unstaging-non-newly-added-files came from the following commands: git diff-index --cached --name-only --diff-filter=A $CTREE >"$a" git read-tree --reset $CTREE git update-index --add --stdin <"$a" rm -f "$a" Looking back at the different types of special sparsity handling listed at the beginning of this message, you may note that we have at least one of each type covered here: merge, diff-index, and read-tree. The weird mix-and-match led to 3 different bugs: (1) If a path merged cleanly and it didn't match the sparsity patterns, the merge backend would know to avoid writing it to the working tree and keep the SKIP_WORKTREE bit, simply only updating it in the index. Unfortunately, the subsequent commands would essentially undo the changes in the index and thus simply toss the changes altogether since there was nothing left in the working tree. This means the stash is only partially applied. (2) If a path existed in the worktree before `git stash apply` despite having the SKIP_WORKTREE bit set, then the `git read-tree --reset` would print an error message of the form error: Entry 'modified' not uptodate. Cannot merge. and cause stash to abort early. (3) If there was a brand new file added by the stash, then the diff-index command would save that pathname to the temporary file, the read-tree --reset would remove it from the index, and the update-index command would barf due to no such file being present in the working copy; it would print a message of the form: error: NEWFILE: does not exist and --remove not passed fatal: Unable to process path NEWFILE and then cause stash to abort early. Basically, the whole idea of unstage-unless-brand-new requires special care when you are dealing with a sparse-checkout. Fix these problems by applying the following simple rule: When we unstage files, if they have the SKIP_WORKTREE bit set, clear that bit and write the file out to the working directory. (*) If there's already a file present in the way, rename it first. This fixes all three problems in t7012.13 and allows us to mark it as passing. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-01 14:39:04 -08:00
Elijah Newren	b34ab4a43b	stash: remove unnecessary process forking When stash was converted from shell to a builtin, it merely transliterated the forking of various git commands from shell to a C program that would fork the same commands. Some of those were converted over to actual library calls, but much of the pipeline-of-commands design still remains. Fix some of this by replacing the portion corresponding to git diff-index --cached --name-only --diff-filter=A $CTREE >"$a" git read-tree --reset $CTREE git update-index --add --stdin <"$a" rm -f "$a" into a library function that does the same thing. (The read-tree --reset was already partially converted over to a library call, but as an independent piece.) Note here that this came after a merge operation was performed. The merge machinery always stages anything that cleanly merges, and the above code only runs if there are no conflicts. Its purpose is to make it so that when there are no conflicts, all the changes from the stash are unstaged. However, that causes brand new files from the stash to become untracked, so the code above first saves those files off and then re-adds them afterwards. We replace the whole series of commands with a simple function that will unstage files that are not newly added. This doesn't fix any bugs in the usage of these commands, it simply matches the existing behavior but makes it into a single atomic operation that we can then operate on as a whole. A subsequent commit will take advantage of this to fix issues with these commands in sparse-checkouts. This conversion incidentally fixes t3906.1, because the separate update-index process would die with the following error messages: error: uninitialized_sub: is a directory - add files inside instead fatal: Unable to process path uninitialized_sub The unstaging of the directory as a submodule meant it was no longer tracked, and thus as an uninitialized directory it could not be added back using `git update-index --add`, thus resulting in this error and early abort. Most of the submodule tests in 3906 continue to fail after this change, this change was just enough to push the first of those tests to success. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-01 14:39:04 -08:00
Taylor Blau	4f6460df55	builtin/bugreport.c: use thread-safe localtime_r() To generate its filename, the 'git bugreport' builtin asks the system for the current time with 'localtime()'. Since this uses a shared buffer, it is not thread-safe. Even though 'git bugreport' is not multi-threaded, using localtime() can trigger some static analysis tools to complain, and a quick $ git grep -oh 'localtime$_.$\?' -- */.c \| sort \| uniq -c shows that the only usage of the thread-unsafe 'localtime' is in a piece of documentation. So, convert this instance to use the thread-safe version for consistency, and to appease some analysis tools. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-01 13:05:37 -08:00
Junio C Hamano	3d8f81f21b	Merge branch 'sa/credential-store-timeout' Multiple "credential-store" backends can race to lock the same file, causing everybody else but one to fail---reattempt locking with some timeout to reduce the rate of the failure. * sa/credential-store-timeout: crendential-store: use timeout when locking file	2020-11-30 14:49:45 -08:00
Junio C Hamano	fa27e2d103	Merge branch 'km/stash-error-message-fix' Error message fix. * km/stash-error-message-fix: stash: add missing space to an error message	2020-11-30 14:49:45 -08:00
Junio C Hamano	f73ee0c6be	Merge branch 'mt/worktree-error-message-fix' Fix formulation of an error message with two placeholders in "git worktree add" subcommand. * mt/worktree-error-message-fix: worktree: fix order of arguments in error message	2020-11-30 14:49:43 -08:00
Junio C Hamano	1c04cdd424	Merge branch 'ab/gc-keep-base-option' Fix an option name in "gc" documentation. * ab/gc-keep-base-option: gc: rename keep_base_pack variable for --keep-largest-pack gc docs: change --keep-base-pack to --keep-largest-pack	2020-11-30 14:49:43 -08:00
Junio C Hamano	290c94085b	Merge branch 'js/pull-rebase-use-advise' UI improvement. * js/pull-rebase-use-advise: pull: colorize the hint about setting `pull.rebase`	2020-11-30 14:49:42 -08:00
Rafael Silva	e72f7defc4	maintenance: fix SEGFAULT when no repository The "git maintenance run" and "git maintenance start/stop" commands holds a file-based lock at the .git/maintenance.lock and .git/schedule.lock respectively. These locks are used to ensure only one maintenance process is executed at the time as both operations involves writing data into the git repository. The path to the lock file is built using "the_repository->objects->odb->path" that results in SEGFAULT when we have no repository available as "the_repository->objects->odb" is set to NULL. Let's teach maintenance command to use RUN_SETUP option that will provide the validation and fail when running outside of a repository. Hence fixing the SEGFAULT for all three operations and making the behaviour consistent across all subcommands. Setting the RUN_SETUP also provides the same protection for all subcommands given that the "register" and "unregister" also requires to be executed inside a repository. Furthermore let's remove the local validation implemented by the "register" and "unregister" as this will not be required anymore with the new option. Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-30 13:44:15 -08:00
Junio C Hamano	2ba70a330b	Merge branch 'rs/gc-sort-func-cast-fix' Fix broken sorting of maintenance tasks. * rs/gc-sort-func-cast-fix: gc: fix cast in compare_tasks_by_selection()	2020-11-25 15:24:53 -08:00
Junio C Hamano	fcf26ef53a	Merge branch 'jk/4gb-idx' The code was not prepared to deal with pack .idx file that is larger than 4GB. * jk/4gb-idx: packfile: detect overflow in .idx file size checks block-sha1: take a size_t length parameter fsck: correctly compute checksums on idx files larger than 4GB use size_t to store pack .idx byte offsets compute pack .idx byte offsets using size_t	2020-11-25 15:24:52 -08:00
Junio C Hamano	8f8f10ac09	Merge branch 'jx/t5411-flake-fix' The exchange between receive-pack and proc-receive hook did not carefully check for errors. * jx/t5411-flake-fix: receive-pack: use default version 0 for proc-receive receive-pack: gently write messages to proc-receive t5411: new helper filter_out_user_friendly_and_stable_output	2020-11-25 15:24:52 -08:00
Derrick Stolee	483a6d9b5d	maintenance: use 'git config --fixed-value' When a repository's leading directories contain regex metacharacters, the config calls for 'git maintenance register' and 'git maintenance unregister' are not careful enough. Use the new --fixed-value option to direct the config machinery to use exact string matches. This is a more robust option than escaping these arguments in a piecemeal fashion. For the test, require that we are not running on Windows since the '+' and '*' characters are not allowed on that filesystem. Reported-by: Emily Shaffer <emilyshaffer@google.com> Reported-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-25 15:04:55 -08:00
Derrick Stolee	3f1bae1dc3	config: implement --fixed-value with --get* The config builtin does its own regex matching of values for the --get, --get-all, and --get-regexp modes. Plumb the existing 'flags' parameter to the get_value() method so we can initialize the value-pattern argument as a fixed string instead of a regex pattern. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-25 14:43:48 -08:00
Derrick Stolee	c90702a1f6	config: plumb --fixed-value into config API The git_config_set_multivar_in_file_gently() and related methods now take a 'flags' bitfield, so add a new bit representing the --fixed-value option from 'git config'. This alters the purpose of the value_pattern parameter to be an exact string match. This requires some initialization changes in git_config_set_multivar_in_file_gently() and a new strcmp() call in the matches() method. The new CONFIG_FLAGS_FIXED_VALUE flag is initialized in builtin/config.c based on the --fixed-value option, and that needs to be updated in several callers. This patch only affects some of the modes of 'git config', and the rest will be completed in the next change. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-25 14:43:48 -08:00
Derrick Stolee	fda43942d7	config: add --fixed-value option, un-implemented The 'git config' builtin takes a 'value-pattern' parameter for several actions. This can cause confusion when expecting exact value matches instead of regex matches, especially when the input string contains metacharacters. While callers can escape the patterns themselves, it would be more friendly to allow an argument to disable the pattern matching in favor of an exact string match. Add a new '--fixed-value' option that does not currently change the behavior. The implementation will be filled in by later changes for each appropriate action. For now, check and test that --fixed-value will abort the command when included with an incompatible action or without a 'value-pattern' argument. The name '--fixed-value' was chosen over something simpler like '--fixed' because some commands allow regular expressions on the key in addition to the value. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-25 14:43:48 -08:00
Derrick Stolee	247e2f822e	config: replace 'value_regex' with 'value_pattern' The 'value_regex' argument in the 'git config' builtin is poorly named, especially related to an upcoming change that allows exact string matches instead of ERE pattern matches. Perform a mostly mechanical change of every instance of 'value_regex' to 'value_pattern' in the codebase. This is only critical for documentation and error messages, but it is best to be consistent inside the codebase, too. For documentation, use 'value-pattern' which is better punctuation. This affects Documentation/git-config.txt and the usage in builtin/config.c, which was already mixed between 'value_regex' and 'value-regex'. I gave some thought to leaving the value_regex variables inside config.c that are regex_t pointers. However, it is probably best to keep the name consistent with the rest of the variables. This does not update the translations inside the po/ directory, as that creates conflicts with ongoing work. The input strings should automatically update through automation, and a few of the output strings currently use "[value_regex]" directly. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-25 14:43:48 -08:00
Derrick Stolee	504ee1290e	config: convert multi_replace to flags We will extend the flexibility of the config API. Before doing so, let's take an existing 'int multi_replace' parameter and replace it with a new 'unsigned flags' parameter that can take multiple options as a bit field. Update all callers that specified multi_replace to now specify the CONFIG_FLAGS_MULTI_REPLACE flag. To add more clarity, extend the documentation of git_config_set_multivar_in_file() including a clear labeling of its arguments. Other config API methods in config.h require only a change of the final parameter from 'int' to 'unsigned'. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-25 14:43:47 -08:00
Simão Afonso	df7f915fb6	crendential-store: use timeout when locking file When holding the lock for rewriting the credential file, use a timeout to avoid race conditions when the credentials file needs to be updated in parallel. An example would be doing `fetch --all` on a repository with several remotes that need credentials, using parallel fetching. The timeout can be configured using "credentialStore.lockTimeoutMS", defaulting to 1 second. Signed-off-by: Simão Afonso <simao.afonso@powertools-tech.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-25 12:30:18 -08:00
Derrick Stolee	31345d5545	maintenance: extract platform-specific scheduling The existing schedule mechanism using 'cron' is supported by POSIX platforms, but not Windows. It also works slightly differently on macOS to significant detriment of the user experience. To allow for new implementations on these platforms, extract a method that performs the platform-specific scheduling mechanism. This will be swapped at compile time with new implementations on specialized platforms. As we add this generality, rename GIT_TEST_CRONTAB to GIT_TEST_MAINT_SCHEDULER. Further, this variable is now parsed as "<scheduler>:<command>" so we can test platform-specific scheduling logic even when not on the correct platform. By specifying the <scheduler> in this string, we will be able to test all three sets of Git logic from a Linux machine. Co-authored-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-24 13:02:29 -08:00
Kyle Meyer	eaf5341538	stash: add missing space to an error message Restore a space that was lost in `8a0fc8d19d` (stash: convert apply to builtin, 2019-02-25). Signed-off-by: Kyle Meyer <kyle@kyleam.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-24 12:56:31 -08:00
Junio C Hamano	bf0a430f70	Merge branch 'en/strmap' A specialization of hashmap that uses a string as key has been introduced. Hopefully it will see wider use over time. * en/strmap: shortlog: use strset from strmap.h Use new HASHMAP_INIT macro to simplify hashmap initialization strmap: take advantage of FLEXPTR_ALLOC_STR when relevant strmap: enable allocations to come from a mem_pool strmap: add a strset sub-type strmap: split create_entry() out of strmap_put() strmap: add functions facilitating use as a string->int map strmap: enable faster clearing and reusing of strmaps strmap: add more utility functions strmap: new utility functions hashmap: provide deallocation function names hashmap: introduce a new hashmap_partial_clear() hashmap: allow re-use after hashmap_free() hashmap: adjust spacing to fix argument alignment hashmap: add usage documentation explaining hashmap_free[_entries]()	2020-11-21 15:14:38 -08:00
Junio C Hamano	0dd171f0bc	Merge branch 'jk/rev-parse-end-of-options' "git rev-parse" learned the "--end-of-options" to help scripts to safely take a parameter that is supposed to be a revision, e.g. "git rev-parse --verify -q --end-of-options $rev". * jk/rev-parse-end-of-options: rev-parse: handle --end-of-options rev-parse: put all options under the "-" check rev-parse: don't accept options after dashdash	2020-11-21 15:14:38 -08:00
Junio C Hamano	473c6224c6	Merge branch 'jc/format-patch-name-max' The maximum length of output filenames "git format-patch" creates has become configurable (used to be capped at 64). * jc/format-patch-name-max: format-patch: make output filename configurable	2020-11-21 15:14:38 -08:00
Martin Ågren	96313423a7	grep: use designated initializers for `grep_defaults` In `15fabd1bbd` ("builtin/grep.c: make configuration callback more reusable", 2012-10-09), we learned to fill a `static struct grep_opt grep_defaults` which we can use as a blueprint for other such structs. At the time, we didn't consider designated initializers to be widely useable, but these days, we do. (See, e.g., `cbc0f81d96` ("strbuf: use designated initializers in STRBUF_INIT", 2017-07-10).) Use designated initializers to let the compiler set up the struct and so that we don't need to remember to call `init_grep_defaults()`. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-21 14:50:33 -08:00
Martin Ågren	1d3878799f	grep: don't set up a "default" repo for grep `init_grep_defaults()` fills a `static struct grep_opt grep_defaults`. This struct is then used by `grep_init()` as a blueprint for other such structs. Notably, `grep_init()` takes a `struct repo ` and assigns it into the target struct. As a result, it is unnecessary for us to take a `struct repo ` in `init_grep_defaults()` as well. We assign it into the default struct and never look at it again. And in light of how we return early if we have already set up the default struct, it's not just unnecessary, but is also a bit confusing: If we are called twice and with different repos, is it a bug or a feature that we ignore the second repo? Drop the repo parameter for `init_grep_defaults()`. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-21 14:50:29 -08:00
Matheus Tavares	b86339b12b	worktree: fix order of arguments in error message `git worktree add` (without --force) errors out when given a path that is already registered as a worktree and the path is missing on disk. But the `cmd` and `path` strings are switched on the error message. Let's fix that. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-21 13:31:34 -08:00
Ævar Arnfjörð Bjarmason	793c1464d3	gc: rename keep_base_pack variable for --keep-largest-pack As noted in an earlier change the keep_base_pack variable name is a relic from an earlier on-list version of `ae4e89e549` ("gc: add --keep-largest-pack option", 2018-04-15) before it was renamed to --keep-largest-pack. Let's change the variable name to avoid that confusion, it's easier to read the code if there's a 1=1 mapping between the variable name and option name. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-21 11:39:59 -08:00
Johannes Schindelin	e01ae2a4a7	pull: colorize the hint about setting `pull.rebase` In `d18c950a69` (pull: warn if the user didn't say whether to rebase or to merge, 2020-03-09), a new hint was introduced to encourage users to make a conscious decision about whether they want their pull to merge or to rebase by configuring the `pull.rebase` setting. This warning was clearly intended to advise users, but as pointed out in https://lore.kernel.org/git/87ima2rdsm.fsf%40evledraar.gmail.com, it uses `warning()` instead of `advise()`. One consequence is that the advice is not colorized in the same manner as other, similar messages. So let's use `advise()` instead. Pointed-out-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-19 14:13:30 -08:00
René Scharfe	a1c74791d5	gc: fix cast in compare_tasks_by_selection() compare_tasks_by_selection() is used with QSORT and gets passed pointers to the elements of "static struct maintenance_task tasks[]". It casts the addresses of these passed pointers to element pointers, though, and thus effectively compares some unrelated values from the stack. Fix the casts to actually compare array elements. Detected by USan (make SANITIZE=undefined test). Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-18 14:15:58 -08:00
Junio C Hamano	3f6dc9c366	Merge branch 'pb/blame-funcname-range-userdiff' "git blame -L :funcname -- path" did not work well for a path for which a userdiff driver is defined. * pb/blame-funcname-range-userdiff: blame: simplify 'setup_blame_bloom_data' interface blame: simplify 'setup_scoreboard' interface blame: enable funcname blaming with userdiff driver line-log: mention both modes in 'blame' and 'log' short help doc: add more pointers to gitattributes(5) for userdiff blame-options.txt: also mention 'funcname' in '-L' description doc: line-range: improve formatting doc: log, gitk: move '-L' description to 'line-range-options.txt'	2020-11-18 13:32:53 -08:00
Junio C Hamano	a1f95951ef	Merge branch 'en/merge-ort-api-null-impl' Preparation for a new merge strategy. * en/merge-ort-api-null-impl: merge,rebase,revert: select ort or recursive by config or environment fast-rebase: demonstrate merge-ort's API via new test-tool command merge-ort-wrappers: new convience wrappers to mimic the old merge API merge-ort: barebones API of new merge strategy with empty implementation	2020-11-18 13:32:53 -08:00
Junio C Hamano	7660da1618	Merge branch 'ds/maintenance-part-3' Parts of "git maintenance" to ease writing crontab entries (and other scheduling system configuration) for it. * ds/maintenance-part-3: maintenance: add troubleshooting guide to docs maintenance: use 'incremental' strategy by default maintenance: create maintenance.strategy config maintenance: add start/stop subcommands maintenance: add [un]register subcommands for-each-repo: run subcommands on configured repos maintenance: add --schedule option and config maintenance: optionally skip --auto process	2020-11-18 13:32:53 -08:00
Junio C Hamano	c042c455d4	Merge branch 'pw/rebase-i-orig-head' "git rebase -i" did not store ORIG_HEAD correctly. * pw/rebase-i-orig-head: rebase -i: simplify get_revision_ranges() rebase -i: use struct object_id when writing state rebase -i: use struct object_id rather than looking up commit rebase -i: stop overwriting ORIG_HEAD buffer	2020-11-18 13:32:53 -08:00
Junio C Hamano	5edc8bdc06	Merge branch 'jk/format-patch-output' "git format-patch --output=there" did not work as expected and instead crashed. The option is now supported. * jk/format-patch-output: format-patch: support --output option format-patch: tie file-opening logic to output_directory format-patch: refactor output selection	2020-11-18 13:32:52 -08:00
Junio C Hamano	f8a1cee7b3	Merge branch 'jc/line-log-takes-no-pathspec' "git log -L<range>:<path>" is documented to take no pathspec, but this was not enforced by the command line option parser, which has been corrected. * jc/line-log-takes-no-pathspec: log: diagnose -L used with pathspec as an error	2020-11-18 13:32:52 -08:00
Junio C Hamano	30f5257611	Merge branch 'rs/empty-reflog-check-fix' The code to see if "git stash drop" can safely remove refs/stash has been made more carerful. * rs/empty-reflog-check-fix: stash: simplify reflog emptiness check	2020-11-18 13:32:52 -08:00
Taylor Blau	2fcb03b52d	builtin/repack.c: don't move existing packs out of the way When 'git repack' creates a pack with the same name as any existing pack, it moves the existing one to 'old-pack-xxx.{pack,idx,...}' and then renames the new one into place. Eventually, it would be nice to have 'git repack' allow for writing a multi-pack index at the critical time (after the new packs have been written / moved into place, but before the old ones have been deleted). Guessing that this option might be called '--write-midx', this makes the following situation (where repacks are issued back-to-back without any new objects) impossible: $ git repack -adb $ git repack -adb --write-midx In the second repack, the existing packs are overwritten verbatim with the same rename-to-old sequence. At that point, the current MIDX is invalidated, since it refers to now-missing packs. So that code wants to be run after the MIDX is re-written. But (prior to this patch) the new MIDX can't be written until the new packs are moved into place. So, we have a circular dependency. This is all hypothetical, since no code currently exists to write a MIDX safely during a 'git repack' (the 'GIT_TEST_MULTI_PACK_INDEX' does so unsafely). Putting hypothetical aside, though: why do we need to rename existing packs to be prefixed with 'old-' anyway? This behavior dates all the way back to `2ad47d6` (git-repack: Be careful when updating the same pack as an existing one., 2006-06-25). `2ad47d6` is mainly concerned about a case where a newly written pack would have a different structure than its index. This used to be possible when the pack name was a hash of the set of objects. Under this naming scheme, two packs that store the same set of objects could differ in delta selection, object positioning, or both. If this happened, then any such packs would be unreadable in the instant between copying the new pack and new index (i.e., either the index or pack will be stale depending on the order that they were copied). But since `1190a1a` (pack-objects: name pack files after trailer hash, 2013-12-05), this is no longer possible, since pack files are named not after their logical contents (i.e., the set of objects), but by the actual checksum of their contents. So, this old- behavior can safely go, which allows us to avoid our circular dependency above. In addition to avoiding the circular dependency, this patch also makes 'git repack' a lot simpler, since we don't have to deal with failures encountered when renaming existing packs to be prefixed with 'old-'. This patch is mostly limited to removing code paths that deal with the 'old' prefixing, with the exception of files that include the pack's name in their own filename, like .idx, .bitmap, and related files. The exception is that we want to continue to trust what pack-objects wrote. That is, it is not the case that we pretend as if pack-objects didn't write files identical to ones that already exist, but rather that we respect what pack-objects wrote as the source of truth. That cuts two ways: - If pack-objects produced an identical pack to one that already exists with a bitmap, but did not produce a bitmap, we remove the bitmap that already exists. (This behavior is codified in t7700.14). - If pack-objects produced an identical pack to one that already exists, we trust the just-written version of the coresponding .idx, .promisor, and other files over the ones that already exist. This ensures that we use the most up-to-date versions of this files, which is safe even in the face of format changes in, say, the .idx file (which would not be reflected in the .idx file's name). Helped-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-17 13:31:55 -08:00
Philippe Blain	5176f20ffe	pull: check for local submodule modifications with the right range Ever since 'git pull' learned '--recurse-submodules' in `a6d7eb2c7a` (pull: optionally rebase submodules (remote submodule changes only), 2017-06-23), we check if there are local submodule modifications by checking the revision range 'curr_head --not rebase_fork_point'. The goal of this check is to abort the pull if there are submodule modifications in the local commits being rebased, since this scenario is not supported. However, the actual range of commits being rebased is not 'rebase_fork_point..curr_head', as the logic in 'get_rebase_newbase_and_upstream' reveals, it is 'upstream..curr_head'. If the 'git merge-base --fork-point' invocation in 'get_rebase_fork_point' fails to find a fork point between the current branch and the remote-tracking branch we are pulling from, 'rebase_fork_point' is null and since `4d36f88be7` (submodule: do not pass null OID to setup_revisions, 2018-05-24), 'submodule_touches_in_range' checks 'curr_head' and all its ancestors for submodule modifications. Since it is highly likely that there are submodule modifications in this range (which is in effect the whole history of the current branch), this prevents 'git pull --rebase --recurse-submodules' from succeeding if no fork point exists between the current branch and the remote-tracking branch being pulled. This can happen, for example, when the current branch was forked from a commit which was never recorded in the reflog of the remote-tracking branch we are pulling, as the last two paragraphs of the "Discussion on fork-point mode" section in git-merge-base(1) explain. Fix this bug by passing 'upstream' instead of 'rebase_fork_point' as the 'excl_oid' argument to 'submodule_touches_in_range'. Reported-by: Brice Goglin <bgoglin@free.fr> Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-16 16:01:13 -08:00
Philippe Blain	4f66d79ae3	pull --rebase: compute rebase arguments in separate function The function 'run_rebase' is responsible for constructing the command line to be passed to 'git rebase'. This includes both forwarding pass-through options given to 'git pull' as well computing the <newbase> and <upstream> arguments to 'git rebase'. A following commit will need to access the <upstream> argument in 'cmd_pull' to fix a bug with 'git pull --rebase --recurse-submodules'. In order to do so, refactor the code so that the <newbase> and <upstream> commits are computed in a new, separate function, 'get_rebase_newbase_and_upstream'. Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-16 16:01:13 -08:00
Taylor Blau	704c4a5c07	builtin/repack.c: keep track of what pack-objects wrote In the subsequent commit, it will become useful to keep track of which metadata files were written by pack-objects. We already do this to an extent with the 'exts' array, which only is used in the context of existing packs. Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-16 15:57:44 -08:00
Jeff King	63f4d5cf57	repack: make "exts" array available outside cmd_repack() We'll use it in a helper function soon. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-16 15:57:43 -08:00
Patrick Steinhardt	8c4417f1cf	update-ref: disallow "start" for ongoing transactions It is currently possible to write multiple "start" commands into git-update-ref(1) for a single session, but none of them except for the first one actually have any effect. Using such nested "start"s may eventually have a sensible effect. One may imagine that it restarts the current transaction, effectively emptying it and creating a new one. It may also allow for creation of nested transactions. But currently, none of these are implemented. Silently ignoring this misuse is making it hard to iterate in the future if "start" is ever going to have meaningful semantics in such a context. This commit thus makes sure to error out in case we see such use. Signed-off-by: Patrick Steinhardt <ps@pks.im> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-16 13:44:01 -08:00
Patrick Steinhardt	262a4d28fe	update-ref: allow creation of multiple transactions While git-update-ref has recently grown commands which allow interactive control of transactions in `e48cf33b61` (update-ref: implement interactive transaction handling, 2020-04-02), it is not yet possible to create multiple transactions in a single session. To do so, one currently still needs to invoke the executable multiple times. This commit addresses this shortcoming by allowing the "start" command to create a new transaction if the current transaction has already been either committed or aborted. Signed-off-by: Patrick Steinhardt <ps@pks.im> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-16 13:44:01 -08:00
Jeff King	a9bc372ef8	use size_t to store pack .idx byte offsets We sometimes store the offset into a pack .idx file as an "unsigned long", but the mmap'd size of a pack .idx file can exceed 4GB. This is sufficient on LP64 systems like Linux, but will be too small on LLP64 systems like Windows, where "unsigned long" is still only 32 bits. Let's use size_t, which is a better type for an offset into a memory buffer. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-16 13:41:35 -08:00
Jeff King	f86f769550	compute pack .idx byte offsets using size_t A pack and its matching .idx file are limited to 2^32 objects, because the pack format contains a 32-bit field to store the number of objects. Hence we use uint32_t in the code. But the byte count of even a .idx file can be much larger than that, because it stores at least a hash and an offset for each object. So using SHA-1, a v2 .idx file will cross the 4GB boundary at 153,391,650 objects. This confuses load_idx(), which computes the minimum size like this: unsigned long min_size = 8 + 4256 + nr(hashsz + 4 + 4) + hashsz + hashsz; Even though min_size will be big enough on most 64-bit platforms, the actual arithmetic is done as a uint32_t, resulting in a truncation. We actually exceed that min_size, but then we do: unsigned long max_size = min_size; if (nr) max_size += (nr - 1)8; to account for the variable-sized table. That computation doesn't overflow quite so low, but with the truncation for min_size, we end up with a max_size that is much smaller than our actual size. So we complain that the idx is invalid, and can't find any of its objects. We can fix this case by casting "nr" to a size_t, which will do the multiplication in 64-bits (assuming you're on a 64-bit platform; this will never work on a 32-bit system since we couldn't map the whole .idx anyway). Likewise, we don't have to worry about further additions, because adding a smaller number to a size_t will convert the other side to a size_t. A few notes: - obviously we could just declare "nr" as a size_t in the first place (and likewise, packed_git.num_objects). But it's conceptually a uint32_t because of the on-disk format, and we correctly treat it that way in other contexts that don't need to compute byte offsets (e.g., iterating over the set of objects should and generally does use a uint32_t). Switching to size_t would make all of those other cases look wrong. - it could be argued that the proper type is off_t to represent the file offset. But in practice the .idx file must fit within memory, because we mmap the whole thing. And the rest of the code (including the idx_size variable we're comparing against) uses size_t. - we'll add the same cast to the max_size arithmetic line. Even though we're adding to a larger type, which will convert our result, the multiplication is still done as a 32-bit value and can itself overflow. I didn't check this with my test case, since it would need an even larger pack (~530M objects), but looking at compiler output shows that it works this way. The standard should agree, but I couldn't find anything explicit in 6.3.1.8 ("usual arithmetic conversions"). The case in load_idx() was the most immediate one that I was able to trigger. After fixing it, looking up actual objects (including the very last one in sha1 order) works in a test repo with 153,725,110 objects. That's because bsearch_hash() works with uint32_t entry indices, and the actual byte access: int cmp = hashcmp(table + mi stride, sha1); is done with "stride" as a size_t, causing the uint32_t "mi" to be promoted to a size_t. This is the way most code will access the index data. However, I audited all of the other byte-wise accesses of packed_git.index_data, and many of the others are suspect (they are similar to the max_size one, where we are adding to a properly sized offset or directly to a pointer, but the multiplication in the sub-expression can overflow). I didn't trigger any of these in practice, but I believe they're potential problems, and certainly adding in the cast is not going to hurt anything here. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-16 13:41:35 -08:00
Josh Steadmon	a2a066d96a	receive-pack: log received client session ID When receive-pack receives a session-id capability from the client, log the received session ID via a trace2 data event. Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-11 18:26:53 -08:00
Josh Steadmon	8073d75bbf	receive-pack: advertise session ID in v0 capabilities When transfer.advertiseSID is true, advertise receive-pack's session ID via the new session-id capability. Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-11 18:26:53 -08:00
Junio C Hamano	902f358555	Merge branch 'rs/clear-commit-marks-in-repo' Code clean-up. * rs/clear-commit-marks-in-repo: bisect: clear flags in passed repository object: allow clear_commit_marks_all to handle any repo	2020-11-11 13:18:37 -08:00
Elijah Newren	449a900969	shortlog: use strset from strmap.h Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-11 12:55:27 -08:00
Elijah Newren	b19315d8ab	Use new HASHMAP_INIT macro to simplify hashmap initialization Now that hashamp has lazy initialization and a HASHMAP_INIT macro, hashmaps allocated on the stack can be initialized without a call to hashmap_init() and in some cases makes the code a bit shorter. Convert some callsites over to take advantage of this. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-11 12:55:27 -08:00
Jiang Xin	80ffeb94f4	receive-pack: use default version 0 for proc-receive In the verison negotiation phase between "receive-pack" and "proc-receive", "proc-receive" can send an empty flush-pkt to end the negotiation and use default version 0. Capabilities (such as "push-options") are not supported in version 0. Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-11 12:46:56 -08:00
Jiang Xin	f65003b4c4	receive-pack: gently write messages to proc-receive Johannes found a flaky hang in `t5411/test-0013-bad-protocol.sh` in the osx-clang job of the CI/PR builds, and ran into an issue when using the `--stress` option with the following error messages: fatal: unable to write flush packet: Broken pipe send-pack: unexpected disconnect while reading sideband packet fatal: the remote end hung up unexpectedly In this test case, the "proc-receive" hook sends an error message and dies earlier. While "receive-pack" on the other side of the pipe should forward the error message of the "proc-receive" hook to the client side, but it fails to do so. This is because "receive-pack" uses `packet_write_fmt()` and `packet_flush()` to write pkt-line message to "proc-receive" hook, and these functions die immediately when pipe is broken. Using "gently" forms for these functions will get more predicable output. Add more "--die-*" options to test helper to test different stages of the protocol between "receive-pack" and "proc-receive" hook. Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-11 12:46:56 -08:00
Jeff King	3a1f91cfd9	rev-parse: handle --end-of-options We taught rev-list a new way to separate options from revisions in `19e8789b23` (revision: allow --end-of-options to end option parsing, 2019-08-06), but rev-parse uses its own parser. It should know about --end-of-options not only for consistency, but because it may be presented with similarly ambiguous cases. E.g., if a caller does: git rev-parse "$rev" -- "$path" to parse an untrusted input, then it will get confused if $rev contains an option-like string like "--local-env-vars". Or even "--not-real", which we'd keep as an option to pass along to rev-list. Or even more importantly: git rev-parse --verify "$rev" can be confused by options, even though its purpose is safely parsing untrusted input. On the plus side, it will always fail the --verify part, as it will not have parsed a revision, so the caller will generally "fail closed" rather than continue to use the untrusted string. But it will still trigger whatever option was in "$rev"; this should be mostly harmless, since rev-parse options are all read-only, but I didn't carefully audit all paths. This patch lets callers write: git rev-parse --end-of-options "$rev" -- "$path" and: git rev-parse --verify --end-of-options "$rev" which will both treat "$rev" always as a revision parameter. The latter is a bit clunky. It would be nicer if we had defined "--verify" to require that its next argument be the revision. But we have not historically done so, and: git rev-parse --verify -q "$rev" does currently work. I added a test here to confirm that we didn't break that. A few implementation notes: - We don't document --end-of-options explicitly in commands, but rather in gitcli(7). So I didn't give it its own section in git-rev-parse(1). But I did call it out specifically in the --verify section, and include it in the examples, which should show best practices. - We don't have to re-indent the main option-parsing block, because we can combine our "did we see end of options" check with "does it start with a dash". The exception is the pre-setup options, which need their own block. - We do however have to pull the "--" parsing out of the "does it start with dash" block, because we want to parse it even if we've seen --end-of-options. - We'll leave "--end-of-options" in the output. This is probably not technically necessary, as a careful caller will do: git rev-parse --end-of-options $revs -- $paths and anything in $revs will be resolved to an object id. However, it does help a slightly less careful caller like: git rev-parse --end-of-options $revs_or_paths where a path "--foo" will remain in the output as long as it also exists on disk. In that case, it's helpful to retain --end-of-options to get passed along to rev-list, s it would otherwise see just "--foo". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-10 13:46:27 -08:00
Jeff King	9033addfa6	rev-parse: put all options under the "-" check The option-parsing loop of rev-parse checks whether the first character of an arg is "-". If so, then it enters a series of conditionals checking for individual options. But some options are inexplicably outside of that outer conditional. This doesn't produce the wrong behavior; the conditional is actually redundant with the individual option checks, and it's really only its fallback "continue" that we care about. But we should at least be consistent. One obvious alternative is that we could get rid of the conditional entirely. But we'll be using the extra block it provides in the next patch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-10 13:46:27 -08:00
Jeff King	e05e2ae8fe	rev-parse: don't accept options after dashdash Because of the order in which we check options in rev-parse, there are a few options we accept even after a "--". This is wrong, because the whole point of "--" is to say "everything after here is a path". Let's move the "did we see a dashdash" check (it's called "as_is" in the code) to the top of the parsing loop. Note there is one subtlety here. The options are ordered so that some are checked before we even see if we're in a repository (they continue the loop, and if we get past a certain point, then we do the repository setup). By moving the as_is check higher, it's also in that "before setup" section, even though it might look at the repository via verify_filename(). However, this works out: we'd never set as_is until we parse "--", and we don't parse that until after doing the setup. An alternative here to avoid the subtlety is to put the as_is check at the top of the post-setup options. But then every pre-setup option would have to remember to check "if (!as_is && !strcmp(...))". So while this is a bit magical, it's harder for future code to get wrong. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-10 13:46:27 -08:00
Junio C Hamano	3baf58bfb4	format-patch: make output filename configurable For the past 15 years, we've used the hardcoded 64 as the length limit of the filename of the output from the "git format-patch" command. Since the value is shorter than the 80-column terminal, it could grow without line wrapping a bit. At the same time, since the value is longer than half of the 80-column terminal, we could fit two or more of them in "ls" output on such a terminal if we allowed to lower it. Introduce a new command line option --filename-max-length=<n> and a new configuration variable format.filenameMaxLength to override the hardcoded default. While we are at it, remove a check that the name of output directory does not exceed PATH_MAX---this check is pointless in that by the time control reaches the function, the caller would already have done an equivalent of "mkdir -p", so if the system does not like an overly long directory name, the control wouldn't have reached here, and otherwise, we know that the system allowed the output directory to exist. In the worst case, we will get an error when we try to open the output file and handle the error correctly anyway. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-09 17:44:41 -08:00
Junio C Hamano	6a44c9c0d0	Merge branch 'jk/committer-date-is-author-date-fix-simplify' Code simplification. * jk/committer-date-is-author-date-fix-simplify: am, sequencer: stop parsing our own committer ident	2020-11-09 14:06:28 -08:00
Junio C Hamano	ecf95d938b	Merge branch 'ab/git-remote-exit-code' Exit codes from "git remote add" etc. were not usable by scripted callers. * ab/git-remote-exit-code: remote: add meaningful exit code on missing/existing	2020-11-09 14:06:26 -08:00
Junio C Hamano	92d6bd2e90	Merge branch 'jk/checkout-index-errors' "git checkout-index" did not consistently signal an error with its exit status. * jk/checkout-index-errors: checkout-index: propagate errors to exit code checkout-index: drop error message from empty --stage=all	2020-11-09 14:06:26 -08:00
Junio C Hamano	cfdc70b299	Merge branch 'mr/bisect-in-c-3' Rewriting "git bisect" in C continues. * mr/bisect-in-c-3: bisect--helper: retire `--bisect-autostart` subcommand bisect--helper: retire `--write-terms` subcommand bisect--helper: retire `--check-expected-revs` subcommand bisect--helper: reimplement `bisect_state` & `bisect_head` shell functions in C bisect--helper: retire `--next-all` subcommand bisect--helper: retire `--bisect-clean-state` subcommand bisect--helper: finish porting `bisect_start()` to C	2020-11-09 14:06:25 -08:00
Phillip Wood	8843302307	rebase -i: simplify get_revision_ranges() Now that all the external users of head_hash have been converted to use a opts->orig_head instead we can stop returning head_hash from get_revision_ranges(). Because we want to pass the full object names back to the caller in `revisions` the find_unique_abbrev_r() call that was used to initialize `head_hash` is replaced with oid_to_hex(). Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-04 14:10:41 -08:00
Phillip Wood	a2bb10d06d	rebase -i: use struct object_id when writing state Rather than passing a string around pass the struct object_id that the string was created from call oid_hex() when we write the file. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-04 14:10:41 -08:00
Phillip Wood	f3e27a02d5	rebase -i: use struct object_id rather than looking up commit We already have a struct object_id containing the oid that we want to set ORIG_HEAD to so use that rather than converting it to a string and then calling get_oid() on that string. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-04 14:10:41 -08:00
Phillip Wood	e100bea481	rebase -i: stop overwriting ORIG_HEAD buffer After rebasing, ORIG_HEAD is supposed to point to the old HEAD of the rebased branch. The code used find_unique_abbrev() to obtain the object name of the old HEAD and wrote to both .git/rebase-merge/orig-head (used by `rebase --abort` to go back to the previous state) and to ORIG_HEAD. The buffer find_unique_abbrev() gives back is volatile, unfortunately, and was overwritten after the former file is written but before ORIG_FILE is written, leaving an incorrect object name in it. Avoid relying on the volatile buffer of find_unique_abbrev(), and instead supply our own buffer to keep the object name. I think that all of the users of head_hash should actually be using opts->orig_head instead as passing a string rather than a struct object_id around is a hang over from the scripted implementation. This patch just fixes the immediate bug and adds a regression test based on Caspar's reproduction example[1]. The users will be converted to use struct object_id and head_hash removed in the next few commits. [1] https://lore.kernel.org/git/CAFzd1+7PDg2PZgKw7U0kdepdYuoML9wSN4kofmB_-8NHrbbrHg@mail.gmail.com Reported-by: Caspar Duregger <herr.kaste@gmail.com> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-04 14:10:41 -08:00
Jeff King	dc1672dd10	format-patch: support --output option We've never intended to support diff's --output option in format-patch. And until `baa4adc66a` (parse-options: disable option abbreviation with PARSE_OPT_KEEP_UNKNOWN, 2019-01-27), it was impossible to trigger. We first parse the format-patch options before handing the remainder off to setup_revisions(). Before that commit, we'd accept "--output=foo" as an abbreviation for "--output-directory=foo". But afterwards, we don't check abbreviations, and --output gets passed to the diff code. This results in nonsense behavior and bugs. The diff code will have opened a filehandle at rev.diffopt.file, but we'll overwrite that with our own handles that we open for each individual patch file. So the --output file will always just be empty. But worse, the diff code also sets rev.diffopt.close_file, so log_tree_commit() will close the filehandle itself. And then the main loop in cmd_format_patch() will try to close it again, resulting in a double-free. The simplest solution would be to just disallow --output with format-patch, as nobody ever intended it to work. However, we have accidentally documented it (because format-patch includes diff-options). And it does work with "git log", which writes the whole output to the specified file. It's easy enough to make that work for format-patch, too: it's really the same as --stdout, but pointed at a specific file. We can detect the use of the --output option by the "close_file" flag (note that we can't use rev.diffopt.file, since the diff setup will otherwise set it to stdout). So we just need to unset that flag, but don't have to do anything else. Our situation is otherwise exactly like --stdout (note that we don't fclose() the file, but nor does the stdout case; exiting the program takes care of that for us). Reported-by: Johannes Postler <johannes.postler@txture.io> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-04 14:05:29 -08:00
Jeff King	1e1693b2bb	format-patch: tie file-opening logic to output_directory In format-patch we're either outputting to stdout or to individual files in an output directory (which may be just "./"). Our logic for whether to open a new file for each patch is checked with "!use_stdout", but it is equally correct to check for a non-NULL output_directory. The distinction will matter when we add a new single-stream output in a future patch, when only one of the three methods will want individual files. Let's swap the logic here in preparation. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-04 14:05:28 -08:00
Jeff King	4c6f781f9c	format-patch: refactor output selection The --stdout and --output-directory options are mutually exclusive, but it's hard to tell from reading the code. We have three separate conditionals that check for use_stdout, and it's only after we've set up the output_directory fully that we check whether the user also specified --stdout. Instead, let's check the exclusion explicitly first, then have a single conditional that handles stdout versus an output directory. This is slightly easier to follow now, and also will keep things sane when we add another output mode in a future patch. We'll add a few tests as well, covering the mutual exclusion and the fact that we are not confused by a configured output directory. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-04 14:05:28 -08:00
Junio C Hamano	39664cb0ac	log: diagnose -L used with pathspec as an error The -L option is documented to accept no pathspec, but the command line option parser has allowed the combination without checking so far. Ensure that there is no pathspec when the -L option is in effect to fix this. Incidentally, this change fixes another bug in the command line option parser, which has allowed the -L option used together with the --follow option. Because the latter requires exactly one path given, but the former takes no pathspec, they become mutually incompatible automatically. Because the -L option follows renames on its own, there is no reason to give --follow at the same time. The new tests say they may fail with "-L and --follow being incompatible" instead of "-L and pathspec being incompatible". Currently the expected failure can come only from the latter, but this is to futureproof them, in case we decide to add code to explicititly die on -L and --follow used together. Heled-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-04 13:38:33 -08:00
Elijah Newren	14c4586c2d	merge,rebase,revert: select ort or recursive by config or environment Allow the testsuite to run where it treats requests for "recursive" or the default merge algorithm via consulting the environment variable GIT_TEST_MERGE_ALGORITHM which is expected to either be "recursive" (the old traditional algorithm) or "ort" (the new algorithm). Also, allow folks to pick the new algorithm via config setting. It turns out builtin/merge.c already had a way to allow users to specify a different default merge algorithm: pull.twohead. Rather odd configuration name (especially to be in the 'pull' namespace rather than 'merge') but it's there. Add that same configuration to rebase, cherry-pick, and revert. This required updating the various callsites that called merge_trees() or merge_recursive() to conditionally call the new API, so this serves as another demonstration of what the new API looks and feels like. There are almost certainly some callsites that have not yet been modified to work with the new merge algorithm, but this represents the ones that I have been testing with thus far. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-02 16:35:50 -08:00
Junio C Hamano	1ae0949a03	Merge branch 'mk/diff-ignore-regex' "git diff" family of commands learned the "-I<regex>" option to ignore hunks whose changed lines all match the given pattern. * mk/diff-ignore-regex: diff: add -I<regex> that ignores matching changes merge-base, xdiff: zero out xpparam_t structures	2020-11-02 13:17:44 -08:00
Junio C Hamano	51830654fc	Merge branch 'jk/fast-import-marks-cleanup' Code clean-up. * jk/fast-import-marks-cleanup: fast-import: remove duplicated option-parsing line	2020-11-02 13:17:40 -08:00
Junio C Hamano	e0f6ad2984	Merge branch 'tk/credential-config' "git credential' didn't honor the core.askPass configuration variable (among other things), which has been corrected. * tk/credential-config: credential: load default config	2020-11-02 13:17:39 -08:00
Junio C Hamano	b6fb70c985	Merge branch 'dl/diff-merge-base' "git diff A...B" learned "git diff --merge-base A B", which is a longer short-hand to say the same thing. * dl/diff-merge-base: contrib/completion: complete `git diff --merge-base` builtin/diff-tree: learn --merge-base builtin/diff-index: learn --merge-base t4068: add --merge-base tests diff-lib: define diff_get_merge_base() diff-lib: accept option flags in run_diff_index() contrib/completion: extract common diff/difftool options git-diff.txt: backtick quote command text git-diff-index.txt: make --cached description a proper sentence t4068: remove unnecessary >tmp	2020-11-02 13:17:39 -08:00
Junio C Hamano	761a4e9ab1	Merge branch 'bk/sob-dco' Document that the meaning of a Signed-off-by trailer can vary from project to project in the end-user documentation, and clarify what it means to this project. * bk/sob-dco: Documentation: stylistically normalize references to Signed-off-by: SubmittingPatches: clarify DCO is our --signoff rule Documentation: clarify and expand description of --signoff doc: preparatory clean-up of description on the sign-off option	2020-11-02 13:17:39 -08:00
Junio C Hamano	0be2d65132	Merge branch 'ds/maintenance-commit-graph-auto-fix' Test-coverage enhancement of running commit-graph task "git maintenance" as needed led to discovery and fix of a bug. * ds/maintenance-commit-graph-auto-fix: maintenance: core.commitGraph=false prevents writes maintenance: test commit-graph auto condition	2020-11-02 13:17:39 -08:00
Junio C Hamano	cd47bbe164	Merge branch 'jk/fast-import-marks-alloc-fix' "git fast-import" wasted a lot of memory when many marks were in use. * jk/fast-import-marks-alloc-fix: fast-import: fix over-allocation of marks storage	2020-11-02 13:17:37 -08:00
Elijah Newren	6da1a25814	hashmap: provide deallocation function names hashmap_free(), hashmap_free_entries(), and hashmap_free_() have existed for a while, but aren't necessarily the clearest names, especially with hashmap_partial_clear() being added to the mix and lazy-initialization now being supported. Peff suggested we adopt the following names[1]: - hashmap_clear() - remove all entries and de-allocate any hashmap-specific data, but be ready for reuse - hashmap_clear_and_free() - ditto, but free the entries themselves - hashmap_partial_clear() - remove all entries but don't deallocate table - hashmap_partial_clear_and_free() - ditto, but free the entries This patch provides the new names and converts all existing callers over to the new naming scheme. [1] https://lore.kernel.org/git/20201030125059.GA3277724@coredump.intra.peff.net/ Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-02 12:15:50 -08:00
Philippe Blain	3af31e8786	blame: simplify 'setup_blame_bloom_data' interface The penultimate commit moved the initialization of 'sb.path' in 'builtin/blame.c::cmd_blame' before the call to 'blame.c::setup_blame_bloom_data'. Since 'cmd_blame' is the only caller of 'setup_blame_bloom_data', it is now unnecessary for 'setup_blame_bloom_data' to receive 'path' as a separate argument, as 'sb.path' is already initialized. Remove this argument from setup_blame_bloom_data's interface and use the 'path' field of the 'sb' 'struct blame_scoreboard' instead. Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-01 15:54:15 -08:00
Philippe Blain	88894aaeea	blame: simplify 'setup_scoreboard' interface The previous commit moved the initialization of 'sb.path' in 'builtin/blame.c::cmd_blame' before the call to 'blame.c::setup_scoreboard'. Since 'cmd_blame' is the only caller of 'setup_scoreboard', it is now unnecessary for 'setup_scoreboard' to receive 'path' as a separate argument, as 'sb.path' is already initialized. Remove this argument from setup_scoreboard's interface and use the 'path' field of the 'sb' 'struct blame_scoreboard' instead. Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-01 15:54:15 -08:00
Philippe Blain	9466e3809d	blame: enable funcname blaming with userdiff driver In blame.c::cmd_blame, we send the 'path' field of the 'sb' 'struct blame_scoreboard' as the 'path' argument to 'line-range.c::parse_range_arg', but 'sb.path' is not set yet; it's set to the local variable 'path' a few lines later at line 1137. This 'path' argument is only used in 'parse_range_arg' if we are blaming a funcname, i.e. `git blame -L :<funcname> <path>`, and in that case it is sent to 'parse_range_funcname', where it is used to determine if a userdiff driver should be used for said <path> to match the given funcname. Since 'path' is yet unset, the userdiff driver is never used, so we fall back to the default funcname regex, which is usually not appropriate for paths that are set to use a specific userdiff driver, and thus either we match some unrelated lines, or we die with fatal: -L parameter '<funcname>' starting at line 1: no match This has been the case ever since `git blame` learned to blame a funcname in `13b8f68c1f` (log -L: :pattern:file syntax to find by funcname, 2013-03-28). Enable funcname blaming for paths using specific userdiff drivers by initializing 'sb.path' earlier in 'cmd_blame', when some of its other fields are initialized, so that it is set when passed to 'parse_range_arg'. Add a regression test in 'annotate-tests.sh', which is sourced in t8001-annotate.sh and t8002-blame.sh, leveraging an existing file used to test the userdiff patterns in t4018-diff-funcname. Also, use 'sb.path' instead of 'path' when constructing the error message at line 1114, for consistency. Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-01 15:54:15 -08:00
Philippe Blain	180d641d7d	line-log: mention both modes in 'blame' and 'log' short help 'git blame -h' and 'git log -h' both show '-L <n,m>' and describe this option as "Process only line range n,m, counting from 1". No hint is given that a function name regex can also be used. Use <range> instead, and expand the description of the option to mention both modes. Remove "counting from 1" as it's uneeded; it's uncommon to refer to the first line of a file as "line 0". Also, for 'git log', improve the wording to better reflect the long help. Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-01 15:54:14 -08:00
René Scharfe	4f44c5659b	stash: simplify reflog emptiness check Calling rev-parse to check if the drop subcommand removed the last stash and treating its failure as confirmation is fragile, as the command can fail for other reasons, e.g. because the system is out of memory. Directly check if the reflog is empty instead, which is more robust. Reported-by: Marek Mrva <mrva@eof-studios.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-01 15:51:31 -08:00
René Scharfe	cd8888452c	object: allow clear_commit_marks_all to handle any repo Allow callers to specify the repository to use. Rename the function to repo_clear_commit_marks to document its new scope. No functional change intended. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-31 10:46:34 -07:00
Junio C Hamano	4f9f7c1442	Merge branch 'jk/committer-date-is-author-date-fix' into maint In 2.29, "--committer-date-is-author-date" option of "rebase" and "am" subcommands lost the e-mail address by mistake, which has been corrected. * jk/committer-date-is-author-date-fix: rebase: fix broken email with --committer-date-is-author-date am: fix broken email with --committer-date-is-author-date t3436: check --committer-date-is-author-date result more carefully	2020-10-29 14:18:47 -07:00
Junio C Hamano	0e41cfad62	Merge branch 'dl/checkout-guess' "git checkout" learned to use checkout.guess configuration variable and enable/disable its "--[no-]guess" option accordingly. * dl/checkout-guess: checkout: learn to respect checkout.guess Documentation/config/checkout: replace sq with backticks	2020-10-27 15:09:51 -07:00
Junio C Hamano	f3cfeb3078	Merge branch 'dl/checkout-p-merge-base' "git checkout -p A...B [-- <path>]" did not work, even though the same command without "-p" correctly used the merge-base between commits A and B. * dl/checkout-p-merge-base: t2016: add a NEEDSWORK about the PERL prerequisite add-patch: add NEEDSWORK about comparing commits Doc: document "A...B" form for <tree-ish> in checkout and switch builtin/checkout: fix `git checkout -p HEAD...` bug	2020-10-27 15:09:51 -07:00
Junio C Hamano	40696c6727	Merge branch 'sb/clone-origin' "git clone" learned clone.defaultremotename configuration variable to customize what nickname to use to call the remote the repository was cloned from. * sb/clone-origin: clone: allow configurable default for `-o`/`--origin` clone: read new remote name from remote_name instead of option_origin clone: validate --origin option before use refs: consolidate remote name validation remote: add tests for add and rename with invalid names clone: use more conventional config/option layering clone: add tests for --template and some disallowed option pairs	2020-10-27 15:09:50 -07:00
Junio C Hamano	de0a7effc8	Merge branch 'sk/force-if-includes' "git push --force-with-lease[=<ref>]" can easily be misused to lose commits unless the user takes good care of their own "git fetch". A new option "--force-if-includes" attempts to ensure that what is being force-pushed was created after examining the commit at the tip of the remote ref that is about to be force-replaced. * sk/force-if-includes: t, doc: update tests, reference for "--force-if-includes" push: parse and set flag for "--force-if-includes" push: add reflog check for "--force-if-includes"	2020-10-27 15:09:49 -07:00
Junio C Hamano	52b8c8c716	Merge branch 'ds/maintenance-part-2' "git maintenance", an extended big brother of "git gc", continues to evolve. * ds/maintenance-part-2: maintenance: add incremental-repack auto condition maintenance: auto-size incremental-repack batch maintenance: add incremental-repack task midx: use start_delayed_progress() midx: enable core.multiPackIndex by default maintenance: create auto condition for loose-objects maintenance: add loose-objects task maintenance: add prefetch task	2020-10-27 15:09:47 -07:00
Junio C Hamano	26bb5437f6	Merge branch 'rs/worktree-list-show-locked' "git worktree list" now shows if each worktree is locked. This possibly may open us to show other kinds of states in the future. * rs/worktree-list-show-locked: worktree: teach `list` to annotate locked worktree	2020-10-27 15:09:47 -07:00
Junio C Hamano	ae84e924da	Merge branch 'rs/tighten-callers-of-deref-tag' Code clean-up. * rs/tighten-callers-of-deref-tag: line-log: handle deref_tag() returning NULL blame: handle deref_tag() returning NULL grep: handle deref_tag() returning NULL	2020-10-27 15:09:46 -07:00
Jeff King	7e41061588	checkout-index: propagate errors to exit code If we encounter an error while checking out an explicit path, we print a message to stderr but do not actually exit with a non-zero code. While this is a plumbing command and the behavior goes all the way back to `33db5f4d90` (Add a "checkout-cache" command which does what the name suggests., 2005-04-09), this is almost certainly an oversight: - we _do_ return an exit code from checkout_file(); the caller just never reads it - errors while checking out all paths (with "-a") do result in a non-zero exit code. - it would be quite unusual not to use the exit code for an error, as otherwise the caller has no idea the command failed except by scraping stderr To keep our tests simple and portable, we can use the most obvious error: asking to checkout a path which is not in the index at all. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-27 12:41:56 -07:00
Jeff King	0b809c8248	checkout-index: drop error message from empty --stage=all If checkout-index is given --stage=all for a specific path, it will try to write stages 1-3 (if present) for that path to temporary files. However, if the file is present only at stage 0, it writes nothing but gives a confusing message: $ git checkout-index --stage=all -- Makefile git checkout-index: Makefile does not exist at stage 4 This is nonsense. There is no stage 4 (it's just an internal enum value we use for "all"), and the documentation clearly states: Paths which only have a stage 0 entry will always be omitted from the output. Here it's talking about the list of tempfiles written to stdout, but it seems clear that this case was not meant to be an error. We even have a test which covers it, but it only checks that the command reports an exit code of 0, not its stderr. And it reports 0 only because of another bug which fails to propagate errors (which will be fixed in a subsequent patch). So let's make the test more thorough. We'll also cover the case that we found _no_ entry, not even a stage zero, which should still be an error. However, because of the other bug, we'll have to mark this as expecting failure for the moment. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-27 12:41:54 -07:00
Ævar Arnfjörð Bjarmason	9144ba4cf5	remote: add meaningful exit code on missing/existing Change the exit code for the likes of "git remote add/rename" to exit with 2 if the remote in question doesn't exist, and 3 if it does. Before we'd just die() and exit with the general 128 exit code. This changes the output message from e.g.: fatal: remote origin already exists. To: error: remote origin already exists. Which I believe is a feature, since we generally use "fatal" for the generic errors, and "error" for the more specific ones with a custom exit code, but this part of the change may break code that already relies on stderr parsing (not that we ever supported that...). The motivation for this is a discussion around some code in GitLab's gitaly which wanted to check this, and had to parse stderr to do so: https://gitlab.com/gitlab-org/gitaly/-/merge_requests/2695 It's worth noting as an aside that a method of checking this that doesn't rely on that is to check with "git config" whether the value in question does or doesn't exist. That introduces a TOCTOU race condition, but on the other hand this code (e.g. "git remote add") already has a TOCTOU race. We go through the config.lock for the actual setting of the config, but the pseudocode logic is: read_config(); check_config_and_arg_sanity(); save_config(); So e.g. if a sleep() is added right after the remote_is_configured() check in add() we'll clobber remote.NAME.url, and add another (usually duplicate) remote.NAME.fetch entry (and other values, depending on invocation). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-27 11:40:33 -07:00
Junio C Hamano	f34687dc81	Merge branch 'jk/committer-date-is-author-date-fix' In 2.29, "--committer-date-is-author-date" option of "rebase" and "am" subcommands lost the e-mail address by mistake, which has been corrected. * jk/committer-date-is-author-date-fix: rebase: fix broken email with --committer-date-is-author-date am: fix broken email with --committer-date-is-author-date t3436: check --committer-date-is-author-date result more carefully	2020-10-26 14:59:58 -07:00
Jeff King	2020451c5b	am, sequencer: stop parsing our own committer ident For the --committer-date-is-author-date option of git-am and git-rebase, we format the committer ident, then re-parse it to find the name and email, and then feed those back to fmt_ident(). We can simplify this by handling it all at the time of the fmt_ident() call. We pass in the appropriate getenv() results, and if they're not present, then our WANT_COMMITTER_IDENT flag tells fmt_ident() to fill in the appropriate value from the config. Which is exactly what git_committer_ident() was doing under the hood. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-26 09:59:57 -07:00
Jeff King	16b0bb99ea	am: fix broken email with --committer-date-is-author-date Commit `e8cbe2118a` (am: stop exporting GIT_COMMITTER_DATE, 2020-08-17) rewrote the code for setting the committer date to use fmt_ident(), rather than setting an environment variable and letting commit_tree() handle it. But it introduced two bugs: - we use the author email string instead of the committer email - when parsing the committer ident, we used the wrong variable to compute the length of the email, resulting in it always being a zero-length string This commit fixes both, which causes our test of this option via the rebase "apply" backend to now succeed. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-23 08:25:19 -07:00
Michał Kępień	ec7967cfaf	merge-base, xdiff: zero out xpparam_t structures xpparam_t structures are usually zero-initialized before their specific fields are assigned to, but there are three locations in the tree where that does not happen. Add the missing memset() calls in order to make initialization of xpparam_t structures consistent tree-wide and to prevent stack garbage from being used as field values. Signed-off-by: Michał Kępień <michal@isc.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-20 12:53:26 -07:00
Bradley M. Kuhn	3abd4a67d9	Documentation: stylistically normalize references to Signed-off-by: Ted reported an old typo in the git-commit.txt and merge-options.txt. Namely, the phrase "Signed-off-by line" was used without either a definite nor indefinite article. Upon examination, it seems that the documentation (including items in Documentation/, but also option help strings) have been quite inconsistent on usage when referring to `Signed-off-by`. First, very few places used a definite or indefinite article with the phrase "Signed-off-by line", but that was the initial typo that led to this investigation. So, normalize using either an indefinite or definite article consistently. The original phrasing, in Commit `3f971fc425` (Documentation updates, 2005-08-14), is "Add Signed-off-by line". Commit `6f855371a5` (Add --signoff, --check, and long option-names. 2005-12-09) switched to using "Add `Signed-off-by:` line", but didn't normalize the former commit to match. Later commits seem to have cut and pasted from one or the other, which is likely how the usage became so inconsistent. Junio stated on the git mailing list in <xmqqy2k1dfoh.fsf@gitster.c.googlers.com> a preference to leave off the colon. Thus, prefer `Signed-off-by` (with backticks) for the documentation files and Signed-off-by (without backticks) for option help strings. Additionally, Junio argued that "trailer" is now the standard term to refer to `Signed-off-by`, saying that "becomes plenty clear that we are not talking about any random line in the log message". As such, prefer "trailer" over "line" anywhere the former word fits. However, leave alone those few places in documentation that use Signed-off-by to refer to the process (rather than the specific trailer), or in places where mail headers are generally discussed in comparison with Signed-off-by. Reported-by: "Theodore Y. Ts'o" <tytso@mit.edu> Signed-off-by: Bradley M. Kuhn <bkuhn@sfconservancy.org> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-20 11:57:40 -07:00
Thomas Koutcher	567ad2c0f9	credential: load default config Make `git credential fill` honour the core.askPass variable. Signed-off-by: Thomas Koutcher <thomas.koutcher@online.fr> [jk: added test] Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-16 12:30:45 -07:00
Pranit Bauva	b0f6494f70	bisect--helper: retire `--bisect-autostart` subcommand The `--bisect-autostart` subcommand is no longer used from the git-bisect.sh shell script. Instead the function `bisect_autostart()` is directly called from the C implementation. Mentored-by: Lars Schneider <larsxschneider@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com> Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com> Signed-off-by: Miriam Rubio <mirucam@gmail.com> Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-16 12:24:20 -07:00
Pranit Bauva	5c517fe345	bisect--helper: retire `--write-terms` subcommand The `--write-terms` subcommand is no longer used from the git-bisect.sh shell script. Instead the function `write_terms()` is called from the C implementation of `set_terms()` and `bisect_start()`. Mentored-by: Lars Schneider <larsxschneider@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com> Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com> Signed-off-by: Miriam Rubio <mirucam@gmail.com> Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-16 12:24:20 -07:00
Pranit Bauva	9b437b056d	bisect--helper: retire `--check-expected-revs` subcommand The `--check-expected-revs` subcommand is no longer used from the git-bisect.sh shell script. Functions `check_expected_revs` and `is_expected_revs` are also deleted. Mentored-by: Lars Schneider <larsxschneider@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com> Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com> Signed-off-by: Miriam Rubio <mirucam@gmail.com> Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-16 12:24:20 -07:00
Pranit Bauva	27257bc466	bisect--helper: reimplement `bisect_state` & `bisect_head` shell functions in C Reimplement the `bisect_state()` shell functions in C and also add a subcommand `--bisect-state` to `git-bisect--helper` to call them from git-bisect.sh . Using `--bisect-state` subcommand is a temporary measure to port shell function to C so as to use the existing test suite. As more functions are ported, this subcommand will be retired and will be called by some other methods. `bisect_head()` is only called from `bisect_state()`, thus it is not required to introduce another subcommand. Note that the `eval` in the changed line of `git-bisect.sh` cannot be dropped: it is necessary because the `rev` and the `tail` variables may contain multiple, quoted arguments that need to be passed to `bisect--helper` (without the quotes, naturally). Mentored-by: Lars Schneider <larsxschneider@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com> Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com> Signed-off-by: Miriam Rubio <mirucam@gmail.com> Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-16 12:24:20 -07:00
Pranit Bauva	04774b4e70	bisect--helper: retire `--next-all` subcommand The `--next-all` subcommand is no longer used from the git-bisect.sh shell script. Instead the function `bisect_next_all()` is called from the C implementation of `bisect_next()`. Mentored-by: Lars Schneider <larsxschneider@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com> Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com> Signed-off-by: Miriam Rubio <mirucam@gmail.com> Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-16 12:24:20 -07:00
Pranit Bauva	e4396072e7	bisect--helper: retire `--bisect-clean-state` subcommand The `--bisect-clean-state` subcommand is no longer used from the git-bisect.sh shell script. Instead the function `bisect_clean_state()` is directly called from the C implementation. Mentored-by: Lars Schneider <larsxschneider@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com> Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com> Signed-off-by: Miriam Rubio <mirucam@gmail.com> Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-16 12:24:20 -07:00
Pranit Bauva	88ad372fc0	bisect--helper: finish porting `bisect_start()` to C Add the subcommand to `git bisect--helper` and call it from git-bisect.sh. With the conversion of `bisect_auto_next()` from shell to C in a previous commit, `bisect_start()` can now be fully ported to C. So let's complete the `--bisect-start` subcommand of `git bisect--helper` so that it fully implements `bisect_start()`, and let's use this subcommand in `git-bisect.sh` instead of `bisect_start()`. Note that the `eval` in the changed line of `git-bisect.sh` cannot be dropped: it is necessary because the `rev` and the `tail` variables may contain multiple, quoted arguments that need to be passed to `bisect--helper` (without the quotes, naturally). Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com> Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com> Signed-off-by: Miriam Rubio <mirucam@gmail.com> Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-16 12:24:20 -07:00
Jeff King	6db29ab213	fast-import: remove duplicated option-parsing line Commit `1bdca81641` (fast-import: add options for rewriting submodules, 2020-02-22) accidentally added two lines parsing the option "rewrite-submodules-from". This didn't do anything in practice, because they're in an if/else chain and so the second one can never trigger. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-16 08:48:47 -07:00
Derrick Stolee	61f7a383d3	maintenance: use 'incremental' strategy by default The 'git maintenance (register\|start)' subcommands add the current repository to the global Git config so maintenance will operate on that repository. It does not specify what maintenance should occur or how often. To make it simple for users to start background maintenance with a recommended schedlue, update the 'maintenance.strategy' config option in both the 'register' and 'start' subcommands. This allows users to customize beyond the defaults using individual 'maintenance.<task>.schedule' options, but also the user can opt-out of this strategy using 'maintenance.strategy=none'. Helped-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-16 08:36:42 -07:00
Derrick Stolee	a4cb1a2339	maintenance: create maintenance.strategy config To provide an on-ramp for users to use background maintenance without several 'git config' commands, create a 'maintenance.strategy' config option. Currently, the only important value is 'incremental' which assigns the following schedule: * gc: never * prefetch: hourly * commit-graph: hourly * loose-objects: daily * incremental-repack: daily These tasks are chosen to minimize disruptions to foreground Git commands and use few compute resources. The 'maintenance.strategy' is intended as a baseline that can be customzied further by manually assigning 'maintenance.<task>.enabled' and 'maintenance.<task>.schedule' config options, which will override any recommendation from 'maintenance.strategy'. This operates similarly to config options like 'feature.experimental' which operate as "meta" config options that change default config values. This presents a way forward for updating the 'incremental' strategy in the future or adding new strategies. For example, a potential strategy could be to include a 'full' strategy that runs the 'gc' task weekly and no other tasks by default. Helped-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-16 08:36:42 -07:00
Jeff King	3f018ec716	fast-import: fix over-allocation of marks storage Fast-import stores its marks in a trie-like structure made of mark_set structs. Each struct has a fixed size (1024). If our id number is too large to fit in the struct, then we allocate a new struct which shifts the id number by 10 bits. Our original struct becomes a child node of this new layer, and the new struct becomes the top level of the trie. This scheme was broken by `ddddf8d7e2` (fast-import: permit reading multiple marks files, 2020-02-22). Before then, we had a top-level "marks" pointer, and the push-down worked by assigning the new top-level struct to "marks". But after that commit, insert_mark() takes a pointer to the mark_set, rather than using the global "marks". It continued to assign to the global "marks" variable during the push down, which was wrong for two reasons: - we added a call in option_rewrite_submodules() which uses a separate mark set; pushing down on "marks" is outright wrong here. We'd corrupt the "marks" set, and we'd fail to correctly store any submodule mappings with an id over 1024. - the other callers passed "marks", but the push-down was still wrong. In read_mark_file(), we take the pointer to the mark_set as a parameter. So even though insert_mark() was updating the global "marks", the local pointer we had in read_mark_file() was not updated. As a result, we'd add a new level when needed, but then the next call to insert_mark() wouldn't see it! It would then allocate a new layer, which would also not be seen, and so on. Lookups for the lost layers obviously wouldn't work, but before we even hit any lookup stage, we'd generally run out of memory and die. Our tests didn't notice either of these cases because they didn't have enough marks to trigger the push-down behavior. The new tests in t9304 cover both cases (and fail without this patch). We can solve the problem by having insert_mark() take a pointer-to-pointer of the top-level of the set. Then our push down can assign to it in a way that the caller actually sees. Note the subtle reordering in option_rewrite_submodules(). Our call to read_mark_file() may modify our top-level set pointer, so we have to wait until after it returns to assign its value into the string_list. Reported-by: Sergey Brester <serg.brester@sebres.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-15 10:30:53 -07:00
René Scharfe	db7d07f610	blame: handle deref_tag() returning NULL Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-12 12:25:14 -07:00
René Scharfe	e30b1525fb	grep: handle deref_tag() returning NULL deref_tag() can return NULL. Exit gracefully in that case instead of blindly dereferencing the return value. .name shouldn't ever be NULL, but grep_object() handles that case explicitly, so let's be defensive here as well and show the broken object's ID if it happens to lack a name after all. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-12 12:25:14 -07:00
Rafael Silva	c57b3367be	worktree: teach `list` to annotate locked worktree The "git worktree list" shows the absolute path to the working tree, the commit that is checked out and the name of the branch. It is not immediately obvious which of the worktrees, if any, are locked. "git worktree remove" refuses to remove a locked worktree with an error message. If "git worktree list" told which worktrees are locked in its output, the user would not even attempt to remove such a worktree, or would realize that "git worktree remove -f -f <path>" is required. Teach "git worktree list" to append "locked" to its output. The output from the command becomes like so: $ git worktree list /path/to/main abc123 [master] /path/to/worktree 456def (detached HEAD) /path/to/locked-worktree 123abc (detached HEAD) locked Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-12 12:24:29 -07:00
Derrick Stolee	d334107c5d	maintenance: core.commitGraph=false prevents writes Recently, a user had an issue due to combining fetch.writeCommitGraph=true with core.commitGraph=false. The root bug has been resolved by preventing commit-graph writes when core.commitGraph is disabled. This happens inside the 'git commit-graph write' command, but we can be more aware of this situation and prevent that process from ever starting in the 'commit-graph' maintenance task. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-12 12:13:21 -07:00
Junio C Hamano	d620daaa34	Merge branch 'ja/misc-doc-fixes' Doc fixes. * ja/misc-doc-fixes: doc: fix the bnf like style of some commands doc: git-remote fix ups doc: use linkgit macro where needed. git-bisect-lk2009: make continuation of list indented	2020-10-08 21:53:26 -07:00
Junio C Hamano	c7ac8c0a7c	Merge branch 'jk/index-pack-hotfixes' Hotfix and clean-up for the jt/threaded-index-pack topic that has graduated to v2.29-rc0. * jk/index-pack-hotfixes: index-pack: make get_base_data() comment clearer index-pack: drop type_cas mutex index-pack: restore "resolving deltas" progress meter	2020-10-08 21:53:26 -07:00
Jean-Noël Avila	9f443f5531	doc: fix the bnf like style of some commands In command line options, variables are entered between < and > Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-08 14:01:19 -07:00
Derrick Stolee	8f801804be	maintenance: test commit-graph auto condition The auto condition for the commit-graph maintenance task walks refs looking for commits that are not in the commit-graph file. This was added in `4ddc79b2` (maintenance: add auto condition for commit-graph task, 2020-09-17) but was left untested. The initial goal of this change was to demonstrate the feature works properly by adding tests. However, there was an off-by-one error that caused the basic tests around maintenance.commit-graph.auto=1 to fail when it should work. The subtlety is that if a ref tip is not in the commit-graph, then we were not adding that to the total count. In the test, we see that we have only added one commit since our last commit-graph write, so the auto condition would say there is nothing to do. The fix is simple: add the check for the commit-graph position to see that the tip is not in the commit-graph file before starting our walk. Since this happens before adding to the DFS stack, we do not need to clear our (currently empty) commit list. This does add some extra complexity for the test, because we also want to verify that the walk along the parents actually does some work. This means we need to add at least two commits in a row without writing the commit-graph. However, we also need to make sure no additional refs are pointing to the middle of this list or else the for_each_ref() in should_write_commit_graph() might visit these commits as tips instead of doing a DFS walk. Hence, the last two commits are added with "git commit" instead of "test_commit". Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-08 10:24:40 -07:00
Denton Liu	64f1f58fe7	checkout: learn to respect checkout.guess The current behavior of git checkout/switch is that --guess is currently enabled by default. However, some users may not wish for this to happen automatically. Instead of forcing users to specify --no-guess manually each time, teach these commands the checkout.guess configuration variable that gives users the option to set a default behavior. Teach the completion script to recognize the new config variable and disable DWIM logic if it is set to false. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-08 09:25:29 -07:00
Jonathan Tan	ec6a8f9705	index-pack: make get_base_data() comment clearer A comment mentions that we may free cached delta bases via find_unresolved_deltas(), but that function went away in `f08cbf60fe` (index-pack: make quantum of work smaller, 2020-09-08). Since we need to rewrite that comment anyway, make the entire comment clearer. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-07 13:32:27 -07:00
Jeff King	bebe171947	index-pack: drop type_cas mutex The type_cas lock lost all of its callers in `f08cbf60fe` (index-pack: make quantum of work smaller, 2020-09-08), so we can safely delete it. The compiler didn't alert us that the variable became unused, because we still call pthread_mutex_init() and pthread_mutex_destroy() on it. It's worth considering also whether that commit was in error to remove the use of the lock. Why don't we need it now, if we did before, as described in `ab791dd138` (index-pack: fix race condition with duplicate bases, 2014-08-29)? I think the answer is that we now look at and assign the child_obj->real_type field in the main thread while holding the work_lock(). So we don't have to worry about racing with the worker threads. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-07 11:51:26 -07:00
Jeff King	cea69151a4	index-pack: restore "resolving deltas" progress meter Commit `f08cbf60fe` (index-pack: make quantum of work smaller, 2020-09-08) refactored the main loop in threaded_second_pass(), but also deleted the call to display_progress() at the top of the loop. This means that users typically see no progress at all during the delta resolution phase (and for large repositories, Git appears to hang). This looks like an accident that was unrelated to the intended change of that commit, since we continue to update nr_resolved_deltas in resolve_delta(). Let's restore the call to get that progress back. We'll also add a test that confirms we generate the expected progress. This isn't perfect, as it wouldn't catch a bug where progress was delayed to the end. That was probably possible to trigger when receiving a thin pack, because we'd eventually call display_progress() from fix_unresolved_deltas(), but only once after doing all the work. However, since our test case generates a complete pack, it reliably demonstrates this particular bug and its fix. And we can't do better without making the test racy. Signed-off-by: Jeff King <peff@peff.net> Acked-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-07 11:50:09 -07:00
Denton Liu	5602b500c3	builtin/checkout: fix `git checkout -p HEAD...` bug Running `git checkout -p` with a merge-base rev results in an error: $ git checkout -p HEAD... usage: git diff-index [-m] [--cached] [<common-diff-options>] <tree-ish> [<path>...] common diff options: -z output diff-raw with lines terminated with NUL. -p output patch format. -u synonym for -p. --patch-with-raw output both a patch and the diff-raw format. --stat show diffstat instead of patch. --numstat show numeric diffstat instead of patch. --patch-with-stat output a patch and prepend its diffstat. --name-only show only names of changed files. --name-status show names and status of changed files. --full-index show full object name on index lines. --abbrev=<n> abbreviate object names in diff-tree header and diff-raw. -R swap input file pairs. -B detect complete rewrites. -M detect renames. -C detect copies. --find-copies-harder try unchanged files as candidate for copy detection. -l<n> limit rename attempts up to <n> paths. -O<file> reorder diffs according to the <file>. -S<string> find filepair whose only one side contains the string. --pickaxe-all show all files diff when -S is used and hit is found. -a --text treat all files as text. Cannot close git diff-index --cached --numstat --summary HEAD... -- () at <redacted>/libexec/git-core/git-add--interactive line 183. This happens because checkout passes the literal argument (in the example, `HEAD...`) to diff-index which does not recognise merge-base revs. Fix this by using the hex of the found commit instead of the given name. Note that "HEAD" is handled specially in run_add_interactive() so it's explicitly not changed. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-07 09:49:05 -07:00
Junio C Hamano	5f8c70a148	Merge branch 'jk/format-auto-base-when-able' "git format-patch" learns to take "whenAble" as a possible value for the format.useAutoBase configuration variable to become no-op when the automatically computed base does not make sense. * jk/format-auto-base-when-able: format-patch: teach format.useAutoBase "whenAble" option	2020-10-05 14:01:55 -07:00
Junio C Hamano	8e3ec76a20	Merge branch 'jk/refspecs-negative' "git fetch" and "git push" support negative refspecs. * jk/refspecs-negative: refspec: add support for negative refspecs	2020-10-05 14:01:54 -07:00
Junio C Hamano	e68f0a4e57	Merge branch 'jt/keep-partial-clone-filter-upon-lazy-fetch' The lazy fetching done internally to make missing objects available in a partial clone incorrectly made permanent damage to the partial clone filter in the repository, which has been corrected. * jt/keep-partial-clone-filter-upon-lazy-fetch: fetch: do not override partial clone filter promisor-remote: remove unused variable	2020-10-05 14:01:53 -07:00
Junio C Hamano	19dd352d03	Merge branch 'jk/unused' Code cleanup. * jk/unused: dir.c: drop unused "untracked" from treat_path_fast() sequencer: handle ignore_footer when parsing trailers test-advise: check argument count with argc instead of argv sparse-checkout: fill in some options boilerplate sequencer: drop repository argument from run_git_commit() push: drop unused repo argument to do_push() assert PARSE_OPT_NONEG in parse-options callbacks env--helper: write to opt->value in parseopt helper drop unused argc parameters convert: drop unused crlf_action from check_global_conv_flags_eol()	2020-10-05 14:01:52 -07:00
Junio C Hamano	34415c76c8	Merge branch 'so/combine-diff-simplify' Code simplification. * so/combine-diff-simplify: diff: get rid of redundant 'dense' argument	2020-10-05 14:01:51 -07:00
Junio C Hamano	58138d3f26	Merge branch 'js/default-branch-name-part-2' Update the tests to drop word 'master' from them. * js/default-branch-name-part-2: t9902: avoid using the branch name `master` tests: avoid variations of the `master` branch name t3200: avoid variations of the `master` branch name fast-export: avoid using unnecessary language in a code comment t/test-terminal: avoid non-inclusive language	2020-10-05 14:01:50 -07:00
Junio C Hamano	2fa8aacc72	Merge branch 'jk/shortlog-group-by-trailer' "git shortlog" has been taught to group commits by the contents of the trailer lines, like "Reviewed-by:", "Coauthored-by:", etc. * jk/shortlog-group-by-trailer: shortlog: allow multiple groups to be specified shortlog: parse trailer idents shortlog: rename parse_stdin_ident() shortlog: de-duplicate trailer values shortlog: match commit trailers with --group trailer: add interface for iterating over commit trailers shortlog: add grouping option shortlog: change "author" variables to "ident"	2020-10-04 12:49:14 -07:00
Junio C Hamano	f4cc68cbd0	Merge branch 'mr/bisect-in-c-2' Rewrite of the "git bisect" script in C continues. * mr/bisect-in-c-2: bisect--helper: reimplement `bisect_next` and `bisect_auto_next` shell functions in C bisect: call 'clear_commit_marks_all()' in 'bisect_next_all()' bisect--helper: reimplement `bisect_autostart` shell function in C bisect--helper: introduce new `write_in_file()` function bisect--helper: use '-res' in 'cmd_bisect__helper' return bisect--helper: BUG() in cmd_*() on invalid subcommand	2020-10-04 12:49:08 -07:00
Junio C Hamano	03a01824a4	Merge branch 'cc/bisect-start-fix' "git bisect start X Y", when X and Y are not valid committish object names, should take X and Y as pathspec, but didn't. * cc/bisect-start-fix: bisect: don't use invalid oid as rev when starting	2020-10-04 12:49:08 -07:00
Junio C Hamano	230ff3e997	Merge branch 'jc/blame-ignore-fix' "git blame --ignore-rev/--ignore-revs-file" failed to validate their input are valid revision, and failed to take into account that the user may want to give an annotated tag instead of a commit, which has been corrected. * jc/blame-ignore-fix: blame: validate and peel the object names on the ignore list t8013: minimum preparatory clean-up	2020-10-04 12:49:07 -07:00
Junio C Hamano	86cca370e1	Merge branch 'jk/drop-unaligned-loads' Compilation fix around type punning. * jk/drop-unaligned-loads: Revert "fast-export: use local array to store anonymized oid" bswap.h: drop unaligned loads	2020-10-04 12:49:06 -07:00
Srinidhi Kaushik	3b990aa645	push: parse and set flag for "--force-if-includes" The previous commit added the necessary machinery to implement the "--force-if-includes" protection, when "--force-with-lease" is used without giving exact object the remote still ought to have. Surface the feature by adding a command line option and a configuration variable to enable it. - Add a flag: "TRANSPORT_PUSH_FORCE_IF_INCLUDES" to indicate that the new option was passed from the command line of via configuration settings; update command line and configuration parsers to set the new flag accordingly. - Introduce a new configuration option "push.useForceIfIncludes", which is equivalent to setting "--force-if-includes" in the command line. - Update "remote-curl" to recognize and pass this option to "send-pack" when enabled. - Update "advise" to catch the reject reason "REJECT_REF_NEEDS_UPDATE", set when the ref status is "REF_STATUS_REJECT_REMOTE_UPDATED" and (optionally) print a help message when the push fails. - The new option is a "no-op" in the following scenarios: * When used without "--force-with-lease". * When used with "--force-with-lease", and if the expected commit on the remote side is specified as an argument. Signed-off-by: Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-03 09:59:19 -07:00
Srinidhi Kaushik	99a1f9ae10	push: add reflog check for "--force-if-includes" Add a check to verify if the remote-tracking ref of the local branch is reachable from one of its "reflog" entries. The check iterates through the local ref's reflog to see if there is an entry for the remote-tracking ref and collecting any commits that are seen, into a list; the iteration stops if an entry in the reflog matches the remote ref or if the entry timestamp is older the latest entry of the remote ref's "reflog". If there wasn't an entry found for the remote ref, "in_merge_bases_many()" is called to check if it is reachable from the list of collected commits. When a local branch that is based on a remote ref, has been rewound and is to be force pushed on the remote, "--force-if-includes" runs a check that ensures any updates to the remote-tracking ref that may have happened (by push from another repository) in-between the time of the last update to the local branch (via "git-pull", for instance) and right before the time of push, have been integrated locally before allowing a forced update. If the new option is passed without specifying "--force-with-lease", or specified along with "--force-with-lease=<refname>:<expect>" it is a "no-op". Signed-off-by: Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-03 09:59:18 -07:00
Jacob Keller	7efba5fa39	format-patch: teach format.useAutoBase "whenAble" option The format.useAutoBase configuration option exists to allow users to enable '--base=auto' for format-patch by default. This can sometimes lead to poor workflow, due to unexpected failures when attempting to format an ancient patch: $ git format-patch -1 <an old commit> fatal: base commit shouldn't be in revision list This can be very confusing, as it is not necessarily immediately obvious that the user requested a --base (since this was in the configuration, not on the command line). We do want --base=auto to fail when it cannot provide a suitable base, as it would be equally confusing if a formatted patch did not include the base information when it was requested. Teach format.useAutoBase a new mode, "whenAble". This mode will cause format-patch to attempt to include a base commit when it can. However, if no valid base commit can be found, then format-patch will continue formatting the patch without a base commit. In order to avoid making yet another branch name unusable with --base, do not teach --base=whenAble or --base=whenable. Instead, refactor the base_commit option to use a callback, and rely on the global configuration variable auto_base. This does mean that a user cannot request this optional base commit generation from the command line. However, this is likely not too valuable. If the user requests base information manually, they will be immediately informed of the failure to acquire a suitable base commit. This allows the user to make an informed choice about whether to continue the format. Add tests to cover the new mode of operation for --base. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-01 15:22:10 -07:00
Sean Barag	de9ed3ef37	clone: allow configurable default for `-o`/`--origin` While the default remote name of "origin" can be changed at clone-time with `git clone`'s `--origin` option, it was previously not possible to specify a default value for the name of that remote. Add support for a new `clone.defaultRemoteName` config, with the newly-created remote name resolved in priority order: 1. (Highest priority) A remote name passed directly to `git clone -o` 2. A `clone.defaultRemoteName=new_name` in config `git clone -c` 3. A `clone.defaultRemoteName` value set in `/path/to/template/config`, where `--template=/path/to/template` is provided 4. A `clone.defaultRemoteName` value set in a non-template config file 5. The default value of `origin` Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Helped-by: Derrick Stolee <stolee@gmail.com> Helped-by: Andrei Rybak <rybak.a.v@gmail.com> Signed-off-by: Sean Barag <sean@barag.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-30 22:09:13 -07:00
Sean Barag	75ca3906b1	clone: read new remote name from remote_name instead of option_origin In a future patch, the name of the remote created by `git clone` may come from multiple sources. To avoid confusion, convert most uses of option_origin to remote_name, leaving option_origin to exclusively represent the -o/--origin option. Helped-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Sean Barag <sean@barag.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-30 22:09:13 -07:00
Sean Barag	ebe7e28a36	clone: validate --origin option before use Providing a bad origin name to `git clone` currently reports an 'invalid refspec' error instead of a more explicit message explaining that the `--origin` option was malformed. This behavior dates back to since `8434c2f1` (Build in clone, 2008-04-27). Reintroduce validation for the provided `--origin` option, but notably _don't_ include a multi-level check (e.g. "foo/bar") that was present in the original `git-clone.sh`. `git remote` allows multi-level remote names since at least `46220ca100` (remote.c: Fix overtight refspec validation, 2008-03-20), so that appears to be the desired behavior. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Derrick Stolee <stolee@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Sean Barag <sean@barag.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-30 22:09:13 -07:00
Sean Barag	f2c6fda886	refs: consolidate remote name validation In preparation for a future patch, extract from remote.c a function that validates possible remote names so that its rules can be used consistently in other places. Helped-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Sean Barag <sean@barag.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-30 22:09:13 -07:00
Sean Barag	552955ed7f	clone: use more conventional config/option layering Parsing command-line options before reading from config required careful handling to ensure CLI options were treated with higher priority. Read config first to let parsed CLI naively overwrite matching config values. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Sean Barag <sean@barag.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-30 22:09:13 -07:00
Jacob Keller	c0192df630	refspec: add support for negative refspecs Both fetch and push support pattern refspecs which allow fetching or pushing references that match a specific pattern. Because these patterns are globs, they have somewhat limited ability to express more complex situations. For example, suppose you wish to fetch all branches from a remote except for a specific one. To allow this, you must setup a set of refspecs which match only the branches you want. Because refspecs are either explicit name matches, or simple globs, many patterns cannot be expressed. Add support for a new type of refspec, referred to as "negative" refspecs. These are prefixed with a '^' and mean "exclude any ref matching this refspec". They can only have one "side" which always refers to the source. During a fetch, this refers to the name of the ref on the remote. During a push, this refers to the name of the ref on the local side. With negative refspecs, users can express more complex patterns. For example: git fetch origin refs/heads/:refs/remotes/origin/ ^refs/heads/dontwant will fetch all branches on origin into remotes/origin, but will exclude fetching the branch named dontwant. Refspecs today are commutative, meaning that order doesn't expressly matter. Rather than forcing an implied order, negative refspecs will always be applied last. That is, in order to match, a ref must match at least one positive refspec, and match none of the negative refspecs. This is similar to how negative pathspecs work. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-30 14:52:00 -07:00
Jeff King	75d3bee157	sparse-checkout: fill in some options boilerplate The sparse-checkout passes along argv and argc to its sub-command helper functions. Many of these sub-commands do not yet take any command-line options, and ignore those parameters. Let's instead add empty option lists and make sure we call parse_options(). That will give a useful error message for something like: git sparse-checkout list --nonsense which currently just silently ignores the unknown option. As a bonus, it also silences some -Wunused-parameter warnings. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-30 12:53:48 -07:00
Jeff King	5b9427e0ac	push: drop unused repo argument to do_push() We stopped using the "repo" argument in `8e4c8af058` (push: disallow --all and refspecs when remote.<name>.mirror is set, 2019-09-02), which moved the pushremote handling to its caller. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-30 12:53:47 -07:00
Jeff King	8d2aa8dfac	assert PARSE_OPT_NONEG in parse-options callbacks In the spirit of `517fe807d6` (assert NOARG/NONEG behavior of parse-options callbacks, 2018-11-05), let's cover some parse-options callbacks which expect to be used with PARSE_OPT_NONEG but don't explicitly assert that this is the case. These callbacks are all used correctly in the current code, but this will help document their expectations and future-proof the code. As a bonus, it also silences -Wunused-parameters (these were added since the initial sweep of `517fe807d6`, and we can't yet turn on -Wunused-parameters to remind people because it has too many existing false positives). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-30 12:53:47 -07:00
Jeff King	424e28fcad	env--helper: write to opt->value in parseopt helper We use OPT_CALLBACK_F() to call the option_parse_type() callback, passing it the address of "cmdmode" as the value to write to. But the callback doesn't look at opt->value at all, and instead writes to a global variable. This works out because that's the same global variable we happen to pass in, but it's rather confusing. Let's use the passed-in value instead. We'll also make "cmdmode" a local variable of the main function, ensuring we can't make the same mistake again. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-30 12:53:47 -07:00
Jeff King	e885a84f1b	drop unused argc parameters Many functions take an argv/argc pair, but never actually look at argc. This makes it useless at best (we use the NULL sentinel in argv to find the end of the array), and misleading at worst (what happens if the argc count does not match the argv NULL?). In each of these instances, the argv NULL does match the argc count, so there are no bugs here. But let's tighten the interfaces to make it harder to get wrong (and to reduce some -Wunused-parameter complaints). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-30 12:53:47 -07:00
Junio C Hamano	299deeac8a	Merge branch 'ah/pull' Earlier we taught "git pull" to warn when the user does not say the histories need to be merged, rebased or accepts only fast- forwarding, but the warning triggered for those who have set the pull.ff configuration variable. * ah/pull: pull: don't warn if pull.ff has been set	2020-09-29 14:01:22 -07:00
Junio C Hamano	b28919c7bc	Merge branch 'bc/clone-with-git-default-hash-fix' "git clone" that clones from SHA-1 repository, while GIT_DEFAULT_HASH set to use SHA-256 already, resulted in an unusable repository that half-claims to be SHA-256 repository with SHA-1 objects and refs. This has been corrected. * bc/clone-with-git-default-hash-fix: builtin/clone: avoid failure with GIT_DEFAULT_HASH	2020-09-29 14:01:20 -07:00
Junio C Hamano	288ed98bf7	Merge branch 'tb/bloom-improvements' "git commit-graph write" learned to limit the number of bloom filters that are computed from scratch with the --max-new-filters option. * tb/bloom-improvements: commit-graph: introduce 'commitGraph.maxNewFilters' builtin/commit-graph.c: introduce '--max-new-filters=<n>' commit-graph: rename 'split_commit_graph_opts' bloom: encode out-of-bounds filters as non-empty bloom/diff: properly short-circuit on max_changes bloom: use provided 'struct bloom_filter_settings' bloom: split 'get_bloom_filter()' in two commit-graph.c: store maximum changed paths commit-graph: respect 'commitGraph.readChangedPaths' t/helper/test-read-graph.c: prepare repo settings commit-graph: pass a 'struct repository *' in more places t4216: use an '&&'-chain commit-graph: introduce 'get_bloom_filter_settings()'	2020-09-29 14:01:20 -07:00
Sergey Organov	d01141de5a	diff: get rid of redundant 'dense' argument Get rid of 'dense' argument that is redundant for every function that has 'struct rev_info *rev' argument as well, as the value of 'dense' passed is always taken from 'rev->dense_combined_merges' field. The only place where this was not the case is in 'submodule.c' where 'diff_tree_combined_merge()' was called with '1' for 'dense' argument. However, at that call the 'revs' instance used is local to the function, and we now just set 'revs->dense_combined_merges' to 1 in this local instance. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-29 11:54:53 -07:00
Jonathan Tan	23547c4051	fetch: do not override partial clone filter When a fetch with the --filter argument is made, the configured default filter is set even if one already exists. This change was made in `5e46139376` ("builtin/fetch: remove unique promisor remote limitation", 2019-06-25) - in particular, changing from: * If this is the FIRST partial-fetch request, we enable partial * on this repo and remember the given filter-spec as the default * for subsequent fetches to this remote. to: * If this is a partial-fetch request, we enable partial on * this repo if not already enabled and remember the given * filter-spec as the default for subsequent fetches to this * remote. (The given filter-spec is "remembered" even if there is already an existing one.) This is problematic whenever a lazy fetch is made, because lazy fetches are made using "git fetch --filter=blob:none", but this will also happen if the user invokes "git fetch --filter=<filter>" manually. Therefore, restore the behavior prior to `5e46139376`, which writes a filter-spec only if the current fetch request is the first partial-fetch one (for that remote). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-28 16:11:59 -07:00
Jeff King	63d24fa0b0	shortlog: allow multiple groups to be specified Now that shortlog supports reading from trailers, it can be useful to combine counts from multiple trailers, or between trailers and authors. This can be done manually by post-processing the output from multiple runs, but it's non-trivial to make sure that each name/commit pair is counted only once. This patch teaches shortlog to accept multiple --group options on the command line, and pull data from all of them. That makes it possible to run: git shortlog -ns --group=author --group=trailer:co-authored-by to get a shortlog that counts authors and co-authors equally. The implementation is mostly straightforward. The "group" enum becomes a bitfield, and the trailer key becomes a list. I didn't bother implementing the multi-group semantics for reading from stdin. It would be possible to do, but the existing matching code makes it awkward, and I doubt anybody cares. The duplicate suppression we used for trailers now covers authors and committers as well (though in non-trailer single-group mode we can skip the hash insertion and lookup, since we only see one value per commit). There is one subtlety: we now care about the case when no group bit is set (in which case we default to showing the author). The caller in builtin/log.c needs to be adapted to ask explicitly for authors, rather than relying on shortlog_init(). It would be possible with some gymnastics to make this keep working as-is, but it's not worth it for a single caller. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-27 12:21:05 -07:00
Jeff King	56d5dde752	shortlog: parse trailer idents Trailers don't necessarily contain name/email identity values, so shortlog has so far treated them as opaque strings. However, since many trailers do contain identities, it's useful to treat them as such when they can be parsed. That lets "-e" work as usual, as well as mailmap. When they can't be parsed, we'll continue with the old behavior of treating them as a single string (there's no new test for that here, since the existing tests cover a trailer like this). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-27 12:21:05 -07:00
Jeff King	87abb96222	shortlog: rename parse_stdin_ident() This function is actually useful for parsing any identity, whether from stdin or not. We'll need it for handling trailers. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-27 12:21:05 -07:00
Jeff King	f17b0b99bf	shortlog: de-duplicate trailer values The current documentation is vague about what happens with --group=trailer:signed-off-by when we see a commit with: Signed-off-by: One Signed-off-by: Two Signed-off-by: One We clearly should credit both "One" and "Two", but should "One" get credited twice? The current code does so, but mostly because that was the easiest thing to do. It's probably more useful to count each commit at most once. This will become especially important when we allow values from multiple sources in a future patch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-27 12:21:05 -07:00
Jeff King	47beb37bc6	shortlog: match commit trailers with --group If a project uses commit trailers, this patch lets you use shortlog to see who is performing each action. For example, running: git shortlog -ns --group=trailer:reviewed-by in git.git shows who has reviewed. You can even use a custom format to see things like who has helped whom: git shortlog --format="...helped %an (%ad)" \ --group=trailer:helped-by Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-27 12:21:05 -07:00
Jeff King	92338c450b	shortlog: add grouping option In preparation for adding more grouping types, let's refactor the committer/author grouping code and add a user-facing option that binds them together. In particular: - the main option is now "--group", to make it clear that the various group types are mutually exclusive. The "--committer" option is an alias for "--group=committer". - we keep an enum rather than a binary flag, to prepare for more values - we prefer switch statements to ternary assignment, since other group types will need more custom code Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-27 12:21:05 -07:00
Junio C Hamano	6c430a647c	Merge branch 'jx/proc-receive-hook' "git receive-pack" that accepts requests by "git push" learned to outsource most of the ref updates to the new "proc-receive" hook. * jx/proc-receive-hook: doc: add documentation for the proc-receive hook transport: parse report options for tracking refs t5411: test updates of remote-tracking branches receive-pack: new config receive.procReceiveRefs doc: add document for capability report-status-v2 New capability "report-status-v2" for git-push receive-pack: feed report options to post-receive receive-pack: add new proc-receive hook t5411: add basic test cases for proc-receive hook transport: not report a non-head push as a branch	2020-09-25 15:25:39 -07:00
Junio C Hamano	48794acc50	Merge branch 'ds/maintenance-part-1' A "git gc"'s big brother has been introduced to take care of more repository maintenance tasks, not limited to the object database cleaning. * ds/maintenance-part-1: maintenance: add trace2 regions for task execution maintenance: add auto condition for commit-graph task maintenance: use pointers to check --auto maintenance: create maintenance.<task>.enabled config maintenance: take a lock on the objects directory maintenance: add --task option maintenance: add commit-graph task maintenance: initialize task array maintenance: replace run_auto_gc() maintenance: add --quiet option maintenance: create basic maintenance runner	2020-09-25 15:25:38 -07:00
Derrick Stolee	2fec604f8d	maintenance: add start/stop subcommands Add new subcommands to 'git maintenance' that start or stop background maintenance using 'cron', when available. This integration is as simple as I could make it, barring some implementation complications. The schedule is laid out as follows: 0 1-23 * * * $cmd maintenance run --schedule=hourly 0 0 * * 1-6 $cmd maintenance run --schedule=daily 0 0 * * 0 $cmd maintenance run --schedule=weekly where $cmd is a properly-qualified 'git for-each-repo' execution: $cmd=$path/git --exec-path=$path for-each-repo --config=maintenance.repo where $path points to the location of the Git executable running 'git maintenance start'. This is critical for systems with multiple versions of Git. Specifically, macOS has a system version at '/usr/bin/git' while the version that users can install resides at '/usr/local/bin/git' (symlinked to '/usr/local/libexec/git-core/git'). This will also use your locally-built version if you build and run this in your development environment without installing first. This conditional schedule avoids having cron launch multiple 'git for-each-repo' commands in parallel. Such parallel commands would likely lead to the 'hourly' and 'daily' tasks competing over the object database lock. This could lead to to some tasks never being run! Since the --schedule=<frequency> argument will run all tasks with _at least_ the given frequency, the daily runs will also run the hourly tasks. Similarly, the weekly runs will also run the daily and hourly tasks. The GIT_TEST_CRONTAB environment variable is not intended for users to edit, but instead as a way to mock the 'crontab [-l]' command. This variable is set in test-lib.sh to avoid a future test from accidentally running anything with the cron integration from modifying the user's schedule. We use GIT_TEST_CRONTAB='test-tool crontab <file>' in our tests to check how the schedule is modified in 'git maintenance (start\|stop)' commands. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-25 10:59:44 -07:00
Derrick Stolee	0c18b70081	maintenance: add [un]register subcommands In preparation for launching background maintenance from the 'git maintenance' builtin, create register/unregister subcommands. These commands update the new 'maintenance.repos' config option in the global config so the background maintenance job knows which repositories to maintain. These commands allow users to add a repository to the background maintenance list without disrupting the actual maintenance mechanism. For example, a user can run 'git maintenance register' when no background maintenance is running and it will not start the background maintenance. A later update to start running background maintenance will then pick up this repository automatically. The opposite example is that a user can run 'git maintenance unregister' to remove the current repository from background maintenance without halting maintenance for other repositories. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-25 10:59:44 -07:00
Derrick Stolee	4950b2a2b5	for-each-repo: run subcommands on configured repos It can be helpful to store a list of repositories in global or system config and then iterate Git commands on that list. Create a new builtin that makes this process simple for experts. We will use this builtin to run scheduled maintenance on all configured repositories in a future change. The test is very simple, but does highlight that the "--" argument is optional. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-25 10:59:44 -07:00
Derrick Stolee	b08ff1fee0	maintenance: add --schedule option and config Maintenance currently triggers when certain data-size thresholds are met, such as number of pack-files or loose objects. Users may want to run certain maintenance tasks based on frequency instead. For example, a user may want to perform a 'prefetch' task every hour, or 'gc' task every day. To help these users, update the 'git maintenance run' command to include a '--schedule=<frequency>' option. The allowed frequencies are 'hourly', 'daily', and 'weekly'. These values are also allowed in a new config value 'maintenance.<task>.schedule'. The 'git maintenance run --schedule=<frequency>' checks the '.schedule' config value for each enabled task to see if the configured frequency is at least as frequent as the frequency from the '--schedule' argument. We use the following order, for full clarity: 'hourly' > 'daily' > 'weekly' Use new 'enum schedule_priority' to track these values numerically. The following cron table would run the scheduled tasks with the correct frequencies: 0 1-23 * * git -C <repo> maintenance run --schedule=hourly 0 0 * * 1-6 git -C <repo> maintenance run --schedule=daily 0 0 * * 0 git -C <repo> maintenance run --schedule=weekly This cron schedule will run --schedule=hourly every hour except at midnight. This avoids a concurrent run with the --schedule=daily that runs at midnight every day except the first day of the week. This avoids a concurrent run with the --schedule=weekly that runs at midnight on the first day of the week. Since --schedule=daily also runs the 'hourly' tasks and --schedule=weekly runs the 'hourly' and 'daily' tasks, we will still see all tasks run with the proper frequencies. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-25 10:59:44 -07:00
Derrick Stolee	e841a79a13	maintenance: add incremental-repack auto condition The incremental-repack task updates the multi-pack-index by deleting pack- files that have been replaced with new packs, then repacking a batch of small pack-files into a larger pack-file. This incremental repack is faster than rewriting all object data, but is slower than some other maintenance activities. The 'maintenance.incremental-repack.auto' config option specifies how many pack-files should exist outside of the multi-pack-index before running the step. These pack-files could be created by 'git fetch' commands or by the loose-objects task. The default value is 10. Setting the option to zero disables the task with the '--auto' option, and a negative value makes the task run every time. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-25 10:53:05 -07:00
Derrick Stolee	a13e3d0ec8	maintenance: auto-size incremental-repack batch When repacking during the 'incremental-repack' task, we use the --batch-size option in 'git multi-pack-index repack'. The initial setting used --batch-size=0 to repack everything into a single pack-file. This is not sustainable for a large repository. The amount of work required is also likely to use too many system resources for a background job. Update the 'incremental-repack' task by dynamically computing a --batch-size option based on the current pack-file structure. The dynamic default size is computed with this idea in mind for a client repository that was cloned from a very large remote: there is likely one "big" pack-file that was created at clone time. Thus, do not try repacking it as it is likely packed efficiently by the server. Instead, we select the second-largest pack-file, and create a batch size that is one larger than that pack-file. If there are three or more pack-files, then this guarantees that at least two will be combined into a new pack-file. Of course, this means that the second-largest pack-file size is likely to grow over time and may eventually surpass the initially-cloned pack-file. Recall that the pack-file batch is selected in a greedy manner: the packs are considered from oldest to newest and are selected if they have size smaller than the batch size until the total selected size is larger than the batch size. Thus, that oldest "clone" pack will be first to repack after the new data creates a pack larger than that. We also want to place some limits on how large these pack-files become, in order to bound the amount of time spent repacking. A maximum batch-size of two gigabytes means that large repositories will never be packed into a single pack-file using this job, but also that repack is rather expensive. This is a trade-off that is valuable to have if the maintenance is being run automatically or in the background. Users who truly want to optimize for space and performance (and are willing to pay the upfront cost of a full repack) can use the 'gc' task to do so. Create a test for this two gigabyte limit by creating an EXPENSIVE test that generates two pack-files of roughly 2.5 gigabytes in size, then performs an incremental repack. Check that the --batch-size argument in the subcommand uses the hard-coded maximum. Helped-by: Chris Torek <chris.torek@gmail.com> Reported-by: Son Luong Ngoc <sluongng@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-25 10:53:05 -07:00
Derrick Stolee	52fe41ff1c	maintenance: add incremental-repack task The previous change cleaned up loose objects using the 'loose-objects' that can be run safely in the background. Add a similar job that performs similar cleanups for pack-files. One issue with running 'git repack' is that it is designed to repack all pack-files into a single pack-file. While this is the most space-efficient way to store object data, it is not time or memory efficient. This becomes extremely important if the repo is so large that a user struggles to store two copies of the pack on their disk. Instead, perform an "incremental" repack by collecting a few small pack-files into a new pack-file. The multi-pack-index facilitates this process ever since 'git multi-pack-index expire' was added in `19575c7` (multi-pack-index: implement 'expire' subcommand, 2019-06-10) and 'git multi-pack-index repack' was added in `ce1e4a1` (midx: implement midx_repack(), 2019-06-10). The 'incremental-repack' task runs the following steps: 1. 'git multi-pack-index write' creates a multi-pack-index file if one did not exist, and otherwise will update the multi-pack-index with any new pack-files that appeared since the last write. This is particularly relevant with the background fetch job. When the multi-pack-index sees two copies of the same object, it stores the offset data into the newer pack-file. This means that some old pack-files could become "unreferenced" which I will use to mean "a pack-file that is in the pack-file list of the multi-pack-index but none of the objects in the multi-pack-index reference a location inside that pack-file." 2. 'git multi-pack-index expire' deletes any unreferenced pack-files and updaes the multi-pack-index to drop those pack-files from the list. This is safe to do as concurrent Git processes will see the multi-pack-index and not open those packs when looking for object contents. (Similar to the 'loose-objects' job, there are some Git commands that open pack-files regardless of the multi-pack-index, but they are rarely used. Further, a user that self-selects to use background operations would likely refrain from using those commands.) 3. 'git multi-pack-index repack --bacth-size=<size>' collects a set of pack-files that are listed in the multi-pack-index and creates a new pack-file containing the objects whose offsets are listed by the multi-pack-index to be in those objects. The set of pack- files is selected greedily by sorting the pack-files by modified time and adding a pack-file to the set if its "expected size" is smaller than the batch size until the total expected size of the selected pack-files is at least the batch size. The "expected size" is calculated by taking the size of the pack-file divided by the number of objects in the pack-file and multiplied by the number of objects from the multi-pack-index with offset in that pack-file. The expected size approximates how much data from that pack-file will contribute to the resulting pack-file size. The intention is that the resulting pack-file will be close in size to the provided batch size. The next run of the incremental-repack task will delete these repacked pack-files during the 'expire' step. In this version, the batch size is set to "0" which ignores the size restrictions when selecting the pack-files. It instead selects all pack-files and repacks all packed objects into a single pack-file. This will be updated in the next change, but it requires doing some calculations that are better isolated to a separate change. These steps are based on a similar background maintenance step in Scalar (and VFS for Git) [1]. This was incredibly effective for users of the Windows OS repository. After using the same VFS for Git repository for over a year, some users had _thousands_ of pack-files that combined to up to 250 GB of data. We noticed a few users were running into the open file descriptor limits (due in part to a bug in the multi-pack-index fixed by `af96fe3` (midx: add packs to packed_git linked list, 2019-04-29). These pack-files were mostly small since they contained the commits and trees that were pushed to the origin in a given hour. The GVFS protocol includes a "prefetch" step that asks for pre-computed pack- files containing commits and trees by timestamp. These pack-files were grouped into "daily" pack-files once a day for up to 30 days. If a user did not request prefetch packs for over 30 days, then they would get the entire history of commits and trees in a new, large pack-file. This led to a large number of pack-files that had poor delta compression. By running this pack-file maintenance step once per day, these repos with thousands of packs spanning 200+ GB dropped to dozens of pack- files spanning 30-50 GB. This was done all without removing objects from the system and using a constant batch size of two gigabytes. Once the work was done to reduce the pack-files to small sizes, the batch size of two gigabytes means that not every run triggers a repack operation, so the following run will not expire a pack-file. This has kept these repos in a "clean" state. [1] https://github.com/microsoft/scalar/blob/master/Scalar.Common/Maintenance/PackfileMaintenanceStep.cs Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-25 10:53:04 -07:00
Derrick Stolee	3e220e6069	maintenance: create auto condition for loose-objects The loose-objects task deletes loose objects that already exist in a pack-file, then place the remaining loose objects into a new pack-file. If this step runs all the time, then we risk creating pack-files with very few objects with every 'git commit' process. To prevent overwhelming the packs directory with small pack-files, place a minimum number of objects to justify the task. The 'maintenance.loose-objects.auto' config option specifies a minimum number of loose objects to justify the task to run under the '--auto' option. This defaults to 100 loose objects. Setting the value to zero will prevent the step from running under '--auto' while a negative value will force it to run every time. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-25 10:53:04 -07:00
Derrick Stolee	252cfb7cb8	maintenance: add loose-objects task One goal of background maintenance jobs is to allow a user to disable auto-gc (gc.auto=0) but keep their repository in a clean state. Without any cleanup, loose objects will clutter the object database and slow operations. In addition, the loose objects will take up extra space because they are not stored with deltas against similar objects. Create a 'loose-objects' task for the 'git maintenance run' command. This helps clean up loose objects without disrupting concurrent Git commands using the following sequence of events: 1. Run 'git prune-packed' to delete any loose objects that exist in a pack-file. Concurrent commands will prefer the packed version of the object to the loose version. (Of course, there are exceptions for commands that specifically care about the location of an object. These are rare for a user to run on purpose, and we hope a user that has selected background maintenance will not be trying to do foreground maintenance.) 2. Run 'git pack-objects' on a batch of loose objects. These objects are grouped by scanning the loose object directories in lexicographic order until listing all loose objects -or- reaching 50,000 objects. This is more than enough if the loose objects are created only by a user doing normal development. We noticed users with _millions_ of loose objects because VFS for Git downloads blobs on-demand when a file read operation requires populating a virtual file. This step is based on a similar step in Scalar [1] and VFS for Git. [1] https://github.com/microsoft/scalar/blob/master/Scalar.Common/Maintenance/LooseObjectsStep.cs Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-25 10:53:04 -07:00
Derrick Stolee	28cb5e66dd	maintenance: add prefetch task When working with very large repositories, an incremental 'git fetch' command can download a large amount of data. If there are many other users pushing to a common repo, then this data can rival the initial pack-file size of a 'git clone' of a medium-size repo. Users may want to keep the data on their local repos as close as possible to the data on the remote repos by fetching periodically in the background. This can break up a large daily fetch into several smaller hourly fetches. The task is called "prefetch" because it is work done in advance of a foreground fetch to make that 'git fetch' command much faster. However, if we simply ran 'git fetch <remote>' in the background, then the user running a foreground 'git fetch <remote>' would lose some important feedback when a new branch appears or an existing branch updates. This is especially true if a remote branch is force-updated and this isn't noticed by the user because it occurred in the background. Further, the functionality of 'git push --force-with-lease' becomes suspect. When running 'git fetch <remote> <options>' in the background, use the following options for careful updating: 1. --no-tags prevents getting a new tag when a user wants to see the new tags appear in their foreground fetches. 2. --refmap= removes the configured refspec which usually updates refs/remotes/<remote>/* with the refs advertised by the remote. While this looks confusing, this was documented and tested by `b40a50264a` (fetch: document and test --refmap="", 2020-01-21), including this sentence in the documentation: Providing an empty `<refspec>` to the `--refmap` option causes Git to ignore the configured refspecs and rely entirely on the refspecs supplied as command-line arguments. 3. By adding a new refspec "+refs/heads/:refs/prefetch/<remote>/" we can ensure that we actually load the new values somewhere in our refspace while not updating refs/heads or refs/remotes. By storing these refs here, the commit-graph job will update the commit-graph with the commits from these hidden refs. 4. --prune will delete the refs/prefetch/<remote> refs that no longer appear on the remote. 5. --no-write-fetch-head prevents updating FETCH_HEAD. We've been using this step as a critical background job in Scalar [1] (and VFS for Git). This solved a pain point that was showing up in user reports: fetching was a pain! Users do not like waiting to download the data that was created while they were away from their machines. After implementing background fetch, the foreground fetch commands sped up significantly because they mostly just update refs and download a small amount of new data. The effect is especially dramatic when paried with --no-show-forced-udpates (through fetch.showForcedUpdates=false). [1] https://github.com/microsoft/scalar/blob/master/Scalar.Common/Maintenance/FetchStep.cs Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-25 10:53:04 -07:00
Jeff King	45d93eb824	shortlog: change "author" variables to "ident" We already match "committer", and we're about to start matching more things. Let's use a more neutral variable to avoid confusion. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-25 10:47:50 -07:00
Christian Couder	73c6de06af	bisect: don't use invalid oid as rev when starting In `06f5608c14` (bisect--helper: `bisect_start` shell function partially in C, 2019-01-02), we changed the following shell code: - rev=$(git rev-parse -q --verify "$arg^{commit}") \|\| { - test $has_double_dash -eq 1 && - die "$(eval_gettext "'\$arg' does not appear to be a valid revision")" - break - } - revs="$revs $rev" into: + char *commit_id = xstrfmt("%s^{commit}", arg); + if (get_oid(commit_id, &oid) && has_double_dash) + die(_("'%s' does not appear to be a valid " + "revision"), arg); + + string_list_append(&revs, oid_to_hex(&oid)); + free(commit_id); In case of an invalid "arg" when "has_double_dash" is false, the old code would "break" out of the argument loop. In the new C code though, `oid_to_hex(&oid)` is unconditonally appended to "revs". This is wrong first because "oid" is junk as `get_oid(commit_id, &oid)` failed and second because it doesn't break out of the argument loop. Not breaking out of the argument loop means that "arg" is then not treated as a path restriction (which is wrong). Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-25 09:57:48 -07:00
Alex Henrie	54200cef86	pull: don't warn if pull.ff has been set A user who understands enough to set pull.ff does not need additional instructions. Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-24 23:04:27 -07:00
Junio C Hamano	610e2b9240	blame: validate and peel the object names on the ignore list The command reads list of object names to place on the ignore list either from the command line or from a file, but they are not checked with their object type (those read from the file are not even checked for object existence). Extend the oidset_parse_file() API and allow it to take a callback that can be used to die (e.g. when an inappropriate input is read) or modify the object name read (e.g. when a tag pointing at a commit is read, and the caller wants a commit object name), and use it in the code that handles ignore list. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-24 22:20:58 -07:00
Jeff King	176380fd11	Revert "fast-export: use local array to store anonymized oid" This reverts commit `f39ad38410`. That commit was trying to silence a type-punning warning on older versions of gcc. However, its analysis was all wrong. I didn't notice that we _were_ in fact type-punning because there are two versions of put_be32(): one that uses casts and unaligned loads, and another that uses bitshifts. I looked at the latter, but on my platform we were defaulting to the former. However, as of the previous commit, we'll always use the bitshift version. So we can drop this hackery to avoid the warning, making the code slightly cleaner. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-24 12:30:11 -07:00
Pranit Bauva	517ecb3161	bisect--helper: reimplement `bisect_next` and `bisect_auto_next` shell functions in C Reimplement the `bisect_next()` and the `bisect_auto_next()` shell functions in C and add the subcommands to `git bisect--helper` to call them from git-bisect.sh . bisect_auto_next() function returns an enum bisect_error type as whole `git bisect` can exit with an error code when bisect_next() does. Return an error when `bisect_next()` fails, that fix a bug on shell script version. Using `--bisect-next` and `--bisect-auto-next` subcommands is a temporary measure to port shell function to C so as to use the existing test suite. As more functions are ported, `--bisect-auto-next` subcommand will be retired and will be called by some other methods. Mentored-by: Lars Schneider <larsxschneider@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com> Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com> Signed-off-by: Miriam Rubio <mirucam@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-24 12:06:30 -07:00
Pranit Bauva	09535f056b	bisect--helper: reimplement `bisect_autostart` shell function in C Reimplement the `bisect_autostart()` shell function in C and add the C implementation from `bisect_next()` which was previously left uncovered. Add `--bisect-autostart` subcommand to be called from git-bisect.sh. Using `--bisect-autostart` subcommand is a temporary measure to port the shell function to C so as to use the existing test suite. As more functions are ported, this subcommand will be retired and bisect_autostart() will be called directly by `bisect_state()`. Change behavior of shell script that returned success when user aborted the bisection. Mentored-by: Lars Schneider <larsxschneider@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com> Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com> Signed-off-by: Miriam Rubio <mirucam@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-24 12:06:30 -07:00
Junio C Hamano	6854689e65	Merge branch 'ar/fetch-ipversion-in-all' "git fetch --all --ipv4/--ipv6" forgot to pass the protocol options to instances of the "git fetch" that talk to individual remotes, which has been corrected. * ar/fetch-ipversion-in-all: fetch: pass --ipv4 and --ipv6 options to sub-fetches	2020-09-22 12:36:34 -07:00
Junio C Hamano	39149df364	Merge branch 'cs/don-t-pretend-a-failed-remote-set-head-succeeded' "git remote set-head" that failed still said something that hints the operation went through, which was misleading. * cs/don-t-pretend-a-failed-remote-set-head-succeeded: remote: don't show success message when set-head fails	2020-09-22 12:36:32 -07:00
Junio C Hamano	26a3728bed	Merge branch 'al/ref-filter-merged-and-no-merged' "git for-each-ref" and friends that list refs used to allow only one --merged or --no-merged to filter them; they learned to take combination of both kind of filtering. * al/ref-filter-merged-and-no-merged: Doc: prefer more specific file name ref-filter: make internal reachable-filter API more precise ref-filter: allow merged and no-merged filters Doc: cover multiple contains/no-contains filters t3201: test multiple branch filter combinations	2020-09-22 12:36:31 -07:00
Junio C Hamano	4aff18a3f0	Merge branch 'ls/mergetool-meld-auto-merge' The 'meld' backend of the "git mergetool" learned to give the underlying 'meld' the '--auto-merge' option, which would help reduce the amount of text that requires manual merging. * ls/mergetool-meld-auto-merge: mergetool: allow auto-merge for meld to follow the vim-diff behavior	2020-09-22 12:36:29 -07:00
Junio C Hamano	b7e65b51e5	Merge branch 'jt/threaded-index-pack' "git index-pack" learned to resolve deltified objects with greater parallelism. * jt/threaded-index-pack: index-pack: make quantum of work smaller index-pack: make resolve_delta() assume base data index-pack: calculate {ref,ofs}_{first,last} early index-pack: remove redundant child field index-pack: unify threaded and unthreaded code index-pack: remove redundant parameter Documentation: deltaBaseCacheLimit is per-thread	2020-09-22 12:36:28 -07:00
Junio C Hamano	634e0084fa	Merge branch 'es/format-patch-interdiff-cleanup' "format-patch --range-diff=<prev> <origin>..HEAD" has been taught not to ignore <origin> when <prev> is a single version. * es/format-patch-interdiff-cleanup: format-patch: use 'origin' as start of current-series-range when known diff-lib: tighten show_interdiff()'s interface diff: move show_interdiff() from its own file to diff-lib	2020-09-22 12:36:28 -07:00
Junio C Hamano	bcb68bff80	Merge branch 'os/fetch-submodule-optim' Optimization around submodule handling. * os/fetch-submodule-optim: fetch: do not look for submodule changes in unchanged refs	2020-09-22 12:36:28 -07:00
brian m. carlson	47ac970309	builtin/clone: avoid failure with GIT_DEFAULT_HASH If a user is cloning a SHA-1 repository with GIT_DEFAULT_HASH set to "sha256", then we can end up with a repository where the repository format version is 0 but the extensions.objectformat key is set to "sha256". This is both wrong (the user has a SHA-1 repository) and nonfunctional (because the extension cannot be used in a v0 repository). This happens because in a clone, we initially set up the repository, and then change its algorithm based on what the remote side tells us it's using. We've initially set up the repository as SHA-256 in this case, and then later on reset the repository version without clearing the extension. We could just always set the extension in this case, but that would mean that our SHA-1 repositories weren't compatible with older Git versions, even though there's no reason why they shouldn't be. And we also don't want to initialize the repository as SHA-1 initially, since that means if we're cloning an empty repository, we'll have failed to honor the GIT_DEFAULT_HASH variable and will end up with a SHA-1 repository, not a SHA-256 repository. Neither of those are appealing, so let's tell the repository initialization code if we're doing a reinit like this, and if so, to clear the extension if we're using SHA-1. This makes sure we produce a valid and functional repository and doesn't break any of our other use cases. Reported-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-22 09:22:32 -07:00
Johannes Schindelin	5a0c32bd4b	fast-export: avoid using unnecessary language in a code comment In an ongoing effort to avoid non-inclusive language, let's avoid using the branch name "master" in a code comment. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-21 15:19:27 -07:00
Denton Liu	3d09c22869	builtin/diff-tree: learn --merge-base The previous commit introduced ---merge-base a way to take the diff between the working tree or index and the merge base between an arbitrary commit and HEAD. It makes sense to extend this option to support the case where two commits are given too and behave in a manner identical to `git diff A...B`. Introduce the --merge-base flag as an alternative to triple-dot notation. Thus, we would be able to write the above as `git diff --merge-base A B`. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-21 13:37:03 -07:00
Denton Liu	0f5a1d449b	builtin/diff-index: learn --merge-base There is currently no easy way to take the diff between the working tree or index and the merge base between an arbitrary commit and HEAD. Even diff's `...` notation doesn't allow this because it only works between commits. However, the ability to do this would be desirable to a user who would like to see all the changes they've made on a branch plus uncommitted changes without taking into account changes made in the upstream branch. Teach diff-index and diff (with one commit) the --merge-base option which allows a user to use the merge base of a commit and HEAD as the "before" side. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-20 21:30:26 -07:00
Denton Liu	4c3fe82ef1	diff-lib: accept option flags in run_diff_index() In a future commit, we will teach run_diff_index() to accept more options via flag bits. For now, change `cached` into a flag in the `option` bitfield. The behaviour should remain exactly the same. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-20 21:30:26 -07:00
Junio C Hamano	80cacaec41	Merge branch 'mt/config-fail-nongit-early' Unlike "git config --local", "git config --worktree" did not fail early and cleanly when started outside a git repository. * mt/config-fail-nongit-early: config: complain about --worktree outside of a git repo	2020-09-18 17:58:06 -07:00
Junio C Hamano	9d4e7ec4d9	Merge branch 'jc/quote-path-cleanup' "git status --short" quoted a path with SP in it when tracked, but not those that are untracked, ignored or unmerged. They are all shown quoted consistently. * jc/quote-path-cleanup: quote: turn 'nodq' parameter into a set of flags quote: rename misnamed sq_lookup[] to cq_lookup[] wt-status: consistently quote paths in "status --short" output quote_path: code clarification quote_path: optionally allow quoting a path with SP in it quote_path: give flags parameter to quote_path() quote_path: rename quote_path_relative() to quote_path()	2020-09-18 17:58:04 -07:00
Junio C Hamano	45f462b5c5	Merge branch 'es/wt-add-detach' "git worktree add" learns that the "-d" is a synonym to "--detach" option to create a new worktree without being on a branch. * es/wt-add-detach: git-worktree.txt: discuss branch-based vs. throwaway worktrees worktree: teach `add` to recognize -d as shorthand for --detach git-checkout.txt: document -d short option for --detach	2020-09-18 17:58:04 -07:00
Junio C Hamano	e96b271d18	Merge branch 'jc/add-i-use-builtin-experimental' The "add -i/-p" machinery has been written in C but it is not used by default yet. It is made default to those who are participating in feature.experimental experiment. * jc/add-i-use-builtin-experimental: add -i: use the built-in version when feature.experimental is set	2020-09-18 17:58:02 -07:00
Junio C Hamano	21de7e9c50	Merge branch 'rs/refspec-leakfix' Leakfix. * rs/refspec-leakfix: refspec: add and use refspec_appendf() push: release strbufs used for refspec formatting	2020-09-18 17:58:00 -07:00
Junio C Hamano	9b8074427b	Merge branch 'rs/misc-cleanups' Misc cleanups. * rs/misc-cleanups: pack-bitmap-write: use hashwrite_be32() in write_hash_cache() midx: use hashwrite_u8() in write_midx_header() fast-import: use write_pack_header()	2020-09-18 17:58:00 -07:00
Taylor Blau	d356d5debe	commit-graph: introduce 'commitGraph.maxNewFilters' Introduce a configuration variable to specify a default value for the recently-introduce '--max-new-filters' option of 'git commit-graph write'. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-18 10:39:22 -07:00
Taylor Blau	809e0327f5	builtin/commit-graph.c: introduce '--max-new-filters=<n>' Introduce a command-line flag to specify the maximum number of new Bloom filters that a 'git commit-graph write' is willing to compute from scratch. Prior to this patch, a commit-graph write with '--changed-paths' would compute Bloom filters for all selected commits which haven't already been computed (i.e., by a previous commit-graph write with '--split' such that a roll-up or replacement is performed). This behavior can cause prohibitively-long commit-graph writes for a variety of reasons: * There may be lots of filters whose diffs take a long time to generate (for example, they have close to the maximum number of changes, diffing itself takes a long time, etc). * Old-style commit-graphs (which encode filters with too many entries as not having been computed at all) cause us to waste time recomputing filters that appear to have not been computed only to discover that they are too-large. This can make the upper-bound of the time it takes for 'git commit-graph write --changed-paths' to be rather unpredictable. To make this command behave more predictably, introduce '--max-new-filters=<n>' to allow computing at most '<n>' Bloom filters from scratch. This lets "computing" already-known filters proceed quickly, while bounding the number of slow tasks that Git is willing to do. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-18 10:35:39 -07:00
Taylor Blau	98bb796191	commit-graph: rename 'split_commit_graph_opts' In the subsequent commit, additional options will be added to the commit-graph API which have nothing to do with splitting. Rename the 'split_commit_graph_opts' structure to the more-generic 'commit_graph_opts' to encompass both. Likewise, rename the 'flags' member to instead be 'split_flags' to clarify that it only has to do with the behavior implied by '--split'. Suggested-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-17 21:55:50 -07:00
Christian Schlack	5a07c6c3c2	remote: don't show success message when set-head fails Suppress the message 'origin/HEAD set to master' in case of an error. $ git remote set-head origin -a error: Not a valid ref: refs/remotes/origin/master origin/HEAD set to master Signed-off-by: Christian Schlack <christian@backhub.co> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-17 11:40:17 -07:00
Derrick Stolee	25914c4fde	maintenance: add trace2 regions for task execution Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-17 11:30:05 -07:00
Derrick Stolee	4ddc79b2da	maintenance: add auto condition for commit-graph task Instead of writing a new commit-graph in every 'git maintenance run --auto' process (when maintenance.commit-graph.enalbed is configured to be true), only write when there are "enough" commits not in a commit-graph file. This count is controlled by the maintenance.commit-graph.auto config option. To compute the count, use a depth-first search starting at each ref, and leaving markers using the SEEN flag. If this count reaches the limit, then terminate early and start the task. Otherwise, this operation will peel every ref and parse the commit it points to. If these are all in the commit-graph, then this is typically a very fast operation. Users with many refs might feel a slow-down, and hence could consider updating their limit to be very small. A negative value will force the step to run every time. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-17 11:30:05 -07:00
Derrick Stolee	916d0626c2	maintenance: use pointers to check --auto The 'git maintenance run' command has an '--auto' option. This is used by other Git commands such as 'git commit' or 'git fetch' to check if maintenance should be run after adding data to the repository. Previously, this --auto option was only used to add the argument to the 'git gc' command as part of the 'gc' task. We will be expanding the other tasks to perform a check to see if they should do work as part of the --auto flag, when they are enabled by config. First, update the 'gc' task to perform the auto check inside the maintenance process. This prevents running an extra 'git gc --auto' command when not needed. It also shows a model for other tasks. Second, use the 'auto_condition' function pointer as a signal for whether we enable the maintenance task under '--auto'. For instance, we do not want to enable the 'fetch' task in '--auto' mode, so that function pointer will remain NULL. Now that we are not automatically calling 'git gc', a test in t5514-fetch-multiple.sh must be changed to watch for 'git maintenance' instead. We continue to pass the '--auto' option to the 'git gc' command when necessary, because of the gc.autoDetach config option changes behavior. Likely, we will want to absorb the daemonizing behavior implied by gc.autoDetach as a maintenance.autoDetach config option. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-17 11:30:05 -07:00
Derrick Stolee	65d655b52d	maintenance: create maintenance.<task>.enabled config Currently, a normal run of "git maintenance run" will only run the 'gc' task, as it is the only one enabled. This is mostly for backwards- compatible reasons since "git maintenance run --auto" commands replaced previous "git gc --auto" commands after some Git processes. Users could manually run specific maintenance tasks by calling "git maintenance run --task=<task>" directly. Allow users to customize which steps are run automatically using config. The 'maintenance.<task>.enabled' option then can turn on these other tasks (or turn off the 'gc' task). Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-17 11:30:05 -07:00
Derrick Stolee	d7514f6ed5	maintenance: take a lock on the objects directory Performing maintenance on a Git repository involves writing data to the .git directory, which is not safe to do with multiple writers attempting the same operation. Ensure that only one 'git maintenance' process is running at a time by holding a file-based lock. Simply the presence of the .git/maintenance.lock file will prevent future maintenance. This lock is never committed, since it does not represent meaningful data. Instead, it is only a placeholder. If the lock file already exists, then no maintenance tasks are attempted. This will become very important later when we implement the 'prefetch' task, as this is our stop-gap from creating a recursive process loop between 'git fetch' and 'git maintenance run --auto'. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-17 11:30:05 -07:00
Derrick Stolee	090511bc0b	maintenance: add --task option A user may want to only run certain maintenance tasks in a certain order. Add the --task=<task> option, which allows a user to specify an ordered list of tasks to run. These cannot be run multiple times, however. Here is where our array of maintenance_task pointers becomes critical. We can sort the array of pointers based on the task order, but we do not want to move the struct data itself in order to preserve the hashmap references. We use the hashmap to match the --task=<task> arguments into the task struct data. Keep in mind that the 'enabled' member of the maintenance_task struct is a placeholder for a future 'maintenance.<task>.enabled' config option. Thus, we use the 'enabled' member to specify which tasks are run when the user does not specify any --task=<task> arguments. The 'enabled' member should be ignored if --task=<task> appears. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-17 11:30:05 -07:00
Derrick Stolee	663b2b1b90	maintenance: add commit-graph task The first new task in the 'git maintenance' builtin is the 'commit-graph' task. This updates the commit-graph file incrementally with the command git commit-graph write --reachable --split By writing an incremental commit-graph file using the "--split" option we minimize the disruption from this operation. The default behavior is to merge layers until the new "top" layer is less than half the size of the layer below. This provides quick writes most of the time, with the longer writes following a power law distribution. Most importantly, concurrent Git processes only look at the commit-graph-chain file for a very short amount of time, so they will verly likely not be holding a handle to the file when we try to replace it. (This only matters on Windows.) If a concurrent process reads the old commit-graph-chain file, but our job expires some of the .graph files before they can be read, then those processes will see a warning message (but not fail). This could be avoided by a future update to use the --expire-time argument when writing the commit-graph. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-17 11:30:05 -07:00
Derrick Stolee	3103e9848f	maintenance: initialize task array In anticipation of implementing multiple maintenance tasks inside the 'maintenance' builtin, use a list of structs to describe the work to be done. The struct maintenance_task stores the name of the task (as given by a future command-line argument) along with a function pointer to its implementation and a boolean for whether the step is enabled. A list these structs are initialized with the full list of implemented tasks along with a default order. For now, this list only contains the "gc" task. This task is also the only task enabled by default. The run subcommand will return a nonzero exit code if any task fails. However, it will attempt all tasks in its loop before returning with the failure. Also each failed task will print an error message. Helped-by: Taylor Blau <me@ttaylorr.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-17 11:30:05 -07:00
Derrick Stolee	a95ce12430	maintenance: replace run_auto_gc() The run_auto_gc() method is used in several places to trigger a check for repo maintenance after some Git commands, such as 'git commit' or 'git fetch'. To allow for extra customization of this maintenance activity, replace the 'git gc --auto [--quiet]' call with one to 'git maintenance run --auto [--quiet]'. As we extend the maintenance builtin with other steps, users will be able to select different maintenance activities. Rename run_auto_gc() to run_auto_maintenance() to be clearer what is happening on this call, and to expose all callers in the current diff. Rewrite the method to use a struct child_process to simplify the calls slightly. Since 'git fetch' already allows disabling the 'git gc --auto' subprocess, add an equivalent option with a different name to be more descriptive of the new behavior: '--[no-]maintenance'. Update the documentation to include these options at the same time. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-17 11:30:05 -07:00
Derrick Stolee	3ddaad0e06	maintenance: add --quiet option Maintenance activities are commonly used as steps in larger scripts. Providing a '--quiet' option allows those scripts to be less noisy when run on a terminal window. Turn this mode on by default when stderr is not a terminal. Pipe the option to the 'git gc' child process. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-17 11:30:05 -07:00
Derrick Stolee	2057d75038	maintenance: create basic maintenance runner The 'gc' builtin is our current entrypoint for automatically maintaining a repository. This one tool does many operations, such as repacking the repository, packing refs, and rewriting the commit-graph file. The name implies it performs "garbage collection" which means several different things, and some users may not want to use this operation that rewrites the entire object database. Create a new 'maintenance' builtin that will become a more general- purpose command. To start, it will only support the 'run' subcommand, but will later expand to add subcommands for scheduling maintenance in the background. For now, the 'maintenance' builtin is a thin shim over the 'gc' builtin. In fact, the only option is the '--auto' toggle, which is handed directly to the 'gc' builtin. The current change is isolated to this simple operation to prevent more interesting logic from being lost in all of the boilerplate of adding a new builtin. Use existing builtin/gc.c file because we want to share code between the two builtins. It is possible that we will have 'maintenance' replace the 'gc' builtin entirely at some point, leaving 'git gc' as an alias for some specific arguments to 'git maintenance run'. Create a new test_subcommand helper that allows us to test if a certain subcommand was run. It requires storing the GIT_TRACE2_EVENT logs in a file. A negation mode is available that will be used in later tests. Helped-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-17 11:30:04 -07:00
Lin Sun	dbd8c09bfe	mergetool: allow auto-merge for meld to follow the vim-diff behavior Make the mergetool used with "meld" backend behave similarly to "vimdiff" by telling it to auto-merge non-conflicting parts and highlight the conflicting parts when `mergetool.meld.useAutoMerge` is configured with `true`, or `auto` for detecting the `--auto-merge` option automatically. Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Helped-by: David Aguilar <davvid@gmail.com> Signed-off-by: Lin Sun <lin.sun@zoom.us> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-16 17:11:20 -07:00
Aaron Lipman	21bf933928	ref-filter: allow merged and no-merged filters Enable ref-filter to process multiple merged and no-merged filters, and extend functionality to git branch, git tag and git for-each-ref. This provides an easy way to check for branches that are "graduation candidates:" $ git branch --no-merged master --merged next If passed more than one merged (or more than one no-merged) filter, refs must be reachable from any one of the merged commits, and reachable from none of the no-merged commits. Signed-off-by: Aaron Lipman <alipman88@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-16 12:38:10 -07:00
Alex Riesen	4e735c1326	fetch: pass --ipv4 and --ipv6 options to sub-fetches The options indicate user intent for the whole fetch operation, and ignoring them in sub-fetches (i.e. "--all" and recursive fetching of submodules) is quite unexpected when, for instance, it is intended to limit all of the communication to a specific transport protocol for some reason. Signed-off-by: Alex Riesen <alexander.riesen@cetitec.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-15 14:15:05 -07:00
Junio C Hamano	88910c9939	quote_path: give flags parameter to quote_path() The quote_path() function computes a path (relative to its base directory) and c-quotes the result if necessary. Teach it to take a flags parameter to allow its behaviour to be enriched later. No behaviour change intended. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-10 10:49:19 -07:00
Junio C Hamano	c34d24b8a4	quote_path: rename quote_path_relative() to quote_path() There is no quote_path_absolute() or anything that causes confusion, and one of the two large consumers already rename the long name locally with a preprocessor macro. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-10 10:49:17 -07:00
Junio C Hamano	0df670bc0b	Merge branch 'jt/interpret-branch-name-fallback' "git status" has trouble showing where it came from by interpreting reflog entries that recordcertain events, e.g. "checkout @{u}", and gives a hard/fatal error. Even though it inherently is impossible to give a correct answer because the reflog entries lose some information (e.g. "@{u}" does not record what branch the user was on hence which branch 'the upstream' needs to be computed, and even if the record were available, the relationship between branches may have changed), at least hide the error to allow "status" show its output. * jt/interpret-branch-name-fallback: wt-status: tolerate dangling marks refs: move dwim_ref() to header file sha1-name: replace unsigned int with option struct	2020-09-09 13:53:09 -07:00
Junio C Hamano	9f7833fd55	Merge branch 'ss/submodule-summary-in-c-fixes' Fixups to a topic in 'next'. * ss/submodule-summary-in-c-fixes: t7421: eliminate 'grep' check in t7421.4 for mingw compatibility submodule: fix style in function definition submodule: eliminate unused parameters from print_submodule_summary()	2020-09-09 13:53:07 -07:00
Junio C Hamano	eb7460fd31	Merge branch 'es/worktree-repair' "git worktree" gained a "repair" subcommand to help users recover after moving the worktrees or repository manually without telling Git. Also, "git init --separate-git-dir" no longer corrupts administrative data related to linked worktrees. * es/worktree-repair: init: make --separate-git-dir work from within linked worktree init: teach --separate-git-dir to repair linked worktrees worktree: teach "repair" to fix outgoing links to worktrees worktree: teach "repair" to fix worktree back-links to main worktree worktree: add skeleton "repair" command	2020-09-09 13:53:07 -07:00
Junio C Hamano	1aadb47aad	Merge branch 'jk/worktree-check-clean-leakfix' Leakfix. * jk/worktree-check-clean-leakfix: worktree: fix leak in check_clean_worktree()	2020-09-09 13:53:07 -07:00
Junio C Hamano	a31677dde3	Merge branch 'tb/repack-clearing-midx' When a packfile is removed by "git repack", multi-pack-index gets cleared; the code was taught to do so less aggressively by first checking if the midx actually refers to a pack that no longer exists. * tb/repack-clearing-midx: midx: traverse the local MIDX first builtin/repack.c: invalidate MIDX only when necessary	2020-09-09 13:53:06 -07:00
Junio C Hamano	bbdba3d883	Merge branch 'ss/submodule-summary-in-c' Yet another subcommand of "git submodule" is getting rewritten in C. * ss/submodule-summary-in-c: submodule: port submodule subcommand 'summary' from shell to C t7421: introduce a test script for verifying 'summary' output submodule: rename helper functions to avoid ambiguity submodule: remove extra line feeds between callback struct and macro	2020-09-09 13:53:05 -07:00
Taylor Blau	ab14d0676c	commit-graph: pass a 'struct repository ' in more places In a future commit, some commit-graph internals will want access to 'r->settings', but we only have the 'struct object_directory ' corresponding to that repository. Add an additional parameter to pass the repository around in more places. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-09 12:51:48 -07:00
Matheus Tavares	378fe5fc3d	config: complain about --worktree outside of a git repo Running `git config --worktree` outside of a git repository hits a BUG() when trying to enumerate the worktrees. Let's catch this error earlier and die() with a friendlier message. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-09 12:47:47 -07:00
Jonathan Tan	f08cbf60fe	index-pack: make quantum of work smaller Currently, when index-pack resolves deltas, it does not split up delta trees into threads: each delta base root (an object that is not a REF_DELTA or OFS_DELTA) can go into its own thread, but all deltas on that root (direct or indirect) are processed in the same thread. This is a problem when a repository contains a large text file (thus, delta-able) that is modified many times - delta resolution time during fetching is dominated by processing the deltas corresponding to that text file. This patch contains a solution to that. When cloning using git -c core.deltabasecachelimit=1g clone \ https://fuchsia.googlesource.com/third_party/vulkan-cts on my laptop, clone time improved from 3m2s to 2m5s (using 3 threads, which is the default). The solution is to have a global work stack. This stack contains delta bases (objects, whether appearing directly in the packfile or generated by delta resolution, that themselves have delta children) that need to be processed; whenever a thread needs work, it peeks at the top of the stack and processes its next unprocessed child. If a thread finds the stack empty, it will look for more delta base roots to push on the stack instead. The main weakness of having a global work stack is that more time is spent in the mutex, but profiling has shown that most time is spent in the resolution of the deltas themselves, so this shouldn't be an issue in practice. In any case, experimentation (as described in the clone command above) shows that this patch is a net improvement. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-08 15:52:17 -07:00
Eric Sunshine	07a7f8debf	format-patch: use 'origin' as start of current-series-range when known When formatting a patch series over `origin..HEAD`, one would expect that range to be used as the current-series-range when computing a range-diff between the previous and current versions of a patch series. However, infer_range_diff_ranges() ignores `origin..HEAD` when --range-diff=<prev> specifies a single revision rather than a range, and instead unexpectedly computes the current-series-range based upon <prev>. Address this anomaly by unconditionally using `origin..HEAD` as the current-series-range regardless of <prev> as long as `origin` is known, and only fall back to basing current-series-range on <prev> when `origin` is not known. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-08 15:03:27 -07:00
Eric Sunshine	72a7239016	diff-lib: tighten show_interdiff()'s interface To compute and show an interdiff, show_interdiff() needs only the two OID's to compare and a diffopts, yet it expects callers to supply an entire rev_info. The demand for rev_info is not only overkill, but also places unnecessary burden on potential future callers which might not otherwise have a rev_info at hand. Address this by tightening its signature to require only the items it needs instead of a full rev_info. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-08 15:03:27 -07:00
Eric Sunshine	cdffbdc217	diff: move show_interdiff() from its own file to diff-lib show_interdiff() is a relatively small function and not likely to grow larger or more complicated. Rather than dedicating an entire source file to it, relocate it to diff-lib.c which houses other "take two things and compare them" functions meant to be re-used but not so low-level as to reside in the core diff implementation. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-08 15:03:26 -07:00
Junio C Hamano	2df2d81ddd	add -i: use the built-in version when feature.experimental is set We have had parallel implementations of "add -i/-p" since 2.25 and have been using them from various codepaths since 2.26 days, but never made the built-in version the default. We have found and fixed a handful of corner case bugs in the built-in version, and it may be a good time to start switching over the user base from the scripted version to the built-in version. Let's enable the built-in version for those who opt into the feature.experimental guinea-pig program to give wider exposure. Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-08 14:53:36 -07:00
Eric Sunshine	c670aa47df	worktree: teach `add` to recognize -d as shorthand for --detach Like `git switch` and `git checkout`, `git worktree add` can check out a branch or set up a detached HEAD. However, unlike those other commands, `git worktree add` does not understand -d as shorthand for --detach, which may confound users accustomed to using -d for this purpose. Address this shortcoming by teaching `add` to recognize -d for --detach, thus bringing it in line with the other commands. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-06 18:53:56 -07:00
René Scharfe	ccb181d0f0	fast-import: use write_pack_header() Call write_pack_header() to hash and write a pack header instead of open-coding this function. This gets rid of duplicate code and of the magic version number 2 -- which has been used here since `c90be46abd` (Changed fast-import's pack header creation to use pack.h, 2006-08-16) and in pack.h (again) since `29f049a0c2` (Revert "move pack creation to version 3", 2006-10-14). Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-06 13:40:37 -07:00
René Scharfe	1af8b8c0a5	refspec: add and use refspec_appendf() Add a function for building a refspec using printf-style formatting. It frees callers from managing their own buffer. Use it throughout the tree to shorten and simplify its callers. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-06 13:15:46 -07:00
René Scharfe	30035d9c66	push: release strbufs used for refspec formatting map_refspec() either returns the passed in ref string or a detached strbuf. This makes it hard for callers to release the possibly allocated memory, and set_refspecs() consequently leaks it. Let map_refspec() append any refspecs directly and release its own strbufs after use. Rename it to refspec_append_mapped() and don't return anything to reflect its increased responsibility. set_refspecs() also leaks its strbufs. Do the same here and directly call refspec_append() in each if branch instead of holding onto a detached strbuf, then dispose of the allocated memory after use. We need to add an else branch for the final call because all the other conditional branches already add their formatted refspec now. setup_push_upstream() and setup_push_current() forgot to release their strbufs as well; plug these leaks, too, while at it. None of these leaks were likely to impact users, because the number and sizes of refspecs are usually small and the allocations are only done once per program run. Clean them up nevertheless, as another step on the long road towards zero memory leaks. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-06 13:15:45 -07:00
Orgad Shaneh	7ea0c2f44d	fetch: do not look for submodule changes in unchanged refs When fetching recursively with submodules, for each ref in the superproject, we call check_for_new_submodule_commits() which collects all the objects that have to be checked for submodule changes on calculate_changed_submodule_paths(). On the first call, it also collects all the existing refs for excluding them from the scan. calculate_changed_submodule_paths() creates an argument array with all the collected new objects, followed by --not and all the old objects. This argv is passed to setup_revisions, which parses each argument, converts it back to an oid and resolves the object. The parsing itself also does redundant work, because it is treated like user input, while in fact it is a full oid. So it needlessly attempts to look it up as ref (checks if it has ^, ~ etc.), checks if it is a file name etc. For a repository with many refs, all of this is expensive. But if the fetch in the superproject did not update the ref (i.e. the objects that are required to exist in the submodule did not change), there is no need to include it in the list. Before commit `be76c212` (fetch: ensure submodule objects fetched, 2018-12-06), submodule reference changes were only detected for refs that were changed, but not for new refs. This commit covered also this case, but what it did was to just include every ref. This change should reduce the number of scanned refs by about half (except the case of a no-op fetch, which will not scan any ref), because all the existing refs will still be listed after --not. The regression was reported here: https://public-inbox.org/git/CAGHpTBKSUJzFSWc=uznSu2zB33qCSmKXM- iAjxRCpqNK5bnhRg@mail.gmail.com/ Signed-off-by: Orgad Shaneh <orgads@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-06 09:50:49 -07:00
Junio C Hamano	da6b99c39a	Merge branch 'hl/bisect-doc-clarify-bad-good-ordering' Doc update. * hl/bisect-doc-clarify-bad-good-ordering: bisect: swap command-line options in documentation	2020-09-03 12:37:06 -07:00
Junio C Hamano	b4100f366c	Merge branch 'jt/lazy-fetch' Updates to on-demand fetching code in lazily cloned repositories. * jt/lazy-fetch: fetch: no FETCH_HEAD display if --no-write-fetch-head fetch-pack: remove no_dependents code promisor-remote: lazy-fetch objects in subprocess fetch-pack: do not lazy-fetch during ref iteration fetch: only populate existing_refs if needed fetch: avoid reading submodule config until needed fetch: allow refspecs specified through stdin negotiator/noop: add noop fetch negotiator	2020-09-03 12:37:04 -07:00
Junio C Hamano	18aff08e04	Merge branch 'jc/undash-in-tree-git-callers' A handful of places in in-tree code still relied on being able to execute the git subcommands, especially built-ins, in "git-foo" form, which have been corrected. * jc/undash-in-tree-git-callers: credential-cache: use child_process.args cvsexportcommit: do not run git programs in dashed form transport-helper: do not run git-remote-ext etc. in dashed form	2020-09-03 12:37:03 -07:00
Junio C Hamano	afd49c39dd	Merge branch 'jk/slimmed-down' Trim an unused binary and turn a bunch of commands into built-in. * jk/slimmed-down: drop vcs-svn experiment make git-fast-import a builtin make git-bugreport a builtin make credential helpers builtins Makefile: drop builtins from MSVC pdb list	2020-09-03 12:37:02 -07:00
Junio C Hamano	9c31b19dd0	Merge branch 'pw/rebase-i-more-options' "git rebase -i" learns a bit more options. * pw/rebase-i-more-options: t3436: do not run git-merge-recursive in dashed form rebase: add --reset-author-date rebase -i: support --ignore-date rebase -i: support --committer-date-is-author-date am: stop exporting GIT_COMMITTER_DATE rebase -i: add --ignore-whitespace flag	2020-09-03 12:37:01 -07:00
Jonathan Tan	f24c30e0b6	wt-status: tolerate dangling marks When a user checks out the upstream branch of HEAD, the upstream branch not being a local branch, and then runs "git status", like this: git clone $URL client cd client git checkout @{u} git status no status is printed, but instead an error message: fatal: HEAD does not point to a branch (This error message when running "git branch" persists even after checking out other things - it only stops after checking out a branch.) This is because "git status" reads the reflog when determining the "HEAD detached" message, and thus attempts to DWIM "@{u}", but that doesn't work because HEAD no longer points to a branch. Therefore, when calculating the status of a worktree, tolerate dangling marks. This is done by adding an additional parameter to dwim_ref() and repo_dwim_ref(). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-02 14:39:25 -07:00
Jonathan Tan	db3c293ecd	fetch: no FETCH_HEAD display if --no-write-fetch-head `887952b8c6` ("fetch: optionally allow disabling FETCH_HEAD update", 2020-08-18) introduced the ability to disable writing to FETCH_HEAD during fetch, but did not suppress the "<source> -> FETCH_HEAD" message when this ability is used. This message is misleading in this case, because FETCH_HEAD is not written. Also, because "fetch" is used to lazy-fetch missing objects in a partial clone, this significantly clutters up the output in that case since the objects to be fetched are potentially numerous. Therefore, suppress this message when --no-write-fetch-head is passed (but not when --dry-run is set). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-02 14:26:55 -07:00
Junio C Hamano	c57afd73ef	Merge branch 'rs/checkout-no-overlay-pathspec-fix' "git restore/checkout --no-overlay" with wildcarded pathspec mistakenly removed matching paths in subdirectories, which has been corrected. * rs/checkout-no-overlay-pathspec-fix: checkout, restore: make pathspec recursive	2020-08-31 15:49:50 -07:00
Junio C Hamano	cca424ba90	Merge branch 'jk/refspecs-cleanup' Preliminary code clean-up before introducing "negative refspec". * jk/refspecs-cleanup: refspec: make sure stack refspec_item variables are zeroed refspec: fix documentation referring to refspec_item	2020-08-31 15:49:48 -07:00
Junio C Hamano	e699684cf6	Merge branch 'hn/refs-pseudorefs' Accesses to two pseudorefs have been updated to properly use ref API. * hn/refs-pseudorefs: sequencer: treat REVERT_HEAD as a pseudo ref builtin/commit: suggest update-ref for pseudoref removal sequencer: treat CHERRY_PICK_HEAD as a pseudo ref refs: make refs_ref_exists public	2020-08-31 15:49:48 -07:00
Junio C Hamano	53015c9dd4	Merge branch 'jk/index-pack-w-more-threads' Long ago, we decided to use 3 threads by default when running the index-pack task in parallel, which has been adjusted a bit upwards. * jk/index-pack-w-more-threads: index-pack: adjust default threading cap p5302: count up to online-cpus for thread tests p5302: disable thread-count parameter tests by default	2020-08-31 15:49:48 -07:00
Eric Sunshine	59d876ccd6	init: make --separate-git-dir work from within linked worktree The intention of `git init --separate-work-dir=<path>` is to move the .git/ directory to a location outside of the main worktree. When used within a linked worktree, however, rather than moving the .git/ directory as intended, it instead incorrectly moves the worktree's .git/worktrees/<id> directory to <path>, thus disconnecting the linked worktree from its parent repository and breaking the worktree in the process since its local .git file no longer points at a location at which it can find the object database. Fix this broken behavior. An intentional side-effect of this change is that it also closes a loophole not caught by `ccf236a23a` (init: disallow --separate-git-dir with bare repository, 2020-08-09) in which the check to prevent --separate-git-dir being used in conjunction with a bare repository was unable to detect the invalid combination when invoked from within a linked worktree. Therefore, add a test to verify that this loophole is closed, as well. Reported-by: Henré Botha <henrebotha@gmail.com> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-31 11:47:45 -07:00
Eric Sunshine	42264bc841	init: teach --separate-git-dir to repair linked worktrees A linked worktree's .git file is a "gitfile" pointing at the .git/worktrees/<id> directory within the repository. When `git init --separate-git-dir=<path>` is used on an existing repository to relocate the repository's .git/ directory to a different location, it neglects to update the .git files of linked worktrees, thus breaking the worktrees by making it impossible for them to locate the repository. Fix this by teaching --separate-git-dir to repair the .git file of each linked worktree to point at the new repository location. Reported-by: Henré Botha <henrebotha@gmail.com> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-31 11:47:45 -07:00
Eric Sunshine	b214ab5aa5	worktree: teach "repair" to fix outgoing links to worktrees The .git/worktrees/<id>/gitdir file points at the location of a linked worktree's .git file. Its content must be of the form /path/to/worktree/.git (from which the location of the worktree itself can be derived by stripping the "/.git" suffix). If the gitdir file is deleted or becomes corrupted or outdated, then Git will be unable to find the linked worktree. An easy way for the gitdir file to become outdated is for the user to move the worktree manually (without using "git worktree move"). Although it is possible to manually update the gitdir file to reflect the new linked worktree location, doing so requires a level of knowledge about worktree internals beyond what a user should be expected to know offhand. Therefore, teach "git worktree repair" how to repair broken or outdated .git/worktrees/<id>/gitdir files automatically. (For this to work, the command must either be invoked from within the worktree whose gitdir file requires repair, or from within the main or any linked worktree by providing the path of the broken worktree as an argument to "git worktree repair".) Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-31 11:47:45 -07:00
Eric Sunshine	bdd1f3e4da	worktree: teach "repair" to fix worktree back-links to main worktree The .git file in a linked worktree is a "gitfile" which points back to the .git/worktrees/<id> entry in the main worktree or bare repository. If a worktree's .git file is deleted or becomes corrupted or outdated, then the linked worktree won't know how to find the repository or any of its own administrative files (such as 'index', 'HEAD', etc.). An easy way for the .git file to become outdated is for the user to move the main worktree or bare repository. Although it is possible to manually update each linked worktree's .git file to reflect the new repository location, doing so requires a level of knowledge about worktree internals beyond what a user should be expected to know offhand. Therefore, teach "git worktree repair" how to repair broken or outdated worktree .git files automatically. (For this to work, the command must be invoked from within the main worktree or bare repository, or from within a worktree which has not become disconnected from the repository -- such as one which was created after the repository was moved.) Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-31 11:47:45 -07:00
Miriam Rubio	7b4de74b5d	bisect--helper: introduce new `write_in_file()` function Let's refactor code adding a new `write_in_file()` function that opens a file for writing a message and closes it and a wrapper for writing mode. This helper will be used in later steps and makes the code simpler and easier to understand. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Miriam Rubio <mirucam@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-28 16:21:16 -07:00
Miriam Rubio	30276765c1	bisect--helper: use '-res' in 'cmd_bisect__helper' return Following 'enum bisect_error' vocabulary, return variable 'res' is always non-positive. Let's use '-res' instead of 'abs(res)' to make the code clearer. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Miriam Rubio <mirucam@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-28 16:21:16 -07:00
Miriam Rubio	ef5aef5ee0	bisect--helper: BUG() in cmd_*() on invalid subcommand In cmd_bisect__helper() function, if an invalid or no subcommand is passed there is a BUG. BUG() out instead of returning an error. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Miriam Rubio <mirucam@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-28 16:21:15 -07:00
Taylor Blau	59552fb3e2	midx: traverse the local MIDX first When a repository has an alternate object directory configured, callers can traverse through each alternate's MIDX by walking the '->next' pointer. But, when 'prepare_multi_pack_index_one()' loads multiple MIDXs, it places the new ones at the front of this pointer chain, not at the end. This can be confusing for callers such as 'git repack -ad', causing test failures like in t7700.6 with 'GIT_TEST_MULTI_PACK_INDEX=1'. The occurs when dropping a pack known to the local MIDX with alternates configured that have their own MIDX. Since the alternate's MIDX is returned via 'get_multi_pack_index()', 'midx_contains_pack()' returns true (which is correct, since it traverses through the '->next' pointer to find the MIDX in the chain that does contain the requested object). But, we call 'clear_midx_file()' on 'the_repository', which drops the MIDX at the path of the first MIDX in the chain, which (in the case of t7700.6 is the one in the alternate). This patch addresses that by: - placing the local MIDX first in the chain when calling 'prepare_multi_pack_index_one()', and - introducing a new 'get_local_multi_pack_index()', which explicitly returns the repository-local MIDX, if any. Don't impose an additional order on the MIDX's '->next' pointer beyond that the first item in the chain must be local if one exists so that we avoid a quadratic insertion. Likewise, use 'get_local_multi_pack_index()' in 'remove_redundant_pack()' to fix the formerly broken t7700.6 when run with 'GIT_TEST_MULTI_PACK_INDEX=1'. Finally, note that the MIDX ordering invariant is only preserved by the insertion order in 'prepare_packed_git()', which traverses through the ODB's '->next' pointer, meaning we visit the local object store first. This fragility makes this an undesirable long-term solution if more callers are added, but it is acceptable for now since this is the only caller. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-28 14:07:09 -07:00
Hugo Locurcio	ef4d9f8a32	bisect: swap command-line options in documentation The positional arguments are specified in this order: "bad" then "good". To avoid confusion, the options above the positional arguments are now specified in the same order. They can still be specified in any order since they're options, not positional arguments. Signed-off-by: Hugo Locurcio <hugo.locurcio@hugo.pro> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-28 14:06:06 -07:00
Junio C Hamano	0d9a8e33f9	Merge branch 'jk/leakfix' Code clean-up. * jk/leakfix: submodule--helper: fix leak of core.worktree value config: fix leak in git_config_get_expiry_in_days() config: drop git_config_get_string_const() config: fix leaks from git_config_get_string_const() checkout: fix leak of non-existent branch names submodule--helper: use strbuf_release() to free strbufs clear_pattern_list(): clear embedded hashmaps	2020-08-27 14:04:49 -07:00
Jiang Xin	31e8595a11	receive-pack: new config receive.procReceiveRefs Add a new multi-valued config variable "receive.procReceiveRefs" for `receive-pack` command, like the follows: git config --system --add receive.procReceiveRefs refs/for git config --system --add receive.procReceiveRefs refs/drafts If the specific prefix strings given by the config variables match the reference names of the commands which are sent from git client to `receive-pack`, these commands will be executed by an external hook (named "proc-receive"), instead of the internal `execute_commands` function. For example, if it is set to "refs/for", pushing to a reference such as "refs/for/master" will not create or update reference "refs/for/master", but may create or update a pull request directly by running the hook "proc-receive". Optional modifiers can be provided in the beginning of the value to filter commands for specific actions: create (a), modify (m), delete (d). A `!` can be included in the modifiers to negate the reference prefix entry. E.g.: git config --system --add receive.procReceiveRefs ad:refs/heads git config --system --add receive.procReceiveRefs !:refs/heads Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-27 12:47:47 -07:00
Jiang Xin	63518a574a	New capability "report-status-v2" for git-push The new introduced "proc-receive" hook may handle a command for a pseudo-reference with a zero-old as its old-oid, while the hook may create or update a reference with different name, different new-oid, and different old-oid (the reference may exist already with a non-zero old-oid). Current "report-status" protocol cannot report the status for such reference rewrite. Add new capability "report-status-v2" and new report protocol which is not backward compatible for report of git-push. If a user pushes to a pseudo-reference "refs/for/master/topic", and "receive-pack" creates two new references "refs/changes/23/123/1" and "refs/changes/24/124/1", for client without the knowledge of "report-status-v2", "receive-pack" will only send "ok/ng" directives in the report, such as: ok ref/for/master/topic But for client which has the knowledge of "report-status-v2", "receive-pack" will use "option" directives to report more attributes for the reference given by the above "ok/ng" directive. ok refs/for/master/topic option refname refs/changes/23/123/1 option new-oid <new-oid> ok refs/for/master/topic option refname refs/changes/24/124/1 option new-oid <new-oid> The client will report two new created references to the end user. Suggested-by: Junio C Hamano <gitster@pobox.com> Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-27 12:47:47 -07:00
Jiang Xin	195d6eaea3	receive-pack: feed report options to post-receive When commands are fed to the "post-receive" hook, report options will be parsed and the real old-oid, new-oid, reference name will feed to the "post-receive" hook. Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-27 12:47:47 -07:00
Jiang Xin	15d3af5e22	receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-27 12:47:47 -07:00
Shourya Shukla	d79b145569	t7421: eliminate 'grep' check in t7421.4 for mingw compatibility The 'grep' check in test 4 of t7421 resulted in the failure of t7421 on Windows due to a different error message error: cannot spawn git: No such file or directory instead of fatal: exec 'rev-parse': cd to 'my-subm' failed: No such file or directory Tighten up the check to compute 'src_abbrev' by guarding the 'verify_submodule_committish()' call using `p->status !='D'`, so that the former isn't called in case of non-existent submodule directory, consequently, there is no such error message on any execution environment. The same need not be implemented for 'dst_abbrev' and is rather redundant since the conditional 'if (S_ISGITLINK(p->mod_dst))' already guards the 'verify_submodule_committish()' when we have a status of 'D'. Therefore, eliminate the 'grep' check in t7421. Instead, verify the absence of an error message by doing a 'test_must_be_empty' on the file containing the error. Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Helped-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com> Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-27 11:47:10 -07:00
Eric Sunshine	e8e1ff24c5	worktree: add skeleton "repair" command Worktree administrative files can become corrupted or outdated due to external factors. Although, it is often possible to recover from such situations by hand-tweaking these files, doing so requires intimate knowledge of worktree internals. While information necessary to make such repairs manually can be obtained from git-worktree.txt and gitrepository-layout.txt, we can assist users more directly by teaching git-worktree how to repair its administrative files itself (at least to some extent). Therefore, add a "git worktree repair" command which attempts to correct common problems which may arise due to factors beyond Git's control. At this stage, the "repair" command is a mere skeleton; subsequent commits will flesh out the functionality. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-27 08:59:13 -07:00
Jeff King	27ed6ccc12	worktree: fix leak in check_clean_worktree() We allocate a child_env strvec but never free its memory. Instead, let's just use the strvec that our child_process struct provides, which is cleaned up automatically when we run the command. And while we're moving the initialization of the child_process around, let's switch it to use the official init function (zero-initializing it works OK, since strvec is happy enough with that, but it sets a bad example). Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-27 08:30:17 -07:00
Taylor Blau	e08f7bb093	builtin/repack.c: invalidate MIDX only when necessary In `525e18c04b` (midx: clear midx on repack, 2018-07-12), 'git repack' learned to remove a multi-pack-index file if it added or removed a pack from the object store. This mechanism is a little over-eager, since it is only necessary to drop a MIDX if 'git repack' removes a pack that the MIDX references. Adding a pack outside of the MIDX does not require invalidating the MIDX, and likewise for removing a pack the MIDX does not know about. Teach 'git repack' to check for this by loading the MIDX, and checking whether the to-be-removed pack is known to the MIDX. This requires a slightly odd alternation to a test in t5319, which is explained with a comment. A new test is added to show that the MIDX is left alone when both packs known to it are marked as .keep, but two packs unknown to it are removed and combined into one new pack. Helped-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-26 13:55:46 -07:00
Shourya Shukla	f0c6b6467d	submodule: fix style in function definition The definitions of 'verify_submodule_committish()' and 'print_submodule_summary()' had wrong styling in terms of the asterisk placement. Amend them. Also, the warning printed in case of an unexpected file mode printed the mode in decimal. Print it in octal for enhanced readability. Reported-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-25 13:43:21 -07:00
Shourya Shukla	e0f7ae564e	submodule: eliminate unused parameters from print_submodule_summary() Eliminate the parameters 'missing_{src,dst}' from the 'print_submodule_summary()' function call since they are not used anywhere in the function. Reported-by: Jeff King <peff@peff.net> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com> Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-25 13:43:10 -07:00
Junio C Hamano	ad00f44f54	Merge branch 'en/dir-clear' Leakfix with code clean-up. * en/dir-clear: dir: fix problematic API to avoid memory leaks dir: make clear_directory() free all relevant memory	2020-08-24 14:54:34 -07:00
Junio C Hamano	b556050733	Merge branch 'jc/no-update-fetch-head' "git fetch" learned --no-write-fetch-head option to avoid writing the FETCH_HEAD file. * jc/no-update-fetch-head: fetch: optionally allow disabling FETCH_HEAD update	2020-08-24 14:54:31 -07:00
Junio C Hamano	ff20794402	Merge branch 'jk/unleak-fixes' Fix some incorrect UNLEAK() annotations. * jk/unleak-fixes: ls-remote: simplify UNLEAK() usage stop calling UNLEAK() before die()	2020-08-24 14:54:30 -07:00
Junio C Hamano	a654836d96	Merge branch 'es/init-no-separate-git-dir-in-bare' The purpose of "git init --separate-git-dir" is to initialize a new project with the repository separate from the working tree, or, in the case of an existing project, to move the repository (the .git/ directory) out of the working tree. It does not make sense to use --separate-git-dir with a bare repository for which there is no working tree, so disallow its use with bare repositories. * es/init-no-separate-git-dir-in-bare: init: disallow --separate-git-dir with bare repository	2020-08-24 14:54:28 -07:00
Jonathan Tan	ee6f058384	index-pack: make resolve_delta() assume base data A subsequent commit will make the quantum of work smaller, necessitating more locking. This commit allows resolve_delta() to be called outside the lock. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-24 14:14:52 -07:00
Jonathan Tan	b4718cae51	index-pack: calculate {ref,ofs}_{first,last} early This is refactoring 2 of 2 to simplify struct base_data. Whenever we make a struct base_data, immediately calculate its delta children. This eliminates confusion as to when the {ref,ofs}_{first,last} fields are initialized. Before this patch, the delta children were calculated at the last possible moment. This allowed the members of struct base_data to be populated in any order, superficially useful when we have the object contents before the struct object_entry. But this makes reasoning about the state of struct base_data more complicated, hence this patch. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-24 14:12:58 -07:00
Jonathan Tan	a7f7e84a49	index-pack: remove redundant child field This is refactoring 1 of 2 to simplify struct base_data. In index-pack, each thread maintains a doubly-linked list of the delta chain that it is currently processing (the "base" and "child" pointers in struct base_data). When a thread exceeds the delta base cache limit and needs to reclaim memory, it uses the "child" pointers to traverse the lineage, reclaiming the memory of the eldest delta bases first. A subsequent patch will perform memory reclaiming in a different way and will thus no longer need the "child" pointer. Because the "child" pointer is redundant even now, remove it so that the aforementioned subsequent patch will be clearer. In the meantime, reclaim memory in the reverse order of the "base" pointers. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-24 14:11:14 -07:00
Jonathan Tan	46e6fb1e44	index-pack: unify threaded and unthreaded code Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-24 14:02:31 -07:00
Jonathan Tan	fc968e26c2	index-pack: remove redundant parameter find_{ref,ofs}_delta_{,children} take an enum object_type parameter, but the object type is already present in the name of the function. Remove that parameter from these functions. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-24 13:55:57 -07:00
René Scharfe	bfda204ade	checkout, restore: make pathspec recursive The pathspec given to git checkout and git restore is used with both tree_entry_interesting (via read_tree_recursive) and match_pathspec (via ce_path_match). The latter effectively only supports recursive matching regardless of the value of the pathspec flag "recursive", which is unset here. That causes different match results for pathspecs with wildcards, and can lead checkout and restore in no-overlay mode to remove entries instead of modifying them. Enable recursive matching for both checkout and restore to make matching consistent. Setting the flag in checkout_main() technically also affects git switch, but since that command doesn't accept pathspecs at all this has no actual consequence. Reported-by: Sergii Shkarnikov <sergii.shkarnikov@globallogic.com> Initial-test-by: Sergii Shkarnikov <sergii.shkarnikov@globallogic.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-22 13:37:43 -07:00
Jeff King	fbff95b67f	index-pack: adjust default threading cap Commit `b8a2486f15` (index-pack: support multithreaded delta resolving, 2012-05-06) describes an experiment that shows that setting the number of threads for index-pack higher than 3 does not help. I repeated that experiment using a more modern version of Git and a more modern CPU and got different results. Here are timings for p5302 against linux.git run on my laptop, a Core i9-9880H with 8 cores plus hyperthreading (so online-cpus returns 16): 5302.3: index-pack 0 threads 256.28(253.41+2.79) 5302.4: index-pack 1 threads 257.03(254.03+2.91) 5302.5: index-pack 2 threads 149.39(268.34+3.06) 5302.6: index-pack 4 threads 94.96(294.10+3.23) 5302.7: index-pack 8 threads 68.12(339.26+3.89) 5302.8: index-pack 16 threads 70.90(655.03+7.21) 5302.9: index-pack default number of threads 116.91(290.05+3.21) You can see that wall-clock times continue to improve dramatically up to the number of cores, but bumping beyond that (into hyperthreading territory) does not help (and in fact hurts a little). Here's the same experiment on a machine with dual Xeon 6230's, totaling 40 cores (80 with hyperthreading): 5302.3: index-pack 0 threads 310.04(302.73+6.90) 5302.4: index-pack 1 threads 310.55(302.68+7.40) 5302.5: index-pack 2 threads 178.17(304.89+8.20) 5302.6: index-pack 5 threads 99.53(315.54+9.56) 5302.7: index-pack 10 threads 72.80(327.37+12.79) 5302.8: index-pack 20 threads 60.68(357.74+21.66) 5302.9: index-pack 40 threads 58.07(454.44+67.96) 5302.10: index-pack 80 threads 59.81(720.45+334.52) 5302.11: index-pack default number of threads 134.18(309.32+7.98) The results are similar; things stop improving at 40 threads. Curiously, going from 20 to 40 really doesn't help much, either (and increases CPU time considerably). So that may represent an actual barrier to parallelism, where we lose out due to context-switching and loss of cache locality, but don't reap the wall-clock benefits due to contention of our coarse-grained locks. So what's a good default value? It's clear that the current cap of 3 is too low; our default values are 42% and 57% slower than the best times on each machine. The results on the 40-core machine imply that 20 threads is an actual barrier regardless of the number of cores, so we'll take that as a maximum. We get the best results on these machines at half of the online-cpus value. That's presumably a result of the hyperthreading. That's common on multi-core Intel processors, but not necessarily elsewhere. But if we take it as an assumption, we can perform optimally on hyperthreaded machines and still do much better than the status quo on other machines, as long as we never half below the current value of 3. So that's what this patch does. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-21 12:02:36 -07:00
Han-Wen Nienhuys	b6d2558c9e	builtin/commit: suggest update-ref for pseudoref removal When pseudorefs move to a different ref storage mechanism, pseudorefs no longer can be removed with 'rm'. Instead, suggest a "update-ref -d" command, which will work regardless of ref storage backend. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-21 11:20:10 -07:00
Han-Wen Nienhuys	c8e4159efd	sequencer: treat CHERRY_PICK_HEAD as a pseudo ref Check for existence and delete CHERRY_PICK_HEAD through ref functions. This will help cherry-pick work with alternate ref storage backends. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-21 11:20:10 -07:00
Junio C Hamano	2a978f8273	Merge branch 'jc/object-names-are-not-sha-1' A few end-user facing messages have been updated to be hash-algorithm agnostic. * jc/object-names-are-not-sha-1: messages: avoid SHA-1 in end-user facing messages	2020-08-19 16:14:52 -07:00
Rohit Ashiwal	27126692ba	rebase: add --reset-author-date The previous commit introduced --ignore-date flag to rebase -i, but the name is rather vague as it does not say whether the author date or the committer date is ignored. Add an alias to convey the precise purpose. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-19 15:22:56 -07:00
Phillip Wood	a3894aad67	rebase -i: support --ignore-date Rebase is implemented with two different backends - 'apply' and 'merge' each of which support a different set of options. In particular the apply backend supports a number of options implemented by 'git am' that are not implemented in the merge backend. This means that the available options are different depending on which backend is used which is confusing. This patch adds support for the --ignore-date option to the merge backend. This option uses the current time as the author date rather than reusing the original author date when rewriting commits. We take care to handle the combination of --ignore-date and --committer-date-is-author-date in the same way as the apply backend. Original-patch-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-19 15:19:59 -07:00
Elijah Newren	eceba53214	dir: fix problematic API to avoid memory leaks The dir structure seemed to have a number of leaks and problems around it. First I noticed that parent_hashmap and recursive_hashmap were being leaked (though Peff noticed and submitted fixes before me). Then I noticed in the previous commit that clear_directory() was only taking responsibility for a subset of fields within dir_struct, despite the fact that entries[] and ignored[] we allocated internally to dir.c. That, of course, resulted in many callers either leaking or haphazardly trying to free these arrays and their contents. Digging further, I found that despite the pretty clear documentation near the top of dir.h that folks were supposed to call clear_directory() when the user no longer needed the dir_struct, there were four callers that didn't bother doing that at all. However, two of them clearly thought about leaks since they had an UNLEAK(dir) directive, which to me suggests that the method to free the data was too unclear. I suspect the non-obviousness of the API and its holes led folks to avoid it, which then snowballed into further problems with the entries[], ignored[], parent_hashmap, and recursive_hashmap problems. Rename clear_directory() to dir_clear() to be more in line with other data structures in git, and introduce a dir_init() to handle the suggested memsetting of dir_struct to all zeroes. I hope that a name like "dir_clear()" is more clear, and that the presence of dir_init() will provide a hint to those looking at the code that they need to look for either a dir_clear() or a dir_free() and lead them to find dir_clear(). Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-18 17:17:31 -07:00
Elijah Newren	dad4f23ce5	dir: make clear_directory() free all relevant memory The calling convention for the dir API is supposed to end with a call to clear_directory() to free up no longer needed memory. However, clear_directory() didn't free dir->entries or dir->ignored. I believe this was an oversight, but a number of callers noticed memory leaks and started free'ing these. Unfortunately, they did so somewhat haphazardly (sometimes freeing the entries in the arrays, and sometimes only free'ing the arrays themselves). This suggests the callers weren't trying to make sure any possible memory used might be free'd, but just the memory they noticed their usecase definitely had allocated. Fix this mess by moving all the duplicated free'ing logic into clear_directory(). End by resetting dir to a pristine state so it could be reused if desired. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-18 17:17:29 -07:00
Jonathan Tan	9dfa8dbeee	fetch-pack: remove no_dependents code Now that Git has switched to using a subprocess to lazy-fetch missing objects, remove the no_dependents code as it is no longer used. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-18 16:46:53 -07:00
Jonathan Tan	abcb7eeb31	fetch: only populate existing_refs if needed In "fetch", get_ref_map() iterates over all refs to populate "existing_refs" in order to populate peer_ref->old_oid in the returned refmap, even if the refmap has no peer_ref set - which is the case when only literal hashes (i.e. no refs by name) are fetched. Iterating over refs causes the targets of those refs to be checked for existence. Avoiding this is especially important when we use "git fetch" to perform lazy fetches in a partial clone because a target of such a ref may need to be itself lazy-fetched (and otherwise causing an infinite loop). Therefore, avoid populating "existing_refs" until necessary. With this patch, because Git lazy-fetches objects by literal hashes (to be done in a subsequent commit), it will then be able to guarantee avoiding reading targets of refs. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-18 13:25:05 -07:00
Jonathan Tan	e5b942136e	fetch: avoid reading submodule config until needed In "fetch", there are two parameters submodule_fetch_jobs_config and recurse_submodules that can be set in a variety of ways: through .gitmodules, through .git/config, and through the command line. Currently "fetch" handles this by first reading .gitmodules, then reading .git/config (allowing it to overwrite existing values), then reading the command line (allowing it to overwrite existing values). Notice that we can avoid reading .gitmodules if .git/config and/or the command line already provides us with what we need. In addition, if recurse_submodules is found to be "no", we do not need the value of submodule_fetch_jobs_config. Avoiding reading .gitmodules is especially important when we use "git fetch" to perform lazy fetches in a partial clone because the .gitmodules file itself might need to be lazy fetched (and otherwise causing an infinite loop). In light of all this, avoid reading .gitmodules until necessary. When reading it, we may only need one of the two parameters it provides, so teach fetch_config_from_gitmodules() to support NULL arguments. With this patch, users (including Git itself when invoking "git fetch" to lazy-fetch) will be able to guarantee avoiding reading .gitmodules by passing --recurse-submodules=no. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-18 13:25:05 -07:00
Jonathan Tan	2b713c272c	fetch: allow refspecs specified through stdin In a subsequent patch, partial clones will be taught to fetch missing objects using a "git fetch" subprocess. Because the number of objects fetched may be too numerous to fit on the command line, teach "fetch" to accept refspecs passed through stdin. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-18 13:25:05 -07:00
Junio C Hamano	887952b8c6	fetch: optionally allow disabling FETCH_HEAD update If you run fetch but record the result in remote-tracking branches, and either if you do nothing with the fetched refs (e.g. you are merely mirroring) or if you always work from the remote-tracking refs (e.g. you fetch and then merge origin/branchname separately), you can get away with having no FETCH_HEAD at all. Teach "git fetch" a command line option "--[no-]write-fetch-head". The default is to write FETCH_HEAD, and the option is primarily meant to be used with the "--no-" prefix to override this default, because there is no matching fetch.writeFetchHEAD configuration variable to flip the default to off (in which case, the positive form may become necessary to defeat it). Note that under "--dry-run" mode, FETCH_HEAD is never written; otherwise you'd see list of objects in the file that you do not actually have. Passing `--write-fetch-head` does not force `git fetch` to write the file. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-18 12:56:57 -07:00
Junio C Hamano	eca8c62a50	Merge branch 'jk/log-fp-implies-m' "git log --first-parent -p" showed patches only for single-parent commits on the first-parent chain; the "--first-parent" option has been made to imply "-m". Use "--no-diff-merges" to restore the previous behaviour to omit patches for merge commits. * jk/log-fp-implies-m: doc/git-log: clarify handling of merge commit diffs doc/git-log: move "-t" into diff-options list doc/git-log: drop "-r" diff option doc/git-log: move "Diff Formatting" from rev-list-options log: enable "-m" automatically with "--first-parent" revision: add "--no-diff-merges" option to counteract "-m" log: drop "--cc implies -m" logic	2020-08-17 17:02:49 -07:00
Junio C Hamano	47f0f94bc7	Merge branch 'al/bisect-first-parent' "git bisect" learns the "--first-parent" option to find the first breakage along the first-parent chain. * al/bisect-first-parent: bisect: combine args passed to find_bisection() bisect: introduce first-parent flag cmd_bisect__helper: defer parsing no-checkout flag rev-list: allow bisect and first-parent flags t6030: modernize "git bisect run" tests	2020-08-17 17:02:45 -07:00
Jeff King	55fe225dde	submodule--helper: fix leak of core.worktree value In the ensure_core_worktree() function, we load the core.worktree value of the submodule repository using repo_config_get_string(). This function copies the string, but we never free it, leaking the memory. We can instead use the "tmp" version of that function to avoid the allocation at all. We don't have to worry about lifetime issues, since we never even look at the value (we just want to know if it's set). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-17 15:35:47 -07:00
Phillip Wood	7573cec52c	rebase -i: support --committer-date-is-author-date Rebase is implemented with two different backends - 'apply' and 'merge' each of which support a different set of options. In particular the apply backend supports a number of options implemented by 'git am' that are not implemented in the merge backend. This means that the available options are different depending on which backend is used which is confusing. This patch adds support for the --committer-date-is-author-date option to the merge backend. This option uses the author date of the commit that is being rewritten as the committer date when the new commit is created. Original-patch-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-17 11:58:37 -07:00
Phillip Wood	e8cbe2118a	am: stop exporting GIT_COMMITTER_DATE The implementation of --committer-date-is-author-date exports GIT_COMMITTER_DATE to override the default committer date but does not reset GIT_COMMITTER_DATE in the environment after creating the commit so it is set in the environment of any hooks that get run. We're about to add the same functionality to the sequencer and do not want to have GIT_COMMITTER_DATE set when running hooks or exec commands so lets update commit_tree_extended() to take an explicit committer so we override the default date without setting GIT_COMMITTER_DATE in the environment. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-17 11:58:37 -07:00
Jacob Keller	95e7c38539	refspec: make sure stack refspec_item variables are zeroed A couple of functions that used struct refspec_item did not zero out the structure memory. This can result in unexpected behavior, especially if additional parameters are ever added to refspec_item in the future. Use memset to ensure that unset structure members are zero. It may make sense to convert most of these uses of struct refspec_item to use either struct initializers or refspec_item_init_or_die. However, other similar code uses memset. Converting all of these uses has been left as a future exercise. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-17 10:39:21 -07:00
Jeff King	f1de981e8b	config: fix leaks from git_config_get_string_const() There are two functions to get a single config string: - git_config_get_string() - git_config_get_string_const() One might naively think that the first one allocates a new string and the second one just points us to the internal configset storage. But in fact they both allocate a new copy; the second one exists only to avoid having to cast when using it with a const global which we never intend to free. The documentation for the function explains that clearly, but it seems I'm not alone in being surprised by this. Of 17 calls to the function, 13 of them leak the resulting value. We could obviously fix these by adding the appropriate free(). But it would be simpler still if we actually had a non-allocating way to get the string. There's git_config_get_value() but that doesn't quite do what we want. If the config key is present but is a boolean with no value (e.g., "[foo]bar" in the file), then we'll get NULL (whereas the string versions will print an error and die). So let's introduce a new variant, git_config_get_string_tmp(), that behaves as these callers expect. We need a new name because we have new semantics but the same function signature (so even if we converted the four remaining callers, topics in flight might be surprised). The "tmp" is because this value should only be held onto for a short time. In practice it's rare for us to clear and refresh the configset, invalidating the pointer, but hopefully the "tmp" makes callers think about the lifetime. In each of the converted cases here the value only needs to last within the local function or its immediate caller. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-14 10:52:04 -07:00
Jeff King	c514c62a4f	checkout: fix leak of non-existent branch names We unconditionally write a branch name into a newly allocated buffer in new_branch_info->path, via setup_branch_path(). We then check to see if the branch exists; if not, we set that field to NULL, leaking the memory. We should take care to free() it when doing so. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-14 10:52:04 -07:00
Jeff King	9101c8ea2d	submodule--helper: use strbuf_release() to free strbufs The prepare_to_clone_next_submodule() function has a few local-variable strbufs. We use strbuf_reset() throughout the function to reuse the buffers over and over. But at the end of the function we also use strbuf_reset() as they go out of scope, which means we end up leaking their heap buffers. This should be strbuf_release() instead. These were introduced by `48308681b0` (git submodule update: have a dedicated helper for cloning, 2016-02-29), but it doesn't seem to have the same mistake elsewhere. Likewise, I looked for other instances of the pattern in the submodule--helper file but couldn't find any. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-14 10:52:04 -07:00
Junio C Hamano	4279000d3e	messages: avoid SHA-1 in end-user facing messages There are still a handful mentions of SHA-1 when we meant the (hexadecimal) object names in end-user facing messages. Rewrite them. I was hoping that this can mostly be s/SHA-1/object name/, but a few messages needed rephrasing to keep the result readable. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-14 09:33:37 -07:00
Junio C Hamano	d1a8a8979d	Merge branch 'jt/has_object' A new helper function has_object() has been introduced to make it easier to mark object existence checks that do and don't want to trigger lazy fetches, and a few such checks are converted using it. * jt/has_object: fsck: do not lazy fetch known non-promisor object pack-objects: no fetch when allow-{any,promisor} apply: do not lazy fetch when applying binary sha1-file: introduce no-lazy-fetch has_object()	2020-08-13 14:13:39 -07:00
Jeff King	3e19816dc0	ls-remote: simplify UNLEAK() usage We UNLEAK() the "sorting" list created by parsing command-line options (which is essentially used until the program exits). But we do so right before leaving the cmd_ls_remote() function, which means we have to hit all of the exits. But the point of UNLEAK() is that it's an annotation which doesn't impact the variable itself. We can mark it as soon as we're done writing its value, and then we only have to do so once. This gives us a minor code reduction, and serves as a better example of how UNLEAK() can be used. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-13 11:05:26 -07:00
Jeff King	a006f875e2	make git-fast-import a builtin There's no reason that git-fast-import benefits from being a separate binary. And as it links against libgit.a, it has a non-trivial disk footprint. Let's make it a builtin, which reduces the size of a stripped installation from 22MB to 21MB. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-13 11:02:13 -07:00
Jeff King	d7a5649c82	make git-bugreport a builtin There's no reason that bugreport has to be a separate binary. And since it links against libgit.a, it has a rather large disk footprint. Let's make it a builtin, which reduces the size of a stripped installation from 24MB to 22MB. This also simplifies our Makefile a bit. And we can take advantage of builtin niceties like RUN_SETUP_GENTLY. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-13 11:02:12 -07:00
Jeff King	b5dd96b70a	make credential helpers builtins There's no real reason for credential helpers to be separate binaries. I did them this way originally under the notion that helper don't _need_ to be part of Git, and so can be built totally separately (and indeed, the ones in contrib/credential are). But the ones in our main Makefile build on libgit.a, and the resulting binaries are reasonably large. We can slim down our total disk footprint by just making them builtins. This reduces the size of: make strip install from 29MB to 24MB on my Debian system. Note that credential-cache can't operate without support for Unix sockets. Currently we just don't build it at all when NO_UNIX_SOCKETS is set. We could continue that with conditionals in the Makefile and our list of builtins. But instead, let's build a dummy implementation that dies with an informative message. That has two advantages: - it's simpler, because the conditional bits are all kept inside the credential-cache source - a user who is expecting it to exist will be told _why_ they can't use it, rather than getting the "credential-cache is not a git command" error which makes it look like the Git install is broken. Note that our dummy implementation does still respond to "-h" in order to appease t0012 (and this may be a little friendlier for users, as well). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-13 11:02:08 -07:00
Prathamesh Chavan	e83e3333b5	submodule: port submodule subcommand 'summary' from shell to C Convert submodule subcommand 'summary' to a builtin and call it via 'git-submodule.sh'. The shell version had to call $diff_cmd twice, once to find the modified modules cared by the user and then again, with that list of modules to do various operations for computing the summary of those modules. On the other hand, the C version does not need a second call to $diff_cmd since it reuses the module list from the first call to do the aforementioned tasks. In the C version, we use the combination of setting a child process' working directory to the submodule path and then calling 'prepare_submodule_repo_env()' which also sets the 'GIT_DIR' to '.git', so that we can be certain that those spawned processes will not access the superproject's ODB by mistake. A behavioural difference between the C and the shell version is that the shell version outputs two line feeds after the 'git log' output when run outside of the tests while the C version outputs one line feed in any case. The reason for this is that the shell version calls log with '--pretty=format:<fmt>' whose output is followed by two echo calls; 'format' does not have "terminator" semantics like its 'tformat' counterpart. So, the log output is terminated by a newline only when invoked by the user and not when invoked from the scripts. This results in the one & two line feed differences in the shell version. On the other hand, the C version calls log with '--pretty=<fmt>' which is equivalent to '--pretty:tformat:<fmt>' which is then followed by a 'printf("\n")'. Due to its "terminator" semantics the log output is always terminated by newline and hence one line feed in any case. Also, when we try to pass an option-like argument after a non-option argument, for instance: git submodule summary HEAD --foo-bar (or) git submodule summary HEAD --cached That argument would be treated like a path to the submodule for which the user is requesting a summary. So, the option ends up having no effect. Though, passing '--quiet' is an exception to this: git submodule summary HEAD --quiet While 'summary' doesn't support '--quiet', we don't get an output for the above command as '--quiet' is treated as a path which means we get an output only if a submodule whose path is '--quiet' exists. The error message in case of computing a summary for non-existent submodules in the C version is different from that of the shell version. Since the new error message is not marked for translation, change the 'test_i18ngrep' in t7421.4 to 'grep'. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Stefan Beller <stefanbeller@gmail.com> Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Prathamesh Chavan <pc44800@gmail.com> Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-12 14:12:58 -07:00
Shourya Shukla	6414c3d316	submodule: remove extra line feeds between callback struct and macro Many `submodule--helper` subcommands follow the convention that a struct defines their callback data, and the declaration of that struct is followed immediately by a macro to use in static initializers, without any separating empty line. Let's align the `init`, `status` and `sync` subcommands with that convention. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Helped-by: Philip Oakley <philipoakley@iee.email> Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-12 14:12:58 -07:00
Junio C Hamano	e0ad9574dd	Merge branch 'bc/sha-256-part-3' The final leg of SHA-256 transition. * bc/sha-256-part-3: (39 commits) t: remove test_oid_init in tests docs: add documentation for extensions.objectFormat ci: run tests with SHA-256 t: make SHA1 prerequisite depend on default hash t: allow testing different hash algorithms via environment t: add test_oid option to select hash algorithm repository: enable SHA-256 support by default setup: add support for reading extensions.objectformat bundle: add new version for use with SHA-256 builtin/verify-pack: implement an --object-format option http-fetch: set up git directory before parsing pack hashes t0410: mark test with SHA1 prerequisite t5308: make test work with SHA-256 t9700: make hash size independent t9500: ensure that algorithm info is preserved in config t9350: make hash size independent t9301: make hash size independent t9300: use $ZERO_OID instead of hard-coded object ID t9300: abstract away SHA-1-specific constants t8011: make hash size independent ...	2020-08-11 18:04:11 -07:00
Junio C Hamano	995c71986a	Merge branch 'pb/guide-docs' Update "git help guides" documentation organization. * pb/guide-docs: git.txt: add list of guides Documentation: don't hardcode command categories twice help: drop usage of 'common' and 'useful' for guides command-list.txt: add missing 'gitcredentials' and 'gitremote-helpers'	2020-08-10 10:24:04 -07:00
Junio C Hamano	4339259d5f	Merge branch 'en/eol-attrs-gotchas' All "mergy" operations that internally use the merge-recursive machinery should honor the merge.renormalize configuration, but many of them didn't. * en/eol-attrs-gotchas: checkout: support renormalization with checkout -m <paths> merge: make merge.renormalize work for all uses of merge machinery t6038: remove problematic test t6038: make tests fail for the right reason	2020-08-10 10:24:02 -07:00
Junio C Hamano	46b225f153	Merge branch 'jk/strvec' The argv_array API is useful for not just managing argv but any "vector" (NULL-terminated array) of strings, and has seen adoption to a certain degree. It has been renamed to "strvec" to reduce the barrier to adoption. * jk/strvec: strvec: rename struct fields strvec: drop argv_array compatibility layer strvec: update documention to avoid argv_array strvec: fix indentation in renamed calls strvec: convert remaining callers away from argv_array name strvec: convert more callers away from argv_array name strvec: convert builtin/ callers away from argv_array name quote: rename sq_dequote_to_argv_array to mention strvec strvec: rename files from argv-array to strvec argv-array: rename to strvec argv-array: use size_t for count and alloc	2020-08-10 10:23:57 -07:00
Eric Sunshine	ccf236a23a	init: disallow --separate-git-dir with bare repository The purpose of "git init --separate-git-dir" is to separate the repository from the worktree. This is true even when --separate-git-dir is used on an existing worktree, in which case, it moves the .git/ subdirectory to a new location outside the worktree. However, an outright bare repository (such as one created by "git init --bare"), has no worktree, so using --separate-git-dir to separate it from its non-existent worktree is nonsensical. Therefore, make it an error to use --separate-git-dir on a bare repository. Implementation note: "git init" considers a repository bare if told so explicitly via --bare or if it guesses it to be so based upon heuristics. In the explicit --bare case, a conflict with --separate-git-dir is easy to detect early. In the guessed case, however, the conflict can only be detected once "bareness" is guessed, which happens after "git init" has begun creating the repository. Technically, we can get by with a single late check which would cover both cases, however, erroring out early, when possible, without leaving detritus provides a better user experience. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-10 09:24:11 -07:00
Aaron Lipman	ad464a4e84	bisect: combine args passed to find_bisection() Now that find_bisection() accepts multiple boolean arguments, these may be combined into a single unsigned integer in order to declutter some of the code in bisect.c Also, rename the existing "flags" bitfield to "commit_flags", to explicitly differentiate it from the new "bisect_flags" bitfield. Based-on-patch-by: Harald Nordgren <haraldnordgren@gmail.com> Signed-off-by: Aaron Lipman <alipman88@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-07 15:13:03 -07:00
Aaron Lipman	e8861ffc20	bisect: introduce first-parent flag Upon seeing a merge commit when bisecting, this option may be used to follow only the first parent. In detecting regressions introduced through the merging of a branch, the merge commit will be identified as introduction of the bug and its ancestors will be ignored. This option is particularly useful in avoiding false positives when a merged branch contained broken or non-buildable commits, but the merge itself was OK. Signed-off-by: Aaron Lipman <alipman88@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-07 15:13:03 -07:00
Aaron Lipman	be5fe2000d	cmd_bisect__helper: defer parsing no-checkout flag cmd_bisect__helper() is intended as a temporary shim layer serving as an interface for git-bisect.sh. This function and git-bisect.sh should eventually be replaced by a C implementation, cmd_bisect(), serving as an entrypoint for all "git bisect ..." shell commands: cmd_bisect() will only parse the first token following "git bisect", and dispatch the remaining args to the appropriate function ["bisect_start()", "bisect_next()", etc.]. Thus, cmd_bisect__helper() should not be responsible for parsing flags like --no-checkout. Instead, let the --no-checkout flag remain in the argv array, so it may be evaluated alongside the other options already parsed by bisect_start(). Signed-off-by: Aaron Lipman <alipman88@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-07 15:13:03 -07:00
Aaron Lipman	0fe305a5d3	rev-list: allow bisect and first-parent flags Add first_parent_only parameter to find_bisection(), removing the barrier that prevented combining the --bisect and --first-parent flags when using git rev-list Based-on-patch-by: Tiago Botelho <tiagonbotelho@hotmail.com> Signed-off-by: Aaron Lipman <alipman88@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-07 15:11:59 -07:00
Jonathan Tan	9eb86f41de	fsck: do not lazy fetch known non-promisor object There is a call to has_object_file(), which lazily fetches missing objects in a partial clone, when the object is known to not be a promisor object. Change that call to has_object(), which does not do any lazy fetching. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-06 13:01:03 -07:00
Jonathan Tan	ee47243d76	pack-objects: no fetch when allow-{any,promisor} The options --missing=allow-{any,promisor} were introduced in `caf3827e2f` ("rev-list: add list-objects filtering support", 2017-11-22) with the following note in the commit message: This patch introduces handling of missing objects to help debugging and development of the "partial clone" mechanism, and once the mechanism is implemented, for a power user to perform operations that are missing-object aware without incurring the cost of checking if a missing link is expected. The idea that these options are missing-object aware (and thus do not need to lazily fetch objects, unlike unaware commands that assume that all objects are present) are assumed in later commits such as `07ef3c6604` ("fetch test: use more robust test for filtered objects", 2020-01-15). However, the current implementations of these options use has_object_file(), which indeed lazily fetches missing objects. Teach these implementations not to do so. Also, update the documentation of these options to be clearer. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-06 13:01:03 -07:00
Philippe Blain	0371a764d2	help: drop usage of 'common' and 'useful' for guides Since `1b81d8cb19` (help: use command-list.txt for the source of guides, 2018-05-20), all man5/man7 guides listed in command-list.txt appear in the output of 'git help -g'. However, 'git help -g' still prefixes this list with "The common Git guides are:", which makes one wonder if there are others! In the same spirit, the man page for 'git help' describes the '--guides' option as listing 'useful' guides, which is not false per se but can also be taken to mean that there are other guides that exist but are not useful. Instead of 'common' and 'useful', use 'Git concept guides' in both places. To keep the code in line with this change, rename help.c::list_common_guides_help to list_guides_help. Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-04 18:34:01 -07:00
Junio C Hamano	5c454b3825	Merge branch 'jt/pack-objects-prefetch-in-batch' While packing many objects in a repository with a promissor remote, lazily fetching missing objects from the promissor remote one by one may be inefficient---the code now attempts to fetch all the missing objects in batch (obviously this won't work for a lazy clone that lazily fetches tree objects as you cannot even enumerate what blobs are missing until you learn which trees are missing). * jt/pack-objects-prefetch-in-batch: pack-objects: prefetch objects to be packed pack-objects: refactor to oid_object_info_extended	2020-08-04 13:53:57 -07:00
Elijah Newren	00906d6f22	checkout: support renormalization with checkout -m <paths> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-03 11:48:15 -07:00
Elijah Newren	8d552258f4	merge: make merge.renormalize work for all uses of merge machinery The 'merge' command is not the only one that does merges; other commands like checkout -m or rebase do as well. Unfortunately, the only area of the code that checked for the "merge.renormalize" config setting was in builtin/merge.c, meaning it could only affect merges performed by the "merge" command. Move the handling of this config setting to merge_recursive_config() so that other commands can benefit from it as well. Fixes a few tests in t6038. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-03 11:48:15 -07:00
Junio C Hamano	add0a35caa	Merge branch 'rs/grep-simpler-parse-object-or-die-call' into master * rs/grep-simpler-parse-object-or-die-call: grep: avoid using oid_to_hex() with parse_object_or_die()	2020-07-30 21:34:30 -07:00
Jeff King	d70a9eb611	strvec: rename struct fields The "argc" and "argv" names made sense when the struct was argv_array, but now they're just confusing. Let's rename them to "nr" (which we use for counts elsewhere) and "v" (which is rather terse, but reads well when combined with typical variable names like "args.v"). Note that we have to update all of the callers immediately. Playing tricks with the preprocessor is hard here, because we wouldn't want to rewrite unrelated tokens. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-30 19:18:06 -07:00
Junio C Hamano	be2dab9c80	Merge branch 'ct/mv-unmerged-path-error' into master "git mv src dst", when src is an unmerged path, errored out correctly but with an incorrect error message to claim that src is not tracked, which has been clarified. * ct/mv-unmerged-path-error: git-mv: improve error message for conflicted file	2020-07-30 13:20:35 -07:00
Junio C Hamano	3161cc6e6b	Merge branch 'hn/reftable' into master Preliminary clean-up of the refs API in preparation for adding a new refs backend "reftable". * hn/reftable: reflog: cleanse messages in the refs.c layer bisect: treat BISECT_HEAD as a pseudo ref t3432: use git-reflog to inspect the reflog for HEAD lib-t6000.sh: write tag using git-update-ref	2020-07-30 13:20:32 -07:00
Junio C Hamano	f175e9b845	Merge branch 'bw/fail-cloning-into-non-empty' into master "git clone --separate-git-dir=$elsewhere" used to stomp on the contents of the existing directory $elsewhere, which has been taught to fail when $elsewhere is not an empty directory. * bw/fail-cloning-into-non-empty: git clone: don't clone into non-empty directory	2020-07-30 13:20:32 -07:00
Junio C Hamano	70cdbbe3a7	Merge branch 'ds/commit-graph-bloom-updates' into master Updates to the changed-paths bloom filter. * ds/commit-graph-bloom-updates: commit-graph: check all leading directories in changed path Bloom filters revision: empty pathspecs should not use Bloom filters revision.c: fix whitespace commit-graph: check chunk sizes after writing commit-graph: simplify chunk writes into loop commit-graph: unify the signatures of all write_graph_chunk_*() functions commit-graph: persist existence of changed-paths bloom: fix logic in get_bloom_filter() commit-graph: change test to die on parse, not load commit-graph: place bloom_settings in context	2020-07-30 13:20:31 -07:00
brian m. carlson	eff45daab8	repository: enable SHA-256 support by default Now that we have a complete SHA-256 implementation in Git, let's enable it so people can use it. Remove the ENABLE_SHA256 define constant everywhere it's used. Add tests for initializing a repository with SHA-256. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-30 09:16:49 -07:00
brian m. carlson	c5aecfc866	bundle: add new version for use with SHA-256 Currently we detect the hash algorithm in use by the length of the object ID. This is inelegant and prevents us from using a different hash algorithm that is also 256 bits in length. Since we cannot extend the v2 format in a backward-compatible way, let's add a v3 format, which is identical, except for the addition of capabilities, which are prefixed by an at sign. We add "object-format" as the only capability and reject unknown capabilities, since we do not have a network connection and therefore cannot negotiate with the other side. For compatibility, default to the v2 format for SHA-1 and require v3 for SHA-256. In t5510, always use format v3 so we can be sure we produce consistent results across hash algorithms. Since head -n N lists the top N lines instead of the Nth line, let's run our output through sed to normalize it and compare it against a fixed value, which will make sure we get exactly what we're expecting. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-30 09:16:48 -07:00
brian m. carlson	e74b606d47	builtin/verify-pack: implement an --object-format option A recently added test in t5702 started using git verify-pack outside of a repository. While this poses no problems with SHA-1, with SHA-256 we implicitly rely on the setup of the repository to initialize our hash algorithm settings. Since we're not in a repository here, we need to provide git verify-pack help to set things up properly. git index-pack already knows an --object-format option, so let's accept one as well and pass it down to our git index-pack invocation. Since we're now dynamically adjusting the elements in argv, let's switch to using struct argv_array to manage them. Finally, let's make t5702 pass the proper argument on down to its git verify-pack caller. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-30 09:16:48 -07:00
Jeff King	9ab89a2439	log: enable "-m" automatically with "--first-parent" When using "--first-parent" to consider history as a single line of commits, git-log still defaults to treating merges specially, even though they could be considered as single commits in the linearized history (that just introduce all of the changes from the second and higher parents). Let's instead have "--first-parent" imply "-m", which makes something like: git log --first-parent -p do what you'd expect. Likewise: git log --first-parent -Sfoo will find "foo" in merge commits. No new test is needed; we'll tweak the output of the existing "--first-parent -p" test, which now matches the "-m --first-parent -p" test. The unchanged existing test for "--no-diff-merges" confirms that the user can get the old behavior if they want. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-29 13:43:57 -07:00
Jeff King	6fae74b418	revision: add "--no-diff-merges" option to counteract "-m" The "-m" option sets revs->ignore_merges to "0", but there's no way to undo it. This probably isn't something anybody overly cares about, since "1" is already the default, but it will serve as an escape hatch when we flip the default for ignore_merges to "0" in more situations. We'll also add a few extra niceties: - initialize the value to "-1" to indicate "not set", and then resolve it to the normal 0/1 bool in setup_revisions(). This lets any tweak functions, as well as setup_revisions() itself, avoid clobbering the user's preference (which until now they couldn't actually express). - since we now have --no-diff-merges, let's add the matching --diff-merges, which is just a synonym for "-m". Then we don't even need to document --no-diff-merges separately; it countermands the long form of "-m" in the usual way. The new test shows that this behaves just the same as the current behavior without "-m". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-29 13:43:57 -07:00
Jeff King	eed5332a13	log: drop "--cc implies -m" logic This was added by `82dee4160c` (log: show merge commit when --cc is given, 2015-08-20), which explains why we need it. But that commit failed to notice that setup_revisions() already does the same thing, since `cd2bdc5309` (Common option parsing for "git log --diff" and friends, 2006-04-14). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-29 13:43:57 -07:00
René Scharfe	98c6871fad	grep: avoid using oid_to_hex() with parse_object_or_die() parse_object_or_die() is passed an object ID and a name to show if the object cannot be parsed. If the name is NULL then it shows the hexadecimal object ID. Use that feature instead of preparing and passing the hexadecimal representation to the function proactively. That's shorter and a bit more efficient. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-28 15:26:12 -07:00
Jeff King	f6d8942b1f	strvec: fix indentation in renamed calls Code which split an argv_array call across multiple lines, like: argv_array_pushl(&args, "one argument", "another argument", "and more", NULL); was recently mechanically renamed to use strvec, which results in mis-matched indentation like: strvec_pushl(&args, "one argument", "another argument", "and more", NULL); Let's fix these up to align the arguments with the opening paren. I did this manually by sifting through the results of: git jump grep 'strvec_.*,$' and liberally applying my editor's auto-format. Most of the changes are of the form shown above, though I also normalized a few that had originally used a single-tab indentation (rather than our usual style of aligning with the open paren). I also rewrapped a couple of obvious cases (e.g., where previously too-long lines became short enough to fit on one), but I wasn't aggressive about it. In cases broken to three or more lines, the grouping of arguments is sometimes meaningful, and it wasn't worth my time or reviewer time to ponder each case individually. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-28 15:02:18 -07:00
Jeff King	22f9b7f3f5	strvec: convert builtin/ callers away from argv_array name We eventually want to drop the argv_array name and just use strvec consistently. There's no particular reason we have to do it all at once, or care about interactions between converted and unconverted bits. Because of our preprocessor compat layer, the names are interchangeable to the compiler (so even a definition and declaration using different names is OK). This patch converts all of the files in builtin/ to keep the diff to a manageable size. The conversion was done purely mechanically with: git ls-files '.c' '.h' \| xargs perl -i -pe ' s/ARGV_ARRAY/STRVEC/g; s/argv_array/strvec/g; ' and then selectively staging files with "git add builtin/". We'll deal with any indentation/style fallouts separately. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-28 15:02:18 -07:00
Jeff King	2745b6b450	quote: rename sq_dequote_to_argv_array to mention strvec We want to eventually drop the use of the "argv_array" name in favor of "strvec." Unlike most other uses of the name, this one is embedded in a function name, so the definition and all of the callers need to be updated at the same time. We don't technically need to update the parameter types here (our preprocessor compat macros make the two names interchangeable), but let's do so to keep the site consistent for now. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-28 15:02:18 -07:00
Jeff King	dbbcd44fb4	strvec: rename files from argv-array to strvec This requires updating #include lines across the code-base, but that's all fairly mechanical, and was done with: git ls-files '.c' '.h' \| xargs perl -i -pe 's/argv-array.h/strvec.h/' Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-28 15:02:17 -07:00
Jonathan Tan	e00549aa9b	pack-objects: prefetch objects to be packed When an object to be packed is noticed to be missing, prefetch all to-be-packed objects in one batch. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-21 14:29:42 -07:00

... 8 9 10 11 12 ...

9533 Commits