git-commit-vandalism

Author	SHA1	Message	Date
Laurent Arnoud	c28ded83fc	diff: add config option relative The `diff.relative` boolean option set to `true` shows only changes in the current directory/value specified by the `path` argument of the `relative` option and shows pathnames relative to the aforementioned directory. Teach `--no-relative` to override earlier `--relative` Add for git-format-patch(1) options documentation `--relative` and `--no-relative` Signed-off-by: Laurent Arnoud <laurent@spkdev.net> Acked-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-24 16:23:59 -07:00
Carlo Marcelo Arenas Belón	d63ae31962	t5608: avoid say() and use "skip_all" instead for consistency Printing a message directly to stdout could affect TAP processing and is not really needed, as there is a standard way to skip all tests that could be used instead, while printing an equivalent message. While at it; update the message to better reflect that since `a85efb5985` (t5608-clone-2gb.sh: turn GIT_TEST_CLONE_2GB into a bool, 2019-11-22), the enabling variable should be a recognized boolean (ex: true, false, 1, 0) and get rid of the prerequisite that used to guard all the tests, since "skip_all" is just much faster and idempotent. Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-24 16:22:19 -07:00
René Scharfe	bb2198fb91	checkout: improve error messages for -b with extra argument When we try to create a branch "foo" based on "origin/master" and give git commit -b an extra unsupported argument "bar", it confusingly reports: $ git checkout -b foo origin/master bar fatal: 'bar' is not a commit and a branch 'foo' cannot be created from it $ git checkout --track -b foo origin/master bar fatal: 'bar' is not a commit and a branch 'foo' cannot be created from it That's wrong, because it very well understands that "origin/master" is supposed to be the start point for the new branch and not "bar". Check if we got a commit and show more fitting messages in that case instead: $ git checkout -b foo origin/master bar fatal: Cannot update paths and switch to branch 'foo' at the same time. $ git checkout --track -b foo origin/master bar fatal: '--track' cannot be used with updating paths Original-patch-by: Jeff King <peff@peff.net> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-24 16:21:30 -07:00
René Scharfe	16ab794b82	checkout: add tests for -b and --track Test git checkout -b with and without --track and demonstrate unexpected error messages when it's given an extra (i.e. unsupported) path argument. In both cases it reports: $ git checkout -b foo origin/master bar fatal: 'bar' is not a commit and a branch 'foo' cannot be created from it The problem is that the start point we gave for the new branch is "origin/master" and "bar" is just some extra argument -- it could even be a valid commit, which would make the message even more confusing. We have more fitting error messages in git commit, but get confused; use the text of the rights ones in the tests. Reported-by: Dana Dahlstrom <dahlstrom@google.com> Original-test-by: Jeff King <peff@peff.net> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-24 16:19:41 -07:00
Carlo Marcelo Arenas Belón	4d9005ff5d	bisect--helper: avoid segfault with bad syntax in `start --term-*` `06f5608c14` (bisect--helper: `bisect_start` shell function partially in C, 2019-01-02) adds a lax parser for `git bisect start` which could result in a segfault under a bad syntax call for start with custom terms. Detect if there are enough arguments left in the command line to use for --term-{old,good,new,bad} and abort with the same syntax error the original implementation will show if not. While at it, remove an unnecessary (and incomplete) check for unknown arguments and make sure to add a test to avoid regressions. Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Acked-by: Christian Couder <christian.couder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-24 09:00:11 -07:00
René Scharfe	fe747043dc	fsck: detect more in-tree d/f conflicts If the conflict candidate file name from the top of the stack is not a prefix of the current candiate directory then we can discard it as no matching directory can come up later. But we are not done checking the candidate directory -- the stack might still hold a matching file name, so stay in the loop and check the next candidate file name. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-21 11:03:13 -07:00
René Scharfe	3d71b1cf60	t1450: demonstrate undetected in-tree d/f conflict Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-21 11:03:13 -07:00
René Scharfe	fc12aa7bfd	t1450: increase test coverage of in-tree d/f detection Exercise the case of putting a conflict candidate file name back on the stack because a matching directory might yet come up later. Do that by factoring out the test code into a function to allow for more concise notation in the form of parameters indicating names of trees (with trailing slash) and blobs (without trailing slash) in no particular order (they are sorted by git mktree). Then add the new test case as a second function call. Fix a typo in the test title while at it ("dublicate"). Reported-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-21 11:03:10 -07:00
brian m. carlson	8d3e33dadd	t2060: add a test for switch with --orphan and --discard-changes We have several code paths in the checkout code which are traversed only in this case, due to switch having different defaults from checkout. Let's add a test that the combination of options works and produces the expected behavior. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-21 09:55:23 -07:00
Han-Wen Nienhuys	cdb73ca56f	t: use update-ref and show-ref to reading/writing refs Reading and writing .git/refs/* assumes that refs are stored in the 'files' ref backend. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-20 10:39:01 -07:00
Carlo Marcelo Arenas Belón	c703309024	t5520: avoid alternation in grep's BRE (not POSIX) Instead of using a BRE, that broke tests 30-32, 37-39, 42 at least with OpenBSD 6.7; use a simpler ERE. Fixes: `d9f15d37f1` (pull: pass --autostash to merge, 2020-04-07) Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-20 08:57:07 -07:00
Carlo Marcelo Arenas Belón	784ce03d55	t4216: avoid unnecessary subshell in test_bloom_filters_not_used Seems to trigger a bug in at least OpenBSD's 6.7 sh where it is interpreted as a history lookup and therefore fails 125-126, 128, 130. Remove the subshell and get a space between ! and grep, so tests pass successfully. Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-20 08:56:12 -07:00
Junio C Hamano	972ce8561d	Merge branch 'jc/fix-tap-output-under-bash' A recent attempt to make the test output nicer to view on CI systems broke TAP output under bash. The effort has been reverted to be re-attempted in the next cycle. * jc/fix-tap-output-under-bash: Revert "tests: when run in Bash, annotate test failures with file name/line number" Revert "ci: add a problem matcher for GitHub Actions" Revert "t/test_lib: avoid naked bash arrays in file_lineno"	2020-05-20 08:33:29 -07:00
Junio C Hamano	abbd1d9ebf	Merge branch 'en/merge-rename-rename-worktree-fix' When a binary file gets modified and renamed on both sides of history to different locations, both files would be written to the working tree but both would have the contents from "ours". This has been corrected so that the path from each side gets their original content. * en/merge-rename-rename-worktree-fix: merge-recursive: fix rename/rename(1to2) for working tree with a binary	2020-05-20 08:33:27 -07:00
Junio C Hamano	74c6cba6d8	Merge branch 'dd/t1509-i18n-fix' A few tests were not i18n clean. * dd/t1509-i18n-fix: t1509: correct i18n test	2020-05-20 08:33:26 -07:00
Jonathan Tan	293194c9f9	t4067: make rename detection test output raw diff `95acf11a3d` ("diff: restrict when prefetching occurs", 2020-04-07) taught diff to prefetch blobs in a more limited set of situations. These limited situations include when the output format requires blob data, and when inexact rename detection is needed. There is an existing test case that tests inexact rename detection, but it also uses an output format that requires blob data, resulting in the inexact-rename-detection-only code not being tested. Update this test to use the raw output format, which does not require blob data. Thanks to Derrick Stolee for noticing this lapse in code coverage and for doing the preliminary analysis [1]. [1] https://lore.kernel.org/git/853759d3-97c3-241f-98e1-990883cd204e@gmail.com/ Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-19 16:09:16 -07:00
Đoàn Trần Công Danh	1eb7371236	t5703: replace "grep -a" usage by perl On some platforms likes HP-UX, grep(1) doesn't understand "-a". Let's switch to perl. Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-19 11:15:23 -07:00
Carlo Marcelo Arenas Belón	c4c2a96ec7	t4210: detect REG_ILLSEQ dynamically and skip affected tests `7187c7bbb8` (t4210: skip i18n tests that don't work on FreeBSD, 2019-11-27) adds a REG_ILLSEQ prerequisite, and to do that copies the common branch in test-lib and expands it to include it in a special case for FreeBSD. Instead; test for it using a previously added extension to test-tool and use that, together with a function that identifies when regcomp/regexec will be called with broken patterns to avoid any test that would otherwise rely on undefined behaviour. The description of the first test which wasn't accurate has been corrected, and the test rearranged for clarity, including a helper function that avoids overly long lines. Only the affected engines will have their tests suppressed, also including "fixed" if the PCRE optimization that uses LIBPCRE2 since `b65abcafc7` (grep: use PCRE v2 for optimized fixed-string search, 2019-07-01) is not available. Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-18 13:03:36 -07:00
Carlo Marcelo Arenas Belón	aba8187e4d	t/helper: teach test-regex to report pattern errors (like REG_ILLSEQ) `7187c7bbb8` (t4210: skip i18n tests that don't work on FreeBSD, 2019-11-27) adds a REG_ILLSEQ prerequisite to avoid failures from the tests added in `4e2443b181` (log tests: test regex backends in "--encode=<enc>" tests, 2019-06-28), but hardcodes it to be only enabled in FreeBSD. Instead of hardcoding the affected platform, teach the test-regex helper, how to validate a pattern and report back, so it can be used to detect the same issue in other affected systems (like DragonFlyBSD or macOS). While at it, refactor the tool so it can report back the source of the errors it founds, and can be invoked also in a --silent mode, when needed, for backward compatibility. A missing flag has been added and the code reformatted, as well as updates to the way the parameters are handled, for consistency. To minimize changes, it is assumed the regcomp error is of the right type since we control the only caller, and is also assumed to affect both basic and extended syntax (only basic is tested, but both behave the same in all three affected platforms since they use the same function). Based-on-patch-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-18 13:03:35 -07:00
Taylor Blau	2f00c355cb	commit-graph: drop COMMIT_GRAPH_WRITE_CHECK_OIDS flag Since `7c5c9b9c57` (commit-graph: error out on invalid commit oids in 'write --stdin-commits', 2019-08-05), the commit-graph builtin dies on receiving non-commit OIDs as input to '--stdin-commits'. This behavior can be cumbersome to work around in, say, the case of piping 'git for-each-ref' to 'git commit-graph write --stdin-commits' if the caller does not want to cull out non-commits themselves. In this situation, it would be ideal if 'git commit-graph write' wrote the graph containing the inputs that did pertain to commits, and silently ignored the remainder of the input. Some options have been proposed to the effect of '--[no-]check-oids' which would allow callers to have the commit-graph builtin do just that. After some discussion, it is difficult to imagine a caller who wouldn't want to pass '--no-check-oids', suggesting that we should get rid of the behavior of complaining about non-commit inputs altogether. If callers do wish to retain this behavior, they can easily work around this change by doing the following: git for-each-ref --format='%(objectname) %(objecttype) %(*objecttype)' \| awk ' !/commit/ { print "not-a-commit:"$1 } /commit/ { print $1 } ' \| git commit-graph write --stdin-commits To make it so that valid OIDs that refer to non-existent objects are indeed an error after loosening the error handling, perform an extra lookup to make sure that object indeed exists before sending it to the commit-graph internals. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-18 12:51:11 -07:00
Taylor Blau	1f1304d497	t5318: reorder test below 'graph_read_expect' In the subsequent commit, we will introduce a dependency on 'graph_read_expect' from t5318.7. Preemptively move it below 'graph_read_expect()'s definition so that the test can call it. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-18 12:51:11 -07:00
Greg Price	ddcfc7c67d	tests: skip small-stack tests on hppa architecture On hppa these tests crash because the allocated stack space is too small, even after it was doubled in `b9a190789` (and the data size doubled to match) to make it work on powerpc. For this arch just skip these tests, which is enough to make the whole suite pass. Fixes: https://bugs.debian.org/757402 Based-on-patch-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Greg Price <gnprice@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-18 10:05:12 -07:00
Junio C Hamano	e31600b03f	Revert "tests: when run in Bash, annotate test failures with file name/line number" This reverts commit `662f9cf154`, to fix the TAP output broken for bash.	2020-05-15 10:25:58 -07:00
Junio C Hamano	3d7b2b4196	Revert "t/test_lib: avoid naked bash arrays in file_lineno" This reverts commit 303775a25f0b4ac5d6ad2e96eb4404c24209cad8; instead of trying to salvage the tap-breaking change, let's revert the whole thing for now.	2020-05-15 09:47:18 -07:00
Junio C Hamano	d98abce68f	Merge branch 'es/trace-log-progress' Teach codepaths that show progress meter to also use the start_progress() and the stop_progress() calls as a "region" to be traced. * es/trace-log-progress: trace2: log progress time and throughput	2020-05-14 14:39:45 -07:00
Junio C Hamano	ac140beebe	Merge branch 'jt/t5500-unflake' Test fix for a topic already in 'master' and meant for 'maint'. * jt/t5500-unflake: t5500: count objects through stderr, not trace	2020-05-14 14:39:45 -07:00
Junio C Hamano	6baba94afc	Merge branch 'sn/midx-repack-with-config' "git multi-pack-index repack" has been taught to honor some repack.* configuration variables. * sn/midx-repack-with-config: multi-pack-index: respect repack.packKeptObjects=false midx: teach "git multi-pack-index repack" honor "git repack" configurations	2020-05-14 14:39:44 -07:00
Junio C Hamano	4b1e5e5d8c	Merge branch 'ds/bloom-cleanup' Code cleanup and typofixes * ds/bloom-cleanup: completion: offer '--(no-)patch' among 'git log' options bloom: use num_changes not nr for limit detection bloom: de-duplicate directory entries Documentation: changed-path Bloom filters use byte words bloom: parse commit before computing filters test-bloom: fix usage typo bloom: fix whitespace around tab length	2020-05-14 14:39:44 -07:00
Junio C Hamano	0498840b35	Merge branch 'rs/fsck-duplicate-names-in-trees' "git fsck" ensures that the paths recorded in tree objects are sorted and without duplicates, but it failed to notice a case where a blob is followed by entries that sort before a tree with the same name. This has been corrected. * rs/fsck-duplicate-names-in-trees: fsck: report non-consecutive duplicate names in trees	2020-05-14 14:39:44 -07:00
Junio C Hamano	f4507cea24	Merge branch 'ao/p4-d-f-conflict-recover' "git p4" learned to recover from a (broken) state where a directory and a file are recorded at the same path in the Perforce repository the same way as their clients do. * ao/p4-d-f-conflict-recover: git-p4: recover from inconsistent perforce history	2020-05-14 14:39:43 -07:00
Junio C Hamano	a2a0942a16	Merge branch 'js/rebase-autosquash-double-fixup-fix' "rebase -i" segfaulted when rearranging a sequence that has a fix-up that applies another fix-up (which may or may not be a fix-up of yet another step). * js/rebase-autosquash-double-fixup-fix: rebase --autosquash: fix a potential segfault	2020-05-14 14:39:43 -07:00
Junio C Hamano	f9dbe28d62	Merge branch 'cw/bisect-replay-with-dos' "git bisect replay" had trouble with input files when they used CRLF line ending, which has been corrected. * cw/bisect-replay-with-dos: bisect: allow CRLF line endings in "git bisect replay" input	2020-05-14 14:39:41 -07:00
Junio C Hamano	3583730758	Merge branch 'es/bugreport-with-hooks' "git bugreport" learned to report enabled hooks in the repository. * es/bugreport-with-hooks: bugreport: collect list of populated hooks	2020-05-14 14:39:41 -07:00
Elijah Newren	95983da6b4	merge-recursive: fix rename/rename(1to2) for working tree with a binary With a rename/rename(1to2) conflict, we attempt to do a three-way merge of the file contents, so that the correct contents can be placed in the working tree at both paths. If the file is a binary, however, no content merging is possible and we should just use the original version of the file at each of the paths. Reported-by: Chunlin Zhang <zhangchunlin@gmail.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-14 12:14:19 -07:00
Junio C Hamano	a0125885f5	Merge branch 'cc/upload-pack-v2-fetch-fix' Serving a "git fetch" client over "git://" and "ssh://" protocols using the on-wire protocol version 2 was buggy on the server end when the client needs to make a follow-up request to e.g. auto-follow tags. * cc/upload-pack-v2-fetch-fix: upload-pack: clear filter_options for each v2 fetch command	2020-05-13 12:19:21 -07:00
Junio C Hamano	2e72299ec6	Merge branch 'dd/bloom-sparse-fix' Code clean-up. * dd/bloom-sparse-fix: bloom: fix `make sparse` warning	2020-05-13 12:19:20 -07:00
Junio C Hamano	69ae8ffa2a	Merge branch 'tb/bitmap-walk-with-tree-zero-filter' The object walk with object filter "--filter=tree:0" can now take advantage of the pack bitmap when available. * tb/bitmap-walk-with-tree-zero-filter: pack-bitmap: pass object filter to fill-in traversal pack-bitmap.c: support 'tree:0' filtering pack-bitmap.c: make object filtering functions generic list-objects-filter: treat NULL filter_options as "disabled"	2020-05-13 12:19:18 -07:00
Đoàn Trần Công Danh	27e29f859d	t1509: correct i18n test git-init(1)'s messages is subjected to i18n. They should be tested by test_i18n* family. Fix them. Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-13 09:59:00 -07:00
brian m. carlson	a114296371	t1050: match object ID paths in a hash-insensitive way The pattern here looking for failures is specific to SHA-1. Let's create a variable that matches the regex or glob pattern for a path within the objects directory. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-12 22:36:17 -07:00
Emily Shaffer	98a1364740	trace2: log progress time and throughput Rather than teaching only one operation, like 'git fetch', how to write down throughput to traces, we can learn about a wide range of user operations that may seem slow by adding tooling to the progress library itself. Operations which display progress are likely to be slow-running and the kind of thing we want to monitor for performance anyways. By showing object counts and data transfer size, we should be able to make some derived measurements to ensure operations are scaling the way we expect. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-12 15:30:39 -07:00
Jonathan Tan	7167a62b9e	http, imap-send: stop using CURLOPT_VERBOSE Whenever GIT_CURL_VERBOSE is set, teach Git to behave as if GIT_TRACE_CURL=1 and GIT_TRACE_CURL_NO_DATA=1 is set, instead of setting CURLOPT_VERBOSE. This is to prevent inadvertent revelation of sensitive data. In particular, GIT_CURL_VERBOSE redacts neither the "Authorization" header nor any cookies specified by GIT_REDACT_COOKIES. Unifying the tracing mechanism also has the future benefit that any improvements to the tracing mechanism will benefit both users of GIT_CURL_VERBOSE and GIT_TRACE_CURL, and we do not need to remember to implement any improvement twice. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-11 11:18:01 -07:00
Jonathan Tan	373e9bd66e	t5551: test that GIT_TRACE_CURL redacts password Verify that when GIT_TRACE_CURL is set, Git prints out "Authorization: Basic <redacted>" instead of the base64-encoded authorization details. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-11 11:17:59 -07:00
Derrick Stolee	2f6775f00c	bloom: use num_changes not nr for limit detection As diff_tree_oid() computes a diff, it will terminate early if the total number of changed paths is strictly larger than max_changes. This includes the directories that changed, not just the file paths. However, only the file paths are reflected in the resulting diff queue's "nr" value. Use the "num_changes" from diffopt to check if the diff terminated early. This is incredibly important, as it can result in incorrect filters! For example, the first commit in the Linux kernel repo reports only 471 changes, but since these are nested inside several directories they expand to 513 "real" changes, and in fact the total list of changes is not reported. Thus, the computed filter for this commit is incorrect. Demonstrate the subtle difference by using one fewer file change in the 'get bloom filter for commit with 513 changes' test. Before, this edited 513 files inside "bigDir" which hit this inequality. However, dropping the file count by one demonstrates how the previous inequality was incorrect but the new one is correct. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-11 09:33:56 -07:00
SZEDER Gábor	3cb9d2b6f9	line-log: more responsive, incremental 'git log -L' The current line-level log implementation performs a preprocessing step in prepare_revision_walk(), during which the line_log_filter() function filters and rewrites history to keep only commits modifying the given line range. This preprocessing affects both responsiveness and correctness: - Git doesn't produce any output during this preprocessing step. Checking whether a commit modified the given line range is somewhat expensive, so depending on the size of the given revision range this preprocessing can result in a significant delay before the first commit is shown. - Limiting the number of displayed commits (e.g. 'git log -3 -L...') doesn't limit the amount of work during preprocessing, because that limit is applied during history traversal. Alas, by that point this expensive preprocessing step has already churned through the whole revision range to find all commits modifying the revision range, even though only a few of them need to be shown. - It rewrites parents, with no way to turn it off. Without the user explicitly requesting parent rewriting any parent object ID shown should be that of the immediate parent, just like in case of a pathspec-limited history traversal without parent rewriting. However, after that preprocessing step rewrote history, the subsequent "regular" history traversal (i.e. get_revision() in a loop) only sees commits modifying the given line range. Consequently, it can only show the object ID of the last ancestor that modified the given line range (which might happen to be the immediate parent, but many-many times it isn't). This patch addresses both the correctness and, at least for the common case, the responsiveness issues by integrating line-level log filtering into the regular revision walking machinery: - Make process_ranges_arbitrary_commit(), the static function in 'line-log.c' deciding whether a commit modifies the given line range, public by removing the static keyword and adding the 'line_log_' prefix, so it can be called from other parts of the revision walking machinery. - If the user didn't explicitly ask for parent rewriting (which, I believe, is the most common case): - Call this now-public function during regular history traversal, namely from get_commit_action() to ignore any commits not modifying the given line range. Note that while this check is relatively expensive, it must be performed before other, much cheaper conditions, because the tracked line range must be adjusted even when the commit will end up being ignored by other conditions. - Skip the line_log_filter() call, i.e. the expensive preprocessing step, in prepare_revision_walk(), because, thanks to the above points, the revision walking machinery is now able to filter out commits not modifying the given line range while traversing history. This way the regular history traversal sees the unmodified history, and is therefore able to print the object ids of the immediate parents of the listed commits. The eliminated preprocessing step can greatly reduce the delay before the first commit is shown, see the numbers below. - However, if the user did explicitly ask for parent rewriting via '--parents' or a similar option, then stick with the current implementation for now, i.e. perform that expensive filtering and history rewriting in the preprocessing step just like we did before, leaving the initial delay as long as it was. I tried to integrate line-level log filtering with parent rewriting into the regular history traversal, but, unfortunately, several subtleties resisted... :) Maybe someday we'll figure out how to do that, but until then at least the simple and common (i.e. without parent rewriting) 'git log -L:func:file' commands can benefit from the reduced delay. This change makes the failing 'parent oids without parent rewriting' test in 't4211-line-log.sh' succeed. The reduced delay is most noticable when there's a commit modifying the line range near the tip of a large-ish revision range: # no parent rewriting requested, no commit-graph present $ time git --no-pager log -L:read_alternate_refs:sha1-file.c -1 v2.23.0 Before: real 0m9.570s user 0m9.494s sys 0m0.076s After: real 0m0.718s user 0m0.674s sys 0m0.044s A significant part of the remaining delay is spent reading and parsing commit objects in limit_list(). With the help of the commit-graph we can eliminate most of that reading and parsing overhead, so here are the timing results of the same command as above, but this time using the commit-graph: Before: real 0m8.874s user 0m8.816s sys 0m0.057s After: real 0m0.107s user 0m0.091s sys 0m0.013s The next patch will further reduce the remaining delay. To be clear: this patch doesn't actually optimize the line-level log, but merely moves most of the work from the preprocessing step to the history traversal, so the commits modifying the line range can be shown as soon as they are processed, and the traversal can be terminated as soon as the given number of commits are shown. Consequently, listing the full history of a line range, potentially all the way to the root commit, will take the same time as before (but at least the user might start reading the output earlier). Furthermore, if the most recent commit modifying the line range is far away from the starting revision, then that initial delay will still be significant. Additional testing by Derrick Stolee: In the Linux kernel repository, the MAINTAINERS file was changed ~3,500 times across the ~915,000 commits. In addition to that edit frequency, the file itself is quite large (~18,700 lines). This means that a significant portion of the computation is taken up by computing the patch-diff of the file. This patch improves the real time it takes to output the first result quite a bit: Command: git log -L 100,200:MAINTAINERS -n 1 >/dev/null Before: 3.88 s After: 0.71 s If we drop the "-n 1" in the command, then there is no change in end-to-end process time. This is because the command still needs to walk the entire commit history, which negates the point of this patch. This is expected. As a note for future reference, the ~4.3 seconds in the old code spends ~2.6 seconds computing the patch-diffs, and the rest of the time is spent walking commits and computing diffs for which paths changed at each commit. The changed-path Bloom filters could improve the end-to-end computation time (i.e. no "-n 1" in the command). Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-11 09:33:56 -07:00
Derrick Stolee	65c1a28bb6	bloom: de-duplicate directory entries When computing a changed-path Bloom filter, we need to take the files that changed from the diff computation and extract the parent directories. That way, a directory pathspec such as "Documentation" could match commits that change "Documentation/git.txt". However, the current code does a poor job of this process. The paths are added to a hashmap, but we do not check if an entry already exists with that path. This can create many duplicate entries and cause the filter to have a much larger length than it should. This means that the filter is more sparse than intended, which helps the false positive rate, but wastes a lot of space. Properly use hashmap_get() before hashmap_add(). Also be sure to include a comparison function so these can be matched correctly. This has an effect on a test in t0095-bloom.sh. This makes sense, there are ten changes inside "smallDir" so the total number of paths in the filter should be 11. This would result in 11 * 10 bits required, and with 8 bits per byte, this results in 14 bytes. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-11 09:33:56 -07:00
SZEDER Gábor	48da94ba37	t4211-line-log: add tests for parent oids None of the tests in 't4211-line-log.sh' really check which parent object IDs are shown in the output, either implicitly as part of "Merge: ..." lines [1] or explicitly via the '%p' or '%P' format specifiers in a custom pretty format. Add two tests to 't4211-line-log.sh' to check which parent object IDs are shown, one without and one with explicitly requested parent rewriting, IOW without and with the '--parents' option. The test without '--parents' is marked as failing, because without that option parent rewriting should not be performed, and thus the parent object ID should be that of the immediate parent, just like in case of a pathspec-limited history traversal without parent rewriting. The current line-level log implementation, however, performs parent rewriting unconditionally and without a possibility to turn it off, and, consequently, it shows the object ID of the most recent ancestor that modified the given line range. In both of these new tests we only really care about the object IDs of the listed commits and their parents, but not the diffs of the line ranges; the diffs have already been thoroughly checked in the previous tests. [1] While one of the tests ('-M -L ':f:b.c' parallel-change') does list a merge commit, both of its parents happen to modify the given line range and are listed as well, so the implications of parent rewriting remained hidden and untested. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-11 09:33:56 -07:00
René Scharfe	9068cfb20f	fsck: report non-consecutive duplicate names in trees Tree entries are sorted in path order, meaning that directory names get a slash ('/') appended implicitly. Git fsck checks if trees contains consecutive duplicates, but due to that ordering there can be non-consecutive duplicates as well if one of them is a directory and the other one isn't. Such a tree cannot be fully checked out. Find these duplicates by recording candidate file names on a stack and check candidate directory names against that stack to find matches. Suggested-by: Brandon Williams <bwilliamseng@gmail.com> Original-test-by: Brandon Williams <bwilliamseng@gmail.com> Signed-off-by: René Scharfe <l.s.r@web.de> Reviewed-by: Luke Diamand <luke@diamand.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-11 08:40:28 -07:00
Andrew Oakley	82e46d6b83	git-p4: recover from inconsistent perforce history Perforce allows you commit files and directories with the same name, so you could have files //depot/foo and //depot/foo/bar both checked in. A p4 sync of a repository in this state fails. Deleting one of the files recovers the repository. When this happens we want git-p4 to recover in the same way as perforce. Note that Perforce has this change in their 2017.1 version: Bugs fixed in 2017.1 #1489051 (Job #2170) ** Submitting a file with the same name as an existing depot directory path (or vice versa) will now be rejected. so people hopefully will not creating damaged Perforce repos anymore, but "git p4" needs to be able to interact with already corrupt ones. Signed-off-by: Andrew Oakley <andrew@adoakley.name> Reviewed-by: Luke Diamand <luke@diamand.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-10 09:58:50 -07:00
Derrick Stolee	3ce4ca0a56	multi-pack-index: respect repack.packKeptObjects=false When selecting a batch of pack-files to repack in the "git multi-pack-index repack" command, Git should respect the repack.packKeptObjects config option. When false, this option says that the pack-files with an associated ".keep" file should not be repacked. This config value is "false" by default. There are two cases for selecting a batch of objects. The first is the case where the input batch-size is zero, which specifies "repack everything". The second is with a non-zero batch size, which selects pack-files using a greedy selection criteria. Both of these cases are updated and tested. Reported-by: Son Luong Ngoc <sluongng@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-10 09:50:55 -07:00
Johannes Schindelin	02471e7e20	rebase --autosquash: fix a potential segfault When rearranging the todo list so that the fixups/squashes are reordered just after the commits they intend to fix up, we use two arrays to maintain that list: `next` and `tail`. The idea is that `next[i]`, if set to a non-negative value, contains the index of the item that should be rearranged just after the `i`th item. To avoid having to walk the entire `next` chain when appending another fixup/squash, we also store the end of the `next` chain in `tail[i]`. The logic we currently use to update these array items is based on the assumption that given a fixup/squash item at index `i`, we just found the index `i2` indicating the first item in that fixup chain. However, as reported by Paul Ganssle, that need not be true: the special form `fixup! <commit-hash>` is allowed to point to _another_ fixup commit in the middle of the fixup chain. Example: * 0192a To fixup * 02f12 fixup! To fixup * 03763 fixup! To fixup * 04ecb fixup! 02f12 Note how the fourth commit targets the second commit, which is already a fixup that targets the first commit. Previously, we would update `next` and `tail` under our assumption that every `fixup!` commit would find the start of the `fixup!`/`squash!` chain. This would lead to a segmentation fault because we would actually end up with a `next[i]` pointing to a `fixup!` but the corresponding `tail[i]` pointing nowhere, which would the lead to a segmentation fault. Let's fix this by _inserting_, rather than _appending_, the item. In other words, if we make a given line successor of another line, we do not simply forget any previously set successor of the latter, but make it a successor of the former. In the above example, at the point when we insert 04ecb just after 02f12, 03763 would already be recorded as a successor of 04ecb, and we now "squeeze in" 04ecb. To complete the idea, we now no longer assume that `next[i]` pointing to a line means that `last[i]` points to a line, too. Instead, we extend the concept of `last` to cover also partial `fixup!`/`squash!` chains, i.e. chains starting in the middle of a larger such chain. In the above example, after processing all lines, `last[0]` (corresponding to 0192a) would point to 03763, which indeed is the end of the overall `fixup!` chain, and `last[1]` (corresponding to 02f12) would point to 04ecb (which is the last `fixup!` targeting 02f12, but it has 03763 as successor, i.e. it is not the end of overall `fixup!` chain). Reported-by: Paul Ganssle <paul@ganssle.io> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-09 13:59:55 -07:00
Junio C Hamano	282ce92448	Merge branch 'cb/test-bash-lineno-fix' Recent change to show files and line numbers of a breakage during test (only available when running the tests with bash) were hurting other shells with syntax errors, which has been corrected. * cb/test-bash-lineno-fix: t/test_lib: avoid naked bash arrays in file_lineno	2020-05-08 14:25:12 -07:00
Junio C Hamano	41eae3eaa8	Merge branch 'cb/t0000-use-the-configured-shell' The basic test did not honor $TEST_SHELL_PATH setting, which has been corrected. * cb/t0000-use-the-configured-shell: t/t0000-basic: make sure subtests also use TEST_SHELL_PATH	2020-05-08 14:25:12 -07:00
Junio C Hamano	4c2941a5fa	Merge branch 'es/restore-staged-from-head-by-default' "git restore --staged --worktree" now defaults to take the contents out of "HEAD", instead of erring out. * es/restore-staged-from-head-by-default: restore: default to HEAD when combining --staged and --worktree	2020-05-08 14:25:08 -07:00
Junio C Hamano	e9acbd6836	Merge branch 'ds/sparse-allow-empty-working-tree' The sparse-checkout patterns have been forbidden from excluding all paths, leaving an empty working tree, for a long time. This limitation has been lifted. * ds/sparse-allow-empty-working-tree: sparse-checkout: stop blocking empty workdirs	2020-05-08 14:25:06 -07:00
Junio C Hamano	6de1630898	Merge branch 'jk/for-each-ref-multi-key-sort-fix' "git branch" and other "for-each-ref" variants accepted multiple --sort=<key> options in the increasing order of precedence, but it had a few breakages around "--ignore-case" handling, and tie-breaking with the refname, which have been fixed. * jk/for-each-ref-multi-key-sort-fix: ref-filter: apply fallback refname sort only after all user sorts ref-filter: apply --ignore-case to all sorting keys	2020-05-08 14:25:04 -07:00
Junio C Hamano	dc4c3933b1	Merge branch 'ah/userdiff-markdown' The userdiff patterns for Markdown documents have been added. * ah/userdiff-markdown: userdiff: support Markdown	2020-05-08 14:25:01 -07:00
Junio C Hamano	933fdf8784	Merge branch 'cb/credential-store-ignore-bogus-lines' With the recent tightening of the code that is used to parse various parts of a URL for use in the credential subsystem, a hand-edited credential-store file causes the credential helper to die, which is a bit too harsh to the users. Demote the error behaviour to just ignore and keep using well-formed lines instead. * cb/credential-store-ignore-bogus-lines: credential-store: ignore bogus lines from store file credential-store: document the file format a bit more	2020-05-08 14:25:01 -07:00
Christian Couder	08450ef791	upload-pack: clear filter_options for each v2 fetch command Because of the request/response model of protocol v2, the upload_pack_v2() function is sometimes called twice in the same process, while 'struct list_objects_filter_options filter_options' was declared as static at the beginning of 'upload-pack.c'. This made the check in list_objects_filter_die_if_populated(), which is called by process_args(), fail the second time upload_pack_v2() is called, as filter_options had already been populated the first time. To fix that, filter_options is not static any more. It's now owned directly by upload_pack(). It's now also part of 'struct upload_pack_data', so that it's owned indirectly by upload_pack_v2(). In the long term, the goal is to also have upload_pack() use 'struct upload_pack_data', so adding filter_options to this struct makes more sense than to have it owned directly by upload_pack_v2(). This fixes the first of the 2 bugs documented by `d0badf8797` (partial-clone: demonstrate bugs in partial fetch, 2020-02-21). Helped-by: Derrick Stolee <dstolee@microsoft.com> Helped-by: Jeff King <peff@peff.net> Helped-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-08 11:07:27 -07:00
Christopher Warrington	6c722cbe5a	bisect: allow CRLF line endings in "git bisect replay" input We advertise that the bisect log can be corrected in your editor before being fed to "git bisect replay", but some editors may turn the line endings to CRLF. Update the parser of the input lines so that the CR at the end of the line gets ignored. Were anyone to intentionally be using terms/revs with embedded CRs, replaying such bisects will no longer work with this change. I suspect that this is incredibly rare. Signed-off-by: Christopher Warrington <chwarr@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-08 10:54:27 -07:00
Emily Shaffer	788a776069	bugreport: collect list of populated hooks Occasionally a failure a user is seeing may be related to a specific hook which is being run, perhaps without the user realizing. While the contents of hooks can be sensitive - containing user data or process information specific to the user's organization - simply knowing that a hook is being run at a certain stage can help us to understand whether something is going wrong. Without a definitive list of hook names within the code, we compile our own list from the documentation. This is likely prone to bitrot, but designing a single source of truth for acceptable hooks is too much overhead for this small change to the bugreport tool. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-07 18:25:04 -07:00
Đoàn Trần Công Danh	066b70ae97	bloom: fix `make sparse` warning * We need a `final_new_line` to make our source code as text file, per POSIX and C specification. * `bloom_filters` should be limited to interal linkage only Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-07 17:08:21 -07:00
Carlo Marcelo Arenas Belón	303775a25f	t/test_lib: avoid naked bash arrays in file_lineno `662f9cf154` (tests: when run in Bash, annotate test failures with file name/line number, 2020-04-11), introduces a way to report the location (file:lineno) of a failed test case by traversing the bash callstack. The implementation requires bash and uses shell arrays and is therefore protected by a guard but NetBSD sh will still have to parse the function and therefore will result in: t0000-basic.sh * ./test-lib.sh: 681: Syntax error: Bad substitution Enclose the bash specific code inside an eval to avoid parsing errors in the same way than `5826b7b595` (test-lib: check Bash version for '-x' without using shell arrays, 2019-01-03) Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-07 13:04:26 -07:00
Carlo Marcelo Arenas Belón	0555e4af58	t/t0000-basic: make sure subtests also use TEST_SHELL_PATH `3f824e91c8` (t/Makefile: introduce TEST_SHELL_PATH, 2017-12-08) allows for setting a shell for running the tests, but the generated subtests weren't updated. Correct that and while at it update it to use write_script. Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-07 13:01:32 -07:00
Jonathan Tan	2b695ecd74	t5500: count objects through stderr, not trace In two tests introduced by `4fa3f00abb` ("fetch-pack: in protocol v2, in_vain only after ACK", 2020-04-28) and `2f0a093dd6` ("fetch-pack: in protocol v2, reset in_vain upon ACK", 2020-04-28), the count of objects downloaded is checked by grepping for a specific message in the packet trace. However, this is flaky as that specific message may be delivered over 2 or more packet lines. Instead, grep over stderr, just like the "fetch creating new shallow root" test in the same file. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-06 15:38:06 -07:00
Junio C Hamano	da05cacd8a	Merge branch 'js/partial-urlmatch-2.17' Recent updates broke parsing of "credential.<url>.<key>" where <url> is not a full URL (e.g. [credential "https://"] helper = ...) stopped working, which has been corrected. * js/partial-urlmatch-2.17: credential: handle `credential.<partial-URL>.<key>` again credential: optionally allow partial URLs in credential_from_url_gently() credential: fix grammar	2020-05-05 14:54:29 -07:00
Junio C Hamano	1d7e9c4c4e	Merge branch 'tb/commit-graph-perm-bits' Some of the files commit-graph subsystem keeps on disk did not correctly honor the core.sharedRepository settings and some were left read-write. * tb/commit-graph-perm-bits: commit-graph.c: make 'commit-graph-chain's read-only commit-graph.c: ensure graph layers respect core.sharedRepository commit-graph.c: write non-split graphs as read-only lockfile.c: introduce 'hold_lock_file_for_update_mode' tempfile.c: introduce 'create_tempfile_mode'	2020-05-05 14:54:28 -07:00
Junio C Hamano	e34acbdc43	Merge branch 'jk/test-fail-prereqs-fix' Test update. * jk/test-fail-prereqs-fix: t0000: disable GIT_TEST_FAIL_PREREQS in sub-tests	2020-05-05 14:54:27 -07:00
Junio C Hamano	cdfa156a93	Merge branch 'dd/iso-8601-updates' The approxidate parser learns to parse seconds with fraction. * dd/iso-8601-updates: date.c: allow compact version of ISO-8601 datetime date.c: skip fractional second part of ISO-8601 date.c: validate and set time in a helper function date.c: s/is_date/set_date/	2020-05-05 14:54:26 -07:00
Junio C Hamano	fd65fc3960	Merge branch 'bc/wildcard-credential' Update the parser used for credential.<URL>.<variable> configuration, to handle <URL>s with '/' in them correctly. * bc/wildcard-credential: credential: fix matching URLs with multiple levels in path	2020-05-05 14:54:26 -07:00
Eric Sunshine	088018e34d	restore: default to HEAD when combining --staged and --worktree By default, files are restored from the index for --worktree, and from HEAD for --staged. When --worktree and --staged are combined, --source must be specified to disambiguate the restore source[1], thus making it cumbersome to restore a file in both the worktree and the index. However, HEAD is also a reasonable default for --worktree when combined with --staged, so make it the default anytime --staged is used (whether combined with --worktree or not). [1]: Due to an oversight, the --source requirement, though documented, is not actually enforced. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-05 11:27:38 -07:00
Jeff King	9639474b6d	pack-bitmap: pass object filter to fill-in traversal Sometimes a bitmap traversal still has to walk some commits manually, because those commits aren't included in the bitmap packfile (e.g., due to a push or commit since the last full repack). If we're given an object filter, we don't pass it down to this traversal. It's not necessary for correctness because the bitmap code has its own filters to post-process the bitmap result (which it must, to filter out the objects that _are_ mentioned in the bitmapped packfile). And with blob filters, there was no performance reason to pass along those filters, either. The fill-in traversal could omit them from the result, but it wouldn't save us any time to do so, since we'd still have to walk each tree entry to see if it's a blob or not. But now that we support tree filters, there's opportunity for savings. A tree:depth=0 filter means we can avoid accessing trees entirely, since we know we won't them (or any of the subtrees or blobs they point to). The new test in p5310 shows this off (the "partial bitmap" state is one where HEAD~100 and its ancestors are all in a bitmapped pack, but HEAD~100..HEAD are not). Here are the results (run against linux.git): Test HEAD^ HEAD ------------------------------------------------------------------------------------------------- [...] 5310.16: rev-list with tree filter (partial bitmap) 0.19(0.17+0.02) 0.03(0.02+0.01) -84.2% The absolute number of savings isn't _huge_, but keep in mind that we only omitted 100 first-parent links (in the version of linux.git here, that's 894 actual commits). In a more pathological case, we might have a much larger proportion of non-bitmapped commits. I didn't bother creating such a case in the perf script because the setup is expensive, and this is plenty to show the savings as a percentage. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-04 21:57:58 -07:00
Taylor Blau	b0a8d4820b	pack-bitmap.c: support 'tree:0' filtering In the previous patch, we made it easy to define other filters that exclude all objects of a certain type. Use that in order to implement bitmap-level filtering for the '--filter=tree:<n>' filter when 'n' is equal to 0. The general case is not helped by bitmaps, since for values of 'n > 0', the object filtering machinery requires a full-blown tree traversal in order to determine the depth of a given tree. Caching this is non-obvious, too, since the same tree object can have a different depth depending on the context (e.g., a tree was moved up in the directory hierarchy between two commits). But, the 'n = 0' case can be helped, and this patch does so. Running p5310.11 in this tree and on master with the kernel, we can see that this case is helped substantially: Test master this tree -------------------------------------------------------------------------------- 5310.11: rev-list count with tree:0 10.68(10.39+0.27) 0.06(0.04+0.01) -99.4% Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-04 21:57:58 -07:00
Jeff King	7c5045fc18	ref-filter: apply fallback refname sort only after all user sorts Commit `9e468334b4` (ref-filter: fallback on alphabetical comparison, 2015-10-30) taught ref-filter's sort to fallback to comparing refnames. But it did it at the wrong level, overriding the comparison result for a single "--sort" key from the user, rather than after all sort keys have been exhausted. This worked correctly for a single "--sort" option, but not for multiple ones. We'd break any ties in the first key with the refname and never evaluate the second key at all. To make matters even more interesting, we only applied this fallback sometimes! For a field like "taggeremail" which requires a string comparison, we'd truly return the result of strcmp(), even if it was 0. But for numerical "value" fields like "taggerdate", we did apply the fallback. And that's why our multiple-sort test missed this: it uses taggeremail as the main comparison. So let's start by adding a much more rigorous test. We'll have a set of commits expressing every combination of two tagger emails, dates, and refnames. Then we can confirm that our sort is applied with the correct precedence, and we'll be hitting both the string and value comparators. That does show the bug, and the fix is simple: moving the fallback to the outer compare_refs() function, after all ref_sorting keys have been exhausted. Note that in the outer function we don't have an "ignore_case" flag, as it's part of each individual ref_sorting element. It's debatable what such a fallback should do, since we didn't use the user's keys to match. But until now we have been trying to respect that flag, so the least-invasive thing is to try to continue to do so. Since all callers in the current code either set the flag for all keys or for none, we can just pull the flag from the first key. In a hypothetical world where the user really can flip the case-insensitivity of keys separately, we may want to extend the code to distinguish that case from a blanket "--ignore-case". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-04 13:44:46 -07:00
Jeff King	76f9e569ad	ref-filter: apply --ignore-case to all sorting keys All of the ref-filter users (for-each-ref, branch, and tag) take an --ignore-case option which makes filtering and sorting case-insensitive. However, this option was applied only to the first element of the ref_sorting list. So: git for-each-ref --ignore-case --sort=refname would do what you expect, but: git for-each-ref --ignore-case --sort=refname --sort=taggername would sort the primary key (taggername) case-insensitively, but sort the refname case-sensitively. We have two options here: - teach callers to set ignore_case on the whole list - replace the ref_sorting list with a struct that contains both the list of sorting keys, as well as options that apply to _all_ keys I went with the first one here, as it gives more flexibility if we later want to let the users set the flag per-key (presumably through some special syntax when defining the key; for now it's all or nothing through --ignore-case). The new test covers this by sorting on both tagger and subject case-insensitively, which should compare "a" and "A" identically, but still sort them before "b" and "B". We'll break ties by sorting on the refname to give ourselves a stable output (this is actually supposed to be done automatically, but there's another bug which will be fixed in the next commit). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-04 13:41:20 -07:00
Derrick Stolee	ace224ac5f	sparse-checkout: stop blocking empty workdirs Remove the error condition when updating the sparse-checkout leaves an empty working directory. This behavior was added in `9e1afb167` (sparse checkout: inhibit empty worktree, 2009-08-20). The comment was added in `a7bc906f2` (Add explanation why we do not allow to sparse checkout to empty working tree, 2011-09-22) in response to a "dubious" comment in `84563a624` (unpack-trees.c: cosmetic fix, 2010-12-22). With the recent "cone mode" and "git sparse-checkout init [--cone]" command, it is common to set a reasonable sparse-checkout pattern set of /* !/*/ which matches only files at root. If the repository has no such files, then their "git sparse-checkout init" command will fail. Now that we expect this to be a common pattern, we should not have the commands fail on an empty working directory. If it is a confusing result, then the user can recover with "git sparse-checkout disable" or "git sparse-checkout set". This is especially simple when using cone mode. Reported-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-04 12:57:15 -07:00
Carlo Marcelo Arenas Belón	c03859a665	credential-store: ignore bogus lines from store file With the added checks for invalid URLs in credentials, any locally modified store files which might have empty lines or even comments were reported[1] failing to parse as valid credentials. Instead of doing a hard check for credentials, do a soft one and therefore avoid the reported fatal error. While at it add tests for all known corruptions that are currently ignored to keep track of them and avoid the risk of regressions. [1] https://stackoverflow.com/a/61420852/5005936 Reported-by: Dirk <dirk@ed4u.de> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Junio C Hamano <gitster@pobox.com> Based-on-patch-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-02 18:10:38 -07:00
Ash Holland	09dad9256a	userdiff: support Markdown It's typical to find Markdown documentation alongside source code, and having better context for documentation changes is useful; see also commit `69f9c87d4` (userdiff: add support for Fountain documents, 2015-07-21). The pattern is based on the CommonMark specification 0.29, section 4.2 <https://spec.commonmark.org/> but doesn't match empty headings, as seeing them in a hunk header is unlikely to be useful. Only ATX headings are supported, as detecting setext headings would require printing the line before a pattern matches, or matching a multiline pattern. The word-diff pattern is the same as the pattern for HTML, because many Markdown parsers accept inline HTML. Signed-off-by: Ash Holland <ash@sorrel.sh> Acked-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-02 18:04:12 -07:00
Junio C Hamano	0b07eecf6e	Merge branch 'jt/v2-fetch-nego-fix' The upload-pack protocol v2 gave up too early before finding a common ancestor, resulting in a wasteful fetch from a fork of a project. This has been corrected to match the behaviour of v0 protocol. * jt/v2-fetch-nego-fix: fetch-pack: in protocol v2, reset in_vain upon ACK fetch-pack: in protocol v2, in_vain only after ACK fetch-pack: return enum from process_acks()	2020-05-01 13:40:00 -07:00
Junio C Hamano	dd094c2b75	Merge branch 'es/bugreport' The "bugreport" tool. * es/bugreport: bugreport: drop extraneous includes bugreport: add compiler info bugreport: add uname info bugreport: gather git version and build info bugreport: add tool to generate debugging info help: move list_config_help to builtin/help	2020-05-01 13:39:59 -07:00
Junio C Hamano	bf04590ecd	Merge branch 'dd/sparse-fixes' Compilation fix. * dd/sparse-fixes: progress.c: silence cgcc suggestion about internal linkage graph.c: limit linkage of internal variable compat/regex: move stdlib.h up in inclusion chain test-parse-pathspec-file.c: s/0/NULL/ for pointer type	2020-05-01 13:39:56 -07:00
Junio C Hamano	6d56d4c7dc	Merge branch 'ds/blame-on-bloom' "git blame" learns to take advantage of the "changed-paths" Bloom filter stored in the commit-graph file. * ds/blame-on-bloom: test-bloom: check that we have expected arguments test-bloom: fix some whitespace issues blame: drop unused parameter from maybe_changed_path blame: use changed-path Bloom filters tests: write commit-graph with Bloom filters revision: complicated pathspecs disable filters	2020-05-01 13:39:54 -07:00
Junio C Hamano	9b6606f43d	Merge branch 'gs/commit-graph-path-filter' Introduce an extension to the commit-graph to make it efficient to check for the paths that were modified at each commit using Bloom filters. * gs/commit-graph-path-filter: bloom: ignore renames when computing changed paths commit-graph: add GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS test flag t4216: add end to end tests for git log with Bloom filters revision.c: add trace2 stats around Bloom filter usage revision.c: use Bloom filters to speed up path based revision walks commit-graph: add --changed-paths option to write subcommand commit-graph: reuse existing Bloom filters during write commit-graph: write Bloom filters to commit graph file commit-graph: examine commits by generation number commit-graph: examine changed-path objects in pack order commit-graph: compute Bloom filters for changed paths diff: halt tree-diff early after max_changes bloom.c: core Bloom filter implementation for changed paths. bloom.c: introduce core Bloom filter constructs bloom.c: add the murmur3 hash implementation commit-graph: define and use MAX_NUM_CHUNKS	2020-05-01 13:39:53 -07:00
Junio C Hamano	cf054f817a	Merge branch 'tb/commit-graph-fd-exhaustion-fix' The commit-graph code exhausted file descriptors easily when it does not have to. * tb/commit-graph-fd-exhaustion-fix: commit-graph: close descriptors after mmap commit-graph.c: gracefully handle file descriptor exhaustion t/test-lib.sh: make ULIMIT_FILE_DESCRIPTORS available to tests commit-graph.c: don't use discarded graph_name in error	2020-05-01 13:39:53 -07:00
Junio C Hamano	6a1c17d05b	Merge branch 'tb/commit-graph-split-strategy' "git commit-graph write" learned different ways to write out split files. * tb/commit-graph-split-strategy: Revert "commit-graph.c: introduce '--[no-]check-oids'" commit-graph.c: introduce '--[no-]check-oids' commit-graph.h: replace 'commit_hex' with 'commits' oidset: introduce 'oidset_size' builtin/commit-graph.c: introduce split strategy 'replace' builtin/commit-graph.c: introduce split strategy 'no-merge' builtin/commit-graph.c: support for '--split[=<strategy>]' t/helper/test-read-graph.c: support commit-graph chains	2020-05-01 13:39:52 -07:00
Junio C Hamano	2b4ff3d3dc	Merge branch 'tb/reset-shallow' Fix in-core inconsistency after fetching into a shallow repository that broke the code to write out commit-graph. * tb/reset-shallow: shallow.c: use '{commit,rollback}_shallow_file' t5537: use test_write_lines and indented heredocs for readability	2020-05-01 13:39:51 -07:00
Junio C Hamano	cc0c732f6b	Merge branch 'dd/mailinfo-with-nul' Tighten "git mailinfo" to notice and error out when decoded result contains NUL in it. * dd/mailinfo-with-nul: mailinfo: disallow NUL character in mail's header mailinfo.c: avoid strlen on strings that can contains NUL t4254: merge 2 steps of a single test	2020-05-01 13:39:51 -07:00
Junio C Hamano	81bfe5434a	Merge branch 'dl/test-must-fail-fixes-4' Test clean-up. * dl/test-must-fail-fixes-4: t9819: don't use test_must_fail with p4 t9164: use test_must_fail only on git commands t9160: use test_path_is_missing() t9141: use test_path_is_missing() t7508: don't use `test_must_fail test_cmp` t7408: replace incorrect uses of test_must_fail t6030: use test_path_is_missing()	2020-05-01 13:39:50 -07:00
Derrick Stolee	54c337be9c	test-bloom: fix usage typo Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-01 11:41:21 -07:00
Junio C Hamano	d2ea03ddee	Merge branch 'ps/transactional-update-ref-stdin' "git update-ref --stdin" learned a handful of new verbs to let the user control ref update transactions more explicitly, which helps as an ingredient to implement two-phase commit-style atomic ref-updates across multiple repositories. * ps/transactional-update-ref-stdin: update-ref: implement interactive transaction handling update-ref: read commands in a line-wise fashion update-ref: move transaction handling into `update_refs_stdin()` update-ref: pass end pointer instead of strbuf update-ref: drop unused argument for `parse_refname` update-ref: organize commands in an array strbuf: provide function to append whole lines git-update-ref.txt: add missing word refs: fix segfault when aborting empty transaction	2020-04-29 16:15:31 -07:00
Junio C Hamano	6eacc39b6d	Merge branch 'en/fill-directory-exponential' The directory traversal code had redundant recursive calls which made its performance characteristics exponential with respect to the depth of the tree, which was corrected. * en/fill-directory-exponential: completion: fix 'git add' on paths under an untracked directory Fix error-prone fill_directory() API; make it only return matches dir: replace double pathspec matching with single in treat_directory() dir: include DIR_KEEP_UNTRACKED_CONTENTS handling in treat_directory() dir: replace exponential algorithm with a linear one dir: refactor treat_directory to clarify control flow dir: fix confusion based on variable tense dir: fix broken comment dir: consolidate treat_path() and treat_one_path() dir: fix simple typo in comment t3000: add more testcases testing a variety of ls-files issues t7063: more thorough status checking	2020-04-29 16:15:31 -07:00
Junio C Hamano	48eee46d6a	Merge branch 'en/sparse-checkout' "sparse-checkout" UI improvements. * en/sparse-checkout: sparse-checkout: provide a new reapply subcommand unpack-trees: failure to set SKIP_WORKTREE bits always just a warning unpack-trees: provide warnings on sparse updates for unmerged paths too unpack-trees: make sparse path messages sound like warnings unpack-trees: split display_error_msgs() into two unpack-trees: rename ERROR_* fields meant for warnings to WARNING_* unpack-trees: move ERROR_WOULD_LOSE_SUBMODULE earlier sparse-checkout: use improved unpack_trees porcelain messages sparse-checkout: use new update_sparsity() function unpack-trees: add a new update_sparsity() function unpack-trees: pull sparse-checkout pattern reading into a new function unpack-trees: do not mark a dirty path with SKIP_WORKTREE unpack-trees: allow check_updates() to work on a different index t1091: make some tests a little more defensive against failures unpack-trees: simplify pattern_list freeing unpack-trees: simplify verify_absent_sparse() unpack-trees: remove unused error type unpack-trees: fix minor typo in comment	2020-04-29 16:15:30 -07:00
Junio C Hamano	8cb514d1cb	Merge branch 'dd/ci-swap-azure-pipelines-with-github-actions' Update the CI configuration to use GitHub Actions, retiring the one based on Azure Pipelines. * dd/ci-swap-azure-pipelines-with-github-actions: ci: let GitHub Actions upload failed tests' directories ci: add a problem matcher for GitHub Actions tests: when run in Bash, annotate test failures with file name/line number ci: retire the Azure Pipelines definition README: add a build badge for the GitHub Actions runs ci: configure GitHub Actions for CI/PR ci: run gem with sudo to install asciidoctor ci: explicit install all required packages ci: fix the `jobname` of the `GETTEXT_POISON` job ci/lib: set TERM environment variable if not exist ci/lib: allow running in GitHub Actions ci/lib: if CI type is unknown, show the environment variables	2020-04-29 16:15:29 -07:00
Junio C Hamano	3afdeef33e	Merge branch 'dl/merge-autostash-rebase-quit-fix' The stash entry created by "git rebase --autosquash" to keep the initial dirty state were discarded by mistake upon "git rebase --quit", which has been corrected. * dl/merge-autostash-rebase-quit-fix: rebase: save autostash entry into stash reflog on --quit	2020-04-29 16:15:27 -07:00
Junio C Hamano	bf10200871	Merge branch 'dl/merge-autostash' "git merge" learns the "--autostash" option. * dl/merge-autostash: (22 commits) pull: pass --autostash to merge t5520: make test_pull_autostash() accept expect_parent_num merge: teach --autostash option sequencer: implement apply_autostash_oid() sequencer: implement save_autostash() sequencer: unlink autostash in apply_autostash() sequencer: extract perform_autostash() from rebase rebase: generify create_autostash() rebase: extract create_autostash() reset: extract reset_head() from rebase rebase: generify reset_head() rebase: use apply_autostash() from sequencer.c sequencer: rename stash_sha1 to stash_oid sequencer: make apply_autostash() accept a path rebase: use read_oneliner() sequencer: make read_oneliner() extern sequencer: configurably warn on non-existent files sequencer: make read_oneliner() accept flags sequencer: make file exists check more efficient sequencer: stop leaking buf ...	2020-04-29 16:15:27 -07:00
Junio C Hamano	dbd5e0a186	Revert "commit-graph.c: introduce '--[no-]check-oids'" This reverts commit `7a9ce0269b`, which has not yet gained consensus.	2020-04-29 12:44:40 -07:00
Taylor Blau	45a4365cb6	commit-graph.c: make 'commit-graph-chain's read-only In a previous commit, we made incremental graph layers read-only by using 'git_mkstemp_mode' with permissions '0444'. There is no reason that 'commit-graph-chain's should be modifiable by the user, since they are generated at a temporary location and then atomically renamed into place. To ensure that these files are read-only, too, use 'hold_lock_file_for_update_mode' with the same read-only permission bits, and let the umask and 'adjust_shared_perm' take care of the rest. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-29 12:35:30 -07:00
Taylor Blau	f4d62847a4	commit-graph.c: ensure graph layers respect core.sharedRepository Non-layered commit-graphs use 'adjust_shared_perm' to make the commit-graph file readable (or not) to a combination of the user, group, and others. Call 'adjust_shared_perm' for split-graph layers to make sure that these also respect 'core.sharedRepository'. The 'commit-graph-chain' file already respects this configuration since it uses 'hold_lock_file_for_update' (which calls 'adjust_shared_perm' eventually in 'create_tempfile_mode'). Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-29 12:35:30 -07:00
Taylor Blau	1f9becaedc	commit-graph.c: write non-split graphs as read-only In the previous commit, Git learned 'hold_lock_file_for_update_mode' to allow the caller to specify the permission bits (prior to further adjustment by the umask and shared repository permissions) used when acquiring a temporary file. Use this in the commit-graph machinery for writing a non-split graph to acquire an opened temporary file with permissions read-only permissions to match the split behavior. (In the split case, Git uses git_mkstemp_mode' for each of the commit-graph layers with permission bits '0444'). One can notice this discrepancy when moving a non-split graph to be part of a new chain. This causes a commit-graph chain where all layers have read-only permission bits, except for the base layer, which is writable for the current user. Resolve this discrepancy by using the new 'hold_lock_file_for_update_mode' and passing the desired permission bits. Doing so causes some test fallout in t5318 and t6600. In t5318, this occurs in tests that corrupt a commit-graph file by writing into it. For these, 'chmod u+w'-ing the file beforehand resolves the issue. The additional spot in 'corrupt_graph_verify' is necessary because of the extra 'git commit-graph write' beforehand (which does rewrite the commit-graph file). In t6600, this is caused by copying a read-only commit-graph file into place and then trying to replace it. For these, make these files writable. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-29 12:35:30 -07:00
Denton Liu	add2452348	lib-submodule-update: consolidate --recurse-submodules Both test_submodule_switch_recursing_with_args() and test_submodule_forced_switch_recursing_with_args() call the internal function test_submodule_recursing_with_args_common() with the final argument of `--recurse-submodules`. Consolidate this duplication by appending the argument in test_submodule_recursing_with_args_common(). Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-29 10:36:55 -07:00
Denton Liu	27691ca27e	lib-submodule-update: add space after function name In the shell scripts in this codebase, the usual style is to include a space between the function name and the (). Add these missing spaces to conform to the usual style of the code. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-29 10:36:55 -07:00
Johannes Schindelin	9a121b0d22	credential: handle `credential.<partial-URL>.<key>` again In the patches for CVE-2020-11008, the ability to specify credential settings in the config for partial URLs got lost. For example, it used to be possible to specify a credential helper for a specific protocol: [credential "https://"] helper = my-https-helper Likewise, it used to be possible to configure settings for a specific host, e.g.: [credential "dev.azure.com"] useHTTPPath = true Let's reinstate this behavior. While at it, increase the test coverage to document and verify the behavior with a couple other categories of partial URLs. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Reviewed-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-29 08:37:36 -07:00
Junio C Hamano	33a1060988	Merge branch 'mt/grep-cquote-path' "git grep" did not quote a path with unusual character like other commands (like "git diff", "git status") do, but did quote when run from a subdirectory, both of which has been corrected. * mt/grep-cquote-path: grep: follow conventions for printing paths w/ unusual chars	2020-04-28 15:50:09 -07:00
Junio C Hamano	d3fc8dc53a	Merge branch 'ds/log-exclude-decoration-config' The "--decorate-refs" and "--decorate-refs-exclude" options "git log" takes have learned a companion configuration variable log.excludeDecoration that sits at the lowest priority in the family. * ds/log-exclude-decoration-config: log: add log.excludeDecoration config option log-tree: make ref_filter_match() a helper method	2020-04-28 15:50:08 -07:00
Junio C Hamano	93d1f196a9	Merge branch 'vd/range-diff-with-custom-pretty-format-fix' "git range-diff" fixes. * vd/range-diff-with-custom-pretty-format-fix: range-diff: avoid negative string precision range-diff: fix a crash in parsing git-log output	2020-04-28 15:50:08 -07:00
Junio C Hamano	5a96715eb7	Merge branch 'tb/diff-tree-with-notes' "git diff-tree --pretty --notes" used to hit an assertion failure, as it forgot to initialize the notes subsystem. * tb/diff-tree-with-notes: diff-tree.c: load notes machinery when required	2020-04-28 15:50:07 -07:00
Junio C Hamano	e81ecff10a	Merge branch 'js/stash-p-fix' Allowing the user to split a patch hunk while "git stash -p" does not work well; a band-aid has been added to make this (partially) work better. * js/stash-p-fix: stash -p: (partially) fix bug concerning split hunks t3904: fix incorrect demonstration of a bug	2020-04-28 15:50:06 -07:00
Junio C Hamano	5b6864ca44	Merge branch 'jx/atomic-push' "git push --atomic" used to show failures for refs that weren't even pushed, which has been corrected. * jx/atomic-push: transport-helper: new method reject_atomic_push() transport-helper: mark failure for atomic push send-pack: mark failure of atomic push properly t5543: never report what we do not push send-pack: fix inconsistent porcelain output	2020-04-28 15:50:04 -07:00
Junio C Hamano	8f5dc5a4af	Merge branch 'jt/avoid-prefetch-when-able-in-diff' "git diff" in a partial clone learned to avoid lazy loading blob objects in more casese when they are not needed. * jt/avoid-prefetch-when-able-in-diff: diff: restrict when prefetching occurs diff: refactor object read diff: make diff_populate_filespec_options struct promisor-remote: accept 0 as oid_nr in function	2020-04-28 15:50:04 -07:00
Junio C Hamano	0ccf0bafff	Merge branch 'ds/t5319-touch-fix' Tests update to use "test-chmtime" instead of "touch -t". * ds/t5319-touch-fix: t5319: replace 'touch -m' with 'test-tool chmtime'	2020-04-28 15:50:02 -07:00
Junio C Hamano	25b336421f	Merge branch 'ds/commit-graph-expiry-fix' "git commit-graph write --expire-time=<timestamp>" did not use the given timestamp correctly, which has been corrected. * ds/commit-graph-expiry-fix: commit-graph: fix buggy --expire-time option	2020-04-28 15:50:02 -07:00
Junio C Hamano	9404128b34	Merge branch 'jc/log-no-mailmap' "git log" learns "--[no-]mailmap" as a synonym to "--[no-]use-mailmap" * jc/log-no-mailmap: log: give --[no-]use-mailmap a more sensible synonym --[no-]mailmap clone: reorder --recursive/--recurse-submodules parse-options: teach "git cmd -h" to show alias as alias	2020-04-28 15:50:00 -07:00
Junio C Hamano	6ae3c79788	Merge branch 'jk/fast-import-use-hashmap' The custom hash function used by "git fast-import" has been replaced with the one from hashmap.c, which gave us a nice performance boost. * jk/fast-import-use-hashmap: fast-import: replace custom hash with hashmap.c	2020-04-28 15:49:58 -07:00
Junio C Hamano	b07c72100f	Merge branch 'jc/doc-test-leaving-early' Document the recommended way to abort a failing test early (e.g. by exiting a loop), which is to say "return 1". * jc/doc-test-leaving-early: t/README: suggest how to leave test early with failure	2020-04-28 15:49:55 -07:00
Junio C Hamano	28ba5a7b27	Merge branch 'dd/test-with-busybox' Various tests have been updated to work around issues found with shell utilities that come with busybox etc. * dd/test-with-busybox: t5703: feed raw data into test-tool unpack-sideband t4124: tweak test so that non-compliant diff(1) can also be used t7063: drop non-POSIX argument "-ls" from find(1) t5616: use rev-parse instead to get HEAD's object_id t5003: skip conversion test if unzip -a is unavailable t5003: drop the subshell in test_lazy_prereq test-lib-functions: test_cmp: eval $GIT_TEST_CMP t4061: use POSIX compliant regex(7)	2020-04-28 15:49:55 -07:00
Denton Liu	9b2df3e8d0	rebase: save autostash entry into stash reflog on --quit In `a03b55530a` (merge: teach --autostash option, 2020-04-07), the --autostash option was introduced for `git merge`. Notably, when `git merge --quit` is run with an autostash entry present, it is saved into the stash reflog. This is contrasted with the current behaviour of `git rebase --quit` where the autostash entry is simply just dropped out of existence. Adopt the behaviour of `git merge --quit` in `git rebase --quit` and save the autostash entry into the stash reflog instead of just deleting it. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-28 12:35:38 -07:00
Jeff King	797e2cfd78	t0000: disable GIT_TEST_FAIL_PREREQS in sub-tests The test added by `477dcaddb6` (tests: do not let lazy prereqs inside `test_expect_*` turn off tracing, 2020-03-26) runs a sub-test script that traces a test with a lazy prereq, like: test_have_prereq LAZY && echo trace That won't work if GIT_TEST_FAIL_PREREQS is set in the environment, because our have_prereq will report failure, and we won't run the echo at all. We could work around this by avoiding the &&-chain, but we can fix this and any future tests at once by unsetting that variable for our sub-tests. These are meant to be controlled environments where we test the test-suite itself; the outer test snippet should be in charge of the sub-test environment, not whatever mode the user happens to be running in. Reported-by: Son Luong Ngoc <sluongng@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-28 10:26:01 -07:00
Jonathan Tan	2f0a093dd6	fetch-pack: in protocol v2, reset in_vain upon ACK In the function process_acks() in fetch-pack.c, the variable received_ack is meant to track that an ACK was received, but it was never set. This results in negotiation terminating prematurely through the in_vain counter, when the counter should have been reset upon every ACK. Therefore, reset the in_vain counter upon every ACK. Helped-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-28 09:55:06 -07:00
Jonathan Tan	4fa3f00abb	fetch-pack: in protocol v2, in_vain only after ACK When fetching, Git stops negotiation when it has sent at least MAX_IN_VAIN (which is 256) "have" lines without having any of them ACK-ed. But this is supposed to trigger only after the first ACK, as pack-protocol.txt says: However, the 256 limit only turns on in the canonical client implementation if we have received at least one "ACK %s continue" during a prior round. This helps to ensure that at least one common ancestor is found before we give up entirely. The code path for protocol v0 observes this, but not protocol v2, resulting in shorter negotiation rounds but significantly larger packfiles. Teach the code path for protocol v2 to check this criterion only after at least one ACK was received. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-28 09:55:02 -07:00
brian m. carlson	b44d0118ac	credential: fix matching URLs with multiple levels in path `46fd7b3900` ("credential: allow wildcard patterns when matching config", 2020-02-20) introduced support for matching credential helpers using urlmatch. In doing so, it introduced code to percent-encode the paths we get from the credential helper so that they could be effectively matched by the urlmatch code. Unfortunately, that code had a bug: it percent-encoded the slashes in the path, resulting in any URL path that contained multiple levels (i.e., a directory component) not matching. We are currently the only caller of the percent-encoding code and could simply change it not to encode slashes. However, we still want to encode slashes in the username component, so we need to have both behaviors available. So instead, let's add a flag to control encoding slashes, which is the behavior we want here, and use it when calling the code in this case. Add a test for credential helper URLs using multiple slashes in the path, which our test suite previously lacked, as well as one ensuring that we handle usernames with slashes gracefully. Since we're testing other percent-encoding handling, let's add one for non-ASCII UTF-8 characters as well. Reported-by: Ilya Tretyakov <it@it3xl.ru> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-27 11:42:20 -07:00
Đoàn Trần Công Danh	3cacb9aaf4	progress.c: silence cgcc suggestion about internal linkage Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Reviewed-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-27 11:21:28 -07:00
Đoàn Trần Công Danh	1437ebf74a	test-parse-pathspec-file.c: s/0/NULL/ for pointer type Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Reviewed-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-27 11:21:12 -07:00
Đoàn Trần Công Danh	544ed961a5	date.c: allow compact version of ISO-8601 datetime Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-24 14:06:09 -07:00
Đoàn Trần Công Danh	b784840ca8	date.c: skip fractional second part of ISO-8601 git-commit(1) says ISO-8601 is one of our supported date format. ISO-8601 allows timestamps to have a fractional number of seconds. We represent time only in terms of whole seconds, so we never bothered parsing fractional seconds. However, it's better for us to parse and throw away the fractional part than to refuse to parse the timestamp at all. And refusing parsing fractional second part may confuse the parse to think fractional and timezone as day and month in this example: 2008-02-14 20:30:45.019-04:00 While doing this, make sure that we only interpret the number after the second and the dot as fractional when and only when the date is known, since only ISO-8601 allows the fractional part, and we've taught our users to interpret "12:34:56.7.days.ago" as a way to specify a time relative to current time. Reported-by: Brian M. Carlson <sandals@crustytoothpaste.net> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-24 14:06:09 -07:00
Taylor Blau	37b9dcabfc	shallow.c: use '{commit,rollback}_shallow_file' In `bd0b42aed3` (fetch-pack: do not take shallow lock unnecessarily, 2019-01-10), the author noted that 'is_repository_shallow' produces visible side-effect(s) by setting 'is_shallow' and 'shallow_stat'. This is a problem for e.g., fetching with '--update-shallow' in a shallow repository with 'fetch.writeCommitGraph' enabled, since the update to '.git/shallow' will cause Git to think that the repository isn't shallow when it is, thereby circumventing the commit-graph compatibility check. This causes problems in shallow repositories with at least shallow refs that have at least one ancestor (since the client won't have those objects, and therefore can't take the reachability closure over commits when writing a commit-graph). Address this by introducing thin wrappers over 'commit_lock_file' and 'rollback_lock_file' for use specifically when the lock is held over '.git/shallow'. These wrappers (appropriately called 'commit_shallow_file' and 'rollback_shallow_file') call into their respective functions in 'lockfile.h', but additionally reset validity checks used by the shallow machinery. Replace each instance of 'commit_lock_file' and 'rollback_lock_file' with 'commit_shallow_file' and 'rollback_shallow_file' when the lock being held is over the '.git/shallow' file. As a result, 'prune_shallow' can now only be called once (since 'check_shallow_file_for_update' will die after calling 'reset_repository_shallow'). But, this is OK since we only call 'prune_shallow' at most once per process. Helped-by: Jonathan Tan <jonathantanmy@google.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-24 13:56:39 -07:00
Taylor Blau	8a8da49728	t5537: use test_write_lines and indented heredocs for readability A number of spots in t5537 use the non-indented heredoc '<<EOF' when they would benefit from instead using '<<-EOF' or simply test_write_lines. In preparation for adding new tests in a good style and being consistent with the surrounding code, update the existing tests to improve their readability. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-24 13:53:06 -07:00
Jeff King	1b4c57fa87	test-bloom: check that we have expected arguments If "test-tool bloom" is not fed a command, or if arguments are missing for some commands, it will just segfault. Let's check argc and write a friendlier usage message. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-23 16:05:29 -07:00
Jeff King	24b7d1e7b0	test-bloom: fix some whitespace issues Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-23 16:05:29 -07:00
Taylor Blau	b78a556a6a	commit-graph.c: gracefully handle file descriptor exhaustion When writing a layered commit-graph, the commit-graph machinery uses 'commit_graph_filenames_after' and 'commit_graph_hash_after' to keep track of the layers in the chain that we are in the process of writing. When the number of commit-graph layers shrinks, we initialize all entries in the aforementioned arrays, because we know the structure of the new commit-graph chain immediately (since there are no new layers, there are no unknown hash values). But when the number of commit-graph layers grows (i.e., that 'num_commit_graphs_after > num_commit_graphs_before'), then we leave some entries in the filenames and hashes arrays as uninitialized, because we will fill them in later as those values become available. For instance, we rely on 'write_commit_graph_file's to store the filename and hash of the last layer in the new chain, which is the one that it is responsible for writing. But, it's possible that 'write_commit_graph_file' may fail, e.g., from file descriptor exhaustion. In this case it is possible that 'git_mkstemp_mode' will fail, and that function will return early before setting the values for the last commit-graph layer's filename and hash. This causes a number of upleasant side-effects. For instance, trying to 'free()' each entry in 'ctx->commit_graph_filenames_after' (and similarly for the hashes array) causes us to 'free()' uninitialized memory, since the area is allocated with 'malloc()' and is therefore subject to contain garbage (which is left alone when 'write_commit_graph_file' returns early). This can manifest in other issues, like a general protection fault, and/or leaving a stray 'commit-graph-chain.lock' around after the process dies. (The reasoning for this is still a mystery to me, since we'd otherwise usually expect the kernel to run tempfile.c's 'atexit()' handlers in the case of a normal death...) To resolve this, initialize the memory with 'CALLOC_ARRAY' so that uninitialized entries are filled with zeros, and can thus be 'free()'d as a noop instead of causing a fault. Helped-by: Jeff King <peff@peff.net> Helped-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-23 14:58:52 -07:00
Taylor Blau	b30fdb4b4e	t/test-lib.sh: make ULIMIT_FILE_DESCRIPTORS available to tests In t1400 the prerequisite 'ULIMIT_FILE_DESCRIPTORS' is defined and used to effectively guard the helper function 'run_with_limited_open_files' from being used on systems that do not satisfy this prerequisite. In the subsequent patch, we will introduce another test outside of t1400 that would benefit from using this prerequisite. So, move it to 'test-lib.sh' instead so that it can be used by multiple tests. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-23 14:58:52 -07:00
Đoàn Trần Công Danh	3919997447	mailinfo: disallow NUL character in mail's header Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-22 14:01:03 -07:00
Đoàn Trần Công Danh	2a2ff60396	mailinfo.c: avoid strlen on strings that can contains NUL We're passing buffer from strbuf to reencode_string, which will call strlen(3) on that buffer, and discard the length of newly created buffer. Then, we compute the length of the return buffer to attach to strbuf. During this process, we introduce a discrimination between mail originally written in utf-8 and other encoding. * if the email was written in utf-8, we leave it as is. If there is a NUL character in that line, we complains loudly: error: a NUL byte in commit log message not allowed. * if the email was written in other encoding, we truncate the data as the NUL character in that line, then we used the truncated line for the metadata. We can do better by reusing all the available information, and call the underlying lower level function that will be called indirectly by reencode_string. By doing this, we will also postpone the NUL character processing to the commit step, which will complains about the faulty metadata. Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-22 14:01:02 -07:00
Đoàn Trần Công Danh	2ed282cc0d	t4254: merge 2 steps of a single test While we are at it, make sure we run a clean up after testing. In a later patch, we will test for more corrupted patch. Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-22 14:01:00 -07:00
Junio C Hamano	a397e9c236	Merge branch 'jk/credential-parsing-end-of-host-in-URL' Parsing of URL for the credential helper has been corrected. * jk/credential-parsing-end-of-host-in-URL: credential: treat "?" and "#" in URLs as end of host	2020-04-22 13:43:01 -07:00
Junio C Hamano	d6d561db1c	Merge branch 'jt/rebase-allow-duplicate' Allow "git rebase" to reapply all local commits, even if the may be already in the upstream, without checking first. * jt/rebase-allow-duplicate: rebase --merge: optionally skip upstreamed commits	2020-04-22 13:43:00 -07:00
Junio C Hamano	c7d8f69da5	Merge branch 'en/rebase-no-keep-empty' "git rebase" (again) learns to honor "--no-keep-empty", which lets the user to discard commits that are empty from the beginning (as opposed to the ones that become empty because of rebasing). The interactive rebase also marks commits that are empty in the todo. * en/rebase-no-keep-empty: rebase: fix an incompatible-options error message rebase: reinstate --no-keep-empty rebase -i: mark commits that begin empty in todo editor	2020-04-22 13:43:00 -07:00
Junio C Hamano	8b39dfdf47	Merge branch 'js/mingw-is-hidden-test-fix' A Windows-specific test element has been made more robust against misuse from both user's environment and programmer's errors. * js/mingw-is-hidden-test-fix: t: restrict `is_hidden` to be called only on Windows mingw: make test_path_is_hidden more robust t: consolidate the `is_hidden` functions	2020-04-22 13:42:59 -07:00
Junio C Hamano	9af3a7cb4d	Merge branch 'ds/revision-show-pulls' "git log" learned "--show-pulls" that helps pathspec limited history views; a merge commit that takes the whole change from a side branch, which is normally omitted from the output, is shown in addition to the commits that introduce real changes. * ds/revision-show-pulls: revision: --show-pulls adds helpful merges	2020-04-22 13:42:57 -07:00
Junio C Hamano	b3eb70e0f8	Merge branch 'js/mingw-fixes' Misc fixes for Windows. * js/mingw-fixes: mingw: help debugging by optionally executing bash with strace mingw: do not treat `COM0` as a reserved file name mingw: use modern strftime implementation if possible	2020-04-22 13:42:56 -07:00
Junio C Hamano	95ca48973d	Merge branch 'jc/missing-ref-store-fix' We've left the command line parsing of "git log :/a/b/" broken for about a full year without anybody noticing, which has been corrected. * jc/missing-ref-store-fix: repository: mark the "refs" pointer as private sha1-name: do not assume that the ref store is initialized	2020-04-22 13:42:55 -07:00
Junio C Hamano	f4216e5968	Merge branch 'eb/format-patch-no-encode-headers' The output from "git format-patch" uses RFC 2047 encoding for non-ASCII letters on From: and Subject: headers, so that it can directly be fed to e-mail programs. A new option has been added to produce these headers in raw. * eb/format-patch-no-encode-headers: format-patch: teach --no-encode-email-headers	2020-04-22 13:42:54 -07:00
Junio C Hamano	fc3f6fd7be	Merge branch 'dd/no-gpg-sign' "git rebase" learned the "--no-gpg-sign" option to countermand commit.gpgSign the user may have. * dd/no-gpg-sign: Documentation: document merge option --no-gpg-sign Documentation: merge commit-tree --[no-]gpg-sign Documentation: reword commit --no-gpg-sign Documentation: document am --no-gpg-sign cherry-pick/revert: honour --no-gpg-sign in all case rebase.c: honour --no-gpg-sign	2020-04-22 13:42:53 -07:00
Junio C Hamano	886fcb7aae	Merge branch 'js/t0007-typofix' Typofix in a test script. * js/t0007-typofix: t0007: fix a typo	2020-04-22 13:42:52 -07:00
Junio C Hamano	3aa30ccb1c	Merge branch 'en/sequencer-reflog-action' "git rebase -i" did not leave the reflog entries correctly. * en/sequencer-reflog-action: sequencer: honor GIT_REFLOG_ACTION	2020-04-22 13:42:51 -07:00
Junio C Hamano	3ea2b46628	Merge branch 'jk/use-quick-lookup-in-clone-for-tag-following' The logic to auto-follow tags by "git clone --single-branch" was not careful to avoid lazy-fetching unnecessary tags, which has been corrected. * jk/use-quick-lookup-in-clone-for-tag-following: clone: use "quick" lookup while following tags	2020-04-22 13:42:51 -07:00
Junio C Hamano	f72e06703b	Merge branch 'ag/rebase-merge-allow-ff-under-abbrev-command' "git rebase" with the merge backend did not work well when the rebase.abbreviateCommands configuration was set. * ag/rebase-merge-allow-ff-under-abbrev-command: t3432: test `--merge' with `rebase.abbreviateCommands = true', too sequencer: don't abbreviate a command if it doesn't have a short form	2020-04-22 13:42:50 -07:00
Junio C Hamano	a768f866e9	Merge branch 'jk/oid-array-cleanups' Code cleanup. * jk/oid-array-cleanups: oidset: stop referring to sha1-array ref-filter: stop referring to "sha1 array" bisect: stop referring to sha1_array test-tool: rename sha1-array to oid-array oid_array: rename source file from sha1-array oid_array: use size_t for iteration oid_array: use size_t for count and allocation	2020-04-22 13:42:49 -07:00
Junio C Hamano	220546156a	Merge branch 'jk/p5310-drop-non-bitmap-timing' Perf-test update. * jk/p5310-drop-non-bitmap-timing: p5310: stop timing non-bitmap pack-to-disk	2020-04-22 13:42:44 -07:00
Junio C Hamano	5ee5788af6	Merge branch 'jk/harden-protocol-v2-delim-handling' The server-end of the v2 protocol to serve "git clone" and "git fetch" was not prepared to see a delim packets at unexpected places, which led to a crash. * jk/harden-protocol-v2-delim-handling: test-lib-functions: simplify packetize() stdin code upload-pack: handle unexpected delim packets test-lib-functions: make packetize() more efficient	2020-04-22 13:42:44 -07:00
Junio C Hamano	dfe48154b1	Merge branch 'jk/test-cleanup' Test cleanup. * jk/test-cleanup: t/lib-*.sh: drop executable bit t/lib-credential.sh: drop shebang line	2020-04-22 13:42:44 -07:00
Junio C Hamano	8777ec119e	Merge branch 'dr/midx-avoid-int-underflow' When fed a midx that records no objects, some codepaths tried to loop from 0 through (num_objects-1), which, due to integer arithmetic wrapping around, made it nonsense operation with out of bounds array accesses. The code has been corrected to reject such an midx file. * dr/midx-avoid-int-underflow: midx.c: fix an integer underflow	2020-04-22 13:42:44 -07:00
Junio C Hamano	7a8e6305d8	Merge branch 'dl/test-must-fail-fixes-3' Test clean-up continues. * dl/test-must-fail-fixes-3: t5801: teach compare_refs() to accept ! t5612: stop losing return codes of git commands t5612: don't use `test_must_fail test_cmp` t5607: reorder `nongit test_must_fail` t5550: simplify no matching line check t5512: stop losing return codes of git commands t5512: stop losing git exit code in here-docs t5512: don't use `test_must_fail test_cmp`	2020-04-22 13:42:44 -07:00
Junio C Hamano	810dc6481a	Merge branch 'js/trace2-env-vars' Trace2 enhancement to allow logging of the environment variables. * js/trace2-env-vars: trace2: teach Git to log environment variables	2020-04-22 13:42:44 -07:00
Junio C Hamano	45fbdf54a2	Merge branch 'mt/test-lib-bundled-short-options' Minor test usability improvement. * mt/test-lib-bundled-short-options: test-lib: allow short options to be bundled	2020-04-22 13:42:43 -07:00
Junio C Hamano	d72fa768f4	Merge branch 'js/test-junit-finalization-fix' Test fix. * js/test-junit-finalization-fix: tests(junit-xml): avoid invalid XML	2020-04-22 13:42:43 -07:00
Junio C Hamano	d82c528fc1	Merge branch 'js/tests-gpg-integration-on-windows' Enable tests that require GnuPG on Windows. * js/tests-gpg-integration-on-windows: tests: increase the verbosity of the GPG-related prereqs tests: turn GPG, GPGSM and RFC1991 into lazy prereqs tests: do not let lazy prereqs inside `test_expect_*` turn off tracing t/lib-gpg.sh: stop pretending to be a stand-alone script tests(gpg): allow the gpg-agent to start on Windows	2020-04-22 13:42:43 -07:00
Junio C Hamano	21e3bb1299	Merge branch 'jk/t3419-drop-expensive-tests' Test update. * jk/t3419-drop-expensive-tests: t3419: drop EXPENSIVE tests	2020-04-22 13:42:42 -07:00
Junio C Hamano	1aef1360ae	Merge branch 'ar/test-style-fixes' Style fixes. * ar/test-style-fixes: t: fix whitespace around && t9500: remove spaces after redirect operators	2020-04-22 13:42:42 -07:00
Taylor Blau	5778b22b3d	diff-tree.c: load notes machinery when required Since its introduction in `7249e91` (revision.c: support --notes command-line option, 2011-03-29), combining '--notes' with any option that causes us to format notes (e.g., '--pretty', '--format="%N"', etc) results in a failed assertion at runtime. $ git rev-list HEAD \| git diff-tree --stdin --pretty=medium --notes commit 8f3d9f354286745c751374f5f1fcafee6b3f3136 git: notes.c:1308: format_display_notes: Assertion `display_notes_trees' failed. Aborted This failure is due to diff-tree not calling 'load_display_notes' to initialize the notes machinery. Ordinarily, this failure isn't triggered, because it requires passing both '--notes' and another of the above mentioned options. In the case of '--pretty', for example, we set 'opt->verbose_header', causing 'show_log()' to eventually call 'format_display_notes()', which expects a non-NULL 'display_note_trees'. Without initializing the notes machinery, 'display_note_trees' remains NULL, and thus triggers an assertion failure. Fix this by initializing the notes machinery after parsing our options, and harden this behavior against regression with a test in t4013. (Note that the added ref in this test requires updating two unrelated tests which use 'log --all', and thus need to learn about the new refs). Reported-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-20 18:22:54 -07:00
Denton Liu	546f352638	t9819: don't use test_must_fail with p4 We were using `test_must_fail p4` to test that the p4 command failed as expected. However, test_must_fail() is used to ensure that commands fail in an expected way, not due to something like a segv. Since we are not in the business of verifying the sanity of the external world, replace `test_must_fail p4` with `! p4` and assume that the `p4` command does not die unexpectedly. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-20 13:30:55 -07:00
Denton Liu	bf12181068	t9164: use test_must_fail only on git commands The `test_must_fail` function should only be used for git commands; we are not in the business of catching segmentation fault by external commands. Shell helper functions test_cmp and svn_cmd used in this script are wrappers around external commands, so just use `! cmd` instead of `test_must_fail cmd` Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-20 13:30:11 -07:00
Denton Liu	5c65897d2b	t9160: use test_path_is_missing() The test_must_fail() function should only be used for git commands since we assume that external commands work sanely. Since, not only should this file not exist, but there shouldn't exit _any_ filesystem entity in these paths, replace `test_must_fail test -f` with `test_path_is_missing`. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-20 13:12:13 -07:00
Denton Liu	5935ae3ee9	t9141: use test_path_is_missing() The test_must_fail() function should only be used for git commands since we assume that external commands work sanely. Since, not only should these directories not exist, but there shouldn't exist _any_ filesystem entity in these paths, replace `test_must_fail test -d` with `test_path_is_missing`. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-20 13:12:13 -07:00
Denton Liu	e8a5f07d51	t7508: don't use `test_must_fail test_cmp` The test_must_fail function should only be used for git commands since we assume that external commands work sanely. Since test_cmp() just wraps an external command, replace `test_must_fail test_cmp` with `! test_cmp`. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-20 13:12:13 -07:00
Denton Liu	4cf795b842	t7408: replace incorrect uses of test_must_fail According to t/README, test_must_fail() should only be used to test for failure in Git commands. Replace the invocation of `test_must_fail test_path_is_file` with `test_path_is_missing` since, in this test case, the path should not exist at all. In all the cases where `test_must_fail test_alternate_is_used` appears, test_alternate_is_used() fails because test_line_count() cannot open the non-existent $alternates_file. Replace `test_must_fail test_alternate_is_used` with `test_path_is_missing` to test for this directly. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-20 13:12:13 -07:00
Denton Liu	085ba9b5dc	t6030: use test_path_is_missing() The test_must_fail() function should only be used for git commands since we should assume that external commands work sanely. Replace `test_must_fail test -e` with `test_path_is_missing`. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-20 13:12:13 -07:00
Matheus Tavares	45115d8490	grep: follow conventions for printing paths w/ unusual chars grep does not follow the conventions used by other Git commands when printing paths that contain unusual characters (as double-quotes or newlines). Commands such as ls-files, commit, status and diff will: - Quote and escape unusual pathnames, by default. - Print names verbatim and unquoted when "-z" is used. But grep never quotes/escapes absolute paths with unusual chars and always quotes/escapes relative ones, even with "-z". Besides being inconsistent in its own output, the deviation from other Git commands can be confusing. So let's make it follow the two rules above and add some tests for this new behavior. Note that, making grep quote/escape all unusual paths by default, also make it fully compliant with the core.quotePath configuration, which is currently ignored for absolute paths. Reported-by: Greg Hurrell <greg@hurrell.net> Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-20 13:01:43 -07:00
Junio C Hamano	048abe1751	Git 2.26.2 -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE4fA2sf7nIh/HeOzvsLXohpav5ssFAl6dLM0ACgkQsLXohpav 5suZdA//Uv6ZDNw49kOTXYgvUwZXGx5jISv3rErIDIvZHeVCIFOUPhdkIKAUHcEQ iPFsXCl4VTnBoaFXY0wQ1zYksTowY8EDa1X4sWE4bxipJq3tE2M6o7DInCOwgFkF CNsNDoMPz+4r/QmCxkLZmCIdgRQtrol6pttSYnmshnCLrlNPR+OOeGwzACd5Wkx/ RcVSgfv9iBAIRoDeNep0pc3aQ/qpzFZ/PGOa4m1bR3QGsShnR5aLwsrFO3x11jFF MYBP1xrM5MmjMb2fHm2dOsLvVaqjeBj9nbacpWpn3ak3TdzuL9kP41klH2OoUVsp IpPuWS52cAHKYCIyb2EqvifM75EsEh/awxFZM//ZKA+GfMqxRC2DqPV9S9EJ9rdW Pnd+b3b6JYOtVhwjHW0gzk1FWIJ/MwZqMh9dXPEcAvYYcgAnH1lB9pNdzK9YlDGT BEcCKDthkw9B3azUn8uRaOFFhVQloQ7AGfAdmvedkIt9Xa2eFITE0nHPKNyNsM7c aG6ol5CNsR6kAHJjEMqrUPTeot3mvbvrTXaT2Qp24BWvTuc6LImvD3OttVcyfVOz j5H908VTaK2iq3Jd4bjTWsA2PpyfJRsgyow7bCSi6ZvBDowSvlWUFRdFi7fAvm9b Hiep0ar79l2p0VkpAa6Du6L3Dg1wydSIOQGdN1I4UAQZZsL3HvM= =qVLd -----END PGP SIGNATURE----- Sync with 2.26.2	2020-04-19 22:05:56 -07:00
Jonathan Nieder	af6b65d45e	Git 2.26.2 This merges up the security fix from v2.17.5. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2020-04-19 16:32:24 -07:00
Jonathan Nieder	7397ca3373	Git 2.25.4 This merges up the security fix from v2.17.5. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2020-04-19 16:31:07 -07:00
Jonathan Nieder	b86a4be245	Git 2.24.3 This merges up the security fix from v2.17.5. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2020-04-19 16:30:34 -07:00
Jonathan Nieder	f2771efd07	Git 2.23.3 This merges up the security fix from v2.17.5. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2020-04-19 16:30:27 -07:00
Jonathan Nieder	c9808fa014	Git 2.22.4 This merges up the security fix from v2.17.5. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2020-04-19 16:30:19 -07:00
Jonathan Nieder	9206d27eb5	Git 2.21.3 This merges up the security fix from v2.17.5. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2020-04-19 16:30:08 -07:00
Jonathan Nieder	041bc65923	Git 2.20.4 This merges up the security fix from v2.17.5. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2020-04-19 16:28:57 -07:00
Jonathan Nieder	76b54ee9b9	Git 2.19.5 This merges up the security fix from v2.17.5. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2020-04-19 16:26:41 -07:00
Jonathan Nieder	ba6f0905fd	Git 2.18.4 This merges up the security fix from v2.17.5. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2020-04-19 16:24:14 -07:00
Jonathan Nieder	1a3609e402	fsck: reject URL with empty host in .gitmodules Git's URL parser interprets https:///example.com/repo.git to have no host and a path of "example.com/repo.git". Curl, on the other hand, internally redirects it to https://example.com/repo.git. As a result, until "credential: parse URL without host as empty host, not unset", tricking a user into fetching from such a URL would cause Git to send credentials for another host to example.com. Teach fsck to block and detect .gitmodules files using such a URL to prevent sharing them with Git versions that are not yet protected. A relative URL in a .gitmodules file could also be used to trigger this. The relative URL resolver used for .gitmodules does not normalize sequences of slashes and can follow ".." components out of the path part and to the host part of a URL, meaning that such a relative URL can be used to traverse from a https://foo.example.com/innocent superproject to a https:///attacker.example.com/exploit submodule. Fortunately, redundant extra slashes in .gitmodules are rare, so we can catch this by detecting one after a leading sequence of "./" and "../" components. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Reviewed-by: Jeff King <peff@peff.net>	2020-04-19 16:10:58 -07:00
Jonathan Nieder	e7fab62b73	credential: treat URL with empty scheme as invalid Until "credential: refuse to operate when missing host or protocol", Git's credential handling code interpreted URLs with empty scheme to mean "give me credentials matching this host for any protocol". Luckily libcurl does not recognize such URLs (it tries to look for a protocol named "" and fails). Just in case that changes, let's reject them within Git as well. This way, credential_from_url is guaranteed to always produce a "struct credential" with protocol and host set. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2020-04-19 16:10:58 -07:00
Jonathan Nieder	c44088ecc4	credential: treat URL without scheme as invalid libcurl permits making requests without a URL scheme specified. In this case, it guesses the URL from the hostname, so I can run git ls-remote http::ftp.example.com/path/to/repo and it would make an FTP request. Any user intentionally using such a URL is likely to have made a typo. Unfortunately, credential_from_url is not able to determine the host and protocol in order to determine appropriate credentials to send, and until "credential: refuse to operate when missing host or protocol", this resulted in another host's credentials being leaked to the named host. Teach credential_from_url_gently to consider such a URL to be invalid so that fsck can detect and block gitmodules files with such URLs, allowing server operators to avoid serving them to downstream users running older versions of Git. This also means that when such URLs are passed on the command line, Git will print a clearer error so affected users can switch to the simpler URL that explicitly specifies the host and protocol they intend. One subtlety: .gitmodules files can contain relative URLs, representing a URL relative to the URL they were cloned from. The relative URL resolver used for .gitmodules can follow ".." components out of the path part and past the host part of a URL, meaning that such a relative URL can be used to traverse from a https://foo.example.com/innocent superproject to a https::attacker.example.com/exploit submodule. Fortunately a leading ':' in the first path component after a series of leading './' and '../' components is unlikely to show up in other contexts, so we can catch this by detecting that pattern. Reported-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Reviewed-by: Jeff King <peff@peff.net>	2020-04-19 16:10:58 -07:00
Jeff King	fe29a9b7b0	credential: die() when parsing invalid urls When we try to initialize credential loading by URL and find that the URL is invalid, we set all fields to NULL in order to avoid acting on malicious input. Later when we request credentials, we diagonse the erroneous input: fatal: refusing to work with credential missing host field This is problematic in two ways: - The message doesn't tell the user why we are missing the host field, so they can't tell from this message alone how to recover. There can be intervening messages after the original warning of bad input, so the user may not have the context to put two and two together. - The error only occurs when we actually need to get a credential. If the URL permits anonymous access, the only encouragement the user gets to correct their bogus URL is a quiet warning. This is inconsistent with the check we perform in fsck, where any use of such a URL as a submodule is an error. When we see such a bogus URL, let's not try to be nice and continue without helpers. Instead, die() immediately. This is simpler and obviously safe. And there's very little chance of disrupting a normal workflow. It's _possible_ that somebody has a legitimate URL with a raw newline in it. It already wouldn't work with credential helpers, so this patch steps that up from an inconvenience to "we will refuse to work with it at all". If such a case does exist, we should figure out a way to work with it (especially if the newline is only in the path component, which we normally don't even pass to helpers). But until we see a real report, we're better off being defensive. Reported-by: Carlo Arenas <carenas@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2020-04-19 16:10:58 -07:00
Jonathan Nieder	a2b26ffb1a	fsck: convert gitmodules url to URL passed to curl In `07259e74ec` (fsck: detect gitmodules URLs with embedded newlines, 2020-03-11), git fsck learned to check whether URLs in .gitmodules could be understood by the credential machinery when they are handled by git-remote-curl. However, the check is overbroad: it checks all URLs instead of only URLs that would be passed to git-remote-curl. In principle a git:// or file:/// URL does not need to follow the same conventions as an http:// URL; in particular, git:// and file:// protocols are not succeptible to issues in the credential API because they do not support attaching credentials. In the HTTP case, the URL in .gitmodules does not always match the URL that would be passed to git-remote-curl and the credential machinery: Git's URL syntax allows specifying a remote helper followed by a "::" delimiter and a URL to be passed to it, so that git ls-remote http::https://example.com/repo.git invokes git-remote-http with https://example.com/repo.git as its URL argument. With today's checks, that distinction does not make a difference, but for a check we are about to introduce (for empty URL schemes) it will matter. .gitmodules files also support relative URLs. To ensure coverage for the https based embedded-newline attack, urldecode and check them directly for embedded newlines. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Reviewed-by: Jeff King <peff@peff.net>	2020-04-19 16:10:58 -07:00
Jeff King	8ba8ed568e	credential: refuse to operate when missing host or protocol The credential helper protocol was designed to be very flexible: the fields it takes as input are treated as a pattern, and any missing fields are taken as wildcards. This allows unusual things like: echo protocol=https \| git credential reject to delete all stored https credentials (assuming the helpers themselves treat the input that way). But when helpers are invoked automatically by Git, this flexibility works against us. If for whatever reason we don't have a "host" field, then we'd match _any_ host. When you're filling a credential to send to a remote server, this is almost certainly not what you want. Prevent this at the layer that writes to the credential helper. Add a check to the credential API that the host and protocol are always passed in, and add an assertion to the credential_write function that speaks credential helper protocol to be doubly sure. There are a few ways this can be triggered in practice: - the "git credential" command passes along arbitrary credential parameters it reads from stdin. - until the previous patch, when the host field of a URL is empty, we would leave it unset (rather than setting it to the empty string) - a URL like "example.com/foo.git" is treated by curl as if "http://" was present, but our parser sees it as a non-URL and leaves all fields unset - the recent fix for URLs with embedded newlines blanks the URL but otherwise continues. Rather than having the desired effect of looking up no credential at all, many helpers will return _any_ credential Our earlier test for an embedded newline didn't catch this because it only checked that the credential was cleared, but didn't configure an actual helper. Configuring the "verbatim" helper in the test would show that it is invoked (it's obviously a silly helper which doesn't look at its input, but the point is that it shouldn't be run at all). Since we're switching this case to die(), we don't need to bother with a helper. We can see the new behavior just by checking that the operation fails. We'll add new tests covering partial input as well (these can be triggered through various means with url-parsing, but it's simpler to just check them directly, as we know we are covered even if the url parser changes behavior in the future). [jn: changed to die() instead of logging and showing a manual username/password prompt] Reported-by: Carlo Arenas <carenas@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2020-04-19 16:10:58 -07:00
Jeff King	24036686c4	credential: parse URL without host as empty host, not unset We may feed a URL like "cert:///path/to/cert.pem" into the credential machinery to get the key for a client-side certificate. That credential has no hostname field, which is about to be disallowed (to avoid confusion with protocols where a helper _would_ expect a hostname). This means as of the next patch, credential helpers won't work for unlocking certs. Let's fix that by doing two things: - when we parse a url with an empty host, set the host field to the empty string (asking only to match stored entries with an empty host) rather than NULL (asking to match _any_ host). - when we build a cert:// credential by hand, similarly assign an empty string It's the latter that is more likely to impact real users in practice, since it's what's used for http connections. But we don't have good infrastructure to test it. The url-parsing version will help anybody using git-credential in a script, and is easy to test. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2020-04-19 16:10:57 -07:00
Jeff King	73aafe9bc2	t0300: use more realistic inputs Many of the tests in t0300 give partial inputs to git-credential, omitting a protocol or hostname. We're checking only high-level things like whether and how helpers are invoked at all, and we don't care about specific hosts. However, in preparation for tightening up the rules about when we're willing to run a helper, let's start using input that's a bit more realistic: pretend as if http://example.com is being examined. This shouldn't change the point of any of the tests, but do note we have to adjust the expected output to accommodate this (filling a credential will repeat back the protocol/host fields to stdout, and the helper debug messages and askpass prompt will change on stderr). Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2020-04-19 16:10:57 -07:00
Jeff King	a88dbd2f8c	t0300: make "quit" helper more realistic We test a toy credential helper that writes "quit=1" and confirms that we stop running other helpers. However, that helper is unrealistic in that it does not bother to read its stdin at all. For now we don't send any input to it, because we feed git-credential a blank credential. But that will change in the next patch, which will cause this test to racily fail, as git-credential will get SIGPIPE writing to the helper rather than exiting because it was asked to. Let's make this one-off helper more like our other sample helpers, and have it source the "dump" script. That will read stdin, fixing the SIGPIPE problem. But it will also write what it sees to stderr. We can make the test more robust by checking that output, which confirms that we do run the quit helper, don't run any other helpers, and exit for the reason we expected. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2020-04-19 16:10:52 -07:00
Jiang Xin	f38b16843d	transport-helper: mark failure for atomic push Commit v2.22.0-1-g3bca1e7f9f (transport-helper: enforce atomic in push_refs_with_push, 2019-07-11) noticed the incomplete report of failure of an atomic push for HTTP protocol. But the implementation has a flaw that mark all remote references as failure. Only mark necessary references as failure in `push_refs_with_push()` of transport-helper. Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-17 12:16:32 -07:00
Jiang Xin	46701bde69	send-pack: mark failure of atomic push properly When pushing with SSH or other smart protocol, references are validated by function `check_to_send_update()` before they are sent in commands to `send_pack()` of "receve-pack". For atomic push, if a reference is rejected after the validation, only references pushed by user should be marked as failure, instead of report failure on all remote references. Commit v2.22.0-1-g3bca1e7f9f (transport-helper: enforce atomic in push_refs_with_push, 2019-07-11) wanted to fix report issue of HTTP protocol, but marked all remote references failure for atomic push. In order to fix the issue of status report for SSH or other built-in smart protocol, revert part of that commit and add additional status for function `atomic_push_failure()`. The additional status for it except the "REF_STATUS_EXPECTING_REPORT" status are: - REF_STATUS_NONE : Not marked as "REF_STATUS_EXPECTING_REPORT" yet. - REF_STATUS_OK : Assume OK for dryrun or status_report is disabled. This fix won't resolve the issue of status report in transport-helper for HTTP or other protocols, and breaks test case in t5541. Will fix it in additional commit. Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-17 12:16:31 -07:00
Jiang Xin	865e23f532	t5543: never report what we do not push When we push some references to the git server, we expect git to report the status of the references we are pushing; no more, no less. But when pusing with atomic mode, if some references cannot be pushed, Git reports the reject message on all references in the remote repository. Add new test cases in t5543, and fix them in latter commit. Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-17 12:16:31 -07:00
Jiang Xin	7dcbeaa0df	send-pack: fix inconsistent porcelain output The porcelain output of a failed `git-push` command is inconsistent for different protocols. For example, the following `git-push` command may fail due to the failure of the `pre-receive` hook. git push --porcelain origin HEAD:refs/heads/master For SSH protocol, the porcelain output does not end with a "Done" message: To <URL/of/upstream.git> ! HEAD:refs/heads/master [remote rejected] (pre-receive hook declined) While for HTTP protocol, the porcelain output does end with a "Done" message: To <URL/of/upstream.git> ! HEAD:refs/heads/master [remote rejected] (pre-receive hook declined) Done The following code at the end of function `send_pack()` indicates that `send_pack()` should not return an error if some references are rejected in porcelain mode. int send_pack(...) ... ... if (args->porcelain) return 0; for (ref = remote_refs; ref; ref = ref->next) { switch (ref->status) { case REF_STATUS_NONE: case REF_STATUS_UPTODATE: case REF_STATUS_OK: break; default: return -1; } } return 0; } So if atomic push failed, must check the porcelain mode before return an error. And `receive_status()` should not return an error for a failed updated reference, because `send_pack()` will check them instead. Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-17 12:16:31 -07:00
Emily Shaffer	238b439d69	bugreport: add tool to generate debugging info Teach Git how to prompt the user for a good bug report: reproduction steps, expected behavior, and actual behavior. Later, Git can learn how to collect some diagnostic information from the repository. If users can send us a well-written bug report which contains diagnostic information we would otherwise need to ask the user for, we can reduce the number of question-and-answer round trips between the reporter and the Git contributor. Users may also wish to send a report like this to their local "Git expert" if they have put their repository into a state they are confused by. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-16 15:23:42 -07:00
Derrick Stolee	a6be5e6764	log: add log.excludeDecoration config option In 'git log', the --decorate-refs-exclude option appends a pattern to a string_list. This list is used to prevent showing some refs in the decoration output, or even by --simplify-by-decoration. Users may want to use their refs space to store utility refs that should not appear in the decoration output. For example, Scalar [1] runs a background fetch but places the "new" refs inside the refs/scalar/hidden/<remote>/* refspace instead of refs/<remote>/* to avoid updating remote refs when the user is not looking. However, these "hidden" refs appear during regular 'git log' queries. A similar idea to use "hidden" refs is under consideration for core Git [2]. Add the 'log.excludeDecoration' config option so users can exclude some refs from decorations by default instead of needing to use --decorate-refs-exclude manually. The config value is multi-valued much like the command-line option. The documentation is careful to point out that the config value can be overridden by the --decorate-refs option, even though --decorate-refs-exclude would always "win" over --decorate-refs. Since the 'log.excludeDecoration' takes lower precedence to --decorate-refs, and --decorate-refs-exclude takes higher precedence, the struct decoration_filter needed another field. This led also to new logic in load_ref_decorations() and ref_filter_match(). There are several tests in t4202-log.sh that test the --decorate-refs-(include\|exclude) options, so these are extended. Since the expected output is already stored as a file, most tests could simply replace a "--decorate-refs-exclude" option with an in-line config setting. Other tests involve the precedence of the config option compared to command-line options and needed more modification. [1] https://github.com/microsoft/scalar [2] https://lore.kernel.org/git/77b1da5d3063a2404cd750adfe3bb8be9b6c497d.1585946894.git.gitgitgadget@gmail.com/ Helped-by: Junio C Hamano <gister@pobox.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-16 11:05:48 -07:00
Vasil Dimov	8cf51561d1	range-diff: fix a crash in parsing git-log output `git range-diff` calls `git log` internally and tries to parse its output. But `git log` output can be customized by the user in their git config and for certain configurations either an error will be returned by `git range-diff` or it will crash. To fix this explicitly set the output format of the internally executed `git log` with `--pretty=medium`. Because that cancels `--notes`, add explicitly `--notes` at the end. Also, make sure we never crash in the same way - trying to dereference `util` which was never created and has remained NULL. It would happen if the first line of `git log` output does not begin with 'commit '. Alternative considered but discarded - somehow disable all git configs and behave as if no config is present in the internally executed `git log`, but that does not seem to be possible. GIT_CONFIG_NOSYSTEM is the closest to it, but even with that we would still read `.git/config`. Signed-off-by: Vasil Dimov <vd@FreeBSD.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-15 18:32:47 -07:00
Jeff King	4c5971e18a	credential: treat "?" and "#" in URLs as end of host It's unusual to see: https://example.com?query-parameters without an intervening slash, like: https://example.com/some-path?query-parameters or even: https://example.com/?query-parameters but it is a valid end to the hostname (actually "authority component") according to RFC 3986. Likewise for "#". And curl will parse the URL according to the standard, meaning it will contact example.com, but our credential code would ask about a bogus hostname with a "?" in it. Let's make sure we follow the standard, and more importantly ask about the same hosts that curl will be talking to. It would be nice if we could just ask curl to parse the URL for us. But it didn't grow a URL-parsing API until 7.62, so we'd be stuck with fallback code either way. Plus we'd need this code in the main Git binary, where we've tried to avoid having a link dependency on libcurl. But let's at least fix our parser. Moving to curl's parser would prevent other potential discrepancies, but this gives us immediate relief for the known problem, and would help our fallback code if we eventually use curl. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-15 10:31:03 -07:00
Taylor Blau	7a9ce0269b	commit-graph.c: introduce '--[no-]check-oids' When operating on a stream of commit OIDs on stdin, 'git commit-graph write' checks that each OID refers to an object that is indeed a commit. This is convenient to make sure that the given input is well-formed, but can sometimes be undesirable. For example, server operators may wish to feed the refnames that were updated during a push to 'git commit-graph write --input=stdin-commits', and silently discard refs that don't point at commits. This can be done by combing the output of 'git for-each-ref' with '--format %(*objecttype)', but this requires opening up a potentially large number of objects. Instead, it is more convenient to feed the updated refs to the commit-graph machinery, and let it throw out refs that don't point to commits. Introduce '--[no-]check-oids' to make such a behavior possible. With '--check-oids' (the default behavior to retain backwards compatibility), 'git commit-graph write' will barf on a non-commit line in its input. With 'no-check-oids', such lines will be silently ignored, making the above possible by specifying this option. No matter which is supplied, 'git commit-graph write' retains the behavior from the previous commit of rejecting non-OID inputs like "HEAD" and "refs/heads/foo" as before. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-15 09:20:34 -07:00
Taylor Blau	6830c36077	commit-graph.h: replace 'commit_hex' with 'commits' The 'write_commit_graph()' function takes in either a string list of pack indices, or a string list of hexadecimal commit OIDs. These correspond to the '--stdin-packs' and '--stdin-commits' mode(s) from 'git commit-graph write'. Using a string_list of hexadecimal commit IDs is not the most efficient use of memory, since we can instead use the 'struct oidset', which is more well-suited for this case. This has another benefit which will become apparent in the following commit. This is that we are about to disambiguate the kinds of errors we produce with '--stdin-commits' into "non-hex input" and "hex-input, but referring to a non-commit object". By having 'write_commit_graph' take in a 'struct oidset *' of commits, we place the burden on the caller (in this case, the builtin) to handle the first case, and the commit-graph machinery can handle the second case. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-15 09:20:30 -07:00
Taylor Blau	8a6ac287b2	builtin/commit-graph.c: introduce split strategy 'replace' When using split commit-graphs, it is sometimes useful to completely replace the commit-graph chain with a new base. For example, consider a scenario in which a repository builds a new commit-graph incremental for each push. Occasionally (say, after some fixed number of pushes), they may wish to rebuild the commit-graph chain with all reachable commits. They can do so with $ git commit-graph write --reachable but this removes the chain entirely and replaces it with a single commit-graph in 'objects/info/commit-graph'. Unfortunately, this means that the next push will have to move this commit-graph into the first layer of a new chain, and then write its new commits on top. Avoid such copying entirely by allowing the caller to specify that they wish to replace the entirety of their commit-graph chain, while also specifying that the new commit-graph should become the basis of a fresh, length-one chain. This addresses the above situation by making it possible for the caller to instead write: $ git commit-graph write --reachable --split=replace which writes a new length-one chain to 'objects/info/commit-graphs', making the commit-graph incremental generated by the subsequent push relatively cheap by avoiding the aforementioned copy. In order to do this, remove an assumption in 'write_commit_graph_file' that chains are always at least two incrementals long. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-15 09:20:28 -07:00
Taylor Blau	fdbde82fe5	builtin/commit-graph.c: introduce split strategy 'no-merge' In the previous commit, we laid the groundwork for supporting different splitting strategies. In this commit, we introduce the first splitting strategy: 'no-merge'. Passing '--split=no-merge' is useful for callers which wish to write a new incremental commit-graph, but do not want to spend effort condensing the incremental chain [1]. Previously, this was possible by passing '--size-multiple=0', but this no longer the case following `63020f175f` (commit-graph: prefer default size_mult when given zero, 2020-01-02). When '--split=no-merge' is given, the commit-graph machinery will never condense an existing chain, and it will always write a new incremental. [1]: This might occur when, for example, a server administrator running some program after each push may want to ensure that each job runs proportional in time to the size of the push, and does not "jump" when the commit-graph machinery decides to trigger a merge. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-15 09:20:27 -07:00
Taylor Blau	2fa05f31bd	t/helper/test-read-graph.c: support commit-graph chains In `61df89c8e5` (commit-graph: don't early exit(1) on e.g. "git status", 2019-03-25), the former 'load_commit_graph_one' was refactored into 'open_commit_graph' and 'load_commit_graph_one_fd_st' as a means of avoiding an early-exit from non-library code. However, 'load_commit_graph_one' does not support commit-graph chains, and hence the 'read-graph' test tool does not work with them. Replace 'load_commit_graph_one' with 'read_commit_graph_one' in order to support commit-graph chains. In the spirit of `61df89c8e5`, 'read_commit_graph_one' does not ever 'die()', making it a suitable replacement here. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-15 09:20:24 -07:00
Junio C Hamano	efe3874640	Git 2.26.1 -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE4fA2sf7nIh/HeOzvsLXohpav5ssFAl57uiUACgkQsLXohpav 5ssUdQ/+MINTr0vCUL6PUvdhDI4S3+GZ8tryZm6yysIzUHbngbGiPsG2I6XkZQb4 uCdMLcuFMn7BZ3fbP0tZeX9GwqJu8RnZ1Nt+Eh8uKc51AklS/5w322Xb/gL6xA3p 2zLXwruGiR5wmlbkFaeQmLggrFqFWknXVqGdpJ7ei/pl1RTfmfdVz/tWXUME/Q6/ thTEbW2ru6yMuR/GGHQs2J6xYlE0de/Yidvlcl3cK9aGN0rpKe/di/mZwBp1jWam icfDoncNpinzEtube9ML1/aG1fCy/aGKe7CYxKz0zEuM/ODCOaCqU7eLza03M5mJ KPjZGBrfGHwVDzlqeVp8MS7uvhHB4pA8wivHKrbEpbaeCtSc66EhtXGyyoPGpP77 aqwcgbExUUTZm0BNWEdPkKZ0IeYiXbabDsFmESVQnMKvjRcwLGZv5tfFIA56yCql Px/pkQSbKz07f6WR1P49EUDfLNONNYFpY9Q7LZo8Iw1FfiARaXwYvcb9/Mt2CGW3 VUxUKUXaz8PkYFbmEwSxSznqjA+5ZDeC5Q0Ntzvq0YghmenLO+fC3yjT4HSVyGUE uaxNygzbqMwCDLRhbOv0RC81thfwgLYr4X6ZAD0GKZqaGdQbvseo+vr9TwqsA+U/ 7lPh8FBJjoGsMfqt6mgLzLSVscussuOaNeojDMTHaSXJDt/psFU= =4jfG -----END PGP SIGNATURE----- Sync with v2.26.1	2020-04-13 18:40:10 -07:00
Johannes Schindelin	176a66a748	t: restrict `is_hidden` to be called only on Windows The function won't work anywhere else, so let's mark it as an explicit bug if it is called on a non-Windows platform. Let's also rename the function to avoid cluttering the global namespace with an overly-generic function name. While at it, we also fix the code comment above that function: the lower-case `windows` refers to something different than `Windows`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-11 14:24:40 -07:00
Johannes Schindelin	9814d0a4ad	mingw: make test_path_is_hidden more robust This function uses Windows' system tool `attrib` to determine the state of the hidden flag of a file or directory. We should not actually expect the first `attrib.exe` in the PATH to be the one we are looking for. Or that it is in the PATH, for that matter. Let's use the full path to the tool instead. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-11 14:23:25 -07:00
Johannes Schindelin	7c2dfca7e8	t: consolidate the `is_hidden` functions The `is_hidden` function can be used (only on Windows) to determine whether a directory or file have their `hidden` flag set. This function is duplicated between two test scripts. It is better to move it into `test-lib-functions.sh` so that it is reused. This patch is best viewed with `--color-moved`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-11 14:23:23 -07:00
Jonathan Tan	0fcb4f6b62	rebase --merge: optionally skip upstreamed commits When rebasing against an upstream that has had many commits since the original branch was created: O -- O -- ... -- O -- O (upstream) \ -- O (my-dev-branch) it must read the contents of every novel upstream commit, in addition to the tip of the upstream and the merge base, because "git rebase" attempts to exclude commits that are duplicates of upstream ones. This can be a significant performance hit, especially in a partial clone, wherein a read of an object may end up being a fetch. Add a flag to "git rebase" to allow suppression of this feature. This flag only works when using the "merge" backend. This flag changes the behavior of sequencer_make_script(), called from do_interactive_rebase() <- run_rebase_interactive() <- run_specific_rebase() <- cmd_rebase(). With this flag, limit_list() (indirectly called from sequencer_make_script() through prepare_revision_walk()) will no longer call cherry_pick_list(), and thus PATCHSAME is no longer set. Refraining from setting PATCHSAME both means that the intermediate commits in upstream are no longer read (as shown by the test) and means that no PATCHSAME-caused skipping of commits is done by sequencer_make_script(), either directly or through make_script_with_merges(). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-11 14:15:57 -07:00
Elijah Newren	b9cbd2958f	rebase: reinstate --no-keep-empty Commit `d48e5e21da` ("rebase (interactive-backend): make --keep-empty the default", 2020-02-15) turned --keep-empty (for keeping commits which start empty) into the default. The logic underpinning that commit was: 1) 'git commit' errors out on the creation of empty commits without an override flag 2) Once someone determines that the override is worthwhile, it's annoying and/or harmful to required them to take extra steps in order to keep such commits around (and to repeat such steps with every rebase). While the logic on which the decision was made is sound, the result was a bit of an overcorrection. Instead of jumping to having --keep-empty being the default, it jumped to making --keep-empty the only available behavior. There was a simple workaround, though, which was thought to be good enough at the time. People could still drop commits which started empty the same way the could drop any commits: by firing up an interactive rebase and picking out the commits they didn't want from the list. However, there are cases where external tools might create enough empty commits that picking all of them out is painful. As such, having a flag to automatically remove start-empty commits may be beneficial. Provide users a way to drop commits which start empty using a flag that existed for years: --no-keep-empty. Interpret --keep-empty as countermanding any previous --no-keep-empty, but otherwise leaving --keep-empty as the default. This might lead to some slight weirdness since commands like git rebase --empty=drop --keep-empty git rebase --empty=keep --no-keep-empty look really weird despite making perfect sense (the first will drop commits which become empty, but keep commits that started empty; the second will keep commits which become empty, but drop commits which started empty). However, --no-keep-empty was named years ago and we are predominantly keeping it for backward compatibility; also we suspect it will only be used rarely since folks already have a simple way to drop commits they don't want with an interactive rebase. Reported-by: Bryan Turner <bturner@atlassian.com> Reported-by: Sami Boukortt <sami@boukortt.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-11 14:15:52 -07:00
Johannes Schindelin	662f9cf154	tests: when run in Bash, annotate test failures with file name/line number When a test fails, it is nice to see where the corresponding code lives in the worktree. Sadly, it seems that only Bash allows us to infer this information. Let's do it when we detect that we're running in a Bash. This will come in handy in the next commit, where we teach the GitHub Actions workflow to annotate failed test runs with this information. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-10 10:30:40 -07:00
Derrick Stolee	8d049e182e	revision: --show-pulls adds helpful merges The default file history simplification of "git log -- <path>" or "git rev-list -- <path>" focuses on providing the smallest set of commits that first contributed a change. The revision walk greatly restricts the set of walked commits by visiting only the first TREESAME parent of a merge commit, when one exists. This means that portions of the commit-graph are not walked, which can be a performance benefit, but can also "hide" commits that added changes but were ignored by a merge resolution. The --full-history option modifies this by walking all commits and reporting a merge commit as "interesting" if it has _any_ parent that is not TREESAME. This tends to be an over-representation of important commits, especially in an environment where most merge commits are created by pull request completion. Suppose we have a commit A and we create a commit B on top that changes our file. When we merge the pull request, we create a merge commit M. If no one else changed the file in the first-parent history between M and A, then M will not be TREESAME to its first parent, but will be TREESAME to B. Thus, the simplified history will be "B". However, M will appear in the --full-history mode. However, suppose that a number of topics T1, T2, ..., Tn were created based on commits C1, C2, ..., Cn between A and M as follows: A----C1----C2--- ... ---Cn----M------P1---P2--- ... ---Pn \ \ \ \ / / / / \ \__.. \ \/ ..__T1 / Tn \ \__.. /\ ..__T2 / \_____________________B \____________________/ If the commits T1, T2, ... Tn did not change the file, then all of P1 through Pn will be TREESAME to their first parent, but not TREESAME to their second. This means that all of those merge commits appear in the --full-history view, with edges that immediately collapse into the lower history without introducing interesting single-parent commits. The --simplify-merges option was introduced to remove these extra merge commits. By noticing that the rewritten parents are reachable from their first parents, those edges can be simplified away. Finally, the commits now look like single-parent commits that are TREESAME to their "only" parent. Thus, they are removed and this issue does not cause issues anymore. However, this also ends up removing the commit M from the history view! Even worse, the --simplify-merges option requires walking the entire history before returning a single result. Many Git users are using Git alongside a Git service that provides code storage alongside a code review tool commonly called "Pull Requests" or "Merge Requests" against a target branch. When these requests are accepted and merged, they typically create a merge commit whose first parent is the previous branch tip and the second parent is the tip of the topic branch used for the request. This presents a valuable order to the parents, but also makes that merge commit slightly special. Users may want to see not only which commits changed a file, but which pull requests merged those commits into their branch. In the previous example, this would mean the users want to see the merge commit "M" in addition to the single- parent commit "C". Users are even more likely to want these merge commits when they use pull requests to merge into a feature branch before merging that feature branch into their trunk. In some sense, users are asking for the "first" merge commit to bring in the change to their branch. As long as the parent order is consistent, this can be handled with the following rule: Include a merge commit if it is not TREESAME to its first parent, but is TREESAME to a later parent. These merges look like the merge commits that would result from running "git pull <topic>" on a main branch. Thus, the option to show these commits is called "--show-pulls". This has the added benefit of showing the commits created by closing a pull request or merge request on any of the Git hosting and code review platforms. To test these options, extend the standard test example to include a merge commit that is not TREESAME to its first parent. It is surprising that that option was not already in the example, as it is instructive. In particular, this extension demonstrates a common issue with file history simplification. When a user resolves a merge conflict using "-Xours" or otherwise ignoring one side of the conflict, they create a TREESAME edge that probably should not be TREESAME. This leads users to become frustrated and complain that "my change disappeared!" In my experience, showing them history with --full-history and --simplify-merges quickly reveals the problematic merge. As mentioned, this option is expensive to compute. The --show-pulls option _might_ show the merge commit (usually titled "resolving conflicts") more quickly. Of course, this depends on the user having the correct parent order, which is backwards when using "git pull master" from a topic branch. There are some special considerations when combining the --show-pulls option with --simplify-merges. This requires adding a new PULL_MERGE object flag to store the information from the initial TREESAME comparisons. This helps avoid dropping those commits in later filters. This is covered by a test, including how the parents can be simplified. Since "struct object" has already ruined its 32-bit alignment by using 33 bits across parsed, type, and flags member, let's not make it worse. PULL_MERGE is used in revision.c with the same value (1u<<15) as REACHABLE in commit-graph.c. The REACHABLE flag is only used when writing a commit-graph file, and a revision walk using --show-pulls does not happen in the same process. Care must be taken in the future to ensure this remains the case. Update Documentation/rev-list-options.txt with significant details around this option. This requires updating the example in the History Simplification section to demonstrate some of the problems with TREESAME second parents. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-10 09:58:55 -07:00
Denton Liu	d9f15d37f1	pull: pass --autostash to merge Before, `--autostash` only worked with `git pull --rebase`. However, in the last patch, merge learned `--autostash` as well so there's no reason why we should have this restriction anymore. Teach pull to pass `--autostash` to merge, just like it did for rebase. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-10 09:28:02 -07:00
Denton Liu	f8a1785807	t5520: make test_pull_autostash() accept expect_parent_num Before, test_pull_autostash() was hardcoded to run `test_cmp_rev HEAD^ copy` to test that a rebase happened. However, in a future patch, we plan on testing merging as well. Make test_pull_autostash() accept a parent number as an argument so that, in the future, we can test if a merge happened in addition to a rebase. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-10 09:28:02 -07:00
Denton Liu	a03b55530a	merge: teach --autostash option In rebase, one can pass the `--autostash` option to cause the worktree to be automatically stashed before continuing with the rebase. This option is missing in merge, however. Implement the `--autostash` option and corresponding `merge.autoStash` option in merge which stashes before merging and then pops after. This option is useful when a developer has some local changes on a topic branch but they realize that their work depends on another branch. Previously, they had to run something like git fetch ... git stash push git merge FETCH_HEAD git stash pop but now, that is reduced to git fetch ... git merge --autostash FETCH_HEAD When an autostash is generated, it is automatically reapplied to the worktree only in three explicit situations: 1. An incomplete merge is commit using `git commit`. 2. A merge completes successfully. 3. A merge is aborted using `git merge --abort`. In all other situations where the merge state is removed using remove_merge_branch_state() such as aborting a merge via `git reset --hard`, the autostash is saved into the stash reflog instead keeping the worktree clean. Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk> Suggested-by: Alban Gruin <alban.gruin@gmail.com> Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-10 09:28:02 -07:00
Junio C Hamano	5ff4b920eb	sha1-name: do not assume that the ref store is initialized `c931ba4e` (sha1-name.c: remove the_repo from handle_one_ref(), 2019-04-16) replaced the use of for_each_ref() helper, which works with the main ref store of the default repository instance, with refs_for_each_ref(), which can work on any ref store instance, by assuming that the repository instance the function is given has its ref store already initialized. But it is possible that nobody has initialized it, in which case, the code ends up dereferencing a NULL pointer. Reported-by: Érico Rolim <erico.erc@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-09 17:22:20 -07:00
Johannes Schindelin	7723436149	stash -p: (partially) fix bug concerning split hunks When trying to stash part of the worktree changes by splitting a hunk and then only partially accepting the split bits and pieces, the user is presented with a rather cryptic error: error: patch failed: <file>:<line> error: test: patch does not apply Cannot remove worktree changes and the command would fail to stash the desired parts of the worktree changes (even if the `stash` ref was actually updated correctly). We even have a test case demonstrating that failure, carrying it for four years already. The explanation: when splitting a hunk, the changed lines are no longer separated by more than 3 lines (which is the amount of context lines Git's diffs use by default), but less than that. So when staging only part of the diff hunk for stashing, the resulting diff that we want to apply to the worktree in reverse will contain those changes to be dropped surrounded by three context lines, but since the diff is relative to HEAD rather than to the worktree, these context lines will not match. Example time. Let's assume that the file README contains these lines: We the people and the worktree added some lines so that it contains these lines instead: We are the kind people and the user tries to stash the line containing "are", then the command will internally stage this line to a temporary index file and try to revert the diff between HEAD and that index file. The diff hunk that `git stash` tries to revert will look somewhat like this: @@ -1776,3 +1776,4 We +are the people It is obvious, now, that the trailing context lines overlap with the part of the original diff hunk that the user did not want to stash. Keeping in mind that context lines in diffs serve the primary purpose of finding the exact location when the diff does not apply precisely (but when the exact line number in the file to be patched differs from the line number indicated in the diff), we work around this by reducing the amount of context lines: the diff was just generated. Note: this is not a full fix for the issue. Just as demonstrated in t3701's 'add -p works with pathological context lines' test case, there are ambiguities in the diff format. It is very rare in practice, of course, to encounter such repeated lines. The full solution for such cases would be to replace the approach of generating a diff from the stash and then applying it in reverse by emulating `git revert` (i.e. doing a 3-way merge). However, in `git stash -p` it would not apply to `HEAD` but instead to the worktree, which makes this non-trivial to implement as long as we also maintain a scripted version of `add -i`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-08 12:17:59 -07:00
Johannes Schindelin	121c0d4151	t3904: fix incorrect demonstration of a bug In `7e9e048661` (stash -p: demonstrate failure of split with mixed y/n, 2015-04-16), a regression test for a known breakage that was added to the test script `t3904-stash-patch.sh` that demonstrated that splitting a hunk and trying to stash only part of that split hunk fails (but shouldn't). As expected, it still fails, but for the wrong reason: once the bug is fixed, we would expect stderr to show nothing, yet the regression test expects stderr to show something. Let's fix that by telling that regression test case to expect nothing to be printed to stderr. While at it, also drop the obvious left-over from debugging where the regression test did not mind `git stash -p` to return a non-zero exit status. Of course, the regression test still fails, but this time for the correct reason. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-08 12:17:58 -07:00
Johannes Schindelin	b6852e1979	mingw: do not treat `COM0` as a reserved file name In `4dc42c6c18` (mingw: refuse paths containing reserved names, 2019-12-21), we started disallowing file names that are reserved, e.g. `NUL`, `CONOUT$`, etc. This included `COM<n>` where `<n>` is a digit. Unfortunately, this includes `COM0` but only `COM1`, ..., `COM9` are reserved, according to the official documentation, `COM0` is mentioned in the "NT Namespaces" section but it is explicitly _omitted_ from the list of reserved names: https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file#naming-conventions Tests corroborate this: it is totally possible to write a file called `com0.c` on Windows 10, but not `com1.c`. So let's tighten the code to disallow only the reserved `COM<n>` file names, but to allow `COM0` again. This fixes https://github.com/git-for-windows/git/issues/2470. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-08 12:15:51 -07:00
Emma Brooks	19d097e3d7	format-patch: teach --no-encode-email-headers When commit subjects or authors have non-ASCII characters, git format-patch Q-encodes them so they can be safely sent over email. However, if the patch transfer method is something other than email (web review tools, sneakernet), this only serves to make the patch metadata harder to read without first applying it (unless you can decode RFC 2047 in your head). git am as well as some email software supports non-Q-encoded mail as described in RFC 6531. Add --[no-]encode-email-headers and format.encodeEmailHeaders to let the user control this behavior. Signed-off-by: Emma Brooks <me@pluvano.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-07 22:37:18 -07:00
Junio C Hamano	d1068592ab	Merge branch 'dd/test-with-busybox' into HEAD * dd/test-with-busybox: t5703: feed raw data into test-tool unpack-sideband t4124: tweak test so that non-compliant diff(1) can also be used t7063: drop non-POSIX argument "-ls" from find(1) t5616: use rev-parse instead to get HEAD's object_id t5003: skip conversion test if unzip -a is unavailable t5003: drop the subshell in test_lazy_prereq test-lib-functions: test_cmp: eval $GIT_TEST_CMP t4061: use POSIX compliant regex(7)	2020-04-07 22:16:21 -07:00
Jonathan Tan	95acf11a3d	diff: restrict when prefetching occurs Commit `7fbbcb21b1` ("diff: batch fetching of missing blobs", 2019-04-08) optimized "diff" by prefetching blobs in a partial clone, but there are some cases wherein blobs do not need to be prefetched. In these cases, any command that uses the diff machinery will unnecessarily fetch blobs. diffcore_std() may read blobs when it calls the following functions: (1) diffcore_skip_stat_unmatch() (controlled by the config variable diff.autorefreshindex) (2) diffcore_break() and diffcore_merge_broken() (for break-rewrite detection) (3) diffcore_rename() (for rename detection) (4) diffcore_pickaxe() (for detecting addition/deletion of specified string) Instead of always prefetching blobs, teach diffcore_skip_stat_unmatch(), diffcore_break(), and diffcore_rename() to prefetch blobs upon the first read of a missing object. This covers (1), (2), and (3): to cover the rest, teach diffcore_std() to prefetch if the output type is one that includes blob data (and hence blob data will be required later anyway), or if it knows that (4) will be run. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-07 16:09:29 -07:00
Elijah Newren	1f6965f994	sequencer: honor GIT_REFLOG_ACTION There is a lot of code to honor GIT_REFLOG_ACTION throughout git, including some in sequencer.c; unfortunately, reflog_message() and its callers ignored it. Instruct reflog_message() to check the existing environment variable, and use it when present as an override to action_name(). Also restructure pick_commits() to only temporarily modify GIT_REFLOG_ACTION for a short duration and then restore the old value, so that when we do this setting within a loop we do not keep adding " (pick)" substrings and end up with a reflog message of the form rebase (pick) (pick) (pick) (finish): returning to refs/heads/master Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-07 15:10:11 -07:00
Jeff King	d8410a816b	fast-import: replace custom hash with hashmap.c We use a custom hash in fast-import to store the set of objects we've imported so far. It has a fixed set of 2^16 buckets and chains any collisions with a linked list. As the number of objects grows larger than that, the load factor increases and we degrade to O(n) lookups and O(n^2) insertions. We can scale better by using our hashmap.c implementation, which will resize the bucket count as we grow. This does incur an extra memory cost of 8 bytes per object, as hashmap stores the integer hash value for each entry in its hashmap_entry struct (which we really don't care about here, because we're just reusing the embedded object hash). But I think the numbers below justify this (and our per-object memory cost is already much higher). I also looked at using khash, but it seemed to perform slightly worse than hashmap at all sizes, and worse even than the existing code for small sizes. It's also awkward to use here, because we want to look up a "struct object_entry" from a "struct object_id", and it doesn't handle mismatched keys as well. Making a mapping of object_id to object_entry would be more natural, but that would require pulling the embedded oid out of the object_entry or incurring an extra 32 bytes per object. In a synthetic test creating as many cheap, tiny objects as possible perl -e ' my $bits = shift; my $nr = 2**$bits; for (my $i = 0; $i < $nr; $i++) { print "blob\n"; print "data 4\n"; print pack("N", $i); } ' $bits \| git fast-import I got these results: nr_objects master khash hashmap 2^20 0m4.317s 0m5.109s 0m3.890s 2^21 0m10.204s 0m9.702s 0m7.933s 2^22 0m27.159s 0m17.911s 0m16.751s 2^23 1m19.038s 0m35.080s 0m31.963s 2^24 4m18.766s 1m10.233s 1m6.793s which points to hashmap as the winner. We didn't have any perf tests for fast-export or fast-import, so I added one as a more real-world case. It uses an export without blobs since that's significantly cheaper than a full one, but still is an interesting case people might use (e.g., for rewriting history). It will emphasize this change in some ways (as a percentage we spend more time making objects and less shuffling blob bytes around) and less in others (the total object count is lower). Here are the results for linux.git: Test HEAD^ HEAD ---------------------------------------------------------------------------- 9300.1: export (no-blobs) 67.64(66.96+0.67) 67.81(67.06+0.75) +0.3% 9300.2: import (no-blobs) 284.04(283.34+0.69) 198.09(196.01+0.92) -30.3% It only has ~5.2M commits and trees, so this is a larger effect than I expected (the 2^23 case above only improved by 50s or so, but here we gained almost 90s). This is probably due to actually performing more object lookups in a real import with trees and commits, as opposed to just dumping a bunch of blobs into a pack. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-06 13:41:24 -07:00
Garima Singh	d5b873c832	commit-graph: add GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS test flag Add GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS test flag to the test setup suite in order to toggle writing Bloom filters when running any of the git tests. If set to true, we will compute and write Bloom filters every time a test calls `git commit-graph write`, as if the `--changed-paths` option was passed in. The test suite passes when GIT_TEST_COMMIT_GRAPH and GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS are enabled. Helped-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Garima Singh <garima.singh@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-06 11:08:37 -07:00
Garima Singh	a759bfa9ee	t4216: add end to end tests for git log with Bloom filters These tests exercises writing commit graph with Bloom filters and exercises 'git log -- path' with all the applicable options. They check that the output is the same with and without Bloom filters, confirm Bloom filters were used by checking if trace2 statistics were logged correctly. Also confirms cases where Bloom filters are not used: 1. Multiple path specs, 2. --walk-reflogs (see patch titled 'revision.c: use Bloom filters...' for details, 3. If the latest commit graph does not have Bloom filters Signed-off-by: Garima Singh <garima.singh@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-06 11:08:37 -07:00
Garima Singh	1217c03e7b	commit-graph: reuse existing Bloom filters during write Add logic to a) parse Bloom filter information from the commit graph file and, b) re-use existing Bloom filters. See Documentation/technical/commit-graph-format for the format in which the Bloom filter information is written to the commit graph file. To read Bloom filter for a given commit with lexicographic position 'i' we need to: 1. Read BIDX[i] which essentially gives us the starting index in BDAT for filter of commit i+1. It is essentially the index past the end of the filter of commit i. It is called end_index in the code. 2. For i>0, read BIDX[i-1] which will give us the starting index in BDAT for filter of commit i. It is called the start_index in the code. For the first commit, where i = 0, Bloom filter data starts at the beginning, just past the header in the BDAT chunk. Hence, start_index will be 0. 3. The length of the filter will be end_index - start_index, because BIDX[i] gives the cumulative 8-byte words including the ith commit's filter. We toggle whether Bloom filters should be recomputed based on the compute_if_not_present flag. Helped-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Garima Singh <garima.singh@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-06 11:08:37 -07:00
Johannes Schindelin	a1aba0c95c	t0007: fix a typo Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-05 14:51:13 -07:00
Đoàn Trần Công Danh	cf0ad4d199	cherry-pick/revert: honour --no-gpg-sign in all case {cherry-pick,revert} --edit hasn't honoured --no-gpg-sign yet. Pass this option down to git-commit to honour it. Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-03 11:37:22 -07:00
Đoàn Trần Công Danh	c241371c04	rebase.c: honour --no-gpg-sign Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-03 11:37:22 -07:00
Patrick Steinhardt	e48cf33b61	update-ref: implement interactive transaction handling The git-update-ref(1) command can only handle queueing transactions right now via its "--stdin" parameter, but there is no way for users to handle the transaction itself in a more explicit way. E.g. in a replicated scenario, one may imagine a coordinator that spawns git-update-ref(1) for multiple repositories and only if all agree that an update is possible will the coordinator send a commit. Such a transactional session could look like > start < start: ok > update refs/heads/master $OLD $NEW > prepare < prepare: ok # All nodes have returned "ok" > commit < commit: ok or > start < start: ok > create refs/heads/master $OLD $NEW > prepare < fatal: cannot lock ref 'refs/heads/master': reference already exists # On all other nodes: > abort < abort: ok In order to allow for such transactional sessions, this commit introduces four new commands for git-update-ref(1), which matches those we have internally already with the exception of "start": - start: start a new transaction - prepare: prepare the transaction, that is try to lock all references and verify their current value matches the expected one - commit: explicitly commit a session, that is update references to match their new expected state - abort: abort a session and roll back all changes By design, git-update-ref(1) will commit as soon as standard input is being closed. While fine in a non-transactional world, it is definitely unexpected in a transactional world. Because of this, as soon as any of the new transactional commands is used, the default will change to aborting without an explicit "commit". To avoid a race between queueing updates and the first "prepare" that starts a transaction, the "start" command has been added to start an explicit transaction. Add some tests to exercise this new functionality. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-02 11:09:49 -07:00
Derrick Stolee	e892a56845	t5319: replace 'touch -m' with 'test-tool chmtime' The use of 'touch -m' to modify a file's mtime is slightly less portable than using our own 'test-tool chmtime'. The important thing is that these pack-files are ordered in a special way to ensure the multi-pack-index selects some as the "newer" pack-files when resolving duplicate objects. Reported-by: Jeff King <peff@peff.net> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-01 14:37:50 -07:00
Derrick Stolee	b09b785c78	commit-graph: fix buggy --expire-time option The commit-graph builtin has an --expire-time option that takes a datetime using OPT_EXPIRY_DATE(). However, the implementation inside expire_commit_graphs() was treating a non-zero value as a number of seconds to subtract from "now". Update t5323-split-commit-graph.sh to demonstrate the correct value of the --expire-time option by actually creating a crud .graph file with mtime earlier than the expire time. Instead of using a super- early time (1980) we use an explicit, and recent, time. Using test-tool chmtime to create two files on either end of an exact second, we create a test that catches this failure no matter the current time. Using a fixed date is more portable than trying to format a relative date string into the --expiry-date input. I noticed this when inspecting some Scalar repos that had an excess number of commit-graph files. In Scalar, we were using this second interpretation by using "--expire-time=3600" to mean "delete graphs older than one hour ago" to avoid deleting a commit-graph that a foreground process may be trying to load. Also I noticed that the help text was copied from the --max-commits option. Fix that help text. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-01 14:36:26 -07:00
Elijah Newren	c0af173a13	completion: fix 'git add' on paths under an untracked directory As reported on the git mailing list, since git-2.25, git add untracked-dir/ has been tab completing to git add untracked-dir/./ The cause for this was that with commit `b9670c1f5e` (dir: fix checks on common prefix directory, 2019-12-19), git ls-files -o --directory untracked-dir/ (or the equivalent `git -C untracked-dir ls-files -o --directory`) began reporting untracked-dir/ instead of listing paths underneath that directory. It may also be worth noting that the real command in question was git -C untracked-dir ls-files -o --directory '' which is equivalent to git ls-files -o --directory 'untracked-dir/' which behaves the same for the purposes of this issue (the '' can match the empty string), but becomes relevant for the proposed fix. At first, based on the report, I decided to try to view this as a regression and tried to find a way to recover the old behavior without breaking other stuff, or at least breaking as little as possible. However, in the end, I couldn't figure out a way to do it that wouldn't just cause lots more problems than it solved. The old behavior was a bug: Although older git would avoid cleaning anything with `git clean -f .git`, it would wipe out everything under that direcotry with `git clean -f .git/`. Despite the difference in command used, this is relevant because the exact same change that fixed clean changed the behavior of ls-files. * Older git would report different results based solely on presence or absence of a trailing slash for $SUBDIR in the command `git ls-files -o --directory $SUBDIR`. * Older git violated the documented behavior of not recursing into directories that matched the pathspec when --directory was specified. * And, after all, commit `b9670c1f5e` (dir: fix checks on common prefix directory, 2019-12-19) didn't overlook this issue; it explicitly stated that the behavior of the command was being changed to bring it inline with the docs. (Also, if it helps, despite that commit being merged during the 2.25 series, this bug was not reported during the 2.25 cycle, nor even during most of the 2.26 cycle -- it was reported a day before 2.26 was released. So the impact of the change is at least somewhat small.) Instead of relying on a bug of ls-files in reporting the wrong content, change the invocation of ls-files used by git-completion to make it grab paths one depth deeper. Do this by changing '$DIR/' (match $DIR/ plus 0 or more characters) into '$DIR/?' (match $DIR/ plus 1 or more characters). Note that the '?' character should not be added when trying to complete a filename (e.g. 'git ls-files -o --directory "merge.c?*"' would not correctly return "merge.c" when such a file exists), so we have to make sure to add the '?' character only in cases where the path specified so far is a directory. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-01 11:11:31 -07:00
Elijah Newren	7260c7b66e	t3000: add more testcases testing a variety of ls-files issues This adds seven new ls-files tests. While currently all seven test pass, my earlier rounds of restructuring dir.c to replace an exponential algorithm with a linear one passed all the tests in the testsuite but failed six of these seven new tests. Add these tests to increase our case coverage. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-01 11:10:38 -07:00
Elijah Newren	ce5c61a3f5	t7063: more thorough status checking It turns out the t7063 has some testcases that even without using the untracked cache cover situations that nothing else in the testsuite handles. Checking the results of git status --porcelain both with and without the untracked cache, and comparing both against our expected results helped uncover a critical bug in some dir.c restructuring. Unfortunately, it's not easy to run status and tell it to ignore the untracked cache; the only knob we have is core.untrackedCache=false, which is used to instruct git to delete the untracked cache (which might also ignore the untracked cache when it operates, but that isn't specified in the docs). Create a simple helper that will create a clone of the index that is missing the untracked cache bits, and use it to compare that the results with the untracked cache match the results we get without the untracked cache. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-01 11:10:38 -07:00
Jeff King	167a575e2d	clone: use "quick" lookup while following tags When cloning with --single-branch, we implement git-fetch's usual tag-following behavior, grabbing any tag objects that point to objects we have locally. When we're a partial clone, though, our has_object_file() check will actually lazy-fetch each tag. That not only defeats the purpose of --single-branch, but it does it incredibly slowly, potentially kicking off a new fetch for each tag. This is even worse for a shallow clone, which implies --single-branch, because even tags which are supersets of each other will be fetched individually. We can fix this by passing OBJECT_INFO_SKIP_FETCH_OBJECT to the call, which is what git-fetch does in this case. Likewise, let's include OBJECT_INFO_QUICK, as that's what git-fetch does. The rationale is discussed in `5827a03545` (fetch: use "quick" has_sha1_file for tag following, 2016-10-13), but here the tradeoff would apply even more so because clone is very unlikely to be racing with another process repacking our newly-created repository. This may provide a very small speedup even in the non-partial case case, as we'd avoid calling reprepare_packed_git() for each tag (though in practice, we'd only have a single packfile, so that reprepare should be quite cheap). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-01 09:56:41 -07:00
Alban Gruin	de9f1d3ef4	t3432: test `--merge' with` rebase.abbreviateCommands = true', too When fast forwarding, `git --merge' should act the same whether `rebase.abbreviateCommands' is set or not, but so far it was not the case. This duplicates the tests ensuring that `--merge' works when fast forwarding to check if it also works with abbreviated commands. Signed-off-by: Alban Gruin <alban.gruin@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-30 12:47:25 -07:00
Jeff King	ed4b804e46	test-tool: rename sha1-array to oid-array This matches the actual data structure name, as well as the source file that contains the code we're testing. The test scripts need updating to use the new name, as well. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-30 10:59:08 -07:00
Jeff King	fe299ec5ae	oid_array: rename source file from sha1-array We renamed the actual data structure in `910650d2f8` (Rename sha1_array to oid_array, 2017-03-31), but the file is still called sha1-array. Besides being slightly confusing, it makes it more annoying to grep for leftover occurrences of "sha1" in various files, because the header is included in so many places. Let's complete the transition by renaming the source and header files (and fixing up a few comment references). I kept the "-" in the name, as that seems to be our style; cf. `fc1395f4a4` (sha1_file.c: rename to use dash in file name, 2018-04-10). We also have oidmap.h and oidset.h without any punctuation, but those are "struct oidmap" and "struct oidset" in the code. We _could_ make this "oidarray" to match, but somehow it looks uglier to me because of the length of "array" (plus it would be a very invasive patch for little gain). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-30 10:59:08 -07:00
Garima Singh	ed591febb4	bloom.c: core Bloom filter implementation for changed paths. Add the core implementation for computing Bloom filters for the paths changed between a commit and it's first parent. We fill the Bloom filters as (const char *data, int len) pairs as `struct bloom_filters" within a commit slab. Filters for commits with no changes and more than 512 changes, is represented with a filter of length zero. There is no gain in distinguishing between a computed filter of length zero for a commit with no changes, and an uncomputed filter for new commits or for commits with more than 512 changes. The effect on `git log -- path` is the same in both cases. We will fall back to the normal diffing algorithm when we can't benefit from the existence of Bloom filters. Helped-by: Jeff King <peff@peff.net> Helped-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Jakub Narębski <jnareb@gmail.com> Signed-off-by: Garima Singh <garima.singh@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-30 09:59:53 -07:00
Garima Singh	f1294eaf7f	bloom.c: introduce core Bloom filter constructs Introduce the constructs for Bloom filters, Bloom filter keys and Bloom filter settings. For details on what Bloom filters are and how they work, refer to Dr. Derrick Stolee's blog post [1]. It provides a concise explanation of the adoption of Bloom filters as described in [2] and [3]. Implementation specifics: 1. We currently use 7 and 10 for the number of hashes and the size of each entry respectively. They served as great starting values, the mathematical details behind this choice are described in [1] and [4]. The implementation, while not completely open to it at the moment, is flexible enough to allow for tweaking these settings in the future. Note: The performance gains we have observed with these values are significant enough that we did not need to tweak these settings. The performance numbers are included in the cover letter of this series and in the commit message of the subsequent commit where we use Bloom filters to speed up `git log -- path`. 2. As described in [1] and [3], we do not need 7 independent hashing functions. We use the Murmur3 hashing scheme, seed it twice and then combine those to procure an arbitrary number of hash values. 3. The filters will be sized according to the number of changes in each commit, in multiples of 8 bit words. [1] Derrick Stolee "Supercharging the Git Commit Graph IV: Bloom Filters" https://devblogs.microsoft.com/devops/super-charging-the-git-commit-graph-iv-Bloom-filters/ [2] Flavio Bonomi, Michael Mitzenmacher, Rina Panigrahy, Sushil Singh, George Varghese "An Improved Construction for Counting Bloom Filters" http://theory.stanford.edu/~rinap/papers/esa2006b.pdf https://doi.org/10.1007/11841036_61 [3] Peter C. Dillinger and Panagiotis Manolios "Bloom Filters in Probabilistic Verification" http://www.ccs.neu.edu/home/pete/pub/Bloom-filters-verification.pdf https://doi.org/10.1007/978-3-540-30494-4_26 [4] Thomas Mueller Graf, Daniel Lemire "Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters" https://arxiv.org/abs/1912.08258 Helped-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Jakub Narębski <jnareb@gmail.com> Signed-off-by: Garima Singh <garima.singh@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-30 09:59:53 -07:00
Garima Singh	f52207a45c	bloom.c: add the murmur3 hash implementation In preparation for computing changed paths Bloom filters, implement the Murmur3 hash algorithm as described in [1]. It hashes the given data using the given seed and produces a uniformly distributed hash value. [1] https://en.wikipedia.org/wiki/MurmurHash#Algorithm Helped-by: Derrick Stolee <dstolee@microsoft.com> Helped-by: Szeder Gábor <szeder.dev@gmail.com> Reviewed-by: Jakub Narębski <jnareb@gmail.com> Signed-off-by: Garima Singh <garima.singh@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-30 09:59:53 -07:00
Junio C Hamano	9fadedd637	Merge branch 'ds/default-pack-use-sparse-to-true' The 'pack.useSparse' configuration variable now defaults to 'true', enabling an optimization that has been experimental since Git 2.21. * ds/default-pack-use-sparse-to-true: pack-objects: flip the use of GIT_TEST_PACK_SPARSE config: set pack.useSparse=true by default	2020-03-29 09:32:51 -07:00
Jeff King	cacae4329f	test-lib-functions: simplify packetize() stdin code The code path in packetize() for reading stdin needs to handle NUL bytes, so we can't rely on shell variables. However, the current code takes a whopping 4 processes and uses a temporary file. We can do this much more simply and efficiently by using a single perl invocation (and we already rely on perl in the matching depacketize() function). We'll keep the non-stdin code path as it is, since that uses zero extra processes. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-29 08:49:47 -07:00
Junio C Hamano	7cc112dc95	t/README: suggest how to leave test early with failure Over time, we added the support to our test framework to make it easy to leave a test early with failure, but it was not clearly documented in t/README to help developers writing new tests. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-29 08:39:40 -07:00
Damien Robert	796d61cdc0	midx.c: fix an integer underflow When verifying a midx index with 0 objects, the m->num_objects - 1 underflows and wraps around to 4294967295. Fix this both by checking that the midx contains at least one oid, and also that we don't write any midx when there is no packfiles. Update the tests to check that `git multi-pack-index write` does not write an midx when there is no objects, and another to check that `git multi-pack-index verify` warns when it verifies an midx with no objects. For this last test, use t5319/no-objects.midx which was generated by an older version of git. Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-28 16:50:40 -07:00
Jeff King	14d277879c	p5310: stop timing non-bitmap pack-to-disk Commit `645c432d61` (pack-objects: use reachability bitmap index when generating non-stdout pack, 2016-09-10) added two timing tests for packing to an on-disk file, both with and without bitmaps. However, the non-bitmap one isn't interesting to have as part of p5310's regression suite. It _could_ be used as a baseline to show off the improvement in the bitmap case, but: - the point of the t/perf suite is to find performance regressions, and it won't help with that. We don't compare the numbers between two tests (which the perf suite has no idea are even related), and any change in its numbers would have nothing to do with bitmaps. - it did show off the improvement in the commit message of `645c432d61`, but it wasn't even necessary there. The bitmap case already shows an improvement (because before the patch, it behaved the same as the non-bitmap case), and the perf suite is even able to show the difference between the before and after measurements. On top of that, it's one of the most expensive tests in the suite, clocking in around 60s for linux.git on my machine (as compared to 16s for the bitmapped version). And by default when using "./run", we'd run it three times! So let's just drop it. It's not useful and is adding minutes to perf runs. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-27 15:11:21 -07:00
Jeff King	4845b77245	upload-pack: handle unexpected delim packets When processing the arguments list for a v2 ls-refs or fetch command, we loop like this: while (packet_reader_read(request) != PACKET_READ_FLUSH) { const char *arg = request->line; ...handle arg... } to read and handle packets until we see a flush. The hidden assumption here is that anything except PACKET_READ_FLUSH will give us valid packet data to read. But that's not true; PACKET_READ_DELIM or PACKET_READ_EOF will leave packet->line as NULL, and we'll segfault trying to look at it. Instead, we should follow the more careful model demonstrated on the client side (e.g., in process_capabilities_v2): keep looping as long as we get normal packets, and then make sure that we broke out of the loop due to a real flush. That fixes the segfault and correctly diagnoses any unexpected input from the client. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-27 12:18:48 -07:00
Jeff King	88124ab263	test-lib-functions: make packetize() more efficient The packetize() function takes its input on stdin, and requires 4 separate sub-processes to format a simple string. We can do much better by getting the length via the shell's "${#packet}" construct. The one caveat is that the shell can't put a NUL into a variable, so we'll have to continue to provide the stdin form for a few calls. There are a few other cleanups here in the touched code: - the stdin form of packetize() had an extra stray "%s" when printing the packet - the converted calls in t5562 can be made simpler by redirecting output as a block, rather than repeated appending Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-27 11:50:54 -07:00
Elijah Newren	5644ca28cd	sparse-checkout: provide a new reapply subcommand If commands like merge or rebase materialize files as part of their work, or a previous sparse-checkout command failed to update individual files due to dirty changes, users may want a command to simply 'reapply' the sparsity rules. Provide one. Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-27 11:33:31 -07:00
Elijah Newren	681c637b4a	unpack-trees: failure to set SKIP_WORKTREE bits always just a warning Setting and clearing of the SKIP_WORKTREE bit is not only done when users run 'sparse-checkout'; other commands such as 'checkout' also run through unpack_trees() which has logic for handling this special bit. As such, we need to consider how they handle special cases. A couple comparison points should help explain the rationale for changing how unpack_trees() handles these bits: Ignoring sparse checkouts for a moment, if you are switching branches and have dirty changes, it is only considered an error that will prevent the branch switching from being successful if the dirty file happens to be one of the paths with different contents. SKIP_WORKTREE has always been considered advisory; for example, if rebase or merge need or even want to materialize a path as part of their work, they have always been allowed to do so regardless of the SKIP_WORKTREE setting. This has been used for unmerged paths, but it was often used for paths it wasn't needed just because it made the code simpler. It was a best-effort consideration, and when it materialized paths contrary to the SKIP_WORKTREE setting, it was never required to even print a warning message. In the past if you trying to run e.g. 'git checkout' and: 1) you had a path that was materialized and had some dirty changes 2) the path was listed in $GITDIR/info/sparse-checkout 3) this path did not different between the current and target branches then despite the comparison points above, the inability to set SKIP_WORKTREE was treated as a hard error that would abort the checkout operation. This is completely inconsistent with how SKIP_WORKTREE is handled elsewhere, and rather annoying for users as leaving the paths materialized in the working copy (with a simple warning) should present no problem at all. Downgrade any errors from inability to toggle the SKIP_WORKTREE bit to a warning and allow the operations to continue. Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-27 11:33:30 -07:00
Elijah Newren	ebb568b9e2	unpack-trees: provide warnings on sparse updates for unmerged paths too When sparse-checkout runs to update the list of sparsity patterns, it gives warnings if it can't remove paths from the working tree because those files have dirty changes. Add a similar warning for unmerged paths as well. Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-27 11:33:30 -07:00
Elijah Newren	22ab0b37d8	unpack-trees: make sparse path messages sound like warnings The messages for problems with sparse paths are phrased as errors that cause the operation to abort, even though we are not making the operation abort. Reword the messages to make sense in their new context. Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-27 11:33:30 -07:00
Elijah Newren	6271d77cb1	unpack-trees: split display_error_msgs() into two display_error_msgs() is never called to show messages of both ERROR_* and WARNING_* types at the same time; it is instead called multiple times, separately for each type. Since we want to display these types differently, make two slightly different versions of this function. A subsequent commit will further modify unpack_trees() and how it calls the new display_warning_msgs(). Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-27 11:33:30 -07:00
Elijah Newren	4ee5d50fc3	sparse-checkout: use improved unpack_trees porcelain messages setup_unpack_trees_porcelain() provides much improved error/warning messages; instead of a message that assumes that there is only one path with a given problem despite being used by code that intentionally is grouping and showing errors together, it uses a message designed to be used with groups of paths. For example, this transforms error: Entry ' folder1/a folder2/a ' not uptodate. Cannot update sparse checkout. into error: Cannot update sparse checkout: the following entries are not up to date: folder1/a folder2/a In the past the suboptimal messages were never actually triggered because we would error out if the working directory wasn't clean before we even called unpack_trees(). The previous commit changed that, though, so let's use the better error messages. Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-27 11:33:30 -07:00

... 3 4 5 6 7 ...

16809 Commits