git-commit-vandalism

Author	SHA1	Message	Date
Junio C Hamano	ea64414426	Merge branch 'sg/t1404-update-ref-test-timeout' An attempt to unflake a test a bit. * sg/t1404-update-ref-test-timeout: t1404: increase core.packedRefsTimeout to avoid occasional test failure	2018-09-17 13:53:47 -07:00
Junio C Hamano	c2407322b6	Merge branch 'nd/clone-case-smashing-warning' Running "git clone" against a project that contain two files with pathnames that differ only in cases on a case insensitive filesystem would result in one of the files lost because the underlying filesystem is incapable of holding both at the same time. An attempt is made to detect such a case and warn. * nd/clone-case-smashing-warning: clone: report duplicate entries on case-insensitive filesystems	2018-09-17 13:53:47 -07:00
Junio C Hamano	660946196c	Merge branch 'mk/http-backend-content-length' Test update. * mk/http-backend-content-length: http-backend test: make empty CONTENT_LENGTH test more realistic	2018-09-17 13:53:46 -07:00
Derrick Stolee	66ec0390e7	fsck: verify multi-pack-index When core.multiPackIndex is true, we may have a multi-pack-index in our object directory. Add calls to 'git multi-pack-index verify' at the end of 'git fsck' if so. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 13:49:41 -07:00
Derrick Stolee	144d70333e	multi-pack-index: report progress during 'verify' When verifying a multi-pack-index, the only action that takes significant time is checking the object offsets. For example, to verify a multi-pack-index containing 6.2 million objects in the Linux kernel repository takes 1.3 seconds on my machine. 99% of that time is spent looking up object offsets in each of the packfiles and comparing them to the multi-pack-index offset. Add a progress indicator for that section of the 'verify' verb. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 13:49:41 -07:00
Derrick Stolee	cc6af73c02	multi-pack-index: verify object offsets The 'git multi-pack-index verify' command must verify the object offsets stored in the multi-pack-index are correct. There are two ways the offset chunk can be incorrect: the pack-int-id and the object offset. Replace the BUG() statement with a die() statement, now that we may hit a bad pack-int-id during a 'verify' command on a corrupt multi-pack-index, and it is covered by a test. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 13:49:41 -07:00
Derrick Stolee	d8ac9ee109	multi-pack-index: fix 32-bit vs 64-bit size check When loading a 64-bit offset, we intend to check that off_t can store the resulting offset. However, the condition accidentally checks the 32-bit offset to see if it is smaller than a 64-bit value. Fix it, and this will be covered by a test in the 'git multi-pack-index verify' command in a later commit. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 13:49:41 -07:00
Derrick Stolee	55c5648d80	multi-pack-index: verify oid lookup order Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 13:49:41 -07:00
Derrick Stolee	2f23d3f3f9	multi-pack-index: verify oid fanout order Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 13:49:41 -07:00
Derrick Stolee	d4bf1d88b9	multi-pack-index: verify missing pack Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 13:49:41 -07:00
Derrick Stolee	8e72a3c321	multi-pack-index: verify packname order The final check we make while loading a multi-pack-index is that the packfile names are in lexicographical order. Make this error be a die() instead. In order to test this condition, we need multiple packfiles. Earlier in t5319-multi-pack-index.sh, we tested the interaction with 'git repack' but this limits us to one packfile in our object dir. Move these repack tests until after the 'verify' tests. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 13:49:41 -07:00
Derrick Stolee	d3f8e21170	multi-pack-index: verify corrupt chunk lookup table Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 13:49:41 -07:00
Derrick Stolee	53ad040744	multi-pack-index: verify bad header When verifying if a multi-pack-index file is valid, we want the command to fail to signal an invalid file. Previously, we wrote an error to stderr and continued as if we had no multi-pack-index. Now, die() instead of error(). Add tests that check corrupted headers in a few ways: * Bad signature * Bad file version * Bad hash version * Truncated hash count * Extended hash count Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 13:49:41 -07:00
Derrick Stolee	56ee7ff156	multi-pack-index: add 'verify' verb The multi-pack-index builtin writes multi-pack-index files, and uses a 'write' verb to do so. Add a 'verify' verb that checks this file matches the contents of the pack-indexes it replaces. The current implementation is a no-op, but will be extended in small increments in later commits. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 13:49:38 -07:00
Junio C Hamano	6b6547b46f	Revert "doc/Makefile: drop doc-diff worktree and temporary files on "make clean"" This reverts commit `6f924265a0`, which started to require that we have an executable git available in order to say "make clean", which gives us a chicken-and-egg problem. Having to have Git installed, or be in a repository, in order to be able to run an optional "doc-diff" tool is fine. Requiring either in order to run "make clean" is a different story. Reported by Jonathan Nieder <jrnieder@gmail.com>.	2018-09-17 12:46:18 -07:00
Tao Qingyun	7b6057c852	refs: docstring typo Signed-off-by: Tao Qingyun <taoqy@ls-a.me> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 10:17:22 -07:00
Ævar Arnfjörð Bjarmason	1f7f557fd3	commit-graph verify: add progress output For the reasons explained in the "commit-graph write: add progress output" commit leading up to this one, emit progress on "commit-graph verify". Since `e0fd51e1d7` ("fsck: verify commit-graph", 2018-06-27) "git fsck" has called this command if core.commitGraph=true, but there's been no progress output to indicate that anything was different. Now there is (on my tiny dotfiles.git repository): $ git -c core.commitGraph=true -C ~/ fsck Checking object directories: 100% (256/256), done. Checking objects: 100% (2821/2821), done. dangling blob 5b8bbdb9b788ed90459f505b0934619c17cc605b Verifying commits in commit graph: 100% (867/867), done. And on a larger repository, such as the 2015-04-03-1M-git.git test repository: $ time git -c core.commitGraph=true -C ~/g/2015-04-03-1M-git/ commit-graph verify Verifying commits in commit graph: 100% (1000447/1000447), done. real 0m7.813s [...] Since the "commit-graph verify" subcommand is never called from "git gc", we don't have to worry about passing some some "report_progress" progress variable around for this codepath. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 10:12:30 -07:00
Ævar Arnfjörð Bjarmason	7b0f229222	commit-graph write: add progress output Before this change the "commit-graph write" command didn't report any progress. On my machine this command takes more than 10 seconds to write the graph for linux.git, and around 1m30s on the 2015-04-03-1M-git.git[1] test repository (a test case for a large monorepository). Furthermore, since the gc.writeCommitGraph setting was added in `d5d5d7b641` ("gc: automatically write commit-graph files", 2018-06-27), there was no indication at all from a "git gc" run that anything was different. This why one of the progress bars being added here uses start_progress() instead of start_delayed_progress(), so that it's guaranteed to be seen. E.g. on my tiny 867 commit dotfiles.git repository: $ git -c gc.writeCommitGraph=true gc Enumerating objects: 2821, done. [...] Computing commit graph generation numbers: 100% (867/867), done. On larger repositories, such as linux.git the delayed progress bar(s) will kick in, and we'll show what's going on instead of, as was previously happening, printing nothing while we write the graph: $ git -c gc.writeCommitGraph=true gc [...] Annotating commits in commit graph: 1565573, done. Computing commit graph generation numbers: 100% (782484/782484), done. Note that here we don't show "Finding commits for commit graph", this is because under "git gc" we seed the search with the commit references in the repository, and that set is too small to show any progress, but would e.g. on a smaller repo such as git.git with --stdin-commits: $ git rev-list --all \| git -c gc.writeCommitGraph=true write --stdin-commits Finding commits for commit graph: 100% (162576/162576), done. Computing commit graph generation numbers: 100% (162576/162576), done. With --stdin-packs we don't show any estimation of how much is left to do. This is because we might be processing more than one pack. We could be less lazy here and show progress, either by detecting that we're only processing one pack, or by first looping over the packs to discover how many commits they have. I don't see the point in doing that work. So instead we get (on 2015-04-03-1M-git.git): $ echo pack-<HASH>.idx \| git -c gc.writeCommitGraph=true --exec-path=$PWD commit-graph write --stdin-packs Finding commits for commit graph: 13064614, done. Annotating commits in commit graph: 3001341, done. Computing commit graph generation numbers: 100% (1000447/1000447), done. No GC mode uses --stdin-packs. It's what they use at Microsoft to manually compute the generation numbers for their collection of large packs which are never coalesced. The reason we need a "report_progress" variable passed down from "git gc" is so that we don't report this output when we're running in the process "git gc --auto" detaches from the terminal. Since we write the commit graph from the "git gc" process itself (as opposed to what we do with say the "git repack" phase), we'd end up writing the output to .git/gc.log and reporting it to the user next time as part of the "The last gc run reported the following[...]" error, see `329e6e8794` ("gc: save log from daemonized gc --auto and print it next time", 2015-09-19). So we must keep track of whether or not we're running in that demonized mode, and if so print no progress. See [2] and subsequent replies for a discussion of an approach not taken in compute_generation_numbers(). I.e. we're saying "Computing commit graph generation numbers", even though on an established history we're mostly skipping over all the work we did in the past. This is similar to the white lie we tell in the "Writing objects" phase (not all are objects being written). Always showing progress is considered more important than accuracy. I.e. on a repository like 2015-04-03-1M-git.git we'd hang for 6 seconds with no output on the second "git gc" if no changes were made to any objects in the interim if we'd take the approach in [2]. 1. https://github.com/avar/2015-04-03-1M-git 2. <c6960252-c095-fb2b-e0bc-b1e6bb261614@gmail.com> (https://public-inbox.org/git/c6960252-c095-fb2b-e0bc-b1e6bb261614@gmail.com/) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 10:12:30 -07:00
Nguyễn Thái Ngọc Duy	ae9af12287	status: show progress bar if refreshing the index takes too long Refreshing the index is usually very fast, but it can still take a long time sometimes. Cold cache is one. Or copying a repo to a new place (). It's good to show something to let the user know "git status" is not hanging, it's just busy doing something. () In this case, all stat info in the index becomes invalid and git falls back to rehashing all file content to see if there's any difference between updating stat info in the index. This is quite expensive. Even with a repo as small as git.git, it takes 3 seconds. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 09:38:50 -07:00
Nguyễn Thái Ngọc Duy	ce3a7ec8bd	archive.c: remove implicit dependency the_repository The new "repo" field in archive_args has been added since `b612ee202a` (archive.c: avoid access to the_index - 2018-08-13). Use it instead of hard coding the_repository. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 09:20:06 -07:00
Tim Schumacher	fef5f7fc43	t0014: introduce an alias testing suite Introduce a testing suite that is dedicated to aliases. For now, check only if nested aliases work and if looping aliases are detected successfully. The looping aliases check for mixed execution is there but disabled, because it is blocking the test suite for a full minute. As soon as there is a solution for loops using external commands, it should be enabled. Signed-off-by: Tim Schumacher <timschumi@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 08:50:24 -07:00
Tim Schumacher	82f71d9a5a	alias: show the call history when an alias is looping Just printing the command that the user entered is not particularly helpful when trying to find the alias that causes the loop. Print the history of substituted commands to help the user find the offending alias. Mark the entrypoint of the loop with "<==" and the last command (which looped back to the entrypoint) with "==>". Signed-off-by: Tim Schumacher <timschumi@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 08:50:04 -07:00
Tim Schumacher	c6d75bc17a	alias: add support for aliases of an alias Aliases can only contain non-alias git commands and their arguments, not other user-defined aliases. Resolving further (nested) aliases is prevented by breaking the loop after the first alias was processed. Git then fails with a command-not-found error. Allow resolving nested aliases by not breaking the loop in run_argv() after the first alias was processed. Instead, continue the loop until `handle_alias()` fails, which means that there are no further aliases that can be processed. Prevent looping aliases by storing substituted commands in `cmd_list` and checking if a command has been substituted previously. While we're at it, fix a styling issue just below the added code. Signed-off-by: Tim Schumacher <timschumi@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 08:27:52 -07:00
Derrick Stolee	ae0c89d41b	t5318: use test_oid for HASH_LEN Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 08:10:32 -07:00
brian m. carlson	43c94bbfd8	t1407: make hash size independent Instead of hard-coding a 40-based constant, split the output of for-each-ref and for-each-reflog by field. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 08:10:32 -07:00
brian m. carlson	b1484ca94a	t1406: make hash-size independent Instead of hard-coding a 40-based constant, split the output of for-each-ref and for-each-reflog by field. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 08:10:32 -07:00
brian m. carlson	63477b328e	t1405: make hash size independent Instead of hard-coding a 40-based constant, split the output of for-each-ref and for-each-reflog by field. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 08:10:32 -07:00
brian m. carlson	44171e5bda	t1400: switch hard-coded object ID to variable Switch a hard-coded all-zeros object ID to use a variable instead. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 08:10:32 -07:00
brian m. carlson	e95f53137d	t1006: make hash size independent Compute the size of the tree and commit objects we're creating by checking for the size of an object ID and computing the resulting sizes accordingly. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 08:10:32 -07:00
brian m. carlson	1374003db1	t0064: make hash size independent Compute test values of the appropriate size instead of hard-coding 40-character values. Rename the echo20 function to echoid, since the values may be of varying sizes. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 08:10:32 -07:00
Shulhan	5025425dff	builtin/remote: quote remote name on error to display empty name When adding new remote name with empty string, git will print the following error message, fatal: '' is not a valid remote name\n But when removing remote name with empty string as input, git shows the empty string without quote, fatal: No such remote: \n To make these error messages consistent, quote the name of the remote that we tried and failed to find. Signed-off-by: Shulhan <m.shulhan@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-14 09:38:18 -07:00
Thomas Gummerer	e467a90c7a	linear-assignment: fix potential out of bounds memory access Currently the 'compute_assignment()' function may read memory out of bounds, even if used correctly. Namely this happens when we only have one column. In that case we try to calculate the initial minimum cost using '!j1' as column in the reduction transfer code. That in turn causes us to try and get the cost from column 1 in the cost matrix, which does not exist, and thus results in an out of bounds memory read. In the original paper [1], the example code initializes that minimum cost to "infinite". We could emulate something similar by setting the minimum cost to INT_MAX, which would result in the same minimum cost as the current algorithm, as we'd always go into the if condition at least once, except when we only have one column, and column_count thus equals 1. If column_count does equal 1, the condition in the loop would always be false, and we'd end up with a minimum of INT_MAX, which may lead to integer overflows later in the algorithm. For a column count of 1, we however do not even really need to go through the whole algorithm. A column count of 1 means that there's no possible assignments, and we can just zero out the column2row and row2column arrays, and return early from the function, while keeping the reduction transfer part of the function the same as it is currently. Another solution would be to just not call the 'compute_assignment()' function from the range diff code in this case, however it's better to make the compute_assignment function more robust, so future callers don't run into this potential problem. Note that the test only fails under valgrind on Linux, but the same command has been reported to segfault on Mac OS. [1]: Jonker, R., & Volgenant, A. (1987). A shortest augmenting path algorithm for dense and sparse linear assignment problems. Computing, 38(4), 325–340. Reported-by: ryenus <ryenus@gmail.com> Helped-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-14 09:10:26 -07:00
brian m. carlson	0de267b292	t0002: abstract away SHA-1 specific constants Adjust the test so that it computes variables for object IDs instead of using hard-coded hashes. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-13 14:15:24 -07:00
brian m. carlson	e483e1441a	t0000: update tests for SHA-256 Test t0000 tests the "basics of the basics" and as such, checks that we have various fixed hard-coded object IDs. The tests relying on these assertions have been marked with the SHA1 prerequisite, as they will obviously not function in their current form with SHA-256. Use the test_oid helper to update these assertions and provide values for both SHA-1 and SHA-256. These object IDs were synthesized using a set of scripts that created the objects for both SHA-1 and SHA-256 using the same method to ensure that they are indeed the correct values. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-13 14:15:24 -07:00
brian m. carlson	cdd1e17f87	t0000: use hash translation table If the hash we're using is 32 bytes in size, attempting to insert a 20-byte object name won't work. Since these are synthesized objects that are almost all zeros, look them up in a translation table. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-13 14:15:24 -07:00
brian m. carlson	2c02b110da	t: add test functions to translate hash-related values Add several test functions to make working with various hash-related values easier. Add test_oid_init, which loads common hash-related constants and placeholder object IDs from the newly added files in t/oid-info. Provide values for these constants for both SHA-1 and SHA-256. Add test_oid_cache, which accepts data on standard input in the form of hash-specific key-value pairs that can be looked up later, using the same format as the files in t/oid-info. Document this format in a t/oid-info/README directory so that it's easier to use in the future. Add test_oid, which is used to specify look up a per-hash value (produced on standard output) based on the key specified as its argument. Usually the data to be looked up will be a hash-related constant (such as the size of the hash in binary or hexadecimal), a well-known or placeholder object ID (such as the all-zeros object ID or one consisting of "deadbeef" repeated), or something similar. For these reasons, test_oid will usually be used within a command substitution. Consequently, redirect the error output to standard error, since otherwise it will not be displayed. Add test_detect_hash, which currently only detects SHA-1, and test_set_hash, which can be used to set a different hash algorithm for test purposes. In the future, test_detect_hash will learn to actually detect the hash depending on how the testsuite is to be run. Use the local keyword within these functions to avoid overwriting other shell variables. We have had a test balloon in place for a couple of releases to catch shells that don't have this keyword and have not received any reports of failure. Note that the varying usages of local used here are supported by all common open-source shells supporting the local keyword. Test these new functions as part of t0000, which also serves to demonstrate basic usage of them. In addition, add documentation on how to format the lookup data and how to use the test functions. Implement two basic lookup charts, one for common invalid or synthesized object IDs, and one for various facts about the hash function in use. Provide versions of the data for both SHA-1 and SHA-256. Since we use shell variables for storage, names used for lookup can currently consist only of shell identifier characters. If this is a problem in the future, we can hash the names before use. Improved-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-13 14:15:24 -07:00
Jonathan Tan	e68302011c	fetch-object: set exact_oid when fetching fetch_objects() currently does not set exact_oid in struct ref when invoking transport_fetch_refs(). If the server supports ref-in-want, fetch_pack() uses this field to determine whether a wanted ref should be requested as a "want-ref" line or a "want" line; without the setting of exact_oid, the wrong line will be sent. Set exact_oid, so that the correct line is sent. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-13 13:57:31 -07:00
Jonathan Tan	8708ca09a6	fetch-object: unify fetch_object[s] functions There are fetch_object() and fetch_objects() helpers in fetch-object.h; as the latter takes "struct oid_array", the former cannot be made into a thin wrapper around the latter without an extra allocation and set-up cost. Update fetch_objects() to take an array of "struct object_id" and number of elements in it as separate parameters, remove fetch_object(), and adjust all existing callers of these functions to use the new fetch_objects(). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-13 13:56:19 -07:00
Elijah Newren	a3ec9eaf38	sequencer: fix --allow-empty-message behavior, make it smarter In commit `b00bf1c9a8` ("git-rebase: make --allow-empty-message the default", 2018-06-27), several arguments were given for transplanting empty commits without halting and asking the user for confirmation on each commit. These arguments were incomplete because the logic clearly assumed the only cases under consideration were transplanting of commits with empty messages (see the comment about "There are two sources for commits with empty messages). It didn't discuss or even consider rewords, squashes, etc. where the user is explicitly asked for a new commit message and provides an empty one. (My bad, I totally should have thought about that at the time, but just didn't.) Rewords and squashes are significantly different, though, as described by SZEDER: Let's suppose you start an interactive rebase, choose a commit to squash, save the instruction sheet, rebase fires up your editor, and then you notice that you mistakenly chose the wrong commit to squash. What do you do, how do you abort? Before [that commit] you could clear the commit message, exit the editor, and then rebase would say "Aborting commit due to empty commit message.", and you get to run 'git rebase --abort', and start over. But [since that commit, ...] saving the commit message as is would let rebase continue and create a bunch of unnecessary objects, and then you would have to use the reflog to return to the pre-rebase state. Also, he states: The instructions in the commit message template, which is shown for 'reword' and 'squash', too, still say... # Please enter the commit message for your changes. Lines starting # with '#' will be ignored, and an empty message aborts the commit. These are sound arguments that when editing commit messages during a sequencer operation, that if the commit message is empty then the operation should halt and ask the user to correct. The arguments in commit `b00bf1c9a8` (referenced above) still apply when transplanting previously created commits with empty commit messages, so the sequencer should not halt for those. Furthermore, all rationale so far applies equally for cherry-pick as for rebase. Therefore, make the code default to --allow-empty-message when transplanting an existing commit, and to default to halting when the user is asked to edit a commit message and provides an empty one -- for both rebase and cherry-pick. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-13 13:25:08 -07:00
Ævar Arnfjörð Bjarmason	371a655074	fsck: support comments & empty lines in skipList It's annoying not to be able to put comments and empty lines in the skipList, when e.g. keeping a big central list of commits to skip in /etc/gitconfig, which was my motivation for `1362df0d41` ("fetch: implement fetch.fsck.*", 2018-07-27). Implement that, and document what version of Git this was changed in, since this on-disk format can be expected to be used by multiple versions of git. There is no notable performance impact from this change, using the test setup described a couple of commits back: Test HEAD~ HEAD ---------------------------------------------------------------------------------------- 1450.3: fsck with 0 skipped bad commits 7.69(7.27+0.42) 7.86(7.48+0.37) +2.2% 1450.5: fsck with 1 skipped bad commits 7.69(7.30+0.38) 7.83(7.47+0.36) +1.8% 1450.7: fsck with 10 skipped bad commits 7.76(7.38+0.38) 7.79(7.38+0.41) +0.4% 1450.9: fsck with 100 skipped bad commits 7.76(7.38+0.38) 7.74(7.36+0.38) -0.3% 1450.11: fsck with 1000 skipped bad commits 7.71(7.30+0.41) 7.72(7.34+0.38) +0.1% 1450.13: fsck with 10000 skipped bad commits 7.74(7.34+0.40) 7.72(7.34+0.38) -0.3% 1450.15: fsck with 100000 skipped bad commits 7.75(7.40+0.35) 7.70(7.29+0.40) -0.6% 1450.17: fsck with 1000000 skipped bad commits 7.12(6.86+0.26) 7.13(6.87+0.26) +0.1% Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-12 15:17:46 -07:00
René Scharfe	3b41fb0cb2	fsck: use oidset instead of oid_array for skipList Change the implementation of the skipList feature to use oidset instead of oid_array to store SHA-1s for later lookup. This list is parsed once on startup by fsck, fetch-pack or receive-pack depending on the *.skipList config in use. I.e. only once per invocation, but note that for "clone --recurse-submodules" each submodule will re-parse the list, in addition to the main project, and it will be re-parsed when checking .gitmodules blobs, see `fb16287719` ("fsck: check skiplist for object in fsck_blob()", 2018-06-27). Memory usage is a bit higher, but we don't need to keep track of the sort order anymore. Embed the oidset into struct fsck_options to make its ownership clear (no hidden sharing) and avoid unnecessary pointer indirection. The cumulative impact on performance of this & the preceding change, using the test setup described in the previous commit: Test HEAD~2 HEAD~ HEAD ---------------------------------------------------------------------------------------------------------------- 1450.3: fsck with 0 skipped bad commits 7.70(7.31+0.38) 7.72(7.33+0.38) +0.3% 7.70(7.30+0.40) +0.0% 1450.5: fsck with 1 skipped bad commits 7.84(7.47+0.37) 7.69(7.32+0.36) -1.9% 7.71(7.29+0.41) -1.7% 1450.7: fsck with 10 skipped bad commits 7.81(7.40+0.40) 7.94(7.57+0.36) +1.7% 7.92(7.55+0.37) +1.4% 1450.9: fsck with 100 skipped bad commits 7.81(7.42+0.38) 7.95(7.53+0.41) +1.8% 7.83(7.42+0.41) +0.3% 1450.11: fsck with 1000 skipped bad commits 7.99(7.62+0.36) 7.90(7.50+0.40) -1.1% 7.86(7.49+0.37) -1.6% 1450.13: fsck with 10000 skipped bad commits 7.98(7.57+0.40) 7.94(7.53+0.40) -0.5% 7.90(7.45+0.44) -1.0% 1450.15: fsck with 100000 skipped bad commits 7.97(7.57+0.39) 8.03(7.67+0.36) +0.8% 7.84(7.43+0.41) -1.6% 1450.17: fsck with 1000000 skipped bad commits 7.72(7.22+0.50) 7.28(7.07+0.20) -5.7% 7.13(6.87+0.25) -7.6% Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-12 15:17:46 -07:00
René Scharfe	fb8952077d	fsck: use strbuf_getline() to read skiplist file The buffer is unlikely to contain a NUL character, so printing its contents using %s in a die() format is unsafe (detected with ASan). Use an idiomatic strbuf_getline() loop instead, which ensures the buffer is always NUL-terminated, supports CRLF files as well, accepts files without a newline after the last line, supports any hash length automatically, and is shorter. This fixes a bug where emitting an error about an invalid line on say line 1 would continue printing subsequent lines, and usually continue into uninitialized memory. The performance impact of this, on a CentOS 7 box with RedHat GCC 4.8.5-28: $ GIT_PERF_REPEAT_COUNT=5 GIT_PERF_MAKE_OPTS='-j56 CFLAGS="-O3"' ./run HEAD~ HEAD p1451-fsck-skip-list.sh Test HEAD~ HEAD ---------------------------------------------------------------------------------------- 1450.3: fsck with 0 skipped bad commits 7.75(7.39+0.35) 7.68(7.29+0.39) -0.9% 1450.5: fsck with 1 skipped bad commits 7.70(7.30+0.40) 7.80(7.42+0.37) +1.3% 1450.7: fsck with 10 skipped bad commits 7.77(7.37+0.40) 7.87(7.47+0.40) +1.3% 1450.9: fsck with 100 skipped bad commits 7.82(7.41+0.40) 7.88(7.43+0.44) +0.8% 1450.11: fsck with 1000 skipped bad commits 7.88(7.49+0.39) 7.84(7.43+0.40) -0.5% 1450.13: fsck with 10000 skipped bad commits 8.02(7.63+0.39) 8.07(7.67+0.39) +0.6% 1450.15: fsck with 100000 skipped bad commits 8.01(7.60+0.41) 8.08(7.70+0.38) +0.9% 1450.17: fsck with 1000000 skipped bad commits 7.60(7.10+0.50) 7.37(7.18+0.19) -3.0% Helped-by: Jeff King <peff@peff.net> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-12 15:17:46 -07:00
René Scharfe	01e0d545ab	fsck: add a performance test for skipList Create a performance test to see how the skipList implementation performs. First we setup N bad commits, then we see how progressively working our way up to 0..N in increments of 10x does. I.e. the needle(s) in the haystack get progressively more numerous. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-12 15:17:46 -07:00
Ævar Arnfjörð Bjarmason	6cb173b5b6	fsck: add a performance test Add a plain performance test for "fsck". This test will not be used to / referred to in any upcoming commit of mine in this series, but having a simple test for fsck performance is valuable, so let's add it while we're at it. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-12 15:17:46 -07:00
Ævar Arnfjörð Bjarmason	12b1c50a42	fsck: document that skipList input must be unabbreviated Abbreviating the SHA-1s in the skipList input has never worked, but the documentation hasn't unambiguously stated that this is an error, and there was no test for it. Let's fix both since it would be easy for some later refactoring e.g. switch to accidentally switch to a looser OID parsing function, causing the tests before this change to pass, but for older versions of git to be incompatible with the new skipList format. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-12 15:17:46 -07:00
Ævar Arnfjörð Bjarmason	f706c42bab	fsck: document and test commented & empty line skipList input There is currently no comment syntax for the fsck.skipList, this isn't really by design, and it would be nice to have support for comments. Document that this doesn't work, and test for how this errors out. These tests reveal a current bug, if there's invalid input the output will emit some of the next line, and then go into uninitialized memory. This is fixed in a subsequent change. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-12 15:17:46 -07:00
Ævar Arnfjörð Bjarmason	58dc440b3c	fsck: document and test sorted skipList input Ever since the skipList support was first added in `cd94c6f91` ("fsck: git receive-pack: support excluding objects from fsck'ing", 2015-06-22) the documentation for the format has that the file is a sorted list of object names. Thus, anyone using the feature would have thought the list needed to be sorted. E.g. I recently in conjunction with my fetch.fsck.* implementation in `1362df0d41` ("fetch: implement fetch.fsck.*", 2018-07-27) wrote some code to ship a skipList, and went out of my way to sort it. Doing so seems intuitive, since it contains fixed-width records, and has no support for comments, so one might expect it to be binary searched in-place on-disk. However, as documented here this was never a requirement, so let's change the documentation. Since this is a file format change let's also document what was said about this in the past, so e.g. someone like myself reading the new docs can see this never needed to be sorted ("why do I have all this code to sort this thing..."). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-12 15:17:46 -07:00
Ævar Arnfjörð Bjarmason	536a9ce80d	fsck tests: add a test for no skipList input The recent `65a836fa6b` ("fsck: add stress tests for fsck.skipList", 2018-07-27) added various stress tests for odd invocations of fsck.skipList, but didn't tests for some very simple ones, such as asserting that providing to skipList with a bad commit causes fsck to exit with a non-zero exit code. Add such a test. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-12 15:17:46 -07:00
Ævar Arnfjörð Bjarmason	134b7327d0	fsck tests: setup of bogus commit object Several fsck tests used the exact same git-hash-object output, but had copy/pasted that part of the setup code. Let's instead do that setup once and use it in subsequent tests. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-12 15:17:46 -07:00
Elijah Newren	d345e9fbe7	update-ref: allow --no-deref with --stdin If passed both --no-deref and --stdin, update-ref would error out with a general usage message that did not at all suggest these options were incompatible. The manpage for update-ref did suggest through its synopsis line that --no-deref and --stdin were incompatible, but it sadly also incorrectly suggested that -d and --no-deref were incompatible. So the help around the --no-deref option is buggy in a few ways. The --stdin option did provide a different mechanism for avoiding dereferencing symbolic-refs: adding a line reading option no-deref before every other directive in the input. (Technically, if the user wants to do the extra work of first determining which refs they want to update or delete are symbolic, then they only need to put the extra "option no-deref" lines before the updates of those refs. But in some cases, that's more work than just adding the "option no-deref" before every other directive.) It's easier to allow the user to just pass --no-deref along with --stdin in order to tell update-ref that the user doesn't want any symbolic ref to be dereferenced. It also makes the update-ref documentation simpler. Implement that, and update the documentation to match. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-12 15:17:17 -07:00

... 4 5 6 7 8 ...

53336 Commits