git-commit-vandalism

Author	SHA1	Message	Date
Junio C Hamano	6d4960ac7d	Merge branch 'jk/trace-fixup' Various small fixups to the "GIT_TRACE" facility. * jk/trace-fixup: trace: do not fall back to stderr write_or_die: drop write_or_whine_pipe() trace: disable key after write error trace: correct variable name in write() error message trace: cosmetic fixes for error messages trace: use warning() for printing trace errors trace: stop using write_or_whine_pipe() trace: handle NULL argument in trace_disable()	2016-08-12 09:47:36 -07:00
Junio C Hamano	8a5ad2ba5b	Merge branch 'rs/merge-recursive-string-list-init' A small code clean-up. * rs/merge-recursive-string-list-init: merge-recursive: use STRING_LIST_INIT_NODUP	2016-08-12 09:47:36 -07:00
Junio C Hamano	b32d7c524b	Merge branch 'rs/merge-add-strategies-simplification' A small code clean-up. * rs/merge-add-strategies-simplification: merge: use string_list_split() in add_strategies()	2016-08-12 09:47:36 -07:00
Junio C Hamano	18f3ce8841	Merge branch 'rs/child-process-init' A small code clean-up. * rs/child-process-init: use CHILD_PROCESS_INIT to initialize automatic variables	2016-08-12 09:47:36 -07:00
Junio C Hamano	bb876eb371	Merge branch 'js/import-tars-hardlinks' "import-tars" fast-import script (in contrib/) used to ignore a hardlink target and replaced it with an empty file, which has been corrected to record the same blob as the other file the hardlink is shared with. * js/import-tars-hardlinks: import-tars: support hard links	2016-08-12 09:47:36 -07:00
Junio C Hamano	62134efdba	Merge branch 'ms/document-pack-window-memory-is-per-thread' * ms/document-pack-window-memory-is-per-thread: document git-repack interaction of pack.threads and pack.windowMemory	2016-08-12 09:47:35 -07:00
Junio C Hamano	7d4d742c23	Merge branch 'vs/completion-branch-fully-spelled-d-m-r' * vs/completion-branch-fully-spelled-d-m-r: completion: complete --delete, --move, and --remotes for git branch	2016-08-12 09:47:35 -07:00
Junio C Hamano	2f9c615efb	Merge branch 'sb/submodule-clone-retry' Fix-up to an error codepath in a topic already in 'master'. * sb/submodule-clone-retry: submodule--helper: use parallel processor correctly	2016-08-12 09:47:34 -07:00
Junio C Hamano	e0c1ceafc5	Git 2.9.3 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-12 09:17:51 -07:00
Junio C Hamano	9b601eafd1	Merge branch 'jk/difftool-in-subdir' into maint "git difftool <paths>..." started in a subdirectory failed to interpret the paths relative to that directory, which has been fixed. * jk/difftool-in-subdir: difftool: use Git::* functions instead of passing around state difftool: avoid $GIT_DIR and $GIT_WORK_TREE difftool: fix argument handling in subdirs	2016-08-12 09:16:57 -07:00
Junio C Hamano	f4fd627661	Merge branch 'jk/reset-ident-time-per-commit' into maint Not-so-recent rewrite of "git am" that started making internal calls into the commit machinery had an unintended regression, in that no matter how many seconds it took to apply many patches, the resulting committer timestamp for the resulting commits were all the same. * jk/reset-ident-time-per-commit: am: reset cached ident date for each patch	2016-08-12 09:16:56 -07:00
Kevin Willford	b3dfeebb92	rebase: avoid computing unnecessary patch IDs The `rebase` family of Git commands avoid applying patches that were already integrated upstream. They do that by using the revision walking option that computes the patch IDs of the two sides of the rebase (local-only patches vs upstream-only ones) and skipping those local patches whose patch ID matches one of the upstream ones. In many cases, this causes unnecessary churn, as already the set of paths touched by a given commit would suffice to determine that an upstream patch has no local equivalent. This hurts performance in particular when there are a lot of upstream patches, and/or large ones. Therefore, let's introduce the concept of a "diff-header-only" patch ID, compare those first, and only evaluate the "full" patch ID lazily. Please note that in contrast to the "full" patch IDs, those "diff-header-only" patch IDs are prone to collide with one another, as adjacent commits frequently touch the very same files. Hence we now have to be careful to allow multiple hash entries with the same hash. We accomplish that by using the hashmap_add() function that does not even test for hash collisions. This also allows us to evaluate the full patch ID lazily, i.e. only when we found commits with matching diff-header-only patch IDs. We add a performance test that demonstrates ~1-6% improvement. In practice this will depend on various factors such as how many upstream changes and how big those changes are along with whether file system caches are cold or warm. As Git's test suite has no way of catching performance regressions, we also add a regression test that verifies that the full patch ID computation is skipped when the diff-header-only computation suffices. Signed-off-by: Kevin Willford <kcwillford@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 14:39:16 -07:00
Ville Skyttä	2e3a16b279	Spelling fixes <BAD> <CORRECTED> accidently accidentally commited committed dependancy dependency emtpy empty existance existence explicitely explicitly git-upload-achive git-upload-archive hierachy hierarchy indegee indegree intial initial mulitple multiple non-existant non-existent precendence. precedence. priviledged privileged programatically programmatically psuedo-binary pseudo-binary soemwhere somewhere successfull successful transfering transferring uncommited uncommitted unkown unknown usefull useful writting writing Signed-off-by: Ville Skyttä <ville.skytta@iki.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 14:35:42 -07:00
Christian Couder	ccceb7bb13	builtin/apply: make write_out_results() return -1 on error To libify `git apply` functionality we have to signal errors to the caller instead of exit()ing. To do that in a compatible manner with the rest of the error handling in "builtin/apply.c", write_out_results() should return -1 instead of calling exit(). Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 12:41:47 -07:00
Christian Couder	434389deb1	builtin/apply: make write_out_one_result() return -1 on error To libify `git apply` functionality we have to signal errors to the caller instead of exit()ing. To do that in a compatible manner with the rest of the error handling in "builtin/apply.c", write_out_one_result() should just return what remove_file() and create_file() are returning instead of calling exit(). Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 12:41:47 -07:00
Christian Couder	8f5b5445d7	builtin/apply: make create_file() return -1 on error To libify `git apply` functionality we have to signal errors to the caller instead of exit()ing. To do that in a compatible manner with the rest of the error handling in "builtin/apply.c", create_file() should just return what add_conflicted_stages_file() and add_index_file() are returning instead of calling exit(). Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 12:41:47 -07:00
Christian Couder	69e1609f81	builtin/apply: make add_index_file() return -1 on error To libify `git apply` functionality we have to signal errors to the caller instead of die()ing. To do that in a compatible manner with the rest of the error handling in "builtin/apply.c", add_index_file() should return -1 instead of calling die(). Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 12:41:47 -07:00
Christian Couder	a902edceeb	builtin/apply: make add_conflicted_stages_file() return -1 on error To libify `git apply` functionality we have to signal errors to the caller instead of die()ing. To do that in a compatible manner with the rest of the error handling in "builtin/apply.c", add_conflicted_stages_file() should return -1 instead of calling die(). Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 12:41:47 -07:00
Christian Couder	6e8df31469	builtin/apply: make remove_file() return -1 on error To libify `git apply` functionality we have to signal errors to the caller instead of die()ing. To do that in a compatible manner with the rest of the error handling in "builtin/apply.c", remove_file() should return -1 instead of calling die(). Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 12:41:47 -07:00
Christian Couder	fe41b80225	builtin/apply: make build_fake_ancestor() return -1 on error To libify `git apply` functionality we have to signal errors to the caller instead of die()ing. To do that in a compatible manner with the rest of the error handling in "builtin/apply.c", build_fake_ancestor() should return -1 instead of calling die(). Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 12:41:47 -07:00
Christian Couder	119ab159e6	builtin/apply: change die_on_unsafe_path() to check_unsafe_path() To libify `git apply` functionality we have to signal errors to the caller instead of die()ing. To do that in a compatible manner with the rest of the error handling in "builtin/apply.c", die_on_unsafe_path() should return a negative integer instead of calling die(), so while doing that let's change its name to check_unsafe_path(). Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 12:41:47 -07:00
Christian Couder	dbf1b5fb6a	builtin/apply: make gitdiff_() return -1 on error To libify `git apply` functionality we have to signal errors to the caller instead of die()ing. To do that in a compatible manner with the rest of the error handling in "builtin/apply.c", gitdiff_() functions should return -1 instead of calling die(). A previous patch made it possible for gitdiff_*() functions to return -1 in case of error. Let's take advantage of that to make gitdiff_verify_name() return -1 on error, and to have gitdiff_oldname() and gitdiff_newname() directly return what gitdiff_verify_name() returns. Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 12:41:47 -07:00
Christian Couder	70af7662d4	builtin/apply: make gitdiff_() return 1 at end of header The gitdiff_() functions that are called as p->fn() in parse_git_header() should return 1 instead of -1 in case of end of header or unrecognized input, as these are not real errors. It just instructs the parser to break out. This makes it possible for gitdiff_*() functions to return -1 in case of a real error. This will be done in a following patch. Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 12:41:47 -07:00
Christian Couder	9724e6ff48	builtin/apply: make parse_traditional_patch() return -1 on error To libify `git apply` functionality we have to signal errors to the caller instead of die()ing. To do that in a compatible manner with the rest of the error handling in "builtin/apply.c", parse_traditional_patch() should return -1 instead of calling die(). Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 12:41:47 -07:00
Christian Couder	fef7ba5353	builtin/apply: make apply_all_patches() return 128 or 1 on error To finish libifying the apply functionality, apply_all_patches() should not die() or exit() in case of error, but return either 128 or 1, so that it gives the same exit code as when die() or exit(1) is called. This way scripts relying on the exit code don't need to be changed. While doing that we must take care that file descriptors are properly closed and, if needed, reset to a sensible value. Also, according to the lockfile API, when finished with a lockfile, one should either commit it or roll it back. This is even more important now that the same lockfile can be passed to init_apply_state() many times to be reused by series of calls to the apply lib functions. Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 12:41:47 -07:00
Christian Couder	b6446d54ec	builtin/apply: move check_apply_state() to apply.c To libify `git apply` functionality we must make check_apply_state() usable outside "builtin/apply.c". Let's do that by moving it into "apply.c". Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 12:41:47 -07:00
Christian Couder	f36538d88b	builtin/apply: make check_apply_state() return -1 instead of die()ing To libify `git apply` functionality we have to signal errors to the caller instead of die()ing. To do that in a compatible manner with the rest of the error handling in "builtin/apply.c", check_apply_state() should return -1 instead of calling die(). Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 12:41:47 -07:00
Christian Couder	2f5a6d1218	apply: make init_apply_state() return -1 instead of exit()ing To libify `git apply` functionality we have to signal errors to the caller instead of exit()ing. To do that in a compatible manner with the rest of the error handling in "builtin/apply.c", init_apply_state() should return -1 instead of calling exit(). Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 12:41:47 -07:00
Christian Couder	bb493a5c14	builtin/apply: move init_apply_state() to apply.c To libify `git apply` functionality we must make init_apply_state() usable outside "builtin/apply.c". Let's do that by moving it into a new "apply.c". Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 12:41:47 -07:00
Christian Couder	f95fdc256b	builtin/apply: make parse_ignorewhitespace_option() return -1 instead of die()ing To libify `git apply` functionality we have to signal errors to the caller instead of die()ing. To do that in a compatible manner with the rest of the error handling in "builtin/apply.c", parse_ignorewhitespace_option() should return -1 instead of calling die(). Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 12:41:47 -07:00
Christian Couder	aaf6c447aa	builtin/apply: make parse_whitespace_option() return -1 instead of die()ing To libify `git apply` functionality we have to signal errors to the caller instead of die()ing. To do that in a compatible manner with the rest of the error handling in builtin/apply.c, parse_whitespace_option() should return -1 instead of calling die(). Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 12:41:46 -07:00
Christian Couder	dae197f753	builtin/apply: make parse_single_patch() return -1 on error To libify `git apply` functionality we have to signal errors to the caller instead of die()ing. To do that in a compatible manner with the rest of the error handling in builtin/apply.c, parse_single_patch() should return a negative integer instead of calling die(). Let's do that by using error() and let's adjust the related test cases accordingly. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 12:41:46 -07:00
Christian Couder	b654b34c1c	builtin/apply: make parse_chunk() return a negative integer on error To libify `git apply` functionality we have to signal errors to the caller instead of die()ing or exit()ing. To do that in a compatible manner with the rest of the error handling in builtin/apply.c, parse_chunk() should return a negative integer instead of calling die() or exit(). As parse_chunk() is called only by apply_patch() which already returns either -1 or -128 when an error happened, let's make it also return -1 or -128. This makes it compatible with what find_header() and parse_binary() already return. Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 12:41:46 -07:00
Christian Couder	5950851e44	builtin/apply: make find_header() return -128 instead of die()ing To libify `git apply` functionality we have to signal errors to the caller instead of die()ing. To do that in a compatible manner with the rest of the error handling in builtin/apply.c, let's make find_header() return -128 instead of calling die(). We could make it return -1, unfortunately find_header() already returns -1 when no header is found. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 12:41:46 -07:00
Christian Couder	3bee345d7b	builtin/apply: read_patch_file() return -1 instead of die()ing To libify `git apply` functionality we have to signal errors to the caller instead of die()ing. Let's do that by returning -1 instead of die()ing in read_patch_file(). Helped-by: Stefan Beller <sbeller@google.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 12:41:46 -07:00
Christian Couder	f07a9f7643	builtin/apply: make apply_patch() return -1 or -128 instead of die()ing To libify `git apply` functionality we have to signal errors to the caller instead of die()ing. As a first step in this direction, let's make apply_patch() return -1 or -128 in case of errors instead of dying. For now its only caller apply_all_patches() will exit(128) when apply_patch() return -128 and it will exit(1) when it returns -1. We exit() with code 128 because that was what die() was doing and we want to keep the distinction between exiting with code 1 and exiting with code 128. Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 12:41:46 -07:00
Christian Couder	71501a71d0	apply: move 'struct apply_state' to apply.h To libify `git apply` functionality we must make 'struct apply_state' usable outside "builtin/apply.c". Let's do that by creating a new "apply.h" and moving 'struct apply_state' there. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 12:41:46 -07:00
Christian Couder	4d5acae0ca	apply: make some names more specific To prepare for some structs and constants being moved from builtin/apply.c to apply.h, we should give them some more specific names to avoid possible name collisions in the global namespace. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 12:41:09 -07:00
Jeff King	07e7dbf0db	gc: default aggressive depth to 50 This commit message is long and has lots of background and numbers. The summary is: the current default of 250 doesn't save much space, and costs CPU. It's not a good tradeoff. Read on for details. The "--aggressive" flag to git-gc does three things: 1. use "-f" to throw out existing deltas and recompute from scratch 2. use "--window=250" to look harder for deltas 3. use "--depth=250" to make longer delta chains Items (1) and (2) are good matches for an "aggressive" repack. They ask the repack to do more computation work in the hopes of getting a better pack. You pay the costs during the repack, and other operations see only the benefit. Item (3) is not so clear. Allowing longer chains means fewer restrictions on the deltas, which means potentially finding better ones and saving some space. But it also means that operations which access the deltas have to follow longer chains, which affects their performance. So it's a tradeoff, and it's not clear that the tradeoff is even a good one. The existing "250" numbers for "--aggressive" come originally from this thread: http://public-inbox.org/git/alpine.LFD.0.9999.0712060803430.13796@woody.linux-foundation.org/ where Linus says: So when I said "--depth=250 --window=250", I chose those numbers more as an example of extremely aggressive packing, and I'm not at all sure that the end result is necessarily wonderfully usable. It's going to save disk space (and network bandwidth - the delta's will be re-used for the network protocol too!), but there are definitely downsides too, and using long delta chains may simply not be worth it in practice. There are some numbers in that thread, but they're mostly focused on the improved window size, and measure the improvement from --depth=250 and --window=250 together. E.g.: http://public-inbox.org/git/9e4733910712062006l651571f3w7f76ce64c6650dff@mail.gmail.com/ talks about the improved run-time of "git-blame", which comes from the reduced pack size. But most of that reduction is coming from --window=250, whereas most of the extra costs come from --depth=250. There's a link in that thread showing that increasing the depth beyond 50 doesn't seem to help much with the size: https://vcscompare.blogspot.com/2008/06/git-repack-parameters.html but again, no discussion of the timing impact. In an earlier thread from Ted Ts'o which discussed setting the non-aggressive default (from 10 to 50): http://public-inbox.org/git/20070509134958.GA21489%40thunk.org/ we have more numbers, with the conclusion that going past 50 does not help size much, and hurts the speed of normal operations. So from that, we might guess that 50 is actually a sweet spot, even for aggressive, if we interpret aggressive to "spend time now to make a better pack". It is not clear that "--depth=250" is actually a better pack. It may be slightly _smaller_, but it carries a run-time penalty. Here are some more recent timings I did to verify that. They show three things: - the size of the resulting pack (so disk saved to store, bandwidth saved on clones/fetches) - the cost of "rev-list --objects --all", which shows the effect of the delta chains on trees (commits typically don't delta, and the command doesn't touch the blobs at all) - the cost of "log -Sfoo", which will additionally access each blob All cases were repacked with "git repack -adf --depth=$d --window=250" (so basically, what would happen if we tweaked the "gc --aggressive" default depth). The timings are all wall-clock best-of-3. The machine itself has plenty of RAM compared to the repositories (which is probably typical of most workstations these days), so we're really measuring CPU usage, as the whole thing will be in disk cache after the first run. The core.deltaBaseCacheLimit is at its default of 96MiB. It's possible that tweaking it would have some impact on the tests, as some of them (especially "log -S" on a large repo) are likely to overflow that. But bumping that carries a run-time memory cost, so for these tests, I focused on what we could do just with the on-disk pack tradeoffs. Each test is done for four depths: 250 (the current value), 50 (the current default that tested well previously), 100 (to show something on the larger side, which previous tests showed was not a good tradeoff), and 10 (the very old default, which previous tests showed was worse than 50). Here are the numbers for linux.git: depth \| size \| % \| rev-list \| % \| log -Sfoo \| % -------+-------+-------+----------+--------+-----------+------- 250 \| 967MB \| n/a \| 48.159s \| n/a \| 378.088 \| n/a 100 \| 971MB \| +0.4% \| 41.471s \| -13.9% \| 342.060 \| -9.5% 50 \| 979MB \| +1.2% \| 37.778s \| -21.6% \| 311.040s \| -17.7% 10 \| 1.1GB \| +6.6% \| 32.518s \| -32.5% \| 279.890s \| -25.9% and for git.git: depth \| size \| % \| rev-list \| % \| log -Sfoo \| % -------+-------+-------+----------+--------+-----------+------- 250 \| 48MB \| n/a \| 2.215s \| n/a \| 20.922s \| n/a 100 \| 49MB \| +0.5% \| 2.140s \| -3.4% \| 17.736s \| -15.2% 50 \| 49MB \| +1.7% \| 2.099s \| -5.2% \| 15.418s \| -26.3% 10 \| 53MB \| +9.3% \| 2.001s \| -9.7% \| 12.677s \| -39.4% You can see that that the CPU savings for regular operations improves as we decrease the depth. The savings are less for "rev-list" on a smaller repository than they are for blob-accessing operations, or even rev-list on a larger repository. This may mean that a larger delta cache would help (though setting core.deltaBaseCacheLimit by itself doesn't). But we can also see that the space savings are not that great as the depth goes higher. Saving 5-10% between 10 and 50 is probably worth the CPU tradeoff. Saving 1% to go from 50 to 100, or another 0.5% to go from 100 to 250 is probably not. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 11:53:19 -07:00
Jeff Hostetler	b249e39f99	test-lib-functions.sh: add lf_to_nul helper Add lf_to_nul helper function to test-lib-functions. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 11:16:13 -07:00
Jeff Hostetler	1cd828ddc8	git-status.txt: describe --porcelain=v2 format Update status manpage to include information about porcelain v2 format. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 11:15:56 -07:00
Jeff Hostetler	d9fc746cd7	status: print branch info with --porcelain=v2 --branch Expand porcelain v2 output to include branch and tracking branch information. This includes the commit id, the branch, the upstream branch, and the ahead and behind counts. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 11:15:40 -07:00
Jeff Hostetler	24959bad5d	status: print per-file porcelain v2 status data Print per-file information in porcelain v2 format. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 11:15:22 -07:00
Jeff Hostetler	1ecdecce62	status: collect per-file data for --porcelain=v2 Collect extra per-file data for porcelain V2 format. The output of `git status --porcelain` leaves out many details about the current status that clients might like to have. This can force them to be less efficient as they may need to launch secondary commands (and try to match the logic within git) to accumulate this extra information. For example, a GUI IDE might want the file mode to display the correct icon for a changed item (without having to stat it afterwards). Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 11:14:43 -07:00
Jeff King	c9af708b1a	pack-objects: use mru list when iterating over packs In the original implementation of want_object_in_pack(), we always looked for the object in every pack, so the order did not matter for performance. As of the last few patches, however, we can now often break out of the loop early after finding the first instance, and avoid looking in the other packs at all. In this case, pack order can make a big difference, because we'd like to find the objects by looking at as few packs as possible. This patch switches us to the same packed_git_mru list that is now used by normal object lookups. Here are timings for p5303 on linux.git: Test HEAD^ HEAD ------------------------------------------------------------------------ 5303.3: rev-list (1) 31.31(31.07+0.23) 31.28(31.00+0.27) -0.1% 5303.4: repack (1) 40.35(38.84+2.60) 40.53(39.31+2.32) +0.4% 5303.6: rev-list (50) 31.37(31.15+0.21) 31.41(31.16+0.24) +0.1% 5303.7: repack (50) 58.25(68.54+2.03) 47.28(57.66+1.89) -18.8% 5303.9: rev-list (1000) 31.91(31.57+0.33) 31.93(31.64+0.28) +0.1% 5303.10: repack (1000) 304.80(376.00+3.92) 87.21(159.54+2.84) -71.4% The rev-list numbers are unchanged, which makes sense (they are not exercising this code at all). The 50- and 1000-pack repack cases show considerable improvement. The single-pack repack case doesn't, of course; there's nothing to improve. In fact, it gives us a baseline for how fast we could possibly go. You can see that though rev-list can approach the single-pack case even with 1000 packs, repack doesn't. The reason is simple: the loop we are optimizing is only part of what the repack is doing. After the "counting" phase, we do delta compression, which is much more expensive when there are multiple packs, because we have fewer deltas we can reuse (you can also see that these numbers come from a multicore machine; the CPU times are much higher than the wall-clock times due to the delta phase). So the good news is that in cases with many packs, we used to be dominated by the "counting" phase, and now we are dominated by the delta compression (which is faster, and which we have already parallelized). Here are similar numbers for git.git: Test HEAD^ HEAD --------------------------------------------------------------------- 5303.3: rev-list (1) 1.55(1.51+0.02) 1.54(1.53+0.00) -0.6% 5303.4: repack (1) 1.82(1.80+0.08) 1.82(1.78+0.09) +0.0% 5303.6: rev-list (50) 1.58(1.57+0.00) 1.58(1.56+0.01) +0.0% 5303.7: repack (50) 2.50(3.12+0.07) 2.31(2.95+0.06) -7.6% 5303.9: rev-list (1000) 2.22(2.20+0.02) 2.23(2.19+0.03) +0.5% 5303.10: repack (1000) 10.47(16.78+0.22) 7.50(13.76+0.22) -28.4% Not as impressive in terms of percentage, but still measurable wins. If you look at the wall-clock time improvements in the 1000-pack case, you can see that linux improved by roughly 10x as many seconds as git. That's because it has roughly 10x as many objects, and we'd expect this improvement to scale linearly with the number of objects (since the number of packs is kept constant). It's just that the "counting" phase is a smaller percentage of the total time spent for a git.git repack, and hence the percentage win is smaller. The implementation itself is a straightforward use of the MRU code. We only bother marking a pack as used when we know that we are able to break early out of the loop, for two reasons: 1. If we can't break out early, it does no good; we have to visit each pack anyway, so we might as well avoid even the minor overhead of managing the cache order. 2. The mru_mark() function reorders the list, which would screw up our traversal. So it is only safe to mark when we are about to break out of the loop. We could record the found pack and mark it after the loop finishes, of course, but that's more complicated and it doesn't buy us anything due to (1). Note that this reordering does have a potential impact on the final pack, as we store only a single "found" pack for each object, even if it is present in multiple packs. In principle, any copy is acceptable, as they all refer to the same content. But in practice, they may differ in whether they are stored as deltas, against which base, etc. This may have an impact on delta reuse, and even the delta search (since we skip pairs that were already in the same pack). It's not clear whether this change of order would hurt or even help average cases, though. The most likely reason to have duplicate objects is from the completion of thin packs (e.g., you have some objects in a "base" pack, then receive several pushes; the packs you receive may be thin on the wire, with deltas that refer to bases outside the pack, but we complete them with duplicate base objects when indexing them). In such a case the current code would always find the thin duplicates (because we currently walk the packs in reverse chronological order). Whereas with this patch, some of those duplicates would be found in the base pack instead. In my tests repacking a real-world case of linux.git with 3600 thin-pack pushes (on top of a large "base" pack), the resulting pack was about 0.04% larger with this patch. On the other hand, because we were more likely to hit the base pack, there were more opportunities for delta reuse, and we had 50,000 fewer objects to examine in the delta search. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 10:44:23 -07:00
Jeff King	4cf2143e02	pack-objects: break delta cycles before delta-search phase We do not allow cycles in the delta graph of a pack (i.e., A is a delta of B which is a delta of A) for the obvious reason that you cannot actually access any of the objects in such a case. There's a last-ditch attempt to notice cycles during the write phase, during which we issue a warning to the user and write one of the objects out in full. However, this is "last-ditch" for two reasons: 1. By this time, it's too late to find another delta for the object, so the resulting pack is larger than it otherwise could be. 2. The warning is there because this is something that _shouldn't_ ever happen. If it does, then either: a. a pack we are reusing deltas from had its own cycle b. we are reusing deltas from multiple packs, and we found a cycle among them (i.e., A is a delta of B in one pack, but B is a delta of A in another, and we choose to use both deltas). c. there is a bug in the delta-search code So this code serves as a final check that none of these things has happened, warns the user, and prevents us from writing a bogus pack. Right now, (2b) should never happen because of the static ordering of packs in want_object_in_pack(). If two objects have a delta relationship, then they must be in the same pack, and therefore we will find them from that same pack. However, a future patch would like to change that static ordering, which will make (2b) a common occurrence. In preparation, we should be able to handle those kinds of cycles better. This patch does by introducing a cycle-breaking step during the get_object_details() phase, when we are deciding which deltas can be reused. That gives us the chance to feed the objects into the delta search as if the cycle did not exist. We'll leave the detection and warning in the write_object() phase in place, as it still serves as a check for case (2c). This does mean we will stop warning for (2a). That case is caused by bogus input packs, and we ideally would warn the user about it. However, since those cycles show up after picking reusable deltas, they look the same as (2b) to us; our new code will break the cycles early and the last-ditch check will never see them. We could do analysis on any cycles that we find to distinguish the two cases (i.e., it is a bogus pack if and only if every delta in the cycle is in the same pack), but we don't need to. If there is a cycle inside a pack, we'll run into problems not only reusing the delta, but accessing the object data at all. So when we try to dig up the actual size of the object, we'll hit that same cycle and kick in our usual complain-and-try-another-source code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 10:44:13 -07:00
Jeff King	ca79c98572	sha1_file: make packed_object_info public Some code may have a pack/offset pair for an object, but would like to look up more information. Using sha1_object_info() is too heavy-weight; it starts from the sha1 and has to find the pack again (so not only does it waste time, it might not even find the same instance). In some cases, this problem is solved by helpers like get_size_from_delta(), which is used by pack-objects to take a shortcut for objects whose packed representation has already been found. But there's no similar function for getting the object type, for instance. Rather than introduce one, let's just make the whole packed_object_info() available. It is smart enough to spend effort only on the items the caller wants. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 10:43:24 -07:00
Jeff King	27b5c1a065	provide an initializer for "struct object_info" An all-zero initializer is fine for this struct, but because the first element is a pointer, call sites need to know to use "NULL" instead of "0". Otherwise some static checkers like "sparse" will complain; see `d099b71` (Fix some sparse warnings, 2013-07-18) for example. So let's provide an initializer to make this easier to get right. But let's also comment that memset() to zero is explicitly OK[1]. One of the callers embeds object_info in another struct which is initialized via memset (expand_data in builtin/cat-file.c). Since our subset of C doesn't allow assignment from a compound literal, handling this in any other way is awkward, so we'd like to keep the ability to initialize by memset(). By documenting this property, it should make anybody who wants to change the initializer think twice before doing so. There's one other caller of interest. In parse_sha1_header(), we did not initialize the struct fully in the first place. This turned out not to be a bug because the sub-function it calls does not look at any other fields except the ones we did initialize. But that assumption might not hold in the future, so it's a dangerous construct. This patch switches it to initializing the whole struct, which protects us against unexpected reads of the other fields. [1] Obviously using memset() to initialize a pointer violates the C standard, but we long ago decided that it was an acceptable tradeoff in the real world. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 10:42:23 -07:00
Junio C Hamano	a42d7b6a5b	Sync with maint * maint: Yet another batch for 2.9.3	2016-08-10 12:38:02 -07:00
Junio C Hamano	27b0ea4038	Twelfth batch for 2.10 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-10 12:37:53 -07:00

... 19 20 21 22 23 ...

45133 Commits