git-commit-vandalism/builtin
Derrick Stolee daa1acefc5 commit: integrate with sparse-index
Update 'git commit' to allow using the sparse-index in memory without
expanding to a full one. The only place that had an ensure_full_index()
call was in cache_tree_update(). The recursive algorithm for
update_one() was already updated in 2de37c536 (cache-tree: integrate
with sparse directory entries, 2021-03-03) to handle sparse directory
entries in the index.

Most of this change involves testing different command-line options that
allow specifying which on-disk changes should be included in the commit.
This includes no options (only take currently-staged changes), -a (take
all tracked changes), and --include (take a list of specific changes).
To simplify testing that these options do not expand the index, update
the test that previously verified that 'git status' does not expand the
index with a helper method, ensure_not_expanded().

This allows 'git commit' to operate much faster when the sparse-checkout
cone is much smaller than the full list of files at HEAD.

Here are the relevant lines from p2000-sparse-operations.sh:

Test                                      HEAD~1           HEAD
----------------------------------------------------------------------------------
2000.14: git commit -a -m A (full-v3)     0.35(0.26+0.06)  0.36(0.28+0.07) +2.9%
2000.15: git commit -a -m A (full-v4)     0.32(0.26+0.05)  0.34(0.28+0.06) +6.3%
2000.16: git commit -a -m A (sparse-v3)   0.63(0.59+0.06)  0.04(0.05+0.05) -93.7%
2000.17: git commit -a -m A (sparse-v4)   0.64(0.59+0.08)  0.04(0.04+0.04) -93.8%

It is important to compare the full-index case to the sparse-index case,
so the improvement for index version v4 is actually an 88% improvement in
this synthetic example.

In a real repository with over two million files at HEAD and 60,000
files in the sparse-checkout definition, the time for 'git commit -a'
went from 2.61 seconds to 134ms. I compared this to the result if the
index only contained the paths in the sparse-checkout definition and
found the theoretical optimum to be 120ms, so the out-of-cone paths only
add a 12% overhead.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 15:05:53 -07:00
..
add.c Merge branch 'ow/no-dryrun-in-add-i' 2021-05-14 08:26:09 +09:00
am.c am: learn to process quoted lines that ends with CRLF 2021-05-10 15:06:22 +09:00
annotate.c strvec: rename struct fields 2020-07-30 19:18:06 -07:00
apply.c
archive.c
bisect--helper.c bisect--helper: use BISECT_TERMS in 'bisect skip' command 2021-04-30 09:56:42 +09:00
blame.c Merge branch 'rs/blame-optim' 2021-02-25 16:43:29 -08:00
branch.c ref-filter: reuse output buffer 2021-04-20 11:09:50 -07:00
bugreport.c builtin/bugreport: don't leak prefixed filename 2021-04-28 09:25:45 +09:00
bundle.c Merge branch 'bc/sha-256-part-3' 2020-08-11 18:04:11 -07:00
cat-file.c Merge branch 'cc/cat-file-usage-update' into master 2020-07-09 14:00:41 -07:00
check-attr.c
check-ignore.c Merge branch 'ah/plugleaks' 2021-05-07 12:47:41 +09:00
check-mailmap.c shortlog: remove unused(?) "repo-abbrev" feature 2021-01-12 14:04:42 -08:00
check-ref-format.c
checkout--worker.c make_transient_cache_entry(): optionally alloc from mem_pool 2021-05-05 12:25:25 +09:00
checkout-index.c Merge branch 'mt/parallel-checkout-part-3' 2021-05-16 21:05:23 +09:00
checkout.c Merge branch 'mt/parallel-checkout-part-3' 2021-05-16 21:05:23 +09:00
clean.c Merge branch 'en/dir-traversal' 2021-05-20 08:54:59 +09:00
clone.c Merge branch 'jk/clone-clean-upon-transport-error' 2021-06-14 13:33:26 +09:00
column.c column, range-diff: downcase option description 2021-03-29 14:06:08 -07:00
commit-graph.c builtin/*: update usage format 2021-01-06 15:10:49 -08:00
commit-tree.c Use OPT_CALLBACK and OPT_CALLBACK_F 2020-04-28 10:47:10 -07:00
commit.c commit: integrate with sparse-index 2021-07-14 15:05:53 -07:00
config.c config: unify code paths to get global config paths 2021-04-19 14:16:59 -07:00
count-objects.c
credential-cache--daemon.c unix-socket: add backlog size option to unix_stream_listen() 2021-03-15 14:32:51 -07:00
credential-cache.c unix-socket: disallow chdir() when creating unix domain sockets 2021-03-15 14:32:51 -07:00
credential-store.c crendential-store: use timeout when locking file 2020-11-25 12:30:18 -08:00
credential.c credential: load default config 2020-10-16 12:30:45 -07:00
describe.c hash: provide per-algorithm null OIDs 2021-04-27 16:31:39 +09:00
diff-files.c Merge branch 'jc/diffcore-rotate' 2021-02-25 16:43:30 -08:00
diff-index.c diff-merges: move specific diff-index "-m" handling to diff-index 2021-05-21 09:24:14 +09:00
diff-tree.c Merge branch 'jc/diffcore-rotate' 2021-02-25 16:43:30 -08:00
diff.c hash: provide per-algorithm null OIDs 2021-04-27 16:31:39 +09:00
difftool.c Merge branch 'mt/parallel-checkout-part-3' 2021-05-16 21:05:23 +09:00
env--helper.c assert PARSE_OPT_NONEG in parse-options callbacks 2020-09-30 12:53:47 -07:00
fast-export.c hash: provide per-algorithm null OIDs 2021-04-27 16:31:39 +09:00
fast-import.c Use the final_oid_fn to finalize hashing of object IDs 2021-04-27 16:31:38 +09:00
fetch-pack.c connect, transport: encapsulate arg in struct 2021-02-05 13:49:54 -08:00
fetch.c fetch: improve grammar of "shallow roots" message 2021-05-20 13:36:33 +09:00
fmt-merge-msg.c
for-each-ref.c Merge branch 'ah/plugleaks' 2021-05-07 12:47:41 +09:00
for-each-repo.c for-each-repo: do nothing on empty config 2021-01-07 19:12:02 -08:00
fsck.c Merge branch 'ab/fsck-api-cleanup' 2021-06-02 07:34:27 +09:00
gc.c maintenance: fix two memory leaks 2021-05-12 07:00:45 +09:00
get-tar-commit-id.c
grep.c Merge branch 'bc/hash-transition-interop-part-1' 2021-05-10 16:59:46 +09:00
hash-object.c
help.c help: drop usage of 'common' and 'useful' for guides 2020-08-04 18:34:01 -07:00
index-pack.c Use the final_oid_fn to finalize hashing of object IDs 2021-04-27 16:31:38 +09:00
init-db.c Merge branch 'mt/init-template-userpath-fix' 2021-05-25 16:21:20 +09:00
interpret-trailers.c Use OPT_CALLBACK and OPT_CALLBACK_F 2020-04-28 10:47:10 -07:00
log.c Merge branch 'so/log-diff-merge' 2021-04-30 13:50:26 +09:00
ls-files.c Merge branch 'en/dir-traversal' 2021-05-20 08:54:59 +09:00
ls-remote.c Merge branch 'ah/plugleaks' 2021-04-07 16:54:08 -07:00
ls-tree.c tree.h API: simplify read_tree_recursive() signature 2021-03-20 16:09:26 -07:00
mailinfo.c mailinfo: allow squelching quoted CRLF warning 2021-05-10 15:06:22 +09:00
mailsplit.c
merge-base.c
merge-file.c
merge-index.c merge-index: ensure full index 2021-04-14 13:47:21 -07:00
merge-ours.c
merge-recursive.c
merge-tree.c merge-base, xdiff: zero out xpparam_t structures 2020-10-20 12:53:26 -07:00
merge.c Merge branch 'ah/merge-usage-i18n-fix' 2021-06-10 12:04:23 +09:00
mktag.c fsck.c: add an fsck_set_msg_type() API that takes enums 2021-03-28 19:03:10 -07:00
mktree.c
multi-pack-index.c midx: allow marking a pack as preferred 2021-04-01 13:07:37 -07:00
mv.c git mv foo FOO ; git mv foo bar gave an assert 2021-03-03 17:07:12 -08:00
name-rev.c oid_pos(): access table through const pointers 2021-01-28 12:03:26 -08:00
notes.c use CALLOC_ARRAY 2021-03-13 16:00:09 -08:00
pack-objects.c Merge branch 'ab/pack-linkage-fix' 2021-05-27 12:36:58 +09:00
pack-redundant.c builtin/pack-redundant: avoid casting buffers to struct object_id 2021-04-27 16:31:38 +09:00
pack-refs.c
patch-id.c
prune-packed.c
prune.c Merge branch 'tb/shallow-cleanup' 2020-05-13 12:19:18 -07:00
pull.c pull: display default warning only when non-ff 2020-12-15 17:39:42 -08:00
push.c Merge branch 'jc/push-delete-nothing' 2021-02-25 16:43:33 -08:00
range-diff.c column, range-diff: downcase option description 2021-03-29 14:06:08 -07:00
read-tree.c Use OPT_CALLBACK and OPT_CALLBACK_F 2020-04-28 10:47:10 -07:00
rebase.c Merge branch 'bc/hash-transition-interop-part-1' 2021-05-10 16:59:46 +09:00
receive-pack.c hash: provide per-algorithm null OIDs 2021-04-27 16:31:39 +09:00
reflog.c reflog expire --stale-fix: be generous about missing objects 2021-02-11 09:21:52 -08:00
remote-ext.c strvec: convert builtin/ callers away from argv_array name 2020-07-28 15:02:18 -07:00
remote-fd.c
remote.c Merge branch 'ah/plugleaks' 2021-04-07 16:54:08 -07:00
repack.c repack: avoid loosening promisor objects in partial clones 2021-04-28 13:36:13 +09:00
replace.c strvec: rename struct fields 2020-07-30 19:18:06 -07:00
rerere.c
reset.c reset: free instead of leaking unneeded ref 2021-03-14 15:57:59 -07:00
rev-list.c rev-list: allow filtering of provided items 2021-04-19 14:09:11 -07:00
rev-parse.c rev-parse: mark die() messages for translation 2021-05-17 18:39:53 +09:00
revert.c sequencer: fix edit handling for cherry-pick and revert messages 2021-03-31 14:10:50 -07:00
rm.c Merge branch 'ah/plugleaks' 2021-05-07 12:47:41 +09:00
send-pack.c push: parse and set flag for "--force-if-includes" 2020-10-03 09:59:19 -07:00
shortlog.c Merge branch 'ab/mailmap' 2021-01-25 14:19:19 -08:00
show-branch.c Merge branch 'jt/interpret-branch-name-fallback' 2020-09-09 13:53:09 -07:00
show-index.c builtin/show-index: set the algorithm for object IDs 2021-04-27 16:31:39 +09:00
show-ref.c refs: switch peel_ref() to peel_iterated_oid() 2021-01-21 15:51:31 -08:00
sparse-checkout.c Merge branch 'ds/sparse-index-protections' 2021-04-30 13:50:26 +09:00
stash.c Merge branch 'so/log-m-implies-p' 2021-06-14 13:33:27 +09:00
stripspace.c
submodule--helper.c Merge branch 'ah/submodule-helper-module-summary-parseopt' 2021-06-10 12:04:24 +09:00
symbolic-ref.c symbolic-ref: don't leak shortened refname in check_symref() 2021-03-14 15:57:59 -07:00
tag.c ref-filter: reuse output buffer 2021-04-20 11:09:50 -07:00
unpack-file.c
unpack-objects.c hash: provide per-algorithm null OIDs 2021-04-27 16:31:39 +09:00
update-index.c update-index: ensure full index 2021-04-14 13:47:29 -07:00
update-ref.c update-ref: disallow "start" for ongoing transactions 2020-11-16 13:44:01 -08:00
update-server-info.c
upload-archive.c strvec: rename struct fields 2020-07-30 19:18:06 -07:00
upload-pack.c
var.c
verify-commit.c
verify-pack.c Merge branch 'bc/sha-256-part-3' 2020-08-11 18:04:11 -07:00
verify-tag.c
worktree.c Merge branch 'en/dir-traversal' 2021-05-20 08:54:59 +09:00
write-tree.c