git-commit-vandalism/Documentation/config
Derrick Stolee ee1f0c242e read-cache: add index.skipHash config option
The previous change allowed skipping the hashing portion of the
hashwrite API, using it instead as a buffered write API. Disabling the
hashwrite can be particularly helpful when the write operation is in a
critical path.

One such critical path is the writing of the index. This operation is so
critical that the sparse index was created specifically to reduce the
size of the index to make these writes (and reads) faster.

This trade-off between file stability at rest and write-time performance
is not easy to balance. The index is an interesting case for a couple
reasons:

1. Writes block users. Writing the index takes place in many user-
   blocking foreground operations. The speed improvement directly
   impacts their use. Other file formats are typically written in the
   background (commit-graph, multi-pack-index) or are super-critical to
   correctness (pack-files).

2. Index files are short lived. It is rare that a user leaves an index
   for a long time with many staged changes. Outside of staged changes,
   the index can be completely destroyed and rewritten with minimal
   impact to the user.

Following a similar approach to one used in the microsoft/git fork [1],
add a new config option (index.skipHash) that allows disabling this
hashing during the index write. The cost is that we can no longer
validate the contents for corruption-at-rest using the trailing hash.

[1] 21fed2d914

We load this config from the repository config given by istate->repo,
with a fallback to the_repository if it is not set.

While older Git versions will not recognize the null hash as a special
case, the file format itself is still being met in terms of its
structure. Using this null hash will still allow Git operations to
function across older versions.

The one exception is 'git fsck' which checks the hash of the index file.
This used to be a check on every index read, but was split out to just
the index in a33fc72fe9 (read-cache: force_verify_index_checksum,
2017-04-14) and released first in Git 2.13.0. Document the versions that
relaxed these restrictions, with the optimistic expectation that this
change will be included in Git 2.40.0.

Here, we disable this check if the trailing hash is all zeroes. We add a
warning to the config option that this may cause undesirable behavior
with older Git versions.

As a quick comparison, I tested 'git update-index --force-write' with
and without index.skipHash=true on a copy of the Linux kernel
repository.

Benchmark 1: with hash
  Time (mean ± σ):      46.3 ms ±  13.8 ms    [User: 34.3 ms, System: 11.9 ms]
  Range (min … max):    34.3 ms …  79.1 ms    82 runs

Benchmark 2: without hash
  Time (mean ± σ):      26.0 ms ±   7.9 ms    [User: 11.8 ms, System: 14.2 ms]
  Range (min … max):    16.3 ms …  42.0 ms    69 runs

Summary
  'without hash' ran
    1.78 ± 0.76 times faster than 'with hash'

These performance benefits are substantial enough to allow users the
ability to opt-in to this feature, even with the potential confusion
with older 'git fsck' versions.

Test this new config option, both at a command-line level and within a
submodule. The confirmation is currently limited to confirm that 'git
fsck' does not complain about the index. Future updates will make this
test more robust.

It is critical that this test is placed before the test_index_version
tests, since those tests obliterate the .git/config file and hence lose
the setting from GIT_TEST_DEFAULT_HASH, if set.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-07 07:46:14 +09:00
..
add.txt add -i: default to the built-in implementation 2021-12-01 14:34:43 -08:00
advice.txt Merge branch 'tk/ambiguous-fetch-refspec' 2022-04-04 10:56:24 -07:00
alias.txt config/alias.txt: document alias accepting non-command first word 2019-06-06 09:33:42 -07:00
am.txt
apply.txt
blame.txt blame: correct name of config option in docs 2021-06-28 10:05:13 -07:00
branch.txt push: default to single remote even when not named origin 2022-04-29 11:20:55 -07:00
browser.txt
bundle.txt bundle-uri: create base key-value pair parsing 2022-10-12 09:13:24 -07:00
checkout.txt parallel-checkout: add configuration options 2021-04-19 11:57:05 -07:00
clean.txt
clone.txt clone, submodule: pass partial clone filters to submodules 2022-02-09 15:38:36 -08:00
color.txt Merge branch 'hm/paint-hits-in-log-grep' 2021-11-01 13:48:08 -07:00
column.txt
commit.txt
commitgraph.txt commit-graph: use config to specify generation type 2021-02-25 15:10:41 -08:00
completion.txt
core.txt doc: use "commit-graph" hyphenation consistently 2022-10-30 19:58:40 -04:00
credential.txt crendential-store: use timeout when locking file 2020-11-25 12:30:18 -08:00
diff.txt difftool docs: de-duplicate configuration sections 2022-09-07 09:46:06 -07:00
difftool.txt difftool docs: de-duplicate configuration sections 2022-09-07 09:46:06 -07:00
extensions.txt Documentation: add extensions.worktreeConfig details 2022-02-08 09:49:20 -08:00
fastimport.txt
feature.txt config: let feature.experimental imply gc.cruftPacks=true 2022-10-26 14:39:31 -07:00
fetch.txt transfer doc: move fetch.credentialsInUrl to "transfer" config namespace 2022-06-15 11:40:11 -07:00
filter.txt
fmt-merge-msg.txt config/fmt-merge-msg.txt: drop space in quote 2020-09-27 14:22:41 -07:00
format.txt format-patch: learn format.forceInBodyFrom configuration variable 2022-08-29 14:39:13 -07:00
fsck.txt fsck: document msg-id 2022-10-25 15:44:18 -07:00
fsmonitor--daemon.txt fsmonitor: add documentation for allowRemote and socketDir options 2022-10-05 11:05:23 -07:00
gc.txt builtin/gc.c: conditionally avoid pruning objects via loose 2022-05-26 15:48:26 -07:00
gitcvs.txt
gitweb.txt
gpg.txt gpg docs: explain better use of ssh.defaultKeyCommand 2022-06-08 16:33:40 -07:00
grep.txt grep docs: de-duplicate configuration sections 2022-09-07 09:46:05 -07:00
gui.txt docs: use "character encoding" to refer to commit-object encoding 2021-08-27 12:45:45 -07:00
guitool.txt
help.txt help.c: help.autocorrect=prompt waits for user action 2021-08-14 11:20:49 -07:00
http.txt i18n: fix mismatched camelCase config variables 2022-06-17 10:38:26 -07:00
i18n.txt
imap.txt
includeif.txt config.txt: document include, includeIf 2022-07-17 14:23:42 -07:00
index.txt read-cache: add index.skipHash config option 2023-01-07 07:46:14 +09:00
init.txt clone: respect remote unborn HEAD 2021-02-05 13:49:55 -08:00
instaweb.txt
interactive.txt checkout: split part of it to new command 'restore' 2019-05-07 13:04:47 +09:00
log.txt diff-merges: clarify log.diffMerges documentation 2022-09-16 09:21:44 -07:00
lsrefs.txt docs: move protocol-related docs to man section 5 2022-08-04 14:12:23 -07:00
mailinfo.txt
mailmap.txt
maintenance.txt maintenance: incremental strategy runs pack-refs weekly 2021-02-09 23:09:29 -08:00
man.txt
merge.txt update documentation for new zdiff3 conflictStyle 2021-12-01 14:45:59 -08:00
mergetool.txt Merge branch 'nb/doc-mergetool-typofix' 2022-10-11 10:36:12 -07:00
notes.txt notes docs: de-duplicate and combine configuration sections 2022-09-07 09:46:06 -07:00
pack.txt Merge branch 'ac/bitmap-lookup-table' 2022-09-05 18:33:39 -07:00
pager.txt
pretty.txt
protocol.txt Sync with 2.37.4 2022-10-06 20:00:04 -04:00
pull.txt pull: remove support for --rebase=preserve 2021-09-07 21:45:32 -07:00
push.txt Doc: document push.recurseSubmodules=only 2022-11-14 16:55:50 -05:00
rebase.txt rebase: add rebase.updateRefs config option 2022-07-19 12:49:04 -07:00
receive.txt receive-pack: new config receive.procReceiveRefs 2020-08-27 12:47:47 -07:00
remote.txt docs: mention --refetch fetch option 2022-03-28 10:25:53 -07:00
remotes.txt
repack.txt builtin/repack.c: allow configuring cruft pack generation 2022-05-26 15:48:26 -07:00
rerere.txt
revert.txt revert: config documentation fixes 2022-06-27 08:37:36 -07:00
safe.txt setup.c: create safe.bareRepository 2022-07-14 15:08:29 -07:00
sendemail.txt send-email docs: de-duplicate configuration sections 2022-09-07 09:46:05 -07:00
sequencer.txt
showbranch.txt
sparse.txt repo_read_index: add config to expect files outside sparse patterns 2022-03-01 23:37:48 -08:00
splitindex.txt
ssh.txt
stash.txt stash: remove documentation for stash.useBuiltin 2022-01-27 18:00:37 -08:00
status.txt status: add status.aheadbehind setting 2019-06-21 09:35:00 -07:00
submodule.txt branch: add --recurse-submodules option for branch creation 2022-02-04 08:16:39 -08:00
tag.txt separate tar.* config to its own source file 2020-03-18 12:42:09 -07:00
tar.txt separate tar.* config to its own source file 2020-03-18 12:42:09 -07:00
trace2.txt doc: fix some typos 2021-01-04 11:27:48 -08:00
transfer.txt Documentation: fix various repeat word typos 2022-09-12 11:04:55 -07:00
uploadarchive.txt
uploadpack.txt Documentation: define protected configuration 2022-07-14 15:08:29 -07:00
url.txt
user.txt ssh signing: support non ssh-* keytypes 2021-11-19 09:05:25 -08:00
versionsort.txt
web.txt
worktree.txt