Introduce an optional configuration to allow the trailing hash that
protects the index file from bit flipping.
* ds/omit-trailing-hash-in-index:
features: feature.manyFiles implies fast index writes
test-lib-functions: add helper for trailing hash
read-cache: add index.skipHash config option
hashfile: allow skipping the hash function
Conditionally skip the pre-applypatch and applypatch-msg hooks when
applying patches with 'git am'.
* tr/am--no-verify:
am: allow passing --no-verify flag
Fix typos in code comments which repeat various words. Most of the
cases are simple in that they repeat a word that usually cannot be
repeated in a grammatically correct sentence. Just remove the
incorrectly duplicated word in these cases and rewrap text, if needed.
A tricky case is usage of "that that", which is sometimes grammatically
correct. However, an instance of this in "t7527-builtin-fsmonitor.sh"
doesn't need two words "that", because there is only one daemon being
discussed, so replace the second "that" with "the".
Reword code comment "entries exist on on-disk index" in function
update_one in file cache-tree.c, by replacing incorrect preposition "on"
with "in".
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The recent addition of the index.skipHash config option allows index
writes to speed up by skipping the hash computation for the trailing
checksum. This is particularly critical for repositories with many files
at HEAD, so add this config option to two cases where users in that
scenario may opt-in to such behavior:
1. The feature.manyFiles config option enables some options that are
helpful for repositories with many files at HEAD.
2. 'scalar register' and 'scalar reconfigure' set config options that
optimize for large repositories.
In both of these cases, set index.skipHash=true to gain this
speedup. Add tests that demonstrate the proper way that
index.skipHash=true can override feature.manyFiles=true.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous change allowed skipping the hashing portion of the
hashwrite API, using it instead as a buffered write API. Disabling the
hashwrite can be particularly helpful when the write operation is in a
critical path.
One such critical path is the writing of the index. This operation is so
critical that the sparse index was created specifically to reduce the
size of the index to make these writes (and reads) faster.
This trade-off between file stability at rest and write-time performance
is not easy to balance. The index is an interesting case for a couple
reasons:
1. Writes block users. Writing the index takes place in many user-
blocking foreground operations. The speed improvement directly
impacts their use. Other file formats are typically written in the
background (commit-graph, multi-pack-index) or are super-critical to
correctness (pack-files).
2. Index files are short lived. It is rare that a user leaves an index
for a long time with many staged changes. Outside of staged changes,
the index can be completely destroyed and rewritten with minimal
impact to the user.
Following a similar approach to one used in the microsoft/git fork [1],
add a new config option (index.skipHash) that allows disabling this
hashing during the index write. The cost is that we can no longer
validate the contents for corruption-at-rest using the trailing hash.
[1] 21fed2d914
We load this config from the repository config given by istate->repo,
with a fallback to the_repository if it is not set.
While older Git versions will not recognize the null hash as a special
case, the file format itself is still being met in terms of its
structure. Using this null hash will still allow Git operations to
function across older versions.
The one exception is 'git fsck' which checks the hash of the index file.
This used to be a check on every index read, but was split out to just
the index in a33fc72fe9 (read-cache: force_verify_index_checksum,
2017-04-14) and released first in Git 2.13.0. Document the versions that
relaxed these restrictions, with the optimistic expectation that this
change will be included in Git 2.40.0.
Here, we disable this check if the trailing hash is all zeroes. We add a
warning to the config option that this may cause undesirable behavior
with older Git versions.
As a quick comparison, I tested 'git update-index --force-write' with
and without index.skipHash=true on a copy of the Linux kernel
repository.
Benchmark 1: with hash
Time (mean ± σ): 46.3 ms ± 13.8 ms [User: 34.3 ms, System: 11.9 ms]
Range (min … max): 34.3 ms … 79.1 ms 82 runs
Benchmark 2: without hash
Time (mean ± σ): 26.0 ms ± 7.9 ms [User: 11.8 ms, System: 14.2 ms]
Range (min … max): 16.3 ms … 42.0 ms 69 runs
Summary
'without hash' ran
1.78 ± 0.76 times faster than 'with hash'
These performance benefits are substantial enough to allow users the
ability to opt-in to this feature, even with the potential confusion
with older 'git fsck' versions.
Test this new config option, both at a command-line level and within a
submodule. The confirmation is currently limited to confirm that 'git
fsck' does not complain about the index. Future updates will make this
test more robust.
It is critical that this test is placed before the test_index_version
tests, since those tests obliterate the .git/config file and hence lose
the setting from GIT_TEST_DEFAULT_HASH, if set.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stop using "git --super-prefix" and narrow the scope of its use to
the submodule--helper.
* ab/no-more-git-global-super-prefix:
read-tree: add "--super-prefix" option, eliminate global
submodule--helper: convert "{update,clone}" to their own "--super-prefix"
submodule--helper: convert "status" to its own "--super-prefix"
submodule--helper: convert "sync" to its own "--super-prefix"
submodule--helper: convert "foreach" to its own "--super-prefix"
submodule--helper: don't use global --super-prefix in "absorbgitdirs"
submodule.c & submodule--helper: pass along "super_prefix" param
read-tree + fetch tests: test failing "--super-prefix" interaction
submodule absorbgitdirs tests: add missing "Migrating git..." tests
'cat-file' gains mailmap support for its '--batch-check' and '-s'
options.
* sa/cat-file-mailmap--batch-check:
cat-file: add mailmap support to --batch-check option
cat-file: add mailmap support to -s option
The git-am --no-verify flag is analogous to the same flag passed to
git-commit. It bypasses the pre-applypatch and applypatch-msg hooks
if they are enabled.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git format-patch" learned to honor format.mboxrd even when sending
patches to the standard output stream,
* ew/format-patch-mboxrd:
format-patch: support format.mboxrd with --stdout
Bundle URIs part 4.
* ds/bundle-uri-4:
clone: unbundle the advertised bundles
bundle-uri: download bundles from an advertised list
bundle-uri: allow relative URLs in bundle lists
strbuf: introduce strbuf_strip_file_from_path()
bundle-uri: serve bundle.* keys from config
bundle-uri client: add helper for testing server
transport: rename got_remote_heads
bundle-uri client: add boolean transfer.bundleURI setting
clone: request the 'bundle-uri' command when available
t: create test harness for 'bundle-uri' command
protocol v2: add server-side "bundle-uri" skeleton
Just like "git var GIT_EDITOR" abstracts the complex logic to
choose which editor gets used behind it, "git var" now give support
to GIT_SEQUENCE_EDITOR.
* sa/git-var-sequence-editor:
var: add GIT_SEQUENCE_EDITOR variable
Update the description of --force to use '<start-point>' rather than
'<startpoint>' to match the spelling used everywhere else in the
git-branch documentation.
Signed-off-by: Kyle Meyer <kyle@kyleam.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "--super-prefix" option to "git" was initially added in [1] for
use with "ls-files"[2], and shortly thereafter "submodule--helper"[3]
and "grep"[4]. It wasn't until [5] that "read-tree" made use of it.
At the time [5] made sense, but since then we've made "ls-files"
recurse in-process in [6], "grep" in [7], and finally
"submodule--helper" in the preceding commits.
Let's also remove it from "read-tree", which allows us to remove the
option to "git" itself.
We can do this because the only remaining user of it is the submodule
API, which will now invoke "read-tree" with its new "--super-prefix"
option. It will only do so when the "submodule_move_head()" function
is called.
That "submodule_move_head()" function was then only invoked by
"read-tree" itself, but now rather than setting an environment
variable to pass "--super-prefix" between cmd_read_tree() we:
- Set a new "super_prefix" in "struct unpack_trees_options". The
"super_prefixed()" function in "unpack-trees.c" added in [5] will now
use this, rather than get_super_prefix() looking up the environment
variable we set earlier in the same process.
- Add the same field to the "struct checkout", which is only needed to
ferry the "super_prefix" in the "struct unpack_trees_options" all the
way down to the "entry.c" callers of "submodule_move_head()".
Those calls which used the super prefix all originated in
"cmd_read_tree()". The only other caller is the "unlink_entry()"
caller in "builtin/checkout.c", which now passes a "NULL".
1. 74866d7579 (git: make super-prefix option, 2016-10-07)
2. e77aa336f1 (ls-files: optionally recurse into submodules, 2016-10-07)
3. 89c8626557 (submodule helper: support super prefix, 2016-12-08)
4. 0281e487fd (grep: optionally recurse into submodules, 2016-12-16)
5. 3d415425c7 (unpack-trees: support super-prefix option, 2017-01-17)
6. 188dce131f (ls-files: use repository object, 2017-06-22)
7. f9ee2fcdfa (grep: recurse in-process using 'struct repository', 2017-08-02)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
mboxrd is a more robust output format when used with --stdout
and needs more exposure. Introducing this config knob lets
users choose the more robust format for all their --stdout
uses.
Relying on --pretty=mboxrd and including all of pretty-formats.txt
in the `git format-patch' documentation would likely be
confusing to users. Furthermore, this setting is useful across
multiple invocations. So introduce `format.mboxrd' as a boolean
configuration knob that changes the default --pretty=email format
to --pretty=mboxrd when (and only when) --stdout is in use.
Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The yet-to-be introduced client support for bundle-uri will always
fall back on a full clone, but we'd still like to be able to ignore a
server's bundle-uri advertisement entirely.
The new transfer.bundleURI config option defaults to 'false', but a user
can set it to 'true' to enable checking for bundle URIs from the origin
Git server using protocol v2.
Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a skeleton server-side implementation of a new "bundle-uri" command
to protocol v2. This will allow conforming clients to optionally seed
their initial clones or incremental fetches from URLs containing
"*.bundle" files created with "git bundle create".
This change only performs the basic boilerplate of advertising a new
protocol v2 capability. The new 'bundle-uri' capability allows a client
to request a list of bundles. Right now, the server only returns a flush
packet, which corresponds to an empty advertisement. The bundle.* config
namespace describes which key-value pairs will be communicated across
this interface in future updates.
The critical bit right now is that the new boolean
uploadPack.adverstiseBundleURIs config value signals whether or not this
capability should be advertised at all.
An earlier version of this patch [1] used a different transfer format
than the "key=value" pairs in the current implementation. The change was
made to unify the protocol v2 command with the bundle lists provided by
independent bundle servers. Further, the standard allows for the server
to advertise a URI that contains a bundle list. This allows users
automatically discovering bundle providers that are loosely associated
with the origin server, but without the origin server knowing exactly
which bundles are currently available.
[1] https://lore.kernel.org/git/RFC-patch-v2-01.13-2fc87ce092b-20220311T155841Z-avarab@gmail.com/
The very-deep headings needed to be modified to stop at level 4 due to
documentation build issues. These were not recognized in earlier builds
since the file was previously in the Documentation/technical/ directory
and was built in a different way. With its current location, the
heavily-nested details were causing build issues and they are now
replaced with a bulletted list of details.
Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even though the cat-file command with `--batch-check` option does not
complain when `--use-mailmap` option is given, the latter option is
ignored. Compute the size of the object after replacing the idents and
report it instead.
In order to make `--batch-check` option honour the mailmap mechanism we
have to read the contents of the commit/tag object.
There were two ways to do it:
1. Make two calls to `oid_object_info_extended()`. If `--use-mailmap`
option is given, the first call will get us the type of the object
and second call will only be made if the object type is either a
commit or tag to get the contents of the object.
2. Make one call to `oid_object_info_extended()` to get the type of the
object. Then, if the object type is either of commit or tag, make a
call to `repo_read_object_file()` to read the contents of the object.
I benchmarked the following command with both the above approaches and
compared against the current implementation where `--use-mailmap`
option is ignored:
`git cat-file --use-mailmap --batch-all-objects --batch-check --buffer
--unordered`
The results can be summarized as follows:
Time (mean ± σ)
default 827.7 ms ± 104.8 ms
first approach 6.197 s ± 0.093 s
second approach 1.975 s ± 0.217 s
Since, the second approach is faster than the first one, I implemented
it in this patch.
The command git cat-file can now use the mailmap mechanism to replace
idents with canonical versions for commit and tag objects. There are
several options like `--batch`, `--batch-check` and `--batch-command`
that can be combined with `--use-mailmap`. But the documentation for
`--batch`, `--batch-check` and `--batch-command` doesn't say so. This
patch fixes that documentation.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even though the cat-file command with `-s` option does not complain when
`--use-mailmap` option is given, the latter option is ignored. Compute
the size of the object after replacing the idents and report it instead.
In order to make `-s` option honour the mailmap mechanism we have to
read the contents of the commit/tag object. Make use of the call to
`oid_object_info_extended()` to get the contents of the object and store
in `buf`. `buf` is later freed in the function.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The advice message given by "git status" when it takes long time to
enumerate untracked paths has been updated.
* rr/status-untracked-advice:
status: modernize git-status "slow untracked files" advice
The editor program used by Git when editing the sequencer "todo" file
is determined by examining a few environment variables and also
affected by configuration variables. Introduce "git var
GIT_SEQUENCE_EDITOR" to give users access to the final result of the
logic without having to know the exact details.
This is very similar in spirit to 44fcb497 (Teach git var about
GIT_EDITOR, 2009-11-11) that introduced "git var GIT_EDITOR".
Signed-off-by: Sean Allred <allred.sean@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git var UNKNOWN_VARIABLE" and "git var VARIABLE" with the variable
given an empty value used to behave identically. Now the latter
just gives an empty output, while the former still gives an error
message.
* sa/git-var-empty:
var: allow GIT_EDITOR to return null
var: do not print usage() with a correct invocation
"merge-tree" learns a new `--merge-base` option.
* kz/merge-tree-merge-base:
docs: fix description of the `--merge-base` option
merge-tree.c: allow specifying the merge-base when --stdin is passed
merge-tree.c: add --merge-base=<commit> option
`git status` can be slow when there are a large number of
untracked files and directories since Git must search the entire
worktree to enumerate them. When it is too slow, Git prints
advice with the elapsed search time and a suggestion to disable
the search using the `-uno` option. This suggestion also carries
a warning that might scare off some users.
However, these days, `-uno` isn't the only option. Git can reduce
the time taken to enumerate untracked files by caching results from
previous `git status` invocations, when the `core.untrackedCache`
and `core.fsmonitor` features are enabled.
Update the `git status` man page to explain these configuration
options, and update the advice to provide more detail about the
current configuration and to refer to the updated documentation.
Signed-off-by: Rudy Rigot <rudy.rigot@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A GNU make pattern rule with multiple targets has always meant that
a single invocation of the recipe will build all the targets.
However in older versions of GNU make a recipe that did not really
build all the targets would be tolerated.
Starting with GNU make 4.4 this behavior is deprecated and pattern
rules are expected to generate files to match all the patterns.
If not all targets are created then GNU make will not consider any
target up to date and will re-run the recipe when it is run again.
Modify Documentation/Makefile to split the man page-creating pattern
rule into a separate pattern rule for each pattern.
Reported-by: Alexander Kanavin <alex.kanavin@gmail.com>
Signed-off-by: Paul Smith <psmith@gnu.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before, git-var could print usage() even if the command was invoked
correctly with a variable defined in git_vars -- provided that its
read() function returned NULL.
Now, we only print usage() only if it was called with a logical
variable that wasn't defined -- regardless of read().
Since we now know the variable is valid when we call read_var(), we
can avoid printing usage() here (and exiting with code 129) and
instead exit quietly with code 1. While exiting with a different code
can be a breaking change, it's far better than changing the exit
status more generally from 'failure' to 'success'.
Signed-off-by: Sean Allred <allred.sean@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git receive-pack" used to use all the local refs as the boundary for
checking connectivity of the data "git push" sent, but now it uses
only the refs that it advertised to the pusher. In a repository with
the .hideRefs configuration, this reduces the resources needed to
perform the check.
cf. <221028.86bkpw805n.gmgdl@evledraar.gmail.com>
cf. <xmqqr0yrizqm.fsf@gitster.g>
* ps/receive-use-only-advertised:
receive-pack: only use visible refs for connectivity check
rev-parse: add `--exclude-hidden=` option
revision: add new parameter to exclude hidden refs
revision: introduce struct to handle exclusions
revision: move together exclusion-related functions
refs: get rid of global list of hidden refs
refs: fix memory leak when parsing hideRefs config