Commit Graph

13929 Commits

Author SHA1 Message Date
Junio C Hamano
85c54ecc5f Merge branch 'sb/submodule-cleanup'
A few preliminary minor clean-ups in the area around submodules.

* sb/submodule-cleanup:
  builtin/submodule--helper: remove stray new line
  t7410: update to new style
2018-08-20 12:41:33 -07:00
Junio C Hamano
5a5c5e9565 Merge branch 'pw/rebase-i-merge-segv-fix'
"git rebase -i", when a 'merge <branch>' insn in its todo list
fails, segfaulted, which has been (minimally) corrected.

* pw/rebase-i-merge-segv-fix:
  rebase -i: fix SIGSEGV when 'merge <branch>' fails
  t3430: add conflicting commit
2018-08-20 12:41:33 -07:00
Junio C Hamano
36fd1e843b Merge branch 'pw/rebase-i-squash-number-fix'
When "git rebase -i" is told to squash two or more commits into
one, it labeled the log message for each commit with its number.
It correctly called the first one "1st commit", but the next one
was "commit #1", which was off-by-one.  This has been corrected.

* pw/rebase-i-squash-number-fix:
  rebase -i: fix numbering in squash message
2018-08-20 12:41:33 -07:00
Junio C Hamano
2a2c18f1c3 Merge branch 'sb/config-write-fix'
Recent update to "git config" broke updating variable in a
subsection, which has been corrected.

* sb/config-write-fix:
  git-config: document accidental multi-line setting in deprecated syntax
  config: fix case sensitive subsection names on writing
  t1300: document current behavior of setting options
2018-08-20 12:41:32 -07:00
Junio C Hamano
87aa1595e7 Merge branch 'ab/submodule-relative-url-tests'
Test updates.

* ab/submodule-relative-url-tests:
  submodule: add more exhaustive up-path testing
2018-08-20 12:41:32 -07:00
Junio C Hamano
36f0f344e7 Merge branch 'jt/repack-promisor-packs'
After a partial clone, repeated fetches from promisor remote would
have accumulated many packfiles marked with .promisor bit without
getting them coalesced into fewer packfiles, hurting performance.
"git repack" now learned to repack them.

* jt/repack-promisor-packs:
  repack: repack promisor objects if -a or -A is set
  repack: refactor setup of pack-objects cmd
2018-08-20 12:40:31 -07:00
Junio C Hamano
e72db08f15 Merge branch 'wc/make-funnynames-shared-lazy-prereq'
A test prerequisite defined by various test scripts with slightly
different semantics has been consolidated into a single copy and
made into a lazily defined one.

* wc/make-funnynames-shared-lazy-prereq:
  t: factor out FUNNYNAMES as shared lazy prereq
2018-08-20 11:33:55 -07:00
Junio C Hamano
4601516b41 Merge branch 'js/chain-lint-attrfix'
Test fix.

* js/chain-lint-attrfix:
  chainlint: fix for core.autocrlf=true
2018-08-20 11:33:54 -07:00
Junio C Hamano
81eab6871e Merge branch 'js/range-diff'
"git tbdiff" that lets us compare individual patches in two
iterations of a topic has been rewritten and made into a built-in
command.

* js/range-diff: (21 commits)
  range-diff: use dim/bold cues to improve dual color mode
  range-diff: make --dual-color the default mode
  range-diff: left-pad patch numbers
  completion: support `git range-diff`
  range-diff: populate the man page
  range-diff --dual-color: skip white-space warnings
  range-diff: offer to dual-color the diffs
  diff: add an internal option to dual-color diffs of diffs
  color: add the meta color GIT_COLOR_REVERSE
  range-diff: use color for the commit pairs
  range-diff: add tests
  range-diff: do not show "function names" in hunk headers
  range-diff: adjust the output of the commit pairs
  range-diff: suppress the diff headers
  range-diff: indent the diffs just like tbdiff
  range-diff: right-trim commit messages
  range-diff: also show the diff between patches
  range-diff: improve the order of the shown commits
  range-diff: first rudimentary implementation
  Introduce `range-diff` to compare iterations of a topic branch
  ...
2018-08-20 11:33:53 -07:00
Junio C Hamano
ace1f99cc8 Merge branch 'es/chain-lint-more'
Improve built-in facility to catch broken &&-chain in the tests.

* es/chain-lint-more:
  chainlint: add test of pathological case which triggered false positive
  chainlint: recognize multi-line quoted strings more robustly
  chainlint: let here-doc and multi-line string commence on same line
  chainlint: recognize multi-line $(...) when command cuddled with "$("
  chainlint: match 'quoted' here-doc tags
  chainlint: match arbitrary here-docs tags rather than hard-coded names
2018-08-20 11:33:53 -07:00
Junio C Hamano
a15bfa517d Merge branch 'sg/t5310-empty-input-fix'
Test fix.

* sg/t5310-empty-input-fix:
  t5310-pack-bitmaps: fix bogus 'pack-objects to file can use bitmap' test
2018-08-20 11:33:52 -07:00
Junio C Hamano
0c54cdaf65 Merge branch 'jk/for-each-object-iteration'
The API to iterate over all objects learned to optionally list
objects in the order they appear in packfiles, which helps locality
of access if the caller accesses these objects while as objects are
enumerated.

* jk/for-each-object-iteration:
  for_each_*_object: move declarations to object-store.h
  cat-file: use a single strbuf for all output
  cat-file: split batch "buf" into two variables
  cat-file: use oidset check-and-insert
  cat-file: support "unordered" output for --batch-all-objects
  cat-file: rename batch_{loose,packed}_object callbacks
  t1006: test cat-file --batch-all-objects with duplicates
  for_each_packed_object: support iterating in pack-order
  for_each_*_object: give more comprehensive docstrings
  for_each_*_object: take flag arguments as enum
  for_each_*_object: store flag definitions in a single location
2018-08-20 11:33:52 -07:00
Junio C Hamano
42a6274b62 Merge branch 'ab/fetch-tags-noclobber'
Test and doc clean-ups.

* ab/fetch-tags-noclobber:
  pull doc: fix a long-standing grammar error
  fetch tests: correct a comment "remove it" -> "remove them"
  push tests: assert re-pushing annotated tags
  push tests: add more testing for forced tag pushing
  push tests: fix logic error in "push" test assertion
  push tests: remove redundant 'git push' invocation
  fetch tests: change "Tag" test tag to "testTag"
2018-08-20 11:33:52 -07:00
Junio C Hamano
3bc484af74 Merge branch 'jt/commit-graph-per-object-store'
Test update.

* jt/commit-graph-per-object-store:
  t5318: avoid unnecessary command substitutions
2018-08-20 11:33:51 -07:00
Junio C Hamano
5dd54744b8 Merge branch 'ds/commit-graph-fsck'
Test fix.

* ds/commit-graph-fsck:
  t5318: use 'test_cmp_bin' to compare commit-graph files
2018-08-20 11:33:51 -07:00
Junio C Hamano
c5c2162a32 Merge branch 'jt/fetch-negotiator-skipping'
Test fix.

* jt/fetch-negotiator-skipping:
  t5552: suppress upload-pack trace output
2018-08-20 11:33:51 -07:00
Junio C Hamano
02c51a2fd8 Merge branch 'en/t7406-fixes'
Test fixes.

* en/t7406-fixes:
  t7406: avoid using test_must_fail for commands other than git
  t7406: prefer test_* helper functions to test -[feds]
  t7406: avoid having git commands upstream of a pipe
  t7406: simplify by using diff --name-only instead of diff --raw
  t7406: fix call that was failing for the wrong reason
2018-08-20 11:33:49 -07:00
Junio C Hamano
750eb11d8f Merge branch 'js/rebase-merges-exec-fix'
The "--exec" option to "git rebase --rebase-merges" placed the exec
commands at wrong places, which has been corrected.

* js/rebase-merges-exec-fix:
  rebase --exec: make it work with --rebase-merges
  t3430: demonstrate what -r, --autosquash & --exec should do
2018-08-20 11:33:48 -07:00
Junio C Hamano
14677d25ab Merge branch 'ab/test-must-be-empty-for-master'
Test updates.

* ab/test-must-be-empty-for-master:
  tests: make use of the test_must_be_empty function
2018-08-20 11:33:48 -07:00
Junio C Hamano
2c8c407d0a Merge branch 'ar/t4150-am-scissors-test-fix'
Test fix.

* ar/t4150-am-scissors-test-fix:
  t4150: fix broken test for am --scissors
2018-08-17 13:09:59 -07:00
Junio C Hamano
c757aa2d12 Merge branch 'js/pull-rebase-type-shorthand'
"git pull --rebase=interactive" learned "i" as a short-hand for
"interactive".

* js/pull-rebase-type-shorthand:
  pull --rebase=<type>: allow single-letter abbreviations for the type
2018-08-17 13:09:59 -07:00
Junio C Hamano
b576cf70b2 Merge branch 'en/t3031-title-fix'
Test fix.

* en/t3031-title-fix:
  t3031: update test description to mention desired behavior
2018-08-17 13:09:58 -07:00
Junio C Hamano
8ba8642bd5 Merge branch 'en/abort-df-conflict-fixes'
"git merge --abort" etc. did not clean things up properly when
there were conflicted entries in the index in certain order that
are involved in D/F conflicts.  This has been corrected.

* en/abort-df-conflict-fixes:
  read-cache: fix directory/file conflict handling in read_index_unmerged()
  t1015: demonstrate directory/file conflict recovery failures
2018-08-17 13:09:57 -07:00
Junio C Hamano
c5d276cb18 Merge branch 'mk/http-backend-content-length'
The http-backend (used for smart-http transport) used to slurp the
whole input until EOF, without paying attention to CONTENT_LENGTH
that is supplied in the environment and instead expecting the Web
server to close the input stream.  This has been fixed.

* mk/http-backend-content-length:
  t5562: avoid non-portable "export FOO=bar" construct
  http-backend: respect CONTENT_LENGTH for receive-pack
  http-backend: respect CONTENT_LENGTH as specified by rfc3875
  http-backend: cleanup writing to child process
2018-08-17 13:09:57 -07:00
Junio C Hamano
28dbabb5e0 Merge branch 'ab/fetch-nego'
Update to a few other topics around 'git fetch'.

* ab/fetch-nego:
  fetch doc: cross-link two new negotiation options
  negotiator: unknown fetch.negotiationAlgorithm should error out
2018-08-17 13:09:55 -07:00
Junio C Hamano
72c11b7e62 Merge branch 'jt/refspec-dwim-precedence-fix'
"git fetch $there refs/heads/s" ought to fetch the tip of the
branch 's', but when "refs/heads/refs/heads/s", i.e. a branch whose
name is "refs/heads/s" exists at the same time, fetched that one
instead by mistake.  This has been corrected to honor the usual
disambiguation rules for abbreviated refnames.

* jt/refspec-dwim-precedence-fix:
  remote: make refspec follow the same disambiguation rule as local refs
2018-08-17 13:09:55 -07:00
Junio C Hamano
60858f343a Merge branch 'jk/merge-subtree-heuristics'
The automatic tree-matching in "git merge -s subtree" was broken 5
years ago and nobody has noticed since then, which is now fixed.

* jk/merge-subtree-heuristics:
  score_trees(): fix iteration over trees with missing entries
2018-08-17 13:09:55 -07:00
Junio C Hamano
28bdd99065 Merge branch 'ab/test-must-be-empty'
Test updates.

* ab/test-must-be-empty:
  tests: make use of the test_must_be_empty function
2018-08-17 13:09:54 -07:00
Junio C Hamano
1bc505b476 Merge branch 'es/rebase-i-author-script-fix'
The "author-script" file "git rebase -i" creates got broken when
we started to move the command away from shell script, which is
getting fixed now.

* es/rebase-i-author-script-fix:
  sequencer: don't die() on bogus user-edited timestamp
  sequencer: fix "rebase -i --root" corrupting author header timestamp
  sequencer: fix "rebase -i --root" corrupting author header timezone
  sequencer: fix "rebase -i --root" corrupting author header
2018-08-17 13:09:54 -07:00
Junio C Hamano
f8ca71870a Merge branch 'ab/fsck-transfer-updates'
The test performed at the receiving end of "git push" to prevent
bad objects from entering repository can be customized via
receive.fsck.* configuration variables; we now have gained a
counterpart to do the same on the "git fetch" side, with
fetch.fsck.* configuration variables.

* ab/fsck-transfer-updates:
  fsck: test and document unknown fsck.<msg-id> values
  fsck: add stress tests for fsck.skipList
  fsck: test & document {fetch,receive}.fsck.* config fallback
  fetch: implement fetch.fsck.*
  transfer.fsckObjects tests: untangle confusing setup
  config doc: elaborate on fetch.fsckObjects security
  config doc: elaborate on what transfer.fsckObjects does
  config doc: unify the description of fsck.* and receive.fsck.*
  config doc: don't describe *.fetchObjects twice
  receive.fsck.<msg-id> tests: remove dead code
2018-08-17 13:09:54 -07:00
Stefan Beller
31158c7efc t7410: update to new style
While at it fix a typo (s/independed/independent) and
make sure git is not in a chain of pipes.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-16 10:42:29 -07:00
Phillip Wood
bc9238bb09 rebase -i: fix SIGSEGV when 'merge <branch>' fails
If a merge command in the todo list specifies just a branch to merge
with no -C/-c argument then item->commit is NULL. This means that if
there are merge conflicts error_with_patch() is passed a NULL commit
which causes a segmentation fault when make_patch() tries to look it up.

This commit implements a minimal fix which fixes the crash and allows
the user to successfully commit a conflict resolution with 'git rebase
--continue'. It does not write .git/rebase-merge/patch,
.git/rebase-merge/stopped-sha or update REBASE_HEAD. To sensibly get the
hashes of the merge parents would require refactoring do_merge() to
extract the code that parses the merge parents into a separate function
which error_with_patch() could then use to write the parents into the
stopped-sha file. To create meaningful output make_patch() and 'git
rebase --show-current-patch' would also need to be modified to diff the
merge parent and merge base in this case.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-16 08:54:50 -07:00
Phillip Wood
d54e189862 t3430: add conflicting commit
Move the creation of conflicting-G from a test to the setup so that it
can be used in subsequent tests without creating the kind of implicit
dependencies that plague t3404. While we're at it simplify the
arguments to the test_commit() call the creates the conflicting commit.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-16 08:52:58 -07:00
Junio C Hamano
b160b6e69d Merge branch 'jt/connectivity-check-after-unshallow'
"git fetch" sometimes failed to update the remote-tracking refs,
which has been corrected.

* jt/connectivity-check-after-unshallow:
  fetch-pack: unify ref in and out param
2018-08-15 15:08:28 -07:00
Junio C Hamano
dd4ab3eaaa Merge branch 'cb/p4-pre-submit-hook'
"git p4 submit" learns to ask its own pre-submit hook if it should
continue with submitting.

* cb/p4-pre-submit-hook:
  git-p4: add the `p4-pre-submit` hook
2018-08-15 15:08:27 -07:00
Junio C Hamano
e4095da40e Merge branch 'en/merge-recursive-skip-fix'
When the sparse checkout feature is in use, "git cherry-pick" and
other mergy operations lost the skip_worktree bit when a path that
is excluded from checkout requires content level merge, which is
resolved as the same as the HEAD version, without materializing the
merge result in the working tree, which made the path appear as
deleted.  This has been corrected by preserving the skip_worktree
bit (and not materializing the file in the working tree).

* en/merge-recursive-skip-fix:
  merge-recursive: preserve skip_worktree bit when necessary
  t3507: add a testcase showing failure with sparse checkout
2018-08-15 15:08:26 -07:00
Junio C Hamano
d6628c99fa Merge branch 'jt/tag-following-with-proto-v2-fix'
The wire-protocol v2 relies on the client to send "ref prefixes" to
limit the bandwidth spent on the initial ref advertisement.  "git
fetch $remote branch:branch" that asks tags that point into the
history leading to the "branch" automatically followed sent to
narrow prefix and broke the tag following, which has been fixed.

* jt/tag-following-with-proto-v2-fix:
  fetch: send "refs/tags/" prefix upon CLI refspecs
  t5702: test fetch with multiple refspecs at a time
2018-08-15 15:08:25 -07:00
Junio C Hamano
4bea8485e3 Merge branch 'nd/i18n'
Many more strings are prepared for l10n.

* nd/i18n: (23 commits)
  transport-helper.c: mark more strings for translation
  transport.c: mark more strings for translation
  sha1-file.c: mark more strings for translation
  sequencer.c: mark more strings for translation
  replace-object.c: mark more strings for translation
  refspec.c: mark more strings for translation
  refs.c: mark more strings for translation
  pkt-line.c: mark more strings for translation
  object.c: mark more strings for translation
  exec-cmd.c: mark more strings for translation
  environment.c: mark more strings for translation
  dir.c: mark more strings for translation
  convert.c: mark more strings for translation
  connect.c: mark more strings for translation
  config.c: mark more strings for translation
  commit-graph.c: mark more strings for translation
  builtin/replace.c: mark more strings for translation
  builtin/pack-objects.c: mark more strings for translation
  builtin/grep.c: mark strings for translation
  builtin/config.c: mark more strings for translation
  ...
2018-08-15 15:08:23 -07:00
Junio C Hamano
3ec5ebee15 Merge branch 'hs/gpgsm'
Teach "git tag -s" etc. a few configuration variables (gpg.format
that can be set to "openpgp" or "x509", and gpg.<format>.program
that is used to specify what program to use to deal with the format)
to allow x.509 certs with CMS via "gpgsm" to be used instead of
openpgp via "gnupg".

* hs/gpgsm:
  gpg-interface t: extend the existing GPG tests with GPGSM
  gpg-interface: introduce new signature format "x509" using gpgsm
  gpg-interface: introduce new config to select per gpg format program
  gpg-interface: do not hardcode the key string len anymore
  gpg-interface: introduce an abstraction for multiple gpg formats
  t/t7510: check the validation of the new config gpg.format
  gpg-interface: add new config to select how to sign a commit
2018-08-15 15:08:23 -07:00
Junio C Hamano
2d7a20258f Merge branch 'bw/clone-ref-prefixes'
The wire-protocol v2 relies on the client to send "ref prefixes" to
limit the bandwidth spent on the initial ref advertisement.  "git
clone" when learned to speak v2 forgot to do so, which has been
corrected.

* bw/clone-ref-prefixes:
  clone: send ref-prefixes when using protocol v2
2018-08-15 15:08:23 -07:00
Junio C Hamano
1689c22c1c Merge branch 'jk/core-use-replace-refs'
A new configuration variable core.usereplacerefs has been added,
primarily to help server installations that want to ignore the
replace mechanism altogether.

* jk/core-use-replace-refs:
  add core.usereplacerefs config option
  check_replace_refs: rename to read_replace_refs
  check_replace_refs: fix outdated comment
2018-08-15 15:08:23 -07:00
Junio C Hamano
a14a9bfc13 Merge branch 'jh/json-writer'
Preparatory code to later add json output for telemetry data.

* jh/json-writer:
  json_writer: new routines to create JSON data
2018-08-15 15:08:22 -07:00
Junio C Hamano
706b0b5e8d Merge branch 'es/diff-color-moved-fix'
One of the "diff --color-moved" mode "dimmed_zebra" that was named
in an unusual way has been deprecated and replaced by
"dimmed-zebra".

* es/diff-color-moved-fix:
  diff: --color-moved: rename "dimmed_zebra" to "dimmed-zebra"
2018-08-15 15:08:22 -07:00
Junio C Hamano
10639c395a Merge branch 'js/t7406-recursive-submodule-update-order-fix'
Test fix.

* js/t7406-recursive-submodule-update-order-fix:
  t7406: avoid failures solely due to timing issues
2018-08-15 15:08:21 -07:00
Junio C Hamano
1638a625ca Merge branch 'sg/fast-import-dump-refs-on-checkpoint-fix'
Test update.

* sg/fast-import-dump-refs-on-checkpoint-fix:
  t9300: wait for background fast-import process to die after killing it
2018-08-15 15:08:20 -07:00
Phillip Wood
dd2e36ebac rebase -i: fix numbering in squash message
Commit e12a7ef597 ("rebase -i: Handle "combination of <n> commits" with
GETTEXT_POISON", 2018-04-27) changed the way that individual commit
messages are labelled when squashing commits together. In doing so a
regression was introduced where the numbering of the messages is off by
one. This commit fixes that and adds a test for the numbering.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-15 10:50:24 -07:00
Johannes Schindelin
1ce2b452c6 chainlint: fix for core.autocrlf=true
The `chainlint` target compares actual output to expected output, where
the actual output is generated from files that are specifically checked
out with LF-only line endings. So the expected output needs to be
checked out with LF-only line endings, too.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-15 10:40:46 -07:00
Ævar Arnfjörð Bjarmason
2711b1ad5e submodule: add more exhaustive up-path testing
The tests added in 63e95beb08 ("submodule: port resolve_relative_url
from shell to C", 2016-04-15) didn't do a good job of testing various
up-path invocations where the up-path would bring us beyond even the
URL in question without emitting an error.

These results look nonsensical, but it's worth exhaustively testing
them before fixing any of this code, so we can see which of these
cases were changed.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-14 12:55:17 -07:00
SZEDER Gábor
10c600172c t5310-pack-bitmaps: fix bogus 'pack-objects to file can use bitmap' test
The test 'pack-objects to file can use bitmap' added in 645c432d61
(pack-objects: use reachability bitmap index when generating
non-stdout pack, 2016-09-10) is silently buggy and doesn't check what
it's supposed to.

In 't5310-pack-bitmaps.sh', the 'list_packed_objects' helper function
does what its name implies by running:

  git show-index <"$1" | cut -d' ' -f2

The test in question invokes this function like this:

  list_packed_objects <packa-$packasha1.idx >packa.objects &&
  list_packed_objects <packb-$packbsha1.idx >packb.objects &&
  test_cmp packa.objects packb.objects

Note how these two callsites don't specify the name of the pack index
file as the function's parameter, but redirect the function's standard
input from it.  This triggers an error message from the shell, as it
has no filename to redirect from in the function, but this error is
ignored, because it happens upstream of a pipe.  Consequently, both
invocations produce empty 'pack{a,b}.objects' files, and the
subsequent 'test_cmp' happily finds those two empty files identical.

Fix these two 'list_packed_objects' invocations by specifying the pack
index files as parameters.  Furthermore, eliminate the pipe in that
function by replacing it with an &&-chained pair of commands using an
intermediate file, so a failure of 'git show-index' or the shell
redirection will fail the test.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-14 08:55:30 -07:00
Jeff King
0750bb5b51 cat-file: support "unordered" output for --batch-all-objects
If you're going to access the contents of every object in a
packfile, it's generally much more efficient to do so in
pack order, rather than in hash order. That increases the
locality of access within the packfile, which in turn is
friendlier to the delta base cache, since the packfile puts
related deltas next to each other. By contrast, hash order
is effectively random, since the sha1 has no discernible
relationship to the content.

This patch introduces an "--unordered" option to cat-file
which iterates over packs in pack-order under the hood. You
can see the results when dumping all of the file content:

  $ time ./git cat-file --batch-all-objects --buffer --batch | wc -c
  6883195596

  real	0m44.491s
  user	0m42.902s
  sys	0m5.230s

  $ time ./git cat-file --unordered \
                        --batch-all-objects --buffer --batch | wc -c
  6883195596

  real	0m6.075s
  user	0m4.774s
  sys	0m3.548s

Same output, different order, way faster. The same speed-up
applies even if you end up accessing the object content in a
different process, like:

  git cat-file --batch-all-objects --buffer --batch-check |
  grep blob |
  git cat-file --batch='%(objectname) %(rest)' |
  wc -c

Adding "--unordered" to the first command drops the runtime
in git.git from 24s to 3.5s.

  Side note: there are actually further speedups available
  for doing it all in-process now. Since we are outputting
  the object content during the actual pack iteration, we
  know where to find the object and could skip the extra
  lookup done by oid_object_info(). This patch stops short
  of that optimization since the underlying API isn't ready
  for us to make those sorts of direct requests.

So if --unordered is so much better, why not make it the
default? Two reasons:

  1. We've promised in the documentation that --batch-all-objects
     outputs in hash order. Since cat-file is plumbing,
     people may be relying on that default, and we can't
     change it.

  2. It's actually _slower_ for some cases. We have to
     compute the pack revindex to walk in pack order. And
     our de-duplication step uses an oidset, rather than a
     sort-and-dedup, which can end up being more expensive.
     If we're just accessing the type and size of each
     object, for example, like:

       git cat-file --batch-all-objects --buffer --batch-check

     my best-of-five warm cache timings go from 900ms to
     1100ms using --unordered. Though it's possible in a
     cold-cache or under memory pressure that we could do
     better, since we'd have better locality within the
     packfile.

And one final question: why is it "--unordered" and not
"--pack-order"? The answer is again two-fold:

  1. "pack order" isn't a well-defined thing across the
     whole set of objects. We're hitting loose objects, as
     well as objects in multiple packs, and the only
     ordering we're promising is _within_ a single pack. The
     rest is apparently random.

  2. The point here is optimization. So we don't want to
     promise any particular ordering, but only to say that
     we will choose an ordering which is likely to be
     efficient for accessing the object content. That leaves
     the door open for further changes in the future without
     having to add another compatibility option.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13 13:48:31 -07:00