git-commit-vandalism/builtin
Jeff King 5c5b29b459 cat-file: disable refs/replace with --batch-all-objects
When we're enumerating all objects in the object database, it doesn't
make sense to respect refs/replace. The point of this option is to
enumerate all of the objects in the database at a low level. By
definition we'd already show the replacement object's contents (under
its real oid), and showing those contents under another oid is almost
certainly working against what the user is trying to do.

Note that you could make the same argument for something like:

  git show-index <foo.idx |
  awk '{print $2}' |
  git cat-file --batch

but there we can't know in cat-file exactly what the user intended,
because we don't know the source of the input. They could be trying to
do low-level debugging, or they could be doing something more high-level
(e.g., imagine a porcelain built around cat-file for its object
accesses). So in those cases, we'll have to rely on the user specifying
"git --no-replace-objects" to tell us what to do.

One _could_ make an argument that "cat-file --batch" is sufficiently
low-level plumbing that it should not respect replace-objects at all
(and the caller should do any replacement if they want it).  But we have
been doing so for some time. The history is a little tangled:

  - looking back as far as v1.6.6, we would not respect replace refs for
    --batch-check, but would for --batch (because the former used
    sha1_object_info(), and the replace mechanism only affected actual
    object reads)

  - this discrepancy was made even weirder by 98e2092b50 (cat-file:
    teach --batch to stream blob objects, 2013-07-10), where we always
    output the header using the --batch-check code, and then printed the
    object separately. This could lead to "cat-file --batch" dying (when
    it notices the size or type changed for a non-blob object) or even
    producing bogus output (in streaming mode, we didn't notice that we
    wrote the wrong number of bytes).

  - that persisted until 1f7117ef7a (sha1_file: perform object
    replacement in sha1_object_info_extended(), 2013-12-11), which then
    respected replace refs for both forms.

So it has worked reliably this way for over 7 years, and we should make
sure it continues to do so. That could also be an argument that
--batch-all-objects should not change behavior (which this patch is
doing), but I really consider the current behavior to be an unintended
bug. It's a side effect of how the code is implemented (feeding the oids
back into oid_object_info() rather than looking at what we found while
reading the loose and packed object storage).

The implementation is straight-forward: we just disable the global
read_replace_refs flag when we're in --batch-all-objects mode. It would
perhaps be a little cleaner to change the flag we pass to
oid_object_info_extended(), but that's not enough. We also read objects
via read_object_file() and stream_blob_to_fd(). The former could switch
to its _extended() form, but the streaming code has no mechanism for
disabling replace refs. Setting the global flag works, and as a bonus,
it's impossible to have any "oops, we're sometimes replacing the object
and sometimes not" bugs in the output (like the ones caused by
98e2092b50 above).

The tests here cover the regular-input and --batch-all-objects cases,
for both --batch-check and --batch. There is a test in t6050 that covers
the regular-input case with --batch already, but this new one goes much
further in actually verifying the output (plus covering --batch-check
explicitly). This is perhaps a little overkill and the tests would be
simpler just covering --batch-check, but I wanted to make sure we're
checking that --batch output is consistent between the header and the
content. The global-flag technique used here makes that easy to get
right, but this is future-proofing us against regressions.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08 15:45:14 -07:00
..
add.c dir.[ch]: replace dir_init() with DIR_INIT 2021-07-01 12:32:22 -07:00
am.c *.c static functions: add missing __attribute__((format)) 2021-07-13 15:20:20 -07:00
annotate.c strvec: rename struct fields 2020-07-30 19:18:06 -07:00
apply.c
archive.c
bisect--helper.c *.c static functions: add missing __attribute__((format)) 2021-07-13 15:20:20 -07:00
blame.c Merge branch 'rs/blame-optim' 2021-02-25 16:43:29 -08:00
branch.c ref-filter: reuse output buffer 2021-04-20 11:09:50 -07:00
bugreport.c builtin/bugreport: don't leak prefixed filename 2021-04-28 09:25:45 +09:00
bundle.c bundle: remove "ref_list" in favor of string-list.c API 2021-07-06 12:10:17 -07:00
cat-file.c cat-file: disable refs/replace with --batch-all-objects 2021-10-08 15:45:14 -07:00
check-attr.c
check-ignore.c dir.[ch]: replace dir_init() with DIR_INIT 2021-07-01 12:32:22 -07:00
check-mailmap.c shortlog: remove unused(?) "repo-abbrev" feature 2021-01-12 14:04:42 -08:00
check-ref-format.c
checkout--worker.c builtin/checkout--worker: zero-initialise struct to avoid MSAN complaints 2021-06-15 12:07:56 +09:00
checkout-index.c Merge branch 'mt/parallel-checkout-part-3' 2021-05-16 21:05:23 +09:00
checkout.c checkout: stop expanding sparse indexes 2021-07-14 15:05:53 -07:00
clean.c dir.[ch]: replace dir_init() with DIR_INIT 2021-07-01 12:32:22 -07:00
clone.c Merge branch 'jk/clone-clean-upon-transport-error' 2021-06-14 13:33:26 +09:00
column.c column, range-diff: downcase option description 2021-03-29 14:06:08 -07:00
commit-graph.c builtin/*: update usage format 2021-01-06 15:10:49 -08:00
commit-tree.c Use OPT_CALLBACK and OPT_CALLBACK_F 2020-04-28 10:47:10 -07:00
commit.c Merge branch 'ds/commit-and-checkout-with-sparse-index' 2021-08-04 13:28:53 -07:00
config.c config: unify code paths to get global config paths 2021-04-19 14:16:59 -07:00
count-objects.c
credential-cache--daemon.c unix-socket: add backlog size option to unix_stream_listen() 2021-03-15 14:32:51 -07:00
credential-cache.c unix-socket: disallow chdir() when creating unix domain sockets 2021-03-15 14:32:51 -07:00
credential-store.c crendential-store: use timeout when locking file 2020-11-25 12:30:18 -08:00
credential.c credential: load default config 2020-10-16 12:30:45 -07:00
describe.c hash: provide per-algorithm null OIDs 2021-04-27 16:31:39 +09:00
diff-files.c Merge branch 'jc/diffcore-rotate' 2021-02-25 16:43:30 -08:00
diff-index.c diff-merges: move specific diff-index "-m" handling to diff-index 2021-05-21 09:24:14 +09:00
diff-tree.c Merge branch 'jc/diffcore-rotate' 2021-02-25 16:43:30 -08:00
diff.c Merge branch 'dl/diff-merge-base' 2021-07-28 13:17:59 -07:00
difftool.c Merge branch 'ab/cmd-foo-should-return' 2021-07-08 13:15:04 -07:00
env--helper.c assert PARSE_OPT_NONEG in parse-options callbacks 2020-09-30 12:53:47 -07:00
fast-export.c hash: provide per-algorithm null OIDs 2021-04-27 16:31:39 +09:00
fast-import.c Use the final_oid_fn to finalize hashing of object IDs 2021-04-27 16:31:38 +09:00
fetch-pack.c connect, transport: encapsulate arg in struct 2021-02-05 13:49:54 -08:00
fetch.c Merge branch 'ab/fetch-negotiate-segv-fix' 2021-07-16 17:42:48 -07:00
fmt-merge-msg.c Lib-ify fmt-merge-msg 2020-03-24 15:04:43 -07:00
for-each-ref.c Merge branch 'ah/plugleaks' 2021-05-07 12:47:41 +09:00
for-each-repo.c builtin/for-each-repo: remove unnecessary argv copy to plug leak 2021-07-26 12:19:20 -07:00
fsck.c Merge branch 'ab/fsck-api-cleanup' 2021-06-02 07:34:27 +09:00
gc.c maintenance: fix two memory leaks 2021-05-12 07:00:45 +09:00
get-tar-commit-id.c
grep.c dir.[ch]: replace dir_init() with DIR_INIT 2021-07-01 12:32:22 -07:00
hash-object.c
help.c help: convert git_cmd to page in one place 2021-07-06 13:09:20 -07:00
index-pack.c *.c static functions: don't forward-declare __attribute__ 2021-07-12 12:09:53 -07:00
init-db.c Merge branch 'mt/init-template-userpath-fix' 2021-05-25 16:21:20 +09:00
interpret-trailers.c Use OPT_CALLBACK and OPT_CALLBACK_F 2020-04-28 10:47:10 -07:00
log.c Merge branch 'jk/log-decorate-optim' 2021-07-28 13:17:58 -07:00
ls-files.c dir.[ch]: replace dir_init() with DIR_INIT 2021-07-01 12:32:22 -07:00
ls-remote.c Merge branch 'ah/plugleaks' 2021-04-07 16:54:08 -07:00
ls-tree.c tree.h API: simplify read_tree_recursive() signature 2021-03-20 16:09:26 -07:00
mailinfo.c mailinfo: allow squelching quoted CRLF warning 2021-05-10 15:06:22 +09:00
mailsplit.c
merge-base.c rebase: --fork-point regression fix 2020-02-11 09:59:39 -08:00
merge-file.c
merge-index.c merge-index: ensure full index 2021-04-14 13:47:21 -07:00
merge-ours.c builtins + test helpers: use return instead of exit() in cmd_* 2021-06-09 09:15:58 +09:00
merge-recursive.c
merge-tree.c xdiff users: use designated initializers for out_line 2021-05-11 12:47:31 +09:00
merge.c Merge branch 'pb/merge-autostash-more' 2021-08-04 13:28:54 -07:00
mktag.c fsck.c: add an fsck_set_msg_type() API that takes enums 2021-03-28 19:03:10 -07:00
mktree.c builtins + test helpers: use return instead of exit() in cmd_* 2021-06-09 09:15:58 +09:00
multi-pack-index.c multi-pack-index: fix potential segfault without sub-command 2021-07-19 15:24:01 -07:00
mv.c builtin/mv: free or UNLEAK multiple pointers at end of cmd_mv 2021-07-26 12:19:20 -07:00
name-rev.c oid_pos(): access table through const pointers 2021-01-28 12:03:26 -08:00
notes.c use CALLOC_ARRAY 2021-03-13 16:00:09 -08:00
pack-objects.c Merge branch 'ab/pack-linkage-fix' 2021-05-27 12:36:58 +09:00
pack-redundant.c builtin/pack-redundant: avoid casting buffers to struct object_id 2021-04-27 16:31:38 +09:00
pack-refs.c
patch-id.c patch-id: use oid_to_hex() to print multiple object IDs 2019-12-09 12:26:40 -08:00
prune-packed.c Lib-ify prune-packed 2020-03-24 15:04:44 -07:00
prune.c Merge branch 'tb/shallow-cleanup' 2020-05-13 12:19:18 -07:00
pull.c pull: trivial whitespace style fix 2021-06-19 16:36:17 +09:00
push.c push: don't get a full remote object 2021-06-02 10:12:03 +09:00
range-diff.c column, range-diff: downcase option description 2021-03-29 14:06:08 -07:00
read-tree.c Use OPT_CALLBACK and OPT_CALLBACK_F 2020-04-28 10:47:10 -07:00
rebase.c builtin/rebase: fix options.strategy memory lifecycle 2021-07-26 12:19:21 -07:00
receive-pack.c *.c static functions: don't forward-declare __attribute__ 2021-07-12 12:09:53 -07:00
reflog.c reflog expire --stale-fix: be generous about missing objects 2021-02-11 09:21:52 -08:00
remote-ext.c strvec: convert builtin/ callers away from argv_array name 2020-07-28 15:02:18 -07:00
remote-fd.c
remote.c Merge branch 'ah/plugleaks' 2021-04-07 16:54:08 -07:00
repack.c repack: avoid loosening promisor objects in partial clones 2021-04-28 13:36:13 +09:00
replace.c strvec: rename struct fields 2020-07-30 19:18:06 -07:00
rerere.c xdiff users: use designated initializers for out_line 2021-05-11 12:47:31 +09:00
reset.c reset: free instead of leaking unneeded ref 2021-03-14 15:57:59 -07:00
rev-list.c rev-list: add option for --pretty=format without header 2021-07-12 10:12:31 -07:00
rev-parse.c rev-parse: mark die() messages for translation 2021-05-17 18:39:53 +09:00
revert.c sequencer: fix edit handling for cherry-pick and revert messages 2021-03-31 14:10:50 -07:00
rm.c Merge branch 'ah/plugleaks' 2021-05-07 12:47:41 +09:00
send-pack.c push: parse and set flag for "--force-if-includes" 2020-10-03 09:59:19 -07:00
shortlog.c Merge branch 'ab/mailmap' 2021-01-25 14:19:19 -08:00
show-branch.c show-branch: don't <COLOR></RESET> for space characters 2021-06-28 09:33:06 -07:00
show-index.c builtin/show-index: set the algorithm for object IDs 2021-04-27 16:31:39 +09:00
show-ref.c refs: switch peel_ref() to peel_iterated_oid() 2021-01-21 15:51:31 -08:00
sparse-checkout.c use fspathhash() everywhere 2021-07-30 12:14:27 -07:00
stash.c Merge branch 'ab/struct-init' 2021-07-16 17:42:53 -07:00
stripspace.c
submodule--helper.c Merge branch 'ah/plugleaks' 2021-08-04 13:28:52 -07:00
symbolic-ref.c symbolic-ref: don't leak shortened refname in check_symref() 2021-03-14 15:57:59 -07:00
tag.c ref-filter: reuse output buffer 2021-04-20 11:09:50 -07:00
unpack-file.c
unpack-objects.c hash: provide per-algorithm null OIDs 2021-04-27 16:31:39 +09:00
update-index.c update-index: ensure full index 2021-04-14 13:47:29 -07:00
update-ref.c update-ref: disallow "start" for ongoing transactions 2020-11-16 13:44:01 -08:00
update-server-info.c
upload-archive.c strvec: rename struct fields 2020-07-30 19:18:06 -07:00
upload-pack.c
var.c
verify-commit.c
verify-pack.c Merge branch 'bc/sha-256-part-3' 2020-08-11 18:04:11 -07:00
verify-tag.c
worktree.c worktree: teach add to accept --reason <string> with --lock 2021-07-15 13:30:59 -07:00
write-tree.c