git-commit-vandalism/t/t4064-diff-oidfind.sh
Jeff King 957876f17d combine-diff: handle --find-object in multitree code path
When doing combined diffs, we have two possible code paths:

  - a slower one which independently diffs against each parent, applies
    any filters, and then intersects the resulting paths

  - a faster one which walks all trees simultaneously

When the diff options specify that we must do certain filters, like
pickaxe, then we always use the slow path, since the pickaxe code only
knows how to handle filepairs, not the n-parent entries generated for
combined diffs.

But there are two problems with the slow path:

 1. It's slow. Running:

      git rev-list HEAD | git diff-tree --stdin -r -c

    in git.git takes ~3s on my machine. But adding "--find-object" to
    that increases it to ~6s, even though find-object itself should
    incur only a few extra oid comparisons. On linux.git, it's even
    worse: 35s versus 215s.

 2. It doesn't catch all cases where a particular path is interesting.
    Consider a merge with parent blobs X and Y for a particular path,
    and end result Z. That should be interesting according to "-c",
    because the result doesn't match either parent. And it should be
    interesting even with "--find-object=X", because "X" went away in
    the merge.

    But because we perform each pairwise diff independently, this
    confuses the intersection code. The change from X to Z is still
    interesting according to --find-object. But in the other parent we
    went from Y to Z, so the diff appears empty! That causes the
    intersection code to think that parent didn't change the path, and
    thus it's not interesting for "-c".

This patch fixes both by implementing --find-object for the multitree
code. It's a bit unfortunate that we have to duplicate some logic from
diffcore-pickaxe, but this is the best we can do for now. In an ideal
world, all of the diffcore code would stop thinking about filepairs and
start thinking about n-parent sets, and we could use the multitree walk
with all of it.

Until then, there are some leftover warts:

  - other pickaxe operations, like -S or -G, still suffer from both
    problems. These would be hard to adapt because they rely on having
    a diff_filespec() for each path to look at content. And we'd need to
    define what an n-way "change" means in each case (probably easy for
    "-S", which can compare counts, but not so clear for -G, which is
    about grepping diffs).

  - other options besides --find-object may cause us to use the slow
    pairwise path, in which case we'll go back to producing a different
    (wrong) answer for the X/Y/Z case above.

We may be able to hack around these, but I think the ultimate solution
will be a larger rewrite of the diffcore code. For now, this patch
improves one specific case but leaves the rest.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-30 13:35:24 -07:00

124 lines
2.7 KiB
Bash
Executable File

#!/bin/sh
test_description='test finding specific blobs in the revision walking'
. ./test-lib.sh
test_expect_success 'setup ' '
git commit --allow-empty -m "empty initial commit" &&
echo "Hello, world!" >greeting &&
git add greeting &&
git commit -m "add the greeting blob" && # borrowed from Git from the Bottom Up
git tag -m "the blob" greeting $(git rev-parse HEAD:greeting) &&
echo asdf >unrelated &&
git add unrelated &&
git commit -m "unrelated history" &&
git revert HEAD^ &&
git commit --allow-empty -m "another unrelated commit"
'
test_expect_success 'find the greeting blob' '
cat >expect <<-EOF &&
Revert "add the greeting blob"
add the greeting blob
EOF
git log --format=%s --find-object=greeting^{blob} >actual &&
test_cmp expect actual
'
test_expect_success 'setup a tree' '
mkdir a &&
echo asdf >a/file &&
git add a/file &&
git commit -m "add a file in a subdirectory"
'
test_expect_success 'find a tree' '
cat >expect <<-EOF &&
add a file in a subdirectory
EOF
git log --format=%s -t --find-object=HEAD:a >actual &&
test_cmp expect actual
'
test_expect_success 'setup a submodule' '
test_create_repo sub &&
test_commit -C sub sub &&
git submodule add ./sub sub &&
git commit -a -m "add sub"
'
test_expect_success 'find a submodule' '
cat >expect <<-EOF &&
add sub
EOF
git log --format=%s --find-object=HEAD:sub >actual &&
test_cmp expect actual
'
test_expect_success 'set up merge tests' '
test_commit base &&
git checkout -b boring base^ &&
echo boring >file &&
git add file &&
git commit -m boring &&
git checkout -b interesting base^ &&
echo interesting >file &&
git add file &&
git commit -m interesting &&
blob=$(git rev-parse interesting:file)
'
test_expect_success 'detect merge which introduces blob' '
git checkout -B merge base &&
git merge --no-commit boring &&
echo interesting >file &&
git commit -am "introduce blob" &&
git diff-tree --format=%s --find-object=$blob -c --name-status HEAD >actual &&
cat >expect <<-\EOF &&
introduce blob
AM file
EOF
test_cmp expect actual
'
test_expect_success 'detect merge which removes blob' '
git checkout -B merge interesting &&
git merge --no-commit base &&
echo boring >file &&
git commit -am "remove blob" &&
git diff-tree --format=%s --find-object=$blob -c --name-status HEAD >actual &&
cat >expect <<-\EOF &&
remove blob
MA file
EOF
test_cmp expect actual
'
test_expect_success 'do not detect merge that does not touch blob' '
git checkout -B merge interesting &&
git merge -m "untouched blob" base &&
git diff-tree --format=%s --find-object=$blob -c --name-status HEAD >actual &&
cat >expect <<-\EOF &&
untouched blob
EOF
test_cmp expect actual
'
test_done