git-commit-vandalism/t/t5312-prune-corruption.sh
Jeff King ea56c4e02f refs.c: drop curate_packed_refs
When we delete a ref, we have to rewrite the entire
packed-refs file. We take this opportunity to "curate" the
packed-refs file and drop any entries that are crufty or
broken.

Dropping broken entries (e.g., with bogus names, or ones
that point to missing objects) is actively a bad idea, as it
means that we lose any notion that the data was there in the
first place. Aside from the general hackiness that we might
lose any information about ref "foo" while deleting an
unrelated ref "bar", this may seriously hamper any attempts
by the user at recovering from the corruption in "foo".

They will lose the sha1 and name of "foo"; the exact pointer
may still be useful even if they recover missing objects
from a different copy of the repository. But worse, once the
ref is gone, there is no trace of the corruption. A
follow-up "git prune" may delete objects, even though it
would otherwise bail when seeing corruption.

We could just drop the "broken" bits from
curate_packed_refs, and continue to drop the "crufty" bits:
refs whose loose counterpart exists in the filesystem. This
is not wrong to do, and it does have the advantage that we
may write out a slightly smaller packed-refs file. But it
has two disadvantages:

  1. It is a potential source of races or mistakes with
     respect to these refs that are otherwise unrelated to
     the operation. To my knowledge, there aren't any active
     problems in this area, but it seems like an unnecessary
     risk.

  2. We have to spend time looking up the matching loose
     refs for every item in the packed-refs file. If you
     have a large number of packed refs that do not change,
     that outweighs the benefit from writing out a smaller
     packed-refs file (it doesn't get smaller, and you do a
     bunch of directory traversal to find that out).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-20 12:41:41 -07:00

115 lines
3.6 KiB
Bash
Executable File

#!/bin/sh
test_description='
Test pruning of repositories with minor corruptions. The goal
here is that we should always be erring on the side of safety. So
if we see, for example, a ref with a bogus name, it is OK either to
bail out or to proceed using it as a reachable tip, but it is _not_
OK to proceed as if it did not exist. Otherwise we might silently
delete objects that cannot be recovered.
'
. ./test-lib.sh
test_expect_success 'disable reflogs' '
git config core.logallrefupdates false &&
rm -rf .git/logs
'
test_expect_success 'create history reachable only from a bogus-named ref' '
test_tick && git commit --allow-empty -m master &&
base=$(git rev-parse HEAD) &&
test_tick && git commit --allow-empty -m bogus &&
bogus=$(git rev-parse HEAD) &&
git cat-file commit $bogus >saved &&
echo $bogus >.git/refs/heads/bogus..name &&
git reset --hard HEAD^
'
test_expect_success 'pruning does not drop bogus object' '
test_when_finished "git hash-object -w -t commit saved" &&
test_might_fail git prune --expire=now &&
verbose git cat-file -e $bogus
'
test_expect_success 'put bogus object into pack' '
git tag reachable $bogus &&
git repack -ad &&
git tag -d reachable &&
verbose git cat-file -e $bogus
'
test_expect_success 'destructive repack keeps packed object' '
test_might_fail git repack -Ad --unpack-unreachable=now &&
verbose git cat-file -e $bogus &&
test_might_fail git repack -ad &&
verbose git cat-file -e $bogus
'
# subsequent tests will have different corruptions
test_expect_success 'clean up bogus ref' '
rm .git/refs/heads/bogus..name
'
# We create two new objects here, "one" and "two". Our
# master branch points to "two", which is deleted,
# corrupting the repository. But we'd like to make sure
# that the otherwise unreachable "one" is not pruned
# (since it is the user's best bet for recovering
# from the corruption).
#
# Note that we also point HEAD somewhere besides "two",
# as we want to make sure we test the case where we
# pick up the reference to "two" by iterating the refs,
# not by resolving HEAD.
test_expect_success 'create history with missing tip commit' '
test_tick && git commit --allow-empty -m one &&
recoverable=$(git rev-parse HEAD) &&
git cat-file commit $recoverable >saved &&
test_tick && git commit --allow-empty -m two &&
missing=$(git rev-parse HEAD) &&
git checkout --detach $base &&
rm .git/objects/$(echo $missing | sed "s,..,&/,") &&
test_must_fail git cat-file -e $missing
'
test_expect_success 'pruning with a corrupted tip does not drop history' '
test_when_finished "git hash-object -w -t commit saved" &&
test_might_fail git prune --expire=now &&
verbose git cat-file -e $recoverable
'
test_expect_success 'pack-refs does not silently delete broken loose ref' '
git pack-refs --all --prune &&
echo $missing >expect &&
git rev-parse refs/heads/master >actual &&
test_cmp expect actual
'
# we do not want to count on running pack-refs to
# actually pack it, as it is perfectly reasonable to
# skip processing a broken ref
test_expect_success 'create packed-refs file with broken ref' '
rm -f .git/refs/heads/master &&
cat >.git/packed-refs <<-EOF &&
$missing refs/heads/master
$recoverable refs/heads/other
EOF
echo $missing >expect &&
git rev-parse refs/heads/master >actual &&
test_cmp expect actual
'
test_expect_success 'pack-refs does not silently delete broken packed ref' '
git pack-refs --all --prune &&
git rev-parse refs/heads/master >actual &&
test_cmp expect actual
'
test_expect_success 'pack-refs does not drop broken refs during deletion' '
git update-ref -d refs/heads/other &&
git rev-parse refs/heads/master >actual &&
test_cmp expect actual
'
test_done