p5551: add a script to test fetch pack-dir rescans

Since fetch often deals with object-ids we don't have (yet),
it's an easy mistake for it to use a function like
parse_object() that gives the correct result (e.g., NULL)
but does so very slowly (because after failing to find the
object, we re-scan the pack directory looking for new
packs).

The regular test suite won't catch this because the end
result is correct, but we would want to know about
performance regressions, too. Let's add a test to the
regression suite.

Note that this uses a synthetic repository that has a large
number of packs. That's not ideal, as it means we're not
testing what "normal" users see (in fact, some of these
problems have existed for ages without anybody noticing
simply because a rescan on a normal repository just isn't
that expensive).

So what we're really looking for here is the spike you'd
notice in a pathological case (a lot of unknown objects
coming into a repo with a lot of packs). If that's fast,
then the normal cases should be, too.

Note that the test also makes liberal use of $MODERN_GIT for
setup; some of these regressions go back a ways, and we
should be able to use it to find the problems there.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Jeff King 2017-11-20 15:28:28 -05:00 committed by Junio C Hamano
parent 0a11e40275
commit 7893bf1720

55
t/perf/p5551-fetch-rescan.sh Executable file
View File

@ -0,0 +1,55 @@
#!/bin/sh
test_description='fetch performance with many packs
It is common for fetch to consider objects that we might not have, and it is an
easy mistake for the code to use a function like `parse_object` that might
give the correct _answer_ on such an object, but do so slowly (due to
re-scanning the pack directory for lookup failures).
The resulting performance drop can be hard to notice in a real repository, but
becomes quite large in a repository with a large number of packs. So this
test creates a more pathological case, since any mistakes would produce a more
noticeable slowdown.
'
. ./perf-lib.sh
. "$TEST_DIRECTORY"/perf/lib-pack.sh
test_expect_success 'create parent and child' '
git init parent &&
git clone parent child
'
test_expect_success 'create refs in the parent' '
(
cd parent &&
git commit --allow-empty -m foo &&
head=$(git rev-parse HEAD) &&
test_seq 1000 |
sed "s,.*,update refs/heads/& $head," |
$MODERN_GIT update-ref --stdin
)
'
test_expect_success 'create many packs in the child' '
(
cd child &&
setup_many_packs
)
'
test_perf 'fetch' '
# start at the same state for each iteration
obj=$($MODERN_GIT -C parent rev-parse HEAD) &&
(
cd child &&
$MODERN_GIT for-each-ref --format="delete %(refname)" refs/remotes |
$MODERN_GIT update-ref --stdin &&
rm -vf .git/objects/$(echo $obj | sed "s|^..|&/|") &&
git fetch
)
'
test_done