p5551: add a script to test fetch pack-dir rescans

Since fetch often deals with object-ids we don't have (yet), it's an easy mistake for it to use a function like parse_object() that gives the correct result (e.g., NULL) but does so very slowly (because after failing to find the object, we re-scan the pack directory looking for new packs). The regular test suite won't catch this because the end result is correct, but we would want to know about performance regressions, too. Let's add a test to the regression suite. Note that this uses a synthetic repository that has a large number of packs. That's not ideal, as it means we're not testing what "normal" users see (in fact, some of these problems have existed for ages without anybody noticing simply because a rescan on a normal repository just isn't that expensive). So what we're really looking for here is the spike you'd notice in a pathological case (a lot of unknown objects coming into a repo with a lot of packs). If that's fast, then the normal cases should be, too. Note that the test also makes liberal use of $MODERN_GIT for setup; some of these regressions go back a ways, and we should be able to use it to find the problems there. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-20 15:28:28 -05:00 · 2017-11-20 15:28:28 -05:00 · 7893bf1720
commit 7893bf1720
parent 0a11e40275
1 changed files with 55 additions and 0 deletions
--- a/t/perf/p5551-fetch-rescan.sh
+++ b/t/perf/p5551-fetch-rescan.sh
@ -0,0 +1,55 @@
+#!/bin/sh
+
+test_description='fetch performance with many packs
+
+It is common for fetch to consider objects that we might not have, and it is an
+easy mistake for the code to use a function like `parse_object` that might
+give the correct _answer_ on such an object, but do so slowly (due to
+re-scanning the pack directory for lookup failures).
+
+The resulting performance drop can be hard to notice in a real repository, but
+becomes quite large in a repository with a large number of packs. So this
+test creates a more pathological case, since any mistakes would produce a more
+noticeable slowdown.
+'
+. ./perf-lib.sh
+. "$TEST_DIRECTORY"/perf/lib-pack.sh
+
+test_expect_success 'create parent and child' '
+	git init parent &&
+	git clone parent child
+'
+
+
+test_expect_success 'create refs in the parent' '
+	(
+		cd parent &&
+		git commit --allow-empty -m foo &&
+		head=$(git rev-parse HEAD) &&
+		test_seq 1000 |
+		sed "s,.*,update refs/heads/& $head," |
+		$MODERN_GIT update-ref --stdin
+	)
+'
+
+test_expect_success 'create many packs in the child' '
+	(
+		cd child &&
+		setup_many_packs
+	)
+'
+
+test_perf 'fetch' '
+	# start at the same state for each iteration
+	obj=$($MODERN_GIT -C parent rev-parse HEAD) &&
+	(
+		cd child &&
+		$MODERN_GIT for-each-ref --format="delete %(refname)" refs/remotes |
+		$MODERN_GIT update-ref --stdin &&
+		rm -vf .git/objects/$(echo $obj | sed "s|^..|&/|") &&
+
+		git fetch
+	)
+'
+
+test_done