2013-12-21 15:00:42 +01:00
|
|
|
#!/bin/sh
|
|
|
|
|
|
|
|
test_description='Tests pack performance using bitmaps'
|
|
|
|
. ./perf-lib.sh
|
2021-08-31 22:52:46 +02:00
|
|
|
. "${TEST_DIRECTORY}/perf/lib-bitmap.sh"
|
2013-12-21 15:00:42 +01:00
|
|
|
|
|
|
|
test_perf_large_repo
|
|
|
|
|
|
|
|
# note that we do everything through config,
|
|
|
|
# since we want to be able to compare bitmap-aware
|
|
|
|
# git versus non-bitmap git
|
2014-06-10 22:20:30 +02:00
|
|
|
#
|
|
|
|
# We intentionally use the deprecated pack.writebitmaps
|
|
|
|
# config so that we can test against older versions of git.
|
2013-12-21 15:00:42 +01:00
|
|
|
test_expect_success 'setup bitmap config' '
|
pack-objects: default to writing bitmap hash-cache
Enabling pack.writebitmaphashcache should always be a performance win.
It costs only 4 bytes per object on disk, and the timings in ae4f07fbcc
(pack-bitmap: implement optional name_hash cache, 2013-12-21) show it
improving fetch and partial-bitmap clone times by 40-50%.
The only reason we didn't enable it by default at the time is that early
versions of JGit's bitmap reader complained about the presence of
optional header bits it didn't understand. But that was changed in
JGit's d2fa3987a (Use bitcheck to check for presence of OPT_FULL option,
2013-10-30), which made it into JGit v3.5.0 in late 2014.
So let's turn this option on by default. It's backwards-compatible with
all versions of Git, and if you are also using JGit on the same
repository, you'd only run into problems using a version that's almost 5
years old.
We'll drop the manual setting from all of our test scripts, including
perf tests. This isn't strictly necessary, but it has two advantages:
1. If the hash-cache ever stops being enabled by default, our perf
regression tests will notice.
2. We can use the modified perf tests to show off the behavior of an
otherwise unconfigured repo, as shown below.
These are the results of a few of a perf tests against linux.git that
showed interesting results. You can see the expected speedup in 5310.4,
which was noted in ae4f07fbcc. Curiously, 5310.8 did not improve (and
actually got slower), despite seeing the opposite in ae4f07fbcc.
I don't have an explanation for that.
The tests from p5311 did not exist back then, but do show improvements
(a smaller pack due to better deltas, which we found in less time).
Test HEAD^ HEAD
-------------------------------------------------------------------------------------
5310.4: simulated fetch 7.39(22.70+0.25) 5.64(11.43+0.22) -23.7%
5310.8: clone (partial bitmap) 18.45(24.83+1.19) 19.94(28.40+1.36) +8.1%
5311.31: server (128 days) 0.41(1.13+0.05) 0.34(0.72+0.02) -17.1%
5311.32: size (128 days) 7.4M 7.0M -4.8%
5311.33: client (128 days) 1.33(1.49+0.06) 1.29(1.37+0.12) -3.0%
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-15 07:25:28 +01:00
|
|
|
git config pack.writebitmaps true
|
2013-12-21 15:00:42 +01:00
|
|
|
'
|
|
|
|
|
pack-bitmap: avoid traversal of objects referenced by uninteresting tag
When preparing the bitmap walk, we first establish the set of of have
and want objects by iterating over the set of pending objects: if an
object is marked as uninteresting, it's declared as an object we already
have, otherwise as an object we want. These two sets are then used to
compute which transitively referenced objects we need to obtain.
One special case here are tag objects: when a tag is requested, we
resolve it to its first not-tag object and add both resolved objects as
well as the tag itself into either the have or want set. Given that the
uninteresting-property always propagates to referenced objects, it is
clear that if the tag is uninteresting, so are its children and vice
versa. But we fail to propagate the flag, which effectively means that
referenced objects will always be interesting except for the case where
they have already been marked as uninteresting explicitly.
This mislabeling does not impact correctness: we now have it in our
"wants" set, and given that we later do an `AND NOT` of the bitmaps of
"wants" and "haves" sets it is clear that the result must be the same.
But we now start to needlessly traverse the tag's referenced objects in
case it is uninteresting, even though we know that each referenced
object will be uninteresting anyway. In the worst case, this can lead to
a complete graph walk just to establish that we do not care for any
object.
Fix the issue by propagating the `UNINTERESTING` flag to pointees of tag
objects and add a benchmark with negative revisions to p5310. This shows
some nice performance benefits, tested with linux.git:
Test HEAD~ HEAD
---------------------------------------------------------------------------------------------------------------
5310.3: repack to disk 193.18(181.46+16.42) 194.61(183.41+15.83) +0.7%
5310.4: simulated clone 25.93(24.88+1.05) 25.81(24.73+1.08) -0.5%
5310.5: simulated fetch 2.64(5.30+0.69) 2.59(5.16+0.65) -1.9%
5310.6: pack to file (bitmap) 58.75(57.56+6.30) 58.29(57.61+5.73) -0.8%
5310.7: rev-list (commits) 1.45(1.18+0.26) 1.46(1.22+0.24) +0.7%
5310.8: rev-list (objects) 15.35(14.22+1.13) 15.30(14.23+1.07) -0.3%
5310.9: rev-list with tag negated via --not --all (objects) 22.49(20.93+1.56) 0.11(0.09+0.01) -99.5%
5310.10: rev-list with negative tag (objects) 0.61(0.44+0.16) 0.51(0.35+0.16) -16.4%
5310.11: rev-list count with blob:none 12.15(11.19+0.96) 12.18(11.19+0.99) +0.2%
5310.12: rev-list count with blob:limit=1k 17.77(15.71+2.06) 17.75(15.63+2.12) -0.1%
5310.13: rev-list count with tree:0 1.69(1.31+0.38) 1.68(1.28+0.39) -0.6%
5310.14: simulated partial clone 20.14(19.15+0.98) 19.98(18.93+1.05) -0.8%
5310.16: clone (partial bitmap) 12.78(13.89+1.07) 12.72(13.99+1.01) -0.5%
5310.17: pack to file (partial bitmap) 42.07(45.44+2.72) 41.44(44.66+2.80) -1.5%
5310.18: rev-list with tree filter (partial bitmap) 0.44(0.29+0.15) 0.46(0.32+0.14) +4.5%
While most benchmarks are probably in the range of noise, the newly
added 5310.9 and 5310.10 benchmarks consistenly perform better.
Signed-off-by: Patrick Steinhardt <ps@pks.im>.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-22 13:19:06 +01:00
|
|
|
# we need to create the tag up front such that it is covered by the repack and
|
|
|
|
# thus by generated bitmaps.
|
|
|
|
test_expect_success 'create tags' '
|
|
|
|
git tag --message="tag pointing to HEAD" perf-tag HEAD
|
|
|
|
'
|
|
|
|
|
2013-12-21 15:00:42 +01:00
|
|
|
test_perf 'repack to disk' '
|
|
|
|
git repack -ad
|
|
|
|
'
|
|
|
|
|
2021-08-31 22:52:46 +02:00
|
|
|
test_full_bitmap
|
2020-02-14 19:22:41 +01:00
|
|
|
|
2013-12-21 15:00:42 +01:00
|
|
|
test_expect_success 'create partial bitmap state' '
|
|
|
|
# pick a commit to represent the repo tip in the past
|
|
|
|
cutoff=$(git rev-list HEAD~100 -1) &&
|
|
|
|
orig_tip=$(git rev-parse HEAD) &&
|
|
|
|
|
|
|
|
# now kill off all of the refs and pretend we had
|
|
|
|
# just the one tip
|
2015-06-26 23:27:00 +02:00
|
|
|
rm -rf .git/logs .git/refs/* .git/packed-refs &&
|
|
|
|
git update-ref HEAD $cutoff &&
|
2013-12-21 15:00:42 +01:00
|
|
|
|
|
|
|
# and then repack, which will leave us with a nice
|
|
|
|
# big bitmap pack of the "old" history, and all of
|
|
|
|
# the new history will be loose, as if it had been pushed
|
|
|
|
# up incrementally and exploded via unpack-objects
|
2015-06-26 23:27:00 +02:00
|
|
|
git repack -Ad &&
|
2013-12-21 15:00:42 +01:00
|
|
|
|
|
|
|
# and now restore our original tip, as if the pushes
|
|
|
|
# had happened
|
|
|
|
git update-ref HEAD $orig_tip
|
|
|
|
'
|
|
|
|
|
2021-08-31 22:52:46 +02:00
|
|
|
test_partial_bitmap
|
pack-bitmap: pass object filter to fill-in traversal
Sometimes a bitmap traversal still has to walk some commits manually,
because those commits aren't included in the bitmap packfile (e.g., due
to a push or commit since the last full repack). If we're given an
object filter, we don't pass it down to this traversal. It's not
necessary for correctness because the bitmap code has its own filters to
post-process the bitmap result (which it must, to filter out the objects
that _are_ mentioned in the bitmapped packfile).
And with blob filters, there was no performance reason to pass along
those filters, either. The fill-in traversal could omit them from the
result, but it wouldn't save us any time to do so, since we'd still have
to walk each tree entry to see if it's a blob or not.
But now that we support tree filters, there's opportunity for savings. A
tree:depth=0 filter means we can avoid accessing trees entirely, since
we know we won't them (or any of the subtrees or blobs they point to).
The new test in p5310 shows this off (the "partial bitmap" state is one
where HEAD~100 and its ancestors are all in a bitmapped pack, but
HEAD~100..HEAD are not). Here are the results (run against linux.git):
Test HEAD^ HEAD
-------------------------------------------------------------------------------------------------
[...]
5310.16: rev-list with tree filter (partial bitmap) 0.19(0.17+0.02) 0.03(0.02+0.01) -84.2%
The absolute number of savings isn't _huge_, but keep in mind that we
only omitted 100 first-parent links (in the version of linux.git here,
that's 894 actual commits). In a more pathological case, we might have a
much larger proportion of non-bitmapped commits. I didn't bother
creating such a case in the perf script because the setup is expensive,
and this is plenty to show the savings as a percentage.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-05 01:12:38 +02:00
|
|
|
|
2013-12-21 15:00:42 +01:00
|
|
|
test_done
|