Optimize "rev-list --use-bitmap-index --objects" corner case that
uses negative tags as the stopping points.
* ps/pack-bitmap-optim:
pack-bitmap: avoid traversal of objects referenced by uninteresting tag
"git diff-index" codepath has been taught to trust fsmonitor status
to reduce number of lstat() calls.
* nk/diff-index-fsmonitor:
fsmonitor: add perf test for git diff HEAD
fsmonitor: add assertion that fsmonitor is valid to check_removed
fsmonitor: skip lstat deletion check during git diff-index
"git repack" so far has been only capable of repacking everything
under the sun into a single pack (or split by size). A cleverer
strategy to reduce the cost of repacking a repository has been
introduced.
* tb/geometric-repack:
builtin/pack-objects.c: ignore missing links with --stdin-packs
builtin/repack.c: reword comment around pack-objects flags
builtin/repack.c: be more conservative with unsigned overflows
builtin/repack.c: assign pack split later
t7703: test --geometric repack with loose objects
builtin/repack.c: do not repack single packs with --geometric
builtin/repack.c: add '--geometric' option
packfile: add kept-pack cache for find_kept_pack_entry()
builtin/pack-objects.c: rewrite honor-pack-keep logic
p5303: measure time to repack with keep
p5303: add missing &&-chains
builtin/pack-objects.c: add '--stdin-packs' option
revision: learn '--no-kept-objects'
packfile: introduce 'find_kept_pack_entry()'
Perf test update to work better in secondary worktrees.
* jk/perf-in-worktrees:
t/perf: avoid copying worktree files from test repo
t/perf: handle worktrees as test repos
When preparing the bitmap walk, we first establish the set of of have
and want objects by iterating over the set of pending objects: if an
object is marked as uninteresting, it's declared as an object we already
have, otherwise as an object we want. These two sets are then used to
compute which transitively referenced objects we need to obtain.
One special case here are tag objects: when a tag is requested, we
resolve it to its first not-tag object and add both resolved objects as
well as the tag itself into either the have or want set. Given that the
uninteresting-property always propagates to referenced objects, it is
clear that if the tag is uninteresting, so are its children and vice
versa. But we fail to propagate the flag, which effectively means that
referenced objects will always be interesting except for the case where
they have already been marked as uninteresting explicitly.
This mislabeling does not impact correctness: we now have it in our
"wants" set, and given that we later do an `AND NOT` of the bitmaps of
"wants" and "haves" sets it is clear that the result must be the same.
But we now start to needlessly traverse the tag's referenced objects in
case it is uninteresting, even though we know that each referenced
object will be uninteresting anyway. In the worst case, this can lead to
a complete graph walk just to establish that we do not care for any
object.
Fix the issue by propagating the `UNINTERESTING` flag to pointees of tag
objects and add a benchmark with negative revisions to p5310. This shows
some nice performance benefits, tested with linux.git:
Test HEAD~ HEAD
---------------------------------------------------------------------------------------------------------------
5310.3: repack to disk 193.18(181.46+16.42) 194.61(183.41+15.83) +0.7%
5310.4: simulated clone 25.93(24.88+1.05) 25.81(24.73+1.08) -0.5%
5310.5: simulated fetch 2.64(5.30+0.69) 2.59(5.16+0.65) -1.9%
5310.6: pack to file (bitmap) 58.75(57.56+6.30) 58.29(57.61+5.73) -0.8%
5310.7: rev-list (commits) 1.45(1.18+0.26) 1.46(1.22+0.24) +0.7%
5310.8: rev-list (objects) 15.35(14.22+1.13) 15.30(14.23+1.07) -0.3%
5310.9: rev-list with tag negated via --not --all (objects) 22.49(20.93+1.56) 0.11(0.09+0.01) -99.5%
5310.10: rev-list with negative tag (objects) 0.61(0.44+0.16) 0.51(0.35+0.16) -16.4%
5310.11: rev-list count with blob:none 12.15(11.19+0.96) 12.18(11.19+0.99) +0.2%
5310.12: rev-list count with blob:limit=1k 17.77(15.71+2.06) 17.75(15.63+2.12) -0.1%
5310.13: rev-list count with tree:0 1.69(1.31+0.38) 1.68(1.28+0.39) -0.6%
5310.14: simulated partial clone 20.14(19.15+0.98) 19.98(18.93+1.05) -0.8%
5310.16: clone (partial bitmap) 12.78(13.89+1.07) 12.72(13.99+1.01) -0.5%
5310.17: pack to file (partial bitmap) 42.07(45.44+2.72) 41.44(44.66+2.80) -1.5%
5310.18: rev-list with tree filter (partial bitmap) 0.44(0.29+0.15) 0.46(0.32+0.14) +4.5%
While most benchmarks are probably in the range of noise, the newly
added 5310.9 and 5310.10 benchmarks consistenly perform better.
Signed-off-by: Patrick Steinhardt <ps@pks.im>.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Preliminary changes to fsmonitor integration.
* jh/fsmonitor-prework:
fsmonitor: refactor initialization of fsmonitor_last_update token
fsmonitor: allow all entries for a folder to be invalidated
fsmonitor: log FSMN token when reading and writing the index
fsmonitor: log invocation of FSMonitor hook to trace2
read-cache: log the number of scanned files to trace2
read-cache: log the number of lstat calls to trace2
preload-index: log the number of lstat calls to trace2
p7519: add trace logging during perf test
p7519: move watchman cleanup earlier in the test
p7519: fix watchman watch-list test on Windows
p7519: do not rely on "xargs -d" in test
When running the perf suite, we copy files from an existing $GIT_DIR to
a scratch repository to give us a realistic setup on which to operate.
Since the perf scripts themselves may modify the scratch repository, we
want to make sure we've scrubbed any references back to the original.
One existing example is that we avoid copying the file "commondir" at
the top-level of the repository. In a worktree git-dir (e.g.,
.git/worktrees/foo), that file contains the path to the parent
repository; copying it could mean ref updates in the scratch repository
affect the original.
But there are other files we should cover, too:
- "gitdir" in a worktree git-dir contains the path to the actual .git
file in the working tree. We _shouldn't_ end up looking at it at
all, since the lack of a "commondir" file means Git won't consider
this to be a worktree git-dir. But it's best to err on the safe
side.
- in a parent repository that contains worktrees, the
"$GIT_DIR/worktrees" directory will contain the git dirs for the
individual worktrees. Which will themselves contain commondir and
gitdir files that may reference the original repository. We should
likewise remove them.
Note that this does mean that the perf suite's scratch repositories
will never have any worktrees. That's OK; we don't have any perf tests
that are influenced by their presence. If we add any, they'd
probably want to create the worktrees themselves anyway.
This patch adds both paths to the set of omissions in
test_perf_copy_repo_contents(). Note that we won't get confused here by
matching arbitrary names like refs/heads/commondir. This list is always
matching top-level entries in $GIT_DIR (we rely on "cp -R" to do the
actual recursion).
Suggested-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The perf suite gets confused when test_perf_default_repo is pointed at a
worktree (which includes when it is run from within a worktree at all,
since the default is to use the current repository).
Here's an example:
$ git worktree add ~/foo
Preparing worktree (new branch 'foo')
HEAD is now at 328c109303 The eighth batch
$ cd ~/foo
$ make
[...build output...]
$ cd t/perf
$ ./p0000-perf-lib-sanity.sh -v -i
[...]
perf 1 - test_perf_default_repo works:
running:
foo=$(git rev-parse HEAD) &&
test_export foo
fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
The problem is that we didn't copy all of the necessary files from the
source repository (in this case we got HEAD, but we have no refs!). We
discover the git-dir with "rev-parse --git-dir", but this points to the
worktree's partial repository in .../.git/worktrees/foo.
That partial repository has a "commondir" file which points to the main
repository, where the actual refs are stored, but we don't copy it. This
is the correct thing to do, though! If we did copy it, then our scratch
test repo would be pointing back to the original main repo, and any ref
updates we made in the tests would impact that original repo.
Instead, we need to either:
1. Make a scratch copy of the original main repo (in addition to the
worktree repo), and point the scratch worktree repo's commondir at
it. This preserves the original relationship, but it's doubtful any
script really cares (if they are testing worktree performance,
they'd probably make their own worktrees). And it's trickier to get
right.
2. Collapse the main and worktree repos into a single scratch repo.
This can be done by copying everything from both, preferring any
files from the worktree repo.
This patch does the second one. With this applied, the example above
results in p0000 running successfully.
Reported-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add two new tests to measure repack performance. Both tests split the
repository into synthetic "pushes", and then leave the remaining objects
in a big base pack.
The first new test marks an empty pack as "kept" and then passes
--honor-pack-keep to avoid including objects in it. That doesn't change
the resulting pack, but it does let us compare to the normal repack case
to see how much overhead we add to check whether objects are kept or
not.
The other test is of --stdin-packs, which gives us a sense of how that
number scales based on the number of packs we provide as input. In each
of those tests, the empty pack isn't considered, but the residual pack
(objects that were left over and not included in one of the synthetic
push packs) is marked as kept.
(Note that in the single-pack case of the --stdin-packs test, there is
nothing do since there are no non-excluded packs).
Here are some timings on a recent clone of the kernel:
5303.5: repack (1) 57.26(54.59+10.84)
5303.6: repack with kept (1) 57.33(54.80+10.51)
in the 50-pack case, things start to slow down:
5303.11: repack (50) 71.54(88.57+4.84)
5303.12: repack with kept (50) 85.12(102.05+4.94)
and by the time we hit 1,000 packs, things are substantially worse, even
though the resulting pack produced is the same:
5303.17: repack (1000) 216.87(490.79+14.57)
5303.18: repack with kept (1000) 665.63(938.87+15.76)
That's because the code paths around handling .keep files are known to
scale badly; they look in every single pack file to find each object.
Our solution to that was to notice that most repos don't have keep
files, and to make that case a fast path. But as soon as you add a
single .keep, that part of pack-objects slows down again (even if we
have fewer objects total to look at).
Likewise, the scaling is pretty extreme on --stdin-packs (but each
subsequent test is also being asked to do more work):
5303.7: repack with --stdin-packs (1) 0.01(0.01+0.00)
5303.13: repack with --stdin-packs (50) 3.53(12.07+0.24)
5303.19: repack with --stdin-packs (1000) 195.83(371.82+8.10)
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These are in a helper function, so the usual chain-lint doesn't notice
them. This function is still not perfect, as it has some git invocations
on the left-hand-side of the pipe, but it's primary purpose is timing,
not finding bugs or correctness issues.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add optional trace logging to allow us to better compare performance of
various fsmonitor providers and compare results with non-fsmonitor runs.
Currently, this includes Trace2 logging, but may be extended to include
other trace targets, such as GIT_TRACE_FSMONITOR if desired.
Using this logging helped me explain an odd behavior on MacOS where the
kernel was dropping events and causing the hook to Watchman to timeout.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Shutdown Watchman after the Watchman-based tests and before the block of
"no fsmonitor" tests.
This helps ensure that Watchman cannot affect the test results for the
other.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Only use the final portion of the test trash directory file name
when verifying that Watchman was started.
On Windows and under the SDK, $GIT_WORKTREE is a cygwin-style
path with forward slashes and a "/c/" drive name. However
`watchman watch-list` reports a proper Windows-style pathname
with drive letters and backslashes. This causes the grep to
fail. Since we don't really care about the full pathname (and
we really don't want to bother with normalizaing them), just see
if the test-name portion of the path is found.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the test to use a more portable method to update the mtime on a
large number of files under version control.
The Mac version of xargs does not support the "-d" option.
Likewise, the "-0" and "--null" options are not portable.
Furthermore, use `test-tool chmtime` rather than `touch` to update the
mtime to ensure that it is actually updated (especially on file systems
with only whole second resolution).
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some pretty-format specifiers do not need the data in commit object
(e.g. "%H"), but we were over-eager to load and parse it, which has
been made even lazier.
* jk/pretty-lazy-load-commit:
pretty: lazy-load commit data when expanding user-format
Using "1~5" isn't portable. Nobody seems to have noticed, since perhaps
people don't tend to run the perf suite on more exotic platforms. Still,
it's better to set a good example.
We can use:
perl -ne 'print if $. % 5 == 1'
instead. But we can further observe that perl does a good job of the
other parts of this pipeline, and fold the whole thing together.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we expand a user-format, we try to avoid work that isn't necessary
for the output. For instance, we don't bother parsing the commit header
until we know we need the author, subject, etc.
But we do always load the commit object's contents from disk, even if
the format doesn't require it (e.g., just "%H"). Traditionally this
didn't matter much, because we'd have loaded it as part of the traversal
anyway, and we'd typically have those bytes attached to the commit
struct (or these days, cached in a commit-slab).
But when we have a commit-graph, we might easily get to the point of
pretty-printing a commit without ever having looked at the actual object
contents. We should push off that load (and reencoding) until we're
certain that it's needed.
I think the results of p4205 show the advantage pretty clearly (we serve
parent and tree oids out of the commit struct itself, so they benefit as
well):
# using git.git as the test repo
Test HEAD^ HEAD
----------------------------------------------------------------------
4205.1: log with %H 0.40(0.39+0.01) 0.03(0.02+0.01) -92.5%
4205.2: log with %h 0.45(0.44+0.01) 0.09(0.09+0.00) -80.0%
4205.3: log with %T 0.40(0.39+0.00) 0.04(0.04+0.00) -90.0%
4205.4: log with %t 0.46(0.46+0.00) 0.09(0.08+0.01) -80.4%
4205.5: log with %P 0.39(0.39+0.00) 0.03(0.03+0.00) -92.3%
4205.6: log with %p 0.46(0.46+0.00) 0.10(0.09+0.00) -78.3%
4205.7: log with %h-%h-%h 0.52(0.51+0.01) 0.15(0.14+0.00) -71.2%
4205.8: log with %an-%ae-%s 0.42(0.41+0.00) 0.42(0.41+0.01) +0.0%
# using linux.git as the test repo
Test HEAD^ HEAD
----------------------------------------------------------------------
4205.1: log with %H 7.12(6.97+0.14) 0.76(0.65+0.11) -89.3%
4205.2: log with %h 7.35(7.19+0.16) 1.30(1.19+0.11) -82.3%
4205.3: log with %T 7.58(7.42+0.15) 1.02(0.94+0.08) -86.5%
4205.4: log with %t 8.05(7.89+0.15) 1.55(1.41+0.13) -80.7%
4205.5: log with %P 7.12(7.01+0.10) 0.76(0.69+0.07) -89.3%
4205.6: log with %p 7.38(7.27+0.10) 1.32(1.20+0.12) -82.1%
4205.7: log with %h-%h-%h 7.81(7.67+0.13) 1.79(1.67+0.12) -77.1%
4205.8: log with %an-%ae-%s 7.90(7.74+0.15) 7.81(7.66+0.15) -1.1%
I added the final test to show where we don't improve (the 1% there is
just lucky noise), but also as a regression test to make sure we're not
doing anything stupid like loading the commit multiple times when there
are several placeholders that need it.
Reported-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
p7519 measures the performance of the fsmonitor code. To do this, it
uses the installed copy of Watchman. If Watchman isn't installed, a noop
integration script is installed in its place.
When in the latter mode, it is expected that the script should not write
a "last update token": in fact, it doesn't write anything at all since
the script is blank.
Commit 33226af42b (t/perf/fsmonitor: improve error message if typoing
hook name, 2020-10-26) made sure that running 'git update-index
--fsmonitor' did not write anything to stderr, but this is not the case
when using the empty Watchman script, since Git will complain that:
$ which watchman
watchman not found
$ cat .git/hooks/fsmonitor-empty
$ git -c core.fsmonitor=.git/hooks/fsmonitor-empty update-index --fsmonitor
warning: Empty last update token.
Prior to 33226af42b, the output wasn't checked at all, which allowed
this noop mode to work. But, 33226af42b breaks p7519 when running it
without a 'watchman(1)' on your system.
Handle this by only checking that the stderr is empty only when running
with a real watchman executable. Otherwise, assert that the error
message is the expected one when running in the noop mode.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Nipunn Koorapati <nipunn@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
test_export() has been self-recursive since its inception even though a
simple for-loop would have served just as well to append its arguments
to the `test_export_` variable separated by the pipe character "|".
Recently `test_export_` was changed instead to a space-separated list of
tokens to be exported, an operation which can be accomplished via a
single simple assignment, with no need for looping or recursion.
Therefore, simplify the implementation.
While at it, take advantage of the fact that variable names to be
exported are shell identifiers, thus won't be composed of special
characters or whitespace, thus simple a `$*` can be used rather than
magical `"$@"`.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
test_perf() runs each test in its own subshell which makes it difficult
to persist variables between tests. test_export() addresses this
shortcoming by grabbing the values of specified variables after a test
runs but before the subshell exits, and writes those values to a file
which is loaded into the environment of subsequent tests.
To grab the values to be persisted, test_export() pipes the output of
the shell's builtin `set` command through `sed` which plucks them out
using a regular expression along the lines of `s/^(var1|var2)/.../p`.
Unfortunately, though, this use of alternation is not portable. For
instance, BSD-lineage `sed` (including macOS `sed`) does not support it
in the default "basic regular expression" mode (BRE). It may be possible
to enable "extended regular expression" mode (ERE) in some cases with
`sed -E`, however, `-E` is neither portable nor part of POSIX.
Fortunately, alternation is unnecessary in this case and can easily be
avoided, so replace it with a series of simple expressions such as
`s/^var1/.../p;s/^var2/.../p`.
While at it, tighten the expressions so they match the variable names
exactly rather than matching prefixes (i.e. use `s/^var1=/.../p`).
If the requirements of test_export() become more complex in the future,
then an alternative would be to replace `sed` with `perl` which supports
alternation on all platforms, however, the simple elimination of
alternation via multiple `sed` expressions suffices for the present.
Reported-by: Sangeeta <sangunb09@gmail.com>
Diagnosed-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git update-ref --stdin" learns to take multiple transactions in a
single session.
* ps/update-ref-multi-transaction:
update-ref: disallow "start" for ongoing transactions
p1400: use `git-update-ref --stdin` to test multiple transactions
update-ref: allow creation of multiple transactions
t1400: avoid touching refs on filesystem
Simplify test and make error messages more clear here.
Per feedback from Junio in
33226af42b (t/perf/fsmonitor: improve error message if typoing hook
name, 2020-10-26)
Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit 0a0fbbe3ff (refs: remove lookup cache for
reference-transaction hook, 2020-08-25), a new benchmark was added to
p1400 which has the intention to exercise creation of multiple
transactions in a single process. As git-update-ref wasn't yet able to
create multiple transactions with a single run we instead used git-push.
As its non-atomic version creates a transaction per reference update,
this was the best approximation we could make at that point in time.
Now that `git-update-ref --stdin` supports creation of multiple
transactions, let's convert the benchmark to use that instead. It has
less overhead and it's also a lot clearer what the actual intention is.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This benchmark covers the git status time for a heavily
dirty directory - benchmarking fsmonitor's refresh
When running to compare our perl vs rs-git-fsmonitor - we see that
the perl script incurs significant overhead - further motivation
to provide a faster implementation within git.
7519.7: status (dirty) (fsmonitor=query-watchman) 10.05(7.78+1.56)
7519.20: status (dirty) (fsmonitor=rs-git-fsmonitor) 6.72(4.37+1.64)
7519.33: status (dirty) (fsmonitor=disabled) 5.62(4.24+2.03)
Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This prepares for it being called multiple times when
testing different hooks
Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is extremely verbose, printing >10K non-useful lines
Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The full name is lengthy and makes it hard to read
Before:
7519.3: status (fsmonitor=/home/nipunn/src/server/.git/hooks/rs-git-fsmonitor) 0.02(0.01+0.00)
After
7519.3: status (fsmonitor=rs-git-fsmonitor) 0.03(0.02+0.00)
Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There was much duplication here. Prepares for making
changes to the description.
Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously - it would silently run the perf suite w/o using
fsmonitor - fsmonitor errors are not hard failures.
Now it errors loudly.
GIT_PERF_7519_FSMONITOR="$HOME/rs-git-fsmonitorr"
./p7519-fsmonitor.sh -i -v
fatal: cannot run /home/nipunn/rs-git-fsmonitorr:
No such file or directory
not ok 2 - setup for fsmonitor
Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is only required to be set up once. This prepares for
testing multiple hooks in one invocation.
Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for testing multiple fsmonitor hooks
Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Much of the benchmark code is redundant. This is
easier to understand and edit.
Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Perf tests have not been linted for some time.
They've grown some seq instead of test_seq. This
runs the existing lints on the perf tests as well.
Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Results for the git-diff fsmonitor optimization
in patch in the parent-rev (using a 400k file repo to test)
As you can see here - git diff with fsmonitor running is
significantly better with this patch series (80% faster on my
workload)!
GIT_PERF_LARGE_REPO=~/src/server ./run v2.29.0-rc1 . -- p7519-fsmonitor.sh
Test v2.29.0-rc1 this tree
-----------------------------------------------------------------------------------------------------------------
7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 1.46(0.82+0.64) 1.47(0.83+0.62) +0.7%
7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.16(0.12+0.04) 0.17(0.12+0.05) +6.3%
7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.36(0.73+0.62) 1.37(0.76+0.60) +0.7%
7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman) 0.85(0.22+0.63) 0.14(0.10+0.05) -83.5%
7519.6: diff -- 0_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.08+0.05) 0.13(0.11+0.02) +8.3%
7519.7: diff -- 10_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.08+0.04) 0.13(0.09+0.04) +8.3%
7519.8: diff -- 100_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.07+0.05) 0.13(0.07+0.06) +8.3%
7519.9: diff -- 1000_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.09+0.04) 0.13(0.08+0.05) +8.3%
7519.10: diff -- 10000_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.14(0.09+0.05) 0.13(0.10+0.03) -7.1%
7519.12: status (fsmonitor=) 1.67(0.93+1.49) 1.67(0.99+1.42) +0.0%
7519.13: status -uno (fsmonitor=) 0.37(0.30+0.82) 0.37(0.33+0.79) +0.0%
7519.14: status -uall (fsmonitor=) 1.58(0.97+1.35) 1.57(0.86+1.45) -0.6%
7519.15: diff (fsmonitor=) 0.34(0.28+0.83) 0.34(0.27+0.83) +0.0%
7519.16: diff -- 0_files (fsmonitor=) 0.09(0.06+0.04) 0.09(0.08+0.02) +0.0%
7519.17: diff -- 10_files (fsmonitor=) 0.09(0.07+0.03) 0.09(0.06+0.05) +0.0%
7519.18: diff -- 100_files (fsmonitor=) 0.09(0.06+0.04) 0.09(0.06+0.04) +0.0%
7519.19: diff -- 1000_files (fsmonitor=) 0.09(0.06+0.04) 0.09(0.05+0.05) +0.0%
7519.20: diff -- 10000_files (fsmonitor=) 0.10(0.08+0.04) 0.10(0.06+0.05) +0.0%
I also added a benchmark for a tiny git diff workload w/ a pathspec.
I see an approximately .02 second overhead added w/ and w/o fsmonitor
From looking at these results, I suspected that refresh_fsmonitor
is already happening during git diff - independent of this patch
series' optimization. Confirmed that suspicion by breaking on
refresh_fsmonitor.
(gdb) bt [simplified]
0 refresh_fsmonitor at fsmonitor.c:176
1 ie_match_stat at read-cache.c:375
2 match_stat_with_submodule at diff-lib.c:237
4 builtin_diff_files at builtin/diff.c:260
5 cmd_diff at builtin/diff.c:541
6 run_builtin at git.c:450
7 handle_builtin at git.c:700
8 run_argv at git.c:767
9 cmd_main at git.c:898
10 main at common-main.c:52
Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The first git status would be inflated due to warming of
filesystem cache. This makes the results comparable.
Before
Test this tree
--------------------------------------------------------------------------------
7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 2.52(1.59+1.56)
7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.18(0.12+0.06)
7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.36(0.73+0.62)
7519.7: status (fsmonitor=) 0.69(0.52+0.90)
7519.8: status -uno (fsmonitor=) 0.37(0.28+0.81)
7519.9: status -uall (fsmonitor=) 1.53(0.93+1.32)
After
Test this tree
--------------------------------------------------------------------------------
7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 0.39(0.33+0.06)
7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.17(0.13+0.05)
7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.34(0.77+0.56)
7519.7: status (fsmonitor=) 0.70(0.53+0.90)
7519.8: status -uno (fsmonitor=) 0.37(0.32+0.78)
7519.9: status -uall (fsmonitor=) 1.55(1.01+1.25)
Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is a logic to estimate how many objects are in the
repository, which is mean to run once per process invocation, but
it ran every time the estimated value was requested.
* jk/dont-count-existing-objects-twice:
packfile: actually set approximate_object_count_valid
The approximate_object_count() function tries to compute the count only
once per process. But ever since it was introduced in 8e3f52d778
(find_unique_abbrev: move logic out of get_short_sha1(), 2016-10-03), we
failed to actually set the "valid" flag, meaning we'd compute it fresh
on every call.
This turns out not to be _too_ bad, because we're only iterating through
the packed_git list, and not making any system calls. But since it may
get called for every abbreviated hash we output, even this can add up if
you have many packs.
Here are before-and-after timings for a new perf test which just asks
rev-list to abbreviate each commit hash (the test repo is linux.git,
with commit-graphs):
Test origin HEAD
----------------------------------------------------------------------------
5303.3: rev-list (1) 28.91(28.46+0.44) 29.03(28.65+0.38) +0.4%
5303.4: abbrev-commit (1) 1.18(1.06+0.11) 1.17(1.02+0.14) -0.8%
5303.7: rev-list (50) 28.95(28.56+0.38) 29.50(29.17+0.32) +1.9%
5303.8: abbrev-commit (50) 3.67(3.56+0.10) 3.57(3.42+0.15) -2.7%
5303.11: rev-list (1000) 30.34(29.89+0.43) 30.82(30.35+0.46) +1.6%
5303.12: abbrev-commit (1000) 86.82(86.52+0.29) 77.82(77.59+0.22) -10.4%
5303.15: load 10,000 packs 0.08(0.02+0.05) 0.08(0.02+0.06) +0.0%
It doesn't help at all when we have 1 pack (5303.4), but we get a 10%
speedup when there are 1000 packs (5303.12). That's a modest speedup for
a case that's already slow and we'd hope to avoid in general (note how
slow it is even after, because we have to look in each of those packs
for abbreviations). But it's a one-line change that clearly matches the
original intent, so it seems worth doing.
The included perf test may also be useful for keeping an eye on any
regressions in the overall abbreviation code.
Reported-by: Rasmus Villemoes <rv@rasmusvillemoes.dk>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>