sha1_file: use strbuf_add() instead of strbuf_addf()

Replace use of strbuf_addf() with strbuf_add() when enumerating
loose objects in for_each_file_in_obj_subdir(). Since we already
check the length and hex-values of the string before consuming
the path, we can prevent extra computation by using the lower-
level method.

One consumer of for_each_file_in_obj_subdir() is the abbreviation
code. OID abbreviations use a cached list of loose objects (per
object subdirectory) to make repeated queries fast, but there is
significant cache load time when there are many loose objects.

Most repositories do not have many loose objects before repacking,
but in the GVFS case the repos can grow to have millions of loose
objects. Profiling 'git log' performance in GitForWindows on a
GVFS-enabled repo with ~2.5 million loose objects revealed 12% of
the CPU time was spent in strbuf_addf().

Add a new performance test to p4211-line-log.sh that is more
sensitive to this cache-loading. By limiting to 1000 commits, we
more closely resemble user wait time when reading history into a
pager.

For a copy of the Linux repo with two ~512 MB packfiles and ~572K
loose objects, running 'git log --oneline --parents --raw -1000'
had the following performance:

 HEAD~1            HEAD
----------------------------------------
 7.70(7.15+0.54)   7.44(7.09+0.29) -3.4%

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Derrick Stolee 2017-12-04 09:06:03 -05:00 committed by Junio C Hamano
parent 1a4e40aa5d
commit 163ee5e635
2 changed files with 11 additions and 5 deletions

View File

@ -1903,7 +1903,6 @@ int for_each_file_in_obj_subdir(unsigned int subdir_nr,
origlen = path->len; origlen = path->len;
strbuf_complete(path, '/'); strbuf_complete(path, '/');
strbuf_addf(path, "%02x", subdir_nr); strbuf_addf(path, "%02x", subdir_nr);
baselen = path->len;
dir = opendir(path->buf); dir = opendir(path->buf);
if (!dir) { if (!dir) {
@ -1914,15 +1913,18 @@ int for_each_file_in_obj_subdir(unsigned int subdir_nr,
} }
oid.hash[0] = subdir_nr; oid.hash[0] = subdir_nr;
strbuf_addch(path, '/');
baselen = path->len;
while ((de = readdir(dir))) { while ((de = readdir(dir))) {
size_t namelen;
if (is_dot_or_dotdot(de->d_name)) if (is_dot_or_dotdot(de->d_name))
continue; continue;
namelen = strlen(de->d_name);
strbuf_setlen(path, baselen); strbuf_setlen(path, baselen);
strbuf_addf(path, "/%s", de->d_name); strbuf_add(path, de->d_name, namelen);
if (namelen == GIT_SHA1_HEXSZ - 2 &&
if (strlen(de->d_name) == GIT_SHA1_HEXSZ - 2 &&
!hex_to_bytes(oid.hash + 1, de->d_name, !hex_to_bytes(oid.hash + 1, de->d_name,
GIT_SHA1_RAWSZ - 1)) { GIT_SHA1_RAWSZ - 1)) {
if (obj_cb) { if (obj_cb) {
@ -1941,7 +1943,7 @@ int for_each_file_in_obj_subdir(unsigned int subdir_nr,
} }
closedir(dir); closedir(dir);
strbuf_setlen(path, baselen); strbuf_setlen(path, baselen - 1);
if (!r && subdir_cb) if (!r && subdir_cb)
r = subdir_cb(subdir_nr, path->buf, data); r = subdir_cb(subdir_nr, path->buf, data);

View File

@ -35,4 +35,8 @@ test_perf 'git log --oneline --raw --parents' '
git log --oneline --raw --parents >/dev/null git log --oneline --raw --parents >/dev/null
' '
test_perf 'git log --oneline --raw --parents -1000' '
git log --oneline --raw --parents -1000 >/dev/null
'
test_done test_done