git-commit-vandalism/builtin
Taylor Blau c0c9d35e27 builtin/show-ref.c: avoid over-iterating with --heads, --tags
When `show-ref` is combined with the `--heads` or `--tags` options, it
can avoid iterating parts of a repository's references that it doesn't
care about.

But it doesn't take advantage of this potential optimization. When this
command was introduced back in 358ddb62cf (Add "git show-ref" builtin
command, 2006-09-15), `for_each_ref_in()` did exist. But since most
repositories don't have many (any?) references that aren't branches or
tags already, this makes little difference in practice.

Though for repositories with a large imbalance of branches and tags (or,
more likely in the case of server operators, many hidden references),
this can make quite a difference. Take, for example, a repository with
500,000 "hidden" references (all of the form "refs/__hidden__/N"), and
a single branch:

    git commit --allow-empty -m "base" &&
    seq 1 500000 | sed 's,\(.*\),create refs/__hidden__/\1 HEAD,' |
      git update-ref --stdin &&
    git pack-refs --all

Outputting the existence of that single branch currently takes on the
order of ~50ms on my machine. The vast majority of this time is wasted
iterating through references that we know we're going to discard.

Instead, teach `show-ref` that it can iterate just "refs/heads" and/or
"refs/tags" when given `--heads` and/or `--tags`, respectively. A few
small interesting things to note:

  - When given either option, we can avoid the general-purpose
    for_each_ref() call altogether, since we know that it won't give us
    any references that we wouldn't filter out already.

  - We can make two separate calls to `for_each_fullref_in()` (and
    avoid, say, the more specialized `for_each_fullref_in_prefixes()`,
    since we know that the set of references enumerated by each is
    disjoint, so we'll never see the same reference appear in both
    calls.

  - We have to use the "fullref" variant (instead of just
    `for_each_branch_ref()` and `for_each_tag_ref()`), since we expect
    fully-qualified reference names to appear in `show-ref`'s output.

When either of `heads_only` or `tags_only` is set, we can eliminate the
strcmp() calls in `builtin/show-ref.c::show_ref()` altogether, since we
know that `show_ref()` will never see a non-branch or tag reference.

Unfortunately, we can't use `for_each_fullref_in_prefixes()` to enhance
`show-ref`'s pattern matching, since `show-ref` patterns match on the
_suffix_ (e.g., the pattern "foo" shows "refs/heads/foo",
"refs/tags/foo", and etc, not "foo/*").

Nonetheless, in our synthetic example above, this provides a significant
speed-up ("git" is roughly v2.36, "git.compile" is this patch):

    $ hyperfine -N 'git show-ref --heads' 'git.compile show-ref --heads'
    Benchmark 1: git show-ref --heads
      Time (mean ± σ):      49.9 ms ±   6.2 ms    [User: 45.6 ms, System: 4.1 ms]
      Range (min … max):    46.1 ms …  73.6 ms    43 runs

    Benchmark 2: git.compile show-ref --heads
      Time (mean ± σ):       2.8 ms ±   0.4 ms    [User: 1.4 ms, System: 1.2 ms]
      Range (min … max):     1.3 ms …   5.6 ms    957 runs

    Summary
      'git.compile show-ref --heads' ran
       18.03 ± 3.38 times faster than 'git show-ref --heads'

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-06 09:56:42 -07:00
..
add.c add: remove support for git-legacy-stash 2022-01-27 18:00:15 -08:00
am.c Merge branch 'ab/date-mode-release' 2022-02-25 15:47:36 -08:00
annotate.c
apply.c
archive.c use xopen() to handle fatal open(2) failures 2021-08-25 14:39:08 -07:00
bisect--helper.c Merge branch 'ac/usage-string-fixups' 2022-03-06 21:25:32 -08:00
blame.c Merge branch 'ja/i18n-common-messages' 2022-02-25 15:47:35 -08:00
branch.c Merge branch 'gc/branch-recurse-submodules' 2022-02-18 13:53:29 -08:00
bugreport.c hook-list.h: add a generated list of hooks, like config-list.h 2021-09-27 09:44:54 -07:00
bundle.c bundle: call strvec_clear() on allocated strvec 2022-03-04 13:24:18 -08:00
cat-file.c Merge branch 'jc/cat-file-batch-default-format-optim' 2022-03-23 14:09:31 -07:00
check-attr.c
check-ignore.c dir.[ch]: replace dir_init() with DIR_INIT 2021-07-01 12:32:22 -07:00
check-mailmap.c shortlog: remove unused(?) "repo-abbrev" feature 2021-01-12 14:04:42 -08:00
check-ref-format.c
checkout--worker.c pkt-line.[ch]: remove unused packet_read_line_buf() 2021-10-15 13:09:40 -07:00
checkout-index.c checkout-index: integrate with sparse index 2022-01-13 13:49:45 -08:00
checkout.c Merge branch 'ab/object-file-api-updates' 2022-03-16 17:53:08 -07:00
clean.c Merge branch 'vd/sparse-clean-etc' 2022-02-17 16:25:05 -08:00
clone.c Merge branch 'ds/partial-bundles' 2022-03-21 15:14:24 -07:00
column.c column: fix parsing of the '--nl' option 2021-08-26 14:36:27 -07:00
commit-graph.c commit-graph: fix memory leak in misused string_list API 2022-03-04 13:24:18 -08:00
commit-tree.c use xopen() to handle fatal open(2) failures 2021-08-25 14:39:08 -07:00
commit.c hooks: fix an obscure TOCTOU "did we just run a hook?" race 2022-03-07 13:00:53 -08:00
config.c Merge branch 'mf/fix-type-in-config-h' 2022-03-16 17:53:07 -07:00
count-objects.c i18n: remove from i18n strings that do not hold translatable parts 2022-02-04 13:58:28 -08:00
credential-cache--daemon.c unix-socket: add backlog size option to unix_stream_listen() 2021-03-15 14:32:51 -07:00
credential-cache.c credential-cache: check for windows specific errors 2021-09-14 09:30:54 -07:00
credential-store.c Use a better name for the function interpolating paths 2021-07-26 12:17:16 -07:00
credential.c doc: fix git credential synopsis 2021-10-28 09:57:09 -07:00
describe.c i18n: turn even more messages into "cannot be used together" ones 2022-01-05 13:31:00 -08:00
diff-files.c Merge branch 'jc/diffcore-rotate' 2021-02-25 16:43:30 -08:00
diff-index.c diff-index: restore -c/--cc options handling 2021-09-07 11:11:35 -07:00
diff-tree.c 2.36 gitk/diff-tree --stdin regression fix 2022-04-26 09:26:35 -07:00
diff.c builtin/diff.c: fix "git-diff" usage string typo 2022-02-02 11:30:53 -08:00
difftool.c i18n: factorize more 'incompatible options' messages 2022-02-04 13:58:28 -08:00
env--helper.c assert PARSE_OPT_NONEG in parse-options callbacks 2020-09-30 12:53:47 -07:00
fast-export.c Merge branch 'rs/fast-export-pathspec-fix' into maint 2022-05-05 14:36:25 -07:00
fast-import.c Merge branch 'ns/core-fsyncmethod' 2022-03-25 16:38:24 -07:00
fetch-pack.c Merge branch 'rc/fetch-refetch' 2022-04-04 10:56:23 -07:00
fetch.c Merge branch 'rc/fetch-refetch' 2022-04-04 10:56:23 -07:00
fmt-merge-msg.c merge: allow to pretend a merge is made into a different branch 2021-12-20 14:55:02 -08:00
for-each-ref.c for-each-ref: delay parsing of --sort=<atom> options 2021-10-20 14:33:07 -07:00
for-each-repo.c builtin/for-each-repo: remove unnecessary argv copy to plug leak 2021-07-26 12:19:20 -07:00
fsck.c run-command API users: use strvec_pushl(), not argv construction 2021-11-25 22:15:07 -08:00
fsmonitor--daemon.c fsmonitor--daemon: use a cookie file to sync with file system 2022-03-25 16:04:17 -07:00
gc.c builtin/gc.c: delete duplicate include 2022-03-13 22:23:16 +00:00
get-tar-commit-id.c
grep.c Merge branch 'ab/object-file-api-updates' 2022-03-16 17:53:08 -07:00
hash-object.c Merge branch 'ab/object-file-api-updates' 2022-03-16 17:53:08 -07:00
help.c Merge branch 'ab/help-fixes' 2022-03-09 13:38:24 -08:00
hook.c git hook run: add an --ignore-missing flag 2022-01-07 15:19:34 -08:00
index-pack.c Merge branch 'ns/core-fsyncmethod' 2022-03-25 16:38:24 -07:00
init-db.c i18n: refactor "foo and bar are mutually exclusive" 2022-01-05 13:29:23 -08:00
interpret-trailers.c
log.c Merge branch 'rs/format-patch-pathspec-fix' into maint 2022-05-05 14:36:25 -07:00
ls-files.c ls-files: support --recurse-submodules --stage 2022-02-23 16:41:55 -08:00
ls-remote.c ls-remote & transport API: release "struct transport_ls_refs_options" 2022-02-06 18:02:34 -08:00
ls-tree.c Merge branch 'tl/ls-tree-oid-only' 2022-04-06 15:21:59 -07:00
mailinfo.c mailinfo: allow squelching quoted CRLF warning 2021-05-10 15:06:22 +09:00
mailsplit.c am/apply: warn if we end up reading patches from terminal 2022-03-03 14:00:32 -08:00
merge-base.c merge-base: free() allocated "struct commit **" list 2022-03-04 13:24:17 -08:00
merge-file.c xdiff: implement a zealous diff3, or "zdiff3" 2021-12-01 14:45:58 -08:00
merge-index.c merge-index: ensure full index 2021-04-14 13:47:21 -07:00
merge-ours.c builtins + test helpers: use return instead of exit() in cmd_* 2021-06-09 09:15:58 +09:00
merge-recursive.c gettext API users: don't explicitly cast ngettext()'s "n" 2022-03-07 11:57:52 -08:00
merge-tree.c xdiff users: use designated initializers for out_line 2021-05-11 12:47:31 +09:00
merge.c hooks: fix an obscure TOCTOU "did we just run a hook?" race 2022-03-07 13:00:53 -08:00
mktag.c Merge branch 'ab/object-file-api-updates' 2022-03-16 17:53:08 -07:00
mktree.c Merge branch 'ab/object-file-api-updates' 2022-03-16 17:53:08 -07:00
multi-pack-index.c builtin/multi-pack-index.c: don't leak concatenated options 2021-10-28 15:32:14 -07:00
mv.c mv: refuse to move sparse paths 2021-09-28 10:31:02 -07:00
name-rev.c Merge branch 'rs/name-rev-fix-free-after-use' into maint 2022-05-05 14:36:24 -07:00
notes.c Merge branch 'ab/object-file-api-updates' 2022-03-16 17:53:08 -07:00
pack-objects.c Merge branch 'ds/partial-bundle-more' 2022-04-04 10:56:21 -07:00
pack-redundant.c builtin/pack-redundant: avoid casting buffers to struct object_id 2021-04-27 16:31:38 +09:00
pack-refs.c
patch-id.c patch-id: fix scan_hunk_header on diffs with 1 line of before/after 2022-02-02 11:24:23 -08:00
prune-packed.c i18n: remove from i18n strings that do not hold translatable parts 2022-02-04 13:58:28 -08:00
prune.c Merge branch 'ns/tmp-objdir' 2022-01-03 16:24:15 -08:00
pull.c Merge branch 'ja/i18n-common-messages' 2022-02-25 15:47:35 -08:00
push.c i18n: factorize "invalid value" messages 2022-02-04 13:58:28 -08:00
range-diff.c column, range-diff: downcase option description 2021-03-29 14:06:08 -07:00
read-tree.c read-tree: make three-way merge sparse-aware 2022-03-01 12:36:01 -08:00
rebase.c rebase: set REF_HEAD_DETACH in checkout_up_to_date() 2022-03-18 09:48:53 -07:00
receive-pack.c Merge branch 'ab/string-list-count-in-size-t' 2022-03-16 17:53:09 -07:00
reflog.c reflog: fix 'show' subcommand's argv 2022-03-28 15:45:46 -07:00
remote-ext.c
remote-fd.c
remote.c Merge branch 'tb/rename-remote-progress' 2022-03-16 17:53:08 -07:00
repack.c repack: add config to skip updating server info 2022-03-14 22:25:13 +00:00
replace.c Merge branch 'ab/object-file-api-updates' 2022-03-16 17:53:08 -07:00
rerere.c xdiff users: use designated initializers for out_line 2021-05-11 12:47:31 +09:00
reset.c reset: show --no-refresh in the short-help 2022-03-24 13:36:21 -07:00
rev-list.c Merge branch 'ds/partial-bundles' 2022-03-21 15:14:24 -07:00
rev-parse.c refs: drop "broken" flag from for_each_fullref_in() 2021-09-27 12:36:45 -07:00
revert.c Merge branch 'ds/mergies-with-sparse-index' 2021-09-20 15:20:45 -07:00
rm.c Merge branch 'ja/i18n-similar-messages' 2022-01-10 11:52:56 -08:00
send-pack.c i18n: factorize "invalid value" messages 2022-02-04 13:58:28 -08:00
shortlog.c string-list API: change "nr" and "alloc" to "size_t" 2022-03-07 12:02:04 -08:00
show-branch.c date API: create a date.h, split from cache.h 2022-02-16 09:40:00 -08:00
show-index.c builtin/show-index: set the algorithm for object IDs 2021-04-27 16:31:39 +09:00
show-ref.c builtin/show-ref.c: avoid over-iterating with --heads, --tags 2022-06-06 09:56:42 -07:00
sparse-checkout.c Merge branch 'ep/remove-duplicated-includes' 2022-03-23 14:09:30 -07:00
stash.c Merge branch 'vd/stash-silence-reset' 2022-03-30 18:01:10 -07:00
stripspace.c i18n: remove from i18n strings that do not hold translatable parts 2022-02-04 13:58:28 -08:00
submodule--helper.c Merge branch 'gc/submodule-update-part2' into maint 2022-05-05 14:36:24 -07:00
symbolic-ref.c symbolic-ref: don't leak shortened refname in check_symref() 2021-03-14 15:57:59 -07:00
tag.c Merge branch 'ab/object-file-api-updates' 2022-03-16 17:53:08 -07:00
unpack-file.c
unpack-objects.c object-file API: have hash_object_file() take "enum object_type" 2022-02-25 17:16:32 -08:00
update-index.c fsmonitor: config settings are repository-specific 2022-03-25 16:04:15 -07:00
update-ref.c update-ref: fix streaming of status updates 2021-09-03 11:35:15 -07:00
update-server-info.c i18n: remove from i18n strings that do not hold translatable parts 2022-02-04 13:58:28 -08:00
upload-archive.c upload-archive: use regular "struct child_process" pattern 2021-11-25 22:15:07 -08:00
upload-pack.c upload-pack: document and rename --advertise-refs 2021-08-05 08:59:37 -07:00
var.c var: add GIT_DEFAULT_BRANCH variable 2021-11-03 13:25:36 -07:00
verify-commit.c
verify-pack.c
verify-tag.c
worktree.c Merge branch 'pw/worktree-list-with-z' 2022-04-04 10:56:25 -07:00
write-tree.c