This just passes the filter-options struct to prepare_bitmap_walk().
Since the bitmap code doesn't actually support any filters yet, it will
fallback to the non-bitmap code if any --filter is specified. But this
lets us exercise that rejection code path, as well as getting us ready
to test filters via rev-list when we _do_ support them.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently you can't use object filters with bitmaps, but we plan to
support at least some filters with bitmaps. Let's introduce some
infrastructure that will help us do that:
- prepare_bitmap_walk() now accepts a list_objects_filter_options
parameter (which can be NULL for no filtering; all the current
callers pass this)
- we'll bail early if the filter is incompatible with bitmaps (just as
we would if there were no bitmaps at all). Currently all filters are
incompatible.
- we'll filter the resulting bitmap; since there are no supported
filters yet, this is always a noop.
There should be no behavior change yet, but we'll support some actual
filters in a future patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ever since we added reachability bitmap support, we've been able to use
it with rev-list to get the full list of objects, like:
git rev-list --objects --use-bitmap-index --all
But you can't do so without --objects, since we weren't ready to just
show the commits. However, the internals of the bitmap code are mostly
ready for this: they avoid opening up trees when walking to fill in the
bitmaps. We just need to actually pass in the rev_info to
traverse_bitmap_commit_list() so it knows which types to bother
triggering our callback for.
For completeness, the perf test now covers both the existing --objects
case, as well as the new commits-only behavior (the objects one got way
faster when we introduced bitmaps, but obviously isn't improved now).
Here are numbers for linux.git:
Test HEAD^ HEAD
------------------------------------------------------------------------
5310.7: rev-list (commits) 8.29(8.10+0.19) 1.76(1.72+0.04) -78.8%
5310.8: rev-list (objects) 8.06(7.94+0.12) 8.14(7.94+0.13) +1.0%
That run was cheating a little, as I didn't have any commit-graph in the
repository, and we'd built it by default these days when running git-gc.
Here are numbers with a commit-graph:
Test HEAD^ HEAD
------------------------------------------------------------------------
5310.7: rev-list (commits) 0.70(0.58+0.12) 0.51(0.46+0.04) -27.1%
5310.8: rev-list (objects) 6.20(6.09+0.10) 6.27(6.16+0.11) +1.1%
Still an improvement, but a lot less impressive.
We could have the perf script remove any commit-graph to show the
out-sized effect, but it probably makes sense to leave it in what would
be a more typical setup.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The prior commit taught "--count --objects" to work without bitmaps. We
should be able to get the same answer much more quickly with bitmaps.
Note that we punt on the max_count case here. This perhaps _could_ be
made to work if we find all of the boundary commits and treat them as
UNINTERESTING, subtracting them (and their reachable objects) from the
set we return. That implies an actual commit traversal, but we'd still
be faster due to avoiding opening up any trees. Given the complexity and
the fact that anyone is unlikely to want this, it makes sense to just
fall back to the non-bitmap case for now.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current behavior from "rev-list --count --objects" is nonsensical:
we enumerate all of the objects except commits, but then give a count of
commits. This wasn't planned, and is just what the code happens to do.
Instead, let's give the answer the user almost certainly wanted: the
full count of objects.
Note that there are more complicated cases around cherry-marking, etc.
We'll punt on those for now, but let the user know that we can't produce
an answer (rather than giving them something useless).
We'll test both the new feature as well as a vanilla --count of commits,
since that surprisingly doesn't seem to be covered in the existing
tests.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a few operations in rev-list that are optimized for bitmaps.
Rather than having the code inline in cmd_rev_list(), let's move them
into helpers. This not only makes the flow of the main function simpler,
but it lets us replace the complex "can we do the optimization?"
conditionals with a series of early returns from the functions. That
also makes it easy to add comments explaining those conditions.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
rev-list has refused to use bitmaps with pathspec limiting since
c8a70d3509 (rev-list: disable --use-bitmap-index when pruning commits,
2015-07-01). But this is true not just for rev-list, but for anyone who
calls prepare_bitmap_walk(); the code isn't equipped to handle this
case. We never noticed because the only other callers would never pass
a pathspec limiter.
But let's push the check down into prepare_bitmap_walk() anyway. That's
a more logical place for it to live, as callers shouldn't need to know
the details (and must be prepared to fall back to a regular traversal
anyway, since there might not be bitmaps in the repository).
It would also prepare us for a day where this case _is_ handled, but
that's pretty unlikely. E.g., we could use bitmaps to generate the set
of commits, and then diff each commit to see if it matches the pathspec.
That would be slightly faster than a naive traversal that actually walks
the commits. But you'd probably do better still to make use of the newer
commit-graph feature to make walking the commits very cheap.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "--use-bitmap-index" option is usually aspirational: if we have
bitmaps and the request can be fulfilled more quickly using them we'll
do so, but otherwise fall back to a non-bitmap traversal.
The exception is object filtering, which explicitly dies if the two
options are combined. Let's convert this to the usual fallback behavior.
This is a minor convenience for now (since the caller can easily know
that --filter and --use-bitmap-index don't combine), but will become
much more useful as we start to support _some_ filters with bitmaps, but
not others.
The test infrastructure here is bigger than necessary for checking this
one small feature. But it will serve as the basis for more filtering
bitmap tests in future patches.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 5d9324e0f4, reversing
changes made to c58ae96fc4.
The topic turns out to be too buggy for real use.
cf. <f2fe7437-8a48-3315-4d3f-8d51fe4bb8f1@gmail.com>
The code to write split commit-graph file(s) upon fetching computed
bogus value for the parameter used in splitting the resulting
files, which has been corrected.
* ds/commit-graph-set-size-mult:
commit-graph: prefer default size_mult when given zero
"git sparse-checkout list" subcommand learned to give its output in
a more concise form when the "cone" mode is in effect.
* ds/sparse-list-in-cone-mode:
sparse-checkout: document interactions with submodules
sparse-checkout: list directories in cone mode
In 50f26bd ("fetch: add fetch.writeCommitGraph config setting",
2019-09-02), the fetch builtin added the capability to write a
commit-graph using the "--split" feature. This feature creates
multiple commit-graph files, and those can merge based on a set
of "split options" including a size multiple. The default size
multiple is 2, which intends to provide a log_2 N depth of the
commit-graph chain where N is the number of commits.
However, I noticed during dogfooding that my commit-graph chains
were becoming quite large when left only to builds by 'git fetch'.
It turns out that in split_graph_merge_strategy(), we default the
size_mult variable to 2 except we override it with the context's
split_opts if they exist. In builtin/fetch.c, we create such a
split_opts, but do not populate it with values.
This problem is due to two failures:
1. It is unclear that we can add the flag COMMIT_GRAPH_WRITE_SPLIT
with a NULL split_opts.
2. If we have a non-NULL split_opts, then we override the default
values even if a zero value is given.
Correct both of these issues. First, do not override size_mult when
the options provide a zero value. Second, stop creating a split_opts
in the fetch builtin.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git rebase --signoff" stopped working when the command was written
in C, which has been corrected.
* en/rebase-signoff-fix:
rebase: fix saving of --signoff state for am-based rebases
When core.sparseCheckoutCone is enabled, the 'git sparse-checkout set'
command takes a list of directories as input, then creates an ordered
list of sparse-checkout patterns such that those directories are
recursively included and all sibling entries along the parent directories
are also included. Listing the patterns is less user-friendly than the
directories themselves.
In cone mode, and as long as the patterns match the expected cone-mode
pattern types, change the output of 'git sparse-checkout list' to only
show the directories that created the patterns.
With this change, the following piped commands would not change the
working directory:
git sparse-checkout list | git sparse-checkout set --stdin
The only time this would not work is if core.sparseCheckoutCone is
true, but the sparse-checkout file contains patterns that do not
match the expected pattern types for cone mode.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The effort to move "git-add--interactive" to C continues.
* js/add-p-in-c:
built-in add -p: show helpful hint when nothing can be staged
built-in add -p: only show the applicable parts of the help text
built-in add -p: implement the 'q' ("quit") command
built-in add -p: implement the '/' ("search regex") command
built-in add -p: implement the 'g' ("goto") command
built-in add -p: implement hunk editing
strbuf: add a helper function to call the editor "on an strbuf"
built-in add -p: coalesce hunks after splitting them
built-in add -p: implement the hunk splitting feature
built-in add -p: show different prompts for mode changes and deletions
built-in app -p: allow selecting a mode change as a "hunk"
built-in add -p: handle deleted empty files
built-in add -p: support multi-file diffs
built-in add -p: offer a helpful error message when hunk navigation failed
built-in add -p: color the prompt and the help text
built-in add -p: adjust hunk headers as needed
built-in add -p: show colored hunks by default
built-in add -i: wire up the new C code for the `patch` command
built-in add -i: start implementing the `patch` functionality in C
Code cleanup.
* rs/ref-read-cleanup:
remote: pass NULL to read_ref_full() because object ID is not needed
refs: pass NULL to refs_read_ref_full() because object ID is not needed
Management of sparsely checked-out working tree has gained a
dedicated "sparse-checkout" command.
* ds/sparse-cone: (21 commits)
sparse-checkout: improve OS ls compatibility
sparse-checkout: respect core.ignoreCase in cone mode
sparse-checkout: check for dirty status
sparse-checkout: update working directory in-process for 'init'
sparse-checkout: cone mode should not interact with .gitignore
sparse-checkout: write using lockfile
sparse-checkout: use in-process update for disable subcommand
sparse-checkout: update working directory in-process
sparse-checkout: sanitize for nested folders
unpack-trees: add progress to clear_ce_flags()
unpack-trees: hash less in cone mode
sparse-checkout: init and set in cone mode
sparse-checkout: use hashmaps for cone patterns
sparse-checkout: add 'cone' mode
trace2: add region in clear_ce_flags
sparse-checkout: create 'disable' subcommand
sparse-checkout: add '--stdin' option to set subcommand
sparse-checkout: 'set' subcommand
clone: add --sparse mode
sparse-checkout: create 'init' subcommand
...
Redo "git name-rev" to avoid recursive calls.
* sg/name-rev-wo-recursion:
name-rev: cleanup name_ref()
name-rev: eliminate recursion in name_rev()
name-rev: use 'name->tip_name' instead of 'tip_name'
name-rev: drop name_rev()'s 'generation' and 'distance' parameters
name-rev: restructure creating/updating 'struct rev_name' instances
name-rev: restructure parsing commits and applying date cutoff
name-rev: pull out deref handling from the recursion
name-rev: extract creating/updating a 'struct name_rev' into a helper
t6120: add a test to cover inner conditions in 'git name-rev's name_rev()
name-rev: use sizeof(*ptr) instead of sizeof(type) in allocation
name-rev: avoid unnecessary cast in name_ref()
name-rev: use strbuf_strip_suffix() in get_rev_name()
t6120-describe: modernize the 'check_describe' helper
t6120-describe: correct test repo history graph in comment
"git format-patch" can take a set of configured format.notes values
to specify which notes refs to use in the log message part of the
output. The behaviour of this was not consistent with multiple
--notes command line options, which has been corrected.
* dl/format-patch-notes-config-fixup:
notes.h: fix typos in comment
notes: break set_display_notes() into smaller functions
config/format.txt: clarify behavior of multiple format.notes
format-patch: move git_config() before repo_init_revisions()
format-patch: use --notes behavior for format.notes
notes: extract logic into set_display_notes()
notes: create init_display_notes() helper
notes: rename to load_display_notes()
A few more commands learned the "--pathspec-from-file" command line
option.
* am/pathspec-f-f-checkout:
checkout, restore: support the --pathspec-from-file option
doc: restore: synchronize <pathspec> description
doc: checkout: synchronize <pathspec> description
doc: checkout: fix broken text reference
doc: checkout: remove duplicate synopsis
add: support the --pathspec-from-file option
cmd_add: prepare for next patch
An earlier series to teach "--pathspec-from-file" to "git commit"
forgot to make the option incompatible with "--all", which has been
corrected.
* am/pathspec-from-file:
commit: forbid --pathspec-from-file --all
This was an error introduced in the conversion from shell in commit
21853626ea ("built-in rebase: call `git am` directly", 2019-01-18),
which was noticed by a random browsing of the code.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I forgot this in my previous patch `--pathspec-from-file` for
`git commit` [1]. When both `--pathspec-from-file` and `--all` were
specified, `--all` took precedence and `--pathspec-from-file` was
ignored. Before `--pathspec-from-file` was implemented, this case was
prevented by this check in `parse_and_validate_options()` :
die(_("paths '%s ...' with -a does not make sense"), argv[0]);
It is unfortunate that these two cases are disconnected. This came as
result of how the code was laid out before my patches, where `pathspec`
is parsed outside of `parse_and_validate_options()`. This branch is
already full of refactoring patches and I did not dare to go for another
one.
Fix by mirroring `die()` for `--pathspec-from-file` as well.
[1] Commit e440fc58 ("commit: support the --pathspec-from-file option" 2019-11-19)
Reported-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using a pointer that points at a constant string,
just give name directly to the constant string; this way, we
do not have to allocate a pointer variable in addition to
the string we want to use.
Let's convert `need_bad_and_good_revision_warning` and
`need_bisect_start_warning` char pointers to char arrays.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up.
* dl/range-diff-with-notes:
range-diff: clear `other_arg` at end of function
range-diff: mark pointers as const
t3206: fix incorrect test name
"git rebase" did not work well when format.useAutoBase
configuration variable is set, which has been corrected.
* dl/rebase-with-autobase:
rebase: fix format.useAutoBase breakage
format-patch: teach --no-base
t4014: use test_config()
format-patch: fix indentation
t3400: demonstrate failure with format.useAutoBase
Reduce unnecessary reading of state variables back from the disk
during sequencer operation.
* ag/sequencer-todo-updates:
sequencer: directly call pick_commits() from complete_action()
rebase: fill `squash_onto' in get_replay_opts()
sequencer: move the code writing total_nr on the disk to a new function
sequencer: update `done_nr' when skipping commands in a todo list
sequencer: update `total_nr' when adding an item to a todo list
In the previous steps, we re-implemented the main loop of `git add -i`
in C, and most of the commands.
Notably, we left out the actual functionality of `patch`, as the
relevant code makes up more than half of `git-add--interactive.perl`,
and is actually pretty independent of the rest of the commands.
With this commit, we start to tackle that `patch` part. For better
separation of concerns, we keep the code in a separate file,
`add-patch.c`. The new code is still guarded behind the
`add.interactive.useBuiltin` config setting, and for the moment,
it can only be called via `git add -p`.
The actual functionality follows the original implementation of
5cde71d64a (git-add --interactive, 2006-12-10), but not too closely
(for example, we use string offsets rather than copying strings around,
and after seeing whether the `k` and `j` commands are applicable, in the
C version we remember which previous/next hunk was undecided, and use it
rather than looking again when the user asked to jump).
As a further deviation from that commit, We also use a comma instead of
a slash to separate the available commands in the prompt, as the current
version of the Perl script does this, and we also add a line about the
question mark ("print help") to the help text.
While it is tempting to use this conversion of `git add -p` as an excuse
to work on `apply_all_patches()` so that it does _not_ want to read a
file from `stdin` or from a file, but accepts, say, an `strbuf` instead,
we will refrain from this particular rabbit hole at this stage.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a user uses the sparse-checkout feature in cone mode, they
add patterns using "git sparse-checkout set <dir1> <dir2> ..."
or by using "--stdin" to provide the directories line-by-line over
stdin. This behaviour naturally looks a lot like the way a user
would type "git add <dir1> <dir2> ..."
If core.ignoreCase is enabled, then "git add" will match the input
using a case-insensitive match. Do the same for the sparse-checkout
feature.
Perform case-insensitive checks while updating the skip-worktree
bits during unpack_trees(). This is done by changing the hash
algorithm and hashmap comparison methods to optionally use case-
insensitive methods.
When this is enabled, there is a small performance cost in the
hashing algorithm. To tease out the worst possible case, the
following was run on a repo with a deep directory structure:
git ls-tree -d -r --name-only HEAD |
git sparse-checkout set --stdin
The 'set' command was timed with core.ignoreCase disabled or
enabled. For the repo with a deep history, the numbers were
core.ignoreCase=false: 62s
core.ignoreCase=true: 74s (+19.3%)
For reproducibility, the equivalent test on the Linux kernel
repository had these numbers:
core.ignoreCase=false: 3.1s
core.ignoreCase=true: 3.6s (+16%)
Now, this is not an entirely fair comparison, as most users
will define their sparse cone using more shallow directories,
and the performance improvement from eb42feca97 ("unpack-trees:
hash less in cone mode" 2019-11-21) can remove most of the
hash cost. For a more realistic test, drop the "-r" from the
ls-tree command to store only the first-level directories.
In that case, the Linux kernel repository takes 0.2-0.25s in
each case, and the deep repository takes one second, plus or
minus 0.05s, in each case.
Thus, we _can_ demonstrate a cost to this change, but it is
unlikely to matter to any reasonable sparse-checkout cone.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 8164c961e1 (format-patch: use --notes behavior for format.notes,
2019-12-09), we introduced set_display_notes() which was a monolithic
function with three mutually exclusive branches. Break the function up
into three small and simple functions that each are only responsible for
one task.
This family of functions accepts an `int *show_notes` instead of
returning a value suitable for assignment to `show_notes`. This is for
two reasons. First of all, this guarantees that the external
`show_notes` variable changes in lockstep with the
`struct display_notes_opt`. Second, this prompts future developers to be
careful about doing something meaningful with this value. In fact, a
NULL check is intentionally omitted because causing a segfault here
would tell the future developer that they are meant to use the value for
something meaningful.
One alternative was making the family of functions accept a
`struct rev_info *` instead of the `struct display_notes_opt *`, since
the former contains the `show_notes` field as well. This does not work
because we have to call git_config() before repo_init_revisions().
However, if we had a `struct rev_info`, we'd need to initialize it before
it gets assigned values from git_config(). As a result, we break the
circular dependency by having standalone `int show_notes` and
`struct display_notes_opt notes_opt` variables which temporarily hold
values from git_config() until the values are copied over to `rev`.
To implement this change, we need to get a pointer to
`rev_info::show_notes`. Unfortunately, this is not possible with
bitfields and only direct-assignment is possible. Change
`rev_info::show_notes` to a non-bitfield int so that we can get its
address.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
read_ref_full() wraps refs_read_ref_full(), which in turn wraps
refs_resolve_ref_unsafe(), which handles a NULL oid pointer of callers
not interested in the resolved object ID. Make use of that feature to
document that mv() is such a caller.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 5e82c3dd22 (bisect--helper: `bisect_reset` shell function in C,
2019-01-02), the `git bisect reset` subcommand was ported to C. When the
call to `git checkout` failed, an error message was reported to the
user.
However, this error message used the `strbuf` that had just been
released already. Let's switch that around: first use it, then release
it.
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Hide lower-level verify_signed-buffer() API as a pure helper to
implement the public check_signature() function, in order to
encourage new callers to use the correct and more strict
validation.
* hi/gpg-use-check-signature:
gpg-interface: prefer check_signature() for GPG verification
The interaction between "git clone --recurse-submodules" and
alternate object store was ill-designed. The documentation and
code have been taught to make more clear recommendations when the
users see failures.
* jt/clone-recursesub-ref-advise:
submodule--helper: advise on fatal alternate error
Doc: explain submodule.alternateErrorStrategy
"git rebase -i" learned a few options that are known by "git
rebase" proper.
* ra/rebase-i-more-options:
rebase -i: finishing touches to --reset-author-date
rebase: add --reset-author-date
rebase -i: support --ignore-date
sequencer: rename amend_author to author_to_rename
rebase -i: support --committer-date-is-author-date
sequencer: allow callers of read_author_script() to ignore fields
rebase -i: add --ignore-whitespace flag
A few commands learned to take the pathspec from the
standard input or a named file, instead of taking it as the command
line arguments.
* am/pathspec-from-file:
commit: support the --pathspec-from-file option
doc: commit: synchronize <pathspec> description
reset: support the `--pathspec-from-file` option
doc: reset: synchronize <pathspec> description
pathspec: add new function to parse file
parse-options.h: add new options `--pathspec-from-file`, `--pathspec-file-nul`
In 13cdf78094 (format-patch: teach format.notes config option,
2019-05-16), the order in which git_config() and repo_init_revisions()
were swapped so that `rev.notes_opt` would be initialized before
git_config() was called. This is problematic, however, as git_config()
should generally be called before repo_init_revisions().
Break this circular dependency by creating `show_notes` and `notes_opt`
which git_config() reads into. Then, copy these values over to
`rev.show_notes` and `rev.notes_opt` after repo_init_revisions() is
called.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we had multiple `format.notes` config values where we had `<ref1>`,
`false`, `<ref2>` (in that order), then we would print out the notes for
both `<ref1>` and `<ref2>`. This doesn't make sense, however, since we
parse the config in a top-down manner and a `false` should be able to
override previous configurations, just like how `--no-notes` will
override previous `--notes`.
Duplicate the logic that handles the `--[no-]notes[=]` option to
`format.notes` for consistency. As a result, when parsing the config
from top to bottom, `format.notes = true` will behave like `--notes`,
`format.notes = <ref>` will behave like `--notes=<ref>` and
`format.notes = false` will behave like `--no-notes`.
This change isn't strictly backwards compatible but since it is an edge
case where a sane user would not mix notes refs with `false` and this
feature is relatively new (released only in v2.23.0), this change should
be harmless.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
According to the function comment, init_display_notes() was supposed to
"Load the notes machinery for displaying several notes trees." Rename
this function to load_display_notes() so that its use is more accurately
represented.
This is done because, in a future commit, we will reuse the name
init_display_notes().
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Earlier patches in this series moved a couple of conditions from the
recursive name_rev() function into its caller name_ref(), for no other
reason than to make eliminating the recursion a bit easier to follow.
Since the previous patch name_rev() is not recursive anymore, so let's
move all those conditions back into name_rev().
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The name_rev() function calls itself recursively for each interesting
parent of the commit it got as parameter, and, consequently, it can
segfault when processing a deep history if it exhausts the available
stack space. E.g. running 'git name-rev --all' and 'git name-rev
HEAD~100000' in the gcc, gecko-dev, llvm, and WebKit repositories
results in segfaults on my machine ('ulimit -s' reports 8192kB of
stack size limit), and nowadays the former segfaults in the Linux repo
as well (it reached the necessasry depth sometime between v5.3-rc4 and
-rc5).
Eliminate the recursion by inserting the interesting parents into a
LIFO 'prio_queue' [1] and iterating until the queue becomes empty.
Note that the parent commits must be added in reverse order to the
LIFO 'prio_queue', so their relative order is preserved during
processing, i.e. the first parent should come out first from the
queue, because otherwise performance greatly suffers on mergy
histories [2].
The stacksize-limited test 'name-rev works in a deep repo' in
't6120-describe.sh' demonstrated this issue and expected failure. Now
the recursion is gone, so flip it to expect success. Also gone are
the dmesg entries logging the segfault of that segfaulting 'git
name-rev' process on every execution of the test suite.
Note that this slightly changes the order of lines in the output of
'git name-rev --all', usually swapping two lines every 35 lines in
git.git or every 150 lines in linux.git. This shouldn't matter in
practice, because the output has always been unordered anyway.
This patch is best viewed with '--ignore-all-space'.
[1] Early versions of this patch used a 'commit_list', resulting in
~15% performance penalty for 'git name-rev --all' in 'linux.git',
presumably because of the memory allocation and release for each
insertion and removal. Using a LIFO 'prio_queue' has basically no
effect on performance.
[2] We prefer shorter names, i.e. 'v0.1~234' is preferred over
'v0.1^2~5', meaning that usually following the first parent of a
merge results in the best name for its ancestors. So when later
we follow the remaining parent(s) of a merge, and reach an already
named commit, then we usually find that we can't give that commit
a better name, and thus we don't have to visit any of its
ancestors again.
OTOH, if we were to follow the Nth parent of the merge first, then
the name of all its ancestors would include a corresponding '^N'.
Those are not the best names for those commits, so when later we
reach an already named commit following the first parent of that
merge, then we would have to update the name of that commit and
the names of all of its ancestors as well. Consequently, we would
have to visit many commits several times, resulting in a
significant slowdown.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Following the previous patches in this series we can get the value of
'name_rev()'s 'tip_name' parameter from the 'struct rev_name'
associated with the commit as well.
So let's use 'name->tip_name' instead, which makes the patch
eliminating the recursion of name_rev() a bit easier to follow.
Note that at this point we could drop the 'tip_name' parameter as
well, but that parameter will be necessary later, after the recursion
is eliminated.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
flush_current_id() prints the hexadecimal representation of two object
IDs. When the code was added in f97672225b (Add "git-patch-id" program
to generate patch ID's., 2005-06-23), sha1_to_hex() had only a single
internal static buffer, so the result of one invocation had to be stored
in a local buffer.
Since dcb3450fd8 (sha1_to_hex() usage cleanup, 2006-05-03) it rotates
through four buffers, which allows to print up to four object IDs at the
same time. 1a876a69af (patch-id: convert to use struct object_id,
2015-03-13) replaced sha1_to_hex() with oid_to_hex(), which has the same
feature. Use it to simplify the code.
Signed-off-by: René Scharfe <l.s.r@web.de>
Acked-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>