We assemble an array of strings in a custom struct,
NULL-terminate the result, and then pass it to
parse_pathspec().
But then we never free the array or the individual strings
(nor can we do the latter, as they are heap-allocated when
they come from stdin but not when they come from the
passed-in argv).
Let's swap this out for an argv_array. It does the same
thing with fewer lines of code, and it's safe to call
argv_array_clear() at the end to avoid a memory leak.
Reported-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git gc" and friends when multiple worktrees are used off of a
single repository did not consider the index and per-worktree refs
of other worktrees as the root for reachability traversal, making
objects that are in use only in other worktrees to be subject to
garbage collection.
* nd/prune-in-worktree:
refs.c: reindent get_submodule_ref_store()
refs.c: remove fallback-to-main-store code get_submodule_ref_store()
rev-list: expose and document --single-worktree
revision.c: --reflog add HEAD reflog from all worktrees
files-backend: make reflog iterator go through per-worktree reflog
revision.c: --all adds HEAD from all worktrees
refs: remove dead for_each_*_submodule()
refs.c: move for_each_remote_ref_submodule() to submodule.c
revision.c: use refs_for_each*() instead of for_each_*_submodule()
refs: add refs_head_ref()
refs: move submodule slash stripping code to get_submodule_ref_store
refs.c: refactor get_submodule_ref_store(), share common free block
revision.c: --indexed-objects add objects from all worktrees
revision.c: refactor add_index_objects_to_pending()
refs.c: use is_dir_sep() in resolve_gitlink_ref()
revision.h: new flag in struct rev_info wrt. worktree-related refs
"git log --tag=no-such-tag" showed log starting from HEAD, which
has been fixed---it now shows nothing.
* jk/rev-list-empty-input:
revision: do not fallback to default when rev_input_given is set
rev-list: don't show usage when we see empty ref patterns
revision: add rev_input_given flag
t6018: flesh out empty input/output rev-list tests
Note that add_other_reflogs_to_pending() is a bit inefficient, since
it scans reflog for all refs of each worktree, including shared refs,
so the shared ref's reflog is scanned over and over again.
We could update refs API to pass "per-worktree only" flag to avoid
that. But long term we should be able to obtain a "per-worktree only"
ref store and would need to revert the changes in reflog iteration
API. So let's just wait until then.
add_reflogs_to_pending() is called by reachable.c so by default "git
prune" will examine reflog from all worktrees.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unless single_worktree is set, --all now adds HEAD from all worktrees.
Since reachable.c code does not use setup_revisions(), we need to call
other_head_refs_submodule() explicitly there to have the same effect on
"git prune", so that we won't accidentally delete objects needed by some
other HEADs.
A new FIXME is added because we would need something like
int refs_other_head_refs(struct ref_store *, each_ref_fn, cb_data);
in addition to other_head_refs() to handle it, which might require
int get_submodule_worktrees(const char *submodule, int flags);
It could be a separate topic to reduce the scope of this one.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is the result of single_worktree flag never being set (no way to up
until now). To get objects from current index only, set single_worktree.
The other add_index_objects_to_pending's caller is mark_reachable_objects()
(e.g. "git prune") which also mark objects from all indexes.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The core code is factored out and take 'struct index_state *' instead so
that we can reuse it to add objects from index files other than .git/index
in the next patch.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Numerous bugs in walking of reflogs via "log -g" and friends have
been fixed.
* jk/reflog-walk:
reflog-walk: apply --since/--until to reflog dates
reflog-walk: stop using fake parents
rev-list: check reflog_info before showing usage
get_revision_1(): replace do-while with an early return
log: do not free parents when walking reflog
log: clarify comment about reflog cycles
revision: disallow reflog walking with revs->limited
t1414: document some reflog-walk oddities
"git log --tag=no-such-tag" showed log starting from HEAD, which
has been fixed---it now shows nothing.
* jk/rev-list-empty-input:
revision: do not fallback to default when rev_input_given is set
rev-list: don't show usage when we see empty ref patterns
revision: add rev_input_given flag
t6018: flesh out empty input/output rev-list tests
Numerous bugs in walking of reflogs via "log -g" and friends have
been fixed.
* jk/reflog-walk:
reflog-walk: apply --since/--until to reflog dates
reflog-walk: stop using fake parents
rev-list: check reflog_info before showing usage
get_revision_1(): replace do-while with an early return
log: do not free parents when walking reflog
log: clarify comment about reflog cycles
revision: disallow reflog walking with revs->limited
t1414: document some reflog-walk oddities
Conversion from uchar[20] to struct object_id continues.
* bc/object-id:
sha1_name: convert uses of 40 to GIT_SHA1_HEXSZ
sha1_name: convert GET_SHA1* flags to GET_OID*
sha1_name: convert get_sha1* to get_oid*
Convert remaining callers of get_sha1 to get_oid.
builtin/unpack-file: convert to struct object_id
bisect: convert bisect_checkout to struct object_id
builtin/update_ref: convert to struct object_id
sequencer: convert to struct object_id
remote: convert struct push_cas to struct object_id
submodule: convert submodule config lookup to use object_id
builtin/merge-tree: convert remaining caller of get_sha1 to object_id
builtin/fsck: convert remaining caller of get_sha1 to object_id
If revs->def is set (as it is in "git log") and there are no
pending objects after parsing the user's input, then we show
whatever is in "def". But if the user _did_ ask for some
input that just happened to be empty (e.g., "--glob" that
does not match anything), showing the default revision is
confusing. We should just show nothing, as that is what the
user's request yielded.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Normally a caller that invokes setup_revisions() has to
check rev.pending to see if anything was actually queued for
the traversal. But they can't tell the difference between
two cases:
1. The user gave us no tip from which to start a
traversal.
2. The user tried to give us tips via --glob, --all, etc,
but their patterns ended up being empty.
Let's set a flag in the rev_info struct that callers can use
to tell the difference. We can set this from the
init_all_refs_cb() function. That's a little funny because
it's not exactly about initializing the "cb" struct itself.
But that function is the common setup place for doing
pattern traversals that is used by --glob, --all, etc.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the flags for get_oid_with_context and friends to use "OID"
instead of "SHA1" in their names.
This transform was made by running the following one-liner on the
affected files:
perl -pi -e 's/GET_SHA1/GET_OID/g'
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code cleanup.
* ab/grep-lose-opt-regflags:
grep: remove redundant REG_NEWLINE when compiling fixed regex
grep: remove regflags from the public grep_opt API
grep: remove redundant and verbose re-assignments to 0
grep: remove redundant "fixed" field re-assignment to 0
grep: adjust a redundant grep pattern type assignment
grep: remove redundant double assignment to 0
When doing a reflog walk, we use the commit's date to
do any date limiting. In earlier versions of Git, this could
lead to nonsense results, since a skipped commit would
truncate the traversal. So a sequence like:
git commit ...
git checkout week-old-branch
git checkout -
git log -g --since=1.day.ago
would stop at the week-old-branch, even though the "git
commit" entry further back is still interesting.
As of the prior commit, which uses a parent-less traversal
of the reflog, you get the whole reflog minus any commits
whose dates do not match the specified options. This is
arguably useful, as you could scan the reflogs for commits
that originated in a certain range.
But more likely a user doing a reflog walk wants to limit
based on the reflog entries themselves. You can simulate
--until with:
git log -g @{1.day.ago}
but there's no way to ask Git to traverse only back to a
certain date. E.g.:
# show me reflog entries from the past day
git log -g --since=1.day.ago
This patch teaches the revision machinery to prefer the
reflog entry dates to the commit dates when doing a reflog
walk. Technically this is a change in behavior that affects
plumbing, but the previous behavior was so buggy that it's
unlikely anyone was relying on it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reflog-walk system works by putting a ref's tip into the
pending queue, and then "traversing" the reflog by
pretending that the parent of each commit is the previous
reflog entry.
This causes a number of user-visible oddities, as documented
in t1414 (and the commit message which introduced it). We
can fix all of them in one go by replacing the fake-reflog
system with a much simpler one: just keeping a list of
reflogs to show, and walking through them entry by entry.
The implementation is fairly straight-forward, but there are
a few items to note:
1. We obviously must skip calling add_parents_to_list()
when we are traversing reflogs, since we do not want to
walk the original parents at all. As a result, we must call
try_to_simplify_commit() ourselves.
There are other parts of add_parents_to_list() we skip,
as well, but none of them should matter for a reflog
traversal:
- We do not allow UNINTERESTING commits, nor
symmetric ranges (and we bail when these are used
with "-g").
- Using --source makes no sense, since we aren't
traversing. The reflog selector shows the same
information with more detail.
- Using --first-parent is still sensible, since you
may want to see the first-parent diff for each
entry. But since we're not traversing, we don't
need to cull the parent list here.
2. Since we now just walk the reflog entries themselves,
rather than starting with the ref tip, we now look at
the "new" field of each entry rather than the "old"
(i.e., we are showing entries, not faking parents).
This removes all of the tricky logic around skipping
past root commits.
But note that we have no way to show an entry with the
null sha1 in its "new" field (because such a commit
obviously does not exist). Normally this would not
happen, since we delete reflogs along with refs, but
there is one special case. When we rename the currently
checked out branch, we write two reflog entries into
the HEAD log: one where the commit goes away, and
another where it comes back.
Prior to this commit, we show both entries with
identical reflog messages. After this commit, we show
only the "comes back" entry. See the update in t3200
which demonstrates this.
Arguably either is fine, as the whole double-entry
thing is a bit hacky in the first place. And until a
recent fix, we truncated the traversal in such a case
anyway, which was _definitely_ wrong.
3. We show individual reflogs in order, but choose which
reflog to show at each stage based on which has the
most recent timestamp. This interleaves the output
from multiple reflogs based on date order, which is
probably what you'd want with limiting like "-n 30".
Note that the implementation aims for simplicity. It
does a linear walk over the reflog queue for each
commit it pulls, which may perform badly if you
interleave an enormous number of reflogs. That seems
like an unlikely use case; if we did want to handle it,
we could probably keep a priority queue of reflogs,
ordered by the timestamp of their current tip entry.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The get_revision_1() function tries to avoid entering its
main loop at all when there are no commits to look at. But
it's perfectly safe to call pop_commit() on an empty list
(in which case it will return NULL). Switching to an early
return from the loop lets us skip repeating the loop
condition before we enter the do-while. That will get more
important when we start pulling reflog-walk commits from a
source besides the revs->commits queue, as that condition
will get much more complicated.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reflog-walk code doesn't work with limit_list(). That
function traverses down the real history graph, not the fake
reflog history that get_revision() returns. So it's not
going to actually examine all of the commits we're going to
show, because we'd add them to the pending list only during
the actual traversal.
In practice this limitation doesn't really matter, because
the options that require list-limiting generally need
UNINTERESTING endpoints or symmetric ranges, which already
are forbidden for reflog walks. Still, there are likely some
corner cases that would behave oddly. We're better off to
warn the user that we can't fulfill their request than to
generate potentially wrong output.
This will also make it easier to refactor the reflog-walking
code, because it eliminates a whole area of corner cases
we'd have to consider (that already don't work anyway).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor calls to the grep machinery to always pass opt.ignore_case &
opt.extended_regexp_option instead of setting the equivalent regflags
bits.
The bug fixed when making -i work with -P in commit 9e3cbc59d5 ("log:
make --regexp-ignore-case work with --perl-regexp", 2017-05-20) was
really just plastering over the code smell which this change fixes.
The reason for adding the extensive commentary here is that I
discovered some subtle complexity in implementing this that really
should be called out explicitly to future readers.
Before this change we'd rely on the difference between
`extended_regexp_option` and `regflags` to serve as a membrane between
our preliminary parsing of grep.extendedRegexp and grep.patternType,
and what we decided to do internally.
Now that those two are the same thing, it's necessary to unset
`extended_regexp_option` just before we commit in cases where both of
those config variables are set. See 84befcd0a4 ("grep: add a
grep.patternType configuration setting", 2012-08-03) for the code and
documentation related to that.
The explanation of why the if/else branches in
grep_commit_pattern_type() are ordered the way they are exists in that
commit message, but I think it's worth calling this subtlety out
explicitly with a comment for future readers.
Even though grep_commit_pattern_type() is the only caller of
grep_set_pattern_type_option() it's simpler to reset the
extended_regexp_option flag in the latter, since 2/3 branches in the
former would otherwise need to reset it, this way we can do it in one
place.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Bugfix for a topic that is (only) in 'master'.
* mh/packed-ref-store-prep:
for_each_bisect_ref(): don't trim refnames
lock_packed_refs(): fix cache validity check
Remove the unused wildopts placeholder struct from being passed to all
wildmatch() invocations, or rather remove all the boilerplate NULL
parameters.
This parameter was added back in commit 9b3497cab9 ("wildmatch: rename
constants and update prototype", 2013-01-01) as a placeholder for
future use. Over 4 years later nothing has made use of it, let's just
remove it. It can be added in the future if we find some reason to
start using such a parameter.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up.
* sg/revision-parser-skip-prefix:
revision.c: use skip_prefix() in handle_revision_pseudo_opt()
revision.c: use skip_prefix() in handle_revision_opt()
revision.c: stricter parsing of '--early-output'
revision.c: stricter parsing of '--no-{min,max}-parents'
revision.h: turn rev_info.early_output back into an unsigned int
Update "perl-compatible regular expression" support to enable JIT
and also allow linking with the newer PCRE v2 library.
* ab/pcre-v2:
grep: add support for PCRE v2
grep: un-break building with PCRE >= 8.32 without --enable-jit
grep: un-break building with PCRE < 8.20
grep: un-break building with PCRE < 8.32
grep: add support for the PCRE v1 JIT API
log: add -P as a synonym for --perl-regexp
grep: skip pthreads overhead when using one thread
grep: don't redundantly compile throwaway patterns under threading
`for_each_bisect_ref()` is called by `for_each_bad_bisect_ref()` with
a term "bad". This used to make it call `for_each_ref_in_submodule()`
with a prefix "refs/bisect/bad". But the latter is the name of the
reference that is being sought, so the empty string was being passed
to the callback as the trimmed refname. Moreover, this questionable
practice was turned into an error by
b9c8e7f2fb prefix_ref_iterator: don't trim too much, 2017-05-22
It makes more sense (and agrees better with the documentation of
`--bisect`) for the callers to receive the full reference names. So
* Add a new function, `for_each_fullref_in_submodule()`, to the refs
API. This plugs a gap in the existing functionality, analogous to
`for_each_fullref_in()` but accepting a `submodule` argument.
* Change `for_each_bad_bisect_ref()` to call the new function rather
than `for_each_ref_in_submodule()`.
* Add a test.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of starts_with() and a bunch of magic numbers.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of starts_with() and a bunch of magic numbers.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The parsing of '--early-output' with or without its optional integer
argument allowed bogus options like '--early-output-foobarbaz' to slip
through and be ignored.
Fix it by parsing '--early-output' in the same way as other options
with an optional argument are parsed. Furthermore, use strtoul_ui()
to parse the optional integer argument and to refuse negative numbers.
While at it, use skip_prefix() instead of starts_with() and magic
numbers.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These two options are parsed using starts_with(), allowing things like
'git log --no-min-parents-foobarbaz' to succeed.
Use strcmp() instead.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The internal implementation of "git grep" has seen some clean-up.
* ab/grep-preparatory-cleanup: (31 commits)
grep: assert that threading is enabled when calling grep_{lock,unlock}
grep: given --threads with NO_PTHREADS=YesPlease, warn
pack-objects: fix buggy warning about threads
pack-objects & index-pack: add test for --threads warning
test-lib: add a PTHREADS prerequisite
grep: move is_fixed() earlier to avoid forward declaration
grep: change internal *pcre* variable & function names to be *pcre1*
grep: change the internal PCRE macro names to be PCRE1
grep: factor test for \0 in grep patterns into a function
grep: remove redundant regflags assignments
grep: catch a missing enum in switch statement
perf: add a comparison test of log --grep regex engines with -F
perf: add a comparison test of log --grep regex engines
perf: add a comparison test of grep regex engines with -F
perf: add a comparison test of grep regex engines
perf: emit progress output when unpacking & building
perf: add a GIT_PERF_MAKE_COMMAND for when *_MAKE_OPTS won't do
grep: add tests to fix blind spots with \0 patterns
grep: prepare for testing binary regexes containing rx metacharacters
grep: add a test helper function for less verbose -f \0 tests
...
The result from "git diff" that compares two blobs, e.g. "git diff
$commit1:$path $commit2:$path", used to be shown with the full
object name as given on the command line, but it is more natural to
use the $path in the output and use it to look up .gitattributes.
* jk/diff-blob:
diff: use blob path for blob/file diffs
diff: use pending "path" if it is available
diff: use the word "path" instead of "name" for blobs
diff: pass whole pending entry in blobinfo
handle_revision_arg: record paths for pending objects
handle_revision_arg: record modes for "a..b" endpoints
t4063: add tests of direct blob diffs
get_sha1_with_context: dynamically allocate oc->path
get_sha1_with_context: always initialize oc->symlink_path
sha1_name: consistently refer to object_context as "oc"
handle_revision_arg: add handle_dotdot() helper
handle_revision_arg: hoist ".." check out of range parsing
handle_revision_arg: stop using "dotdot" as a generic pointer
handle_revision_arg: simplify commit reference lookups
handle_revision_arg: reset "dotdot" consistently
Convert diff_change to take a struct object_id. In addition convert the
function pointer type 'change_fn_t' to also take a struct object_id.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert diff_addremove to take a struct object_id. In addtion convert
the function pointer type 'add_remove_fn_t' to also take a struct
object_id.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tag objects, which are not reachable from any ref, that point at
missing objects were mishandled by "git gc" and friends (they
should silently be ignored instead)
* jk/ignore-broken-tags-when-ignoring-missing-links:
revision.c: ignore broken tags with ignore_missing_links
Conversion from uchar[20] to struct object_id continues.
* bc/object-id: (53 commits)
object: convert parse_object* to take struct object_id
tree: convert parse_tree_indirect to struct object_id
sequencer: convert do_recursive_merge to struct object_id
diff-lib: convert do_diff_cache to struct object_id
builtin/ls-tree: convert to struct object_id
merge: convert checkout_fast_forward to struct object_id
sequencer: convert fast_forward_to to struct object_id
builtin/ls-files: convert overlay_tree_on_cache to object_id
builtin/read-tree: convert to struct object_id
sha1_name: convert internals of peel_onion to object_id
upload-pack: convert remaining parse_object callers to object_id
revision: convert remaining parse_object callers to object_id
revision: rename add_pending_sha1 to add_pending_oid
http-push: convert process_ls_object and descendants to object_id
refs/files-backend: convert many internals to struct object_id
refs: convert struct ref_update to use struct object_id
ref-filter: convert some static functions to struct object_id
Convert struct ref_array_item to struct object_id
Convert the verify_pack callback to struct object_id
Convert lookup_tag to struct object_id
...
Add a short -P option as a synonym for the longer --perl-regexp, for
consistency with the options the corresponding grep invocations
accept.
This was intentionally omitted in commit 727b6fc3ed ("log --grep:
accept --basic-regexp and --perl-regexp", 2012-10-03) for unspecified
future use.
Make it consistent with "grep" rather than to keep it open for future
use, and to avoid the confusion of -P meaning different things for
grep & log, as is the case with the -G option.
As noted in the aforementioned commit the --basic-regexp option can't
have a corresponding -G argument, as the log command already uses that
for -G<regex>.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the revision parser sees an argument like tree:path, we
parse it down to the correct blob (or tree), but throw away
the "path" portion. Let's ask get_sha1_with_context() to
record it, and pass it along in the pending array.
This will let programs like git-diff which rely on the
revision-parser show more accurate paths.
Note that the implementation is a little tricky; we have to
make sure we free oc.path in all code paths. For handle_dotdot(),
we can piggy-back on the existing cleanup-wrapper pattern.
The real work happens in handle_dotdot_1(), but the
handle_dotdot() wrapper makes sure that the path is freed no
matter how we exit the function (and for that reason we make
sure that the object_context struct is zero'd, so if we fail
to even get to the get_sha1_with_context() call, we just end
up calling free(NULL)).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "a..b" revision syntax was designed to handle commits,
so it doesn't bother to record any mode we find while
traversing a "tree:path" endpoint. These days "git diff" can
diff blobs using either "a:path..b:path" (with dots) or
"a:path b:path" (without), but the two behave
inconsistently, as the with-dots version fails to notice the
mode.
Let's teach the dot-dot range parser to record modes; it
doesn't cost us anything, and it makes this case work.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The handle_revision_arg function is rather long, and a big
chunk of it is handling the range operators. Let's pull that
out to a separate helper. While we're doing so, we can clean
up a few of the rough edges that made the flow hard to
follow:
- instead of manually restoring *dotdot (that we overwrote
with a NUL), do the real work in a sub-helper, which
makes it clear that the munge/restore lines are a
matched pair
- eliminate a goto which wasn't actually used for control
flow, but only to avoid duplicating a few lines
(instead, those lines are pushed into another helper
function)
- use early returns instead of deep nesting
- consistently name all variables for the left-hand side
of the range as "a" (rather than "this" or "from") and
the right-hand side as "b" (rather than "next", or using
the unadorned "sha1" or "flags" from the main function).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 003c84f6d (specifying ranges: we did not mean to make
".." an empty set, 2011-05-02), we treat the argument ".."
specially. We detect it by noticing that both sides of the
range are empty, and that this is a non-symmetric two-dot
range.
While correct, this makes the code overly complicated. We
can just detect ".." up front before we try to do further
parsing. This avoids having to de-munge the NUL from dotdot,
and lets us eliminate an extra const array (which we needed
only to do direct pointer comparisons).
It also removes the one code path from the range-parsing
conditional that requires us to return -1. That will make it
simpler to pull the dotdot parsing out into its own
function.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The handle_revision_arg() function has a "dotdot" variable
that it uses to find a ".." or "..." in the argument. If we
don't find one, we look for other marks, like "^!". But we
just keep re-using the "dotdot" variable, which is
confusing.
Let's introduce a separate "mark" variable that can be used
for these other marks. They still reuse the same variable,
but at least the name is no longer actively misleading.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "dotdot" range parser avoids calling
lookup_commit_reference() if we are directly fed two
commits. But its casts are unnecessarily complex; that
function will just return a commit we pass into it.
Just calling the function all the time is much simpler, and
doesn't do any significant extra work (the object is already
parsed, and deref_tag() on a non-tag is a noop; we do incur
one extra lookup_object() call, but that's fairly trivial).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we are parsing a range like "a..b", we write a
temporary NUL over the first ".", so that we can access the
names "a" and "b" as C strings. But our restoration of the
original "." is done at inconsistent times, which can lead
to confusing results.
For most calls, we restore the "." after we resolve the
names, but before we call verify_non_filename(). This means
that when we later call add_pending_object(), the name for
the left-hand "a" has been re-expanded to "a..b". You can
see this with:
git log --source a...b
where "b" will be correctly marked with "b", but "a" will be
marked with "a...b". Likewise with "a..b" (though you need
to use --boundary to even see "a" at all in that case).
To top off the confusion, when the REVARG_CANNOT_BE_FILENAME
flag is set, we skip the non-filename check, and leave the
NUL in place.
That means we do report the correct name for "a" in the
pending array. But some code paths try to show the whole
"a...b" name in error messages, and these erroneously show
only "a" instead of "a...b". E.g.:
$ git cherry-pick HEAD:foo...HEAD:foo
error: object d95f3ad14dee633a758d2e331151e950dd13e4ed is a blob, not a commit
error: object d95f3ad14dee633a758d2e331151e950dd13e4ed is a blob, not a commit
fatal: Invalid symmetric difference expression HEAD:foo
(That last message should be "HEAD:foo...HEAD:foo"; I used
cherry-pick because it passes the CANNOT_BE_FILENAME flag).
As an interesting side note, cherry-pick actually looks at
and re-resolves the arguments from the pending->name fields.
So it would have been visibly broken by the first bug, but
the effect was canceled out by the second one.
This patch makes the whole function consistent by re-writing
the NUL immediately after calling verify_non_filename(), and
then restoring the "." as appropriate in some error-printing
and early-return code paths.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* bc/object-id: (53 commits)
object: convert parse_object* to take struct object_id
tree: convert parse_tree_indirect to struct object_id
sequencer: convert do_recursive_merge to struct object_id
diff-lib: convert do_diff_cache to struct object_id
builtin/ls-tree: convert to struct object_id
merge: convert checkout_fast_forward to struct object_id
sequencer: convert fast_forward_to to struct object_id
builtin/ls-files: convert overlay_tree_on_cache to object_id
builtin/read-tree: convert to struct object_id
sha1_name: convert internals of peel_onion to object_id
upload-pack: convert remaining parse_object callers to object_id
revision: convert remaining parse_object callers to object_id
revision: rename add_pending_sha1 to add_pending_oid
http-push: convert process_ls_object and descendants to object_id
refs/files-backend: convert many internals to struct object_id
refs: convert struct ref_update to use struct object_id
ref-filter: convert some static functions to struct object_id
Convert struct ref_array_item to struct object_id
Convert the verify_pack callback to struct object_id
Convert lookup_tag to struct object_id
...
Make the --regexp-ignore-case option work with --perl-regexp. This
never worked, and there was no test for this. Fix the bug and add a
test.
When PCRE support was added in commit 63e7e9d8b6 ("git-grep: Learn
PCRE", 2011-05-09) compile_pcre_regexp() would only check
opt->ignore_case, but when the --perl-regexp option was added in
commit 727b6fc3ed ("log --grep: accept --basic-regexp and
--perl-regexp", 2012-10-03) the code didn't set the opt->ignore_case.
Change the test suite to test for -i and --invert-regexp with
basic/extended/perl patterns in addition to fixed, which was the only
patternType that was tested for before in combination with those
options.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When peeling a tag for prepare_revision_walk(), we do not
respect the ignore_missing_links flag. This can lead to a
bogus error when pack-objects walks the possibly-broken
unreachable-but-recent part of the object graph.
The other link-following all happens via traverse_commit_list(),
which explains why this case was missed. And our tests
covered only broken links from commits. Let's be more
comprehensive and cover broken tree entries (which do work)
and tags (which shows off this bug).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename this function and convert it to take a pointer to struct
object_id.
This is a prerequisite for converting get_reference, which is needed to
convert parse_object.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the lookup_tree function to take a pointer to struct object_id.
The commit was created with manual changes to tree.c, tree.h, and
object.c, plus the following semantic patch:
@@
@@
- lookup_tree(EMPTY_TREE_SHA1_BIN)
+ lookup_tree(&empty_tree_oid)
@@
expression E1;
@@
- lookup_tree(E1.hash)
+ lookup_tree(&E1)
@@
expression E1;
@@
- lookup_tree(E1->hash)
+ lookup_tree(E1)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert lookup_blob to take a pointer to struct object_id.
The commit was created with manual changes to blob.c and blob.h, plus
the following semantic patch:
@@
expression E1;
@@
- lookup_blob(E1.hash)
+ lookup_blob(&E1)
@@
expression E1;
@@
- lookup_blob(E1->hash)
+ lookup_blob(E1)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert lookup_commit, lookup_commit_or_die,
lookup_commit_reference, and lookup_commit_reference_gently to take
struct object_id arguments.
Introduce a temporary in parse_object buffer in order to convert this
function. This is required since in order to convert parse_object and
parse_object_buffer, lookup_commit_reference_gently and
lookup_commit_or_die would need to be converted. Not introducing a
temporary would therefore require that lookup_commit_or_die take a
struct object_id *, but lookup_commit would take unsigned char *,
leaving a confusing and hard-to-use interface.
parse_object_buffer will lose this temporary in a later patch.
This commit was created with manual changes to commit.c, commit.h, and
object.c, plus the following semantic patch:
@@
expression E1, E2;
@@
- lookup_commit_reference_gently(E1.hash, E2)
+ lookup_commit_reference_gently(&E1, E2)
@@
expression E1, E2;
@@
- lookup_commit_reference_gently(E1->hash, E2)
+ lookup_commit_reference_gently(E1, E2)
@@
expression E1;
@@
- lookup_commit_reference(E1.hash)
+ lookup_commit_reference(&E1)
@@
expression E1;
@@
- lookup_commit_reference(E1->hash)
+ lookup_commit_reference(E1)
@@
expression E1;
@@
- lookup_commit(E1.hash)
+ lookup_commit(&E1)
@@
expression E1;
@@
- lookup_commit(E1->hash)
+ lookup_commit(E1)
@@
expression E1, E2;
@@
- lookup_commit_or_die(E1.hash, E2)
+ lookup_commit_or_die(&E1, E2)
@@
expression E1, E2;
@@
- lookup_commit_or_die(E1->hash, E2)
+ lookup_commit_or_die(E1, E2)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a caller of lookup_commit_or_die, which we will convert later
on.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the sha1 member of struct cache_tree to struct object_id by
changing the definition and applying the following semantic patch, plus
the standard object_id transforms:
@@
struct cache_tree E1;
@@
- E1.sha1
+ E1.oid.hash
@@
struct cache_tree *E1;
@@
- E1->sha1
+ E1->oid.hash
Fix up one reference to active_cache_tree which was not automatically
caught by Coccinelle. These changes are prerequisites for converting
parse_object.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git's source code assumes that unsigned long is at least as precise as
time_t. Which is incorrect, and causes a lot of problems, in particular
where unsigned long is only 32-bit (notably on Windows, even in 64-bit
versions).
So let's just use a more appropriate data type instead. In preparation
for this, we introduce the new `timestamp_t` data type.
By necessity, this is a very, very large patch, as it has to replace all
timestamps' data type in one go.
As we will use a data type that is not necessarily identical to `time_t`,
we need to be very careful to use `time_t` whenever we interact with the
system functions, and `timestamp_t` everywhere else.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The interpret_branch_name() function converts names like
@{-1} and @{upstream} into branch names. The expanded ref
names are not fully qualified, and may be outside of the
refs/heads/ namespace (e.g., "@" expands to "HEAD", and
"@{upstream}" is likely to be in "refs/remotes/").
This is OK for callers like dwim_ref() which are primarily
interested in resolving the resulting name, no matter where
it is. But callers like "git branch" treat the result as a
branch name in refs/heads/. When we expand to a ref outside
that namespace, the results are very confusing (e.g., "git
branch @" tries to create refs/heads/HEAD, which is
nonsense).
Callers can't know from the returned string how the
expansion happened (e.g., did the user really ask for a
branch named "HEAD", or did we do a bogus expansion?). One
fix would be to return some out-parameters describing the
types of expansion that occurred. This has the benefit that
the caller can generate precise error messages ("I
understood @{upstream} to mean origin/master, but that is a
remote tracking branch, so you cannot create it as a local
name").
However, out-parameters make the function interface somewhat
cumbersome. Instead, let's do the opposite: let the caller
tell us which elements to expand. That's easier to pass in,
and none of the callers give more precise error messages
than "@{upstream} isn't a valid branch name" anyway (which
should be sufficient).
The strbuf_branchname() function needs a similar parameter,
as most of the callers access interpret_branch_name()
through it.
We can break the callers down into two groups:
1. Callers that are happy with any kind of ref in the
result. We pass "0" here, so they continue to work
without restrictions. This includes merge_name(),
the reflog handling in add_pending_object_with_path(),
and substitute_branch_name(). This last is what powers
dwim_ref().
2. Callers that have funny corner cases (mostly in
git-branch and git-checkout). These need to make use of
the new parameter, but I've left them as "0" in this
patch, and will address them individually in follow-on
patches.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make each_reflog_ent_fn take two struct object_id pointers instead of
two pointers to unsigned char. Convert the various callbacks to use
struct object_id as well. Also, rename fsck_handle_reflog_sha1 to
fsck_handle_reflog_oid.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git log rev^..rev" is an often-used revision range specification
to show what was done on a side branch merged at rev. This has
gained a short-hand "rev^-1". In general "rev^-$n" is the same as
"^rev^$n rev", i.e. what has happened on other branches while the
history leading to nth parent was looking the other way.
* vn/revision-shorthand-for-side-branch-log:
revision: new rev^-n shorthand for rev^n..rev
"git log rev^..rev" is commonly used to show all work done on and merged
from a side branch. This patch introduces a shorthand "rev^-" for this
and additionally allows "rev^-$n" to mean "reachable from rev, excluding
what is reachable from the nth parent of rev". For example, for a
two-parent merge, you can use rev^-2 to get the set of commits which were
made to the main branch while the topic branch was prepared.
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert struct cache_entry to use struct object_id by applying the
following semantic patch and the object_id transforms from contrib, plus
the actual change to the struct:
@@
struct cache_entry E1;
@@
- E1.sha1
+ E1.oid.hash
@@
struct cache_entry *E1;
@@
- E1->sha1
+ E1->oid.hash
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git rebase" tries to compare set of changes on the updated
upstream and our own branch, it computes patch-id for all of these
changes and attempts to find matches. This has been optimized by
lazily computing the full patch-id (which is expensive) to be
compared only for changes that touch the same set of paths.
* kw/patch-ids-optim:
rebase: avoid computing unnecessary patch IDs
patch-ids: add flag to create the diff patch id using header only data
patch-ids: replace the seen indicator with a commit pointer
patch-ids: stop using a hand-rolled hashmap implementation
"git -c grep.patternType=extended log --basic-regexp" misbehaved
because the internal API to access the grep machinery was not
designed well.
* jc/grep-commandline-vs-configuration:
grep: further simplify setting the pattern type
The cherry_pick_list was looping through the original side checking the
seen indicator and setting the cherry_flag on the commit. If we save
off the commit in the patch_id we can set the cherry_flag on the correct
commit when running through the other side when a patch_id match is found.
Signed-off-by: Kevin Willford <kcwillford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When c5c31d33 (grep: move pattern-type bits support to top-level
grep.[ch], 2012-10-03) introduced grep_commit_pattern_type() helper
function, the intention was to allow the users of grep API to having
to fiddle only with .pattern_type_option (which can be set to "fixed",
"basic", "extended", and "pcre"), and then immediately before compiling
the pattern strings for use, call grep_commit_pattern_type() to have
it prepare various bits in the grep_opt structure (like .fixed,
.regflags, etc.).
However, grep_set_pattern_type_option() helper function the grep API
internally uses were left as an external function by mistake. This
function shouldn't have been made callable by the users of the API.
Later when the grep API was used in revision traversal machinery,
the caller then mistakenly started calling the function around
34a4ae55 (log --grep: use the same helper to set -E/-F options as
"git grep", 2012-10-03), instead of setting the .pattern_type_option
field and letting the grep_commit_pattern_type() to take care of the
details.
This caused an unnecessary bug that made a configured
grep.patternType take precedence over the command line options
(e.g. --basic-regexp, --fixed-strings) in "git log" family of
commands.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git log" learns log.showSignature configuration variable, and a
command line option "--no-show-signature" to countermand it.
* mj/log-show-signature-conf:
log: add log.showSignature configuration variable
log: add "--no-show-signature" command line option
t4202: refactor test
If an user creates an alias with "--show-signature" early in command
line, e.g.
[alias] logss = log --show-signature
then there is no way to countermand it through command line.
Teach git-log and related commands about "--no-show-signature" command
line option. This will make "git logss --no-show-signature" run
without showing GPG signature.
Signed-off-by: Mehul Jain <mehul.jain2029@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function takes a pointer to a pathspec structure, and releases
the resources held by it, but does not free() the structure itself.
Such a function should be called "clear", not "free".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move from unsigned char[20] to struct object_id continues.
* bc/object-id:
match-trees: convert several leaf functions to use struct object_id
tree-walk: convert tree_entry_extract() to use struct object_id
struct name_entry: use struct object_id instead of unsigned char sha1[20]
match-trees: convert shift_tree() and shift_tree_by() to use object_id
test-match-trees: convert to use struct object_id
sha1-name: introduce a get_oid() function
When "git log" shows the log message indented by 4-spaces, the
remainder of a line after a HT does not align in the way the author
originally intended. The command now expands tabs by default in
such a case, and allows the users to override it with a new option,
'--no-expand-tabs'.
* lt/pretty-expand-tabs:
pretty: test --expand-tabs
pretty: allow tweaking tabwidth in --expand-tabs
pretty: enable --expand-tabs by default for selected pretty formats
pretty: expand tabs in indented logs to make things line up properly
When the local convention of the project is to use tab width that is
not 8, it may make sense to allow "git log --expand-tabs=<n>" to
tweak the output to match it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git log --pretty={medium,full,fuller}" and "git log" by default
prepend 4 spaces to the log message, so it makes sense to enable
the new "expand-tabs" facility by default for these formats.
Add --no-expand-tabs option to override the new default.
The change alone breaks a test in t4201 that runs "git shortlog"
on the output from "git log", and expects that the output from
"git log" does not do such a tab expansion. Adjust the test to
explicitly disable expand-tabs with --no-expand-tabs.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A commit log message sometimes tries to line things up using tabs,
assuming fixed-width font with the standard 8-place tab settings.
Viewing such a commit however does not work well in "git log", as
we indent the lines by prefixing 4 spaces in front of them.
This should all line up:
Column 1 Column 2
-------- --------
A B
ABCD EFGH
SPACES Instead of Tabs
Even with multi-byte UTF8 characters:
Column 1 Column 2
-------- --------
Ä B
åäö 100
A Møøse once bit my sister..
Tab-expand the lines in "git log --expand-tabs" output before
prefixing 4 spaces.
This is based on the patch by Linus Torvalds, but at this step, we
require an explicit command line option to enable the behaviour.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we find a blob at "a/b/c", we currently pass this to
our show_object_fn callbacks as two components: "a/b/" and
"c". Callbacks which want the full value then call
path_name(), which concatenates the two. But this is an
inefficient interface; the path is a strbuf, and we could
simply append "c" to it temporarily, then roll back the
length, without creating a new copy.
So we could improve this by teaching the callsites of
path_name() this trick (and there are only 3). But we can
also notice that no callback actually cares about the
broken-down representation, and simply pass each callback
the full path "a/b/c" as a string. The callback code becomes
even simpler, then, as we do not have to worry about freeing
an allocated buffer, nor rolling back our modification to
the strbuf.
This is theoretically less efficient, as some callbacks
would not bother to format the final path component. But in
practice this is not measurable. Since we use the same
strbuf over and over, our work to grow it is amortized, and
we really only pay to memcpy a few bytes.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous commit, we left name_path as a thin wrapper
around a strbuf. This patch drops it entirely. As a result,
every show_object_fn callback needs to be adjusted. However,
none of their code needs to be changed at all, because the
only use was to pass it to path_name(), which now handles
the bare strbuf.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "struct name_path" data is examined in only two places:
we generate it in process_tree(), and we convert it to a
single string in path_name(). Everyone else just passes it
through to those functions.
We can further note that process_tree() already keeps a
single strbuf with the leading tree path, for use with
tree_entry_interesting().
Instead of building a separate name_path linked list, let's
just use the one we already build in "base". This reduces
the amount of code (especially tricky code in path_name()
which did not check for integer overflows caused by deep
or large pathnames).
It is also more efficient in some instances. Any time we
were using tree_entry_interesting, we were building up the
strbuf anyway, so this is an immediate and obvious win
there. In cases where we were not, we trade off storing
"pathname/" in a strbuf on the heap for each level of the
path, instead of two pointers and an int on the stack (with
one pointer into the tree object). On a 64-bit system, the
latter is 20 bytes; so if path components are less than that
on average, this has lower peak memory usage. In practice
it probably doesn't matter either way; we are already
holding in memory all of the tree objects leading up to each
pathname, and for normal-depth pathnames, we are only
talking about hundreds of bytes.
This patch leaves "struct name_path" as a thin wrapper
around the strbuf, to avoid disrupting callbacks. We should
fix them, but leaving it out makes this diff easier to view.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git rev-list" shows an object with its associated path
name, it does so by walking the name_path linked list and
printing each component (stopping at any embedded NULs or
newlines).
We'd like to eventually get rid of name_path entirely in
favor of a single buffer, and dropping this custom printing
code is part of that. As a first step, let's use path_name()
to format the list into a single buffer, and print that.
This is strictly less efficient than the original, but it's
a temporary step in the refactoring; our end game will be to
get the fully formatted name in the first place.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update various codepaths to avoid manually-counted malloc().
* jk/tighten-alloc: (22 commits)
ewah: convert to REALLOC_ARRAY, etc
convert ewah/bitmap code to use xmalloc
diff_populate_gitlink: use a strbuf
transport_anonymize_url: use xstrfmt
git-compat-util: drop mempcpy compat code
sequencer: simplify memory allocation of get_message
test-path-utils: fix normalize_path_copy output buffer size
fetch-pack: simplify add_sought_entry
fast-import: simplify allocation in start_packfile
write_untracked_extension: use FLEX_ALLOC helper
prepare_{git,shell}_cmd: use argv_array
use st_add and st_mult for allocation size computation
convert trivial cases to FLEX_ARRAY macros
use xmallocz to avoid size arithmetic
convert trivial cases to ALLOC_ARRAY
convert manual allocations to argv_array
argv-array: add detach function
add helpers for allocating flex-array structs
harden REALLOC_ARRAY and xcalloc against size_t overflow
tree-diff: catch integer overflow in combine_diff_path allocation
...
If our size computation overflows size_t, we may allocate a
much smaller buffer than we expected and overflow it. It's
probably impossible to trigger an overflow in most of these
sites in practice, but it is easy enough convert their
additions and multiplications into overflow-checking
variants. This may be fixing real bugs, and it makes
auditing the code easier.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we find a blob at "a/b/c", we currently pass this to
our show_object_fn callbacks as two components: "a/b/" and
"c". Callbacks which want the full value then call
path_name(), which concatenates the two. But this is an
inefficient interface; the path is a strbuf, and we could
simply append "c" to it temporarily, then roll back the
length, without creating a new copy.
So we could improve this by teaching the callsites of
path_name() this trick (and there are only 3). But we can
also notice that no callback actually cares about the
broken-down representation, and simply pass each callback
the full path "a/b/c" as a string. The callback code becomes
even simpler, then, as we do not have to worry about freeing
an allocated buffer, nor rolling back our modification to
the strbuf.
This is theoretically less efficient, as some callbacks
would not bother to format the final path component. But in
practice this is not measurable. Since we use the same
strbuf over and over, our work to grow it is amortized, and
we really only pay to memcpy a few bytes.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous commit, we left name_path as a thin wrapper
around a strbuf. This patch drops it entirely. As a result,
every show_object_fn callback needs to be adjusted. However,
none of their code needs to be changed at all, because the
only use was to pass it to path_name(), which now handles
the bare strbuf.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "struct name_path" data is examined in only two places:
we generate it in process_tree(), and we convert it to a
single string in path_name(). Everyone else just passes it
through to those functions.
We can further note that process_tree() already keeps a
single strbuf with the leading tree path, for use with
tree_entry_interesting().
Instead of building a separate name_path linked list, let's
just use the one we already build in "base". This reduces
the amount of code (especially tricky code in path_name()
which did not check for integer overflows caused by deep
or large pathnames).
It is also more efficient in some instances. Any time we
were using tree_entry_interesting, we were building up the
strbuf anyway, so this is an immediate and obvious win
there. In cases where we were not, we trade off storing
"pathname/" in a strbuf on the heap for each level of the
path, instead of two pointers and an int on the stack (with
one pointer into the tree object). On a 64-bit system, the
latter is 20 bytes; so if path components are less than that
on average, this has lower peak memory usage. In practice
it probably doesn't matter either way; we are already
holding in memory all of the tree objects leading up to each
pathname, and for normal-depth pathnames, we are only
talking about hundreds of bytes.
This patch leaves "struct name_path" as a thin wrapper
around the strbuf, to avoid disrupting callbacks. We should
fix them, but leaving it out makes this diff easier to view.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git rev-list" shows an object with its associated path
name, it does so by walking the name_path linked list and
printing each component (stopping at any embedded NULs or
newlines).
We'd like to eventually get rid of name_path entirely in
favor of a single buffer, and dropping this custom printing
code is part of that. As a first step, let's use path_name()
to format the list into a single buffer, and print that.
This is strictly less efficient than the original, but it's
a temporary step in the refactoring; our end game will be to
get the fully formatted name in the first place.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A few options of "git diff" did not work well when the command was
run from a subdirectory.
* nd/diff-with-path-params:
diff: make -O and --output work in subdirectory
diff-no-index: do not take a redundant prefix argument
A few options of "git diff" did not work well when the command was
run from a subdirectory.
* nd/diff-with-path-params:
diff: make -O and --output work in subdirectory
diff-no-index: do not take a redundant prefix argument
Reading with getwholeline() and manually stripping the terminating
'\n' would leave CR at the end of the line if the input comes from
a DOS editor.
Constrasting this with the other changes around "--stdin" in this
series, one may realize that the way "log" family of commands read
the paths with "--stdin" looks inconsistent and sloppy. It does not
allow us to C-quote a textual input, neither does it accept records
that are NUL-terminated. These are unfortunately way too late to
fix X-<.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
History traversal with "git log --source" that starts with an
annotated tag failed to report the tag as "source", due to an
old regression in the command line parser back in v2.2 days.
* jk/pending-keep-tag-name:
revision.c: propagate tag names from pending array
History traversal with "git log --source" that starts with an
annotated tag failed to report the tag as "source", due to an
old regression in the command line parser back in v2.2 days.
* jk/pending-keep-tag-name:
revision.c: propagate tag names from pending array
When we unwrap a tag to find its commit for a traversal, we
do not propagate the "name" field of the tag in the pending
array (i.e., the ref name the user gave us in the first
place) to the commit (instead, we use an empty string). This
means that "git log --source" will never show the tag-name
for commits we reach through it.
This was broken in 2073949 (traverse_commit_list: support
pending blobs/trees with paths, 2014-10-15). That commit
tried to be careful and avoid propagating the path
information for a tag (which would be nonsensical) to trees
and blobs. But it should not have cut off the "name" field,
which should carry forward to children.
Note that this does mean that the "name" field will carry
forward to blobs and trees, too. Whereas prior to 2073949,
we always gave them an empty string. This is the right thing
to do, but in practice no callers probably use it (since now
we have an explicit separate "path" field, which was the
point of 2073949).
We add tests here not only for the broken case, but also a
basic sanity test of "log --source" in general, which did
not have any coverage in the test suite.
Reported-by: Raymundo <gypark@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
mark_tree_uninteresting() has code to handle the case where it gets
passed a NULL pointer in its 'tree' parameter, but the function had
'object = &tree->object' assignment before checking if tree is
NULL. This gives a compiler an excuse to declare that tree will
never be NULL and apply a wrong optimization. Avoid it.
* sn/null-pointer-arith-in-mark-tree-uninteresting:
revision.c: fix possible null pointer arithmetic