Commit Graph

635 Commits

Author SHA1 Message Date
Jeff King
7fa3c2ad6d revision: replace "struct cmdline_pathspec" with argv_array
We assemble an array of strings in a custom struct,
NULL-terminate the result, and then pass it to
parse_pathspec().

But then we never free the array or the individual strings
(nor can we do the latter, as they are heap-allocated when
they come from stdin but not when they come from the
passed-in argv).

Let's swap this out for an argv_array. It does the same
thing with fewer lines of code, and it's safe to call
argv_array_clear() at the end to avoid a memory leak.

Reported-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-21 13:09:46 +09:00
Junio C Hamano
8a044c7f1d Merge branch 'nd/prune-in-worktree'
"git gc" and friends when multiple worktrees are used off of a
single repository did not consider the index and per-worktree refs
of other worktrees as the root for reachability traversal, making
objects that are in use only in other worktrees to be subject to
garbage collection.

* nd/prune-in-worktree:
  refs.c: reindent get_submodule_ref_store()
  refs.c: remove fallback-to-main-store code get_submodule_ref_store()
  rev-list: expose and document --single-worktree
  revision.c: --reflog add HEAD reflog from all worktrees
  files-backend: make reflog iterator go through per-worktree reflog
  revision.c: --all adds HEAD from all worktrees
  refs: remove dead for_each_*_submodule()
  refs.c: move for_each_remote_ref_submodule() to submodule.c
  revision.c: use refs_for_each*() instead of for_each_*_submodule()
  refs: add refs_head_ref()
  refs: move submodule slash stripping code to get_submodule_ref_store
  refs.c: refactor get_submodule_ref_store(), share common free block
  revision.c: --indexed-objects add objects from all worktrees
  revision.c: refactor add_index_objects_to_pending()
  refs.c: use is_dir_sep() in resolve_gitlink_ref()
  revision.h: new flag in struct rev_info wrt. worktree-related refs
2017-09-19 10:47:53 +09:00
Nguyễn Thái Ngọc Duy
32619f99f9 rev-list: expose and document --single-worktree
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-24 14:58:50 -07:00
Nguyễn Thái Ngọc Duy
acd9544a8f revision.c: --reflog add HEAD reflog from all worktrees
Note that add_other_reflogs_to_pending() is a bit inefficient, since
it scans reflog for all refs of each worktree, including shared refs,
so the shared ref's reflog is scanned over and over again.

We could update refs API to pass "per-worktree only" flag to avoid
that. But long term we should be able to obtain a "per-worktree only"
ref store and would need to revert the changes in reflog iteration
API. So let's just wait until then.

add_reflogs_to_pending() is called by reachable.c so by default "git
prune" will examine reflog from all worktrees.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-24 14:58:47 -07:00
Nguyễn Thái Ngọc Duy
d0c39a49cc revision.c: --all adds HEAD from all worktrees
Unless single_worktree is set, --all now adds HEAD from all worktrees.

Since reachable.c code does not use setup_revisions(), we need to call
other_head_refs_submodule() explicitly there to have the same effect on
"git prune", so that we won't accidentally delete objects needed by some
other HEADs.

A new FIXME is added because we would need something like

    int refs_other_head_refs(struct ref_store *, each_ref_fn, cb_data);

in addition to other_head_refs() to handle it, which might require

    int get_submodule_worktrees(const char *submodule, int flags);

It could be a separate topic to reduce the scope of this one.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-24 14:56:43 -07:00
Nguyễn Thái Ngọc Duy
073cf63c52 revision.c: use refs_for_each*() instead of for_each_*_submodule()
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-24 14:52:33 -07:00
Nguyễn Thái Ngọc Duy
be489d02d2 revision.c: --indexed-objects add objects from all worktrees
This is the result of single_worktree flag never being set (no way to up
until now). To get objects from current index only, set single_worktree.

The other add_index_objects_to_pending's caller is mark_reachable_objects()
(e.g. "git prune") which also mark objects from all indexes.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-24 14:44:41 -07:00
Nguyễn Thái Ngọc Duy
6c3d818154 revision.c: refactor add_index_objects_to_pending()
The core code is factored out and take 'struct index_state *' instead so
that we can reuse it to add objects from index files other than .git/index
in the next patch.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-24 14:42:45 -07:00
Jonathan Tan
150e3001d0 pack: move has_sha1_pack()
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23 15:12:07 -07:00
Junio C Hamano
8fbaf0b13b Merge branch 'jk/rev-list-empty-input'
"git log --tag=no-such-tag" showed log starting from HEAD, which
has been fixed---it now shows nothing.

* jk/rev-list-empty-input:
  revision: do not fallback to default when rev_input_given is set
  rev-list: don't show usage when we see empty ref patterns
  revision: add rev_input_given flag
  t6018: flesh out empty input/output rev-list tests
2017-08-11 13:27:07 -07:00
Junio C Hamano
3ab01ac3f7 Merge branch 'jk/reflog-walk'
Numerous bugs in walking of reflogs via "log -g" and friends have
been fixed.

* jk/reflog-walk:
  reflog-walk: apply --since/--until to reflog dates
  reflog-walk: stop using fake parents
  rev-list: check reflog_info before showing usage
  get_revision_1(): replace do-while with an early return
  log: do not free parents when walking reflog
  log: clarify comment about reflog cycles
  revision: disallow reflog walking with revs->limited
  t1414: document some reflog-walk oddities
2017-08-11 13:27:01 -07:00
Junio C Hamano
df422678a8 Merge branch 'bc/object-id'
Conversion from uchar[20] to struct object_id continues.

* bc/object-id:
  sha1_name: convert uses of 40 to GIT_SHA1_HEXSZ
  sha1_name: convert GET_SHA1* flags to GET_OID*
  sha1_name: convert get_sha1* to get_oid*
  Convert remaining callers of get_sha1 to get_oid.
  builtin/unpack-file: convert to struct object_id
  bisect: convert bisect_checkout to struct object_id
  builtin/update_ref: convert to struct object_id
  sequencer: convert to struct object_id
  remote: convert struct push_cas to struct object_id
  submodule: convert submodule config lookup to use object_id
  builtin/merge-tree: convert remaining caller of get_sha1 to object_id
  builtin/fsck: convert remaining caller of get_sha1 to object_id
2017-08-11 13:26:55 -07:00
Jeff King
5d34d1ac06 revision: do not fallback to default when rev_input_given is set
If revs->def is set (as it is in "git log") and there are no
pending objects after parsing the user's input, then we show
whatever is in "def". But if the user _did_ ask for some
input that just happened to be empty (e.g., "--glob" that
does not match anything), showing the default revision is
confusing. We should just show nothing, as that is what the
user's request yielded.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-02 15:45:22 -07:00
Jeff King
7ba826290a revision: add rev_input_given flag
Normally a caller that invokes setup_revisions() has to
check rev.pending to see if anything was actually queued for
the traversal. But they can't tell the difference between
two cases:

  1. The user gave us no tip from which to start a
     traversal.

  2. The user tried to give us tips via --glob, --all, etc,
     but their patterns ended up being empty.

Let's set a flag in the rev_info struct that callers can use
to tell the difference.  We can set this from the
init_all_refs_cb() function.  That's a little funny because
it's not exactly about initializing the "cb" struct itself.
But that function is the common setup place for doing
pattern traversals that is used by --glob, --all, etc.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-02 15:45:20 -07:00
brian m. carlson
321c89bf5f sha1_name: convert GET_SHA1* flags to GET_OID*
Convert the flags for get_oid_with_context and friends to use "OID"
instead of "SHA1" in their names.

This transform was made by running the following one-liner on the
affected files:

  perl -pi -e 's/GET_SHA1/GET_OID/g'

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-17 13:54:51 -07:00
brian m. carlson
e82caf384b sha1_name: convert get_sha1* to get_oid*
Now that all the callers of get_sha1 directly or indirectly use struct
object_id, rename the functions starting with get_sha1 to start with
get_oid.  Convert the internals in sha1_name.c to use struct object_id
as well, and eliminate explicit length checks where possible.  Convert a
use of 40 in get_oid_basic to GIT_SHA1_HEXSZ.

Outside of sha1_name.c and cache.h, this transition was made with the
following semantic patch:

@@
expression E1, E2;
@@
- get_sha1(E1, E2.hash)
+ get_oid(E1, &E2)

@@
expression E1, E2;
@@
- get_sha1(E1, E2->hash)
+ get_oid(E1, E2)

@@
expression E1, E2;
@@
- get_sha1_committish(E1, E2.hash)
+ get_oid_committish(E1, &E2)

@@
expression E1, E2;
@@
- get_sha1_committish(E1, E2->hash)
+ get_oid_committish(E1, E2)

@@
expression E1, E2;
@@
- get_sha1_treeish(E1, E2.hash)
+ get_oid_treeish(E1, &E2)

@@
expression E1, E2;
@@
- get_sha1_treeish(E1, E2->hash)
+ get_oid_treeish(E1, E2)

@@
expression E1, E2;
@@
- get_sha1_commit(E1, E2.hash)
+ get_oid_commit(E1, &E2)

@@
expression E1, E2;
@@
- get_sha1_commit(E1, E2->hash)
+ get_oid_commit(E1, E2)

@@
expression E1, E2;
@@
- get_sha1_tree(E1, E2.hash)
+ get_oid_tree(E1, &E2)

@@
expression E1, E2;
@@
- get_sha1_tree(E1, E2->hash)
+ get_oid_tree(E1, E2)

@@
expression E1, E2;
@@
- get_sha1_blob(E1, E2.hash)
+ get_oid_blob(E1, &E2)

@@
expression E1, E2;
@@
- get_sha1_blob(E1, E2->hash)
+ get_oid_blob(E1, E2)

@@
expression E1, E2, E3, E4;
@@
- get_sha1_with_context(E1, E2, E3.hash, E4)
+ get_oid_with_context(E1, E2, &E3, E4)

@@
expression E1, E2, E3, E4;
@@
- get_sha1_with_context(E1, E2, E3->hash, E4)
+ get_oid_with_context(E1, E2, E3, E4)

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-17 13:54:51 -07:00
Junio C Hamano
eac97b438c Merge branch 'ab/grep-lose-opt-regflags'
Code cleanup.

* ab/grep-lose-opt-regflags:
  grep: remove redundant REG_NEWLINE when compiling fixed regex
  grep: remove regflags from the public grep_opt API
  grep: remove redundant and verbose re-assignments to 0
  grep: remove redundant "fixed" field re-assignment to 0
  grep: adjust a redundant grep pattern type assignment
  grep: remove redundant double assignment to 0
2017-07-13 16:14:54 -07:00
Junio C Hamano
0c6435a4d6 Merge branch 'ab/wildmatch'
Minor code cleanup.

* ab/wildmatch:
  wildmatch: remove unused wildopts parameter
2017-07-10 13:42:51 -07:00
Jeff King
de239446b6 reflog-walk: apply --since/--until to reflog dates
When doing a reflog walk, we use the commit's date to
do any date limiting. In earlier versions of Git, this could
lead to nonsense results, since a skipped commit would
truncate the traversal. So a sequence like:

  git commit ...
  git checkout week-old-branch
  git checkout -
  git log -g --since=1.day.ago

would stop at the week-old-branch, even though the "git
commit" entry further back is still interesting.

As of the prior commit, which uses a parent-less traversal
of the reflog, you get the whole reflog minus any commits
whose dates do not match the specified options. This is
arguably useful, as you could scan the reflogs for commits
that originated in a certain range.

But more likely a user doing a reflog walk wants to limit
based on the reflog entries themselves. You can simulate
--until with:

  git log -g @{1.day.ago}

but there's no way to ask Git to traverse only back to a
certain date. E.g.:

  # show me reflog entries from the past day
  git log -g --since=1.day.ago

This patch teaches the revision machinery to prefer the
reflog entry dates to the commit dates when doing a reflog
walk. Technically this is a change in behavior that affects
plumbing, but the previous behavior was so buggy that it's
unlikely anyone was relying on it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-09 10:00:48 -07:00
Jeff King
d08565bf2d reflog-walk: stop using fake parents
The reflog-walk system works by putting a ref's tip into the
pending queue, and then "traversing" the reflog by
pretending that the parent of each commit is the previous
reflog entry.

This causes a number of user-visible oddities, as documented
in t1414 (and the commit message which introduced it). We
can fix all of them in one go by replacing the fake-reflog
system with a much simpler one: just keeping a list of
reflogs to show, and walking through them entry by entry.

The implementation is fairly straight-forward, but there are
a few items to note:

  1. We obviously must skip calling add_parents_to_list()
     when we are traversing reflogs, since we do not want to
     walk the original parents at all.  As a result, we must call
     try_to_simplify_commit() ourselves.

     There are other parts of add_parents_to_list() we skip,
     as well, but none of them should matter for a reflog
     traversal:

       -  We do not allow UNINTERESTING commits, nor
          symmetric ranges (and we bail when these are used
          with "-g").

       - Using --source makes no sense, since we aren't
         traversing. The reflog selector shows the same
         information with more detail.

       - Using --first-parent is still sensible, since you
         may want to see the first-parent diff for each
         entry. But since we're not traversing, we don't
         need to cull the parent list here.

  2. Since we now just walk the reflog entries themselves,
     rather than starting with the ref tip, we now look at
     the "new" field of each entry rather than the "old"
     (i.e., we are showing entries, not faking parents).
     This removes all of the tricky logic around skipping
     past root commits.

     But note that we have no way to show an entry with the
     null sha1 in its "new" field (because such a commit
     obviously does not exist). Normally this would not
     happen, since we delete reflogs along with refs, but
     there is one special case. When we rename the currently
     checked out branch, we write two reflog entries into
     the HEAD log: one where the commit goes away, and
     another where it comes back.

     Prior to this commit, we show both entries with
     identical reflog messages. After this commit, we show
     only the "comes back" entry. See the update in t3200
     which demonstrates this.

     Arguably either is fine, as the whole double-entry
     thing is a bit hacky in the first place. And until a
     recent fix, we truncated the traversal in such a case
     anyway, which was _definitely_ wrong.

  3. We show individual reflogs in order, but choose which
     reflog to show at each stage based on which has the
     most recent timestamp.  This interleaves the output
     from multiple reflogs based on date order, which is
     probably what you'd want with limiting like "-n 30".

     Note that the implementation aims for simplicity. It
     does a linear walk over the reflog queue for each
     commit it pulls, which may perform badly if you
     interleave an enormous number of reflogs. That seems
     like an unlikely use case; if we did want to handle it,
     we could probably keep a priority queue of reflogs,
     ordered by the timestamp of their current tip entry.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-09 10:00:48 -07:00
Jeff King
7c2f08aa7a get_revision_1(): replace do-while with an early return
The get_revision_1() function tries to avoid entering its
main loop at all when there are no commits to look at. But
it's perfectly safe to call pop_commit() on an empty list
(in which case it will return NULL). Switching to an early
return from the loop lets us skip repeating the loop
condition before we enter the do-while. That will get more
important when we start pulling reflog-walk commits from a
source besides the revs->commits queue, as that condition
will get much more complicated.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-09 10:00:48 -07:00
Jeff King
82fd0f4a4b revision: disallow reflog walking with revs->limited
The reflog-walk code doesn't work with limit_list().  That
function traverses down the real history graph, not the fake
reflog history that get_revision() returns. So it's not
going to actually examine all of the commits we're going to
show, because we'd add them to the pending list only during
the actual traversal.

In practice this limitation doesn't really matter, because
the options that require list-limiting generally need
UNINTERESTING endpoints or symmetric ranges, which already
are forbidden for reflog walks. Still, there are likely some
corner cases that would behave oddly. We're better off to
warn the user that we can't fulfill their request than to
generate potentially wrong output.

This will also make it easier to refactor the reflog-walking
code, because it eliminates a whole area of corner cases
we'd have to consider (that already don't work anyway).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-07 10:04:53 -07:00
Ævar Arnfjörð Bjarmason
07a3d41173 grep: remove regflags from the public grep_opt API
Refactor calls to the grep machinery to always pass opt.ignore_case &
opt.extended_regexp_option instead of setting the equivalent regflags
bits.

The bug fixed when making -i work with -P in commit 9e3cbc59d5 ("log:
make --regexp-ignore-case work with --perl-regexp", 2017-05-20) was
really just plastering over the code smell which this change fixes.

The reason for adding the extensive commentary here is that I
discovered some subtle complexity in implementing this that really
should be called out explicitly to future readers.

Before this change we'd rely on the difference between
`extended_regexp_option` and `regflags` to serve as a membrane between
our preliminary parsing of grep.extendedRegexp and grep.patternType,
and what we decided to do internally.

Now that those two are the same thing, it's necessary to unset
`extended_regexp_option` just before we commit in cases where both of
those config variables are set. See 84befcd0a4 ("grep: add a
grep.patternType configuration setting", 2012-08-03) for the code and
documentation related to that.

The explanation of why the if/else branches in
grep_commit_pattern_type() are ordered the way they are exists in that
commit message, but I think it's worth calling this subtlety out
explicitly with a comment for future readers.

Even though grep_commit_pattern_type() is the only caller of
grep_set_pattern_type_option() it's simpler to reset the
extended_regexp_option flag in the latter, since 2/3 branches in the
former would otherwise need to reset it, this way we can do it in one
place.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 10:06:24 -07:00
Junio C Hamano
5c83d850d0 Merge branch 'mh/packed-ref-store-prep'
Bugfix for a topic that is (only) in 'master'.

* mh/packed-ref-store-prep:
  for_each_bisect_ref(): don't trim refnames
  lock_packed_refs(): fix cache validity check
2017-06-26 14:09:29 -07:00
Ævar Arnfjörð Bjarmason
55d3426929 wildmatch: remove unused wildopts parameter
Remove the unused wildopts placeholder struct from being passed to all
wildmatch() invocations, or rather remove all the boilerplate NULL
parameters.

This parameter was added back in commit 9b3497cab9 ("wildmatch: rename
constants and update prototype", 2013-01-01) as a placeholder for
future use. Over 4 years later nothing has made use of it, let's just
remove it. It can be added in the future if we find some reason to
start using such a parameter.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23 18:27:07 -07:00
Junio C Hamano
e77d58a94f Merge branch 'sg/revision-parser-skip-prefix'
Code clean-up.

* sg/revision-parser-skip-prefix:
  revision.c: use skip_prefix() in handle_revision_pseudo_opt()
  revision.c: use skip_prefix() in handle_revision_opt()
  revision.c: stricter parsing of '--early-output'
  revision.c: stricter parsing of '--no-{min,max}-parents'
  revision.h: turn rev_info.early_output back into an unsigned int
2017-06-22 14:15:23 -07:00
Junio C Hamano
a6f38c109b Merge branch 'bw/object-id'
Conversion from uchar[20] to struct object_id continues.

* bw/object-id: (33 commits)
  diff: rename diff_fill_sha1_info to diff_fill_oid_info
  diffcore-rename: use is_empty_blob_oid
  tree-diff: convert path_appendnew to object_id
  tree-diff: convert diff_tree_paths to struct object_id
  tree-diff: convert try_to_follow_renames to struct object_id
  builtin/diff-tree: cleanup references to sha1
  diff-tree: convert diff_tree_sha1 to struct object_id
  notes-merge: convert write_note_to_worktree to struct object_id
  notes-merge: convert verify_notes_filepair to struct object_id
  notes-merge: convert find_notes_merge_pair_ps to struct object_id
  notes-merge: convert merge_from_diffs to struct object_id
  notes-merge: convert notes_merge* to struct object_id
  tree-diff: convert diff_root_tree_sha1 to struct object_id
  combine-diff: convert find_paths_* to struct object_id
  combine-diff: convert diff_tree_combined to struct object_id
  diff: convert diff_flush_patch_id to struct object_id
  patch-ids: convert to struct object_id
  diff: finish conversion for prepare_temp_file to struct object_id
  diff: convert reuse_worktree_file to struct object_id
  diff: convert fill_filespec to struct object_id
  ...
2017-06-19 12:38:44 -07:00
Junio C Hamano
ae7e4d4fed Merge branch 'ab/pcre-v2'
Update "perl-compatible regular expression" support to enable JIT
and also allow linking with the newer PCRE v2 library.

* ab/pcre-v2:
  grep: add support for PCRE v2
  grep: un-break building with PCRE >= 8.32 without --enable-jit
  grep: un-break building with PCRE < 8.20
  grep: un-break building with PCRE < 8.32
  grep: add support for the PCRE v1 JIT API
  log: add -P as a synonym for --perl-regexp
  grep: skip pthreads overhead when using one thread
  grep: don't redundantly compile throwaway patterns under threading
2017-06-19 12:38:43 -07:00
Michael Haggerty
03df567fbf for_each_bisect_ref(): don't trim refnames
`for_each_bisect_ref()` is called by `for_each_bad_bisect_ref()` with
a term "bad". This used to make it call `for_each_ref_in_submodule()`
with a prefix "refs/bisect/bad". But the latter is the name of the
reference that is being sought, so the empty string was being passed
to the callback as the trimmed refname. Moreover, this questionable
practice was turned into an error by

    b9c8e7f2fb prefix_ref_iterator: don't trim too much, 2017-05-22

It makes more sense (and agrees better with the documentation of
`--bisect`) for the callers to receive the full reference names. So

* Add a new function, `for_each_fullref_in_submodule()`, to the refs
  API. This plugs a gap in the existing functionality, analogous to
  `for_each_fullref_in()` but accepting a `submodule` argument.

* Change `for_each_bad_bisect_ref()` to call the new function rather
  than `for_each_ref_in_submodule()`.

* Add a test.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-18 22:13:42 -07:00
SZEDER Gábor
8b1d9136e1 revision.c: use skip_prefix() in handle_revision_pseudo_opt()
Instead of starts_with() and a bunch of magic numbers.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-12 13:39:52 -07:00
SZEDER Gábor
479b3d9785 revision.c: use skip_prefix() in handle_revision_opt()
Instead of starts_with() and a bunch of magic numbers.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-12 13:39:49 -07:00
SZEDER Gábor
dffc651ed1 revision.c: stricter parsing of '--early-output'
The parsing of '--early-output' with or without its optional integer
argument allowed bogus options like '--early-output-foobarbaz' to slip
through and be ignored.

Fix it by parsing '--early-output' in the same way as other options
with an optional argument are parsed.  Furthermore, use strtoul_ui()
to parse the optional integer argument and to refuse negative numbers.

While at it, use skip_prefix() instead of starts_with() and magic
numbers.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-12 13:39:46 -07:00
SZEDER Gábor
9ada7aee19 revision.c: stricter parsing of '--no-{min,max}-parents'
These two options are parsed using starts_with(), allowing things like
'git log --no-min-parents-foobarbaz' to succeed.

Use strcmp() instead.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-12 13:39:43 -07:00
Brandon Williams
66f414f885 diff-tree: convert diff_tree_sha1 to struct object_id
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-05 11:23:58 +09:00
Junio C Hamano
36dcb57337 Merge branch 'ab/grep-preparatory-cleanup'
The internal implementation of "git grep" has seen some clean-up.

* ab/grep-preparatory-cleanup: (31 commits)
  grep: assert that threading is enabled when calling grep_{lock,unlock}
  grep: given --threads with NO_PTHREADS=YesPlease, warn
  pack-objects: fix buggy warning about threads
  pack-objects & index-pack: add test for --threads warning
  test-lib: add a PTHREADS prerequisite
  grep: move is_fixed() earlier to avoid forward declaration
  grep: change internal *pcre* variable & function names to be *pcre1*
  grep: change the internal PCRE macro names to be PCRE1
  grep: factor test for \0 in grep patterns into a function
  grep: remove redundant regflags assignments
  grep: catch a missing enum in switch statement
  perf: add a comparison test of log --grep regex engines with -F
  perf: add a comparison test of log --grep regex engines
  perf: add a comparison test of grep regex engines with -F
  perf: add a comparison test of grep regex engines
  perf: emit progress output when unpacking & building
  perf: add a GIT_PERF_MAKE_COMMAND for when *_MAKE_OPTS won't do
  grep: add tests to fix blind spots with \0 patterns
  grep: prepare for testing binary regexes containing rx metacharacters
  grep: add a test helper function for less verbose -f \0 tests
  ...
2017-06-02 15:06:06 +09:00
Junio C Hamano
7ef0d04738 Merge branch 'jk/diff-blob'
The result from "git diff" that compares two blobs, e.g. "git diff
$commit1:$path $commit2:$path", used to be shown with the full
object name as given on the command line, but it is more natural to
use the $path in the output and use it to look up .gitattributes.

* jk/diff-blob:
  diff: use blob path for blob/file diffs
  diff: use pending "path" if it is available
  diff: use the word "path" instead of "name" for blobs
  diff: pass whole pending entry in blobinfo
  handle_revision_arg: record paths for pending objects
  handle_revision_arg: record modes for "a..b" endpoints
  t4063: add tests of direct blob diffs
  get_sha1_with_context: dynamically allocate oc->path
  get_sha1_with_context: always initialize oc->symlink_path
  sha1_name: consistently refer to object_context as "oc"
  handle_revision_arg: add handle_dotdot() helper
  handle_revision_arg: hoist ".." check out of range parsing
  handle_revision_arg: stop using "dotdot" as a generic pointer
  handle_revision_arg: simplify commit reference lookups
  handle_revision_arg: reset "dotdot" consistently
2017-06-02 15:06:05 +09:00
Brandon Williams
94a0097a41 diff: convert diff_change to struct object_id
Convert diff_change to take a struct object_id.  In addition convert the
function pointer type 'change_fn_t' to also take a struct object_id.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-02 09:36:07 +09:00
Brandon Williams
c26022ea8f diff: convert diff_addremove to struct object_id
Convert diff_addremove to take a struct object_id.  In addtion convert
the function pointer type 'add_remove_fn_t' to also take a struct
object_id.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-02 09:36:07 +09:00
brian m. carlson
fb61e4d3ab notes: convert format_display_notes to struct object_id
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-02 09:36:06 +09:00
Junio C Hamano
b42b41b75a Merge branch 'jk/ignore-broken-tags-when-ignoring-missing-links'
Tag objects, which are not reachable from any ref, that point at
missing objects were mishandled by "git gc" and friends (they
should silently be ignored instead)

* jk/ignore-broken-tags-when-ignoring-missing-links:
  revision.c: ignore broken tags with ignore_missing_links
2017-05-29 12:34:54 +09:00
Junio C Hamano
6b526ced6f Merge branch 'bc/object-id'
Conversion from uchar[20] to struct object_id continues.

* bc/object-id: (53 commits)
  object: convert parse_object* to take struct object_id
  tree: convert parse_tree_indirect to struct object_id
  sequencer: convert do_recursive_merge to struct object_id
  diff-lib: convert do_diff_cache to struct object_id
  builtin/ls-tree: convert to struct object_id
  merge: convert checkout_fast_forward to struct object_id
  sequencer: convert fast_forward_to to struct object_id
  builtin/ls-files: convert overlay_tree_on_cache to object_id
  builtin/read-tree: convert to struct object_id
  sha1_name: convert internals of peel_onion to object_id
  upload-pack: convert remaining parse_object callers to object_id
  revision: convert remaining parse_object callers to object_id
  revision: rename add_pending_sha1 to add_pending_oid
  http-push: convert process_ls_object and descendants to object_id
  refs/files-backend: convert many internals to struct object_id
  refs: convert struct ref_update to use struct object_id
  ref-filter: convert some static functions to struct object_id
  Convert struct ref_array_item to struct object_id
  Convert the verify_pack callback to struct object_id
  Convert lookup_tag to struct object_id
  ...
2017-05-29 12:34:43 +09:00
Ævar Arnfjörð Bjarmason
7531a2dd87 log: add -P as a synonym for --perl-regexp
Add a short -P option as a synonym for the longer --perl-regexp, for
consistency with the options the corresponding grep invocations
accept.

This was intentionally omitted in commit 727b6fc3ed ("log --grep:
accept --basic-regexp and --perl-regexp", 2012-10-03) for unspecified
future use.

Make it consistent with "grep" rather than to keep it open for future
use, and to avoid the confusion of -P meaning different things for
grep & log, as is the case with the -G option.

As noted in the aforementioned commit the --basic-regexp option can't
have a corresponding -G argument, as the log command already uses that
for -G<regex>.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26 12:59:05 +09:00
Jeff King
18f1ad7639 handle_revision_arg: record paths for pending objects
If the revision parser sees an argument like tree:path, we
parse it down to the correct blob (or tree), but throw away
the "path" portion. Let's ask get_sha1_with_context() to
record it, and pass it along in the pending array.

This will let programs like git-diff which rely on the
revision-parser show more accurate paths.

Note that the implementation is a little tricky; we have to
make sure we free oc.path in all code paths. For handle_dotdot(),
we can piggy-back on the existing cleanup-wrapper pattern.
The real work happens in handle_dotdot_1(), but the
handle_dotdot() wrapper makes sure that the path is freed no
matter how we exit the function (and for that reason we make
sure that the object_context struct is zero'd, so if we fail
to even get to the get_sha1_with_context() call, we just end
up calling free(NULL)).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-24 10:59:27 +09:00
Jeff King
101dd4de16 handle_revision_arg: record modes for "a..b" endpoints
The "a..b" revision syntax was designed to handle commits,
so it doesn't bother to record any mode we find while
traversing a "tree:path" endpoint. These days "git diff" can
diff blobs using either "a:path..b:path" (with dots) or
"a:path b:path" (without), but the two behave
inconsistently, as the with-dots version fails to notice the
mode.

Let's teach the dot-dot range parser to record modes; it
doesn't cost us anything, and it makes this case work.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-24 10:59:27 +09:00
Jeff King
62faad5aa5 handle_revision_arg: add handle_dotdot() helper
The handle_revision_arg function is rather long, and a big
chunk of it is handling the range operators. Let's pull that
out to a separate helper. While we're doing so, we can clean
up a few of the rough edges that made the flow hard to
follow:

  - instead of manually restoring *dotdot (that we overwrote
    with a NUL), do the real work in a sub-helper, which
    makes it clear that the munge/restore lines are a
    matched pair

  - eliminate a goto which wasn't actually used for control
    flow, but only to avoid duplicating a few lines
    (instead, those lines are pushed into another helper
    function)

  - use early returns instead of deep nesting

  - consistently name all variables for the left-hand side
    of the range as "a" (rather than "this" or "from") and
    the right-hand side as "b" (rather than "next", or using
    the unadorned "sha1" or "flags" from the main function).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-24 10:59:27 +09:00
Jeff King
d89797feff handle_revision_arg: hoist ".." check out of range parsing
Since 003c84f6d (specifying ranges: we did not mean to make
".." an empty set, 2011-05-02), we treat the argument ".."
specially. We detect it by noticing that both sides of the
range are empty, and that this is a non-symmetric two-dot
range.

While correct, this makes the code overly complicated. We
can just detect ".." up front before we try to do further
parsing. This avoids having to de-munge the NUL from dotdot,
and lets us eliminate an extra const array (which we needed
only to do direct pointer comparisons).

It also removes the one code path from the range-parsing
conditional that requires us to return -1. That will make it
simpler to pull the dotdot parsing out into its own
function.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-24 10:59:27 +09:00
Jeff King
f632dedd8d handle_revision_arg: stop using "dotdot" as a generic pointer
The handle_revision_arg() function has a "dotdot" variable
that it uses to find a ".." or "..." in the argument. If we
don't find one, we look for other marks, like "^!". But we
just keep re-using the "dotdot" variable, which is
confusing.

Let's introduce a separate "mark" variable that can be used
for these other marks. They still reuse the same variable,
but at least the name is no longer actively misleading.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-24 10:59:27 +09:00
Jeff King
1d6c93817b handle_revision_arg: simplify commit reference lookups
The "dotdot" range parser avoids calling
lookup_commit_reference() if we are directly fed two
commits. But its casts are unnecessarily complex; that
function will just return a commit we pass into it.

Just calling the function all the time is much simpler, and
doesn't do any significant extra work (the object is already
parsed, and deref_tag() on a non-tag is a noop; we do incur
one extra lookup_object() call, but that's fairly trivial).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-24 10:59:27 +09:00
Jeff King
ed79b2cf03 handle_revision_arg: reset "dotdot" consistently
When we are parsing a range like "a..b", we write a
temporary NUL over the first ".", so that we can access the
names "a" and "b" as C strings. But our restoration of the
original "." is done at inconsistent times, which can lead
to confusing results.

For most calls, we restore the "." after we resolve the
names, but before we call verify_non_filename().  This means
that when we later call add_pending_object(), the name for
the left-hand "a" has been re-expanded to "a..b". You can
see this with:

  git log --source a...b

where "b" will be correctly marked with "b", but "a" will be
marked with "a...b". Likewise with "a..b" (though you need
to use --boundary to even see "a" at all in that case).

To top off the confusion, when the REVARG_CANNOT_BE_FILENAME
flag is set, we skip the non-filename check, and leave the
NUL in place.

That means we do report the correct name for "a" in the
pending array. But some code paths try to show the whole
"a...b" name in error messages, and these erroneously show
only "a" instead of "a...b". E.g.:

  $ git cherry-pick HEAD:foo...HEAD:foo
  error: object d95f3ad14dee633a758d2e331151e950dd13e4ed is a blob, not a commit
  error: object d95f3ad14dee633a758d2e331151e950dd13e4ed is a blob, not a commit
  fatal: Invalid symmetric difference expression HEAD:foo

(That last message should be "HEAD:foo...HEAD:foo"; I used
cherry-pick because it passes the CANNOT_BE_FILENAME flag).

As an interesting side note, cherry-pick actually looks at
and re-resolves the arguments from the pending->name fields.
So it would have been visibly broken by the first bug, but
the effect was canceled out by the second one.

This patch makes the whole function consistent by re-writing
the NUL immediately after calling verify_non_filename(), and
then restoring the "." as appropriate in some error-printing
and early-return code paths.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-24 10:59:03 +09:00
Junio C Hamano
ca7b2ab07d Merge branch 'bc/object-id'
* bc/object-id: (53 commits)
  object: convert parse_object* to take struct object_id
  tree: convert parse_tree_indirect to struct object_id
  sequencer: convert do_recursive_merge to struct object_id
  diff-lib: convert do_diff_cache to struct object_id
  builtin/ls-tree: convert to struct object_id
  merge: convert checkout_fast_forward to struct object_id
  sequencer: convert fast_forward_to to struct object_id
  builtin/ls-files: convert overlay_tree_on_cache to object_id
  builtin/read-tree: convert to struct object_id
  sha1_name: convert internals of peel_onion to object_id
  upload-pack: convert remaining parse_object callers to object_id
  revision: convert remaining parse_object callers to object_id
  revision: rename add_pending_sha1 to add_pending_oid
  http-push: convert process_ls_object and descendants to object_id
  refs/files-backend: convert many internals to struct object_id
  refs: convert struct ref_update to use struct object_id
  ref-filter: convert some static functions to struct object_id
  Convert struct ref_array_item to struct object_id
  Convert the verify_pack callback to struct object_id
  Convert lookup_tag to struct object_id
  ...
2017-05-23 14:29:19 +09:00