Currently, list-objects.c incorrectly treats all root trees of commits
as USER_GIVEN. Also, it would be easier to mark objects that are
non-user-given instead of user-given, since the places in the code
where we access an object through a reference are more obvious than
the places where we access an object that was given by the user.
Resolve these two problems by introducing a flag NOT_USER_GIVEN that
marks blobs and trees that are non-user-given, replacing USER_GIVEN.
(Only blobs and trees are marked because this mark is only used when
filtering objects, and filtering of other types of objects is not
supported yet.)
This fixes a bug in that git rev-list behaved differently from git
pack-objects. pack-objects would *not* filter objects given explicitly
on the command line and rev-list would filter. This was because the two
commands used a different function to add objects to the rev_info
struct. This seems to have been an oversight, and pack-objects has the
correct behavior, so I added a test to make sure that rev-list now
behaves properly.
Signed-off-by: Matthew DeVore <matvore@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A new variant repo_diff_setup() is added that takes 'struct repository *'
and diff_setup() becomes a thin macro around it that is protected by
NO_THE_REPOSITORY_COMPATIBILITY_MACROS, similar to NO_THE_INDEX_....
The plan is these macros will always be defined for all library files
and the macros are only accessible in builtin/
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
spatch transformation to replace boolean uses of !hashcmp() to
newly introduced oideq() is added, and applied, to regain
performance lost due to support of multiple hash algorithms.
* jk/cocci:
show_dirstat: simplify same-content check
read-cache: use oideq() in ce_compare functions
convert hashmap comparison functions to oideq()
convert "hashcmp() != 0" to "!hasheq()"
convert "oidcmp() != 0" to "!oideq()"
convert "hashcmp() == 0" to hasheq()
convert "oidcmp() == 0" to oideq()
introduce hasheq() and oideq()
coccinelle: use <...> for function exclusion
The code for computing history reachability has been shuffled,
obtained a bunch of new tests to cover them, and then being
improved.
* ds/reachable:
commit-reach: correct accidental #include of C file
commit-reach: use can_all_from_reach
commit-reach: make can_all_from_reach... linear
commit-reach: replace ref_newer logic
test-reach: test commit_contains
test-reach: test can_all_from_reach_with_flags
test-reach: test reduce_heads
test-reach: test get_merge_bases_many
test-reach: test is_descendant_of
test-reach: test in_merge_bases
test-reach: create new test tool for ref_newer
commit-reach: move can_all_from_reach_with_flags
upload-pack: generalize commit date cutoff
upload-pack: refactor ok_to_give_up()
upload-pack: make reachable() more generic
commit-reach: move commit_contains from ref-filter
commit-reach: move ref_newer from remote.c
commit.h: remove method declarations
commit-reach: move walk methods from commit.c
"git rev-list --stdin </dev/null" used to be an error; it now shows
no output without an error. "git rev-list --stdin --default HEAD"
still falls back to the given default when nothing is given on the
standard input.
* jk/rev-list-stdin-noop-is-ok:
rev-list: make empty --stdin not an error
Using the more restrictive oideq() should, in the long run,
give the compiler more opportunities to optimize these
callsites. For now, this conversion should be a complete
noop with respect to the generated code.
The result is also perhaps a little more readable, as it
avoids the "zero is equal" idiom. Since it's so prevalent in
C, I think seasoned programmers tend not to even notice it
anymore, but it can sometimes make for awkward double
negations (e.g., we can drop a few !!oidcmp() instances
here).
This patch was generated almost entirely by the included
coccinelle patch. This mechanical conversion should be
completely safe, because we check explicitly for cases where
oidcmp() is compared to 0, which is what oideq() is doing
under the hood. Note that we don't have to catch "!oidcmp()"
separately; coccinelle's standard isomorphisms make sure the
two are treated equivalently.
I say "almost" because I did hand-edit the coccinelle output
to fix up a few style violations (it mostly keeps the
original formatting, but sometimes unwraps long lines).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we originally did the series that contains 7ba826290a
(revision: add rev_input_given flag, 2017-08-02) the intent
was that "git rev-list --stdin </dev/null" would similarly
become a successful noop. However, an attempt at the time to
do that did not work[1]. The problem is that rev_input_given
serves two roles:
- it tells rev-list.c that it should not error out
- it tells revision.c that it should not have the "default"
ref kick (e.g., "HEAD" in "git log")
We want to trigger the former, but not the latter. This is
technically possible with a single flag, if we set the flag
only after revision.c's revs->def check. But this introduces
a rather subtle ordering dependency.
Instead, let's keep two flags: one to denote when we got
actual input (which triggers both roles) and one for when we
read stdin (which triggers only the first).
This does mean a caller interested in the first role has to
check both flags, but there's only one such caller. And any
future callers might want to make the distinction anyway
(e.g., if they care less about erroring out, and more about
whether revision.c soaked up our stdin).
In fact, we already keep such a flag internally in
revision.c for this purpose, so this is really just exposing
that to the caller (and the old function-local flag can go
away in favor of our new one).
[1] https://public-inbox.org/git/20170802223416.gwiezhbuxbdmbjzx@sigill.intra.peff.net/
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the match_patchspec API and friends take an index_state instead
of assuming the_index in dir.c. All external call sites are converted
blindly to keep the patch simple and retain current behavior.
Individual call sites may receive further updates to use the right
index instead of the_index.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The lazy clone support had a few places where missing but promised
objects were not correctly tolerated, which have been fixed.
* jt/tags-to-promised-blobs-fix:
tag: don't warn if target is missing but promised
revision: tolerate promised targets of tags
Partial clone support of "git clone" has been updated to correctly
validate the objects it receives from the other side. The server
side has been corrected to send objects that are directly
requested, even if they may match the filtering criteria (e.g. when
doing a "lazy blob" partial clone).
* jt/partial-clone-fsck-connectivity:
clone: check connectivity even if clone is partial
upload-pack: send refs' objects despite "filter"
These methods are now declared in commit-reach.h. Remove them from
commit.h and add new include statements in all files that require these
declarations.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The conversion to pass "the_repository" and then "a_repository"
throughout the object access API continues.
* sb/object-store-grafts:
commit: allow lookup_commit_graft to handle arbitrary repositories
commit: allow prepare_commit_graft to handle arbitrary repositories
shallow: migrate shallow information into the object parser
path.c: migrate global git_path_* to take a repository argument
cache: convert get_graft_file to handle arbitrary repositories
commit: convert read_graft_file to handle arbitrary repositories
commit: convert register_commit_graft to handle arbitrary repositories
commit: convert commit_graft_pos() to handle arbitrary repositories
shallow: add repository argument to is_repository_shallow
shallow: add repository argument to check_shallow_file_for_update
shallow: add repository argument to register_shallow
shallow: add repository argument to set_alternate_shallow_file
commit: add repository argument to lookup_commit_graft
commit: add repository argument to prepare_commit_graft
commit: add repository argument to read_graft_file
commit: add repository argument to register_commit_graft
commit: add repository argument to commit_graft_pos
object: move grafts to object parser
object-store: move object access functions to object-store.h
In handle_commit(), it is fatal for an annotated tag to point to a
non-existent object. --exclude-promisor-objects should relax this rule
and allow non-existent objects that are promisor objects, but this is
not the case. Update handle_commit() to tolerate this situation.
This was observed when cloning from a repository with an annotated tag
pointing to a blob. The test included in this patch demonstrates this
case.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A filter line in a request to upload-pack filters out objects regardless
of whether they are directly referenced by a "want" line or not. This
means that cloning with "--filter=blob:none" (or another filter that
excludes blobs) from a repository with at least one ref pointing to a
blob (for example, the Git repository itself) results in output like the
following:
error: missing object referenced by 'refs/tags/junio-gpg-pub'
and if that particular blob is not referenced by a fetched tree, the
resulting clone fails fsck because there is no object from the remote to
vouch that the missing object is a promisor object.
Update both the protocol and the upload-pack implementation to include
all explicitly specified "want" objects in the packfile regardless of
the filter specification.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow callers of lookup_commit_reference
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow the callers of lookup_tree
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow the callers of lookup_blob
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow the callers of parse_object
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The in-core "commit" object had an all-purpose "void *util" field,
which was tricky to use especially in library-ish part of the
code. All of the existing uses of the field has been migrated to a
more dedicated "commit-slab" mechanism and the field is eliminated.
* nd/commit-util-to-slab:
commit.h: delete 'util' field in struct commit
merge: use commit-slab in merge remote desc instead of commit->util
log: use commit-slab in prepare_bases() instead of commit->util
show-branch: note about its object flags usage
show-branch: use commit-slab for commit-name instead of commit->util
name-rev: use commit-slab for rev-name instead of commit->util
bisect.c: use commit-slab for commit weight instead of commit->util
revision.c: use commit-slab for show_source
sequencer.c: use commit-slab to associate todo items to commits
sequencer.c: use commit-slab to mark seen commits
shallow.c: use commit-slab for commit depth instead of commit->util
describe: use commit-slab for commit names instead of commit->util
blame: use commit-slab for blame suspects instead of commit->util
commit-slab: support shared commit-slab
commit-slab.h: code split
Conversion from uchar[20] to struct object_id continues.
* bc/object-id: (42 commits)
merge-one-file: compute empty blob object ID
add--interactive: compute the empty tree value
Update shell scripts to compute empty tree object ID
sha1_file: only expose empty object constants through git_hash_algo
dir: use the_hash_algo for empty blob object ID
sequencer: use the_hash_algo for empty tree object ID
cache-tree: use is_empty_tree_oid
sha1_file: convert cached object code to struct object_id
builtin/reset: convert use of EMPTY_TREE_SHA1_BIN
builtin/receive-pack: convert one use of EMPTY_TREE_SHA1_HEX
wt-status: convert two uses of EMPTY_TREE_SHA1_HEX
submodule: convert several uses of EMPTY_TREE_SHA1_HEX
sequencer: convert one use of EMPTY_TREE_SHA1_HEX
merge: convert empty tree constant to the_hash_algo
builtin/merge: switch tree functions to use object_id
builtin/am: convert uses of EMPTY_TREE_SHA1_BIN to the_hash_algo
sha1-file: add functions for hex empty tree and blob OIDs
builtin/receive-pack: avoid hard-coded constants for push certs
diff: specify abbreviation size in terms of the_hash_algo
upload-pack: replace use of several hard-coded constants
...
Code clean-up to turn history traversal more robust in a
semi-corrupt repository.
* jk/unavailable-can-be-missing:
mark_parents_uninteresting(): avoid most allocation
mark_parents_uninteresting(): replace list with stack
mark_parents_uninteresting(): drop missing object check
mark_tree_contents_uninteresting(): drop missing object check
Developer support update, by using BUG() macro instead of die() to
mark codepaths that should not happen more clearly.
* js/use-bug-macro:
BUG_exit_code: fix sparse "symbol not declared" warning
Convert remaining die*(BUG) messages
Replace all die("BUG: ...") calls by BUG() ones
run-command: use BUG() to report bugs, not die()
test-tool: help verifying BUG() code paths
The code has been taught to use the duplicated information stored
in the commit-graph file to learn the tree object name for a commit
to avoid opening and parsing the commit object when it makes sense
to do so.
* ds/lazy-load-trees:
coccinelle: avoid wrong transformation suggestions from commit.cocci
commit-graph: lazy-load trees for commits
treewide: replace maybe_tree with accessor methods
commit: create get_commit_tree() method
treewide: rename tree to maybe_tree
Instead of relying on commit->util to store the source string, let the
user provide a commit-slab to store the source strings in.
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This should make these functions easier to find and cache.h less
overwhelming to read.
In particular, this moves:
- read_object_file
- oid_object_info
- write_object_file
As a result, most of the codebase needs to #include object-store.h.
In this patch the #include is only added to files that would fail to
compile otherwise. It would be better to #include wherever
identifiers from the header are used. That can happen later
when we have better tooling for it.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 941ba8db57 (Eliminate recursion in setting/clearing
marks in commit list, 2012-01-14) used a clever double-loop
to avoid allocations for single-parent chains of history.
However, it did so only when following parents of parents
(which was an uncommon case), and _always_ incurred at least
one allocation to populate the list of pending parents in
the first place.
We can turn this into zero-allocation in the common case by
iterating directly over the initial parent list, and then
following up on any pending items we might have discovered.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The mark_parents_uninteresting() function uses two loops:
an outer one to process our queue of pending parents, and an
inner one to process first-parent chains. This is a clever
optimization from 941ba8db57 (Eliminate recursion in
setting/clearing marks in commit list, 2012-01-14) to limit
the number of linked-list allocations when following
single-parent chains.
Unfortunately, this makes the result a little hard to read.
Let's replace the list with a stack. Then we don't have to
worry about doing this double-loop optimization, as we'll
just reuse the top element of the stack as we pop/push.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We allow UNINTERESTING objects in a traversal to be
unavailable. As part of this, mark_parents_uninteresting()
checks whether we have a particular uninteresting parent; if
not, we will mark it as "parsed" so that later code skips
it.
This code is redundant and even a little bit harmful, so
let's drop it.
It's redundant because when our parse_object() call in
add_parents_to_list() fails, we already quietly skip
UNINTERESTING parents. This redundancy is a historical
artifact. The mark_parents_uninteresting() protection is
from 454fbbcde3 (git-rev-list: allow missing objects when
the parent is marked UNINTERESTING, 2005-07-10). Much later,
aeeae1b771 (revision traversal: allow UNINTERESTING objects
to be missing, 2009-01-27) covered more cases by making the
actual parse more gentle.
As an aside, even if this weren't redundant, it would be
insufficient. The gentle parsing handles both missing and
corrupted objects, whereas the has_object_file() check
we're getting rid of covers only missing ones.
And the code we're dropping is harmful for two reasons:
1. We spend extra time on the object lookup, even though
we don't actually need the information at this point
(and will just repeat that lookup later when we parse
for the common case that we _do_ have the object).
2. It "lies" about the commit by setting the parsed flag,
even though we didn't load any useful data into the
struct. This shouldn't matter for the UNINTERESTING
case, but we may later clear our flags and do another
traversal in the same process. While pretty unlikely,
it's possible that we could then look at the same
commit without the UNINTERESTING flag, in which case
we'd produce the wrong result (we'd think it's a commit
with no parents, when in fact we should probably die
due to the missing object).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's generally acceptable for UNINTERESTING objects in a
traversal to be unavailable (e.g., see aeeae1b771). When
marking trees UNINTERESTING, we access the object database
twice: once to check if the object is missing (and return
quietly if it is), and then again to actually parse it.
We can instead just try to parse; if that fails, we can then
return quietly. That halves the effort we spend on locating
the object.
Note that this isn't _exactly_ the same as the original
behavior, as the parse failure could be due to other
problems than a missing object: it could be corrupted, in
which case the original code would have died. But the new
behavior is arguably better, as it covers the object being
unavailable for any reason. We'll also still issue a warning
to stderr in such a case.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In d8193743e0 (usage.c: add BUG() function, 2017-05-12), a new macro
was introduced to use for reporting bugs instead of die(). It was then
subsequently used to convert one single caller in 588a538ae5
(setup_git_env: convert die("BUG") to BUG(), 2017-05-12).
The cover letter of the patch series containing this patch
(cf 20170513032414.mfrwabt4hovujde2@sigill.intra.peff.net) is not
terribly clear why only one call site was converted, or what the plan
is for other, similar calls to die() to report bugs.
Let's just convert all remaining ones in one fell swoop.
This trick was performed by this invocation:
sed -i 's/die("BUG: /BUG("/g' $(git grep -l 'die("BUG' \*.c)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace two uses of the hard-coded constant 40 with references to
the_hash_algo.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert this function to take a pointer to struct object_id and rename
it has_object_pack for consistency with has_object_file.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a repository argument to allow the get_main_ref_store caller
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.
As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In anticipation of making trees load lazily, create a Coccinelle
script (contrib/coccinelle/commit.cocci) to ensure that all
references to the 'maybe_tree' member of struct commit are either
mutations or accesses through get_commit_tree() or
get_commit_tree_oid().
Apply the Coccinelle script to create the rest of the patch.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using the commit-graph file to walk commit history removes the large
cost of parsing commits during the walk. This exposes a performance
issue: lookup_tree() takes a large portion of the computation time,
even when Git never uses those trees.
In anticipation of lazy-loading these trees, rename the 'tree' member
of struct commit to 'maybe_tree'. This serves two purposes: it hints
at the future role of possibly being NULL even if the commit has a
valid tree, and it allows for unambiguous transformation from simple
member access (i.e. commit->maybe_tree) to method access.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In mark_parents_uninteresting(), we check for the existence of an
object file to see if we should treat a commit as parsed. The result
is to set the "parsed" bit on the commit.
Modify the condition to only check has_object_file() if the result
would change the parsed bit.
When a local branch is different from its upstream ref, "git status"
will compute ahead/behind counts. This uses paint_down_to_common()
and hits mark_parents_uninteresting(). On a copy of the Linux repo
with a local instance of "master" behind the remote branch
"origin/master" by ~60,000 commits, we find the performance of
"git status" went from 1.42 seconds to 1.32 seconds, for a relative
difference of -7.0%.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This was an undocumented debugging aid that does not seem to
have come in handy in the past decade, judging from its lack
of mentions on the mailing list.
Let's drop it in the name of simplicity. This is morally a
revert of 3131b71301 (Add "--show-all" revision walker flag
for debugging, 2008-02-09), but note that I did leave in the
mapping of UNINTERESTING to "^" in get_revision_mark(). I
don't think this would be possible to trigger with the
current code, but it's the only sensible marker.
We'll skip the usual deprecation period because this was
explicitly a debugging aid that was never documented.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
API clean-up around revision traversal.
* rs/lose-leak-pending:
commit: remove unused function clear_commit_marks_for_object_array()
revision: remove the unused flag leak_pending
checkout: avoid using the rev_info flag leak_pending
bundle: avoid using the rev_info flag leak_pending
bisect: avoid using the rev_info flag leak_pending
object: add clear_commit_marks_all()
ref-filter: use clear_commit_marks_many() in do_merge_filter()
commit: use clear_commit_marks_many() in remove_redundant()
commit: avoid allocation in clear_commit_marks_many()
The split-index mode had a few corner case bugs fixed.
* tg/split-index-fixes:
travis: run tests with GIT_TEST_SPLIT_INDEX
split-index: don't write cache tree with null oid entries
read-cache: fix reading the shared index for other repos
In preparation for implementing narrow/partial clone, the machinery
for checking object connectivity used by gc and fsck has been
taught that a missing object is OK when it is referenced by a
packfile specially marked as coming from trusted repository that
promises to make them available on-demand and lazily.
* jh/fsck-promisors:
gc: do not repack promisor packfiles
rev-list: support termination at promisor objects
sha1_file: support lazily fetching missing objects
introduce fetch-object: fetch one promisor object
index-pack: refactor writing of .keep files
fsck: support promisor objects as CLI argument
fsck: support referenced promisor objects
fsck: support refs pointing to promisor objects
fsck: introduce partialclone extension
extension.partialclone: introduce partial clone extension
"diff" family of commands learned "--find-object=<object-id>" option
to limit the findings to changes that involve the named object.
* sb/diff-blobfind-pickaxe:
diff: use HAS_MULTI_BITS instead of counting bits manually
diff: properly error out when combining multiple pickaxe options
diffcore: add a pickaxe option to find a specific blob
diff: introduce DIFF_PICKAXE_KINDS_MASK
diff: migrate diff_flags.pickaxe_ignore_case to a pickaxe_opts bit
diff.h: make pickaxe_opts an unsigned bit field
API clean-up around revision traversal.
* rs/lose-leak-pending:
commit: remove unused function clear_commit_marks_for_object_array()
revision: remove the unused flag leak_pending
checkout: avoid using the rev_info flag leak_pending
bundle: avoid using the rev_info flag leak_pending
bisect: avoid using the rev_info flag leak_pending
object: add clear_commit_marks_all()
ref-filter: use clear_commit_marks_many() in do_merge_filter()
commit: use clear_commit_marks_many() in remove_redundant()
commit: avoid allocation in clear_commit_marks_many()
read_index_from() takes a path argument for the location of the index
file. For reading the shared index in split index mode however it just
ignores that path argument, and reads it from the gitdir of the current
repository.
This works as long as an index in the_repository is read. Once that
changes, such as when we read the index of a submodule, or of a
different working tree than the current one, the gitdir of
the_repository will no longer contain the appropriate shared index,
and git will fail to read it.
For example t3007-ls-files-recurse-submodules.sh was broken with
GIT_TEST_SPLIT_INDEX set in 188dce131f ("ls-files: use repository
object", 2017-06-22), and t7814-grep-recurse-submodules.sh was also
broken in a similar manner, probably by introducing struct repository
there, although I didn't track down the exact commit for that.
be489d02d2 ("revision.c: --indexed-objects add objects from all
worktrees", 2017-08-23) breaks with split index mode in a similar
manner, not erroring out when it can't read the index, but instead
carrying on with pruning, without taking the index of the worktree into
account.
Fix this by passing an additional gitdir parameter to read_index_from,
to indicate where it should look for and read the shared index from.
read_cache_from() defaults to using the gitdir of the_repository. As it
is mostly a convenience macro, having to pass get_git_dir() for every
call seems overkill, and if necessary users can have more control by
using read_index_from().
Helped-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sometimes users are given a hash of an object and they want to
identify it further (ex.: Use verify-pack to find the largest blobs,
but what are these? or [1])
One might be tempted to extend git-describe to also work with blobs,
such that `git describe <blob-id>` gives a description as
'<commit-ish>:<path>'. This was implemented at [2]; as seen by the sheer
number of responses (>110), it turns out this is tricky to get right.
The hard part to get right is picking the correct 'commit-ish' as that
could be the commit that (re-)introduced the blob or the blob that
removed the blob; the blob could exist in different branches.
Junio hinted at a different approach of solving this problem, which this
patch implements. Teach the diff machinery another flag for restricting
the information to what is shown. For example:
$ ./git log --oneline --find-object=v2.0.0:Makefile
b2feb64309 Revert the whole "ask curl-config" topic for now
47fbfded53 i18n: only extract comments marked with "TRANSLATORS:"
we observe that the Makefile as shipped with 2.0 was appeared in
v1.9.2-471-g47fbfded53 and in v2.0.0-rc1-5-gb2feb6430b. The
reason why these commits both occur prior to v2.0.0 are evil
merges that are not found using this new mechanism.
[1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob
[2] https://public-inbox.org/git/20171028004419.10139-1-sbeller@google.com/
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently the check whether to perform pickaxing is done via checking
`diffopt->pickaxe`, which contains the command line argument that we
want to pickaxe for. Soon we'll introduce a new type of pickaxing, that
will not store anything in the `.pickaxe` field, so let's migrate the
check to be dependent on pickaxe_opts.
It is not enough to just replace the check for pickaxe by pickaxe_opts,
because flags might be set, but pickaxing was not requested ('-i').
To cope with that, introduce a mask to check only for the bits indicating
the modes of operation.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently flags for pickaxing are found in different places. Unify the
flags into the `pickaxe_opts` field, which will contain any pickaxe related
flags.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git describe" was taught to dig trees deeper to find a
<commit-ish>:<path> that refers to a given blob object.
* sb/describe-blob:
builtin/describe.c: describe a blob
builtin/describe.c: factor out describe_commit
builtin/describe.c: print debug statements earlier
builtin/describe.c: rename `oid` to avoid variable shadowing
revision.h: introduce blob/tree walking in order of the commits
list-objects.c: factor out traverse_trees_and_blobs
t6120: fix typo in test name
Teach rev-list to support termination of an object traversal at any
object from a promisor remote (whether one that the local repo also has,
or one that the local repo knows about because it has another promisor
object that references it).
This will be used subsequently in gc and in the connectivity check used
by fetch.
For efficiency, if an object is referenced by a promisor object, and is
in the local repo only as a non-promisor object, object traversal will
not stop there. This is to avoid building the list of promisor object
references.
(In list-objects.c, the case where obj is NULL in process_blob() and
process_tree() do not need to be changed because those happen only when
there is a conflict between the expected type and the existing object.
If the object doesn't exist, an object will be synthesized, which is
fine.)
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `log --decorate` is used, git will decorate commits with all
available refs. While in most cases this may give the desired effect,
under some conditions it can lead to excessively verbose output.
Introduce two command line options, `--decorate-refs=<pattern>` and
`--decorate-refs-exclude=<pattern>` to allow the user to select which
refs are used in decoration.
When "--decorate-refs=<pattern>" is given, only the refs that match the
pattern are used in decoration. The refs that match the pattern when
"--decorate-refs-exclude=<pattern>" is given, are never used in
decoration.
These options follow the same convention for mixing negative and
positive patterns across the system, assuming that the inclusive default
is to match all refs available.
(1) if there is no positive pattern given, pretend as if an
inclusive default positive pattern was given;
(2) for each candidate, reject it if it matches no positive
pattern, or if it matches any one of the negative patterns.
The rules for what is considered a match are slightly different from the
rules used elsewhere.
Commands like `log --glob` assume a trailing '/*' when glob chars are
not present in the pattern. This makes it difficult to specify a single
ref. On the other hand, commands like `describe --match --all` allow
specifying exact refs, but do not have the convenience of allowing
"shorthand refs" like 'refs/heads' or 'heads' to refer to
'refs/heads/*'.
The commands introduced in this patch consider a match if:
(a) the pattern contains globs chars,
and regular pattern matching returns a match.
(b) the pattern does not contain glob chars,
and ref '<pattern>' exists, or if ref exists under '<pattern>/'
This allows both behaviours (allowing single refs and shorthand refs)
yet remaining compatible with existent commands.
Helped-by: Kevin Daudt <me@ikke.info>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Rafael Ascensão <rafa.almas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The functionality to list tree objects in the order they were seen
while traversing the commits will be used in one of the next commits,
where we teach `git describe` to describe not only commits, but blobs, too.
The change in list-objects.c is rather minimal as we'll be re-using
the infrastructure put in place of the revision walking machinery. For
example one could expect that add_pending_tree is not called, but rather
commit->tree is directly passed to the tree traversal function. This
however requires a lot more code than just emptying the queue containing
trees after each commit.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A single-word "unsigned flags" in the diff options is being split
into a structure with many bitfields.
* bw/diff-opt-impl-to-bitfields:
diff: make struct diff_flags members lowercase
diff: remove DIFF_OPT_CLR macro
diff: remove DIFF_OPT_SET macro
diff: remove DIFF_OPT_TST macro
diff: remove touched flags
diff: add flag to indicate textconv was set via cmdline
diff: convert flags to be stored in bitfields
add, reset: use DIFF_OPT_SET macro to set a diff flag
Pathspec-limited revision traversal was taught not to keep finding
unneeded differences once it knows two trees are different inside
given pathspec.
* jk/revision-pruning-optim:
revision: quit pruning diff more quickly when possible
Remove the `DIFF_OPT_CLR` macro and instead set the flags directly.
This conversion is done using the following semantic patch:
@@
expression E;
identifier fld;
@@
- DIFF_OPT_CLR(&E, fld)
+ E.flags.fld = 0
@@
type T;
T *ptr;
identifier fld;
@@
- DIFF_OPT_CLR(ptr, fld)
+ ptr->flags.fld = 0
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the `DIFF_OPT_SET` macro and instead set the flags directly.
This conversion is done using the following semantic patch:
@@
expression E;
identifier fld;
@@
- DIFF_OPT_SET(&E, fld)
+ E.flags.fld = 1
@@
type T;
T *ptr;
identifier fld;
@@
- DIFF_OPT_SET(ptr, fld)
+ ptr->flags.fld = 1
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the `DIFF_OPT_TST` macro and instead access the flags directly.
This conversion is done using the following semantic patch:
@@
expression E;
identifier fld;
@@
- DIFF_OPT_TST(&E, fld)
+ E.flags.fld
@@
type T;
T *ptr;
identifier fld;
@@
- DIFF_OPT_TST(ptr, fld)
+ ptr->flags.fld
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the revision traversal machinery is given a pathspec,
we must compute the parent-diff for each commit to determine
which ones are TREESAME. We set the QUICK diff flag to avoid
looking at more entries than we need; we really just care
whether there are any changes at all.
But there is one case where we want to know a bit more: if
--remove-empty is set, we care about finding cases where the
change consists only of added entries (in which case we may
prune the parent in try_to_simplify_commit()). To cover that
case, our file_add_remove() callback does not quit the diff
upon seeing an added entry; it keeps looking for other types
of entries.
But this means when --remove-empty is not set (and it is not
by default), we compute more of the diff than is necessary.
You can see this in a pathological case where a commit adds
a very large number of entries, and we limit based on a
broad pathspec. E.g.:
perl -e '
chomp(my $blob = `git hash-object -w --stdin </dev/null`);
for my $a (1..1000) {
for my $b (1..1000) {
print "100644 $blob\t$a/$b\n";
}
}
' | git update-index --index-info
git commit -qm add
git rev-list HEAD -- .
This case takes about 100ms now, but after this patch only
needs 6ms. That's not a huge improvement, but it's easy to
get and it protects us against even more pathological cases
(e.g., going from 1 million to 10 million files would take
ten times as long with the current code, but not increase at
all after this patch).
This is reported to minorly speed-up pathspec limiting in
real world repositories (like the 100-million-file Windows
repository), but probably won't make a noticeable difference
outside of pathological setups.
This patch actually covers the case without --remove-empty,
and the case where we see only deletions. See the in-code
comment for details.
Note that we have to add a new member to the diff_options
struct so that our callback can see the value of
revs->remove_empty_trees. This callback parameter could be
passed to the "add_remove" and "change" callbacks, but
there's not much point. They already receive the
diff_options struct, and doing it this way avoids having to
update the function signature of the other callbacks
(arguably the format_callback and output_prefix functions
could benefit from the same simplification).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up.
* rs/resolve-ref-optional-result:
refs: pass NULL to resolve_ref_unsafe() if hash is not needed
refs: pass NULL to refs_resolve_ref_unsafe() if hash is not needed
refs: make sha1 output parameter of refs_resolve_ref_unsafe() optional
This allows us to get rid of some write-only variables, among them seven
SHA1 buffers.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We assemble an array of strings in a custom struct,
NULL-terminate the result, and then pass it to
parse_pathspec().
But then we never free the array or the individual strings
(nor can we do the latter, as they are heap-allocated when
they come from stdin but not when they come from the
passed-in argv).
Let's swap this out for an argv_array. It does the same
thing with fewer lines of code, and it's safe to call
argv_array_clear() at the end to avoid a memory leak.
Reported-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git gc" and friends when multiple worktrees are used off of a
single repository did not consider the index and per-worktree refs
of other worktrees as the root for reachability traversal, making
objects that are in use only in other worktrees to be subject to
garbage collection.
* nd/prune-in-worktree:
refs.c: reindent get_submodule_ref_store()
refs.c: remove fallback-to-main-store code get_submodule_ref_store()
rev-list: expose and document --single-worktree
revision.c: --reflog add HEAD reflog from all worktrees
files-backend: make reflog iterator go through per-worktree reflog
revision.c: --all adds HEAD from all worktrees
refs: remove dead for_each_*_submodule()
refs.c: move for_each_remote_ref_submodule() to submodule.c
revision.c: use refs_for_each*() instead of for_each_*_submodule()
refs: add refs_head_ref()
refs: move submodule slash stripping code to get_submodule_ref_store
refs.c: refactor get_submodule_ref_store(), share common free block
revision.c: --indexed-objects add objects from all worktrees
revision.c: refactor add_index_objects_to_pending()
refs.c: use is_dir_sep() in resolve_gitlink_ref()
revision.h: new flag in struct rev_info wrt. worktree-related refs
"git log --tag=no-such-tag" showed log starting from HEAD, which
has been fixed---it now shows nothing.
* jk/rev-list-empty-input:
revision: do not fallback to default when rev_input_given is set
rev-list: don't show usage when we see empty ref patterns
revision: add rev_input_given flag
t6018: flesh out empty input/output rev-list tests
Note that add_other_reflogs_to_pending() is a bit inefficient, since
it scans reflog for all refs of each worktree, including shared refs,
so the shared ref's reflog is scanned over and over again.
We could update refs API to pass "per-worktree only" flag to avoid
that. But long term we should be able to obtain a "per-worktree only"
ref store and would need to revert the changes in reflog iteration
API. So let's just wait until then.
add_reflogs_to_pending() is called by reachable.c so by default "git
prune" will examine reflog from all worktrees.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unless single_worktree is set, --all now adds HEAD from all worktrees.
Since reachable.c code does not use setup_revisions(), we need to call
other_head_refs_submodule() explicitly there to have the same effect on
"git prune", so that we won't accidentally delete objects needed by some
other HEADs.
A new FIXME is added because we would need something like
int refs_other_head_refs(struct ref_store *, each_ref_fn, cb_data);
in addition to other_head_refs() to handle it, which might require
int get_submodule_worktrees(const char *submodule, int flags);
It could be a separate topic to reduce the scope of this one.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is the result of single_worktree flag never being set (no way to up
until now). To get objects from current index only, set single_worktree.
The other add_index_objects_to_pending's caller is mark_reachable_objects()
(e.g. "git prune") which also mark objects from all indexes.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The core code is factored out and take 'struct index_state *' instead so
that we can reuse it to add objects from index files other than .git/index
in the next patch.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Numerous bugs in walking of reflogs via "log -g" and friends have
been fixed.
* jk/reflog-walk:
reflog-walk: apply --since/--until to reflog dates
reflog-walk: stop using fake parents
rev-list: check reflog_info before showing usage
get_revision_1(): replace do-while with an early return
log: do not free parents when walking reflog
log: clarify comment about reflog cycles
revision: disallow reflog walking with revs->limited
t1414: document some reflog-walk oddities
"git log --tag=no-such-tag" showed log starting from HEAD, which
has been fixed---it now shows nothing.
* jk/rev-list-empty-input:
revision: do not fallback to default when rev_input_given is set
rev-list: don't show usage when we see empty ref patterns
revision: add rev_input_given flag
t6018: flesh out empty input/output rev-list tests
Numerous bugs in walking of reflogs via "log -g" and friends have
been fixed.
* jk/reflog-walk:
reflog-walk: apply --since/--until to reflog dates
reflog-walk: stop using fake parents
rev-list: check reflog_info before showing usage
get_revision_1(): replace do-while with an early return
log: do not free parents when walking reflog
log: clarify comment about reflog cycles
revision: disallow reflog walking with revs->limited
t1414: document some reflog-walk oddities
Conversion from uchar[20] to struct object_id continues.
* bc/object-id:
sha1_name: convert uses of 40 to GIT_SHA1_HEXSZ
sha1_name: convert GET_SHA1* flags to GET_OID*
sha1_name: convert get_sha1* to get_oid*
Convert remaining callers of get_sha1 to get_oid.
builtin/unpack-file: convert to struct object_id
bisect: convert bisect_checkout to struct object_id
builtin/update_ref: convert to struct object_id
sequencer: convert to struct object_id
remote: convert struct push_cas to struct object_id
submodule: convert submodule config lookup to use object_id
builtin/merge-tree: convert remaining caller of get_sha1 to object_id
builtin/fsck: convert remaining caller of get_sha1 to object_id
If revs->def is set (as it is in "git log") and there are no
pending objects after parsing the user's input, then we show
whatever is in "def". But if the user _did_ ask for some
input that just happened to be empty (e.g., "--glob" that
does not match anything), showing the default revision is
confusing. We should just show nothing, as that is what the
user's request yielded.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Normally a caller that invokes setup_revisions() has to
check rev.pending to see if anything was actually queued for
the traversal. But they can't tell the difference between
two cases:
1. The user gave us no tip from which to start a
traversal.
2. The user tried to give us tips via --glob, --all, etc,
but their patterns ended up being empty.
Let's set a flag in the rev_info struct that callers can use
to tell the difference. We can set this from the
init_all_refs_cb() function. That's a little funny because
it's not exactly about initializing the "cb" struct itself.
But that function is the common setup place for doing
pattern traversals that is used by --glob, --all, etc.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the flags for get_oid_with_context and friends to use "OID"
instead of "SHA1" in their names.
This transform was made by running the following one-liner on the
affected files:
perl -pi -e 's/GET_SHA1/GET_OID/g'
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code cleanup.
* ab/grep-lose-opt-regflags:
grep: remove redundant REG_NEWLINE when compiling fixed regex
grep: remove regflags from the public grep_opt API
grep: remove redundant and verbose re-assignments to 0
grep: remove redundant "fixed" field re-assignment to 0
grep: adjust a redundant grep pattern type assignment
grep: remove redundant double assignment to 0
When doing a reflog walk, we use the commit's date to
do any date limiting. In earlier versions of Git, this could
lead to nonsense results, since a skipped commit would
truncate the traversal. So a sequence like:
git commit ...
git checkout week-old-branch
git checkout -
git log -g --since=1.day.ago
would stop at the week-old-branch, even though the "git
commit" entry further back is still interesting.
As of the prior commit, which uses a parent-less traversal
of the reflog, you get the whole reflog minus any commits
whose dates do not match the specified options. This is
arguably useful, as you could scan the reflogs for commits
that originated in a certain range.
But more likely a user doing a reflog walk wants to limit
based on the reflog entries themselves. You can simulate
--until with:
git log -g @{1.day.ago}
but there's no way to ask Git to traverse only back to a
certain date. E.g.:
# show me reflog entries from the past day
git log -g --since=1.day.ago
This patch teaches the revision machinery to prefer the
reflog entry dates to the commit dates when doing a reflog
walk. Technically this is a change in behavior that affects
plumbing, but the previous behavior was so buggy that it's
unlikely anyone was relying on it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reflog-walk system works by putting a ref's tip into the
pending queue, and then "traversing" the reflog by
pretending that the parent of each commit is the previous
reflog entry.
This causes a number of user-visible oddities, as documented
in t1414 (and the commit message which introduced it). We
can fix all of them in one go by replacing the fake-reflog
system with a much simpler one: just keeping a list of
reflogs to show, and walking through them entry by entry.
The implementation is fairly straight-forward, but there are
a few items to note:
1. We obviously must skip calling add_parents_to_list()
when we are traversing reflogs, since we do not want to
walk the original parents at all. As a result, we must call
try_to_simplify_commit() ourselves.
There are other parts of add_parents_to_list() we skip,
as well, but none of them should matter for a reflog
traversal:
- We do not allow UNINTERESTING commits, nor
symmetric ranges (and we bail when these are used
with "-g").
- Using --source makes no sense, since we aren't
traversing. The reflog selector shows the same
information with more detail.
- Using --first-parent is still sensible, since you
may want to see the first-parent diff for each
entry. But since we're not traversing, we don't
need to cull the parent list here.
2. Since we now just walk the reflog entries themselves,
rather than starting with the ref tip, we now look at
the "new" field of each entry rather than the "old"
(i.e., we are showing entries, not faking parents).
This removes all of the tricky logic around skipping
past root commits.
But note that we have no way to show an entry with the
null sha1 in its "new" field (because such a commit
obviously does not exist). Normally this would not
happen, since we delete reflogs along with refs, but
there is one special case. When we rename the currently
checked out branch, we write two reflog entries into
the HEAD log: one where the commit goes away, and
another where it comes back.
Prior to this commit, we show both entries with
identical reflog messages. After this commit, we show
only the "comes back" entry. See the update in t3200
which demonstrates this.
Arguably either is fine, as the whole double-entry
thing is a bit hacky in the first place. And until a
recent fix, we truncated the traversal in such a case
anyway, which was _definitely_ wrong.
3. We show individual reflogs in order, but choose which
reflog to show at each stage based on which has the
most recent timestamp. This interleaves the output
from multiple reflogs based on date order, which is
probably what you'd want with limiting like "-n 30".
Note that the implementation aims for simplicity. It
does a linear walk over the reflog queue for each
commit it pulls, which may perform badly if you
interleave an enormous number of reflogs. That seems
like an unlikely use case; if we did want to handle it,
we could probably keep a priority queue of reflogs,
ordered by the timestamp of their current tip entry.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The get_revision_1() function tries to avoid entering its
main loop at all when there are no commits to look at. But
it's perfectly safe to call pop_commit() on an empty list
(in which case it will return NULL). Switching to an early
return from the loop lets us skip repeating the loop
condition before we enter the do-while. That will get more
important when we start pulling reflog-walk commits from a
source besides the revs->commits queue, as that condition
will get much more complicated.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reflog-walk code doesn't work with limit_list(). That
function traverses down the real history graph, not the fake
reflog history that get_revision() returns. So it's not
going to actually examine all of the commits we're going to
show, because we'd add them to the pending list only during
the actual traversal.
In practice this limitation doesn't really matter, because
the options that require list-limiting generally need
UNINTERESTING endpoints or symmetric ranges, which already
are forbidden for reflog walks. Still, there are likely some
corner cases that would behave oddly. We're better off to
warn the user that we can't fulfill their request than to
generate potentially wrong output.
This will also make it easier to refactor the reflog-walking
code, because it eliminates a whole area of corner cases
we'd have to consider (that already don't work anyway).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor calls to the grep machinery to always pass opt.ignore_case &
opt.extended_regexp_option instead of setting the equivalent regflags
bits.
The bug fixed when making -i work with -P in commit 9e3cbc59d5 ("log:
make --regexp-ignore-case work with --perl-regexp", 2017-05-20) was
really just plastering over the code smell which this change fixes.
The reason for adding the extensive commentary here is that I
discovered some subtle complexity in implementing this that really
should be called out explicitly to future readers.
Before this change we'd rely on the difference between
`extended_regexp_option` and `regflags` to serve as a membrane between
our preliminary parsing of grep.extendedRegexp and grep.patternType,
and what we decided to do internally.
Now that those two are the same thing, it's necessary to unset
`extended_regexp_option` just before we commit in cases where both of
those config variables are set. See 84befcd0a4 ("grep: add a
grep.patternType configuration setting", 2012-08-03) for the code and
documentation related to that.
The explanation of why the if/else branches in
grep_commit_pattern_type() are ordered the way they are exists in that
commit message, but I think it's worth calling this subtlety out
explicitly with a comment for future readers.
Even though grep_commit_pattern_type() is the only caller of
grep_set_pattern_type_option() it's simpler to reset the
extended_regexp_option flag in the latter, since 2/3 branches in the
former would otherwise need to reset it, this way we can do it in one
place.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Bugfix for a topic that is (only) in 'master'.
* mh/packed-ref-store-prep:
for_each_bisect_ref(): don't trim refnames
lock_packed_refs(): fix cache validity check
Remove the unused wildopts placeholder struct from being passed to all
wildmatch() invocations, or rather remove all the boilerplate NULL
parameters.
This parameter was added back in commit 9b3497cab9 ("wildmatch: rename
constants and update prototype", 2013-01-01) as a placeholder for
future use. Over 4 years later nothing has made use of it, let's just
remove it. It can be added in the future if we find some reason to
start using such a parameter.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up.
* sg/revision-parser-skip-prefix:
revision.c: use skip_prefix() in handle_revision_pseudo_opt()
revision.c: use skip_prefix() in handle_revision_opt()
revision.c: stricter parsing of '--early-output'
revision.c: stricter parsing of '--no-{min,max}-parents'
revision.h: turn rev_info.early_output back into an unsigned int