Reimplement commit 4b7f53da on top of the new simplify-merges
infrastructure, tightening the condition to only consider root parents;
the original version incorrectly dropped parents that were TREESAME to
anything.
Original log message follows.
The merge simplification rule stated in 6546b59 (revision traversal:
show full history with merge simplification, 2008-07-31) still
treated merge commits too specially. Namely, in a history with this
shape:
---o---o---M
/
x---x---x
where three 'x' were on a history completely unrelated to the main
history 'o' and do not touch any of the paths we are following, we
still said that after simplifying all of the parents of M, 'x'
(which is the leftmost 'x' that rightmost 'x simplifies down to) and
'o' (which would be the last commit on the main history that touches
the paths we are following) are independent from each other, and
both need to be kept.
That is incorrect; when the side branch 'x' never touches the paths,
it should be removed to allow M to simplify down to the last commit
on the main history that touches the paths.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kevin Bracey <kevin@bracey.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When simplifying an odd merge, such as one that used "-s ours", we may
find ourselves TREESAME to apparently redundant parents. Prevent
simplify_merges() from removing every TREESAME parent; if this would
happen reinstate the first TREESAME parent - the one that the default
log would have followed.
This avoids producing a totally disjoint history from the default log
when the default log is a better explanation of the end result, and aids
visualisation of odd merges.
Signed-off-by: Kevin Bracey <kevin@bracey.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
History simplification previously always treated merges as TREESAME
if they were TREESAME to any parent.
While this was consistent with the default behaviour, this could be
extremely unhelpful when searching detailed history, and could not be
overridden. For example, if a merge had ignored a change, as if by "-s
ours", then:
git log -m -p --full-history -Schange file
would successfully locate "change"'s addition but would not locate the
merge that resolved against it.
Futher, simplify_merges could drop the actual parent that a commit
was TREESAME to, leaving it as a normal commit marked TREESAME that
isn't actually TREESAME to its remaining parent.
Now redefine a commit's TREESAME flag to be true only if a commit is
TREESAME to _all_ of its parents. This doesn't affect either the default
simplify_history behaviour (because partially TREESAME merges are turned
into normal commits), or full-history with parent rewriting (because all
merges are output). But it does affect other modes. The clearest
difference is that --full-history will show more merges - sufficient to
ensure that -m -p --full-history log searches can really explain every
change to the file, including those changes' ultimate fate in merges.
Also modify simplify_merges to recalculate TREESAME after removing
a parent. This is achieved by storing per-parent TREESAME flags on the
initial scan, so the combined flag can be easily recomputed.
This fixes some t6111 failures, but creates a couple of new ones -
we are now showing some merges that don't need to be shown.
Signed-off-by: Kevin Bracey <kevin@bracey.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation assures users that "A...B" is defined as "A B --not
$(git merge-base --all A B)". This wasn't in fact quite true, because
the calculated merge bases were not sent to add_rev_cmdline().
The main effect of this was that although
git rev-list --ancestry-path A B --not $(git merge-base --all A B)
worked, the simpler form
git rev-list --ancestry-path A...B
failed with a "no bottom commits" error.
Other potential users of bottom commits could also be affected by this
problem, if they examine revs->cmdline_info; I came across the issue in
my proposed history traversal refinements series.
So ensure that the calculated merge bases are sent to add_rev_cmdline(),
flagged with new 'whence' enum value REV_CMD_MERGE_BASE.
Signed-off-by: Kevin Bracey <kevin@bracey.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
pretty-printing body of the commit that is stored in non UTF-8
encoding did not work well. The early part of this series fixes
it. And then it adds %C(auto) specifier that turns the coloring on
when we are emitting to the terminal, and adds column-aligning
format directives.
* nd/pretty-formats:
pretty: support %>> that steal trailing spaces
pretty: support truncating in %>, %< and %><
pretty: support padding placeholders, %< %> and %><
pretty: add %C(auto) for auto-coloring
pretty: split color parsing into a separate function
pretty: two phase conversion for non utf-8 commits
utf8.c: add reencode_string_len() that can handle NULs in string
utf8.c: add utf8_strnwidth() with the ability to skip ansi sequences
utf8.c: move display_mode_esc_sequence_len() for use by other functions
pretty: share code between format_decoration and show_decorations
pretty-formats.txt: wrap long lines
pretty: get the correct encoding for --pretty:format=%e
pretty: save commit encoding from logmsg_reencode if the caller needs it
A fix to a long-standing issue in the command line parser for
revisions, which was triggered by mv/sequence-pick-error-diag topic.
* tr/copy-revisions-from-stdin:
read_revisions_from_stdin: make copies for handle_revision_arg
The commit encoding is parsed by logmsg_reencode, there's no need for
the caller to re-parse it again. The reencoded message now has the new
encoding, not the original one. The caller would need to read commit
object again before parsing.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
read_revisions_from_stdin() has passed pointers to its read buffer
down to handle_revision_arg() since its inception way back in 42cabc3
(Teach rev-list an option to read revs from the standard input.,
2006-09-05). Even back then, this was a bug: through
add_pending_object, the argument was recorded in the object_array's
'name' field.
Fix it by making a copy whenever read_revisions_from_stdin() passes an
argument down the callchain. The other caller runs handle_revision_arg()
on argv[], where it would be redundant to make a copy.
Signed-off-by: Thomas Rast <trast@inf.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Kevin Bracey reports that the change regresses a case shown in the
user manual.
Trading one fix with another breakage is not worth it. Just keep
the test to document the existing breakage, and revert the change
for now.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint-1.8.1:
Start preparing for 1.8.1.6
git-tag(1): we tag HEAD by default
Fix revision walk for commits with the same dates
t2003: work around path mangling issue on Windows
pack-refs: add fully-peeled trait
pack-refs: write peeled entry for non-tags
use parse_object_or_die instead of die("bad object")
avoid segfaults on parse_object failure
entry: fix filter lookup
t2003: modernize style
name-hash.c: fix endless loop with core.ignorecase=true
Allow the revision "slop" code to look deeper while commits with
exactly the same timestamps come next to each other (which can
often happen after a large "am" and "rebase" session).
* kk/revwalk-slop-too-many-commit-within-a-second:
Fix revision walk for commits with the same dates
The --simplify-merges logic did not cull irrelevant parents from a
merge that is otherwise not interesting with respect to the paths
we are following.
This touches a fairly core part of the revision traversal
infrastructure; even though I think this change is correct, please
report immediately if you find any unintended side effect.
* jc/remove-treesame-parent-in-simplify-merges:
simplify-merges: drop merge from irrelevant side branch
This is a rewrite of much of Bo's work, mainly in an effort to split
it into smaller, easier to understand routines.
The algorithm is built around the struct range_set, which encodes a
series of line ranges as intervals [a,b). This is used in two
contexts:
* A set of lines we are tracking (which will change as we dig through
history).
* To encode diffs, as pairs of ranges.
The main routine is range_set_map_across_diff(). It processes the
diff between a commit C and some parent P. It determines which diff
hunks are relevant to the ranges tracked in C, and computes the new
ranges for P.
The algorithm is then simply to process history in topological order
from newest to oldest, computing ranges and (partial) diffs. At
branch points, we need to merge the ranges we are watching. We will
find that many commits do not affect the chosen ranges, and mark them
TREESAME (in addition to those already filtered by pathspec limiting).
Another pass of history simplification then gets rid of such commits.
This is wired as an extra filtering pass in the log machinery. This
currently only reduces code duplication, but should allow for other
simplifications and options to be used.
Finally, we hook a diff printer into the output chain. Ideally we
would wire directly into the diff logic, to optionally use features
like word diff. However, that will require some major reworking of
the diff chain, so we completely replace the output with our own diff
for now.
As this was a GSoC project, and has quite some history by now, many
people have helped. In no particular order, thanks go to
Jakub Narebski <jnareb@gmail.com>
Jens Lehmann <Jens.Lehmann@web.de>
Jonathan Nieder <jrnieder@gmail.com>
Junio C Hamano <gitster@pobox.com>
Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Will Palmer <wmpalmer@gmail.com>
Apologies to everyone I forgot.
Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function rewrite_one is used to rewrite a single
parent of the current commit, and is used by rewrite_parents
to rewrite all the parents.
Decouple the dependence between them by making rewrite_one
a callback function that is passed to rewrite_parents. Then
export rewrite_parents for reuse by the line history browser.
We will use this function in line-log.c.
Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Logic in still_interesting function allows to stop the commits
traversing if the oldest processed commit is not older then the
youngest commit on the list to process and the list contains only
commits marked as not interesting ones. It can be premature when dealing
with a set of coequal commits. For example git rev-list A^! --not B
provides wrong answer if all commits in the range A..B had the same
commit time and there are more then 7 of them.
To fix this problem the relevant part of the logic in still_interesting
is changed to: the walk can be stopped if the oldest processed commit is
younger then the youngest commit on the list to processed.
Signed-off-by: Kacper Kornet <draenog@pld-linux.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If you run "git log --grep=foo", we will run your regex on
the literal bytes of the commit message. This can provide
confusing results if the commit message is not in the same
encoding as your grep expression (or worse, you have commits
in multiple encodings, in which case your regex would need
to be written to match either encoding). On top of this, we
might also be grepping in the commit's notes, which are
already re-encoded, potentially leading to grepping in a
buffer with mixed encodings concatenated. This is insanity,
but most people never noticed, because their terminal and
their commit encodings all match.
Instead, let's massage the to-be-grepped commit into a
standardized encoding. There is not much point in adding a
flag for "this is the encoding I expect my grep pattern to
match"; the only sane choice is for it to use the log output
encoding. That is presumably what the user's terminal is
using, and it means that the patterns found by the grep will
match the output produced by git.
As a bonus, this fixes a potential segfault in commit_match
when commit->buffer is NULL, as we now build on logmsg_reencode,
which handles reading the commit buffer from disk if
necessary. The segfault can be triggered with:
git commit -m 'text1' --allow-empty
git commit -m 'text2' --allow-empty
git log --graph --no-walk --grep 'text2'
which arguably does not make any sense (--graph inherently
wants a connected history, and by --no-walk the command line
is telling us to show discrete points in history without
connectivity), and we probably should forbid the
combination, but that is a separate issue.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The merge simplification rule stated in 6546b59 (revision traversal:
show full history with merge simplification, 2008-07-31) still
treated merge commits too specially. Namely, in a history with this
shape:
---o---o---M
/
x---x---x
where three 'x' were on a history completely unrelated to the main
history 'o' and do not touch any of the paths we are following, we
still said that after simplifying all of the parents of M, 'x'
(which is the leftmost 'x' that rightmost 'x simplifies down to) and
'o' (which would be the last commit on the main history that touches
the paths we are following) are independent from each other, and
both need to be kept.
That is incorrect; when the side branch 'x' never touches the paths,
it should be removed to allow M to simplify down to the last commit
on the main history that touches the paths.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we taught the commit_match() mechanism to pay attention to the
new --use-mailmap option, we started to unconditionally copy the
commit object to a temporary buffer, just in case we need the author
and committer lines updated via the mailmap mechanism, and rewrite
author and committer using the mailmap.
It turns out that this has a rather unpleasant performance
implications. In the linux kernel repository, running
$ git log --author='Junio C Hamano' --pretty=short >/dev/null
under /usr/bin/time, with and without --use-mailmap (the .mailmap
file is 118 entries long, the particular author does not appear in
it), cost (with warm cache):
[without --use-mailmap]
5.42user 0.26system 0:05.70elapsed 99%CPU (0avgtext+0avgdata 2005936maxresident)k
0inputs+0outputs (0major+137669minor)pagefaults 0swaps
[with --use-mailmap]
6.47user 0.30system 0:06.78elapsed 99%CPU (0avgtext+0avgdata 2006288maxresident)k
0inputs+0outputs (0major+137692minor)pagefaults 0swaps
which incurs about 20% overhead. The command is doing extra work,
so the extra cost may be justified.
But it is inexcusable to pay the cost when we do not need
author/committer match. In the same repository,
$ git log --grep='fix menuconfig on debian lenny' --pretty=short >/dev/null
shows very similar numbers as the above:
[without --use-mailmap]
5.32user 0.30system 0:05.63elapsed 99%CPU (0avgtext+0avgdata 2005984maxresident)k
0inputs+0outputs (0major+137672minor)pagefaults 0swaps
[with --use-mailmap]
6.64user 0.24system 0:06.89elapsed 99%CPU (0avgtext+0avgdata 2006320maxresident)k
0inputs+0outputs (0major+137694minor)pagefaults 0swaps
The latter case is an unnecessary performance regression. We may
want to _show_ the result with mailmap applied, but we do not have
to copy and rewrite the author/committer of all commits we try to
match if we do not query for these fields.
Trivially optimize this performace regression by limiting the
rewrites for only when we are matching with author/committer fields.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently you can use mailmap to display log authors and committers
but you can't use the mailmap to find commits with mapped values.
This commit allows you to run:
git log --use-mailmap --author mapped_name_or_email
git log --use-mailmap --committer mapped_name_or_email
Of course it only works if the --use-mailmap option is used.
The new name and email are copied only when necessary.
Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Emit the notes attached to the commit in "format-patch --notes"
output after three-dashes.
* jc/prettier-pretty-note:
format-patch: add a blank line between notes and diffstat
Doc User-Manual: Patch cover letter, three dashes, and --notes
Doc format-patch: clarify --notes use case
Doc notes: Include the format-patch --notes option
Doc SubmittingPatches: Mention --notes option after "cover letter"
Documentation: decribe format-patch --notes
format-patch --notes: show notes after three-dashes
format-patch: append --signature after notes
pretty_print_commit(): do not append notes message
pretty: prepare notes message at a centralized place
format_note(): simplify API
pretty: remove reencode_commit_message()
We either stuff the notes message without modification for %N
userformat, or format it for human consumption. Using two bits
is an overkill that does not benefit anybody.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we added the "--perl-regexp" option (or "-P") to "git grep", we
should have done the same for the commands in the "git log" family,
but somehow we forgot to do so. This corrects it, but we will
reserve the short-and-sweet "-P" option for something else for now.
Also introduce the "--basic-regexp" option for completeness, so that
the "last one wins" principle can be used to defeat an earlier -E
option, e.g. "git log -E --basic-regexp --grep='<bre>'". Note that
it cannot have the short "-G" option as the option is to grep in the
patch text in the context of "log" family.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The command line option parser for "git log -F -E --grep='<ere>'"
did not flip the "fixed" bit, violating the general "last option
wins" principle among conflicting options.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using the hand-rolled initialization sequence,
use grep_init() to populate the necessary bits. This opens
the door to allow the calling commands to optionally read
grep.* configuration variables via git_config() if they
want to.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/maint-log-grep-all-match-1:
grep.c: make two symbols really file-scope static this time
t7810-grep: test --all-match with multiple --grep and --author options
t7810-grep: test interaction of multiple --grep and --author options
t7810-grep: test multiple --author with --all-match
t7810-grep: test multiple --grep with and without --all-match
t7810-grep: bring log --grep tests in common form
grep.c: mark private file-scope symbols as static
log: document use of multiple commit limiting options
log --grep/--author: honor --all-match honored for multiple --grep patterns
grep: show --debug output only once
grep: teach --debug option to dump the parse tree
Notes are shown after commit body. From user perspective it looks
pretty much like commit body and they may assume --grep would search
in that part too.
Make it so.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Similar to --author/--committer which filters commits by author and
committer header fields. --grep-reflog adds a fake "reflog" header to
commit and a grep filter to search on that line.
All rules to --author/--committer apply except no timestamp stripping.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a long-standing bug in "git log --grep" when multiple "--grep"
are used together with "--all-match" and "--author" or "--committer".
* jc/maint-log-grep-all-match:
t7810-grep: test --all-match with multiple --grep and --author options
t7810-grep: test interaction of multiple --grep and --author options
t7810-grep: test multiple --author with --all-match
t7810-grep: test multiple --grep with and without --all-match
t7810-grep: bring log --grep tests in common form
grep.c: mark private file-scope symbols as static
log: document use of multiple commit limiting options
log --grep/--author: honor --all-match honored for multiple --grep patterns
grep: show --debug output only once
grep: teach --debug option to dump the parse tree
* mz/cherry-pick-cmdline-order:
cherry-pick/revert: respect order of revisions to pick
demonstrate broken 'git cherry-pick three one two'
teach log --no-walk=unsorted, which avoids sorting
Our "grep" allows complex boolean expressions to be formed to match
each individual line with operators like --and, '(', ')' and --not.
Introduce the "--debug" option to show the parse tree to help people
who want to debug and enhance it.
Also "log" learns "--grep-debug" option to do the same. The command
line parser to the log family is a lot more limited than the general
"git grep" parser, but it has special handling for header matching
(e.g. "--author"), and a parse tree is valuable when working on it.
Note that "--all-match" is *not* any individual node in the parse
tree. It is an instruction to the evaluator to check all the nodes
in the top-level backbone have matched and reject a document as
non-matching otherwise.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git log .." errored out saying it is both rev range and a path when
there is no disambiguating "--" is on the command line. Update the
command line parser to interpret ".." as a path in such a case.
* jc/dotdot-is-parent-directory:
specifying ranges: we did not mean to make ".." an empty set
"git cherry-pick A C B" used to replay changes in A and then B and
then C if these three commits had committer timestamps in that
order, which is not what the user who said "A C B" naturally expects.
* mz/cherry-pick-cmdline-order:
cherry-pick/revert: respect order of revisions to pick
demonstrate broken 'git cherry-pick three one two'
teach log --no-walk=unsorted, which avoids sorting
* maint-1.7.11:
Almost 1.7.11.6
gitweb: URL-decode $my_url/$my_uri when stripping PATH_INFO
rebase -i: use full onto sha1 in reflog
sh-setup: protect from exported IFS
receive-pack: do not leak output from auto-gc to standard output
t/t5400: demonstrate breakage caused by informational message from prune
setup: clarify error messages for file/revisions ambiguity
send-email: improve RFC2047 quote parsing
fsck: detect null sha1 in tree entries
do not write null sha1s to on-disk index
diff: do not use null sha1 as a sentinel value
"git diff" had a confusion between taking data from a path in the
working tree and taking data from an object that happens to have
name 0{40} recorded in a tree.
* jk/maint-null-in-trees:
fsck: detect null sha1 in tree entries
do not write null sha1s to on-disk index
diff: do not use null sha1 as a sentinel value
"git log .." errored out saying it is both rev range and a path when
there is no disambiguating "--" is on the command line. Update the
command line parser to interpret ".." as a path in such a case.
* jc/dotdot-is-parent-directory:
specifying ranges: we did not mean to make ".." an empty set
When 'git log' is passed the --no-walk option, no revision walk takes
place, naturally. Perhaps somewhat surprisingly, however, the provided
revisions still get sorted by commit date. So e.g 'git log --no-walk
HEAD HEAD~1' and 'git log --no-walk HEAD~1 HEAD' give the same result
(unless the two revisions share the commit date, in which case they
will retain the order given on the command line). As the commit that
introduced --no-walk (8e64006 (Teach revision machinery about
--no-walk, 2007-07-24)) points out, the sorting is intentional, to
allow things like
git log --abbrev-commit --pretty=oneline --decorate --all --no-walk
to show all refs in order by commit date.
But there are also other cases where the sorting is not wanted, such
as
<command producing revisions in order> |
git log --oneline --no-walk --stdin
To accomodate both cases, leave the decision of whether or not to sort
up to the caller, by allowing --no-walk={sorted,unsorted}, defaulting
to 'sorted' for backward-compatibility reasons.
Signed-off-by: Martin von Zweigbergk <martinvonz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We do not want a link to 0{40} object stored anywhere in our objects.
* jk/maint-null-in-trees:
fsck: detect null sha1 in tree entries
do not write null sha1s to on-disk index
diff: do not use null sha1 as a sentinel value
Either end of revision range operator can be omitted to default to HEAD,
as in "origin.." (what did I do since I forked) or "..origin" (what did
they do since I forked). But the current parser interprets ".." as an
empty range "HEAD..HEAD", and worse yet, because ".." does exist on the
filesystem, we get this annoying output:
$ cd Documentation/howto
$ git log .. ;# give me recent commits that touch Documentation/ area.
fatal: ambiguous argument '..': both revision and filename
Use '--' to separate filenames from revisions
Surely we could say "git log ../" or even "git log -- .." to disambiguate,
but we shouldn't have to.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff_setup_done() has historically returned an error code, but lost
the last nonzero return in 943d5b7 (allow diff.renamelimit to be set
regardless of -M/-C, 2006-08-09). The callers were in a pretty
confused state: some actually checked for the return code, and some
did not.
Let it return void, and patch all callers to take this into account.
This conveniently also gets rid of a handful of different(!) error
messages that could never be triggered anyway.
Note that the function can still die().
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The diff code represents paths using the diff_filespec
struct. This struct has a sha1 to represent the sha1 of the
content at that path, as well as a sha1_valid member which
indicates whether its sha1 field is actually useful. If
sha1_valid is not true, then the filespec represents a
working tree file (e.g., for the no-index case, or for when
the index is not up-to-date).
The diff_filespec is only used internally, though. At the
interfaces to the diff subsystem, callers feed the sha1
directly, and we create a diff_filespec from it. It's at
that point that we look at the sha1 and decide whether it is
valid or not; callers may pass the null sha1 as a sentinel
value to indicate that it is not.
We should not typically see the null sha1 coming from any
other source (e.g., in the index itself, or from a tree).
However, a corrupt tree might have a null sha1, which would
cause "diff --patch" to accidentally diff the working tree
version of a file instead of treating it as a blob.
This patch extends the edges of the diff interface to accept
a "sha1_valid" flag whenever we accept a sha1, and to use
that flag when creating a filespec. In some cases, this
means passing the flag through several layers, making the
code change larger than would be desirable.
One alternative would be to simply die() upon seeing
corrupted trees with null sha1s. However, this fix more
directly addresses the problem (while bogus sha1s in a tree
are probably a bad thing, it is really the sentinel
confusion sending us down the wrong code path that is what
makes it devastating). And it means that git is more capable
of examining and debugging these corrupted trees. For
example, you can still "diff --raw" such a tree to find out
when the bogus entry was introduced; you just cannot do a
"--patch" diff (just as you could not with any other
corrupted tree, as we do not have any content to diff).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git log -n 1 -- rarely-touched-path" was spending unnecessary
cycles after showing the first change to find the next one, only to
discard it.
* jk/revision-walk-stop-at-max-count:
revision: avoid work after --max-count is reached
Teaches the object name parser things like a "git describe" output
is always a commit object, "A" in "git log A" must be a committish,
and "A" and "B" in "git log A...B" both must be committish, etc., to
prolong the lifetime of abbreviated object names.
* jc/sha1-name-more: (27 commits)
t1512: match the "other" object names
t1512: ignore whitespaces in wc -l output
rev-parse --disambiguate=<prefix>
rev-parse: A and B in "rev-parse A..B" refer to committish
reset: the command takes committish
commit-tree: the command wants a tree and commits
apply: --build-fake-ancestor expects blobs
sha1_name.c: add support for disambiguating other types
revision.c: the "log" family, except for "show", takes committish
revision.c: allow handle_revision_arg() to take other flags
sha1_name.c: introduce get_sha1_committish()
sha1_name.c: teach lookup context to get_sha1_with_context()
sha1_name.c: many short names can only be committish
sha1_name.c: get_sha1_1() takes lookup flags
sha1_name.c: get_describe_name() by definition groks only commits
sha1_name.c: teach get_short_sha1() a commit-only option
sha1_name.c: allow get_short_sha1() to take other flags
get_sha1(): fix error status regression
sha1_name.c: restructure disambiguation of short names
sha1_name.c: correct misnamed "canonical" and "res"
...
During a revision traversal in which --max-count has been
specified, we decrement a counter for each revision returned
by get_revision. When it hits 0, we typically return NULL
(the exception being if we still have boundary commits to
show).
However, before we check the counter, we call get_revision_1
to get the next commit. This might involve looking at a
large number of commits if we have restricted the traversal
(e.g., we might traverse until we find the next commit whose
diff actually matches a pathspec).
There's no need to make this get_revision_1 call when our
counter runs out. If we are not in --boundary mode, we will
just throw away the result and immediately return NULL. If
we are in --boundary mode, then we will still throw away the
result, and then start showing the boundary commits.
However, as git_revision_1 does not impact the boundary
list, it should not have an impact.
In most cases, avoiding this work will not be especially
noticeable. However, in some cases, it can make a big
difference:
[before]
$ time git rev-list -1 origin Documentation/RelNotes/1.7.11.2.txt
8d141a1d56
real 0m0.301s
user 0m0.280s
sys 0m0.016s
[after]
$ time git rev-list -1 origin Documentation/RelNotes/1.7.11.2.txt
8d141a1d56
real 0m0.010s
user 0m0.008s
sys 0m0.000s
Note that the output is produced almost instantaneously in
the first case, and then git uselessly spends a long time
looking for the next commit to touch that file (but there
isn't one, and we traverse all the way down to the roots).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git log" gets "--simplify-merges/by-decoration" together with
"--first-parent", the combination of these options makes the
simplification logic to use in-core commit objects that haven't been
examined for relevance, either producing incorrect result or taking
too long to produce any output. Teach the simplification logic to
ignore commits that the first-parent traversal logic ignored when
both are in effect to work around the issue.
* jc/rev-list-simplify-merges-first-parent:
revision: ignore side parents while running simplify-merges
revision: note the lack of free() in simplify_merges()
revision: "simplify" options imply topo-order sort
"git diff COPYING HEAD:COPYING" gave a nonsense error message that
claimed that the treeish HEAD did not have COPYING in it.
* mm/verify-filename-fix:
verify_filename(): ask the caller to chose the kind of diagnosis
sha1_name: do not trigger detailed diagnosis for file arguments
Add a field to setup_revision_opt structure and allow these callers
to tell the setup_revisions command parsing machinery that short SHA1
it encounters are meant to name committish.
This step does not go all the way to connect the setup_revisions()
to sha1_name.c yet.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The existing "cant_be_filename" that tells the function that the
caller knows the arg is not a path (hence it does not have to be
checked for absense of the file whose name matches it) is made into
a bit in the flag word.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many callers know that the user meant to name a committish by
syntactical positions where the object name appears. Calling this
function allows the machinery to disambiguate shorter-than-unique
abbreviated object names between committish and others.
Note that this does NOT error out when the named object is not a
committish. It is merely to give a hint to the disambiguation
machinery.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function takes user input string and returns the object name
(binary SHA-1) with mode bits and path when the object was looked
up in a tree.
Additionally give hints to help disambiguation of abbreviated object
names when the caller knows what it is looking for.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are only two callers, and they will benefit from being able to
pass disambiguation hints to underlying get_sha1_with_context() API
once it happens.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "--simplify-merges/by-decoration" is given together with
"--first-parent" to "git log", the combination of these options
makes the simplification logic to use in-core commit objects that
haven't been examined for relevance, either producing incorrect
result or taking too long to produce any output. Teach the
simplification logic to ignore commits that the first-parent
traversal logic ignored when both are in effect to work around the
issue.
verify_filename() can be called in two different contexts. Either we
just tried to interpret a string as an object name, and it fails, so
we try looking for a working tree file (i.e. we finished looking at
revs that come earlier on the command line, and the next argument
must be a pathname), or we _know_ that we are looking for a
pathname, and shouldn't even try interpreting the string as an
object name.
For example, with this change, we get:
$ git log COPYING HEAD:inexistant
fatal: HEAD:inexistant: no such path in the working tree.
Use '-- <path>...' to specify paths that do not exist locally.
$ git log HEAD:inexistant
fatal: Path 'inexistant' does not exist in 'HEAD'
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The simplify_merges() function needs to look at all history chain to
find the closest ancestor that is relevant after the simplification,
but after --first-parent traversal, side parents haven't been marked
for relevance (they are irrelevant by definition due to the nature
of first-parent-only traversal) nor culled from the parents list of
resulting commits.
We cannot simply remove these side parents from the parents list, as
the output phase still wants to see the parents. Instead, teach
simplify_one() and its callees to ignore the later parents.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Among the three similar-looking loops that walk singly linked
commit_list, the first one is only peeking and the same list is
later used for real work. Leave a comment not to mistakenly
free its elements there.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code internally runs sort_in_topo_order() already; it is more clear
to spell it out in the option parsing phase, instead of adding a special
case in simplify_merges() function.
There is no need for "commit_list_reverse()" function that only invites
inefficient code.
By René Scharfe
* rs/commit-list-append:
commit: remove commit_list_reverse()
revision: append to list instead of insert and reverse
sequencer: export commit_list_append()
The command line parser choked "git cherry-pick $name" when $name can be
both revision name and a pathname, even though $name can never be a path
in the context of the command.
The issue the patch addresses is real, but the way it is implemented felt
unnecessarily invasive a bit. It may be cleaner for this caller to add
the "--" to the end of the argv_array it passes to setup_revisions().
By Clemens Buchacher
* cb/cherry-pick-rev-path-confusion:
cherry-pick: do not expect file arguments
By using commit_list_insert(), we added new items to the top of the
list and, since this is not the order we want, reversed it afterwards.
Simplify this process by adding new items at the bottom instead,
getting rid of the reversal step.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git push --recurse-submodules" learns to optionally look into the
histories of submodules bound to the superproject and push them out.
By Heiko Voigt
* hv/submodule-recurse-push:
push: teach --recurse-submodules the on-demand option
Refactor submodule push check to use string list instead of integer
Teach revision walking machinery to walk multiple times sequencially
Setting up a revision traversal with many starting points was inefficient
as these were placed in a date-order priority queue one-by-one.
By René Scharfe (3) and Junio C Hamano (1)
* rs/commit-list-sort-in-batch:
mergesort: rename it to llist_mergesort()
revision: insert unsorted, then sort in prepare_revision_walk()
commit: use mergesort() in commit_list_sort_by_date()
add mergesort() for linked lists
If a commit-ish passed to cherry-pick or revert happens to have a file
of the same name, git complains that the argument is ambiguous and
advises to use '--'. To make things worse, the '--' argument is removed
by parse_options, und so passing '--' has no effect.
Instead, always interpret cherry-pick/revert arguments as revisions.
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Speed up prepare_revision_walk() by adding commits without sorting
to the commit_list and at the end sort the list in one go. Thanks
to mergesort() working behind the scenes, this is a lot faster for
large numbers of commits than the current insert sort.
Also introduce and use commit_list_reverse(), to keep the ordering
of commits sharing the same commit date unchanged. That's because
commit_list_insert_by_date() sorts commits with descending date,
but adds later entries with the same date entries last, while
commit_list_insert() always inserts entries at the top. The
following commit_list_sort_by_date() keeps the order of entries
sharing the same date.
Jeff's test case, in a repo with lots of refs, was to run:
# make a new commit on top of HEAD, but not yet referenced
sha1=`git commit-tree HEAD^{tree} -p HEAD </dev/null`
# now do the same "connected" test that receive-pack would do
git rev-list --objects $sha1 --not --all
With a git.git with a ref for each revision, master needs (best of
five):
real 0m2.210s
user 0m2.188s
sys 0m0.016s
And with this patch:
real 0m0.480s
user 0m0.456s
sys 0m0.020s
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously it was not possible to iterate revisions twice using the
revision walking api. We add a reset_revision_walk() which clears the
used flags. This allows us to do multiple sequencial revision walks.
We add the appropriate calls to the existing submodule machinery doing
revision walks. This is done to avoid surprises if future code wants to
call these functions more than once during the processes lifetime.
Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By Junio C Hamano (2) and Ramsay Jones (1)
* jc/pickaxe-ignore-case:
ctype.c: Fix a sparse warning
pickaxe: allow -i to search in patch case-insensitively
grep: use static trans-case table
"git log -S<string>" is a useful way to find the last commit in the
codebase that touched the <string>. As it was designed to be used by a
porcelain script to dig the history starting from a block of text that
appear in the starting commit, it never had to look for anything but an
exact match.
When used by an end user who wants to look for the last commit that
removed a string (e.g. name of a variable) that he vaguely remembers,
however, it is useful to support case insensitive match.
When given the "--regexp-ignore-case" (or "-i") option, which originally
was designed to affect case sensitivity of the search done in the commit
log part, e.g. "log --grep", the matches made with -S/-G pickaxe search is
done case insensitively now.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/grep-binary-attribute:
grep: pre-load userdiff drivers when threaded
grep: load file data after checking binary-ness
grep: respect diff attributes for binary-ness
grep: cache userdiff_driver in grep_source
grep: drop grep_buffer's "name" parameter
convert git-grep to use grep_source interface
grep: refactor the concept of "grep source" into an object
grep: move sha1-reading mutex into low-level code
grep: make locking flag global
* jk/grep-binary-attribute:
grep: pre-load userdiff drivers when threaded
grep: load file data after checking binary-ness
grep: respect diff attributes for binary-ness
grep: cache userdiff_driver in grep_source
grep: drop grep_buffer's "name" parameter
convert git-grep to use grep_source interface
grep: refactor the concept of "grep source" into an object
grep: move sha1-reading mutex into low-level code
grep: make locking flag global
Before the grep_source interface existed, grep_buffer was
used by two types of callers:
1. Ones which pulled a file into a buffer, and then wanted
to supply the file's name for the output (i.e.,
git grep).
2. Ones which really just wanted to grep a buffer (i.e.,
git log --grep).
Callers in set (1) should now be using grep_source. Callers
in set (2) always pass NULL for the "name" parameter of
grep_buffer. We can therefore get rid of this now-useless
parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* nd/index-pack-no-recurse:
index-pack: eliminate unlimited recursion in get_base_data()
index-pack: eliminate recursion in find_unresolved_deltas
Eliminate recursion in setting/clearing marks in commit list
In a topic branch workflow, you often want to find the latest commit that
merged a side branch that touched a particular area of the system, so that
a new topic branch to work on that area can be forked from that commit.
For example, I wanted to find an appropriate fork-point to queue Luke's
changes related to git-p4 in contrib/fast-import/.
"git log --first-parent" traverses the first-parent chain, and "-m --stat"
shows the list of paths touched by commits including merge commits. We
could ask the question this way:
# What is the latest commit that touched that path?
$ git log --first-parent --oneline -m --stat master |
sed -e '/^ contrib\/fast-import\/git-p4 /q' | tail
The above finds that 8cbfc11 (Merge branch 'pw/p4-view-updates',
2012-01-06) was such a commit.
But a more natural way to spell this question is this:
$ git log --first-parent --oneline -m --stat -1 master -- \
contrib/fast-import/git-p4
Unfortunately, this does not work. It finds ecb7cf9 (git-p4: rewrite view
handling, 2012-01-02). This commit is a part of the merged topic branch
and is _not_ on the first-parent path from the 'master':
$ git show-branch 8cbfc11ecb7cf9
! [8cbfc11] Merge branch 'pw/p4-view-updates'
! [ecb7cf9] git-p4: rewrite view handling
--
- [8cbfc11] Merge branch 'pw/p4-view-updates'
+ [8cbfc11^2] git-p4: view spec documentation
++ [ecb7cf9] git-p4: rewrite view handling
The problem is caused by the merge simplification logic when it inspects
the merge commit 8cbfc11. In this case, the history leading to the tip of
'master' did not touch git-p4 since 'pw/p4-view-updates' topic forked, and
the result of the merge is simply a copy from the tip of the topic branch
in the view limited by the given pathspec. The merge simplification logic
discards the history on the mainline side of the merge, and pretends as if
the sole parent of the merge is its second parent, i.e. the tip of the
topic. While this simplification is correct in the general case, it is at
least surprising if not outright wrong when the user explicitly asked to
show the first-parent history.
Here is an attempt to fix this issue, by not allowing us to compare the
merge result with anything but the first parent when --first-parent is in
effect, to avoid the history traversal veering off to the side branch.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Recursion in a DAG is generally a bad idea because it could be very
deep. Be defensive and avoid recursion in mark_parents_uninteresting()
and clear_commit_marks().
mark_parents_uninteresting() learns a trick from clear_commit_marks()
to avoid malloc() in (dominant) single-parent case.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This teaches the "log" family of commands to pass the GPG signature in the
commit objects to "gpg --verify" via the verify_signed_buffer() interface
used to verify signed tag objects. E.g.
$ git show --show-signature -s HEAD
shows GPG output in the header part of the output.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* rs/pending:
commit: factor out clear_commit_marks_for_object_array
checkout: use leak_pending flag
bundle: use leak_pending flag
bisect: use leak_pending flag
revision: add leak_pending flag
checkout: use add_pending_{object,sha1} in orphan check
revision: factor out add_pending_sha1
checkout: check for "Previous HEAD" notice in t2020
Conflicts:
builtin/checkout.c
revision.c
* nd/maint-autofix-tag-in-head:
Accept tags in HEAD or MERGE_HEAD
merge: remove global variable head[]
merge: use return value of resolve_ref() to determine if HEAD is invalid
merge: keep stash[] a local variable
Conflicts:
builtin/merge.c
* jc/fetch-verify:
fetch: verify we have everything we need before updating our ref
rev-list --verify-object
list-objects: pass callback data to show_objects()
* bk/ancestry-path:
t6019: avoid refname collision on case-insensitive systems
revision: do not include sibling history in --ancestry-path output
revision: keep track of the end-user input from the command line
rev-list: Demonstrate breakage with --ancestry-path --all
The new flag leak_pending in struct rev_info can be used to prevent
prepare_revision_walk from freeing the list of pending objects. It
will still forget about them, so it really is leaked. This behaviour
may look weird at first, but it can be useful if the pointer to the
list is saved before calling prepare_revision_walk.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function is a combination of the static get_reference and
add_pending_object. It can be used to easily queue objects by hash.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
HEAD and MERGE_HEAD (among other branch tips) should never hold a
tag. That can only be caused by broken tools and is cumbersome to fix
by an end user with:
$ git update-ref HEAD $(git rev-parse HEAD^{commit})
which may look like a magic to a new person.
Be easy, warn users (so broken tools can be fixed if they bother to
report) and move on.
Be robust, if the given SHA-1 cannot be resolved to a commit object,
die (therefore return value is always valid).
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Often we want to verify everything reachable from a given set of commits
are present in our repository and connected without a gap to the tips of
our refs. We used to do this for this purpose:
$ rev-list --objects $commits_to_be_tested --not --all
Even though this is good enough for catching missing commits and trees,
we show the object name but do not verify their existence, let alone their
well-formedness, for the blob objects at the leaf level.
Add a new "--verify-object" option so that we can catch missing and broken
blobs as well.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the commit specified as the bottom of the commit range has a direct
parent that has another child commit that contributed to the resulting
history, "rev-list --ancestry-path" was confused and listed that side
history as well, due to the command line parser subtlety corrected by the
previous commit.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Given a complex set of revision specifiers on the command line, it is too
late to look at the flags of the objects in the initial traversal list at
the beginning of limit_list() in order to determine what the objects the
end-user explicitly listed on the command line were. The process to move
objects from the pending array to the traversal list may have marked
objects that are not mentioned as UNINTERESTING, when handle_commit()
marked the parents of UNINTERESTING commits mentioned on the command line
by calling mark_parents_uninteresting().
This made "rev-list --ancestry-path ^A ..." to mistakenly list commits
that are descendants of A's parents but that are not descendants of A
itself, as ^A from the command line causes A and its parents marked as
UNINTERESTING before coming to limit_list(), and we try to enumerate the
commits that are descendants of these commits that are UNINTERESTING
before we start walking the history.
It actually is too late even if we inspected the pending object array
before calling prepare_revision_walk(), as some of the same objects might
have been mentioned twice, once as positive and another time as negative.
The "rev-list --some-option A --not --all" command may want to notice,
even if the resulting set is empty, that the user showed some interest in
"A" and do something special about it.
Prepare a separate array to keep track of what syntactic element was used
to cause each object to appear in the pending array from the command line,
and populate it as setup_revisions() parses the command line.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allocating and then immediately freeing temporary memory a million times
when listing a million objects is distasteful.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are two copies of traverse_commit_list callback that show the object
name followed by pathname the object was found, to produce output similar
to "rev-list --objects".
Unify them.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/notes-batch-removal:
show: --ignore-missing
notes remove: --stdin reads from the standard input
notes remove: --ignore-missing
notes remove: allow removing more than one
* jc/magic-pathspec:
setup.c: Fix some "symbol not declared" sparse warnings
t3703: Skip tests using directory name ":" on Windows
revision.c: leave a note for "a lone :" enhancement
t3703, t4208: add test cases for magic pathspec
rev/path disambiguation: further restrict "misspelled index entry" diag
fix overslow :/no-such-string-ever-existed diagnostics
fix overstrict :<path> diagnosis
grep: use get_pathspec() correctly
pathspec: drop "lone : means no pathspec" from get_pathspec()
Revert "magic pathspec: add ":(icase)path" to match case insensitively"
magic pathspec: add ":(icase)path" to match case insensitively
magic pathspec: futureproof shorthand form
magic pathspec: add tentative ":/path/from/top/level" pathspec support
Instead of barfing, simply ignore bad object names seen in the
input. This is useful when reading from "git notes list" output
that may refer to objects that have already been garbage collected.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add log.abbrevCommit config variable as a convenience for users who
often use --abbrev-commit with git log and friends. Allow the option
to be overridden with --no-abbrev-commit. Per 635530a2fc and 4f62c2bc57,
the config variable is ignored when log is given "--pretty=raw".
(Also, a drive-by spelling correction in git log's short help.)
Signed-off-by: Jay Soffian <jaysoffian@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jn/setup-revisions-glob-and-friends-passthru:
revisions: allow --glob and friends in parse_options-enabled commands
revisions: split out handle_revision_pseudo_opt function
Update the fix for 1.7.5 maintenance track.
* jc/maint-1.7.4-pathspec-stdin-and-cmdline:
setup_revisions(): take pathspec from command line and --stdin correctly
If we later add a command in the log family that by default limit
its operation to the current subdirectory, we would need to resurrect
the "a lone ':' on the command line means no pathspec whatsoever".
Now the codepath was cleaned up, we can do so in one place. Leave a
note to mark where it is for later generations.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update the fix for 1.7.4 maintenance track.
* jc/maint-1.6.6-pathspec-stdin-and-cmdline:
setup_revisions(): take pathspec from command line and --stdin correctly
When the command line has "--" disambiguator, we take the remainder of
argv[] as "prune_data", but when --stdin is given at the same time,
we need to append to the existing prune_data and end up attempting to
realloc(3) it. That would not work.
Fix it by consistently using append_prune_data() throughout the input
processing. Also avoid counting the number of existing paths in the
function over and over again.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As v1.6.0-rc2~42 (2008-07-31) explains, even pseudo-options like --not
and --glob that need to be parsed in order with revisions should be
marked handled by handle_revision_opt to avoid an error when
parse_revision_opt callers like "git shortlog" encounter them.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As v1.6.0-rc2~42 (Allow "non-option" revision options in
parse_option-enabled commands, 2008-07-31) explains, options handled
by setup_revisions fall into two categories:
1. global options like --topo-order handled by parse_revision_opt,
which can take detached arguments and can be parsed in advance;
2. pseudo-options that must be parsed in order with their revision
counterparts, like --not and --all.
The global options are taken care of by handle_revision_opt; the
pseudo-options are currently in a deeply indented portion of
setup_revisions. Give them their own function for easier reading.
The only goal is to make setup_revisions easier to read straight
through. No functional change intended.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With most command line options, later instances of an option
override earlier ones. With cumulative options like
"--notes", however, there is no way to say "forget the
--notes I gave you before".
Let's have --no-notes trigger this forgetting, so that:
git log --notes=foo --no-notes --notes=bar
will show only the "bar" notes.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We already have --show-notes, but it has a few shortcomings:
1. Using --show-notes=<ref> implies that we should also
show the default notes. Which means you also need to
use --no-standard-notes if you want to suppress them.
2. It is negated by --no-notes, which doesn't match.
3. It's too long to type. :)
This patch introduces --notes, which behaves exactly like
--show-notes, except that using "--notes=<ref>" does not
imply showing the default notes.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is in preparation for more notes-related revision
command-line options.
The "suppress_default_notes" option is renamed to
"use_default_notes", and is now a tri-state with values less
than one indicating "not set". If the value is "not set",
then we show default refs if and only if no other refs were
given.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's no need to use an extra pointer, which just ends up
leaking memory. The fact that the list is empty tells us the
same thing.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
No need to do it ourselves when there is a library function.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* mg/rev-list-n-parents:
tests: avoid nonportable {foo,bar} glob
rev-list --min-parents,--max-parents: doc, test and completion
revision.c: introduce --min-parents and --max-parents options
t6009: use test_commit() from test-lib.sh
Introduce --min-parents and --max-parents options which limit the
revisions to those commits which have at least (or at most) that many
commits, where negative arguments for --max-parents= denote infinity
(i.e. no upper limit).
In particular:
--max-parents=1 is the same as --no-merges;
--min-parents=2 is the same as --merges;
--max-parents=0 shows only roots; and
--min-parents=3 shows only octopus merges
Using --min-parents=n and --max-parents=m with n>m gives you what you ask
for (i.e. nothing) for obvious reasons, just like when you give --merges
(show only merge commits) and --no-merges (show only non-merge commits) at
the same time.
Also, introduce --no-min-parents and --no-max-parents to do the obvious
thing for convenience.
We compute the number of parents only when we limit by that, so there
is no performance impact when there are no limiters.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* mg/rev-list-one-side-only:
git-log: put space after commit mark
t6007: test rev-list --cherry
log --cherry: a synonym
rev-list: documentation and test for --cherry-mark
revision.c: introduce --cherry-mark
rev-list/log: factor out revision mark generation
rev-list: --left/right-only are mutually exclusive
rev-list: documentation and test for --left/right-only
t6007: Make sure we test --cherry-pick
revlist.c: introduce --left/right-only for unsymmetric picking
Currently, commit marks (left, right, boundary, cherry) are output right
before the commit sha1, which makes it difficult to copy sha1s. Sample
output for "git log --oneline --cherry":
=049c269 t6007: test rev-list --cherry
Change this to
= 049c269 t6007: test rev-list --cherry
which matches exactly the current output of "git log --graph".
Leave "git rev-list" output as is (no space) so that they do not break.
Adjust "git-svn" which uses "git log --pretty=raw --boundary".
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At the porcelain level, because by definition there are many more contributors
than integrators, it makes sense to give a handy short-hand for --right-only
used with --cherry-mark and --no-merges. Make it so.
In other words, this provides "git cherry with rev-list interface".
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
for marking those commits which "--cherry-pick" would drop.
The marker for those commits is '=' because '-' denotes a boundary
commit already, even though 'git cherry' uses it.
Nonequivalent commits are denoted '+' unless '--left-right' is used.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, we have identical code for generating revision marks ('<',
'>', '-') in 5 places.
Factor out the code to a single function get_revision_mark() for easier
maintenance and extensibility.
Note that the check for !!revs in graph.c (which gets removed
effectively by this patch) is superfluous.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The existing "--cherry-pick" does not work with unsymmetric ranges
(A..B) for obvious reasons.
Introduce "--left-only" and "--right-only" which limit the output to
commits on the respective sides of a symmetric range (i.e. only "<"
resp. ">" commits as per "--left-right").
This is especially useful for things like
git log --cherry-pick --right-only @{u}...
which is much more flexible (and descriptive) than
git cherry @{u} | sed -ne 's/^+ //p'
and potentially more useful than
git log --cherry-pick @{u}...
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add commit_list prefix to insert_by_date function and to sort_by_date,
so it's clear that these functions refer to commit_list structure.
Signed-off-by: Thiago Farina <tfransosi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reflog-walking mechanism is based on the regular
revision traversal. We just rewrite the parents of each
commit in fake_reflog_parent to point to the commit in the
next reflog entry instead of the real parents.
However, the regular revision traversal tries not to show
the same commit twice, and so sets the SHOWN flag on each
commit it shows. In a reflog, however, we may want to see
the same commit more than once if it appears in the reflog
multiple times (which easily happens, for example, if you do
a reset to a prior state).
The fake_reflog_parent function takes care of this by
clearing flags, including SHOWN. Unfortunately, it does so
at the very end of the function, and it is possible to
return early from the function if there is no fake parent to
set up (e.g., because we are at the very first reflog entry
on the branch). In such a case the flag is not cleared, and
the entry is skipped by the revision traversal machinery as
already shown.
You can see this by walking the log of a ref which is set to
its very first commit more than once (the test below shows
such a situation). In this case the reflog walk will fail to
show the entry for the initial creation of the ref.
We don't want to simply move the flag-clearing to the top of
the function; we want to make sure flags set during the
fake-parent installation are also cleared. Instead, let's
hoist the flag-clearing out of the fake_reflog_parent
function entirely. It's not really about fake parents
anyway, and the only caller is the get_revision machinery.
Reported-by: Martin von Zweigbergk <martin.von.zweigbergk@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* mm/shortopt-detached:
log: parse separate option for --glob
log: parse separate options like git log --grep foo
diff: parse separate options --stat-width n, --stat-name-width n
diff: split off a function for --stat-* option parsing
diff: parse separate options like -S foo
Conflicts:
revision.c
By passing the path to a submodule in opt->submodule, the function can
be used to walk history in the named submodule repository, instead of
the toplevel repository.
Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jp/string-list-api-cleanup:
string_list: Fix argument order for string_list_append
string_list: Fix argument order for string_list_lookup
string_list: Fix argument order for string_list_insert_at_index
string_list: Fix argument order for string_list_insert
string_list: Fix argument order for for_each_string_list
string_list: Fix argument order for print_string_list
Update the definition and callers of string_list_append to use the
string_list as the first argument. This helps make the string_list
API easier to use by being more consistent.
Signed-off-by: Julian Phillips <julian@quantumfyre.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/rev-list-ancestry-path:
revision: Turn off history simplification in --ancestry-path mode
revision: Fix typo in --ancestry-path error message
Documentation/rev-list-options.txt: Explain --ancestry-path
Documentation/rev-list-options.txt: Fix missing line in example history graph
revision: --ancestry-path
Add a --count option that, instead of actually listing the commits,
merely counts them.
This is mostly geared towards script use, and to this end it acts
specially when used with --left-right: it outputs the left and right
counts separately. Previously, scripts would have to run a shell loop
or small inline script over to achieve the same. (Without
--left-right, a simple |wc -l does the job.)
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using --ancestry-path together with history simplification (typically
triggered by path limiting), history simplification would get in the way of
--ancestry-path by prematurely removing the parent links between commits on
which the ancestry path calculations are made.
This patch disables this history simplification when --ancestry-path is
enabled. This is similar to what e.g. --full-history already does.
The patch also includes a simple testcase verifying that --ancestry-path
works together with path limiting.
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To show the last two commits with one command, one might try
1) git show -s master~2..
2) git show -s ^master~2 master
3) git show -s master^ master
4) git show -s -2 master
Choice (3) works because both commits are listed on the command line.
Choices (1) and (2) have worked ever since v1.6.4-rc~3 (Make 'git
show' more useful, 2009-07-13) disabled --no-walk in this case because
there is no other useful meaning for them to have. Unfortunately, (4)
does not work: it outputs only one commit, because --no-walk stays on.
So disable --no-walk in this case so ‘git show’ and future ‘git
cherry-pick’ can behave as expected.
As a side effect, this unfortunately changes the meaning of
‘git log --oneline --decorate --no-walk -5 --all’: instead of listing
five refs, after this patch that command would list the five most
recent commits.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Without this patch at least IBM VisualAge C 5.0 (I have 5.0.2) on AIX
5.1 fails to compile git.
enum style is inconsistent already, with some enums declared on one
line, some over 3 lines with the enum values all on the middle line,
sometimes with 1 enum value per line... and independently of that the
trailing comma is sometimes present and other times absent, often
mixing with/without trailing comma styles in a single file, and
sometimes in consecutive enum declarations.
Clearly, omitting the comma is the more portable style, and this patch
changes all enum declarations to use the portable omitted dangling
comma style consistently.
Signed-off-by: Gary V. Vaughan <gary@thewrittenword.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"rev-list A..H" computes the set of commits that are ancestors of H, but
excludes the ones that are ancestors of A. This is useful to see what
happened to the history leading to H since A, in the sense that "what does
H have that did not exist in A" (e.g. when you have a choice to update to
H from A).
x---x---A---B---C <-- topic
/ \
x---x---x---o---o---o---o---M---D---E---F---G <-- dev
/ \
x---o---o---o---o---o---o---o---o---o---o---o---N---H <-- master
The result in the above example would be the commits marked with caps
letters (except for A itself, of course), and the ones marked with 'o'.
When you want to find out what commits in H are contaminated with the bug
introduced by A and need fixing, however, you might want to view only the
subset of "A..B" that are actually descendants of A, i.e. excluding the
ones marked with 'o'. Introduce a new option --ancestry-path to compute
this set with "rev-list --ancestry-path A..B".
Note that in practice, you would build a fix immediately on top of A and
"git branch --contains A" will give the names of branches that you would
need to merge the fix into (i.e. topic, dev and master), so this may not
be worth paying the extra cost of postprocessing.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* tr/notes-display:
git-notes(1): add a section about the meaning of history
notes: track whether notes_trees were changed at all
notes: add shorthand --ref to override GIT_NOTES_REF
commit --amend: copy notes to the new commit
rebase: support automatic notes copying
notes: implement helpers needed for note copying during rewrite
notes: implement 'git notes copy --stdin'
rebase -i: invoke post-rewrite hook
rebase: invoke post-rewrite hook
commit --amend: invoke post-rewrite hook
Documentation: document post-rewrite hook
Support showing notes from more than one notes tree
test-lib: unset GIT_NOTES_REF to stop it from influencing tests
Conflicts:
git-am.sh
refs.c
* pb/log-first-parent-p-m:
show --first-parent/-m: do not default to --cc
show -c: show patch text
revision: introduce setup_revision_opt
t4013: add tests for log -p -m --first-parent
git log -p -m: document -m and honor --first-parent
If a revision is specified, it happens not to have any commits, don't
use the default revision. By doing so, surprising and undesired
behavior can happen, such as showing the reflog for HEAD when a branch
was specified.
[jc: squashed a test from René]
Signed-off-by: Dave Olszewski <cxreg@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With this patch, you can set notes.displayRef to a glob that points at
your favourite notes refs, e.g.,
[notes]
displayRef = refs/notes/*
Then git-log and friends will show notes from all trees.
Thanks to Junio C Hamano for lots of feedback, which greatly
influenced the design of the entire series and this commit in
particular.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Acked-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Traditionally, "show" defaulted to "show --cc" (dense combined patch), but
asking for combined patch with "show -c" didn't turn the patch output
format on; the placement of this logic in setup_revisions() dates back to
cd2bdc5 (Common option parsing for "git log --diff" and friends,
2006-04-14).
This unfortunately cannot be done as a trivial change of "if dense
combined is asked, default to patch format" done in setup_revisions() to
"if any combined is asked, default to patch format", as "diff-tree -c"
needs to default to raw, while "diff-tree --cc" needs to default to patch,
and they share the codepath. These command specific defaults are now
handled in the new "tweak" callback that can be customized by individual
command implementations.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
So far the last parameter to setup_revisions() was to specify the default
ref when the command line did not give any (typically "HEAD"). This changes
it to take a pointer to a structure so that we can add other information without
touching too many codepaths in later patches.
There is no functionality change.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --cherry-pick logic starts by counting the commits on each side,
so that it can filter away commits on the bigger one. However, so
far it missed an opportunity for optimization: it doesn't need to do
any work if either side is empty.
This in particular helps the common use-case 'git rebase -i HEAD~$n':
it internally uses --cherry-pick, but since HEAD~$n is a direct
ancestor the left side is always empty.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Historically, any grep filter in "git log" family of commands were taken
as restricting to commits with any of the words in the commit log message.
However, the user almost always want to find commits "done by this person
on that topic". With "--all-match" option, a series of grep patterns can
be turned into a requirement that all of them must produce a match, but
that makes it impossible to ask for "done by me, on either this or that"
with:
log --author=me --committer=him --grep=this --grep=that
because it will require both "this" and "that" to appear.
Change the "header" parser of grep library to treat the headers specially,
and parse it as:
(all-match-OR (HEADER-AUTHOR me)
(HEADER-COMMITTER him)
(OR
(PATTERN this)
(PATTERN that) ) )
Even though the "log" command line parser doesn't give direct access to
the extended grep syntax to group terms with parentheses, this change will
cover the majority of the case the users would want.
This incidentally revealed that one test in t7002 was bogus. It ran:
log --author=Thor --grep=Thu --format='%s'
and expected (wrongly) "Thu" to match "Thursday" in the author/committer
date, but that would never match, as the timestamp in raw commit buffer
does not have the name of the day-of-the-week.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jl/submodule-diff:
Performance optimization for detection of modified submodules
git status: Show uncommitted submodule changes too when enabled
Teach diff that modified submodule directory is dirty
Show submodules as modified when they contain a dirty work tree
Giving "Notes" information in the default output format of "log" and
"show" is a sensible progress (the user has asked for it by having the
notes), but for some commands (e.g. "format-patch") spewing notes into the
formatted commit log message without being asked is too aggressive.
Enable notes output only for "log", "show", "whatchanged" by default and
only when the user didn't ask any specific --pretty/--format from the
command line; users can explicitly override this default with --show-notes
and --no-notes option.
Parts of tests are taken from Jeff King's fix.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since local branch, tags and remote tracking branch namespaces are
most often used, add shortcut notations for globbing those in
manner similar to --glob option.
With this, one can express the "what I have but origin doesn't?"
as:
'git log --branches --not --remotes=origin'
Original-idea-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Ilari Liusvaara <ilari.liusvaara@elisanet.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add --glob=<glob-pattern> option to rev-parse and everything that
accepts its options. This option matches all refs that match given
shell glob pattern (complete with some DWIM logic).
Example:
'git log --branches --not --glob=remotes/origin'
To show what you have that origin doesn't.
Signed-off-by: Ilari Liusvaara <ilari.liusvaara@elisanet.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the worst case is_submodule_modified() got called three times for
each submodule. The information we got from scanning the whole
submodule tree the first time can be reused instead.
New parameters have been added to diff_change() and diff_addremove(),
the information is stored in a new member of struct diff_filespec. Its
value is then reused instead of calling is_submodule_modified() again.
When no explicit "-dirty" is needed in the output the call to
is_submodule_modified() is not necessary when the submodules HEAD
already disagrees with the ref of the superproject, as this alone
marks it as modified. To achieve that, get_stat_data() got an extra
argument.
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/1.7.0-diff-whitespace-only-status:
diff.c: fix typoes in comments
Make test case number unique
diff: Rename QUIET internal option to QUICK
diff: change semantics of "ignore whitespace" options
Conflicts:
diff.h
* jc/log-stdin:
Add trivial tests for --stdin option to log family
Make --stdin option to "log" family read also pathspecs
setup_revisions(): do not call get_pathspec() too early
Teach --stdin option to "log" family
read_revision_from_stdin(): use strbuf
Conflicts:
revision.c
Similar to the command line arguments, after giving zero or more revs, you can
feed a line "--" and then feed pathspecs one at a time.
With this
(
echo ^maint
echo --
echo Documentation
) | git log --stat --oneline --stdin master -- t
lists commits that touch Documentation/ or t/ between maint and master.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is necessary because we will later allow pathspecs to be fed from the
standard input, and pathspecs taken from the command line (and converted
via get_pathspec() already) in revs->prune_data too early gets in the way
when we want to append from the standard input.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the logic to read revs from standard input that rev-list knows about
from it to revision machinery, so that all the users of setup_revisions()
can feed the list of revs from the standard input when "--stdin" is used
on the command line.
Allow some users of the revision machinery that want different semantics
from the "--stdin" option to disable it by setting an option in the
rev_info structure.
This also cleans up the kludge made to bundle.c via cut and paste.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is so 2005 (and Linus ;-) to have a fixed 1000-byte buffer that
reads from the user. Let's use strbuf to unlimit the input length.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I personally use "git bisect visualize" all the time when I bisect, but it
turns out that that is not a very flexible model. Sometimes I want to do
bisection based on all commits (no pathname limiting), but then visualize
the current bisection tree with just a few pathnames because I _suspect_
those pathnames are involved in the problem but am not totally sure about
them.
And at other times, I want to use other revision parsing logic, none of
which is available with "git bisect visualize".
So this adds "--bisect" as a revision parsing argument, and as a result it
just works with all the normal logging tools. So now I can just do
gitk --bisect --simplify-by-decoration filename-here
etc.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we show a reflog, we have two ways of naming the entry:
by sequence number (e.g., HEAD@{0}) or by date (e.g.,
HEAD@{10 minutes ago}). There is no explicit option to set
one or the other, but we guess based on whether or not the
user has provided us with a date format, showing them the
date version if they have done so, and the sequence number
otherwise.
This usually made sense if the use did something like "git
log -g --date=relative". However, it didn't make much sense
if the user set the date format using the log.date config
variable; in that case, all of their reflogs would end up as
dates.
This patch records the source of the date format and only
triggers the date-based view if --date= was given on the
command line.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Previously, graph_is_interesting() did not behave quite the same way as
the code in get_revision(). As a result, it would sometimes think
commits were uninteresting, even though get_revision() would return
them. This resulted in incorrect lines in the graph output.
This change creates a get_commit_action() function, which
graph_is_interesting() and simplify_commit() both now use to determine
if a commit will be shown. It is identical to the old simplify_commit()
behavior, except that it never calls rewrite_parents().
This problem was reported by Santi Béjar. The following command
would exhibit the problem before, but now works correctly:
git log --graph --simplify-by-decoration --oneline v1.6.3.3
Previously git graph did not display the output for this command
correctly between f29ac4f and 66996ec, among other places.
Signed-off-by: Adam Simpkins <simpkins@facebook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit de435ac0 changed the behavior of --decorate from printing the
full ref (e.g., "refs/heads/master") to a shorter, more human-readable
version (e.g., just "master"). While this is nice for human readers,
external tools using the output from "git log" may prefer the full
version.
This patch introduces an extension to --decorate to allow the caller to
specify either the short or the full versions.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The option "QUIET" primarily meant "find if we have _any_ difference as
quick as possible and report", which means we often do not even have to
look at blobs if we know the trees are different by looking at the higher
level (e.g. "diff-tree A B"). As a side effect, because there is no point
showing one change that we happened to have found first, it also enables
NO_OUTPUT and EXIT_WITH_STATUS options, making the end result look quiet.
Rename the internal option to QUICK to reflect this better; it also makes
grepping the source tree much easier, as there are other kinds of QUIET
option everywhere.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For some reason, I ended up doing
git show HEAD~5..
as an odd way of asking for a log. I realize I should just have used "git
log", but at the same time it does make perfect conceptual sense. After
all, you _could_ have done
git show HEAD HEAD~1 HEAD~2 HEAD~3 HEAD~4
and saying "git show HEAD~5.." is pretty natural. It's not like "git show"
only ever showed a single commit (or other object) before either! So
conceptually, giving a commit range is a very sensible operation, even
though you'd traditionally have used "git log" for that.
However, doing that currently results in an error
fatal: object ranges do not make sense when not walking revisions
which admittedly _also_ makes perfect sense - from an internal git
implementation standpoint in 'revision.c'.
However, I think that asking to show a range makes sense to a user, while
saying "object ranges no not make sense when not walking revisions" only
makes sense to a git developer.
So on the whole, of the two different "makes perfect sense" behaviors, I
think I originally picked the wrong one. And quite frankly, I don't really
see anybody actually _depending_ on that error case. So why not change it?
So rather than error out, just turn that non-walking error case into a
"silently turn on walking" instead.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I do various statistics on git, and one of the things I look at is merges,
because they are often interesting events to count ("how many merges vs
how much 'real development'" kind of statistics). And you can do it with
some fairly straightforward scripting, ie
git rev-list --parents HEAD |
grep ' .* ' |
git diff-tree --always -s --pretty=oneline --stdin |
less -S
will do it.
But I finally got irritated with the fact that we can skip merges with
'--no-merges', but we can't do the trivial reverse operation.
So this just adds a '--merges' flag that _only_ shows merges. Now you can
do the above with just a
git log --merges --pretty=oneline
which is a lot simpler. It also means that we automatically get a lot of
statistics for free, eg
git shortlog -ns --merges
does exactly what you'd want it to do.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This simplifies the logic of rev_compare_tree() by removing a special
case.
It does so by turning the special case of finding a diff to be "all new
files" into a more generic case of "all new" vs "all removed" vs "mixed
changes", so now the code is actually more powerful and more generic, and
the added symmetry actually makes it simpler too.
This makes no changes to any existing behavior, but apart from the
simplification it does make it possible to some day care about whether all
changes were just deletions if we want to. Which we may well want to for
merge handling.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* lt/pack-object-memuse:
show_object(): push path_name() call further down
process_{tree,blob}: show objects without buffering
Conflicts:
builtin-pack-objects.c
builtin-rev-list.c
list-objects.c
list-objects.h
upload-pack.c
In particular, pushing the "path_name()" call _into_ the show() function
would seem to allow
- more clarity into who "owns" the name (ie now when we free the name in
the show_object callback, it's because we generated it ourselves by
calling path_name())
- not calling path_name() at all, either because we don't care about the
name in the first place, or because we are actually happy walking the
linked list of "struct name_path *" and the last component.
Now, I didn't do that latter optimization, because it would require some
more coding, but especially looking at "builtin-pack-objects.c", we really
don't even want the whole pathname, we really would be better off with the
list of path components.
Why? We use that name for two things:
- add_preferred_base_object(), which actually _wants_ to traverse the
path, and now does it by looking for '/' characters!
- for 'name_hash()', which only cares about the last 16 characters of a
name, so again, generating the full name seems to be just unnecessary
work.
Anyway, so I didn't look any closer at those things, but it did convince
me that the "show_object()" calling convention was crazy, and we're
actually better off doing _less_ in list-objects.c, and giving people
access to the internal data structures so that they can decide whether
they want to generate a path-name or not.
This patch does that, and then for people who did use the name (even if
they might do something more clever in the future), it just does the
straightforward "name = path_name(path, component); .. free(name);" thing.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Here's a less trivial thing, and slightly more dubious one.
I was looking at that "struct object_array objects", and wondering why we
do that. I have honestly totally forgotten. Why not just call the "show()"
function as we encounter the objects? Rather than add the objects to the
object_array, and then at the very end going through the array and doing a
'show' on all, just do things more incrementally.
Now, there are possible downsides to this:
- the "buffer using object_array" _can_ in theory result in at least
better I-cache usage (two tight loops rather than one more spread out
one). I don't think this is a real issue, but in theory..
- this _does_ change the order of the objects printed. Instead of doing a
"process_tree(revs, commit->tree, &objects, NULL, "");" in the loop
over the commits (which puts all the root trees _first_ in the object
list, this patch just adds them to the list of pending objects, and
then we'll traverse them in that order (and thus show each root tree
object together with the objects we discover under it)
I _think_ the new ordering actually makes more sense, but the object
ordering is actually a subtle thing when it comes to packing
efficiency, so any change in order is going to have implications for
packing. Good or bad, I dunno.
- There may be some reason why we did it that odd way with the object
array, that I have simply forgotten.
Anyway, now that we don't buffer up the objects before showing them
that may actually result in lower memory usage during that whole
traverse_commit_list() phase.
This is seriously not very deeply tested. It makes sense to me, it seems
to pass all the tests, it looks ok, but...
Does anybody remember why we did that "object_array" thing? It used to be
an "object_list" a long long time ago, but got changed into the array due
to better memory usage patterns (those linked lists of obejcts are
horrible from a memory allocation standpoint). But I wonder why we didn't
do this back then. Maybe there's a reason for it.
Or maybe there _used_ to be a reason, and no longer is.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/maint-1.6.0-keep-pack:
pack-objects: don't loosen objects available in alternate or kept packs
t7700: demonstrate repack flaw which may loosen objects unnecessarily
Remove --kept-pack-only option and associated infrastructure
pack-objects: only repack or loosen objects residing in "local" packs
git-repack.sh: don't use --kept-pack-only option to pack-objects
t7700-repack: add two new tests demonstrating repacking flaws
Conflicts:
t/t7700-repack.sh
This option to pack-objects/rev-list was created to improve the -A and -a
options of repack. It was found to be lacking in that it did not provide
the ability to differentiate between local and non-local kept packs, and
found to be unnecessary since objects residing in local kept packs can be
filtered out by the --honor-pack-keep option.
Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/maint-1.6.0-keep-pack:
is_kept_pack(): final clean-up
Simplify is_kept_pack()
Consolidate ignore_packed logic more
has_sha1_kept_pack(): take "struct rev_info"
has_sha1_pack(): refactor "pretend these packs do not exist" interface
git-repack: resist stray environment variable
Now is_kept_pack() is just a member lookup into a structure, we can write
it as such.
Also rewrite the sole caller of has_sha1_kept_pack() to switch on the
criteria the callee uses (namely, revs->kept_pack_only) between calling
has_sha1_kept_pack() and has_sha1_pack(), so that these two callees do not
have to take a pointer to struct rev_info as an argument.
This removes the header file dependency issue temporarily introduced by
the earlier commit, so we revert changes associated to that as well.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This removes --unpacked=<packfile> parameter from the revision parser, and
rewrites its use in git-repack to pass a single --kept-pack-only option
instead.
The new --kept-pack-only option means just that. When this option is
given, is_kept_pack() that used to say "not on the --unpacked=<packfile>
list" now says "the packfile has corresponding .keep file".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Its "ignore_packed" parameter always comes from struct rev_info. This
patch makes the function take a pointer to the surrounding structure, so
that the refactoring in the next patch becomes easier to review.
There is an unfortunate header file dependency and the easiest workaround
is to temporarily move the function declaration from cache.h to
revision.h; this will be moved back to cache.h once the function loses
this "ignore_packed" parameter altogether in the later part of the
series.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of the callers of this function except only one pass NULL to its last
parameter, ignore_packed.
Introduce has_sha1_kept_pack() function that has the function signature
and the semantics of this function, and convert the sole caller that does
not pass NULL to call this new function.
All other callers and has_sha1_pack() lose the ignore_packed parameter.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These two are often used together but are too long to type.
Signed-off-by: Nanako Shiraishi <nanako3@lavabit.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some people prefer to call the pretty-print styles "format", and get
annoyed to see "git log --format=short" fail. Introduce it as a synonym
to --pretty so that both can be used.
Signed-off-by: Nanako Shiraishi <nanako3@lavabit.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
cc0e6c5 (Handle return code of parse_commit in revision machinery,
2007-05-04) attempted to tighten error checking in the revision machinery,
but it wasn't enough. When get_revision_1() was asked for the next commit
to return, it tries to read and simplify the parents of the commit to be
returned, but an error while doing so was silently ignored and reported as
a truncated history to the caller instead.
This resulted in an early end of "git log" output or a pack that lacks
older commits from "git pack-objects", without any error indication in the
exit status from these commands, even though the underlying parse_commit()
issues an error message to the end user.
Note that the codepath in add_parents_list() that paints parents of an
UNINTERESTING commit UNINTERESTING silently ignores the error when
parse_commit() fails; this is deliberate and in line with aeeae1b
(revision traversal: allow UNINTERESTING objects to be missing,
2009-01-27).
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of the existing codepaths were meant to treat missing uninteresting
objects to be a silently ignored non-error, but there were a few places
in handle_commit() and add_parents_to_list(), which are two key functions
in the revision traversal machinery, that cared:
- When a tag refers to an object that we do not have, we barfed. We
ignore such a tag if it is painted as UNINTERESTING with this change.
- When digging deeper into the ancestry chain of a commit that is already
painted as UNINTERESTING, in order to paint its parents UNINTERESTING,
we barfed if parse_parent() for a parent commit object failed. We can
ignore such a parent commit object.
Signed-off-by: Junio C Hamano <gitster@pobox.com>