This basically does a few things that are sadly somewhat interdependent,
and nontrivial to split out
- get rid of "struct log_tree_opt"
The fields in "log_tree_opt" are moved into "struct rev_info", and all
users of log_tree_opt are changed to use the rev_info struct instead.
- add the parsing for the log_tree_opt arguments to "setup_revision()"
- make setup_revision set a flag (revs->diff) if the diff-related
arguments were used. This allows "git log" to decide whether it wants
to show diffs or not.
- make setup_revision() also initialize the diffopt part of rev_info
(which we had from before, but we just didn't initialize it)
- make setup_revision() do all the "finishing touches" on it all (it will
do the proper flag combination logic, and call "diff_setup_done()")
Now, that was the easy and straightforward part.
The slightly more involved part is that some of the programs that want to
use the new-and-improved rev_info parsing don't actually want _commits_,
they may want tree'ish arguments instead. That meant that I had to change
setup_revision() to parse the arguments not into the "revs->commits" list,
but into the "revs->pending_objects" list.
Then, when we do "prepare_revision_walk()", we walk that list, and create
the sorted commit list from there.
This actually cleaned some stuff up, but it's the less obvious part of the
patch, and re-organized the "revision.c" logic somewhat. It actually paves
the way for splitting argument parsing _entirely_ out of "revision.c",
since now the argument parsing really is totally independent of the commit
walking: that didn't use to be true, since there was lots of overlap with
get_commit_reference() handling etc, now the _only_ overlap is the shared
(and trivial) "add_pending_object()" thing.
However, I didn't do that file split, just because I wanted the diff
itself to be smaller, and show the actual changes more clearly. If this
gets accepted, I'll do further cleanups then - that includes the file
split, but also using the new infrastructure to do a nicer "git diff" etc.
Even in this form, it actually ends up removing more lines than it adds.
It's nice to note how simple and straightforward this makes the built-in
"git log" command, even though it continues to support all the diff flags
too. It doesn't get much simpler that this.
I think this is worth merging soonish, because it does allow for future
cleanup and even more sharing of code. However, it obviously touches
"revision.c", which is subtle. I've tested that it passes all the tests we
have, and it passes my "looks sane" detector, but somebody else should
also give it a good look-over.
[jc: squashed the original and three "oops this too" updates, with
another fix-up.]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This makes things that include revision.h build again.
Blame is also built, but I am not sure how well it works (or how
well it worked to begin with) -- it was relying on tree-diff to
be using whatever pathspec was used the last time, which smells
a bit suspicious.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The way tree-diff was set up assumed we would use only one set
of pathspec during the entire life of the program. Move the
pathspec related static variables out to diff_options structure
so that we can filter commits with one set of paths while show
the actual diffs using different set of paths.
I suspect this breaks blame.c, and makes "git log paths..." to
default to the --full-diff, the latter of which is dealt with
the next commit.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The parent rewriting feature caused us to create the whole history in one
go, and then simplify it later, because of how rewrite_parents() had been
written. However, with a little tweaking, it's perfectly possible to do
even that one incrementally.
Right now, this doesn't really much matter, because every user of
"--parents" will probably generally _also_ use "--topo-order", which will
cause the old non-incremental behaviour anyway. However, I'm hopeful that
we could make even the topological sort incremental, or at least
_partially_ so (for example, make it incremental up to the first merge).
In the meantime, this at least moves things in the right direction, and
removes a strange special case.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This replaces occurences of "blob", "commit", "tag", and "tree",
where they're really used as type specifiers, which we already
have defined global constants for.
Signed-off-by: Peter Eriksen <s022018@student.dtu.dk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This just moves code around to consolidate the part that sets
revs->limited to one place based on various flags.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Now, using --unpacked without limit_list() does not make much
sense, but this is parallel to the earlier --max-age fix.
Signed-off-by: Junio C Hamano <junkio@cox.net>
What ends up not working very well at all is the combination of
"--topo-order" and the output filter in get_revision. It will
return NULL when we see the first commit out of date-order, even
if we have other commits coming.
So we really should do the "past the date order" thing in
get_revision() only if we have _not_ done it already in
limit_list().
Something like this.
The easiest way to test this is with just
gitk --since=3.days.ago
on the kernel tree. Without this patch, it tends to be pretty obviously
broken.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This makes git-rev-list able to do path-limiting without having to parse
all of history before it starts showing the results.
This makes things like "git log -- pathname" much more pleasant to use.
This is actually a pretty small patch, and the biggest part of it is
purely cleanups (turning the "goto next" statements into "continue"), but
it's conceptually a lot bigger than it looks.
What it does is that if you do a path-limited revision list, and you do
_not_ ask for pseudo-parenthood information, it won't do all the
path-limiting up-front, but instead do it incrementally in
"get_revision()".
This is an absolutely huge deal for anything like "git log -- <pathname>",
but also for some things that we don't do yet - like the "find where
things changed" logic I've described elsewhere, where we want to find the
previous revision that changed a file.
The reason I put "RFC" in the subject line is that while I've validated it
various ways, like doing
git-rev-list HEAD -- drivers/char/ | md5sum
before-and-after on the kernel archive, it's "git-rev-list" after all. In
other words, it's that really really subtle and complex central piece of
software. So while I think this is important and should go in asap, I also
think it should get lots of testing and eyeballs looking at the code.
Btw, don't even bother testing this with the git archive. git itself is so
small that parsing the whole revision history for it takes about a second
even with path limiting. The thing that _really_ shows this off is doing
git log drivers/
on the kernel archive, or even better, on the _historic_ kernel archive.
With this change, the response is instantaneous (although seeking to the
end of the result will obviously take as long as it ever did). Before this
change, the command would think about the result for tens of seconds - or
even minutes, in the case of the bigger old kernel archive - before
starting to output the results.
NOTE NOTE NOTE! Using path limiting with things like "gitk", which uses
the "--parents" flag to actually generate a pseudo-history of the
resulting commits won't actually see the improvement in interactivity,
since that forces git-rev-list to do the whole-history thing after all.
MAYBE we can fix that too at some point, but I won't promise anything.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Not only do we do it in both rev-list.c and git.c, the revision walking
code will soon want to know whether we should rewrite parenthood
information or not.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Marco reported that
$ git rev-list --boundary --topo-order --parents 5aa44d5..ab57c8d
misses these two boundary commits.
c649657501eb38cc689e
Indeed, we can see that gitk shows these two commits at the
bottom, because the --boundary code failed to output them.
The code did not check to avoid pushing the same uninteresting
commit twice to the result list. I am not sure why this fixes
the reported problem, but this seems to fix it.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The earlier change to make "..B" to mean "HEAD..B" (aka ^HEAD B)
has constness gotcha GCC complains. Fix it.
Signed-off-by: Junio C Hamano <junkio@cox.net>
For consistency reasons, we should probably allow that to be written as
just "..branch", the same way we can write "branch.." to mean "everything
in HEAD but not in "branch".
Signed-off-by: Junio C Hamano <junkio@cox.net>
With the new --boundary flag, the output from rev-list includes
the UNINTERESING commits at the boundary, which are usually not
shown. Their object names are prefixed with '-'.
For example, with this graph:
C side
/
A---B---D master
You would get something like this:
$ git rev-list --boundary --header --parents side..master
D B
tree D^{tree}
parent B
... log message for commit D here ...
\0-B A
tree B^{tree}
parent A
... log message for commit B here ...
\0
Signed-off-by: Junio C Hamano <junkio@cox.net>
When passing in a pathname pattern without the "--" separator on the
command line, we verify that the pathnames in question exist. However,
there were two bugs in that verification:
- git-rev-parse would only check the first pathname, and silently allow
any invalid subsequent pathname, whether it existed or not (which
defeats the purpose of the check, and is also inconsistent with what
git-rev-list actually does)
- git-rev-list (and "git log" etc) would check each filename, but if the
check failed, it would print the error using the first one, i.e.:
[torvalds@g5 git]$ git log Makefile bad-file
fatal: 'Makefile': No such file or directory
instead of saying that it's 'bad-file' that doesn't exist.
This fixes both bugs.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Marco Costalba reports that --remove-empty omits the commit that
created paths we are interested in. try_to_simplify_commit()
logic was dropping a parent we introduced those paths against,
which I think is not what we meant. Instead, this makes such
parent parentless.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Marco Costalba reports that --remove-empty omits the commit that
created paths we are interested in. try_to_simplify_commit()
logic was dropping a parent we introduced those paths against,
which I think is not what we meant. Instead, this marks such
parent uninteresting. The traversal does not go beyond that
parent as advertised, but we still say that the current commit
changed things from that parent.
Signed-off-by: Junio C Hamano <junkio@cox.net>
prune_fn in the rev_info structure is called in place of
try_to_simplify_commit. This makes it possible to do rename tracking
with a custom try_to_simplify_commit-like function.
This commit also introduces init_revisions which initialises the rev_info
structure with default values.
Signed-off-by: Fredrik Kuivinen <freku045@student.liu.se>
Signed-off-by: Junio C Hamano <junkio@cox.net>
When git-rev-list (and git-log) collapsed ancestry chain to
commits that touch specified paths, we failed to inspect and
notice tree changes when we are about to hit uninteresting
parent. This resulted in "git rev-list since.. -- file" to
always show the child commit after the lower bound, even if it
does not touch the file. This commit fixes it.
Thanks for Catalin for reporting this.
See also:
461cf59f89
Signed-off-by: Junio C Hamano <junkio@cox.net>
revision.c:make_parents_uninteresting() is exponential with the number
of merges in the tree. That's fine -- unless some other part of git
already has pulled the whole commit tree into memory ...
Signed-off-by: Junio C Hamano <junkio@cox.net>
This moves the handling of max-count shorthand from the internal
implementation of "git log" to setup_revisions() so other users
of setup_revisions() can use it.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Well, assuming breaking --merge-order is fine, here's a patch (on top of
the other ones) that makes
git log <filename>
actually work, as far as I can tell.
I didn't add the logic for --before/--after flags, but that should be
pretty trivial, and is independent of this anyway.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This actually moves the "meat" of the revision walking from rev-list.c
to the new library code in revision.h. It introduces the new functions
void prepare_revision_walk(struct rev_info *revs);
struct commit *get_revision(struct rev_info *revs);
to prepare and then walk the revisions that we have.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This makes the rewrite easier to validate in that revision flag
parsing and warlking part are now all in rev_info structure.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This really just splits things up partially, and creates the
interface to set things up by parsing the command line.
No real code changes so far, although the parsing of filenames is a bit
stricter. In particular, if there is a "--", then we do not accept any
filenames before it, and if there isn't any "--", then we check that _all_
paths listed are valid, not just the first one.
The new argument parsing automatically also gives us "--default" and
"--not" handling as in git-rev-parse.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>