Set up for better tree diff optimizations

This is mainly just a cleanup patch, and sets up for later changes where
the tree-diff.c "interesting()" function can return more than just a
yes/no value.

In particular, it should be quite possible to say "no subsequent entries
in this tree can possibly be interesting any more", and thus allow the
callers to short-circuit the tree entirely.

In fact, changing the callers to do so is trivial, and is really all this
patch really does, because changing "interesting()" itself to say that
nothing further is going to be interesting is definitely more complicated,
considering that we may have arbitrary pathspecs.

But in cleaning up the callers, this actually fixes a potential small
performance issue in diff_tree(): if the second tree has a lot of
uninterestign crud in it, we would keep on doing the "is it interesting?"
check on the first tree for each uninteresting entry in the second one.

The answer is obviously not going to change, so that was just not helping.
The new code is clearer and simpler and avoids this issue entirely.

I also renamed "interesting()" to "tree_entry_interesting()", because I
got frustrated by the fact that

 - we actually had *another* function called "interesting()" in another
   file, and I couldn't tell from the profiles which one was the one that
   mattered more.

 - when rewriting it to return a ternary value, you can't just do

	if (interesting(...))
		...

   any more, but want to assign the return value to a local variable. The
   name of choice for that variable would normally be "interesting", so
   I just wanted to make the function name be more specific, and avoid
   that whole issue (even though I then didn't choose that name for either
   of the users, just to avoid confusion in the patch itself ;)

In other words, this doesn't really change anything, but I think it's a
good thing to do, and if somebody comes along and writes the logic for
"yeah, none of the pathspecs you have are interesting", we now support
that trivially.

It could easily be a meaningful optimization for things like "blame",
where there's just one pathspec, and stopping when you've seen it would
allow you to avoid about 50% of the tree traversals on average.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This commit is contained in:
Linus Torvalds 2007-03-18 15:18:30 -07:00 committed by Junio C Hamano
parent c711a214c1
commit 5d86501742

View File

@ -66,7 +66,15 @@ static int compare_tree_entry(struct tree_desc *t1, struct tree_desc *t2, const
return 0;
}
static int interesting(struct tree_desc *desc, const char *base, int baselen, struct diff_options *opt)
/*
* Is a tree entry interesting given the pathspec we have?
*
* Return:
* - positive for yes
* - zero for no
* - negative for "no, and no subsequent entries will be either"
*/
static int tree_entry_interesting(struct tree_desc *desc, const char *base, int baselen, struct diff_options *opt)
{
const char *path;
const unsigned char *sha1;
@ -123,7 +131,10 @@ static int interesting(struct tree_desc *desc, const char *base, int baselen, st
static void show_tree(struct diff_options *opt, const char *prefix, struct tree_desc *desc, const char *base, int baselen)
{
while (desc->size) {
if (interesting(desc, base, baselen, opt))
int show = tree_entry_interesting(desc, base, baselen, opt);
if (show < 0)
break;
if (show)
show_entry(opt, prefix, desc, base, baselen);
update_tree_entry(desc);
}
@ -158,22 +169,35 @@ static void show_entry(struct diff_options *opt, const char *prefix, struct tree
}
}
static void skip_uninteresting(struct tree_desc *t, const char *base, int baselen, struct diff_options *opt)
{
while (t->size) {
int show = tree_entry_interesting(t, base, baselen, opt);
if (!show) {
update_tree_entry(t);
continue;
}
/* Skip it all? */
if (show < 0)
t->size = 0;
return;
}
}
int diff_tree(struct tree_desc *t1, struct tree_desc *t2, const char *base, struct diff_options *opt)
{
int baselen = strlen(base);
while (t1->size | t2->size) {
for (;;) {
if (opt->quiet && opt->has_changes)
break;
if (opt->nr_paths && t1->size && !interesting(t1, base, baselen, opt)) {
update_tree_entry(t1);
continue;
}
if (opt->nr_paths && t2->size && !interesting(t2, base, baselen, opt)) {
update_tree_entry(t2);
continue;
if (opt->nr_paths) {
skip_uninteresting(t1, base, baselen, opt);
skip_uninteresting(t2, base, baselen, opt);
}
if (!t1->size) {
if (!t2->size)
break;
show_entry(opt, "+", t2, base, baselen);
update_tree_entry(t2);
continue;