Commit Graph

59 Commits

Author SHA1 Message Date
Linus Torvalds
fe5f51ce27 Optimize common case of git-rev-list
I took a look at webgit, and it looks like at least for the "projects"
page, the most common operation ends up being basically

	git-rev-list --header --parents --max-count=1 HEAD

Now, the thing is, the way "git-rev-list" works, it always keeps on
popping the parents and parsing them in order to build the list of
parents, and it turns out that even though we just want a single commit,
git-rev-list will invariably look up _three_ generations of commits.

It will parse:
 - the commit we want (it obviously needs this)
 - it's parent(s) as part of the "pop_most_recent_commit()" logic
 - it will then pop one of the parents before it notices that it doesn't
   need any more
 - and as part of popping the parent, it will parse the grandparent (again
   due to "pop_most_recent_commit()".

Now, I've strace'd it, and it really is pretty efficient on the whole, but
if things aren't nicely cached, and with long-latency IO, doing those two
extra objects (at a minimum - if the parent is a merge it will be more) is
just wasted time, and potentially a lot of it.

So here's a quick special-case for the trivial case of "just one commit,
and no date-limits or other special rules".

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-18 18:41:28 -07:00
Junio C Hamano
e091eb9325 upload-pack: Do not choke on too many heads request.
Cloning from a repository with more than 256 refs (heads and tags
included) will choke, because upload-pack has a built-in limit of
feeding not more than MAX_NEEDS (currently 256) heads to underlying
git-rev-list.  This is a problem when cloning a repository with many
tags, like http://www.linux-mips.org/pub/scm/linux.git, which has 290+
tags.

This commit introduces a new flag, --all, to git-rev-list, to include
all refs in the repository.  Updated upload-pack detects requests that
ask more than MAX_NEEDS refs, and sends everything back instead.

We may probably want to tweak the definitions of MAX_NEEDS and
MAX_HAS, but that is a separate topic.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-05 14:49:54 -07:00
Junio C Hamano
c807f77194 Fix minor DOS in rev-list.
A carefully crafted pathname can be used to disrupt downstream git-pack-objects
that uses 'git-rev-list --objects' output.  Prevent this.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-02 17:29:21 -07:00
Linus Torvalds
27cfe2e2dc Make time-based commit filtering work with topological ordering.
The trick is to consider the time-based filtering a limiter, the same way
we do for release ranges.

That means that the time-based filtering runs _before_ the topological
sorting, which makes it meaningful again. It also simplifies the code
logic.

This makes "gitk" useful with time ranges.

[ Second version: --merge-order now unaffected by the re-org ]

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-20 18:11:34 -07:00
Linus Torvalds
2a7055ae98 [PATCH] Fix "git-rev-list" revision range parsing
There were two bugs in there:
 - if the range didn't end up working, we restored the '.' character in
   the wrong place.
 - an empty end-of-range should be interpreted as HEAD.

See rev-parse.c for the reference implementation of this.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-17 11:57:50 -07:00
Linus Torvalds
8805ccac40 [PATCH] Avoid building object ref lists when not needed
The object parsing code builds a generic "this object references that
object" because doing a full connectivity check for fsck requires it.

However, nothing else really needs it, and it's quite expensive for
git-rev-list that can have tons of objects in flight.

So, exactly like the commit buffer save thing, add a global flag to
disable it, and use it in git-rev-list.

Before:

	$ /usr/bin/time git-rev-list --objects v2.6.12..HEAD | wc -l
	12.28user 0.29system 0:12.57elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
	0inputs+0outputs (0major+26718minor)pagefaults 0swaps
	59124

After this change:

	$ /usr/bin/time git-rev-list --objects v2.6.12..HEAD | wc -l
	10.33user 0.18system 0:10.54elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
	0inputs+0outputs (0major+18509minor)pagefaults 0swaps
	59124

and note how the number of pages touched by git-rev-list for this
particular object list has shrunk from 26,718 (104 MB) to 18,509 (72 MB).

Calculating the total object difference between two git revisions is still
clearly the most expensive git operation (both in memory and CPU time),
but it's now less than 40% of what it used to be.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-16 15:32:23 -07:00
Linus Torvalds
b0d8923ec0 [PATCH] Improve git-rev-list memory usage further
This avoids keeping tree entries around, and free's them as it traverses
the list. This avoids building up a huge memory footprint just for these
small but very common allocations.

Before:

	$ /usr/bin/time git-rev-list --objects v2.6.12..HEAD | wc -l
	11.65user 0.38system 0:12.65elapsed 95%CPU (0avgtext+0avgdata 0maxresident)k
	0inputs+0outputs (0major+42934minor)pagefaults 0swaps
	59124

After:

	$ /usr/bin/time git-rev-list --objects v2.6.12..HEAD | wc -l
	12.28user 0.29system 0:12.57elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
	0inputs+0outputs (0major+26718minor)pagefaults 0swaps
	59124

Note how the minor fault numbers - which ends up being how many pages we
needed to map - go down from 42934 (167 MB) to 26718 (104 MB).  That is:

Before:
	42934 minor pagefaults

After:

	26718 minor pagefaults

This is all in _addition_ to the previous fixes.  It used to be
~48,000 pagefaults.

That's still a honking big memory footprint, but it's about half of what
it was just a day or two ago (and this is the object list for a pretty big
update - almost 60,000 objects. Smaller updates need less memory).

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-16 15:19:07 -07:00
Linus Torvalds
5bdbaaa4e9 [PATCH] Re-organize "git-rev-list --objects" logic
The logic to calculate the full object list used to be very inter-twined
with the logic that looked up the commits.

For no good reason - it's actually a lot simpler to just do that logic
as a separate pass.

This improves performance a bit, and uses slightly less memory in my
tests, but more importantly it makes the code simpler to work with and
follow what it does.

The performance win is less than I had hoped for, but I get:

Before:

	[torvalds@g5 linux]$ /usr/bin/time git-rev-list --objects v2.6.12..HEAD | wc -l
	13.64user 0.42system 0:14.13elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
	0inputs+0outputs (0major+47947minor)pagefaults 0swaps
	58945

After:

	[torvalds@g5 linux]$ /usr/bin/time git-rev-list --objects v2.6.12..HEAD | wc -l
	11.80user 0.36system 0:12.16elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
	0inputs+0outputs (0major+42684minor)pagefaults 0swaps
	58945

ie it improved by 2 seconds, and took a 5000+ fewer pages (hey, that's
20MB out of 174MB to go). And got the same number of objects (in theory,
the more expensive one might find some more shared objects to avoid. In
practice it obviously doesn't).

I know how to make it use _lots_ less memory, which will probably speed it
up. But that's for another time, and I'd prefer to see this go in first.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-15 16:19:44 -07:00
Linus Torvalds
60ab26de99 [PATCH] Avoid wasting memory in git-rev-list
As pointed out on the list, git-rev-list can use a lot of memory.

One low-hanging fruit is to free the commit buffer for commits that we
parse. By default, parse_commit() will save away the buffer, since a lot
of cases do want it, and re-reading it continually would be unnecessary.
However, in many cases the buffer isn't actually necessary and saving it
just wastes memory.

We could just free the buffer ourselves, but especially in git-rev-list,
we actually end up using the helper functions that automatically add
parent commits to the commit lists, so we don't actually control the
commit parsing directly.

Instead, just make this behaviour of "parse_commit()" a global flag.
Maybe this is a bit tasteless, but it's very simple, and it makes a
noticable difference in memory usage.

Before the change:

	[torvalds@g5 linux]$ /usr/bin/time git-rev-list v2.6.12..HEAD > /dev/null
	0.26user 0.02system 0:00.28elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
	0inputs+0outputs (0major+3714minor)pagefaults 0swaps

after the change:

	[torvalds@g5 linux]$ /usr/bin/time git-rev-list v2.6.12..HEAD > /dev/null
	0.26user 0.00system 0:00.27elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
	0inputs+0outputs (0major+2433minor)pagefaults 0swaps

note how the minor faults have decreased from 3714 pages to 2433 pages.
That's all due to the fewer anonymous pages allocated to hold the comment
buffers and their metadata.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-15 14:57:52 -07:00
Pavel Roskin
d998a0895f [PATCH] Fix "prefix" mixup in git-rev-list
Recent changes in git have broken cg-log.  git-rev-list no longer
prints "commit" in front of commit hashes.  It turn out a local
"prefix" variable in main() shadows a file-scoped "prefix" variable.

The patch removed the local "prefix" variable since its value is never
used (in the intended way, that is).  The call to
setup_git_directory() is kept since it has useful side effects.

The file-scoped "prefix" variable is renamed to "commit_prefix" just
in case someone reintroduces "prefix" to hold the return value of
setup_git_directory().

Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-08-24 16:50:16 -07:00
Linus Torvalds
90e1848113 Make "git-rev-list" work within subdirectories
This trivial patch makes "git-rev-list" able to handle not being in
the top-level directory.  This magically also makes "git-whatchanged"
do the right thing.

Trivial scripting fix to make sure that "git log" also works.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-08-23 12:43:57 -07:00
Sergey Vlasov
7f1335c74c [PATCH] git-rev-list: avoid crash on broken repository
When following tags, check for parse_object() success and error out
properly instead of segfaulting.

Signed-off-by: Sergey Vlasov <vsu@altlinux.ru>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-08-19 13:07:43 -07:00
Junio C Hamano
d87449c553 Introduce --pretty=oneline format.
This introduces --pretty=oneline to git-rev-tree and
git-rev-list commands to show only the first line of the commit
message, without frills. 

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-08-09 22:28:23 -07:00
Johannes Schindelin
76cd8eb619 [PATCH] add *--no-merges* flag to suppress display of merge commits
As requested by Junio (who suggested --single-parents-only, but this
could forget a no-parent root).

Also, adds a few missing options to the usage string.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-08-09 22:28:21 -07:00
Junio C Hamano
1215879cdc Teach rev-list since..til notation.
The King Penguin says:

    Now, for extra bonus points, maybe you should make "git-rev-list" also
    understand the "rev..rev" format (which you can't do with just the
    get_sha1() interface, since it expands into more).

The faithful servant makes it so.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-08-05 01:25:31 -07:00
Petr Baudis
dd53c7ab29 [PATCH] Support for NO_OPENSSL
Support for completely OpenSSL-less builds. FSF considers distributing GPL
binaries with OpenSSL linked in as a legal problem so this is trouble
e.g. for Debian, or some people might not want to install OpenSSL
anyway. If you

	make NO_OPENSSL=1

you get completely OpenSSL-less build, disabling --merge-order and using
Mozilla's SHA1 implementation.

Ported from Cogito.

Signed-off-by: Petr Baudis <pasky@ucw.cz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-07-29 17:21:52 -07:00
Linus Torvalds
6c3b84c81c [PATCH] Fix interesting git-rev-list corner case
This corner-case was triggered by a kernel commit that was not in date
order, due to a misconfigured time zone that made the commit appear three
hours older than it was.

That caused git-rev-list to traverse the commit tree in a non-obvious
order, and made it parse several of the _parents_ of the misplaced commit
before it actually parsed the commit itself. That's fine, but it meant
that the grandparents of the commit didn't get marked uninteresting,
because they had been reached through an "interesting" branch.

The reason was that "mark_parents_uninteresting()" (which is supposed to
mark all existing parents as being uninteresting - duh) didn't actually
traverse more than one level down the parent chain.

NORMALLY this is fine, since with the date-based traversal order,
grandparents won't ever even have been looked at before their parents (so
traversing the chain down isn't needed, because the next time around when
we pick out the parent we'll mark _its_ parents uninteresting), but since
we'd gotten out of order, we'd already seen the parent and thus never got
around to mark the grandparents.

Anyway, the fix is simple. Just traverse parent chains recursively.
Normally the chain won't even exist (since the parent hasn't been parsed
yet), so this is not actually going to trigger except in this strange
corner-case.

Add a comment to the simple one-liner, since this was a bit subtle, and I
had to really think things through to understand how it could happen.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-07-29 17:21:46 -07:00
Junio C Hamano
54c6870ebf Typofix: usage strings fix.
The *_usage strings should not start with "usage: ", since the
usage() function gives its own.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-07-27 11:53:49 -07:00
Linus Torvalds
4311d328fe Be more aggressive about marking trees uninteresting
We'll mark all the trees at the edges (as deep as we had to go to
realize that we have all the commits needed) as uninteresting.
Otherwise we'll occasionally list a lot of objects that were actually
available at the edge in a commit that we just never ended up parsing
because we could determine early that we had all relevant commits.

NOTE! The object listing is still just a _heuristic_.  It's guaranteed
to list a superset of the actual new objects, but there might be the
occasional old object in the list, just because the commit that
referenced it was much further back in the history.

For example, let's say that a recent commit is a revert of part of the
tree to much older state: since we didn't walk _that_ far back in the
commit history tree to list the commits necessary, git-rev-tree will
never have marked the old objects uninteresting, and we'll end up
listing them as "new".

That's ok.
2005-07-23 10:01:49 -07:00
Junio C Hamano
013aab8265 [PATCH] Dereference tag repeatedly until we get a non-tag.
When we allow a tag object in place of a commit object, we only
dereferenced the given tag once, which causes a tag that points at a tag
that points at a commit to be rejected.  Instead, dereference tag
repeatedly until we get a non-tag.

This patch makes change to two functions:

 - commit.c::lookup_commit_reference() is used by merge-base,
   rev-tree and rev-parse to convert user supplied SHA1 to that of
   a commit.
 - rev-list uses its own get_commit_reference() to do the same.

Dereferencing tags this way helps both of these uses.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-11 10:13:09 -07:00
Linus Torvalds
454fbbcde3 git-rev-list: allow missing objects when the parent is marked UNINTERESTING
We still want the "top-most" uninteresting object to exist, so that we
know that we have reached it.
2005-07-10 15:09:46 -07:00
Jon Seymour
a7336ae514 [PATCH] Ensure list insertion method does not depend on position of --merge-order argument
This change ensures that git-rev-list --merge-order produces the same result
irrespective of what position the --merge-order argument appears in the argument
list.

Signed-off-by: Jon Seymour <jon.seymour@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-06 18:03:53 -07:00
Linus Torvalds
960cea2dd1 git-rev-list: remove the DUPCHECK logic, use SEEN instead
That's what we should have done in the first place, since it not only
avoids another unnecessary flag, it also protects the commits from
showing up as duplicates later when they show up as parents of another
commit (in the pop_most_recent_commit() path).

This will hopefully also fix --topo-sort.
2005-07-06 16:52:49 -07:00
Linus Torvalds
e6c3505b44 Make sure we generate the whole commit list before trying to sort it topologically
This was my cherry-pickng merge bug.  But topo-order still shows strange
behaviour with multiple heads, so keep gitk using --merge-order for now.
2005-07-06 10:51:43 -07:00
Jon Seymour
d2775a817a [PATCH] Tidy up - slight simplification of rev-list.c
This patch implements a small tidy up of rev-list.c to reduce
(but not eliminate) the amount of ugliness associated
with the merge_order flag.

Signed-off-by: Jon Seymour <jon.seymour@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-06 10:28:02 -07:00
Linus Torvalds
d2d02a4906 Add "--topo-order" flag to use new topological sort 2005-07-06 10:25:04 -07:00
Linus Torvalds
bce6286670 Remove insane overlapping bit ranges from epoch.c
..and move the DUPCHECK to rev-list.c since both the merge-order and the
upcoming topo-sort get confused by dups.
2005-07-06 09:56:16 -07:00
Linus Torvalds
7e21c29b56 Clean up commit insertion in git-rev-list
Jon wants the commits in a different order for merge-order.
2005-07-06 09:38:06 -07:00
Linus Torvalds
f755494cec Make "insert_by_date()" match "commit_list_insert()"
Same argument order, same return type.  This allows us to use a function
pointer to choose one over the other.
2005-07-06 09:31:17 -07:00
Linus Torvalds
12ba7eaf1d Remove unnecessary usage of strncmp() in git-rev-list arg parsing.
Not only is it unnecessary, it incorrectly allows extraneous characters
at the end of the argument.

Junio noticed the --merge-order thing, and Jon points out that if we fix
that one, we should fix --show-breaks too.
2005-07-05 12:12:50 -07:00
Linus Torvalds
7a662e896b git-rev-list: make sure the output is sorted by recency
We didn't sort the refs by date, so if you had multiple refs, the end
result would not be properly sorted.
2005-07-04 16:49:37 -07:00
Linus Torvalds
7620d39fcb Make rev-list flush the stdio buffers after each rev.
We'd rather get the revisions in a slow but timely manner than
have to wait for them.
2005-07-04 16:36:48 -07:00
Linus Torvalds
12d2a18780 "git rev-list --unpacked" shows only unpacked commits
More infrastructure to do efficient incremental packs.
2005-07-03 13:29:54 -07:00
Linus Torvalds
960bba0d8c Add "--all" flag to rev-parse that shows all refs
And make git-rev-list just silently ignore non-commit refs if we're not
asking for all objects.
2005-07-03 13:07:52 -07:00
Linus Torvalds
6da4016aea Fix sparse warnings.
Mainly making a lot of local functions and variables be marked "static",
but there was a "zero as NULL" warning in there too.
2005-07-03 10:10:45 -07:00
Linus Torvalds
36f8d17445 Teach git-rev-list about non-commit objects
Now you can give git-rev-list tags, trees and blobs, and it will do the
proper reachability for them all. Knock wood.

Of course, you need the "--objects" flag to do anything but plain
commits.
2005-06-29 11:30:24 -07:00
Linus Torvalds
3c90f03d32 Prepare git-rev-list for tracking tag objects too
We want to be able to just say "give a difference between these
objects", rather than limiting it to commits only.  This isn't there
yet, but it sets things up to be a bit easier.
2005-06-29 10:40:14 -07:00
Linus Torvalds
9b66ec0474 Add "--pretty=full" format that also shows committer.
Also move the common implementation of parsing the --pretty argument
format into commit.c rather than having duplicates in diff-tree.c and
rev-list.c.
2005-06-26 17:50:46 -07:00
Linus Torvalds
9ce43d1c90 Ooh. Make git-rev-list --object associate a name with objects.
The name isn't unique, it's just the first name that object is reached
through, so it's really nothing more than a hint.
2005-06-26 15:26:05 -07:00
Linus Torvalds
9de48752fe git-rev-list: add option to list all objects (not just commits)
When you do

	git-rev-list --objects $(git-rev-parse HEAD^..HEAD)

it now lists not only the "commit difference" between the parent of HEAD
and HEAD itself (which is normally just the parent, but in the case of a
merge will be all the newly merged commits), but also all the new tree
and blob objects that weren't in the original.

NOTE! It doesn't walk all the way to the root, so it doesn't do a full
object search in the full old history.  Instead, it will only look as
far back in the history as it needs to resolve the commits.  Thus, if
the commit reverts a blob (or tree) back to a state much further back in
history, we may end up listing some blobs (or trees) as "new" even
though they exist further back.

Regardless, the list of objects will be a superset (usually exact) list
of objects needed to go from the beginning commit to ending commit.

As a particularly obvious special case,

	git-rev-list --objects HEAD

will end up listing every single object that is reachable from the HEAD
commit.

Side note: the objects are sorted by "recency", with commits first.
2005-06-24 22:56:58 -07:00
Jon Seymour
5e749e259b [PATCH] Fix for --merge-order, --max-age interaction issue
This patch fixes a problem reported by Paul Mackerras regarding the interaction
of the --merge-order and --max-age switches of git-rev-list.

This patch applies to the current Linus HEAD. A cleaner fix for the same problem
in my current HEAD will follow later.

With this change, --merge-order produces the same result as no --merge-order
on the linux-2.6 git repository, to wit:

$> git-rev-list --max-age=1116330140 bcfff0b471a60df350338bcd727fc9b8a6aa54b2 | wc -l
655
$> git-rev-list --merge-order --max-age=1116330140 bcfff0b471a60df350338bcd727fc9b8a6aa54b2 | wc -l
655

Signed-off-by: Jon Seymour <jon.seymour@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-19 20:13:18 -07:00
Jon Seymour
51b1e1713b [PATCH] Prevent git-rev-list without --merge-order producing duplicates in output
If b is reachable from a, then:

	 git-rev-list a b

argument would print one of the commits twice.

This patch fixes that problem. A previous problem fixed it for the
--merge-order switch.

Signed-off-by: Jon Seymour <jon.seymour@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-19 20:13:18 -07:00
Linus Torvalds
3d958064e0 Avoid warning about function without return.
Strangely, this warning only shows up when not compiling with "-O2",
which is why I didn't see it originally.
2005-06-18 20:02:49 -07:00
Linus Torvalds
8b3a1e056f git-rev-list: add "--bisect" flag to find the "halfway" point
This is useful for doing binary searching for problems.  You start with
a known good and known bad point, and you then test the "halfway" point
in between:

	git-rev-list --bisect bad ^good

and you test that.  If that one tests good, you now still have a known
bad case, but two known good points, and you can bisect again:

	git-rev-list --bisect bad ^good1 ^good2

and test that point.  If that point is bad, you now use that as your
known-bad starting point:

	git-rev-list --bisect newbad ^good1 ^good2

and basically at every iteration you shrink your list of commits by
half: you're binary searching for the point where the troubles started,
even though there isn't a nice linear ordering.
2005-06-17 22:54:50 -07:00
Petr Baudis
17ebe977d7 [PATCH] Tidy up some rev-list-related stuff
This patch tidies up the git-rev-list documentation and epoch.c, which
are in severe clash with the unwritten coding style now, and quite
unreadable.

It also fixes up compile failures with older compilers due to variable
declarations after code.

The patch mostly wraps lines before or on the 80th column, removes
plenty of superfluous empty lines and changes comments from // to /* */.

Signed-off-by: Petr Baudis <pasky@ucw.cz>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-08 15:59:09 -07:00
jon@blackcubes.dyndns.org
a3437b8c26 [PATCH] Modify git-rev-list to linearise the commit history in merge order.
This patch linearises the GIT commit history graph into merge order
which is defined by invariants specified in Documentation/git-rev-list.txt.

The linearisation produced by this patch is superior in an objective sense
to that produced by the existing git-rev-list implementation in that
the linearisation produced is guaranteed to have the minimum number of
discontinuities, where a discontinuity is defined as an adjacent pair of
commits in the output list which are not related in a direct child-parent
relationship.

With this patch a graph like this:

	a4 ---
	| \   \
	|  b4 |
	|/ |  |
	a3 |  |
	|  |  |
	a2 |  |
	|  |  c3
	|  |  |
	|  |  c2
	|  b3 |
	|  | /|
	|  b2 |
	|  |  c1
	|  | /
	|  b1
	a1 |
	|  |
	a0 |
	| /
	root

Sorts like this:

	= a4
	| c3
	| c2
	| c1
	^ b4
	| b3
	| b2
	| b1
	^ a3
	| a2
	| a1
	| a0
	= root

Instead of this:

	= a4
	| c3
	^ b4
	| a3
	^ c2
	^ b3
	^ a2
	^ b2
	^ c1
	^ a1
	^ b1
	^ a0
	= root

A test script, t/t6000-rev-list.sh, includes a test which demonstrates
that the linearisation produced by --merge-order has less discontinuities
than the linearisation produced by git-rev-list without the --merge-order
flag specified. To see this, do the following:

	cd t
	./t6000-rev-list.sh
	cd trash
	cat actual-default-order
	cat actual-merge-order

The existing behaviour of git-rev-list is preserved, by default. To obtain
the modified behaviour, specify --merge-order or --merge-order --show-breaks
on the command line.

This version of the patch has been tested on the git repository and also on the linux-2.6
repository and has reasonable performance on both - ~50-100% slower than the original algorithm.

This version of the patch has incorporated a functional equivalent of the Linus' output limiting
algorithm into the merge-order algorithm itself. This operates per the notes associated
with Linus' commit 337cb3fb8d.

This version has incorporated Linus' feedback regarding proposed changes to rev-list.c.
(see: [PATCH] Factor out filtering in rev-list.c)

This version has improved the way sort_first_epoch marks commits as uninteresting.

For more details about this change, refer to Documentation/git-rev-list.txt
and http://blackcubes.dyndns.org/epoch/.

Signed-off-by: Jon Seymour <jon.seymour@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-06 09:07:26 -07:00
Linus Torvalds
000182eacf pretty_print_commit: add different formats
You can ask to print out "raw" format (full headers, full body),
"medium" format (author and date, full body) or "short" format
(author only, condensed body).

Use "git-rev-list --pretty=short HEAD | less -S" for an example.
2005-06-05 09:02:03 -07:00
Linus Torvalds
337cb3fb8d git-rev-list: allow arbitrary head selections, use git-rev-tree syntax
This makes git-rev-list use the same command line syntax to mark the
commits as git-rev-tree does, and instead of just allowing a start and
end commit, it allows an arbitrary list of "interesting" and "uninteresting"
commits.

For example, imagine that you had three branches (a, b and c) that you
are interested in, but you don't want to see stuff that already exists
in another persons three releases (x, y and z). You can do

	git-rev-list a b c ^x ^y ^z

(order doesn't matter, btw - feel free to put the uninteresting ones
first or otherwise swithc them around), and it will show all the
commits that are reachable from a/b/c but not reachable from x/y/z.

The old syntax "git-rev-list start end" would not be written as
"git-rev-list start ^end", or "git-rev-list ^end start".

There's no limit to the number of heads you can specify (unlike
git-rev-tree, which can handle a maximum of 16 heads).
2005-06-04 14:38:28 -07:00
Linus Torvalds
3b42a63cb5 git-rev-list: split out commit limiting from main() too.
Ok, now I'm happier.
2005-06-02 09:25:44 -07:00
Linus Torvalds
81f2bb1f54 git-rev-list: factor out the commit printing from "main()"
Functions that do many things are bad. We should basically
just parse the arguments in main(). We're not quite there
yet, but it's a step in the right direction.
2005-06-02 09:19:53 -07:00