Commit Graph

19247 Commits

Author SHA1 Message Date
Junio C Hamano
79778e4696 git-show-branch: Fix off-by-one error.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-23 01:19:48 -07:00
Linus Torvalds
1b9e059d35 git-rev-list: add "--dense" flag
This is what the recent git-rev-list changes have all been gearing up for.

When we use a path filter to git-rev-list, the new "--dense" flag asks
git-rev-list to compress the history so that it _only_ contains commits
that change files in the path filter.  It also rewrites the parent
information so that tools like "gitk" will see the result as a dense
history tree.

For example, on the current kernel archive:

	[torvalds@g5 linux]$ git-rev-list HEAD | wc -l
	9904
	[torvalds@g5 linux]$ git-rev-list HEAD -- kernel | wc -l
	5442
	[torvalds@g5 linux]$ git-rev-list --dense HEAD -- kernel | wc -l
	356

which shows that while we have almost ten thousand commits, we can prune
down the work to slightly more than half by only following the merges
that are interesting. But further, we can then compress the history to
just 356 entries that actually make changes to the kernel subdirectory.

To see this in action, try something like

	gitk --dense -- gitk

to see just the history that affects gitk.  Or, to show that true
parallel development still remains parallel, do

	gitk --dense -- daemon.c

which shows some parallel commits in the current git tree.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-22 22:49:52 -07:00
Linus Torvalds
cf4845441c Teach git-rev-list to follow just a specified set of files
This is the first cut at a git-rev-list that knows to ignore commits that
don't change a certain file (or set of files).

NOTE! For now it only prunes _merge_ commits, and follows the parent where
there are no differences in the set of files specified. In the long run,
I'd like to make it re-write the straight-line history too, but for now
the merge simplification is much more fundamentally important (the
rewriting of straight-line history is largely a separate simplification
phase, but the merge simplification needs to happen early if we want to
optimize away unnecessary commit parsing).

If all parents of a merge change some of the files, the merge is left as
is, so the end result is in no way guaranteed to be a linear history, but
it will often be a lot /more/ linear than the full tree, since it prunes
out parents that didn't matter for that set of files.

As an example from the current kernel:

	[torvalds@g5 linux]$ git-rev-list HEAD | wc -l
	9885
	[torvalds@g5 linux]$ git-rev-list HEAD -- Makefile | wc -l
	4084
	[torvalds@g5 linux]$ git-rev-list HEAD -- drivers/usb | wc -l
	5206

and you can also use 'gitk' to more visually see the pruning of the
history tree, with something like

	gitk -- drivers/usb

showing a simplified history that tries to follow the first parent in a
merge that is the parent that fully defines drivers/usb/.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-22 22:49:52 -07:00
Linus Torvalds
ac1b3d1248 Split up tree diff functions into tree-diff.c library
This makes the tree diff functionality independent of the "git-diff-tree"
program, by splitting the core functionality up into a library file.

This will be needed for when we teach git-rev-list to only follow a
specified set of pathnames, rather than the global revision history.

Most of it is a fairly straightforward code move, but it also involves
some calling convention cleanup, and moving some of the static variables
from diff-tree.c into the options structure.

The actual tree change callback routines also become paramterized by the
diff_options structure, allowing the library functionality to do something
else than just show the diff on stdout.

Right now the only user of this functionality remains git-diff-tree
itself.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-22 22:49:51 -07:00
Junio C Hamano
4f692b1978 Allow git-merge not to commit.
Martin Langhoff wants to use git-merge from outside git-pull and wants
to do further processing; for this, he wants git-merge no to commit
even when it cleanly merges.  I think other script writers would want
something like that as well, so here it is.

Instead of the "merge commit message" parameter (which usually is made
for you by "git-pull" which calls this command), you pass an empty
string to it.  Then it will not update your HEAD -- you can do whatever
you want with the resulting index file, which contains the merge results.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-22 04:45:15 -07:00
Junio C Hamano
6b32884a09 upload-pack: Increase MAX_HAS.
Later round would further improve fetch-pack not to send useless "have",
but in the meantime, increase it to help upload-pack to find more common
commits, as discussed on the list.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-22 02:28:27 -07:00
Junio C Hamano
05625af32e Fix malformatted git-am documentation.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-21 20:57:34 -07:00
Nick Hengeveld
7b9ae53ea3 [PATCH 3/3] Allow running requests to finish after a pull error
Allow running requests to finish after a pull error

Signed-off-by: Nick Hengeveld <nickh@reactrix.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-21 19:20:18 -07:00
Nick Hengeveld
f7eb290fa0 [PATCH 2/3] Switched back to loading alternates as needed
Switched back to loading alternates as needed

Signed-off-by: Nick Hengeveld <nickh@reactrix.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-21 19:20:18 -07:00
Nick Hengeveld
f1a906a387 [PATCH 1/3] Clean up CURL handles in unused request slots
Clean up CURL handles in unused request slots

Signed-off-by: Nick Hengeveld <nickh@reactrix.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-21 19:20:17 -07:00
Junio C Hamano
4ae22d96fe Merge branch 'fixes' 2005-10-20 23:21:50 -07:00
Junio C Hamano
f6804930ca Merge branch 'fixes' 2005-10-20 23:19:47 -07:00
Junio C Hamano
a935c39727 daemon.c: remove trailing whitespace.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-20 23:19:36 -07:00
H. Peter Anvin
147a1ab035 Fix git-daemon argument-parsing bug
Fix stupid bug in parsing the --init-timeout option.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-20 22:56:34 -07:00
H. Peter Anvin
54e31a205c Fix git-daemon argument-parsing bug
Fix stupid bug in parsing the --init-timeout option.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-20 22:46:03 -07:00
Petr Baudis
2707da9c08 Update git-daemon's documentation wrt. new options
New options --timeout, --init-timeout, --export-all and whitelist support
were added to git-daemon, but noone bothered to also add the proper
documentation. This patch aims to fix that.

Signed-off-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-20 22:32:08 -07:00
Junio C Hamano
baa720f501 Finish git-am documentation.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-20 22:32:07 -07:00
Petr Baudis
42e2cba204 Brief documentation for the mysterious git-am script
The git-am script is nowhere called and nowhere (including itself)
explained, and the name isn't helpful either. For those like me who will
wonder what is it about, add some documentation stub for it to the
documentation.

I probably got something wrong and I don't feel like investigating all the
options - this is just kind of "emergency" docs.

Signed-off-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-20 22:32:07 -07:00
Linus Torvalds
a08b650594 git-rev-parse: pass on "--" flag when required
If rev-parse output includes both flags and files, we should pass on any
"--" marker we see, so that the end result can also tell the difference
between a flag and a filename that begins with '-'.

[jc: merged a later one liner updates from Linus]

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-20 22:32:07 -07:00
Petr Baudis
adc3dbca1a Use sensible domain name (the DNS one) when guessing ident information
Currently, the code would use getdomainname() call, which however returns
something usually unset and not necessarily related at all to the DNS
domain name (it seems to be mostly some scary NIS/YP thing).

This patch changes the code to actually use the DNS domain name, which is
also what tends to be used in emails, and we aim at emails with our ident
code.

Signed-off-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-20 22:32:07 -07:00
Johannes Schindelin
4eba0f3763 Make git-cherry-pick in target "all"
Since git-cherry-pick is simply a copy of git-revert, it can be created
before installing (so that it can be used without installing, too).

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-20 22:32:07 -07:00
Junio C Hamano
2c674191d5 Fix missing exports in git-am
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-20 22:31:56 -07:00
Jens Axboe
7872e05567 git-daemon poll() spinning out of control
With the '0' timeout given to poll, it returns instantly without any
events on my system, causing git-daemon to consume all the CPU time. Use
-1 as the timeout so poll() only returns in case of EINTR or actually
events being available.

Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-20 21:26:31 -07:00
Junio C Hamano
bfadbeddd1 Merge /pub/scm/git/git to recover lost side branch
Sorry for the mistake of rewinding something already pushed out.
This recovers the side branch lost by that mistake, specifically
ea5a65a599 commit.

Signed-off-by: Junio C Hamano <junio@hera.kernel.org>
2005-10-20 17:06:15 -07:00
Junio C Hamano
6e1c6c103c Make sure we barf on ref^{type} failure.
Martin Langhoff noticed that ref^0 barfed correctly when we did not
have the commit in a broken repository, but ref^{commit} didn't.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-19 22:49:31 -07:00
Junio C Hamano
f1f0a2be9f Be more careful tangling object chains while marking commits.
Also Johannes noticed we use parse_object to look up if we know that
object already -- we should just ask the in-core object registry with
lookup_object() for that.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-19 21:55:49 -07:00
Junio C Hamano
d6a73596e7 git-fetch/push/pull: documentation.
The documentation was lazily sharing the argument description across these
commands.

Lazy may be a way of life, but that does not justify confusing others ;-).

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-19 21:25:39 -07:00
Junio C Hamano
4dab94d52e Do not feed rev-list an invalid SHA1 expression.
The previous round to optimize fetch-pack has a small bug that
feeds SHA1^ ("parent commit") before making sure SHA1 is
actually a commit (or a tag that eventually dereferences to a
commit).  Also it did not help culling the known-to-be-common
parents if the common one was a merge.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-19 21:04:53 -07:00
Johannes Schindelin
0a8944dd48 [PATCH] Do not send "want" lines for complete objects
It was all good and well to check if all remote refs are complete (local
refs or descendants thereof), but we can just as easily use the same
information to avoid sending "want" lines just for the complete objects in
the case that not all remote refs are complete (or their names differ).

Also, git-fetch-pack does not have to ask for descendants of remote refs
which are complete (for now, git-rev-list is told to ignore only the first
parent). That change also eliminates a code path where a popen()ed handle
was not pclose()ed.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-19 16:14:34 -07:00
Junio C Hamano
d6a461e177 count-objects: squelch error from find on sparse object directory.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-19 15:01:50 -07:00
H. Peter Anvin
b7080d8516 git-daemon: timeout, eliminate double DWIM
It turns out that not only did git-daemon do DWIM, but git-upload-pack
does as well.  This is bad; security checks have to be performed *after*
canonicalization, not before.

Additionally, the current git-daemon can be trivially DoSed by spewing
SYNs at the target port.

This patch adds a --strict option to git-upload-pack to disable all
DWIM, a --timeout option to git-daemon and git-upload-pack, and an
--init-timeout option to git-daemon (which is typically set to a much
lower value, since the initial request should come immediately from the
client.)

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-19 14:44:43 -07:00
Junio C Hamano
76e712f1b3 git-clone: always keep pack sent from remote (documentation).
This adjusts the documentation.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-19 14:43:43 -07:00
Junio C Hamano
e1c7ada6dd git-clone: always keep pack sent from remote.
This deprecates --keep and -q flags and always keeps the pack
sent from the remote site.  Corresponding configuration
variables are also removed.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-19 14:43:43 -07:00
Junio C Hamano
49bb805e69 Do not ask for objects known to be complete.
On top of optimization by Linus not to ask refs that already match, we
can walk our refs and not issue "want" for things that are known to be
reachable from them.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-19 14:27:02 -07:00
Nick Hengeveld
e0004e286c Support for HTTP transfer timeouts based on transfer speed
Add configuration settings to abort HTTP requests if the transfer rate
drops below a threshold for a specified length of time.  Environment
variables override config file settings.

Signed-off-by: Nick Hengeveld <nickh@reactrix.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-19 14:27:01 -07:00
H. Peter Anvin
960deccb26 git-daemon: timeout, eliminate double DWIM
It turns out that not only did git-daemon do DWIM, but git-upload-pack
does as well.  This is bad; security checks have to be performed *after*
canonicalization, not before.

Additionally, the current git-daemon can be trivially DoSed by spewing
SYNs at the target port.

This patch adds a --strict option to git-upload-pack to disable all
DWIM, a --timeout option to git-daemon and git-upload-pack, and an
--init-timeout option to git-daemon (which is typically set to a much
lower value, since the initial request should come immediately from the
client.)

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-19 14:27:01 -07:00
Junio C Hamano
c9ed27b9e8 GIT 0.99.8f
Yes I said 0.99.8e was the last maintenance release for 0.99.8, but it
turns out that there was another backport necessary after git-daemon
was unleashed on kernel.org servers.

Contains the following since 0.99.8e:

H. Peter Anvin:
      revised^2: git-daemon extra paranoia, and path DWIM

Johannes Schindelin:
      Fix cvsimport warning when called without --no-cvs-direct

Junio C Hamano:
      Do not ask for objects known to be complete.

Linus Torvalds:
      git-fetch-pack: avoid unnecessary zero packing
      Optimize common case of git-rev-list

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-19 02:31:27 -07:00
Johannes Schindelin
750a09a7de Fix cvsimport warning when called without --no-cvs-direct
Perl was warning that $opt_p was undefined in that case.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-19 00:01:02 -07:00
Junio C Hamano
acfcb8dfa4 Do not ask for objects known to be complete.
On top of optimization by Linus not to ask refs that already match, we
can walk our refs and not issue "want" for things that are known to be
reachable from them.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-19 00:01:01 -07:00
Linus Torvalds
0910e8cab8 Optimize common case of git-rev-list
I took a look at webgit, and it looks like at least for the "projects"
page, the most common operation ends up being basically

	git-rev-list --header --parents --max-count=1 HEAD

Now, the thing is, the way "git-rev-list" works, it always keeps on
popping the parents and parsing them in order to build the list of
parents, and it turns out that even though we just want a single commit,
git-rev-list will invariably look up _three_ generations of commits.

It will parse:
 - the commit we want (it obviously needs this)
 - it's parent(s) as part of the "pop_most_recent_commit()" logic
 - it will then pop one of the parents before it notices that it doesn't
   need any more
 - and as part of popping the parent, it will parse the grandparent (again
   due to "pop_most_recent_commit()".

Now, I've strace'd it, and it really is pretty efficient on the whole, but
if things aren't nicely cached, and with long-latency IO, doing those two
extra objects (at a minimum - if the parent is a merge it will be more) is
just wasted time, and potentially a lot of it.

So here's a quick special-case for the trivial case of "just one commit,
and no date-limits or other special rules".

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-19 00:01:01 -07:00
H. Peter Anvin
e51fd86ab3 revised^2: git-daemon extra paranoia, and path DWIM
This patch adds some extra paranoia to the git-daemon filename test.  In
particular, it now rejects pathnames containing //; it also adds a
redundant test for pathname absoluteness (belts and suspenders.)

A single / at the end of the path is still permitted, however, and the
.git and /.git append DWIM stuff is now handled in an integrated manner,
which means the resulting path will always be subjected to pathname checks.

[jc: backported to 0.99.8 maintenance branch]

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-19 00:01:01 -07:00
Linus Torvalds
844ac7f818 git-fetch-pack: avoid unnecessary zero packing
If everything is up-to-date locally, we don't need to even ask for a
pack-file from the remote, or try to unpack it.

This is especially important for tags - since the pack-file common commit
logic is based purely on the commit history, it will never be able to find
a common tag, and will thus always end up re-fetching them.

Especially notably, if the tag points to a non-commit (eg a tagged tree),
the pack-file would be unnecessarily big, just because it cannot any most
recent common point between commits for pruning.

Short-circuiting the case where we already have that reference means that
we avoid a lot of these in the common case.

NOTE! This only matches remote ref names against the same local name,
which works well for tags, but is not as generic as it could be. If we
ever need to, we could match against _any_ local ref (if we have it, we
have it), but this "match against same name" is simpler and more
efficient, and covers the common case.

Renaming of refs is common for branch heads, but since those are always
commits, the pack-file generation can optimize that case.

In some cases we might still end up fetching pack-files unnecessarily, but
this at least avoids the re-fetching of tags over and over if you use a
regular

	git fetch --tags ...

which was the main reason behind the change.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-18 23:52:08 -07:00
Junio C Hamano
ea5a65a599 Do not ask for objects known to be complete.
On top of optimization by Linus not to ask refs that already match, we
can walk our refs and not issue "want" for things that are known to be
reachable from them.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-18 18:42:19 -07:00
Junio C Hamano
f8765797a4 Even when overwriting tags, report if they are changed or not.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-18 18:42:14 -07:00
Linus Torvalds
fe5f51ce27 Optimize common case of git-rev-list
I took a look at webgit, and it looks like at least for the "projects"
page, the most common operation ends up being basically

	git-rev-list --header --parents --max-count=1 HEAD

Now, the thing is, the way "git-rev-list" works, it always keeps on
popping the parents and parsing them in order to build the list of
parents, and it turns out that even though we just want a single commit,
git-rev-list will invariably look up _three_ generations of commits.

It will parse:
 - the commit we want (it obviously needs this)
 - it's parent(s) as part of the "pop_most_recent_commit()" logic
 - it will then pop one of the parents before it notices that it doesn't
   need any more
 - and as part of popping the parent, it will parse the grandparent (again
   due to "pop_most_recent_commit()".

Now, I've strace'd it, and it really is pretty efficient on the whole, but
if things aren't nicely cached, and with long-latency IO, doing those two
extra objects (at a minimum - if the parent is a merge it will be more) is
just wasted time, and potentially a lot of it.

So here's a quick special-case for the trivial case of "just one commit,
and no date-limits or other special rules".

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-18 18:41:28 -07:00
H. Peter Anvin
3e04c62daa revised^2: git-daemon extra paranoia, and path DWIM
This patch adds some extra paranoia to the git-daemon filename test.  In
particular, it now rejects pathnames containing //; it also adds a
redundant test for pathname absoluteness (belts and suspenders.)

A single / at the end of the path is still permitted, however, and the
.git and /.git append DWIM stuff is now handled in an integrated manner,
which means the resulting path will always be subjected to pathname checks.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-18 18:26:52 -07:00
Kay Sievers
5b6dcc3fde v248 2005-10-19 03:24:27 +02:00
Kay Sievers
11044297b2 add Expires: +1d header to commit and commitdiff pages
Signed-off-by: Kay Sievers <kay.sievers@suse.de>
2005-10-19 03:18:45 +02:00
Junio C Hamano
5e5f8091e5 Remove unused include.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-18 15:41:43 -07:00
Linus Torvalds
2759cbc774 git-fetch-pack: avoid unnecessary zero packing
If everything is up-to-date locally, we don't need to even ask for a
pack-file from the remote, or try to unpack it.

This is especially important for tags - since the pack-file common commit
logic is based purely on the commit history, it will never be able to find
a common tag, and will thus always end up re-fetching them.

Especially notably, if the tag points to a non-commit (eg a tagged tree),
the pack-file would be unnecessarily big, just because it cannot any most
recent common point between commits for pruning.

Short-circuiting the case where we already have that reference means that
we avoid a lot of these in the common case.

NOTE! This only matches remote ref names against the same local name,
which works well for tags, but is not as generic as it could be. If we
ever need to, we could match against _any_ local ref (if we have it, we
have it), but this "match against same name" is simpler and more
efficient, and covers the common case.

Renaming of refs is common for branch heads, but since those are always
commits, the pack-file generation can optimize that case.

In some cases we might still end up fetching pack-files unnecessarily, but
this at least avoids the re-fetching of tags over and over if you use a
regular

	git fetch --tags ...

which was the main reason behind the change.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-18 11:35:17 -07:00