Commit Graph

3338 Commits

Author SHA1 Message Date
Junio C Hamano
1d6b38cc76 pack-objects: use full pathname to help hashing with "thin" pack.
This uses the same hashing algorithm to the "preferred base
tree" objects and the incoming pathnames, to group the same
files from different revs together, while spreading files with
the same basename in different directories.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-22 23:07:20 -08:00
Junio C Hamano
b925410d10 pack-objects: thin pack micro-optimization.
Since we sort objects by type, hash, preferredness and then
size, after we have a delta against preferred base, there is no
point trying a delta with non-preferred base.  This seems to
save expensive calls to diff-delta and it also seems to save the
output space as well.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-22 21:45:45 -08:00
Junio C Hamano
b19696c2e7 Use thin pack transfer in "git fetch".
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-20 00:38:39 -08:00
Junio C Hamano
a79a276360 Add git-push --thin.
Maybe we would want to make this default before it graduates to
the master branch, but in the meantime to help testing things,
this allows you to say "git push --thin destination".

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-20 00:09:41 -08:00
Junio C Hamano
2245be3e7a send-pack --thin: use "thin pack" delta transfer.
The new flag loosens the usual "self containedness" requirment
of packfiles, and sends deltified representation of objects when
we know the other side has the base objects needed to unpack
them.  This would help reducing the transfer size.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-19 22:28:04 -08:00
Junio C Hamano
7a979d99ba Thin pack - create packfile with missing delta base.
This goes together with "rev-list --object-edge" change, to feed
pack-objects list of edge commits in addition to the usual
object list.  Upon seeing such list, pack-objects loosens the
usual "self contained delta" constraints, and can produce delta
against blobs and trees contained in the edge commits without
storing the delta base objects themselves.

The resulting packfile is not usable in .git/object/packs, but
is a good way to implement "delta-only" transfer.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-19 22:27:39 -08:00
Junio C Hamano
e4c9327a77 pack-objects: avoid delta chains that are too long.
This tries to rework the solution for the excess delta chain
problem. An earlier commit worked it around ``cheaply'', but
repeated repacking risks unbound growth of delta chains.

This version counts the length of delta chain we are reusing
from the existing pack, and makes sure a base object that has
sufficiently long delta chain does not get deltified.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-17 21:48:48 -08:00
Junio C Hamano
cec2be76d9 git-repack: allow passing a couple of flags to pack-objects.
A new flag -q makes underlying pack-objects less chatty.
A new flag -f forces delta to be recomputed from scratch.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-17 02:11:38 -08:00
Junio C Hamano
ca5381d43e pack-objects: finishing touches.
This introduces --no-reuse-delta option to disable reusing of
existing delta, which is a large part of the optimization
introduced by this series.  This may become necessary if
repeated repacking makes delta chain too long.  With this, the
output of the command becomes identical to that of the older
implementation.  But the performance suffers greatly.

It still allows reusing non-deltified representations; there is
no point uncompressing and recompressing the whole text.

It also adds a couple more statistics output, while squelching
it under -q flag, which the last round forgot to do.

  $ time old-git-pack-objects --stdout >/dev/null <RL
  Generating pack...
  Done counting 184141 objects.
  Packing 184141 objects....................
  real    12m8.530s       user    11m1.450s       sys     0m57.920s
  $ time git-pack-objects --stdout >/dev/null <RL
  Generating pack...
  Done counting 184141 objects.
  Packing 184141 objects.....................
  Total 184141, written 184141 (delta 138297), reused 178833 (delta 134081)
  real    0m59.549s       user    0m56.670s       sys     0m2.400s
  $ time git-pack-objects --stdout --no-reuse-delta >/dev/null <RL
  Generating pack...
  Done counting 184141 objects.
  Packing 184141 objects.....................
  Total 184141, written 184141 (delta 134833), reused 47904 (delta 0)
  real    11m13.830s      user    9m45.240s       sys     0m44.330s

There is one remaining issue when --no-reuse-delta option is not
used.  It can create delta chains that are deeper than specified.

    A<--B<--C<--D   E   F   G

Suppose we have a delta chain A to D (A is stored in full either
in a pack or as a loose object. B is depth1 delta relative to A,
C is depth2 delta relative to B...) with loose objects E, F, G.
And we are going to pack all of them.

B, C and D are left as delta against A, B and C respectively.
So A, E, F, and G are examined for deltification, and let's say
we decided to keep E expanded, and store the rest as deltas like
this:

    E<--F<--G<--A

Oops.  We ended up making D a bit too deep, didn't we?  B, C and
D form a chain on top of A!

This is because we did not know what the final depth of A would
be, when we checked objects and decided to keep the existing
delta.  Unfortunately, deferring the decision until just before
the deltification is not an option.  To be able to make B, C,
and D candidates for deltification with the rest, we need to
know the type and final unexpanded size of them, but the major
part of the optimization comes from the fact that we do not read
the delta data to do so -- getting the final size is quite an
expensive operation.

To prevent this from happening, we should keep A from being
deltified.  But how would we tell that, cheaply?

To do this most precisely, after check_object() runs, each
object that is used as the base object of some existing delta
needs to be marked with the maximum depth of the objects we
decided to keep deltified (in this case, D is depth 3 relative
to A, so if no other delta chain that is longer than 3 based on
A exists, mark A with 3).  Then when attempting to deltify A, we
would take that number into account to see if the final delta
chain that leads to D becomes too deep.

However, this is a bit cumbersome to compute, so we would cheat
and reduce the maximum depth for A arbitrarily to depth/4 in
this implementation.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-17 02:11:38 -08:00
Junio C Hamano
a49dd05fd0 pack-objects: reuse data from existing packs.
When generating a new pack, notice if we have already needed
objects in existing packs.  If an object is stored deltified,
and its base object is also what we are going to pack, then
reuse the existing deltified representation unconditionally,
bypassing all the expensive find_deltas() and try_deltas()
calls.

Also, notice if what we are going to write out exactly match
what is already in an existing pack (either deltified or just
compressed).  In such a case, we can just copy it instead of
going through the usual uncompressing & recompressing cycle.

Without this patch, in linux-2.6 repository with about 1500
loose objects and a single mega pack:

    $ git-rev-list --objects v2.6.16-rc3 >RL
    $ wc -l RL
    184141 RL
    $ time git-pack-objects p <RL
    Generating pack...
    Done counting 184141 objects.
    Packing 184141 objects....................
    a1fc7b3e537fcb9b3c46b7505df859f0a11e79d2

    real    12m4.323s
    user    11m2.560s
    sys     0m55.950s

With this patch, the same input:

    $ time ../git.junio/git-pack-objects q <RL
    Generating pack...
    Done counting 184141 objects.
    Packing 184141 objects.....................
    a1fc7b3e537fcb9b3c46b7505df859f0a11e79d2
    Total 184141, written 184141, reused 182441

    real    1m2.608s
    user    0m55.090s
    sys     0m1.830s

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-17 02:11:38 -08:00
Aneesh Kumar
8cb711c8a5 Add contrib/gitview from Aneesh.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-17 02:10:31 -08:00
Eric Wong
defc649229 git-svn: ensure fetch always works chronologically.
We run svn log against a URL without a working copy for the first fetch,
so we end up a log that's sorted from highest to lowest.  That's bad, we
always want lowest to highest.  Just default to --revision 0:HEAD now if
-r isn't specified for the first fetch.

Also sort the revisions after we get them just in case somebody
accidentally reverses the argument to --revision for whatever reason.

Thanks again to Emmanuel Guerin for helping me find this.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-17 01:01:24 -08:00
Eric Wong
1c6bbbf37b git-svn: fix revision order when XML::Simple is not loaded
Thanks to Emmanuel Guerin for finding the bug.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-17 01:01:20 -08:00
Eric Wong
3397f9df53 Introducing contrib/git-svn. 2006-02-16 01:56:43 -08:00
Fernando J. Pereda
b6e56eca8a Allow building Git in systems without iconv
Systems using some uClibc versions do not properly support
iconv stuff. This patch allows Git to be built on those
systems by passing NO_ICONV=YesPlease to make. The only
drawback is mailinfo won't do charset conversion in those
systems.

Signed-off-by: Fernando J. Pereda <ferdy@gentoo.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-16 01:42:58 -08:00
Junio C Hamano
be97bd1b88 Merge branch 'jc/add'
* jc/add:
  Detect misspelled pathspec to git-add
2006-02-15 19:42:15 -08:00
Junio C Hamano
5f906b1c34 Merge fixes up to 1.2.1 2006-02-15 19:39:21 -08:00
Josef Weidendorfer
babfaf8dee More useful/hinting error messages in git-checkout
Signed-off-by: Josef Weidendorfer <Josef.Weidendorfer@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-15 19:14:04 -08:00
Fernando J. Pereda
6c5c62f340 Print an error if cloning a http repo and NO_CURL is set
If Git is compiled with NO_CURL=YesPlease and one tries to
clone a http repository, git-clone tries to call the curl
binary. This trivial patch prints an error instead in such
situation.

Signed-off-by: Fernando J. Pereda <ferdy@gentoo.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-15 19:14:01 -08:00
Junio C Hamano
f8f135c9ba packed objects: minor cleanup
The delta depth is unsigned.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-15 13:03:27 -08:00
Junio C Hamano
45e48120bb Detect misspelled pathspec to git-add
This is in the same spirit as an earlier patch for git-commit.
It does an extra ls-files to avoid complaining when a fully
tracked directory name is given on the command line (otherwise
--others restriction would say the pathspec does not match).

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-15 01:56:55 -08:00
Junio C Hamano
6becd7da87 ls-files --error-unmatch pathspec error reporting fix.
Earlier patch mistakenly used prefix_len when it meant
prefix_offset.  The latter is to strip the leading directories
when run from a subdirectory.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-15 01:10:13 -08:00
Junio C Hamano
e8a1a11d4e Merge branch 'kh/svn'
* kh/svn:
  git-svnimport: -r adds svn revision number to commit messages
2006-02-14 17:51:50 -08:00
Junio C Hamano
756e3ee0c6 Merge branch 'jc/commit'
* jc/commit:
  commit: detect misspelled pathspec while making a partial commit.
  combine-diff: diff-files fix (#2)
  combine-diff: diff-files fix.
2006-02-14 17:51:02 -08:00
Junio C Hamano
9b6c66e05c Merge branch 'jc/rebase'
* jc/rebase:
  rebase: allow a hook to refuse rebasing.
2006-02-14 17:49:00 -08:00
Junio C Hamano
709fb393ca Merge branch 'ra/email'
* ra/email:
  send-email: Add --cc
  send-email: Add some options for controlling how addresses are automatically added to the cc: list.
2006-02-14 17:46:41 -08:00
Junio C Hamano
504fe714fe checkout: fix dirty-file display.
When we refused to switch branches, we incorrectly showed
differences from the branch we would have switched to.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-14 16:05:57 -08:00
Junio C Hamano
bba319b5ce commit: detect misspelled pathspec while making a partial commit.
When you say "git commit Documentaiton" to make partial commit
for the files only in that directory, we did not detect that as
a misspelled pathname and attempted to commit index without
change.  If nothing matched, there is no harm done, but if the
index gets modified otherwise by having another valid pathspec
or after an explicit update-index, a user will not notice
without paying attention to the "git status" preview.

This introduces --error-unmatch option to ls-files, and uses it
to detect this common user error.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-14 14:48:22 -08:00
Karl Hasselström
0a48a344c6 git-svnimport: -r adds svn revision number to commit messages
New -r flag for prepending the corresponding Subversion revision
number to each commit message.

Signed-off-by: Karl Hasselström <kha@treskal.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-14 01:30:43 -08:00
Junio C Hamano
9ece7169a4 combine-diff: diff-files fix (#2)
The raw format "git-diff-files -c" to show unmerged state forgot
to initialize the status fields from parents, causing NUL
characters to be emitted.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-14 01:11:42 -08:00
Junio C Hamano
6a9b87972f Merge some proposed fixes
Conflicts:

	Documentation/git-commit.txt - taking the post 1.2.0 semantics.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-13 23:34:58 -08:00
Junio C Hamano
057f98eda1 Merge branch 'pb/bisect'
* pb/bisect:
  Properly git-bisect reset after bisecting from non-master head
2006-02-13 23:26:53 -08:00
Junio C Hamano
713a11fceb combine-diff: diff-files fix.
When showing a conflicted merge from index stages and working
tree file, we did not fetch the mode from the working tree,
and mistook that as a deleted file.  Also if the manual
resolution (or automated resolution by git rerere) ended up
taking either parent's version, we did not show _anything_ for
that path.  Either was quite bad and confusing.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-13 23:07:04 -08:00
Fredrik Kuivinen
3654638513 s/SHELL/SHELL_PATH/ in Makefile
With the current Makefile we don't use the shell chosen by the
platform specific defines when we invoke GIT-VERSION-GEN.

Signed-off-by: Fredrik Kuivinen <freku045@student.liu.se>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-13 22:13:22 -08:00
Junio C Hamano
4631c0035d bisect: remove BISECT_NAMES after done.
I noticed that we forgot to clean this file and kept it that
way, while trying to help with Andrew's bisect problem.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-13 21:55:27 -08:00
Junio C Hamano
41ac06c7a3 Documentation: git-ls-files asciidocco.
Noticed by Jon Nelson.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-13 21:52:10 -08:00
Ryan Anderson
da140f8bbf send-email: Add --cc
Since Junio used this in an example, and I've personally tried to use it, I
suppose the option should actually exist.

Signed-off-by: Ryan Anderson <ryan@michonline.com>
2006-02-13 03:32:10 -05:00
Junio C Hamano
64491e1ea9 Documentation: git-commit in 1.2.X series defaults to --include.
The documentation was mistakenly describing the --only semantics to
be default.  The 1.2.0 release and its maintenance series 1.2.X will
keep the traditional --include semantics as the default.  Clarify the
situation.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-13 00:32:10 -08:00
Ryan Anderson
a985d595ad send-email: Add some options for controlling how addresses are automatically added to the cc: list.
Signed-off-by: Ryan Anderson <ryan@michonline.com>
2006-02-13 03:32:01 -05:00
Junio C Hamano
9a111c91b0 rebase: allow a hook to refuse rebasing.
This lets a hook to interfere a rebase and help prevent certain
branches from being rebased by mistake.  A sample hook to show
how to prevent a topic branch that has already been merged into
publish branch.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-13 00:17:33 -08:00
Junio C Hamano
4170a19587 git-commit: Now --only semantics is the default.
This changes the "git commit paths..." to default to --only
semantics from traditional --include semantics, as agreed on the
list.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-12 23:55:07 -08:00
Junio C Hamano
bd9ca0baff GIT 1.2.0
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-12 13:14:53 -08:00
Junio C Hamano
4bbdfab766 Fix "test: unexpected operator" on bsd
This fixes the same issue as a previous fix by Alex Riesen does.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-12 13:13:33 -08:00
Petr Baudis
810255fd12 Properly git-bisect reset after bisecting from non-master head
git-bisect reset without an argument would return to master even
if the bisecting started at a non-master branch. This patch makes
it save the original branch name to .git/head-name and restore it
afterwards.

This is also compatible with Cogito and cg-seek, so cg-status will
show that we are seeked on the bisect branch and cg-reset will
properly restore the original branch.

git-bisect start will refuse to work if it is not on a bisect but
.git/head-name exists; this is to protect against conflicts with
other seeking tools.

Signed-off-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-12 13:07:02 -08:00
Junio C Hamano
c5e09c1fbe git-commit: show dirtiness including index.
Earlier, when we switched a branch we used diff-files to show
paths that are dirty in the working tree.  But we allow switching
branches with updated index ("read-tree -m -u $old $new" works that
way), and only showing paths that have differences in the working
tree but not paths that are different in index was confusing.

This shows both as modified from the top commit of the branch we
just have switched to.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-12 13:05:53 -08:00
Junio C Hamano
024701f1d8 Make pack-objects chattier.
You could give -q to squelch it, but currently no tool does it.
This would make 'git clone host:repo here' over ssh not silent
again.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-12 13:01:54 -08:00
Alex Riesen
0dbc4e89bb avoid echo -e, there are systems where it does not work
FreeBSD 4.11 being one example: the built-in echo doesn't have -e,
and the installed /bin/echo does not do "-e" as well.
"printf" works, laking just "\e" and "\xAB'.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-12 11:36:19 -08:00
Alex Riesen
ef1af9d9af fix "test: 2: unexpected operator" on bsd
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-12 11:36:17 -08:00
Linus Torvalds
d7ee090d0d Fix object re-hashing
The hashed object lookup had a subtle bug in re-hashing: it did

	for (i = 0; i < count; i++)
		if (objs[i]) {
			.. rehash ..

where "count" was the old hash couny. Oon the face of it is obvious, since
it clearly re-hashes all the old objects.

However, it's wrong.

If the last old hash entry before re-hashing was in use (or became in use
by the re-hashing), then when re-hashing could have inserted an object
into the hash entries with idx >= count due to overflow. When we then
rehash the last old entry, that old entry might become empty, which means
that the overflow entries should be re-hashed again.

In other words, the loop has to be fixed to either traverse the whole
array, rather than just the old count.

(There's room for a slight optimization: instead of counting all the way
up, we can break when we see the first empty slot that is above the old
"count". At that point we know we don't have any collissions that we might
have to fix up any more. This patch only does the trivial fix)

[jc: with trivial fix on trivial fix]

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-12 11:24:50 -08:00
Junio C Hamano
2b796360ac hashtable-based objects: minimum fixups.
Calling hashtable_index from find_object before objs is created
would result in division by zero failure.  Avoid it.

Also the given object name may not be aligned suitably for
unsigned int; avoid dereferencing casted pointer.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-12 05:12:39 -08:00