Commit Graph

3435 Commits

Author SHA1 Message Date
Junio C Hamano
e4c9327a77 pack-objects: avoid delta chains that are too long.
This tries to rework the solution for the excess delta chain
problem. An earlier commit worked it around ``cheaply'', but
repeated repacking risks unbound growth of delta chains.

This version counts the length of delta chain we are reusing
from the existing pack, and makes sure a base object that has
sufficiently long delta chain does not get deltified.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-17 21:48:48 -08:00
Junio C Hamano
9b1320a99e Merge branch 'js/portable'
* js/portable:
  Support Irix
  Optionally support old diffs
  Fix cpio call
  SubmittingPatches: note on whitespaces
  Add a README for gitview
  Add contrib/README.
  git-tag: -l to list tags (usability).
2006-02-17 17:34:51 -08:00
Junio C Hamano
0f4aa3993d Merge branch 'fix'
* fix:
  Document --short and --git-dir in git-rev-parse(1)
  git-rev-parse: Fix --short= option parsing
  Prevent git-upload-pack segfault if object cannot be found
  Abstract test_create_repo out for use in tests.
  Trap exit to clean up created directory if clone fails.
2006-02-17 17:34:31 -08:00
Jonas Fonseca
735d80b3bf Document --short and --git-dir in git-rev-parse(1)
Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
2006-02-17 17:33:12 -08:00
Jonas Fonseca
44de0da4f9 git-rev-parse: Fix --short= option parsing
Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
2006-02-17 17:33:11 -08:00
Johannes Schindelin
289c4b36e3 Support Irix
Signed-off-by: Johannes E. Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-17 16:32:43 -08:00
Johannes Schindelin
5b5d4d9e1b Optionally support old diffs
Some versions of diff do not correctly detect a missing new-line at the end
of the file under certain circumstances.

When defining NO_ACCURATE_DIFF, work around this bug.

Signed-off-by: Johannes E. Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-17 16:32:41 -08:00
Johannes Schindelin
8e1618f961 Fix cpio call
To some cpio's, -a and -m options are mutually exclusive. Use only -m.

Signed-off-by: Johannes E. Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-17 16:30:57 -08:00
Carl Worth
b5b16990f8 Prevent git-upload-pack segfault if object cannot be found
Signed-off-by: Carl Worth <cworth@cworth.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-17 16:20:51 -08:00
Carl Worth
eedf8f97e5 Abstract test_create_repo out for use in tests.
Signed-off-by: Carl Worth <cworth@cworth.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-17 16:16:53 -08:00
Carl Worth
41ff7a1076 Trap exit to clean up created directory if clone fails.
Signed-off-by: Carl Worth <cworth@cworth.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-17 16:16:49 -08:00
Junio C Hamano
45d2b286ac SubmittingPatches: note on whitespaces
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-17 16:15:26 -08:00
Aneesh Kumar K.V
020e3c1ee6 Add a README for gitview
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-17 13:34:13 -08:00
Junio C Hamano
0c0fab2da4 Add contrib/README.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-17 13:33:14 -08:00
Junio C Hamano
b867c7c23a git-tag: -l to list tags (usability).
git-tag -l lists all tags, and git-tag -l <pattern> filters the
result with <pattern>.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-17 04:04:39 -08:00
Junio C Hamano
07e8ab9be9 Merge branch 'jc/pack-reuse'
* jc/pack-reuse:
  git-repack: allow passing a couple of flags to pack-objects.
  pack-objects: finishing touches.
  pack-objects: reuse data from existing packs.
  Add contrib/gitview from Aneesh.
  git-svn: ensure fetch always works chronologically.
  git-svn: fix revision order when XML::Simple is not loaded
  Introducing contrib/git-svn.
  Allow building Git in systems without iconv
2006-02-17 02:12:19 -08:00
Junio C Hamano
cec2be76d9 git-repack: allow passing a couple of flags to pack-objects.
A new flag -q makes underlying pack-objects less chatty.
A new flag -f forces delta to be recomputed from scratch.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-17 02:11:38 -08:00
Junio C Hamano
ca5381d43e pack-objects: finishing touches.
This introduces --no-reuse-delta option to disable reusing of
existing delta, which is a large part of the optimization
introduced by this series.  This may become necessary if
repeated repacking makes delta chain too long.  With this, the
output of the command becomes identical to that of the older
implementation.  But the performance suffers greatly.

It still allows reusing non-deltified representations; there is
no point uncompressing and recompressing the whole text.

It also adds a couple more statistics output, while squelching
it under -q flag, which the last round forgot to do.

  $ time old-git-pack-objects --stdout >/dev/null <RL
  Generating pack...
  Done counting 184141 objects.
  Packing 184141 objects....................
  real    12m8.530s       user    11m1.450s       sys     0m57.920s
  $ time git-pack-objects --stdout >/dev/null <RL
  Generating pack...
  Done counting 184141 objects.
  Packing 184141 objects.....................
  Total 184141, written 184141 (delta 138297), reused 178833 (delta 134081)
  real    0m59.549s       user    0m56.670s       sys     0m2.400s
  $ time git-pack-objects --stdout --no-reuse-delta >/dev/null <RL
  Generating pack...
  Done counting 184141 objects.
  Packing 184141 objects.....................
  Total 184141, written 184141 (delta 134833), reused 47904 (delta 0)
  real    11m13.830s      user    9m45.240s       sys     0m44.330s

There is one remaining issue when --no-reuse-delta option is not
used.  It can create delta chains that are deeper than specified.

    A<--B<--C<--D   E   F   G

Suppose we have a delta chain A to D (A is stored in full either
in a pack or as a loose object. B is depth1 delta relative to A,
C is depth2 delta relative to B...) with loose objects E, F, G.
And we are going to pack all of them.

B, C and D are left as delta against A, B and C respectively.
So A, E, F, and G are examined for deltification, and let's say
we decided to keep E expanded, and store the rest as deltas like
this:

    E<--F<--G<--A

Oops.  We ended up making D a bit too deep, didn't we?  B, C and
D form a chain on top of A!

This is because we did not know what the final depth of A would
be, when we checked objects and decided to keep the existing
delta.  Unfortunately, deferring the decision until just before
the deltification is not an option.  To be able to make B, C,
and D candidates for deltification with the rest, we need to
know the type and final unexpanded size of them, but the major
part of the optimization comes from the fact that we do not read
the delta data to do so -- getting the final size is quite an
expensive operation.

To prevent this from happening, we should keep A from being
deltified.  But how would we tell that, cheaply?

To do this most precisely, after check_object() runs, each
object that is used as the base object of some existing delta
needs to be marked with the maximum depth of the objects we
decided to keep deltified (in this case, D is depth 3 relative
to A, so if no other delta chain that is longer than 3 based on
A exists, mark A with 3).  Then when attempting to deltify A, we
would take that number into account to see if the final delta
chain that leads to D becomes too deep.

However, this is a bit cumbersome to compute, so we would cheat
and reduce the maximum depth for A arbitrarily to depth/4 in
this implementation.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-17 02:11:38 -08:00
Junio C Hamano
a49dd05fd0 pack-objects: reuse data from existing packs.
When generating a new pack, notice if we have already needed
objects in existing packs.  If an object is stored deltified,
and its base object is also what we are going to pack, then
reuse the existing deltified representation unconditionally,
bypassing all the expensive find_deltas() and try_deltas()
calls.

Also, notice if what we are going to write out exactly match
what is already in an existing pack (either deltified or just
compressed).  In such a case, we can just copy it instead of
going through the usual uncompressing & recompressing cycle.

Without this patch, in linux-2.6 repository with about 1500
loose objects and a single mega pack:

    $ git-rev-list --objects v2.6.16-rc3 >RL
    $ wc -l RL
    184141 RL
    $ time git-pack-objects p <RL
    Generating pack...
    Done counting 184141 objects.
    Packing 184141 objects....................
    a1fc7b3e537fcb9b3c46b7505df859f0a11e79d2

    real    12m4.323s
    user    11m2.560s
    sys     0m55.950s

With this patch, the same input:

    $ time ../git.junio/git-pack-objects q <RL
    Generating pack...
    Done counting 184141 objects.
    Packing 184141 objects.....................
    a1fc7b3e537fcb9b3c46b7505df859f0a11e79d2
    Total 184141, written 184141, reused 182441

    real    1m2.608s
    user    0m55.090s
    sys     0m1.830s

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-17 02:11:38 -08:00
Aneesh Kumar
8cb711c8a5 Add contrib/gitview from Aneesh.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-17 02:10:31 -08:00
Eric Wong
defc649229 git-svn: ensure fetch always works chronologically.
We run svn log against a URL without a working copy for the first fetch,
so we end up a log that's sorted from highest to lowest.  That's bad, we
always want lowest to highest.  Just default to --revision 0:HEAD now if
-r isn't specified for the first fetch.

Also sort the revisions after we get them just in case somebody
accidentally reverses the argument to --revision for whatever reason.

Thanks again to Emmanuel Guerin for helping me find this.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-17 01:01:24 -08:00
Eric Wong
1c6bbbf37b git-svn: fix revision order when XML::Simple is not loaded
Thanks to Emmanuel Guerin for finding the bug.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-17 01:01:20 -08:00
Junio C Hamano
9101625d9f Merge branch 'lt/merge-tree'
* lt/merge-tree:
  git-merge-tree: generalize the "traverse <n> trees in sync" functionality
  Handling large files with GIT
  Handling large files with GIT
2006-02-16 01:57:39 -08:00
Junio C Hamano
b3466cd8e2 Merge branch 'jc/topo'
* jc/topo:
  topo-order: make --date-order optional.
2006-02-16 01:57:33 -08:00
Eric Wong
3397f9df53 Introducing contrib/git-svn. 2006-02-16 01:56:43 -08:00
Fernando J. Pereda
b6e56eca8a Allow building Git in systems without iconv
Systems using some uClibc versions do not properly support
iconv stuff. This patch allows Git to be built on those
systems by passing NO_ICONV=YesPlease to make. The only
drawback is mailinfo won't do charset conversion in those
systems.

Signed-off-by: Fernando J. Pereda <ferdy@gentoo.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-16 01:42:58 -08:00
Linus Torvalds
164dcb97f0 git-merge-tree: generalize the "traverse <n> trees in sync" functionality
It's actually very useful for other things too. Notably, we could do the
combined diff a lot more efficiently with this.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-15 23:39:11 -08:00
Linus Torvalds
01df529722 Handling large files with GIT
On Tue, 14 Feb 2006, Linus Torvalds wrote:
>
> Here, btw, is the trivial diff to turn my previous "tree-resolve" into a
> "resolve tree relative to the current branch".

Gaah. It was trivial, and it happened to work fine for my test-case, but
when I started looking at not doing that extremely aggressive subdirectory
merging, that showed a few other issues...

So in case people want to try, here's a third patch. Oh, and it's against
my _original_ path, not incremental to the middle one (ie both patches two
and three are against patch #1, it's not a nice series).

Now I'm really done, and won't be sending out any more patches today.
Sorry for the noise.

		Linus

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-15 23:35:40 -08:00
Linus Torvalds
492e0759bf Handling large files with GIT
On Tue, 14 Feb 2006, Junio C Hamano wrote:

> Linus Torvalds <torvalds@osdl.org> writes:
>
> > If somebody is interested in making the "lots of filename changes" case go
> > fast, I'd be more than happy to walk them through what they'd need to
> > change. I'm just not horribly motivated to do it myself. Hint, hint.
>
> In case anybody is wondering, I share the same feeling.  I
> cannot say I'd be "more than happy to" clean up potential
> breakages during the development of such changes, but if the
> change eventually would help certain use cases, I can be
> persuaded to help debugging such a mess ;-).

Actually, I got interested in seeing how hard this is, and wrote a simple
first cut at doing a tree-optimized merger.

Let me shout a bit first:

  THIS IS WORKING CODE, BUT BE CAREFUL: IT'S A TECHNOLOGY DEMONSTRATION
  RATHER THAN THE FINAL PRODUCT!

With that out of the way, let me descibe what this does (and then describe
the missing parts).

This is basically a three-way merge that works entirely on the "tree"
level, rather than on the index. A lot of the _concepts_ are the same,
though, and if you're familiar with the results of an index merge, some of
the output will make more sense.

You give it three trees: the base tree (tree 0), and the two branches to
be merged (tree 1 and tree 2 respectively). It will then walk these three
trees, and resolve them as it goes along.

The interesting part is:
 - it can resolve whole sub-directories in one go, without actually even
   looking recursively at them. A whole subdirectory will resolve the same
   way as any individual files will (although that may need some
   modification, see later).
 - if it has a "content conflict", for subdirectories that means "try to
   do a recursive tree merge", while for non-subdirectories it's just a
   content conflict and we'll output the stage 1/2/3 information.
 - a successful merge will output a single stage 0 ("merged") entry,
   potentially for a whole subdirectory.
 - it outputs all the resolve information on stdout, so something like the
   recursive resolver can pretty easily parse it all.

Now, the caveats:
 - we probably need to be more careful about subdirectory resolves. The
   trivial case (both branches have the exact same subdirectory) is a
   trivial resolve, but the other cases ("branch1 matches base, branch2 is
   different" probably can't be silently just resolved to the "branch2"
   subdirectory state, since it might involve renames into - or out of -
   that subdirectory)
 - we do not track the current index file at all, so this does not do the
   "check that index matches branch1" logic that the three-way merge in
   git-read-tree does. The theory is that we'd do a full three-way merge
   (ignoring the index and working directory), and then to update the
   working tree, we'd do a two-way "git-read-tree branch1->result"
 - I didn't actually make it do all the trivial resolve cases that
   git-read-tree does. It's a technology demonstration.

Finally (a more serious caveat):
 - doing things through stdout may end up being so expensive that we'd
   need to do something else. In particular, it's likely that I should
   not actually output the "merge results", but instead output a "merge
   results as they _differ_ from branch1"

However, I think this patch is already interesting enough that people who
are interested in merging trees might want to look at it. Please keep in
mind that tech _demo_ part, and in particular, keep in mind the final
"serious caveat" part.

In many ways, the really _interesting_ part of a merge is not the result,
but how it _changes_ the branch we're merging into. That's particularly
important as it should hopefully also mean that the output size for any
reasonable case is minimal (and tracks what we actually need to do to the
current state to create the final result).

The code very much is organized so that doing the result as a "diff
against branch1" should be quite easy/possible. I was actually going to do
it, but I decided that it probably makes the output harder to read. I
dunno.

Anyway, let's think about this kind of approach.. Note how the code itself
is actually quite small and short, although it's prbably pretty "dense".

As an interesting test-case, I'd suggest this merge in the kernel:

	git-merge-tree $(git-merge-base 4cbf876 7d2babc) 4cbf876 7d2babc

which resolves beautifully (there are no actual file-level conflicts), and
you can look at the output of that command to start thinking about what
it does.

The interesting part (perhaps) is that timing that command for me shows
that it takes all of 0.004 seconds.. (the git-merge-base thing takes
considerably more ;)

The point is, we _can_ do the actual merge part really really quickly.

		Linus

PS. Final note: when I say that it is "WORKING CODE", that is obviously by
my standards. IOW, I tested it once and it gave reasonable results - so it
must be perfect.

Whether it works for anybody else, or indeed for any other test-case, is
not my problem ;)

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-15 23:35:40 -08:00
Junio C Hamano
4c8725f16a topo-order: make --date-order optional.
This adds --date-order to rev-list; it is similar to topo order
in the sense that no parent comes before all of its children,
but otherwise things are still ordered in the commit timestamp
order.

The same flag is also added to show-branch.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-15 22:12:06 -08:00
Junio C Hamano
bf0a25560b Merge master to get fixes up to 1.2.1 2006-02-15 19:45:03 -08:00
Junio C Hamano
be97bd1b88 Merge branch 'jc/add'
* jc/add:
  Detect misspelled pathspec to git-add
2006-02-15 19:42:15 -08:00
Junio C Hamano
5f906b1c34 Merge fixes up to 1.2.1 2006-02-15 19:39:21 -08:00
Josef Weidendorfer
babfaf8dee More useful/hinting error messages in git-checkout
Signed-off-by: Josef Weidendorfer <Josef.Weidendorfer@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-15 19:14:04 -08:00
Fernando J. Pereda
6c5c62f340 Print an error if cloning a http repo and NO_CURL is set
If Git is compiled with NO_CURL=YesPlease and one tries to
clone a http repository, git-clone tries to call the curl
binary. This trivial patch prints an error instead in such
situation.

Signed-off-by: Fernando J. Pereda <ferdy@gentoo.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-15 19:14:01 -08:00
Junio C Hamano
f8f135c9ba packed objects: minor cleanup
The delta depth is unsigned.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-15 13:03:27 -08:00
Junio C Hamano
abd54c2c39 Merge branch 'jc/add'
* jc/add:
  Detect misspelled pathspec to git-add
  ls-files --error-unmatch pathspec error reporting fix.
2006-02-15 01:58:26 -08:00
Junio C Hamano
45e48120bb Detect misspelled pathspec to git-add
This is in the same spirit as an earlier patch for git-commit.
It does an extra ls-files to avoid complaining when a fully
tracked directory name is given on the command line (otherwise
--others restriction would say the pathspec does not match).

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-15 01:56:55 -08:00
Junio C Hamano
6becd7da87 ls-files --error-unmatch pathspec error reporting fix.
Earlier patch mistakenly used prefix_len when it meant
prefix_offset.  The latter is to strip the leading directories
when run from a subdirectory.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-15 01:10:13 -08:00
Junio C Hamano
cfba73c842 Merge branch 'jc/rebase-limit'
* jc/rebase-limit:
  rebase: allow rebasing onto different base.
2006-02-14 17:56:48 -08:00
Junio C Hamano
29cd1fa451 Merge branch 'fix'
* fix:
  checkout: fix dirty-file display.
2006-02-14 17:56:07 -08:00
Junio C Hamano
becb6a658c Merge branch 'master'
* master:
  Merge branch 'kh/svn'
  git-svnimport: -r adds svn revision number to commit messages
  Merge branch 'jc/commit'
  commit: detect misspelled pathspec while making a partial commit.
  combine-diff: diff-files fix (#2)
  combine-diff: diff-files fix.
  Merge branch 'jc/rebase'
  Merge branch 'ra/email'
2006-02-14 17:56:02 -08:00
Junio C Hamano
e8a1a11d4e Merge branch 'kh/svn'
* kh/svn:
  git-svnimport: -r adds svn revision number to commit messages
2006-02-14 17:51:50 -08:00
Junio C Hamano
756e3ee0c6 Merge branch 'jc/commit'
* jc/commit:
  commit: detect misspelled pathspec while making a partial commit.
  combine-diff: diff-files fix (#2)
  combine-diff: diff-files fix.
2006-02-14 17:51:02 -08:00
Junio C Hamano
9b6c66e05c Merge branch 'jc/rebase'
* jc/rebase:
  rebase: allow a hook to refuse rebasing.
2006-02-14 17:49:00 -08:00
Junio C Hamano
709fb393ca Merge branch 'ra/email'
* ra/email:
  send-email: Add --cc
  send-email: Add some options for controlling how addresses are automatically added to the cc: list.
2006-02-14 17:46:41 -08:00
Junio C Hamano
e646c9c8c0 rebase: allow rebasing onto different base.
This allows you to rewrite history a bit more flexibly, by
separating the other branch name and new branch point.  By
default, the new branch point is the same as the tip of the
other branch as before, but you can specify where you graft the
rebased branch onto.

When you have this ancestry graph:

          A---B---C topic
         /
    D---E---F---G master

	$ git rebase --onto master~1 master topic

would rewrite the history to look like this:

	      A'\''--B'\''--C'\'' topic
	     /
    D---E---F---G master

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-14 16:10:49 -08:00
Junio C Hamano
504fe714fe checkout: fix dirty-file display.
When we refused to switch branches, we incorrectly showed
differences from the branch we would have switched to.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-14 16:05:57 -08:00
Junio C Hamano
bba319b5ce commit: detect misspelled pathspec while making a partial commit.
When you say "git commit Documentaiton" to make partial commit
for the files only in that directory, we did not detect that as
a misspelled pathname and attempted to commit index without
change.  If nothing matched, there is no harm done, but if the
index gets modified otherwise by having another valid pathspec
or after an explicit update-index, a user will not notice
without paying attention to the "git status" preview.

This introduces --error-unmatch option to ls-files, and uses it
to detect this common user error.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-14 14:48:22 -08:00
Karl Hasselström
0a48a344c6 git-svnimport: -r adds svn revision number to commit messages
New -r flag for prepending the corresponding Subversion revision
number to each commit message.

Signed-off-by: Karl Hasselström <kha@treskal.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-14 01:30:43 -08:00