Commit Graph

12 Commits

Author SHA1 Message Date
Junio C Hamano
ca5381d43e pack-objects: finishing touches.
This introduces --no-reuse-delta option to disable reusing of
existing delta, which is a large part of the optimization
introduced by this series.  This may become necessary if
repeated repacking makes delta chain too long.  With this, the
output of the command becomes identical to that of the older
implementation.  But the performance suffers greatly.

It still allows reusing non-deltified representations; there is
no point uncompressing and recompressing the whole text.

It also adds a couple more statistics output, while squelching
it under -q flag, which the last round forgot to do.

  $ time old-git-pack-objects --stdout >/dev/null <RL
  Generating pack...
  Done counting 184141 objects.
  Packing 184141 objects....................
  real    12m8.530s       user    11m1.450s       sys     0m57.920s
  $ time git-pack-objects --stdout >/dev/null <RL
  Generating pack...
  Done counting 184141 objects.
  Packing 184141 objects.....................
  Total 184141, written 184141 (delta 138297), reused 178833 (delta 134081)
  real    0m59.549s       user    0m56.670s       sys     0m2.400s
  $ time git-pack-objects --stdout --no-reuse-delta >/dev/null <RL
  Generating pack...
  Done counting 184141 objects.
  Packing 184141 objects.....................
  Total 184141, written 184141 (delta 134833), reused 47904 (delta 0)
  real    11m13.830s      user    9m45.240s       sys     0m44.330s

There is one remaining issue when --no-reuse-delta option is not
used.  It can create delta chains that are deeper than specified.

    A<--B<--C<--D   E   F   G

Suppose we have a delta chain A to D (A is stored in full either
in a pack or as a loose object. B is depth1 delta relative to A,
C is depth2 delta relative to B...) with loose objects E, F, G.
And we are going to pack all of them.

B, C and D are left as delta against A, B and C respectively.
So A, E, F, and G are examined for deltification, and let's say
we decided to keep E expanded, and store the rest as deltas like
this:

    E<--F<--G<--A

Oops.  We ended up making D a bit too deep, didn't we?  B, C and
D form a chain on top of A!

This is because we did not know what the final depth of A would
be, when we checked objects and decided to keep the existing
delta.  Unfortunately, deferring the decision until just before
the deltification is not an option.  To be able to make B, C,
and D candidates for deltification with the rest, we need to
know the type and final unexpanded size of them, but the major
part of the optimization comes from the fact that we do not read
the delta data to do so -- getting the final size is quite an
expensive operation.

To prevent this from happening, we should keep A from being
deltified.  But how would we tell that, cheaply?

To do this most precisely, after check_object() runs, each
object that is used as the base object of some existing delta
needs to be marked with the maximum depth of the objects we
decided to keep deltified (in this case, D is depth 3 relative
to A, so if no other delta chain that is longer than 3 based on
A exists, mark A with 3).  Then when attempting to deltify A, we
would take that number into account to see if the final delta
chain that leads to D becomes too deep.

However, this is a bit cumbersome to compute, so we would cheat
and reduce the maximum depth for A arbitrarily to depth/4 in
this implementation.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-17 02:11:38 -08:00
Junio C Hamano
89438677ab Documentation: spell.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-29 01:32:56 -08:00
Nikolai Weibull
63ae26f87a Document the --non-empty command-line option to git-pack-objects.
This provides (minimal) documentation for the --non-empty command-line
option to the pack-objects command.

Signed-off-by: Nikolai Weibull <nikolai@bitwi.se>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-08 15:50:13 -08:00
Junio C Hamano
12ea5bea58 Update git-pack-objects documentation.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-30 17:28:03 -08:00
Christian Meder
72e9340cfd Convert usage of GIT and Git into git
Convert usage of GIT and Git into git.

Signed-off-by: Christian Meder <chris@absolutegiganten.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-10 16:01:31 -07:00
Junio C Hamano
98438bd0e8 Remove the version tags from the manpages
Signed-off-by: Christian Meder <chris@absolutegiganten.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-10 14:49:52 -07:00
Jonas Fonseca
df8baa42fe [PATCH] Random documentation fixes
The fixes focuses on improving the HTML output. Most noteworthy:

 - Fix the Makefile to also make various *.html files depend on
   included files.

 - Consistently use 'NOTE: ...' instead of '[ ... ]' for additional
   info.

 - Fix ending '::' for description lists in OPTION section etc.

 - Fix paragraphs in description lists ending up as preformated text.

 - Always use listingblocks (preformatted text wrapped in lines with -----)
   for examples that span empty lines, so they are put in only one HTML
   block.

 - Use '1.' instead of '(1)' for numbered lists.

 - Fix linking to other GIT docs.

 - git-rev-list.txt: put option descriptions in an OPTION section.

Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-03 13:23:47 -07:00
Sergey Vlasov
a7154e916c [PATCH] Documentation: Update all files to use the new gitlink: macro
The replacement was performed automatically by these commands:

	perl -pi -e 's/link:(git.+)\.html\[\1\]/gitlink:$1\[1\]/g' \
		README Documentation/*.txt
	perl -pi -e 's/link:git\.html\[git\]/gitlink:git\[7\]/g' \
		README Documentation/*.txt

Signed-off-by: Sergey Vlasov <vsu@altlinux.ru>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-20 15:07:52 -07:00
Junio C Hamano
215a7ad1ef Big tool rename.
As promised, this is the "big tool rename" patch.  The primary differences
since 0.99.6 are:

  (1) git-*-script are no more.  The commands installed do not
      have any such suffix so users do not have to remember if
      something is implemented as a shell script or not.

  (2) Many command names with 'cache' in them are renamed with
      'index' if that is what they mean.

There are backward compatibility symblic links so that you and
Porcelains can keep using the old names, but the backward
compatibility support  is expected to be removed in the near
future.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-07 17:45:20 -07:00
Junio C Hamano
e31bb3bb93 [PATCH] Add documentation for git repack and git-prune-packed.
[jc: the patch forgot to update the main git.txt documentation,
making all these new documentation practically no-op, so I added
a minimum attempt linking them from there.]

Signed-off-by: Ryan Anderson <ryan@michonline.com>
2005-08-15 15:48:47 -07:00
Johannes Schindelin
2c6e477195 [PATCH] Assorted documentation patches
[jc: Johannes spent time and effort to see how consistent our
use of terminilogy is, and as a byproduct made these corrections
not related to the terminology unification.  I really appreciate
it.]

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-08-05 23:07:00 -07:00
Junio C Hamano
5f40520f8c [PATCH] Documentation: packed GIT support commands.
This adds documentation for creating packed archives, inspecting,
validating them, and unpacking them.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-14 08:54:31 -07:00