Add config variables pack.compression and core.loosecompression ,
and switch --compression=level to pack-objects.
Loose objects will be compressed using core.loosecompression if set,
else core.compression if set, else Z_BEST_SPEED.
Packed objects will be compressed using --compression=level if seen,
else pack.compression if set, else core.compression if set,
else Z_DEFAULT_COMPRESSION. This is the "pack compression level".
Loose objects added to a pack undeltified will be recompressed
to the pack compression level if it is unequal to the current
loose compression level by the preceding rules, or if the loose
object was written while core.legacyheaders = true. Newly
deltified loose objects are always compressed to the current
pack compression level.
Previously packed objects added to a pack are recompressed
to the current pack compression level exactly when their
deltification status changes, since the previous pack data
cannot be reused.
In either case, the --no-reuse-object switch from the first
patch below will always force recompression to the current pack
compression level, instead of assuming the pack compression level
hasn't changed and pack data can be reused when possible.
This applies on top of the following patches from Nicolas Pitre:
[PATCH] allow for undeltified objects not to be reused
[PATCH] make "repack -f" imply "pack-objects --no-reuse-object"
Signed-off-by: Dana L. How <danahow@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Currently non deltified object data is always reused when possible.
This means that any change to core.compression has no effect on those
objects as they don't get recompressed when repacking them.
Let's add a --no-reuse-object flag to git-repack in order to force
recompression of all objects when desired.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds documentation for --progress and --all-progress, remove a
duplicate --progress handling and make usage string more readable.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Currently git-push displays progress status for the local packing of
objects to send, but nothing once it starts to push it over the
connection. Having progress status in that later case is especially
nice when pushing lots of objects over a slow network link.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Documentation for pack-objects seems to be out of date in this regard.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* np/pack:
add the capability for index-pack to read from a stream
index-pack: compare only the first 20-bytes of the key.
git-repack: repo.usedeltabaseoffset
pack-objects: document --delta-base-offset option
allow delta data reuse even if base object is a preferred base
zap a debug remnant
let the GIT native protocol use offsets to delta base when possible
make pack data reuse compatible with both delta types
make git-pack-objects able to create deltas with offset to base
teach git-index-pack about deltas with offset to base
teach git-unpack-objects about deltas with offset to base
introduce delta objects with offset to base
Currently, you actually have to read the source to find out the
default values. While at it, fix two typos and suggest that these
options actually take a parameter in git-pack-objects.txt.
Signed-off-by: Dennis Stosberg <dennis@stosberg.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This introduces --no-reuse-delta option to disable reusing of
existing delta, which is a large part of the optimization
introduced by this series. This may become necessary if
repeated repacking makes delta chain too long. With this, the
output of the command becomes identical to that of the older
implementation. But the performance suffers greatly.
It still allows reusing non-deltified representations; there is
no point uncompressing and recompressing the whole text.
It also adds a couple more statistics output, while squelching
it under -q flag, which the last round forgot to do.
$ time old-git-pack-objects --stdout >/dev/null <RL
Generating pack...
Done counting 184141 objects.
Packing 184141 objects....................
real 12m8.530s user 11m1.450s sys 0m57.920s
$ time git-pack-objects --stdout >/dev/null <RL
Generating pack...
Done counting 184141 objects.
Packing 184141 objects.....................
Total 184141, written 184141 (delta 138297), reused 178833 (delta 134081)
real 0m59.549s user 0m56.670s sys 0m2.400s
$ time git-pack-objects --stdout --no-reuse-delta >/dev/null <RL
Generating pack...
Done counting 184141 objects.
Packing 184141 objects.....................
Total 184141, written 184141 (delta 134833), reused 47904 (delta 0)
real 11m13.830s user 9m45.240s sys 0m44.330s
There is one remaining issue when --no-reuse-delta option is not
used. It can create delta chains that are deeper than specified.
A<--B<--C<--D E F G
Suppose we have a delta chain A to D (A is stored in full either
in a pack or as a loose object. B is depth1 delta relative to A,
C is depth2 delta relative to B...) with loose objects E, F, G.
And we are going to pack all of them.
B, C and D are left as delta against A, B and C respectively.
So A, E, F, and G are examined for deltification, and let's say
we decided to keep E expanded, and store the rest as deltas like
this:
E<--F<--G<--A
Oops. We ended up making D a bit too deep, didn't we? B, C and
D form a chain on top of A!
This is because we did not know what the final depth of A would
be, when we checked objects and decided to keep the existing
delta. Unfortunately, deferring the decision until just before
the deltification is not an option. To be able to make B, C,
and D candidates for deltification with the rest, we need to
know the type and final unexpanded size of them, but the major
part of the optimization comes from the fact that we do not read
the delta data to do so -- getting the final size is quite an
expensive operation.
To prevent this from happening, we should keep A from being
deltified. But how would we tell that, cheaply?
To do this most precisely, after check_object() runs, each
object that is used as the base object of some existing delta
needs to be marked with the maximum depth of the objects we
decided to keep deltified (in this case, D is depth 3 relative
to A, so if no other delta chain that is longer than 3 based on
A exists, mark A with 3). Then when attempting to deltify A, we
would take that number into account to see if the final delta
chain that leads to D becomes too deep.
However, this is a bit cumbersome to compute, so we would cheat
and reduce the maximum depth for A arbitrarily to depth/4 in
this implementation.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This provides (minimal) documentation for the --non-empty command-line
option to the pack-objects command.
Signed-off-by: Nikolai Weibull <nikolai@bitwi.se>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The fixes focuses on improving the HTML output. Most noteworthy:
- Fix the Makefile to also make various *.html files depend on
included files.
- Consistently use 'NOTE: ...' instead of '[ ... ]' for additional
info.
- Fix ending '::' for description lists in OPTION section etc.
- Fix paragraphs in description lists ending up as preformated text.
- Always use listingblocks (preformatted text wrapped in lines with -----)
for examples that span empty lines, so they are put in only one HTML
block.
- Use '1.' instead of '(1)' for numbered lists.
- Fix linking to other GIT docs.
- git-rev-list.txt: put option descriptions in an OPTION section.
Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The replacement was performed automatically by these commands:
perl -pi -e 's/link:(git.+)\.html\[\1\]/gitlink:$1\[1\]/g' \
README Documentation/*.txt
perl -pi -e 's/link:git\.html\[git\]/gitlink:git\[7\]/g' \
README Documentation/*.txt
Signed-off-by: Sergey Vlasov <vsu@altlinux.ru>
Signed-off-by: Junio C Hamano <junkio@cox.net>
As promised, this is the "big tool rename" patch. The primary differences
since 0.99.6 are:
(1) git-*-script are no more. The commands installed do not
have any such suffix so users do not have to remember if
something is implemented as a shell script or not.
(2) Many command names with 'cache' in them are renamed with
'index' if that is what they mean.
There are backward compatibility symblic links so that you and
Porcelains can keep using the old names, but the backward
compatibility support is expected to be removed in the near
future.
Signed-off-by: Junio C Hamano <junkio@cox.net>
[jc: the patch forgot to update the main git.txt documentation,
making all these new documentation practically no-op, so I added
a minimum attempt linking them from there.]
Signed-off-by: Ryan Anderson <ryan@michonline.com>
[jc: Johannes spent time and effort to see how consistent our
use of terminilogy is, and as a byproduct made these corrections
not related to the terminology unification. I really appreciate
it.]
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds documentation for creating packed archives, inspecting,
validating them, and unpacking them.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>