* bc/repack:
Documentation/git-repack.txt: document new -A behaviour
let pack-objects do the writing of unreachable objects as loose objects
add a force_object_loose() function
builtin-gc.c: deprecate --prune, it now really has no effect
git-gc: always use -A when manually repacking
repack: modify behavior of -A option to leave unreferenced objects unpacked
Conflicts:
builtin-pack-objects.c
Commit ccc1297226 changed the behavior
of 'git repack -A' so unreachable objects are stored as loose objects.
However it did so in a naive and inn efficient way by making packs
about to be deleted inaccessible and feeding their content through
'git unpack-objects'. While this works, there are major flaws with
this approach:
- It is unacceptably sloooooooooooooow.
In the Linux kernel repository with no actual unreachable objects,
doing 'git repack -A -d' before:
real 2m33.220s
user 2m21.675s
sys 0m3.510s
And with this change:
real 0m36.849s
user 0m24.365s
sys 0m1.950s
For reference, here's the timing for 'git repack -a -d':
real 0m35.816s
user 0m22.571s
sys 0m2.011s
This is explained by the fact that 'git unpack-objects' was used to
unpack _every_ objects even if (almost) 100% of them were thrown away.
- There is a black out period.
Between the removal of the .idx file for the redundant pack and the
completion of its unpacking, the unreachable objects become completely
unaccessible. This is not a big issue as we're talking about unreachable
objects, but some consistency is always good.
- There is no way to easily set a sensible mtime for the newly created
unreachable loose objects.
So, while having a command called "pack-objects" to perform object
unpacking looks really odd, this is probably the best compromize to be
able to solve the above issues in an efficient way.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous behavior of the -A option was to retain any previously
packed objects which had become unreferenced, and place them into the newly
created pack file. Since git-gc, when run automatically with the --auto
option, calls repack with the -A option, this had the effect of retaining
unreferenced packed objects indefinitely. To avoid this scenario, the
user was required to run git-gc with the little known --prune option or
to manually run repack with the -a option.
This patch changes the behavior of the -A option so that unreferenced
objects that exist in any pack file being replaced, will be unpacked into
the repository. The unreferenced loose objects can then be garbage collected
by git-gc (i.e. git-prune) based on the gc.pruneExpire setting.
Also add new tests for checking whether unreferenced objects which were
previously packed are properly left in the repository unpacked after
repacking.
Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit 5715d0b (Migrate git-repack.sh to use git-rev-parse --parseopt,
2007-11-04), parsing of the '-n' command line option was accidentally lost
when git-repack.sh was migrated to use git-rev-parse --parseopt. This adds
it back.
Signed-off-by: A Large Angry SCM <gitzilla@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Discussion on the list tonight came to the conclusion that showing
the name of the packfile we just created during git-repack is not
a very useful message for any end-user. For the really technical
folk who need to have the name of the newest packfile they can use
something such as `ls -t .git/objects/pack | head -2` to find the
most recently created packfile.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* jc/autogc:
git-gc --auto: run "repack -A -d -l" as necessary.
git-gc --auto: restructure the way "repack" command line is built.
git-gc --auto: protect ourselves from accumulated cruft
git-gc --auto: add documentation.
git-gc --auto: move threshold check to need_to_gc() function.
repack -A -d: use --keep-unreachable when repacking
pack-objects --keep-unreachable
Export matches_pack_name() and fix its return value
Invoke "git gc --auto" from commit, merge, am and rebase.
Implement git gc --auto
A lot of shell scripts contained stuff starting with
while case "$#" in 0) break ;; esac
and similar. I consider breaking out of the condition instead of the
body od the loop ugly, and the implied "true" value of the
non-matching case is not really obvious to humans at first glance. It
happens not to be obvious to some BSD shells, either, but that's
because they are not POSIX-compliant. In most cases, this has been
replaced by a straight condition using "test". "case" has the
advantage of being faster than "test" on vintage shells where "test"
is not a builtin. Since none of them is likely to run the git
scripts, anyway, the added readability should be worth the change.
A few loops have had their termination condition expressed
differently.
Signed-off-by: David Kastrup <dak@gnu.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The packfile portion of the "remove redundant" code
near the bottom of git-repack.sh is broken when
pack splitting occurs. Particularly since this is
the only place where we automatically delete packfiles,
make sure it works properly for all cases, old or new.
Signed-off-by: Dana L. How <danahow@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Add --max-pack-size parsing and usage messages.
Upgrade git-repack.sh to handle multiple packfile names,
and build packfiles in GIT_OBJECT_DIRECTORY not GIT_DIR.
Update documentation.
Signed-off-by: Dana L. How <danahow@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Recomputing delta is much more expensive than recompressing
anyway, and when the user says 'repack -f', it is a sign that
the user is willing to spend CPU cycles.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Steven Grimm noticed that git-repack's verbosity is inconsistent
because pack-objects is chatty and prune-packed is not. This
makes the latter a bit more chatty and gives -q option to
squelch it.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds a new option --reflog to pack-objects and revision
machinery; do not bother documenting it for now, since this is
only useful for local repacking.
When the option is passed, objects reachable from reflog entries
are marked as interesting while computing the set of objects to
pack.
Signed-off-by: Junio C Hamano <junkio@cox.net>
During `git repack -a -d` only repack objects which are loose or
which reside in an active (a non-kept) pack. This allows the user
to keep large packs as-is without continuous repacking and can be
very helpful on large repositories. It should also help us resolve
a race condition between `git repack -a -d` and the new pack store
functionality in `git-receive-pack`.
Kept packs are those which have a corresponding .keep file in
$GIT_OBJECT_DIRECTORY/pack. That is pack-X.pack will be kept
(not repacked and not deleted) if pack-X.keep exists in the same
directory when `git repack -a -d` starts.
Currently this feature is not documented and there is no user
interface to keep an existing pack.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* np/pack:
add the capability for index-pack to read from a stream
index-pack: compare only the first 20-bytes of the key.
git-repack: repo.usedeltabaseoffset
pack-objects: document --delta-base-offset option
allow delta data reuse even if base object is a preferred base
zap a debug remnant
let the GIT native protocol use offsets to delta base when possible
make pack data reuse compatible with both delta types
make git-pack-objects able to create deltas with offset to base
teach git-index-pack about deltas with offset to base
teach git-unpack-objects about deltas with offset to base
introduce delta objects with offset to base
When configuration variable `repack.UseDeltaBaseOffset` is set
for the repository, the command passes `--delta-base-offset`
option to `git-pack-objects`; this typically results in slightly
smaller packs, but the generated packs are incompatible with
versions of git older than (and including) v1.4.3.
We will make it default to true sometime in the future, but not
for a while.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Now that we explicitly create all tmpfiles below $GIT_DIR, there's no reason
to care about which directory we're in.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Avoid failing when cwd is !writable by writing the
packfiles in $GIT_DIR, which is more in line with other commands.
Without this, git-repack was failing when run from crontab
by non-root user accounts. For large repositories, this
also makes the mv operation a lot cheaper, and avoids leaving
temp packfiles around the fs upon failure.
Signed-off-by: Martin Langhoff <martin@catalyst.net.nz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This fixes the following warning:
git-repack: line 42: cd: .git/objects/pack: No such file or directory
This happens only, when git-repack -a is run without any packs in the
repository.
Signed-off-by: Matthias Kestenholz <matthias@spinlock.ch>
Signed-off-by: Junio C Hamano <junkio@cox.net>
We are trying to catch error condition of git-rev-list and cause
the downstream pack-objects to barf, but if you run rev-list
with anything that mucks with its stderr (such as GIT_TRACE),
any stderr output would cause the pipeline to fail.
[jc: originally from Matthias Lederhofer, with a reworded error message.]
Signed-off-by: Junio C Hamano <junkio@cox.net>
After a clone, packfiles are read-only by default and "mv" to
replace the pack with a new one goes interactive, asking if the
user wants to replace it. If one is successfully moved and the
other is not, the pack and its idx would become out-of-sync and
corrupts the repository.
Recovering is straightforward -- it is just the matter of
finding the remaining .tmp-pack-* and make sure they are both
moved -- but we should be extra careful not to do something so
alarming to the users.
Signed-off-by: Junio C Hamano <junkio@cox.net>
git-repack was passing the -q along to pack-objects but ignoring it
itself. Correct the oversight.
Signed-off-by: Martin Langhoff <martin@catalyst.net.nz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* fix:
repack: honor -d even when no new pack was created
clone: keep --reference even with -l -s
repo-config: document what value_regexp does a bit more clearly.
Release config lock if the regex is invalid
core-tutorial.txt: escape asterisk
If all objects are reachable via an alternate object store then we
still have to remove all obsolete local packs.
Signed-off-by: Martin Waitz <tali@admingilde.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
.. but don't even bother documenting it. I don't think any normal person
is supposed to ever really care, but it simplifies testing when you want
to use the "git repack" wrapper rather than forcing you to use the core
programs (which already do support the window/depth arguments, of course).
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Use git-rev-list's --all instead of git-rev-parse's to keep from
hitting the shell's argument list length limits when repacking
with lots of tags.
Signed-off-by: Junio C Hamano <junkio@cox.net>
A new flag -q makes underlying pack-objects less chatty.
A new flag -f forces delta to be recomputed from scratch.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Now all the users of this script detect its exit status and die,
complaining that it is outside git repository. So move the code
that dies from all callers to git-sh-setup script.
Signed-off-by: Junio C Hamano <junkio@cox.net>
In a corrupt repository, git-repack produces a pack that does not
contain needed objects without complaining, and the result of this
combined with -d flag can be very painful -- e.g. a lossage of one
tree object can lead to lossage of blobs reachable only through that
tree.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
With "-a", redundant pack removal is trivial, and otherwise
redundant pack removal is pointless; do not call
git-redundant-pack from this script.
Signed-off-by: Junio C Hamano <junkio@cox.net>
No point in running git-pack-redundant if we already know
which packs are redundant.
Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This patch changes git-pack-redundant so that packfiles
in alternate object directories also are considered when
deciding which objects are redundant.
This functionality is controlled by the flag '--alt-odb'.
Also convert the other flags to the long form, and update
docs and git-repack accordingly.
Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This patch renames git-pack-intersect to git-pack-redundant
as suggested by Petr Baudis. The new name reflects what the
program does, rather than how it does it.
Also fix a small argument parsing bug.
Signed-off-by: Lukas Sandström <lukass@etek.chalmers.se>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The git philosophy when it comes to disk accesses is "Laugh in the face of
danger".
Notably, since we never modify an existing object, we don't really care
that deeply about flushing things to disk, since even if the machine
crashes in the middle of a git operation, you can never really have lost
any old work. At most, you'd need to figure out the proper heads (which
git-fsck-objects can do for you) and re-do the operation.
However, there's two exceptions to this: pruning and repacking. Those
operations will actually _delete_ old objects that they know about in
other ways (ie that they just repacked, or that they have found in other
places).
However, since they actually modify old state, we should thus be a bit
more careful about them. If the machine crashes and the duplicate new
objects haven't been flushed to disk, you can actually be in trouble.
This is trivially stupid about it by calling "sync" before removing the
objects. Not very smart, but we're talking about special operations than
are usually done once a week if that.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This uses the new "--local" flag to git-pack-objects. It currently only
makes a difference together with "-a", since a normal incremental repack
won't pack any packed objects at all (whether local or remote).
Eventually, it might end up skipping any objects that aren't local to
the current object directory, but for now it only knows to skip packed
objects.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Using "git repack -a -d" can destroy your git archive if you use it
twice in succession, because the new pack can be called the same as
the old pack. Found by Linus.
Signed-off-by: Junio C Hamano <junkio@cox.net>
As promised, this is the "big tool rename" patch. The primary differences
since 0.99.6 are:
(1) git-*-script are no more. The commands installed do not
have any such suffix so users do not have to remember if
something is implemented as a shell script or not.
(2) Many command names with 'cache' in them are renamed with
'index' if that is what they mean.
There are backward compatibility symblic links so that you and
Porcelains can keep using the old names, but the backward
compatibility support is expected to be removed in the near
future.
Signed-off-by: Junio C Hamano <junkio@cox.net>