2006-12-27 08:17:59 +01:00
|
|
|
git-gc(1)
|
|
|
|
=========
|
|
|
|
|
|
|
|
NAME
|
|
|
|
----
|
|
|
|
git-gc - Cleanup unnecessary files and optimize the local repository
|
|
|
|
|
|
|
|
|
|
|
|
SYNOPSIS
|
|
|
|
--------
|
2011-07-02 04:38:26 +02:00
|
|
|
[verse]
|
2018-04-15 17:36:14 +02:00
|
|
|
'git gc' [--aggressive] [--auto] [--quiet] [--prune=<date> | --no-prune] [--force] [--keep-largest-pack]
|
2006-12-27 08:17:59 +01:00
|
|
|
|
|
|
|
DESCRIPTION
|
|
|
|
-----------
|
|
|
|
Runs a number of housekeeping tasks within the current repository,
|
|
|
|
such as compressing file revisions (to reduce disk space and increase
|
2018-03-15 17:44:10 +01:00
|
|
|
performance), removing unreachable objects which may have been
|
|
|
|
created from prior invocations of 'git add', packing refs, pruning
|
2018-10-10 21:38:18 +02:00
|
|
|
reflog, rerere metadata or stale working trees. May also update ancillary
|
|
|
|
indexes such as the commit-graph.
|
2006-12-27 08:17:59 +01:00
|
|
|
|
2019-03-22 10:32:32 +01:00
|
|
|
When common porcelain operations that create objects are run, they
|
|
|
|
will check whether the repository has grown substantially since the
|
|
|
|
last maintenance, and if so run `git gc` automatically. See `gc.auto`
|
|
|
|
below for how to disable this behavior.
|
|
|
|
|
|
|
|
Running `git gc` manually should only be needed when adding objects to
|
|
|
|
a repository without regularly running such porcelain commands, to do
|
|
|
|
a one-off repository optimization, or e.g. to clean up a suboptimal
|
|
|
|
mass-import. See the "PACKFILE OPTIMIZATION" section in
|
|
|
|
linkgit:git-fast-import[1] for more details on the import case.
|
2006-12-27 08:17:59 +01:00
|
|
|
|
2007-01-22 08:28:28 +01:00
|
|
|
OPTIONS
|
|
|
|
-------
|
|
|
|
|
2007-05-09 21:48:39 +02:00
|
|
|
--aggressive::
|
2010-01-10 00:33:00 +01:00
|
|
|
Usually 'git gc' runs very quickly while providing good disk
|
2007-06-01 01:00:48 +02:00
|
|
|
space utilization and performance. This option will cause
|
2010-01-10 00:33:00 +01:00
|
|
|
'git gc' to more aggressively optimize the repository at the expense
|
2007-05-09 21:48:39 +02:00
|
|
|
of taking much more time. The effects of this optimization are
|
2019-04-07 21:52:14 +02:00
|
|
|
mostly persistent. See the "AGGRESSIVE" section below for details.
|
2007-09-17 09:39:52 +02:00
|
|
|
|
|
|
|
--auto::
|
2010-01-10 00:33:00 +01:00
|
|
|
With this option, 'git gc' checks whether any housekeeping is
|
2007-10-19 04:05:10 +02:00
|
|
|
required; if not, it exits without performing any work.
|
|
|
|
+
|
2019-04-07 21:52:10 +02:00
|
|
|
See the `gc.auto` option in the "CONFIGURATION" section below for how
|
|
|
|
this heuristic works.
|
2007-10-19 04:05:10 +02:00
|
|
|
+
|
2019-04-07 21:52:10 +02:00
|
|
|
Once housekeeping is triggered by exceeding the limits of
|
|
|
|
configuration options such as `gc.auto` and `gc.autoPackLimit`, all
|
2018-03-15 17:44:10 +01:00
|
|
|
other housekeeping tasks (e.g. rerere, working trees, reflog...) will
|
|
|
|
be performed as well.
|
|
|
|
|
2007-01-22 08:28:28 +01:00
|
|
|
|
builtin/gc.c: make `gc.cruftPacks` enabled by default
Back in 5b92477f89 (builtin/gc.c: conditionally avoid pruning objects
via loose, 2022-05-20), `git gc` learned the `--cruft` option and
`gc.cruftPacks` configuration to opt-in to writing cruft packs when
collecting or pruning unreachable objects.
Cruft packs were introduced with the merge in a50036da1a (Merge branch
'tb/cruft-packs', 2022-06-03). They address the problem of "loose object
explosions", where Git will write out many individual loose objects when
there is a large number of unreachable objects that have not yet aged
past `--prune=<date>`.
Instead of keeping track of those unreachable yet recent objects via
their loose object file's mtime, cruft packs collect all unreachable
objects into a single pack with a corresponding `*.mtimes` file that
acts as a table to store the mtimes of all unreachable objects. This
prevents the need to store unreachable objects as loose as they age out
of the repository, and avoids the problem of loose object explosions.
Beyond avoiding loose object explosions, cruft packs also act as a more
efficient mechanism to store unreachable objects as they age out of a
repository. This is because pairs of similar unreachable objects serve
as delta bases for one another.
In 5b92477f89, the feature was introduced as experimental. Since then,
GitHub has been running these patches in every repository generating
hundreds of millions of cruft packs along the way. The feature is
battle-tested, and avoids many pathological cases such as above. Users
who either run `git gc` manually, or via `git maintenance` can benefit
from having cruft packs.
As such, enable cruft pack generation to take place by default (by
making `gc.cruftPacks` have the default of "true" rather than "false).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18 22:40:57 +02:00
|
|
|
--[no-]cruft::
|
2022-05-21 01:18:14 +02:00
|
|
|
When expiring unreachable objects, pack them separately into a
|
builtin/gc.c: make `gc.cruftPacks` enabled by default
Back in 5b92477f89 (builtin/gc.c: conditionally avoid pruning objects
via loose, 2022-05-20), `git gc` learned the `--cruft` option and
`gc.cruftPacks` configuration to opt-in to writing cruft packs when
collecting or pruning unreachable objects.
Cruft packs were introduced with the merge in a50036da1a (Merge branch
'tb/cruft-packs', 2022-06-03). They address the problem of "loose object
explosions", where Git will write out many individual loose objects when
there is a large number of unreachable objects that have not yet aged
past `--prune=<date>`.
Instead of keeping track of those unreachable yet recent objects via
their loose object file's mtime, cruft packs collect all unreachable
objects into a single pack with a corresponding `*.mtimes` file that
acts as a table to store the mtimes of all unreachable objects. This
prevents the need to store unreachable objects as loose as they age out
of the repository, and avoids the problem of loose object explosions.
Beyond avoiding loose object explosions, cruft packs also act as a more
efficient mechanism to store unreachable objects as they age out of a
repository. This is because pairs of similar unreachable objects serve
as delta bases for one another.
In 5b92477f89, the feature was introduced as experimental. Since then,
GitHub has been running these patches in every repository generating
hundreds of millions of cruft packs along the way. The feature is
battle-tested, and avoids many pathological cases such as above. Users
who either run `git gc` manually, or via `git maintenance` can benefit
from having cruft packs.
As such, enable cruft pack generation to take place by default (by
making `gc.cruftPacks` have the default of "true" rather than "false).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18 22:40:57 +02:00
|
|
|
cruft pack instead of storing them as loose objects. `--cruft`
|
|
|
|
is on by default.
|
2022-05-21 01:18:14 +02:00
|
|
|
|
2009-02-14 23:10:10 +01:00
|
|
|
--prune=<date>::
|
|
|
|
Prune loose objects older than date (default is 2 weeks ago,
|
2013-04-18 09:46:34 +02:00
|
|
|
overridable by the config variable `gc.pruneExpire`).
|
2019-03-02 09:51:52 +01:00
|
|
|
--prune=now prunes loose objects regardless of their age and
|
2016-11-15 20:08:51 +01:00
|
|
|
increases the risk of corruption if another process is writing to
|
|
|
|
the repository concurrently; see "NOTES" below. --prune is on by
|
|
|
|
default.
|
2009-02-14 23:10:10 +01:00
|
|
|
|
|
|
|
--no-prune::
|
|
|
|
Do not prune any loose objects.
|
|
|
|
|
2008-02-29 22:53:39 +01:00
|
|
|
--quiet::
|
|
|
|
Suppress all progress reports.
|
|
|
|
|
2013-08-08 13:05:38 +02:00
|
|
|
--force::
|
|
|
|
Force `git gc` to run even if there may be another `git gc`
|
|
|
|
instance running on this repository.
|
|
|
|
|
2018-04-15 17:36:14 +02:00
|
|
|
--keep-largest-pack::
|
builtin/gc.c: ignore cruft packs with `--keep-largest-pack`
When cruft packs were implemented, we never adjusted the code for `git
gc`'s `--keep-largest-pack` and `gc.bigPackThreshold` to ignore cruft
packs. This option and configuration option share a common
implementation, but including cruft packs is wrong in both cases:
- Running `git gc --keep-largest-pack` in a repository where the
largest pack is the cruft pack itself will make it impossible for
`git gc` to prune objects, since the cruft pack itself is kept.
- The same is true for `gc.bigPackThreshold`, if the size of the cruft
pack exceeds the limit set by the caller.
In the future, it is possible that `gc.bigPackThreshold` could be used
to write a separate cruft pack containing any new unreachable objects
that entered the repository since the last time a cruft pack was
written.
There are some complexities to doing so, mainly around handling
pruning objects that are in an existing cruft pack that is above the
threshold (which would either need to be rewritten, or else delay
pruning). Rewriting a substantially similar cruft pack isn't ideal, but
it is significantly better than the status-quo.
If users have large cruft packs that they don't want to rewrite, they
can mark them as `*.keep` packs. But in general, if a repository has a
cruft pack that is so large it is slowing down GC's, it should probably
be pruned anyway.
In the meantime, ignore cruft packs in the common implementation for
both of these options, and add a pair of tests to prevent any future
regressions here.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-18 22:40:38 +02:00
|
|
|
All packs except the largest non-cruft pack, any packs marked
|
|
|
|
with a `.keep` file, and any cruft pack(s) are consolidated into
|
|
|
|
a single pack. When this option is used, `gc.bigPackThreshold`
|
|
|
|
is ignored.
|
2018-04-15 17:36:14 +02:00
|
|
|
|
2019-04-07 21:52:14 +02:00
|
|
|
AGGRESSIVE
|
|
|
|
----------
|
|
|
|
|
|
|
|
When the `--aggressive` option is supplied, linkgit:git-repack[1] will
|
|
|
|
be invoked with the `-f` flag, which in turn will pass
|
|
|
|
`--no-reuse-delta` to linkgit:git-pack-objects[1]. This will throw
|
|
|
|
away any existing deltas and re-compute them, at the expense of
|
|
|
|
spending much more time on the repacking.
|
|
|
|
|
|
|
|
The effects of this are mostly persistent, e.g. when packs and loose
|
|
|
|
objects are coalesced into one another pack the existing deltas in
|
|
|
|
that pack might get re-used, but there are also various cases where we
|
|
|
|
might pick a sub-optimal delta from a newer pack instead.
|
|
|
|
|
|
|
|
Furthermore, supplying `--aggressive` will tweak the `--depth` and
|
|
|
|
`--window` options passed to linkgit:git-repack[1]. See the
|
|
|
|
`gc.aggressiveDepth` and `gc.aggressiveWindow` settings below. By
|
|
|
|
using a larger window size we're more likely to find more optimal
|
|
|
|
deltas.
|
|
|
|
|
|
|
|
It's probably not worth it to use this option on a given repository
|
|
|
|
without running tailored performance benchmarks on it. It takes a lot
|
|
|
|
more time, and the resulting space/delta optimization may or may not
|
|
|
|
be worth it. Not using this at all is the right trade-off for most
|
|
|
|
users and their repositories.
|
|
|
|
|
2018-04-30 17:35:33 +02:00
|
|
|
CONFIGURATION
|
2006-12-27 08:17:59 +01:00
|
|
|
-------------
|
|
|
|
|
2022-09-07 10:26:57 +02:00
|
|
|
include::includes/cmd-config-section-all.txt[]
|
2018-03-15 17:44:10 +01:00
|
|
|
|
2019-04-07 21:52:10 +02:00
|
|
|
include::config/gc.txt[]
|
2008-04-24 03:28:36 +02:00
|
|
|
|
2018-04-30 17:35:33 +02:00
|
|
|
NOTES
|
2008-04-24 03:28:36 +02:00
|
|
|
-----
|
|
|
|
|
2016-11-15 20:08:51 +01:00
|
|
|
'git gc' tries very hard not to delete objects that are referenced
|
Recommend git-filter-repo instead of git-filter-branch
filter-branch suffers from a deluge of disguised dangers that disfigure
history rewrites (i.e. deviate from the deliberate changes). Many of
these problems are unobtrusive and can easily go undiscovered until the
new repository is in use. This can result in problems ranging from an
even messier history than what led folks to filter-branch in the first
place, to data loss or corruption. These issues cannot be backward
compatibly fixed, so add a warning to both filter-branch and its manpage
recommending that another tool (such as filter-repo) be used instead.
Also, update other manpages that referenced filter-branch. Several of
these needed updates even if we could continue recommending
filter-branch, either due to implying that something was unique to
filter-branch when it applied more generally to all history rewriting
tools (e.g. BFG, reposurgeon, fast-import, filter-repo), or because
something about filter-branch was used as an example despite other more
commonly known examples now existing. Reword these sections to fix
these issues and to avoid recommending filter-branch.
Finally, remove the section explaining BFG Repo Cleaner as an
alternative to filter-branch. I feel somewhat bad about this,
especially since I feel like I learned so much from BFG that I put to
good use in filter-repo (which is much more than I can say for
filter-branch), but keeping that section presented a few problems:
* In order to recommend that people quit using filter-branch, we need
to provide them a recomendation for something else to use that
can handle all the same types of rewrites. To my knowledge,
filter-repo is the only such tool. So it needs to be mentioned.
* I don't want to give conflicting recommendations to users
* If we recommend two tools, we shouldn't expect users to learn both
and pick which one to use; we should explain which problems one
can solve that the other can't or when one is much faster than
the other.
* BFG and filter-repo have similar performance
* All filtering types that BFG can do, filter-repo can also do. In
fact, filter-repo comes with a reimplementation of BFG named
bfg-ish which provides the same user-interface as BFG but with
several bugfixes and new features that are hard to implement in
BFG due to its technical underpinnings.
While I could still mention both tools, it seems like I would need to
provide some kind of comparison and I would ultimately just say that
filter-repo can do everything BFG can, so ultimately it seems that it
is just better to remove that section altogether.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-05 00:32:38 +02:00
|
|
|
anywhere in your repository. In particular, it will keep not only
|
|
|
|
objects referenced by your current set of branches and tags, but also
|
2021-02-11 08:39:14 +01:00
|
|
|
objects referenced by the index, remote-tracking branches, reflogs
|
|
|
|
(which may reference commits in branches that were later amended or
|
|
|
|
rewound), and anything else in the refs/* namespace. Note that a note
|
|
|
|
(of the kind created by 'git notes') attached to an object does not
|
|
|
|
contribute in keeping the object alive. If you are expecting some
|
|
|
|
objects to be deleted and they aren't, check all of those locations
|
|
|
|
and decide whether it makes sense in your case to remove those
|
|
|
|
references.
|
2008-04-24 03:28:36 +02:00
|
|
|
|
2016-11-15 20:08:51 +01:00
|
|
|
On the other hand, when 'git gc' runs concurrently with another process,
|
|
|
|
there is a risk of it deleting an object that the other process is using
|
|
|
|
but hasn't created a reference to. This may just cause the other process
|
|
|
|
to fail or may corrupt the repository if the other process later adds a
|
|
|
|
reference to the deleted object. Git has two features that significantly
|
|
|
|
mitigate this problem:
|
|
|
|
|
|
|
|
. Any object with modification time newer than the `--prune` date is kept,
|
|
|
|
along with everything reachable from it.
|
|
|
|
|
|
|
|
. Most operations that add an object to the database update the
|
|
|
|
modification time of the object if it is already present so that #1
|
|
|
|
applies.
|
|
|
|
|
|
|
|
However, these features fall short of a complete solution, so users who
|
|
|
|
run commands concurrently have to live with some risk of corruption (which
|
2019-04-07 21:52:17 +02:00
|
|
|
seems to be low in practice).
|
2016-11-15 20:08:51 +01:00
|
|
|
|
2010-06-30 22:41:27 +02:00
|
|
|
HOOKS
|
|
|
|
-----
|
|
|
|
|
|
|
|
The 'git gc --auto' command will run the 'pre-auto-gc' hook. See
|
|
|
|
linkgit:githooks[5] for more information.
|
|
|
|
|
|
|
|
|
2008-05-29 01:55:27 +02:00
|
|
|
SEE ALSO
|
2006-12-27 08:17:59 +01:00
|
|
|
--------
|
2007-12-29 07:20:38 +01:00
|
|
|
linkgit:git-prune[1]
|
|
|
|
linkgit:git-reflog[1]
|
|
|
|
linkgit:git-repack[1]
|
|
|
|
linkgit:git-rerere[1]
|
2006-12-27 08:17:59 +01:00
|
|
|
|
|
|
|
GIT
|
|
|
|
---
|
2008-06-06 09:07:32 +02:00
|
|
|
Part of the linkgit:git[1] suite
|