git-commit-vandalism/Documentation/git-gc.txt

169 lines
5.8 KiB
Plaintext
Raw Normal View History

git-gc(1)
=========
NAME
----
git-gc - Cleanup unnecessary files and optimize the local repository
SYNOPSIS
--------
[verse]
'git gc' [--aggressive] [--auto] [--quiet] [--prune=<date> | --no-prune] [--force] [--keep-largest-pack]
DESCRIPTION
-----------
Runs a number of housekeeping tasks within the current repository,
such as compressing file revisions (to reduce disk space and increase
performance), removing unreachable objects which may have been
created from prior invocations of 'git add', packing refs, pruning
reflog, rerere metadata or stale working trees. May also update ancillary
indexes such as the commit-graph.
When common porcelain operations that create objects are run, they
will check whether the repository has grown substantially since the
last maintenance, and if so run `git gc` automatically. See `gc.auto`
below for how to disable this behavior.
Running `git gc` manually should only be needed when adding objects to
a repository without regularly running such porcelain commands, to do
a one-off repository optimization, or e.g. to clean up a suboptimal
mass-import. See the "PACKFILE OPTIMIZATION" section in
linkgit:git-fast-import[1] for more details on the import case.
OPTIONS
-------
--aggressive::
Usually 'git gc' runs very quickly while providing good disk
space utilization and performance. This option will cause
'git gc' to more aggressively optimize the repository at the expense
of taking much more time. The effects of this optimization are
mostly persistent. See the "AGGRESSIVE" section below for details.
--auto::
With this option, 'git gc' checks whether any housekeeping is
required; if not, it exits without performing any work.
+
See the `gc.auto` option in the "CONFIGURATION" section below for how
this heuristic works.
+
Once housekeeping is triggered by exceeding the limits of
configuration options such as `gc.auto` and `gc.autoPackLimit`, all
other housekeeping tasks (e.g. rerere, working trees, reflog...) will
be performed as well.
--cruft::
When expiring unreachable objects, pack them separately into a
cruft pack instead of storing the loose objects as loose
objects.
--prune=<date>::
Prune loose objects older than date (default is 2 weeks ago,
overridable by the config variable `gc.pruneExpire`).
--prune=now prunes loose objects regardless of their age and
increases the risk of corruption if another process is writing to
the repository concurrently; see "NOTES" below. --prune is on by
default.
--no-prune::
Do not prune any loose objects.
--quiet::
Suppress all progress reports.
--force::
Force `git gc` to run even if there may be another `git gc`
instance running on this repository.
--keep-largest-pack::
All packs except the largest pack and those marked with a
`.keep` files are consolidated into a single pack. When this
option is used, `gc.bigPackThreshold` is ignored.
AGGRESSIVE
----------
When the `--aggressive` option is supplied, linkgit:git-repack[1] will
be invoked with the `-f` flag, which in turn will pass
`--no-reuse-delta` to linkgit:git-pack-objects[1]. This will throw
away any existing deltas and re-compute them, at the expense of
spending much more time on the repacking.
The effects of this are mostly persistent, e.g. when packs and loose
objects are coalesced into one another pack the existing deltas in
that pack might get re-used, but there are also various cases where we
might pick a sub-optimal delta from a newer pack instead.
Furthermore, supplying `--aggressive` will tweak the `--depth` and
`--window` options passed to linkgit:git-repack[1]. See the
`gc.aggressiveDepth` and `gc.aggressiveWindow` settings below. By
using a larger window size we're more likely to find more optimal
deltas.
It's probably not worth it to use this option on a given repository
without running tailored performance benchmarks on it. It takes a lot
more time, and the resulting space/delta optimization may or may not
be worth it. Not using this at all is the right trade-off for most
users and their repositories.
CONFIGURATION
-------------
The below documentation is the same as what's found in
linkgit:git-config[1]:
include::config/gc.txt[]
NOTES
-----
'git gc' tries very hard not to delete objects that are referenced
Recommend git-filter-repo instead of git-filter-branch filter-branch suffers from a deluge of disguised dangers that disfigure history rewrites (i.e. deviate from the deliberate changes). Many of these problems are unobtrusive and can easily go undiscovered until the new repository is in use. This can result in problems ranging from an even messier history than what led folks to filter-branch in the first place, to data loss or corruption. These issues cannot be backward compatibly fixed, so add a warning to both filter-branch and its manpage recommending that another tool (such as filter-repo) be used instead. Also, update other manpages that referenced filter-branch. Several of these needed updates even if we could continue recommending filter-branch, either due to implying that something was unique to filter-branch when it applied more generally to all history rewriting tools (e.g. BFG, reposurgeon, fast-import, filter-repo), or because something about filter-branch was used as an example despite other more commonly known examples now existing. Reword these sections to fix these issues and to avoid recommending filter-branch. Finally, remove the section explaining BFG Repo Cleaner as an alternative to filter-branch. I feel somewhat bad about this, especially since I feel like I learned so much from BFG that I put to good use in filter-repo (which is much more than I can say for filter-branch), but keeping that section presented a few problems: * In order to recommend that people quit using filter-branch, we need to provide them a recomendation for something else to use that can handle all the same types of rewrites. To my knowledge, filter-repo is the only such tool. So it needs to be mentioned. * I don't want to give conflicting recommendations to users * If we recommend two tools, we shouldn't expect users to learn both and pick which one to use; we should explain which problems one can solve that the other can't or when one is much faster than the other. * BFG and filter-repo have similar performance * All filtering types that BFG can do, filter-repo can also do. In fact, filter-repo comes with a reimplementation of BFG named bfg-ish which provides the same user-interface as BFG but with several bugfixes and new features that are hard to implement in BFG due to its technical underpinnings. While I could still mention both tools, it seems like I would need to provide some kind of comparison and I would ultimately just say that filter-repo can do everything BFG can, so ultimately it seems that it is just better to remove that section altogether. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-05 00:32:38 +02:00
anywhere in your repository. In particular, it will keep not only
objects referenced by your current set of branches and tags, but also
objects referenced by the index, remote-tracking branches, reflogs
(which may reference commits in branches that were later amended or
rewound), and anything else in the refs/* namespace. Note that a note
(of the kind created by 'git notes') attached to an object does not
contribute in keeping the object alive. If you are expecting some
objects to be deleted and they aren't, check all of those locations
and decide whether it makes sense in your case to remove those
references.
On the other hand, when 'git gc' runs concurrently with another process,
there is a risk of it deleting an object that the other process is using
but hasn't created a reference to. This may just cause the other process
to fail or may corrupt the repository if the other process later adds a
reference to the deleted object. Git has two features that significantly
mitigate this problem:
. Any object with modification time newer than the `--prune` date is kept,
along with everything reachable from it.
. Most operations that add an object to the database update the
modification time of the object if it is already present so that #1
applies.
However, these features fall short of a complete solution, so users who
run commands concurrently have to live with some risk of corruption (which
seems to be low in practice).
HOOKS
-----
The 'git gc --auto' command will run the 'pre-auto-gc' hook. See
linkgit:githooks[5] for more information.
SEE ALSO
--------
linkgit:git-prune[1]
linkgit:git-reflog[1]
linkgit:git-repack[1]
linkgit:git-rerere[1]
GIT
---
Part of the linkgit:git[1] suite