2018-07-12 21:39:20 +02:00
|
|
|
git-multi-pack-index(1)
|
|
|
|
=======================
|
|
|
|
|
|
|
|
NAME
|
|
|
|
----
|
|
|
|
git-multi-pack-index - Write and verify multi-pack-indexes
|
|
|
|
|
|
|
|
|
|
|
|
SYNOPSIS
|
|
|
|
--------
|
|
|
|
[verse]
|
2021-10-06 22:40:11 +02:00
|
|
|
'git multi-pack-index' [--object-dir=<dir>] [--[no-]bitmap] <sub-command>
|
2018-07-12 21:39:20 +02:00
|
|
|
|
|
|
|
DESCRIPTION
|
|
|
|
-----------
|
|
|
|
Write or verify a multi-pack-index (MIDX) file.
|
|
|
|
|
|
|
|
OPTIONS
|
|
|
|
-------
|
|
|
|
|
|
|
|
--object-dir=<dir>::
|
|
|
|
Use given directory for the location of Git objects. We check
|
|
|
|
`<dir>/packs/multi-pack-index` for the current MIDX file, and
|
|
|
|
`<dir>/packs` for the pack-files to index.
|
midx: avoid opening multiple MIDXs when writing
Opening multiple instance of the same MIDX can lead to problems like two
separate packed_git structures which represent the same pack being added
to the repository's object store.
The above scenario can happen because prepare_midx_pack() checks if
`m->packs[pack_int_id]` is NULL in order to determine if a pack has been
opened and installed in the repository before. But a caller can
construct two copies of the same MIDX by calling get_multi_pack_index()
and load_multi_pack_index() since the former manipulates the
object store directly but the latter is a lower-level routine which
allocates a new MIDX for each call.
So if prepare_midx_pack() is called on multiple MIDXs with the same
pack_int_id, then that pack will be installed twice in the object
store's packed_git pointer.
This can lead to problems in, for e.g., the pack-bitmap code, which does
something like the following (in pack-bitmap.c:open_pack_bitmap()):
struct bitmap_index *bitmap_git = ...;
for (p = get_all_packs(r); p; p = p->next) {
if (open_pack_bitmap_1(bitmap_git, p) == 0)
ret = 0;
}
which is a problem if two copies of the same pack exist in the
packed_git list because pack-bitmap.c:open_pack_bitmap_1() contains a
conditional like the following:
if (bitmap_git->pack || bitmap_git->midx) {
/* ignore extra bitmap file; we can only handle one */
warning("ignoring extra bitmap file: %s", packfile->pack_name);
close(fd);
return -1;
}
Avoid this scenario by not letting write_midx_internal() open a MIDX
that isn't also pointed at by the object store. So long as this is the
case, other routines should prefer to open MIDXs with
get_multi_pack_index() or reprepare_packed_git() instead of creating
instances on their own. Because get_multi_pack_index() returns
`r->object_store->multi_pack_index` if it is non-NULL, we'll only have
one instance of a MIDX open at one time, avoiding these problems.
To encourage this, drop the `struct multi_pack_index *` parameter from
`write_midx_internal()`, and rely instead on the `object_dir` to find
(or initialize) the correct MIDX instance.
Likewise, replace the call to `close_midx()` with
`close_object_store()`, since we're about to replace the MIDX with a new
one and should invalidate the object store's memory of any MIDX that
might have existed beforehand.
Note that this now forbids passing object directories that don't belong
to alternate repositories over `--object-dir`, since before we would
have happily opened a MIDX in any directory, but now restrict ourselves
to only those reachable by `r->objects->multi_pack_index` (and alternate
MIDXs that we can see by walking the `next` pointer).
As far as I can tell, supporting arbitrary directories with
`--object-dir` was a historical accident, since even the documentation
says `<alt>` when referring to the value passed to this option.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01 22:34:01 +02:00
|
|
|
+
|
|
|
|
`<dir>` must be an alternate of the current repository.
|
2018-07-12 21:39:20 +02:00
|
|
|
|
2019-10-21 20:40:03 +02:00
|
|
|
--[no-]progress::
|
|
|
|
Turn progress on/off explicitly. If neither is specified, progress is
|
2021-09-20 23:39:19 +02:00
|
|
|
shown if standard error is connected to a terminal. Supported by
|
|
|
|
sub-commands `write`, `verify`, `expire`, and `repack.
|
2019-10-21 20:40:03 +02:00
|
|
|
|
2019-06-11 01:35:22 +02:00
|
|
|
The following subcommands are available:
|
|
|
|
|
2018-07-12 21:39:21 +02:00
|
|
|
write::
|
2021-03-30 17:04:11 +02:00
|
|
|
Write a new MIDX file. The following options are available for
|
|
|
|
the `write` sub-command:
|
|
|
|
+
|
|
|
|
--
|
|
|
|
--preferred-pack=<pack>::
|
|
|
|
Optionally specify the tie-breaking pack used when
|
2021-08-31 22:52:02 +02:00
|
|
|
multiple packs contain the same object. `<pack>` must
|
|
|
|
contain at least one object. If not given, ties are
|
|
|
|
broken in favor of the pack with the lowest mtime.
|
2021-08-31 22:52:24 +02:00
|
|
|
|
|
|
|
--[no-]bitmap::
|
|
|
|
Control whether or not a multi-pack bitmap is written.
|
2021-09-29 03:55:04 +02:00
|
|
|
|
|
|
|
--stdin-packs::
|
|
|
|
Write a multi-pack index containing only the set of
|
|
|
|
line-delimited pack index basenames provided over stdin.
|
midx: preliminary support for `--refs-snapshot`
To figure out which commits we can write a bitmap for, the multi-pack
index/bitmap code does a reachability traversal, marking any commit
which can be found in the MIDX as eligible to receive a bitmap.
This approach will cause a problem when multi-pack bitmaps are able to
be generated from `git repack`, since the reference tips can change
during the repack. Even though we ignore commits that don't exist in
the MIDX (when doing a scan of the ref tips), it's possible that a
commit in the MIDX reaches something that isn't.
This can happen when a multi-pack index contains some pack which refers
to loose objects (e.g., if a pack was pushed after starting the repack
but before generating the MIDX which depends on an object which is
stored as loose in the repository, and by definition isn't included in
the multi-pack index).
By taking a snapshot of the references before we start repacking, we can
close that race window. In the above scenario (where we have a packed
object pointing at a loose one), we'll either (a) take a snapshot of the
references before seeing the packed one, or (b) take it after, at which
point we can guarantee that the loose object will be packed and included
in the MIDX.
This patch does just that. It writes a temporary "reference snapshot",
which is a list of OIDs that are at the ref tips before writing a
multi-pack bitmap. References that are "preferred" (i.e,. are a suffix
of at least one value of the 'pack.preferBitmapTips' configuration) are
marked with a special '+'.
The format is simple: one line per commit at each tip, with an optional
'+' at the beginning (for preferred references, as described above).
When provided, the reference snapshot is used to drive bitmap selection
instead of the MIDX code doing its own traversal. When it isn't
provided, the usual traversal takes place instead.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-29 03:55:07 +02:00
|
|
|
|
|
|
|
--refs-snapshot=<path>::
|
|
|
|
With `--bitmap`, optionally specify a file which
|
|
|
|
contains a "refs snapshot" taken prior to repacking.
|
|
|
|
+
|
|
|
|
A reference snapshot is composed of line-delimited OIDs corresponding to
|
|
|
|
the reference tips, usually taken by `git repack` prior to generating a
|
|
|
|
new pack. A line may optionally start with a `+` character to indicate
|
|
|
|
that the reference which corresponds to that OID is "preferred" (see
|
|
|
|
linkgit:git-config[1]'s `pack.preferBitmapTips`.)
|
|
|
|
+
|
|
|
|
The file given at `<path>` is expected to be readable, and can contain
|
|
|
|
duplicates. (If a given OID is given more than once, it is marked as
|
|
|
|
preferred if at least one instance of it begins with the special `+`
|
|
|
|
marker).
|
2021-03-30 17:04:11 +02:00
|
|
|
--
|
2018-07-12 21:39:21 +02:00
|
|
|
|
2018-09-13 20:02:13 +02:00
|
|
|
verify::
|
2019-06-11 01:35:22 +02:00
|
|
|
Verify the contents of the MIDX file.
|
2018-09-13 20:02:13 +02:00
|
|
|
|
2019-06-11 01:35:23 +02:00
|
|
|
expire::
|
2022-09-20 03:55:40 +02:00
|
|
|
Delete the pack-files that are tracked by the MIDX file, but
|
2022-09-20 03:55:42 +02:00
|
|
|
have no objects referenced by the MIDX (with the exception of
|
midx.c: prevent `expire` from removing the cruft pack
The `expire` sub-command unlinks any packs that are (a) contained in the
MIDX, but (b) have no objects referenced by the MIDX.
This sub-command ignores `.keep` packs, which remain on-disk even if
they have no objects referenced by the MIDX. Cruft packs, however,
aren't given the same treatment: if none of the objects contained in the
cruft pack are selected from the cruft pack by the MIDX, then the cruft
pack is eligible to be expired.
This is less than desireable, since the cruft pack has important
metadata about the individual object mtimes, which is useful to
determine how quickly an object should age out of the repository when
pruning.
Ordinarily, we wouldn't expect the contents of a cruft pack to
duplicated across non-cruft packs (and we'd expect to see the MIDX
select all cruft objects from other sources even less often). But
nonetheless, it is still possible to trick the `expire` sub-command into
removing the `.mtimes` file in this circumstance.
Teach the `expire` sub-command to ignore cruft packs in the same manner
as it does `.keep` packs, in order to keep their metadata around, even
when they are unreferenced by the MIDX.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-20 03:55:45 +02:00
|
|
|
`.keep` packs and cruft packs). Rewrite the MIDX file afterward
|
|
|
|
to remove all references to these pack-files.
|
2019-06-11 01:35:23 +02:00
|
|
|
|
2019-06-11 01:35:26 +02:00
|
|
|
repack::
|
|
|
|
Create a new pack-file containing objects in small pack-files
|
|
|
|
referenced by the multi-pack-index. If the size given by the
|
|
|
|
`--batch-size=<size>` argument is zero, then create a pack
|
|
|
|
containing all objects referenced by the multi-pack-index. For
|
|
|
|
a non-zero batch size, Select the pack-files by examining packs
|
|
|
|
from oldest-to-newest, computing the "expected size" by counting
|
|
|
|
the number of objects in the pack referenced by the
|
|
|
|
multi-pack-index, then divide by the total number of objects in
|
|
|
|
the pack and multiply by the pack size. We select packs with
|
|
|
|
expected size below the batch size until the set of packs have
|
2020-08-11 17:30:18 +02:00
|
|
|
total expected size at least the batch size, or all pack-files
|
|
|
|
are considered. If only one pack-file is selected, then do
|
|
|
|
nothing. If a new pack-file is created, rewrite the
|
|
|
|
multi-pack-index to reference the new pack-file. A later run of
|
|
|
|
'git multi-pack-index expire' will delete the pack-files that
|
|
|
|
were part of this batch.
|
2020-05-10 18:07:34 +02:00
|
|
|
+
|
|
|
|
If `repack.packKeptObjects` is `false`, then any pack-files with an
|
|
|
|
associated `.keep` file will not be selected for the batch to repack.
|
2019-06-11 01:35:26 +02:00
|
|
|
|
2018-07-12 21:39:21 +02:00
|
|
|
|
|
|
|
EXAMPLES
|
|
|
|
--------
|
|
|
|
|
2021-10-24 19:09:15 +02:00
|
|
|
* Write a MIDX file for the packfiles in the current `.git` directory.
|
2018-07-12 21:39:21 +02:00
|
|
|
+
|
|
|
|
-----------------------------------------------
|
|
|
|
$ git multi-pack-index write
|
|
|
|
-----------------------------------------------
|
|
|
|
|
2021-10-24 19:09:15 +02:00
|
|
|
* Write a MIDX file for the packfiles in the current `.git` directory with a
|
2021-08-31 22:52:24 +02:00
|
|
|
corresponding bitmap.
|
|
|
|
+
|
|
|
|
-------------------------------------------------------------
|
|
|
|
$ git multi-pack-index write --preferred-pack=<pack> --bitmap
|
|
|
|
-------------------------------------------------------------
|
|
|
|
|
2018-07-12 21:39:21 +02:00
|
|
|
* Write a MIDX file for the packfiles in an alternate object store.
|
|
|
|
+
|
|
|
|
-----------------------------------------------
|
|
|
|
$ git multi-pack-index --object-dir <alt> write
|
|
|
|
-----------------------------------------------
|
|
|
|
|
2021-10-24 19:09:15 +02:00
|
|
|
* Verify the MIDX file for the packfiles in the current `.git` directory.
|
2018-09-13 20:02:13 +02:00
|
|
|
+
|
|
|
|
-----------------------------------------------
|
|
|
|
$ git multi-pack-index verify
|
|
|
|
-----------------------------------------------
|
|
|
|
|
2018-07-12 21:39:20 +02:00
|
|
|
|
|
|
|
SEE ALSO
|
|
|
|
--------
|
|
|
|
See link:technical/multi-pack-index.html[The Multi-Pack-Index Design
|
2022-08-04 18:28:39 +02:00
|
|
|
Document] and linkgit:gitformat-pack[5] for more information on the
|
|
|
|
multi-pack-index feature and its file format.
|
2018-07-12 21:39:20 +02:00
|
|
|
|
|
|
|
|
|
|
|
GIT
|
|
|
|
---
|
|
|
|
Part of the linkgit:git[1] suite
|