Merge branch 'tb/reverse-midx'
An on-disk reverse-index to map the in-pack location of an object back to its object name across multiple packfiles is introduced. * tb/reverse-midx: midx.c: improve cache locality in midx_pack_order_cmp() pack-revindex: write multi-pack reverse indexes pack-write.c: extract 'write_rev_file_order' pack-revindex: read multi-pack reverse indexes Documentation/technical: describe multi-pack reverse indexes midx: make some functions non-static midx: keep track of the checksum midx: don't free midx_name early midx: allow marking a pack as preferred t/helper/test-read-midx.c: add '--show-objects' builtin/multi-pack-index.c: display usage on unrecognized command builtin/multi-pack-index.c: don't enter bogus cmd_mode builtin/multi-pack-index.c: split sub-commands builtin/multi-pack-index.c: define common usage with a macro builtin/multi-pack-index.c: don't handle 'progress' separately builtin/multi-pack-index.c: inline 'flags' with options
This commit is contained in:
commit
e6b971fcf5
@ -9,7 +9,8 @@ git-multi-pack-index - Write and verify multi-pack-indexes
|
|||||||
SYNOPSIS
|
SYNOPSIS
|
||||||
--------
|
--------
|
||||||
[verse]
|
[verse]
|
||||||
'git multi-pack-index' [--object-dir=<dir>] [--[no-]progress] <subcommand>
|
'git multi-pack-index' [--object-dir=<dir>] [--[no-]progress]
|
||||||
|
[--preferred-pack=<pack>] <subcommand>
|
||||||
|
|
||||||
DESCRIPTION
|
DESCRIPTION
|
||||||
-----------
|
-----------
|
||||||
@ -30,7 +31,16 @@ OPTIONS
|
|||||||
The following subcommands are available:
|
The following subcommands are available:
|
||||||
|
|
||||||
write::
|
write::
|
||||||
Write a new MIDX file.
|
Write a new MIDX file. The following options are available for
|
||||||
|
the `write` sub-command:
|
||||||
|
+
|
||||||
|
--
|
||||||
|
--preferred-pack=<pack>::
|
||||||
|
Optionally specify the tie-breaking pack used when
|
||||||
|
multiple packs contain the same object. If not given,
|
||||||
|
ties are broken in favor of the pack with the lowest
|
||||||
|
mtime.
|
||||||
|
--
|
||||||
|
|
||||||
verify::
|
verify::
|
||||||
Verify the contents of the MIDX file.
|
Verify the contents of the MIDX file.
|
||||||
|
@ -43,8 +43,9 @@ Design Details
|
|||||||
a change in format.
|
a change in format.
|
||||||
|
|
||||||
- The MIDX keeps only one record per object ID. If an object appears
|
- The MIDX keeps only one record per object ID. If an object appears
|
||||||
in multiple packfiles, then the MIDX selects the copy in the most-
|
in multiple packfiles, then the MIDX selects the copy in the
|
||||||
recently modified packfile.
|
preferred packfile, otherwise selecting from the most-recently
|
||||||
|
modified packfile.
|
||||||
|
|
||||||
- If there exist packfiles in the pack directory not registered in
|
- If there exist packfiles in the pack directory not registered in
|
||||||
the MIDX, then those packfiles are loaded into the `packed_git`
|
the MIDX, then those packfiles are loaded into the `packed_git`
|
||||||
|
@ -379,3 +379,86 @@ CHUNK DATA:
|
|||||||
TRAILER:
|
TRAILER:
|
||||||
|
|
||||||
Index checksum of the above contents.
|
Index checksum of the above contents.
|
||||||
|
|
||||||
|
== multi-pack-index reverse indexes
|
||||||
|
|
||||||
|
Similar to the pack-based reverse index, the multi-pack index can also
|
||||||
|
be used to generate a reverse index.
|
||||||
|
|
||||||
|
Instead of mapping between offset, pack-, and index position, this
|
||||||
|
reverse index maps between an object's position within the MIDX, and
|
||||||
|
that object's position within a pseudo-pack that the MIDX describes
|
||||||
|
(i.e., the ith entry of the multi-pack reverse index holds the MIDX
|
||||||
|
position of ith object in pseudo-pack order).
|
||||||
|
|
||||||
|
To clarify the difference between these orderings, consider a multi-pack
|
||||||
|
reachability bitmap (which does not yet exist, but is what we are
|
||||||
|
building towards here). Each bit needs to correspond to an object in the
|
||||||
|
MIDX, and so we need an efficient mapping from bit position to MIDX
|
||||||
|
position.
|
||||||
|
|
||||||
|
One solution is to let bits occupy the same position in the oid-sorted
|
||||||
|
index stored by the MIDX. But because oids are effectively random, their
|
||||||
|
resulting reachability bitmaps would have no locality, and thus compress
|
||||||
|
poorly. (This is the reason that single-pack bitmaps use the pack
|
||||||
|
ordering, and not the .idx ordering, for the same purpose.)
|
||||||
|
|
||||||
|
So we'd like to define an ordering for the whole MIDX based around
|
||||||
|
pack ordering, which has far better locality (and thus compresses more
|
||||||
|
efficiently). We can think of a pseudo-pack created by the concatenation
|
||||||
|
of all of the packs in the MIDX. E.g., if we had a MIDX with three packs
|
||||||
|
(a, b, c), with 10, 15, and 20 objects respectively, we can imagine an
|
||||||
|
ordering of the objects like:
|
||||||
|
|
||||||
|
|a,0|a,1|...|a,9|b,0|b,1|...|b,14|c,0|c,1|...|c,19|
|
||||||
|
|
||||||
|
where the ordering of the packs is defined by the MIDX's pack list,
|
||||||
|
and then the ordering of objects within each pack is the same as the
|
||||||
|
order in the actual packfile.
|
||||||
|
|
||||||
|
Given the list of packs and their counts of objects, you can
|
||||||
|
naïvely reconstruct that pseudo-pack ordering (e.g., the object at
|
||||||
|
position 27 must be (c,1) because packs "a" and "b" consumed 25 of the
|
||||||
|
slots). But there's a catch. Objects may be duplicated between packs, in
|
||||||
|
which case the MIDX only stores one pointer to the object (and thus we'd
|
||||||
|
want only one slot in the bitmap).
|
||||||
|
|
||||||
|
Callers could handle duplicates themselves by reading objects in order
|
||||||
|
of their bit-position, but that's linear in the number of objects, and
|
||||||
|
much too expensive for ordinary bitmap lookups. Building a reverse index
|
||||||
|
solves this, since it is the logical inverse of the index, and that
|
||||||
|
index has already removed duplicates. But, building a reverse index on
|
||||||
|
the fly can be expensive. Since we already have an on-disk format for
|
||||||
|
pack-based reverse indexes, let's reuse it for the MIDX's pseudo-pack,
|
||||||
|
too.
|
||||||
|
|
||||||
|
Objects from the MIDX are ordered as follows to string together the
|
||||||
|
pseudo-pack. Let `pack(o)` return the pack from which `o` was selected
|
||||||
|
by the MIDX, and define an ordering of packs based on their numeric ID
|
||||||
|
(as stored by the MIDX). Let `offset(o)` return the object offset of `o`
|
||||||
|
within `pack(o)`. Then, compare `o1` and `o2` as follows:
|
||||||
|
|
||||||
|
- If one of `pack(o1)` and `pack(o2)` is preferred and the other
|
||||||
|
is not, then the preferred one sorts first.
|
||||||
|
+
|
||||||
|
(This is a detail that allows the MIDX bitmap to determine which
|
||||||
|
pack should be used by the pack-reuse mechanism, since it can ask
|
||||||
|
the MIDX for the pack containing the object at bit position 0).
|
||||||
|
|
||||||
|
- If `pack(o1) ≠ pack(o2)`, then sort the two objects in descending
|
||||||
|
order based on the pack ID.
|
||||||
|
|
||||||
|
- Otherwise, `pack(o1) = pack(o2)`, and the objects are sorted in
|
||||||
|
pack-order (i.e., `o1` sorts ahead of `o2` exactly when `offset(o1)
|
||||||
|
< offset(o2)`).
|
||||||
|
|
||||||
|
In short, a MIDX's pseudo-pack is the de-duplicated concatenation of
|
||||||
|
objects in packs stored by the MIDX, laid out in pack order, and the
|
||||||
|
packs arranged in MIDX order (with the preferred pack coming first).
|
||||||
|
|
||||||
|
Finally, note that the MIDX's reverse index is not stored as a chunk in
|
||||||
|
the multi-pack-index itself. This is done because the reverse index
|
||||||
|
includes the checksum of the pack or MIDX to which it belongs, which
|
||||||
|
makes it impossible to write in the MIDX. To avoid races when rewriting
|
||||||
|
the MIDX, a MIDX reverse index includes the MIDX's checksum in its
|
||||||
|
filename (e.g., `multi-pack-index-xyz.rev`).
|
||||||
|
@ -4,67 +4,181 @@
|
|||||||
#include "parse-options.h"
|
#include "parse-options.h"
|
||||||
#include "midx.h"
|
#include "midx.h"
|
||||||
#include "trace2.h"
|
#include "trace2.h"
|
||||||
|
#include "object-store.h"
|
||||||
|
|
||||||
|
#define BUILTIN_MIDX_WRITE_USAGE \
|
||||||
|
N_("git multi-pack-index [<options>] write [--preferred-pack=<pack>]")
|
||||||
|
|
||||||
|
#define BUILTIN_MIDX_VERIFY_USAGE \
|
||||||
|
N_("git multi-pack-index [<options>] verify")
|
||||||
|
|
||||||
|
#define BUILTIN_MIDX_EXPIRE_USAGE \
|
||||||
|
N_("git multi-pack-index [<options>] expire")
|
||||||
|
|
||||||
|
#define BUILTIN_MIDX_REPACK_USAGE \
|
||||||
|
N_("git multi-pack-index [<options>] repack [--batch-size=<size>]")
|
||||||
|
|
||||||
|
static char const * const builtin_multi_pack_index_write_usage[] = {
|
||||||
|
BUILTIN_MIDX_WRITE_USAGE,
|
||||||
|
NULL
|
||||||
|
};
|
||||||
|
static char const * const builtin_multi_pack_index_verify_usage[] = {
|
||||||
|
BUILTIN_MIDX_VERIFY_USAGE,
|
||||||
|
NULL
|
||||||
|
};
|
||||||
|
static char const * const builtin_multi_pack_index_expire_usage[] = {
|
||||||
|
BUILTIN_MIDX_EXPIRE_USAGE,
|
||||||
|
NULL
|
||||||
|
};
|
||||||
|
static char const * const builtin_multi_pack_index_repack_usage[] = {
|
||||||
|
BUILTIN_MIDX_REPACK_USAGE,
|
||||||
|
NULL
|
||||||
|
};
|
||||||
static char const * const builtin_multi_pack_index_usage[] = {
|
static char const * const builtin_multi_pack_index_usage[] = {
|
||||||
N_("git multi-pack-index [<options>] (write|verify|expire|repack --batch-size=<size>)"),
|
BUILTIN_MIDX_WRITE_USAGE,
|
||||||
|
BUILTIN_MIDX_VERIFY_USAGE,
|
||||||
|
BUILTIN_MIDX_EXPIRE_USAGE,
|
||||||
|
BUILTIN_MIDX_REPACK_USAGE,
|
||||||
NULL
|
NULL
|
||||||
};
|
};
|
||||||
|
|
||||||
static struct opts_multi_pack_index {
|
static struct opts_multi_pack_index {
|
||||||
const char *object_dir;
|
const char *object_dir;
|
||||||
|
const char *preferred_pack;
|
||||||
unsigned long batch_size;
|
unsigned long batch_size;
|
||||||
int progress;
|
unsigned flags;
|
||||||
} opts;
|
} opts;
|
||||||
|
|
||||||
int cmd_multi_pack_index(int argc, const char **argv,
|
static struct option common_opts[] = {
|
||||||
const char *prefix)
|
OPT_FILENAME(0, "object-dir", &opts.object_dir,
|
||||||
{
|
N_("object directory containing set of packfile and pack-index pairs")),
|
||||||
unsigned flags = 0;
|
OPT_BIT(0, "progress", &opts.flags, N_("force progress reporting"), MIDX_PROGRESS),
|
||||||
|
OPT_END(),
|
||||||
|
};
|
||||||
|
|
||||||
static struct option builtin_multi_pack_index_options[] = {
|
static struct option *add_common_options(struct option *prev)
|
||||||
OPT_FILENAME(0, "object-dir", &opts.object_dir,
|
{
|
||||||
N_("object directory containing set of packfile and pack-index pairs")),
|
return parse_options_concat(common_opts, prev);
|
||||||
OPT_BOOL(0, "progress", &opts.progress, N_("force progress reporting")),
|
}
|
||||||
|
|
||||||
|
static int cmd_multi_pack_index_write(int argc, const char **argv)
|
||||||
|
{
|
||||||
|
struct option *options;
|
||||||
|
static struct option builtin_multi_pack_index_write_options[] = {
|
||||||
|
OPT_STRING(0, "preferred-pack", &opts.preferred_pack,
|
||||||
|
N_("preferred-pack"),
|
||||||
|
N_("pack for reuse when computing a multi-pack bitmap")),
|
||||||
|
OPT_END(),
|
||||||
|
};
|
||||||
|
|
||||||
|
options = add_common_options(builtin_multi_pack_index_write_options);
|
||||||
|
|
||||||
|
trace2_cmd_mode(argv[0]);
|
||||||
|
|
||||||
|
argc = parse_options(argc, argv, NULL,
|
||||||
|
options, builtin_multi_pack_index_write_usage,
|
||||||
|
PARSE_OPT_KEEP_UNKNOWN);
|
||||||
|
if (argc)
|
||||||
|
usage_with_options(builtin_multi_pack_index_write_usage,
|
||||||
|
options);
|
||||||
|
|
||||||
|
FREE_AND_NULL(options);
|
||||||
|
|
||||||
|
return write_midx_file(opts.object_dir, opts.preferred_pack,
|
||||||
|
opts.flags);
|
||||||
|
}
|
||||||
|
|
||||||
|
static int cmd_multi_pack_index_verify(int argc, const char **argv)
|
||||||
|
{
|
||||||
|
struct option *options = common_opts;
|
||||||
|
|
||||||
|
trace2_cmd_mode(argv[0]);
|
||||||
|
|
||||||
|
argc = parse_options(argc, argv, NULL,
|
||||||
|
options, builtin_multi_pack_index_verify_usage,
|
||||||
|
PARSE_OPT_KEEP_UNKNOWN);
|
||||||
|
if (argc)
|
||||||
|
usage_with_options(builtin_multi_pack_index_verify_usage,
|
||||||
|
options);
|
||||||
|
|
||||||
|
return verify_midx_file(the_repository, opts.object_dir, opts.flags);
|
||||||
|
}
|
||||||
|
|
||||||
|
static int cmd_multi_pack_index_expire(int argc, const char **argv)
|
||||||
|
{
|
||||||
|
struct option *options = common_opts;
|
||||||
|
|
||||||
|
trace2_cmd_mode(argv[0]);
|
||||||
|
|
||||||
|
argc = parse_options(argc, argv, NULL,
|
||||||
|
options, builtin_multi_pack_index_expire_usage,
|
||||||
|
PARSE_OPT_KEEP_UNKNOWN);
|
||||||
|
if (argc)
|
||||||
|
usage_with_options(builtin_multi_pack_index_expire_usage,
|
||||||
|
options);
|
||||||
|
|
||||||
|
return expire_midx_packs(the_repository, opts.object_dir, opts.flags);
|
||||||
|
}
|
||||||
|
|
||||||
|
static int cmd_multi_pack_index_repack(int argc, const char **argv)
|
||||||
|
{
|
||||||
|
struct option *options;
|
||||||
|
static struct option builtin_multi_pack_index_repack_options[] = {
|
||||||
OPT_MAGNITUDE(0, "batch-size", &opts.batch_size,
|
OPT_MAGNITUDE(0, "batch-size", &opts.batch_size,
|
||||||
N_("during repack, collect pack-files of smaller size into a batch that is larger than this size")),
|
N_("during repack, collect pack-files of smaller size into a batch that is larger than this size")),
|
||||||
OPT_END(),
|
OPT_END(),
|
||||||
};
|
};
|
||||||
|
|
||||||
git_config(git_default_config, NULL);
|
options = add_common_options(builtin_multi_pack_index_repack_options);
|
||||||
|
|
||||||
opts.progress = isatty(2);
|
|
||||||
argc = parse_options(argc, argv, prefix,
|
|
||||||
builtin_multi_pack_index_options,
|
|
||||||
builtin_multi_pack_index_usage, 0);
|
|
||||||
|
|
||||||
if (!opts.object_dir)
|
|
||||||
opts.object_dir = get_object_directory();
|
|
||||||
if (opts.progress)
|
|
||||||
flags |= MIDX_PROGRESS;
|
|
||||||
|
|
||||||
if (argc == 0)
|
|
||||||
usage_with_options(builtin_multi_pack_index_usage,
|
|
||||||
builtin_multi_pack_index_options);
|
|
||||||
|
|
||||||
if (argc > 1) {
|
|
||||||
die(_("too many arguments"));
|
|
||||||
return 1;
|
|
||||||
}
|
|
||||||
|
|
||||||
trace2_cmd_mode(argv[0]);
|
trace2_cmd_mode(argv[0]);
|
||||||
|
|
||||||
if (!strcmp(argv[0], "repack"))
|
argc = parse_options(argc, argv, NULL,
|
||||||
return midx_repack(the_repository, opts.object_dir,
|
options,
|
||||||
(size_t)opts.batch_size, flags);
|
builtin_multi_pack_index_repack_usage,
|
||||||
if (opts.batch_size)
|
PARSE_OPT_KEEP_UNKNOWN);
|
||||||
die(_("--batch-size option is only for 'repack' subcommand"));
|
if (argc)
|
||||||
|
usage_with_options(builtin_multi_pack_index_repack_usage,
|
||||||
|
options);
|
||||||
|
|
||||||
if (!strcmp(argv[0], "write"))
|
FREE_AND_NULL(options);
|
||||||
return write_midx_file(opts.object_dir, flags);
|
|
||||||
if (!strcmp(argv[0], "verify"))
|
|
||||||
return verify_midx_file(the_repository, opts.object_dir, flags);
|
|
||||||
if (!strcmp(argv[0], "expire"))
|
|
||||||
return expire_midx_packs(the_repository, opts.object_dir, flags);
|
|
||||||
|
|
||||||
die(_("unrecognized subcommand: %s"), argv[0]);
|
return midx_repack(the_repository, opts.object_dir,
|
||||||
|
(size_t)opts.batch_size, opts.flags);
|
||||||
|
}
|
||||||
|
|
||||||
|
int cmd_multi_pack_index(int argc, const char **argv,
|
||||||
|
const char *prefix)
|
||||||
|
{
|
||||||
|
struct option *builtin_multi_pack_index_options = common_opts;
|
||||||
|
|
||||||
|
git_config(git_default_config, NULL);
|
||||||
|
|
||||||
|
if (isatty(2))
|
||||||
|
opts.flags |= MIDX_PROGRESS;
|
||||||
|
argc = parse_options(argc, argv, prefix,
|
||||||
|
builtin_multi_pack_index_options,
|
||||||
|
builtin_multi_pack_index_usage,
|
||||||
|
PARSE_OPT_STOP_AT_NON_OPTION);
|
||||||
|
|
||||||
|
if (!opts.object_dir)
|
||||||
|
opts.object_dir = get_object_directory();
|
||||||
|
|
||||||
|
if (argc == 0)
|
||||||
|
goto usage;
|
||||||
|
|
||||||
|
if (!strcmp(argv[0], "repack"))
|
||||||
|
return cmd_multi_pack_index_repack(argc, argv);
|
||||||
|
else if (!strcmp(argv[0], "write"))
|
||||||
|
return cmd_multi_pack_index_write(argc, argv);
|
||||||
|
else if (!strcmp(argv[0], "verify"))
|
||||||
|
return cmd_multi_pack_index_verify(argc, argv);
|
||||||
|
else if (!strcmp(argv[0], "expire"))
|
||||||
|
return cmd_multi_pack_index_expire(argc, argv);
|
||||||
|
else {
|
||||||
|
usage:
|
||||||
|
error(_("unrecognized subcommand: %s"), argv[0]);
|
||||||
|
usage_with_options(builtin_multi_pack_index_usage,
|
||||||
|
builtin_multi_pack_index_options);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
@ -721,7 +721,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
|
|||||||
remove_temporary_files();
|
remove_temporary_files();
|
||||||
|
|
||||||
if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0))
|
if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0))
|
||||||
write_midx_file(get_object_directory(), 0);
|
write_midx_file(get_object_directory(), NULL, 0);
|
||||||
|
|
||||||
string_list_clear(&names, 0);
|
string_list_clear(&names, 0);
|
||||||
string_list_clear(&rollback, 0);
|
string_list_clear(&rollback, 0);
|
||||||
|
219
midx.c
219
midx.c
@ -12,6 +12,7 @@
|
|||||||
#include "run-command.h"
|
#include "run-command.h"
|
||||||
#include "repository.h"
|
#include "repository.h"
|
||||||
#include "chunk-format.h"
|
#include "chunk-format.h"
|
||||||
|
#include "pack.h"
|
||||||
|
|
||||||
#define MIDX_SIGNATURE 0x4d494458 /* "MIDX" */
|
#define MIDX_SIGNATURE 0x4d494458 /* "MIDX" */
|
||||||
#define MIDX_VERSION 1
|
#define MIDX_VERSION 1
|
||||||
@ -47,11 +48,22 @@ static uint8_t oid_version(void)
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static const unsigned char *get_midx_checksum(struct multi_pack_index *m)
|
||||||
|
{
|
||||||
|
return m->data + m->data_len - the_hash_algo->rawsz;
|
||||||
|
}
|
||||||
|
|
||||||
static char *get_midx_filename(const char *object_dir)
|
static char *get_midx_filename(const char *object_dir)
|
||||||
{
|
{
|
||||||
return xstrfmt("%s/pack/multi-pack-index", object_dir);
|
return xstrfmt("%s/pack/multi-pack-index", object_dir);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
char *get_midx_rev_filename(struct multi_pack_index *m)
|
||||||
|
{
|
||||||
|
return xstrfmt("%s/pack/multi-pack-index-%s.rev",
|
||||||
|
m->object_dir, hash_to_hex(get_midx_checksum(m)));
|
||||||
|
}
|
||||||
|
|
||||||
static int midx_read_oid_fanout(const unsigned char *chunk_start,
|
static int midx_read_oid_fanout(const unsigned char *chunk_start,
|
||||||
size_t chunk_size, void *data)
|
size_t chunk_size, void *data)
|
||||||
{
|
{
|
||||||
@ -239,7 +251,7 @@ struct object_id *nth_midxed_object_oid(struct object_id *oid,
|
|||||||
return oid;
|
return oid;
|
||||||
}
|
}
|
||||||
|
|
||||||
static off_t nth_midxed_offset(struct multi_pack_index *m, uint32_t pos)
|
off_t nth_midxed_offset(struct multi_pack_index *m, uint32_t pos)
|
||||||
{
|
{
|
||||||
const unsigned char *offset_data;
|
const unsigned char *offset_data;
|
||||||
uint32_t offset32;
|
uint32_t offset32;
|
||||||
@ -258,7 +270,7 @@ static off_t nth_midxed_offset(struct multi_pack_index *m, uint32_t pos)
|
|||||||
return offset32;
|
return offset32;
|
||||||
}
|
}
|
||||||
|
|
||||||
static uint32_t nth_midxed_pack_int_id(struct multi_pack_index *m, uint32_t pos)
|
uint32_t nth_midxed_pack_int_id(struct multi_pack_index *m, uint32_t pos)
|
||||||
{
|
{
|
||||||
return get_be32(m->chunk_object_offsets +
|
return get_be32(m->chunk_object_offsets +
|
||||||
(off_t)pos * MIDX_CHUNK_OFFSET_WIDTH);
|
(off_t)pos * MIDX_CHUNK_OFFSET_WIDTH);
|
||||||
@ -431,6 +443,14 @@ static int pack_info_compare(const void *_a, const void *_b)
|
|||||||
return strcmp(a->pack_name, b->pack_name);
|
return strcmp(a->pack_name, b->pack_name);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static int idx_or_pack_name_cmp(const void *_va, const void *_vb)
|
||||||
|
{
|
||||||
|
const char *pack_name = _va;
|
||||||
|
const struct pack_info *compar = _vb;
|
||||||
|
|
||||||
|
return cmp_idx_or_pack_name(pack_name, compar->pack_name);
|
||||||
|
}
|
||||||
|
|
||||||
struct write_midx_context {
|
struct write_midx_context {
|
||||||
struct pack_info *info;
|
struct pack_info *info;
|
||||||
uint32_t nr;
|
uint32_t nr;
|
||||||
@ -443,8 +463,11 @@ struct write_midx_context {
|
|||||||
uint32_t entries_nr;
|
uint32_t entries_nr;
|
||||||
|
|
||||||
uint32_t *pack_perm;
|
uint32_t *pack_perm;
|
||||||
|
uint32_t *pack_order;
|
||||||
unsigned large_offsets_needed:1;
|
unsigned large_offsets_needed:1;
|
||||||
uint32_t num_large_offsets;
|
uint32_t num_large_offsets;
|
||||||
|
|
||||||
|
int preferred_pack_idx;
|
||||||
};
|
};
|
||||||
|
|
||||||
static void add_pack_to_midx(const char *full_path, size_t full_path_len,
|
static void add_pack_to_midx(const char *full_path, size_t full_path_len,
|
||||||
@ -489,6 +512,7 @@ struct pack_midx_entry {
|
|||||||
uint32_t pack_int_id;
|
uint32_t pack_int_id;
|
||||||
time_t pack_mtime;
|
time_t pack_mtime;
|
||||||
uint64_t offset;
|
uint64_t offset;
|
||||||
|
unsigned preferred : 1;
|
||||||
};
|
};
|
||||||
|
|
||||||
static int midx_oid_compare(const void *_a, const void *_b)
|
static int midx_oid_compare(const void *_a, const void *_b)
|
||||||
@ -500,6 +524,12 @@ static int midx_oid_compare(const void *_a, const void *_b)
|
|||||||
if (cmp)
|
if (cmp)
|
||||||
return cmp;
|
return cmp;
|
||||||
|
|
||||||
|
/* Sort objects in a preferred pack first when multiple copies exist. */
|
||||||
|
if (a->preferred > b->preferred)
|
||||||
|
return -1;
|
||||||
|
if (a->preferred < b->preferred)
|
||||||
|
return 1;
|
||||||
|
|
||||||
if (a->pack_mtime > b->pack_mtime)
|
if (a->pack_mtime > b->pack_mtime)
|
||||||
return -1;
|
return -1;
|
||||||
else if (a->pack_mtime < b->pack_mtime)
|
else if (a->pack_mtime < b->pack_mtime)
|
||||||
@ -527,7 +557,8 @@ static int nth_midxed_pack_midx_entry(struct multi_pack_index *m,
|
|||||||
static void fill_pack_entry(uint32_t pack_int_id,
|
static void fill_pack_entry(uint32_t pack_int_id,
|
||||||
struct packed_git *p,
|
struct packed_git *p,
|
||||||
uint32_t cur_object,
|
uint32_t cur_object,
|
||||||
struct pack_midx_entry *entry)
|
struct pack_midx_entry *entry,
|
||||||
|
int preferred)
|
||||||
{
|
{
|
||||||
if (nth_packed_object_id(&entry->oid, p, cur_object) < 0)
|
if (nth_packed_object_id(&entry->oid, p, cur_object) < 0)
|
||||||
die(_("failed to locate object %d in packfile"), cur_object);
|
die(_("failed to locate object %d in packfile"), cur_object);
|
||||||
@ -536,6 +567,7 @@ static void fill_pack_entry(uint32_t pack_int_id,
|
|||||||
entry->pack_mtime = p->mtime;
|
entry->pack_mtime = p->mtime;
|
||||||
|
|
||||||
entry->offset = nth_packed_object_offset(p, cur_object);
|
entry->offset = nth_packed_object_offset(p, cur_object);
|
||||||
|
entry->preferred = !!preferred;
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
@ -552,7 +584,8 @@ static void fill_pack_entry(uint32_t pack_int_id,
|
|||||||
static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m,
|
static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m,
|
||||||
struct pack_info *info,
|
struct pack_info *info,
|
||||||
uint32_t nr_packs,
|
uint32_t nr_packs,
|
||||||
uint32_t *nr_objects)
|
uint32_t *nr_objects,
|
||||||
|
int preferred_pack)
|
||||||
{
|
{
|
||||||
uint32_t cur_fanout, cur_pack, cur_object;
|
uint32_t cur_fanout, cur_pack, cur_object;
|
||||||
uint32_t alloc_fanout, alloc_objects, total_objects = 0;
|
uint32_t alloc_fanout, alloc_objects, total_objects = 0;
|
||||||
@ -589,12 +622,17 @@ static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m,
|
|||||||
nth_midxed_pack_midx_entry(m,
|
nth_midxed_pack_midx_entry(m,
|
||||||
&entries_by_fanout[nr_fanout],
|
&entries_by_fanout[nr_fanout],
|
||||||
cur_object);
|
cur_object);
|
||||||
|
if (nth_midxed_pack_int_id(m, cur_object) == preferred_pack)
|
||||||
|
entries_by_fanout[nr_fanout].preferred = 1;
|
||||||
|
else
|
||||||
|
entries_by_fanout[nr_fanout].preferred = 0;
|
||||||
nr_fanout++;
|
nr_fanout++;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
for (cur_pack = start_pack; cur_pack < nr_packs; cur_pack++) {
|
for (cur_pack = start_pack; cur_pack < nr_packs; cur_pack++) {
|
||||||
uint32_t start = 0, end;
|
uint32_t start = 0, end;
|
||||||
|
int preferred = cur_pack == preferred_pack;
|
||||||
|
|
||||||
if (cur_fanout)
|
if (cur_fanout)
|
||||||
start = get_pack_fanout(info[cur_pack].p, cur_fanout - 1);
|
start = get_pack_fanout(info[cur_pack].p, cur_fanout - 1);
|
||||||
@ -602,7 +640,11 @@ static struct pack_midx_entry *get_sorted_entries(struct multi_pack_index *m,
|
|||||||
|
|
||||||
for (cur_object = start; cur_object < end; cur_object++) {
|
for (cur_object = start; cur_object < end; cur_object++) {
|
||||||
ALLOC_GROW(entries_by_fanout, nr_fanout + 1, alloc_fanout);
|
ALLOC_GROW(entries_by_fanout, nr_fanout + 1, alloc_fanout);
|
||||||
fill_pack_entry(cur_pack, info[cur_pack].p, cur_object, &entries_by_fanout[nr_fanout]);
|
fill_pack_entry(cur_pack,
|
||||||
|
info[cur_pack].p,
|
||||||
|
cur_object,
|
||||||
|
&entries_by_fanout[nr_fanout],
|
||||||
|
preferred);
|
||||||
nr_fanout++;
|
nr_fanout++;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@ -776,10 +818,80 @@ static int write_midx_large_offsets(struct hashfile *f,
|
|||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
struct midx_pack_order_data {
|
||||||
|
uint32_t nr;
|
||||||
|
uint32_t pack;
|
||||||
|
off_t offset;
|
||||||
|
};
|
||||||
|
|
||||||
|
static int midx_pack_order_cmp(const void *va, const void *vb)
|
||||||
|
{
|
||||||
|
const struct midx_pack_order_data *a = va, *b = vb;
|
||||||
|
if (a->pack < b->pack)
|
||||||
|
return -1;
|
||||||
|
else if (a->pack > b->pack)
|
||||||
|
return 1;
|
||||||
|
else if (a->offset < b->offset)
|
||||||
|
return -1;
|
||||||
|
else if (a->offset > b->offset)
|
||||||
|
return 1;
|
||||||
|
else
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
static uint32_t *midx_pack_order(struct write_midx_context *ctx)
|
||||||
|
{
|
||||||
|
struct midx_pack_order_data *data;
|
||||||
|
uint32_t *pack_order;
|
||||||
|
uint32_t i;
|
||||||
|
|
||||||
|
ALLOC_ARRAY(data, ctx->entries_nr);
|
||||||
|
for (i = 0; i < ctx->entries_nr; i++) {
|
||||||
|
struct pack_midx_entry *e = &ctx->entries[i];
|
||||||
|
data[i].nr = i;
|
||||||
|
data[i].pack = ctx->pack_perm[e->pack_int_id];
|
||||||
|
if (!e->preferred)
|
||||||
|
data[i].pack |= (1U << 31);
|
||||||
|
data[i].offset = e->offset;
|
||||||
|
}
|
||||||
|
|
||||||
|
QSORT(data, ctx->entries_nr, midx_pack_order_cmp);
|
||||||
|
|
||||||
|
ALLOC_ARRAY(pack_order, ctx->entries_nr);
|
||||||
|
for (i = 0; i < ctx->entries_nr; i++)
|
||||||
|
pack_order[i] = data[i].nr;
|
||||||
|
free(data);
|
||||||
|
|
||||||
|
return pack_order;
|
||||||
|
}
|
||||||
|
|
||||||
|
static void write_midx_reverse_index(char *midx_name, unsigned char *midx_hash,
|
||||||
|
struct write_midx_context *ctx)
|
||||||
|
{
|
||||||
|
struct strbuf buf = STRBUF_INIT;
|
||||||
|
const char *tmp_file;
|
||||||
|
|
||||||
|
strbuf_addf(&buf, "%s-%s.rev", midx_name, hash_to_hex(midx_hash));
|
||||||
|
|
||||||
|
tmp_file = write_rev_file_order(NULL, ctx->pack_order, ctx->entries_nr,
|
||||||
|
midx_hash, WRITE_REV);
|
||||||
|
|
||||||
|
if (finalize_object_file(tmp_file, buf.buf))
|
||||||
|
die(_("cannot store reverse index file"));
|
||||||
|
|
||||||
|
strbuf_release(&buf);
|
||||||
|
}
|
||||||
|
|
||||||
|
static void clear_midx_files_ext(struct repository *r, const char *ext,
|
||||||
|
unsigned char *keep_hash);
|
||||||
|
|
||||||
static int write_midx_internal(const char *object_dir, struct multi_pack_index *m,
|
static int write_midx_internal(const char *object_dir, struct multi_pack_index *m,
|
||||||
struct string_list *packs_to_drop, unsigned flags)
|
struct string_list *packs_to_drop,
|
||||||
|
const char *preferred_pack_name,
|
||||||
|
unsigned flags)
|
||||||
{
|
{
|
||||||
char *midx_name;
|
char *midx_name;
|
||||||
|
unsigned char midx_hash[GIT_MAX_RAWSZ];
|
||||||
uint32_t i;
|
uint32_t i;
|
||||||
struct hashfile *f = NULL;
|
struct hashfile *f = NULL;
|
||||||
struct lock_file lk;
|
struct lock_file lk;
|
||||||
@ -828,7 +940,19 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index *
|
|||||||
if (ctx.m && ctx.nr == ctx.m->num_packs && !packs_to_drop)
|
if (ctx.m && ctx.nr == ctx.m->num_packs && !packs_to_drop)
|
||||||
goto cleanup;
|
goto cleanup;
|
||||||
|
|
||||||
ctx.entries = get_sorted_entries(ctx.m, ctx.info, ctx.nr, &ctx.entries_nr);
|
ctx.preferred_pack_idx = -1;
|
||||||
|
if (preferred_pack_name) {
|
||||||
|
for (i = 0; i < ctx.nr; i++) {
|
||||||
|
if (!cmp_idx_or_pack_name(preferred_pack_name,
|
||||||
|
ctx.info[i].pack_name)) {
|
||||||
|
ctx.preferred_pack_idx = i;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
ctx.entries = get_sorted_entries(ctx.m, ctx.info, ctx.nr, &ctx.entries_nr,
|
||||||
|
ctx.preferred_pack_idx);
|
||||||
|
|
||||||
ctx.large_offsets_needed = 0;
|
ctx.large_offsets_needed = 0;
|
||||||
for (i = 0; i < ctx.entries_nr; i++) {
|
for (i = 0; i < ctx.entries_nr; i++) {
|
||||||
@ -889,13 +1013,30 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index *
|
|||||||
pack_name_concat_len += strlen(ctx.info[i].pack_name) + 1;
|
pack_name_concat_len += strlen(ctx.info[i].pack_name) + 1;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/* Check that the preferred pack wasn't expired (if given). */
|
||||||
|
if (preferred_pack_name) {
|
||||||
|
struct pack_info *preferred = bsearch(preferred_pack_name,
|
||||||
|
ctx.info, ctx.nr,
|
||||||
|
sizeof(*ctx.info),
|
||||||
|
idx_or_pack_name_cmp);
|
||||||
|
|
||||||
|
if (!preferred)
|
||||||
|
warning(_("unknown preferred pack: '%s'"),
|
||||||
|
preferred_pack_name);
|
||||||
|
else {
|
||||||
|
uint32_t perm = ctx.pack_perm[preferred->orig_pack_int_id];
|
||||||
|
if (perm == PACK_EXPIRED)
|
||||||
|
warning(_("preferred pack '%s' is expired"),
|
||||||
|
preferred_pack_name);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
if (pack_name_concat_len % MIDX_CHUNK_ALIGNMENT)
|
if (pack_name_concat_len % MIDX_CHUNK_ALIGNMENT)
|
||||||
pack_name_concat_len += MIDX_CHUNK_ALIGNMENT -
|
pack_name_concat_len += MIDX_CHUNK_ALIGNMENT -
|
||||||
(pack_name_concat_len % MIDX_CHUNK_ALIGNMENT);
|
(pack_name_concat_len % MIDX_CHUNK_ALIGNMENT);
|
||||||
|
|
||||||
hold_lock_file_for_update(&lk, midx_name, LOCK_DIE_ON_ERROR);
|
hold_lock_file_for_update(&lk, midx_name, LOCK_DIE_ON_ERROR);
|
||||||
f = hashfd(get_lock_file_fd(&lk), get_lock_file_path(&lk));
|
f = hashfd(get_lock_file_fd(&lk), get_lock_file_path(&lk));
|
||||||
FREE_AND_NULL(midx_name);
|
|
||||||
|
|
||||||
if (ctx.m)
|
if (ctx.m)
|
||||||
close_midx(ctx.m);
|
close_midx(ctx.m);
|
||||||
@ -927,8 +1068,16 @@ static int write_midx_internal(const char *object_dir, struct multi_pack_index *
|
|||||||
write_midx_header(f, get_num_chunks(cf), ctx.nr - dropped_packs);
|
write_midx_header(f, get_num_chunks(cf), ctx.nr - dropped_packs);
|
||||||
write_chunkfile(cf, &ctx);
|
write_chunkfile(cf, &ctx);
|
||||||
|
|
||||||
finalize_hashfile(f, NULL, CSUM_FSYNC | CSUM_HASH_IN_STREAM);
|
finalize_hashfile(f, midx_hash, CSUM_FSYNC | CSUM_HASH_IN_STREAM);
|
||||||
free_chunkfile(cf);
|
free_chunkfile(cf);
|
||||||
|
|
||||||
|
if (flags & MIDX_WRITE_REV_INDEX)
|
||||||
|
ctx.pack_order = midx_pack_order(&ctx);
|
||||||
|
|
||||||
|
if (flags & MIDX_WRITE_REV_INDEX)
|
||||||
|
write_midx_reverse_index(midx_name, midx_hash, &ctx);
|
||||||
|
clear_midx_files_ext(the_repository, ".rev", midx_hash);
|
||||||
|
|
||||||
commit_lock_file(&lk);
|
commit_lock_file(&lk);
|
||||||
|
|
||||||
cleanup:
|
cleanup:
|
||||||
@ -943,13 +1092,55 @@ cleanup:
|
|||||||
free(ctx.info);
|
free(ctx.info);
|
||||||
free(ctx.entries);
|
free(ctx.entries);
|
||||||
free(ctx.pack_perm);
|
free(ctx.pack_perm);
|
||||||
|
free(ctx.pack_order);
|
||||||
free(midx_name);
|
free(midx_name);
|
||||||
return result;
|
return result;
|
||||||
}
|
}
|
||||||
|
|
||||||
int write_midx_file(const char *object_dir, unsigned flags)
|
int write_midx_file(const char *object_dir,
|
||||||
|
const char *preferred_pack_name,
|
||||||
|
unsigned flags)
|
||||||
{
|
{
|
||||||
return write_midx_internal(object_dir, NULL, NULL, flags);
|
return write_midx_internal(object_dir, NULL, NULL, preferred_pack_name,
|
||||||
|
flags);
|
||||||
|
}
|
||||||
|
|
||||||
|
struct clear_midx_data {
|
||||||
|
char *keep;
|
||||||
|
const char *ext;
|
||||||
|
};
|
||||||
|
|
||||||
|
static void clear_midx_file_ext(const char *full_path, size_t full_path_len,
|
||||||
|
const char *file_name, void *_data)
|
||||||
|
{
|
||||||
|
struct clear_midx_data *data = _data;
|
||||||
|
|
||||||
|
if (!(starts_with(file_name, "multi-pack-index-") &&
|
||||||
|
ends_with(file_name, data->ext)))
|
||||||
|
return;
|
||||||
|
if (data->keep && !strcmp(data->keep, file_name))
|
||||||
|
return;
|
||||||
|
|
||||||
|
if (unlink(full_path))
|
||||||
|
die_errno(_("failed to remove %s"), full_path);
|
||||||
|
}
|
||||||
|
|
||||||
|
static void clear_midx_files_ext(struct repository *r, const char *ext,
|
||||||
|
unsigned char *keep_hash)
|
||||||
|
{
|
||||||
|
struct clear_midx_data data;
|
||||||
|
memset(&data, 0, sizeof(struct clear_midx_data));
|
||||||
|
|
||||||
|
if (keep_hash)
|
||||||
|
data.keep = xstrfmt("multi-pack-index-%s%s",
|
||||||
|
hash_to_hex(keep_hash), ext);
|
||||||
|
data.ext = ext;
|
||||||
|
|
||||||
|
for_each_file_in_pack_dir(r->objects->odb->path,
|
||||||
|
clear_midx_file_ext,
|
||||||
|
&data);
|
||||||
|
|
||||||
|
free(data.keep);
|
||||||
}
|
}
|
||||||
|
|
||||||
void clear_midx_file(struct repository *r)
|
void clear_midx_file(struct repository *r)
|
||||||
@ -964,6 +1155,8 @@ void clear_midx_file(struct repository *r)
|
|||||||
if (remove_path(midx))
|
if (remove_path(midx))
|
||||||
die(_("failed to clear multi-pack-index at %s"), midx);
|
die(_("failed to clear multi-pack-index at %s"), midx);
|
||||||
|
|
||||||
|
clear_midx_files_ext(r, ".rev", NULL);
|
||||||
|
|
||||||
free(midx);
|
free(midx);
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -1184,7 +1377,7 @@ int expire_midx_packs(struct repository *r, const char *object_dir, unsigned fla
|
|||||||
free(count);
|
free(count);
|
||||||
|
|
||||||
if (packs_to_drop.nr)
|
if (packs_to_drop.nr)
|
||||||
result = write_midx_internal(object_dir, m, &packs_to_drop, flags);
|
result = write_midx_internal(object_dir, m, &packs_to_drop, NULL, flags);
|
||||||
|
|
||||||
string_list_clear(&packs_to_drop, 0);
|
string_list_clear(&packs_to_drop, 0);
|
||||||
return result;
|
return result;
|
||||||
@ -1373,7 +1566,7 @@ int midx_repack(struct repository *r, const char *object_dir, size_t batch_size,
|
|||||||
goto cleanup;
|
goto cleanup;
|
||||||
}
|
}
|
||||||
|
|
||||||
result = write_midx_internal(object_dir, m, NULL, flags);
|
result = write_midx_internal(object_dir, m, NULL, NULL, flags);
|
||||||
m = NULL;
|
m = NULL;
|
||||||
|
|
||||||
cleanup:
|
cleanup:
|
||||||
|
11
midx.h
11
midx.h
@ -15,6 +15,10 @@ struct multi_pack_index {
|
|||||||
const unsigned char *data;
|
const unsigned char *data;
|
||||||
size_t data_len;
|
size_t data_len;
|
||||||
|
|
||||||
|
const uint32_t *revindex_data;
|
||||||
|
const uint32_t *revindex_map;
|
||||||
|
size_t revindex_len;
|
||||||
|
|
||||||
uint32_t signature;
|
uint32_t signature;
|
||||||
unsigned char version;
|
unsigned char version;
|
||||||
unsigned char hash_len;
|
unsigned char hash_len;
|
||||||
@ -36,10 +40,15 @@ struct multi_pack_index {
|
|||||||
};
|
};
|
||||||
|
|
||||||
#define MIDX_PROGRESS (1 << 0)
|
#define MIDX_PROGRESS (1 << 0)
|
||||||
|
#define MIDX_WRITE_REV_INDEX (1 << 1)
|
||||||
|
|
||||||
|
char *get_midx_rev_filename(struct multi_pack_index *m);
|
||||||
|
|
||||||
struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local);
|
struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local);
|
||||||
int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id);
|
int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id);
|
||||||
int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, uint32_t *result);
|
int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, uint32_t *result);
|
||||||
|
off_t nth_midxed_offset(struct multi_pack_index *m, uint32_t pos);
|
||||||
|
uint32_t nth_midxed_pack_int_id(struct multi_pack_index *m, uint32_t pos);
|
||||||
struct object_id *nth_midxed_object_oid(struct object_id *oid,
|
struct object_id *nth_midxed_object_oid(struct object_id *oid,
|
||||||
struct multi_pack_index *m,
|
struct multi_pack_index *m,
|
||||||
uint32_t n);
|
uint32_t n);
|
||||||
@ -47,7 +56,7 @@ int fill_midx_entry(struct repository *r, const struct object_id *oid, struct pa
|
|||||||
int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pack_name);
|
int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pack_name);
|
||||||
int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local);
|
int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local);
|
||||||
|
|
||||||
int write_midx_file(const char *object_dir, unsigned flags);
|
int write_midx_file(const char *object_dir, const char *preferred_pack_name, unsigned flags);
|
||||||
void clear_midx_file(struct repository *r);
|
void clear_midx_file(struct repository *r);
|
||||||
int verify_midx_file(struct repository *r, const char *object_dir, unsigned flags);
|
int verify_midx_file(struct repository *r, const char *object_dir, unsigned flags);
|
||||||
int expire_midx_packs(struct repository *r, const char *object_dir, unsigned flags);
|
int expire_midx_packs(struct repository *r, const char *object_dir, unsigned flags);
|
||||||
|
126
pack-revindex.c
126
pack-revindex.c
@ -3,6 +3,7 @@
|
|||||||
#include "object-store.h"
|
#include "object-store.h"
|
||||||
#include "packfile.h"
|
#include "packfile.h"
|
||||||
#include "config.h"
|
#include "config.h"
|
||||||
|
#include "midx.h"
|
||||||
|
|
||||||
struct revindex_entry {
|
struct revindex_entry {
|
||||||
off_t offset;
|
off_t offset;
|
||||||
@ -293,6 +294,43 @@ int load_pack_revindex(struct packed_git *p)
|
|||||||
return -1;
|
return -1;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
int load_midx_revindex(struct multi_pack_index *m)
|
||||||
|
{
|
||||||
|
char *revindex_name;
|
||||||
|
int ret;
|
||||||
|
if (m->revindex_data)
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
revindex_name = get_midx_rev_filename(m);
|
||||||
|
|
||||||
|
ret = load_revindex_from_disk(revindex_name,
|
||||||
|
m->num_objects,
|
||||||
|
&m->revindex_map,
|
||||||
|
&m->revindex_len);
|
||||||
|
if (ret)
|
||||||
|
goto cleanup;
|
||||||
|
|
||||||
|
m->revindex_data = (const uint32_t *)((const char *)m->revindex_map + RIDX_HEADER_SIZE);
|
||||||
|
|
||||||
|
cleanup:
|
||||||
|
free(revindex_name);
|
||||||
|
return ret;
|
||||||
|
}
|
||||||
|
|
||||||
|
int close_midx_revindex(struct multi_pack_index *m)
|
||||||
|
{
|
||||||
|
if (!m || !m->revindex_map)
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
munmap((void*)m->revindex_map, m->revindex_len);
|
||||||
|
|
||||||
|
m->revindex_map = NULL;
|
||||||
|
m->revindex_data = NULL;
|
||||||
|
m->revindex_len = 0;
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_t *pos)
|
int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_t *pos)
|
||||||
{
|
{
|
||||||
unsigned lo, hi;
|
unsigned lo, hi;
|
||||||
@ -347,3 +385,91 @@ off_t pack_pos_to_offset(struct packed_git *p, uint32_t pos)
|
|||||||
else
|
else
|
||||||
return nth_packed_object_offset(p, pack_pos_to_index(p, pos));
|
return nth_packed_object_offset(p, pack_pos_to_index(p, pos));
|
||||||
}
|
}
|
||||||
|
|
||||||
|
uint32_t pack_pos_to_midx(struct multi_pack_index *m, uint32_t pos)
|
||||||
|
{
|
||||||
|
if (!m->revindex_data)
|
||||||
|
BUG("pack_pos_to_midx: reverse index not yet loaded");
|
||||||
|
if (m->num_objects <= pos)
|
||||||
|
BUG("pack_pos_to_midx: out-of-bounds object at %"PRIu32, pos);
|
||||||
|
return get_be32(m->revindex_data + pos);
|
||||||
|
}
|
||||||
|
|
||||||
|
struct midx_pack_key {
|
||||||
|
uint32_t pack;
|
||||||
|
off_t offset;
|
||||||
|
|
||||||
|
uint32_t preferred_pack;
|
||||||
|
struct multi_pack_index *midx;
|
||||||
|
};
|
||||||
|
|
||||||
|
static int midx_pack_order_cmp(const void *va, const void *vb)
|
||||||
|
{
|
||||||
|
const struct midx_pack_key *key = va;
|
||||||
|
struct multi_pack_index *midx = key->midx;
|
||||||
|
|
||||||
|
uint32_t versus = pack_pos_to_midx(midx, (uint32_t*)vb - (const uint32_t *)midx->revindex_data);
|
||||||
|
uint32_t versus_pack = nth_midxed_pack_int_id(midx, versus);
|
||||||
|
off_t versus_offset;
|
||||||
|
|
||||||
|
uint32_t key_preferred = key->pack == key->preferred_pack;
|
||||||
|
uint32_t versus_preferred = versus_pack == key->preferred_pack;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* First, compare the preferred-ness, noting that the preferred pack
|
||||||
|
* comes first.
|
||||||
|
*/
|
||||||
|
if (key_preferred && !versus_preferred)
|
||||||
|
return -1;
|
||||||
|
else if (!key_preferred && versus_preferred)
|
||||||
|
return 1;
|
||||||
|
|
||||||
|
/* Then, break ties first by comparing the pack IDs. */
|
||||||
|
if (key->pack < versus_pack)
|
||||||
|
return -1;
|
||||||
|
else if (key->pack > versus_pack)
|
||||||
|
return 1;
|
||||||
|
|
||||||
|
/* Finally, break ties by comparing offsets within a pack. */
|
||||||
|
versus_offset = nth_midxed_offset(midx, versus);
|
||||||
|
if (key->offset < versus_offset)
|
||||||
|
return -1;
|
||||||
|
else if (key->offset > versus_offset)
|
||||||
|
return 1;
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
int midx_to_pack_pos(struct multi_pack_index *m, uint32_t at, uint32_t *pos)
|
||||||
|
{
|
||||||
|
struct midx_pack_key key;
|
||||||
|
uint32_t *found;
|
||||||
|
|
||||||
|
if (!m->revindex_data)
|
||||||
|
BUG("midx_to_pack_pos: reverse index not yet loaded");
|
||||||
|
if (m->num_objects <= at)
|
||||||
|
BUG("midx_to_pack_pos: out-of-bounds object at %"PRIu32, at);
|
||||||
|
|
||||||
|
key.pack = nth_midxed_pack_int_id(m, at);
|
||||||
|
key.offset = nth_midxed_offset(m, at);
|
||||||
|
key.midx = m;
|
||||||
|
/*
|
||||||
|
* The preferred pack sorts first, so determine its identifier by
|
||||||
|
* looking at the first object in pseudo-pack order.
|
||||||
|
*
|
||||||
|
* Note that if no --preferred-pack is explicitly given when writing a
|
||||||
|
* multi-pack index, then whichever pack has the lowest identifier
|
||||||
|
* implicitly is preferred (and includes all its objects, since ties are
|
||||||
|
* broken first by pack identifier).
|
||||||
|
*/
|
||||||
|
key.preferred_pack = nth_midxed_pack_int_id(m, pack_pos_to_midx(m, 0));
|
||||||
|
|
||||||
|
found = bsearch(&key, m->revindex_data, m->num_objects,
|
||||||
|
sizeof(*m->revindex_data), midx_pack_order_cmp);
|
||||||
|
|
||||||
|
if (!found)
|
||||||
|
return error("bad offset for revindex");
|
||||||
|
|
||||||
|
*pos = found - m->revindex_data;
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
@ -14,6 +14,20 @@
|
|||||||
*
|
*
|
||||||
* - offset: the byte offset within the .pack file at which the object contents
|
* - offset: the byte offset within the .pack file at which the object contents
|
||||||
* can be found
|
* can be found
|
||||||
|
*
|
||||||
|
* The revindex can also be used with a multi-pack index (MIDX). In this
|
||||||
|
* setting:
|
||||||
|
*
|
||||||
|
* - index position refers to an object's numeric position within the MIDX
|
||||||
|
*
|
||||||
|
* - pack position refers to an object's position within a non-existent pack
|
||||||
|
* described by the MIDX. The pack structure is described in
|
||||||
|
* Documentation/technical/pack-format.txt.
|
||||||
|
*
|
||||||
|
* It is effectively a concatanation of all packs in the MIDX (ordered by
|
||||||
|
* their numeric ID within the MIDX) in their original order within each
|
||||||
|
* pack), removing duplicates, and placing the preferred pack (if any)
|
||||||
|
* first.
|
||||||
*/
|
*/
|
||||||
|
|
||||||
|
|
||||||
@ -24,6 +38,7 @@
|
|||||||
#define GIT_TEST_REV_INDEX_DIE_IN_MEMORY "GIT_TEST_REV_INDEX_DIE_IN_MEMORY"
|
#define GIT_TEST_REV_INDEX_DIE_IN_MEMORY "GIT_TEST_REV_INDEX_DIE_IN_MEMORY"
|
||||||
|
|
||||||
struct packed_git;
|
struct packed_git;
|
||||||
|
struct multi_pack_index;
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* load_pack_revindex populates the revindex's internal data-structures for the
|
* load_pack_revindex populates the revindex's internal data-structures for the
|
||||||
@ -34,6 +49,22 @@ struct packed_git;
|
|||||||
*/
|
*/
|
||||||
int load_pack_revindex(struct packed_git *p);
|
int load_pack_revindex(struct packed_git *p);
|
||||||
|
|
||||||
|
/*
|
||||||
|
* load_midx_revindex loads the '.rev' file corresponding to the given
|
||||||
|
* multi-pack index by mmap-ing it and assigning pointers in the
|
||||||
|
* multi_pack_index to point at it.
|
||||||
|
*
|
||||||
|
* A negative number is returned on error.
|
||||||
|
*/
|
||||||
|
int load_midx_revindex(struct multi_pack_index *m);
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Frees resources associated with a multi-pack reverse index.
|
||||||
|
*
|
||||||
|
* A negative number is returned on error.
|
||||||
|
*/
|
||||||
|
int close_midx_revindex(struct multi_pack_index *m);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* offset_to_pack_pos converts an object offset to a pack position. This
|
* offset_to_pack_pos converts an object offset to a pack position. This
|
||||||
* function returns zero on success, and a negative number otherwise. The
|
* function returns zero on success, and a negative number otherwise. The
|
||||||
@ -71,4 +102,26 @@ uint32_t pack_pos_to_index(struct packed_git *p, uint32_t pos);
|
|||||||
*/
|
*/
|
||||||
off_t pack_pos_to_offset(struct packed_git *p, uint32_t pos);
|
off_t pack_pos_to_offset(struct packed_git *p, uint32_t pos);
|
||||||
|
|
||||||
|
/*
|
||||||
|
* pack_pos_to_midx converts the object at position "pos" within the MIDX
|
||||||
|
* pseudo-pack into a MIDX position.
|
||||||
|
*
|
||||||
|
* If the reverse index has not yet been loaded, or the position is out of
|
||||||
|
* bounds, this function aborts.
|
||||||
|
*
|
||||||
|
* This function runs in time O(log N) with the number of objects in the MIDX.
|
||||||
|
*/
|
||||||
|
uint32_t pack_pos_to_midx(struct multi_pack_index *m, uint32_t pos);
|
||||||
|
|
||||||
|
/*
|
||||||
|
* midx_to_pack_pos converts from the MIDX-relative position at "at" to the
|
||||||
|
* corresponding pack position.
|
||||||
|
*
|
||||||
|
* If the reverse index has not yet been loaded, or the position is out of
|
||||||
|
* bounds, this function aborts.
|
||||||
|
*
|
||||||
|
* This function runs in constant time.
|
||||||
|
*/
|
||||||
|
int midx_to_pack_pos(struct multi_pack_index *midx, uint32_t at, uint32_t *pos);
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
36
pack-write.c
36
pack-write.c
@ -201,21 +201,12 @@ static void write_rev_header(struct hashfile *f)
|
|||||||
}
|
}
|
||||||
|
|
||||||
static void write_rev_index_positions(struct hashfile *f,
|
static void write_rev_index_positions(struct hashfile *f,
|
||||||
struct pack_idx_entry **objects,
|
uint32_t *pack_order,
|
||||||
uint32_t nr_objects)
|
uint32_t nr_objects)
|
||||||
{
|
{
|
||||||
uint32_t *pack_order;
|
|
||||||
uint32_t i;
|
uint32_t i;
|
||||||
|
|
||||||
ALLOC_ARRAY(pack_order, nr_objects);
|
|
||||||
for (i = 0; i < nr_objects; i++)
|
|
||||||
pack_order[i] = i;
|
|
||||||
QSORT_S(pack_order, nr_objects, pack_order_cmp, objects);
|
|
||||||
|
|
||||||
for (i = 0; i < nr_objects; i++)
|
for (i = 0; i < nr_objects; i++)
|
||||||
hashwrite_be32(f, pack_order[i]);
|
hashwrite_be32(f, pack_order[i]);
|
||||||
|
|
||||||
free(pack_order);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
static void write_rev_trailer(struct hashfile *f, const unsigned char *hash)
|
static void write_rev_trailer(struct hashfile *f, const unsigned char *hash)
|
||||||
@ -228,6 +219,29 @@ const char *write_rev_file(const char *rev_name,
|
|||||||
uint32_t nr_objects,
|
uint32_t nr_objects,
|
||||||
const unsigned char *hash,
|
const unsigned char *hash,
|
||||||
unsigned flags)
|
unsigned flags)
|
||||||
|
{
|
||||||
|
uint32_t *pack_order;
|
||||||
|
uint32_t i;
|
||||||
|
const char *ret;
|
||||||
|
|
||||||
|
ALLOC_ARRAY(pack_order, nr_objects);
|
||||||
|
for (i = 0; i < nr_objects; i++)
|
||||||
|
pack_order[i] = i;
|
||||||
|
QSORT_S(pack_order, nr_objects, pack_order_cmp, objects);
|
||||||
|
|
||||||
|
ret = write_rev_file_order(rev_name, pack_order, nr_objects, hash,
|
||||||
|
flags);
|
||||||
|
|
||||||
|
free(pack_order);
|
||||||
|
|
||||||
|
return ret;
|
||||||
|
}
|
||||||
|
|
||||||
|
const char *write_rev_file_order(const char *rev_name,
|
||||||
|
uint32_t *pack_order,
|
||||||
|
uint32_t nr_objects,
|
||||||
|
const unsigned char *hash,
|
||||||
|
unsigned flags)
|
||||||
{
|
{
|
||||||
struct hashfile *f;
|
struct hashfile *f;
|
||||||
int fd;
|
int fd;
|
||||||
@ -262,7 +276,7 @@ const char *write_rev_file(const char *rev_name,
|
|||||||
|
|
||||||
write_rev_header(f);
|
write_rev_header(f);
|
||||||
|
|
||||||
write_rev_index_positions(f, objects, nr_objects);
|
write_rev_index_positions(f, pack_order, nr_objects);
|
||||||
write_rev_trailer(f, hash);
|
write_rev_trailer(f, hash);
|
||||||
|
|
||||||
if (rev_name && adjust_shared_perm(rev_name) < 0)
|
if (rev_name && adjust_shared_perm(rev_name) < 0)
|
||||||
|
1
pack.h
1
pack.h
@ -94,6 +94,7 @@ struct ref;
|
|||||||
void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought);
|
void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought);
|
||||||
|
|
||||||
const char *write_rev_file(const char *rev_name, struct pack_idx_entry **objects, uint32_t nr_objects, const unsigned char *hash, unsigned flags);
|
const char *write_rev_file(const char *rev_name, struct pack_idx_entry **objects, uint32_t nr_objects, const unsigned char *hash, unsigned flags);
|
||||||
|
const char *write_rev_file_order(const char *rev_name, uint32_t *pack_order, uint32_t nr_objects, const unsigned char *hash, unsigned flags);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* The "hdr" output buffer should be at least this big, which will handle sizes
|
* The "hdr" output buffer should be at least this big, which will handle sizes
|
||||||
|
@ -862,6 +862,9 @@ static void prepare_pack(const char *full_name, size_t full_name_len,
|
|||||||
|
|
||||||
if (!strcmp(file_name, "multi-pack-index"))
|
if (!strcmp(file_name, "multi-pack-index"))
|
||||||
return;
|
return;
|
||||||
|
if (starts_with(file_name, "multi-pack-index") &&
|
||||||
|
ends_with(file_name, ".rev"))
|
||||||
|
return;
|
||||||
if (ends_with(file_name, ".idx") ||
|
if (ends_with(file_name, ".idx") ||
|
||||||
ends_with(file_name, ".rev") ||
|
ends_with(file_name, ".rev") ||
|
||||||
ends_with(file_name, ".pack") ||
|
ends_with(file_name, ".pack") ||
|
||||||
|
@ -4,7 +4,7 @@
|
|||||||
#include "repository.h"
|
#include "repository.h"
|
||||||
#include "object-store.h"
|
#include "object-store.h"
|
||||||
|
|
||||||
static int read_midx_file(const char *object_dir)
|
static int read_midx_file(const char *object_dir, int show_objects)
|
||||||
{
|
{
|
||||||
uint32_t i;
|
uint32_t i;
|
||||||
struct multi_pack_index *m;
|
struct multi_pack_index *m;
|
||||||
@ -43,13 +43,29 @@ static int read_midx_file(const char *object_dir)
|
|||||||
|
|
||||||
printf("object-dir: %s\n", m->object_dir);
|
printf("object-dir: %s\n", m->object_dir);
|
||||||
|
|
||||||
|
if (show_objects) {
|
||||||
|
struct object_id oid;
|
||||||
|
struct pack_entry e;
|
||||||
|
|
||||||
|
for (i = 0; i < m->num_objects; i++) {
|
||||||
|
nth_midxed_object_oid(&oid, m, i);
|
||||||
|
fill_midx_entry(the_repository, &oid, &e, m);
|
||||||
|
|
||||||
|
printf("%s %"PRIu64"\t%s\n",
|
||||||
|
oid_to_hex(&oid), e.offset, e.p->pack_name);
|
||||||
|
}
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
int cmd__read_midx(int argc, const char **argv)
|
int cmd__read_midx(int argc, const char **argv)
|
||||||
{
|
{
|
||||||
if (argc != 2)
|
if (!(argc == 2 || argc == 3))
|
||||||
usage("read-midx <object-dir>");
|
usage("read-midx [--show-objects] <object-dir>");
|
||||||
|
|
||||||
return read_midx_file(argv[1]);
|
if (!strcmp(argv[1], "--show-objects"))
|
||||||
|
return read_midx_file(argv[2], 1);
|
||||||
|
return read_midx_file(argv[1], 0);
|
||||||
}
|
}
|
||||||
|
@ -234,6 +234,48 @@ test_expect_success 'warn on improper hash version' '
|
|||||||
)
|
)
|
||||||
'
|
'
|
||||||
|
|
||||||
|
test_expect_success 'midx picks objects from preferred pack' '
|
||||||
|
test_when_finished rm -rf preferred.git &&
|
||||||
|
git init --bare preferred.git &&
|
||||||
|
(
|
||||||
|
cd preferred.git &&
|
||||||
|
|
||||||
|
a=$(echo "a" | git hash-object -w --stdin) &&
|
||||||
|
b=$(echo "b" | git hash-object -w --stdin) &&
|
||||||
|
c=$(echo "c" | git hash-object -w --stdin) &&
|
||||||
|
|
||||||
|
# Set up two packs, duplicating the object "B" at different
|
||||||
|
# offsets.
|
||||||
|
#
|
||||||
|
# Note that the "BC" pack (the one we choose as preferred) sorts
|
||||||
|
# lexically after the "AB" pack, meaning that omitting the
|
||||||
|
# --preferred-pack argument would cause this test to fail (since
|
||||||
|
# the MIDX code would select the copy of "b" in the "AB" pack).
|
||||||
|
git pack-objects objects/pack/test-AB <<-EOF &&
|
||||||
|
$a
|
||||||
|
$b
|
||||||
|
EOF
|
||||||
|
bc=$(git pack-objects objects/pack/test-BC <<-EOF
|
||||||
|
$b
|
||||||
|
$c
|
||||||
|
EOF
|
||||||
|
) &&
|
||||||
|
|
||||||
|
git multi-pack-index --object-dir=objects \
|
||||||
|
write --preferred-pack=test-BC-$bc.idx 2>err &&
|
||||||
|
test_must_be_empty err &&
|
||||||
|
|
||||||
|
test-tool read-midx --show-objects objects >out &&
|
||||||
|
|
||||||
|
ofs=$(git show-index <objects/pack/test-BC-$bc.idx | grep $b |
|
||||||
|
cut -d" " -f1) &&
|
||||||
|
printf "%s %s\tobjects/pack/test-BC-%s.pack\n" \
|
||||||
|
"$b" "$ofs" "$bc" >expect &&
|
||||||
|
grep ^$b out >actual &&
|
||||||
|
|
||||||
|
test_cmp expect actual
|
||||||
|
)
|
||||||
|
'
|
||||||
|
|
||||||
test_expect_success 'verify multi-pack-index success' '
|
test_expect_success 'verify multi-pack-index success' '
|
||||||
git multi-pack-index verify --object-dir=$objdir
|
git multi-pack-index verify --object-dir=$objdir
|
||||||
|
Loading…
Reference in New Issue
Block a user