2018-10-18 00:13:26 +02:00
|
|
|
#ifndef MIDX_H
|
|
|
|
#define MIDX_H
|
2018-07-12 21:39:21 +02:00
|
|
|
|
2018-07-12 21:39:33 +02:00
|
|
|
#include "repository.h"
|
|
|
|
|
2018-09-19 02:13:36 +02:00
|
|
|
struct object_id;
|
|
|
|
struct pack_entry;
|
2019-04-29 18:18:55 +02:00
|
|
|
struct repository;
|
2018-09-19 02:13:36 +02:00
|
|
|
|
2018-10-12 19:34:20 +02:00
|
|
|
#define GIT_TEST_MULTI_PACK_INDEX "GIT_TEST_MULTI_PACK_INDEX"
|
|
|
|
|
2018-07-12 21:39:23 +02:00
|
|
|
struct multi_pack_index {
|
2018-07-12 21:39:33 +02:00
|
|
|
struct multi_pack_index *next;
|
|
|
|
|
2018-07-12 21:39:23 +02:00
|
|
|
const unsigned char *data;
|
|
|
|
size_t data_len;
|
|
|
|
|
pack-revindex: read multi-pack reverse indexes
Implement reading for multi-pack reverse indexes, as described in the
previous patch.
Note that these functions don't yet have any callers, and won't until
multi-pack reachability bitmaps are introduced in a later patch series.
In the meantime, this patch implements some of the infrastructure
necessary to support multi-pack bitmaps.
There are three new functions exposed by the revindex API:
- load_midx_revindex(): loads the reverse index corresponding to the
given multi-pack index.
- midx_to_pack_pos() and pack_pos_to_midx(): these convert between the
multi-pack index and pseudo-pack order.
load_midx_revindex() and pack_pos_to_midx() are both relatively
straightforward.
load_midx_revindex() needs a few functions to be exposed from the midx
API. One to get the checksum of a midx, and another to get the .rev's
filename. Similar to recent changes in the packed_git struct, three new
fields are added to the multi_pack_index struct: one to keep track of
the size, one to keep track of the mmap'd pointer, and another to point
past the header and at the reverse index's data.
pack_pos_to_midx() simply reads the corresponding entry out of the
table.
midx_to_pack_pos() is the trickiest, since it needs to find an object's
position in the psuedo-pack order, but that order can only be recovered
in the .rev file itself. This mapping can be implemented with a binary
search, but note that the thing we're binary searching over isn't an
array of values, but rather a permuted order of those values.
So, when comparing two items, it's helpful to keep in mind the
difference. Instead of a traditional binary search, where you are
comparing two things directly, here we're comparing a (pack, offset)
tuple with an index into the multi-pack index. That index describes
another (pack, offset) tuple, and it is _those_ two tuples that are
compared.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 17:04:26 +02:00
|
|
|
const uint32_t *revindex_data;
|
|
|
|
const uint32_t *revindex_map;
|
|
|
|
size_t revindex_len;
|
|
|
|
|
2018-07-12 21:39:23 +02:00
|
|
|
uint32_t signature;
|
|
|
|
unsigned char version;
|
|
|
|
unsigned char hash_len;
|
|
|
|
unsigned char num_chunks;
|
|
|
|
uint32_t num_packs;
|
|
|
|
uint32_t num_objects;
|
|
|
|
|
2018-08-20 18:51:55 +02:00
|
|
|
int local;
|
|
|
|
|
2018-07-12 21:39:27 +02:00
|
|
|
const unsigned char *chunk_pack_names;
|
2018-07-12 21:39:31 +02:00
|
|
|
const uint32_t *chunk_oid_fanout;
|
2018-07-12 21:39:30 +02:00
|
|
|
const unsigned char *chunk_oid_lookup;
|
2018-07-12 21:39:32 +02:00
|
|
|
const unsigned char *chunk_object_offsets;
|
|
|
|
const unsigned char *chunk_large_offsets;
|
2018-07-12 21:39:27 +02:00
|
|
|
|
2018-07-12 21:39:28 +02:00
|
|
|
const char **pack_names;
|
2018-07-12 21:39:34 +02:00
|
|
|
struct packed_git **packs;
|
2018-07-12 21:39:23 +02:00
|
|
|
char object_dir[FLEX_ARRAY];
|
|
|
|
};
|
|
|
|
|
2019-10-21 20:39:58 +02:00
|
|
|
#define MIDX_PROGRESS (1 << 0)
|
pack-revindex: write multi-pack reverse indexes
Implement the writing half of multi-pack reverse indexes. This is
nothing more than the format describe a few patches ago, with a new set
of helper functions that will be used to clear out stale .rev files
corresponding to old MIDXs.
Unfortunately, a very similar comparison function as the one implemented
recently in pack-revindex.c is reimplemented here, this time accepting a
MIDX-internal type. An effort to DRY these up would create more
indirection and overhead than is necessary, so it isn't pursued here.
Currently, there are no callers which pass the MIDX_WRITE_REV_INDEX
flag, meaning that this is all dead code. But, that won't be the case
for long, since subsequent patches will introduce the multi-pack bitmap,
which will begin passing this field.
(In midx.c:write_midx_internal(), the two adjacent if statements share a
conditional, but are written separately since the first one will
eventually also handle the MIDX_WRITE_BITMAP flag, which does not yet
exist.)
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 17:04:32 +02:00
|
|
|
#define MIDX_WRITE_REV_INDEX (1 << 1)
|
2019-10-21 20:39:58 +02:00
|
|
|
|
pack-revindex: read multi-pack reverse indexes
Implement reading for multi-pack reverse indexes, as described in the
previous patch.
Note that these functions don't yet have any callers, and won't until
multi-pack reachability bitmaps are introduced in a later patch series.
In the meantime, this patch implements some of the infrastructure
necessary to support multi-pack bitmaps.
There are three new functions exposed by the revindex API:
- load_midx_revindex(): loads the reverse index corresponding to the
given multi-pack index.
- midx_to_pack_pos() and pack_pos_to_midx(): these convert between the
multi-pack index and pseudo-pack order.
load_midx_revindex() and pack_pos_to_midx() are both relatively
straightforward.
load_midx_revindex() needs a few functions to be exposed from the midx
API. One to get the checksum of a midx, and another to get the .rev's
filename. Similar to recent changes in the packed_git struct, three new
fields are added to the multi_pack_index struct: one to keep track of
the size, one to keep track of the mmap'd pointer, and another to point
past the header and at the reverse index's data.
pack_pos_to_midx() simply reads the corresponding entry out of the
table.
midx_to_pack_pos() is the trickiest, since it needs to find an object's
position in the psuedo-pack order, but that order can only be recovered
in the .rev file itself. This mapping can be implemented with a binary
search, but note that the thing we're binary searching over isn't an
array of values, but rather a permuted order of those values.
So, when comparing two items, it's helpful to keep in mind the
difference. Instead of a traditional binary search, where you are
comparing two things directly, here we're comparing a (pack, offset)
tuple with an index into the multi-pack index. That index describes
another (pack, offset) tuple, and it is _those_ two tuples that are
compared.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 17:04:26 +02:00
|
|
|
char *get_midx_rev_filename(struct multi_pack_index *m);
|
|
|
|
|
2018-08-20 18:51:55 +02:00
|
|
|
struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local);
|
2019-04-29 18:18:55 +02:00
|
|
|
int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id);
|
2018-07-12 21:39:34 +02:00
|
|
|
int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, uint32_t *result);
|
2021-03-30 17:04:20 +02:00
|
|
|
off_t nth_midxed_offset(struct multi_pack_index *m, uint32_t pos);
|
|
|
|
uint32_t nth_midxed_pack_int_id(struct multi_pack_index *m, uint32_t pos);
|
2018-07-12 21:39:35 +02:00
|
|
|
struct object_id *nth_midxed_object_oid(struct object_id *oid,
|
|
|
|
struct multi_pack_index *m,
|
|
|
|
uint32_t n);
|
2019-04-29 18:18:55 +02:00
|
|
|
int fill_midx_entry(struct repository *r, const struct object_id *oid, struct pack_entry *e, struct multi_pack_index *m);
|
midx: check both pack and index names for containment
A midx file (and the struct we parse from it) contains a list of all of
the covered packfiles, mentioned by their ".idx" names (e.g.,
"pack-1234.idx", etc). And thus calls to midx_contains_pack() expect
callers to provide the idx name.
This works for most of the calls, but the one in open_packed_git_1()
tries to feed a packed_git->pack_name, which is the ".pack" name,
meaning we'll never find a match (even if the pack is covered by the
midx).
We can fix this by converting the ".pack" to ".idx" in the caller.
However, that requires allocating a new string. Instead, let's make
midx_contains_pack() a bit friendlier, and allow it take _either_ the
.pack or .idx variant.
All cleverness in the matching code is credited to René. Bugs are mine.
There's no test here, because while this does fix _a_ bug, it's masked
by another bug in that same caller. That will be covered (with a test)
in the next patch.
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-05 20:06:04 +02:00
|
|
|
int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pack_name);
|
2018-08-20 18:51:55 +02:00
|
|
|
int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local);
|
2018-07-12 21:39:23 +02:00
|
|
|
|
2021-03-30 17:04:11 +02:00
|
|
|
int write_midx_file(const char *object_dir, const char *preferred_pack_name, unsigned flags);
|
2018-10-12 19:34:19 +02:00
|
|
|
void clear_midx_file(struct repository *r);
|
2019-10-21 20:39:58 +02:00
|
|
|
int verify_midx_file(struct repository *r, const char *object_dir, unsigned flags);
|
|
|
|
int expire_midx_packs(struct repository *r, const char *object_dir, unsigned flags);
|
|
|
|
int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, unsigned flags);
|
2018-07-12 21:39:21 +02:00
|
|
|
|
2018-10-12 19:34:19 +02:00
|
|
|
void close_midx(struct multi_pack_index *m);
|
2018-07-12 21:39:21 +02:00
|
|
|
|
|
|
|
#endif
|