From 2f4ba2a867f0390f139b622dbafcab766cb88e80 Mon Sep 17 00:00:00 2001 From: Taylor Blau Date: Mon, 25 Jan 2021 18:37:14 -0500 Subject: [PATCH 01/10] packfile: prepare for the existence of '*.rev' files MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Specify the format of the on-disk reverse index 'pack-*.rev' file, as well as prepare the code for the existence of such files. The reverse index maps from pack relative positions (i.e., an index into the array of object which is sorted by their offsets within the packfile) to their position within the 'pack-*.idx' file. Today, this is done by building up a list of (off_t, uint32_t) tuples for each object (the off_t corresponding to that object's offset, and the uint32_t corresponding to its position in the index). To convert between pack and index position quickly, this array of tuples is radix sorted based on its offset. This has two major drawbacks: First, the in-memory cost scales linearly with the number of objects in a pack. Each 'struct revindex_entry' is sizeof(off_t) + sizeof(uint32_t) + padding bytes for a total of 16. To observe this, force Git to load the reverse index by, for e.g., running 'git cat-file --batch-check="%(objectsize:disk)"'. When asking for a single object in a fresh clone of the kernel, Git needs to allocate 120+ MB of memory in order to hold the reverse index in memory. Second, the cost to sort also scales with the size of the pack. Luckily, this is a linear function since 'load_pack_revindex()' uses a radix sort, but this cost still must be paid once per pack per process. As an example, it takes ~60x longer to print the _size_ of an object as it does to print that entire object's _contents_: Benchmark #1: git.compile cat-file --batch /dev/null Time (mean ± σ): 22.6 ms ± 0.5 ms [User: 2.4 ms, System: 7.9 ms] Range (min … max): 21.4 ms … 23.5 ms 41 runs Benchmark #2: git.compile cat-file --batch /dev/null Time (mean ± σ): 17.2 ms ± 0.7 ms [User: 2.8 ms, System: 5.5 ms] Range (min … max): 15.6 ms … 18.2 ms 45 runs (Numbers taken in the kernel after cheating and using the next patch to generate a reverse index). There are a couple of approaches to improve cold cache performance not pursued here: - We could include the object offsets in the reverse index format. Predictably, this does result in fewer page faults, but it triples the size of the file, while simultaneously duplicating a ton of data already available in the .idx file. (This was the original way I implemented the format, and it did show `--batch-check='%(objectsize:disk)'` winning out against `--batch`.) On the other hand, this increase in size also results in a large block-cache footprint, which could potentially hurt other workloads. - We could store the mapping from pack to index position in more cache-friendly way, like constructing a binary search tree from the table and writing the values in breadth-first order. This would result in much better locality, but the price you pay is trading O(1) lookup in 'pack_pos_to_index()' for an O(log n) one (since you can no longer directly index the table). So, neither of these approaches are taken here. (Thankfully, the format is versioned, so we are free to pursue these in the future.) But, cold cache performance likely isn't interesting outside of one-off cases like asking for the size of an object directly. In real-world usage, Git is often performing many operations in the revindex (i.e., asking about many objects rather than a single one). The trade-off is worth it, since we will avoid the vast majority of the cost of generating the revindex that the extra pointer chase will look like noise in the following patch's benchmarks. This patch describes the format and prepares callers (like in pack-revindex.c) to be able to read *.rev files once they exist. An implementation of the writer will appear in the next patch, and callers will gradually begin to start using the writer in the patches that follow after that. Signed-off-by: Taylor Blau Signed-off-by: Junio C Hamano --- Documentation/technical/pack-format.txt | 20 ++++ builtin/repack.c | 1 + object-store.h | 3 + pack-revindex.c | 144 ++++++++++++++++++++++-- pack-revindex.h | 10 +- packfile.c | 13 ++- packfile.h | 1 + tmp-objdir.c | 6 +- 8 files changed, 184 insertions(+), 14 deletions(-) diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt index 96d2fc589f..8833b71c8b 100644 --- a/Documentation/technical/pack-format.txt +++ b/Documentation/technical/pack-format.txt @@ -274,6 +274,26 @@ Pack file entry: <+ Index checksum of all of the above. +== pack-*.rev files have the format: + + - A 4-byte magic number '0x52494458' ('RIDX'). + + - A 4-byte version identifier (= 1). + + - A 4-byte hash function identifier (= 1 for SHA-1, 2 for SHA-256). + + - A table of index positions (one per packed object, num_objects in + total, each a 4-byte unsigned integer in network order), sorted by + their corresponding offsets in the packfile. + + - A trailer, containing a: + + checksum of the corresponding packfile, and + + a checksum of all of the above. + +All 4-byte numbers are in network order. + == multi-pack-index (MIDX) files have the following format: The multi-pack-index files refer to multiple pack-files and loose objects. diff --git a/builtin/repack.c b/builtin/repack.c index 2158b48f4c..01440de2d5 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -209,6 +209,7 @@ static struct { } exts[] = { {".pack"}, {".idx"}, + {".rev", 1}, {".bitmap", 1}, {".promisor", 1}, }; diff --git a/object-store.h b/object-store.h index c4fc9dd74e..541dab0858 100644 --- a/object-store.h +++ b/object-store.h @@ -85,6 +85,9 @@ struct packed_git { multi_pack_index:1; unsigned char hash[GIT_MAX_RAWSZ]; struct revindex_entry *revindex; + const uint32_t *revindex_data; + const uint32_t *revindex_map; + size_t revindex_size; /* something like ".git/objects/pack/xxxxx.pack" */ char pack_name[FLEX_ARRAY]; /* more */ }; diff --git a/pack-revindex.c b/pack-revindex.c index 5e69bc7372..a174fa5388 100644 --- a/pack-revindex.c +++ b/pack-revindex.c @@ -164,14 +164,128 @@ static void create_pack_revindex(struct packed_git *p) sort_revindex(p->revindex, num_ent, p->pack_size); } +static int create_pack_revindex_in_memory(struct packed_git *p) +{ + if (open_pack_index(p)) + return -1; + create_pack_revindex(p); + return 0; +} + +static char *pack_revindex_filename(struct packed_git *p) +{ + size_t len; + if (!strip_suffix(p->pack_name, ".pack", &len)) + BUG("pack_name does not end in .pack"); + return xstrfmt("%.*s.rev", (int)len, p->pack_name); +} + +#define RIDX_HEADER_SIZE (12) +#define RIDX_MIN_SIZE (RIDX_HEADER_SIZE + (2 * the_hash_algo->rawsz)) + +struct revindex_header { + uint32_t signature; + uint32_t version; + uint32_t hash_id; +}; + +static int load_revindex_from_disk(char *revindex_name, + uint32_t num_objects, + const uint32_t **data_p, size_t *len_p) +{ + int fd, ret = 0; + struct stat st; + void *data = NULL; + size_t revindex_size; + struct revindex_header *hdr; + + fd = git_open(revindex_name); + + if (fd < 0) { + ret = -1; + goto cleanup; + } + if (fstat(fd, &st)) { + ret = error_errno(_("failed to read %s"), revindex_name); + goto cleanup; + } + + revindex_size = xsize_t(st.st_size); + + if (revindex_size < RIDX_MIN_SIZE) { + ret = error(_("reverse-index file %s is too small"), revindex_name); + goto cleanup; + } + + if (revindex_size - RIDX_MIN_SIZE != st_mult(sizeof(uint32_t), num_objects)) { + ret = error(_("reverse-index file %s is corrupt"), revindex_name); + goto cleanup; + } + + data = xmmap(NULL, revindex_size, PROT_READ, MAP_PRIVATE, fd, 0); + hdr = data; + + if (ntohl(hdr->signature) != RIDX_SIGNATURE) { + ret = error(_("reverse-index file %s has unknown signature"), revindex_name); + goto cleanup; + } + if (ntohl(hdr->version) != 1) { + ret = error(_("reverse-index file %s has unsupported version %"PRIu32), + revindex_name, ntohl(hdr->version)); + goto cleanup; + } + if (!(ntohl(hdr->hash_id) == 1 || ntohl(hdr->hash_id) == 2)) { + ret = error(_("reverse-index file %s has unsupported hash id %"PRIu32), + revindex_name, ntohl(hdr->hash_id)); + goto cleanup; + } + +cleanup: + if (ret) { + if (data) + munmap(data, revindex_size); + } else { + *len_p = revindex_size; + *data_p = (const uint32_t *)data; + } + + close(fd); + return ret; +} + +static int load_pack_revindex_from_disk(struct packed_git *p) +{ + char *revindex_name; + int ret; + if (open_pack_index(p)) + return -1; + + revindex_name = pack_revindex_filename(p); + + ret = load_revindex_from_disk(revindex_name, + p->num_objects, + &p->revindex_map, + &p->revindex_size); + if (ret) + goto cleanup; + + p->revindex_data = (const uint32_t *)((const char *)p->revindex_map + RIDX_HEADER_SIZE); + +cleanup: + free(revindex_name); + return ret; +} + int load_pack_revindex(struct packed_git *p) { - if (!p->revindex) { - if (open_pack_index(p)) - return -1; - create_pack_revindex(p); - } - return 0; + if (p->revindex || p->revindex_data) + return 0; + + if (!load_pack_revindex_from_disk(p)) + return 0; + else if (!create_pack_revindex_in_memory(p)) + return 0; + return -1; } int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_t *pos) @@ -203,18 +317,28 @@ int offset_to_pack_pos(struct packed_git *p, off_t ofs, uint32_t *pos) uint32_t pack_pos_to_index(struct packed_git *p, uint32_t pos) { - if (!p->revindex) + if (!(p->revindex || p->revindex_data)) BUG("pack_pos_to_index: reverse index not yet loaded"); if (p->num_objects <= pos) BUG("pack_pos_to_index: out-of-bounds object at %"PRIu32, pos); - return p->revindex[pos].nr; + + if (p->revindex) + return p->revindex[pos].nr; + else + return get_be32(p->revindex_data + pos); } off_t pack_pos_to_offset(struct packed_git *p, uint32_t pos) { - if (!p->revindex) + if (!(p->revindex || p->revindex_data)) BUG("pack_pos_to_index: reverse index not yet loaded"); if (p->num_objects < pos) BUG("pack_pos_to_offset: out-of-bounds object at %"PRIu32, pos); - return p->revindex[pos].offset; + + if (p->revindex) + return p->revindex[pos].offset; + else if (pos == p->num_objects) + return p->pack_size - the_hash_algo->rawsz; + else + return nth_packed_object_offset(p, pack_pos_to_index(p, pos)); } diff --git a/pack-revindex.h b/pack-revindex.h index 6e0320b08b..61b2f3ab75 100644 --- a/pack-revindex.h +++ b/pack-revindex.h @@ -16,11 +16,17 @@ * can be found */ +#define RIDX_SIGNATURE 0x52494458 /* "RIDX" */ +#define RIDX_VERSION 1 + struct packed_git; /* * load_pack_revindex populates the revindex's internal data-structures for the * given pack, returning zero on success and a negative value otherwise. + * + * If a '.rev' file is present it is mmap'd, and pointers are assigned into it + * (instead of using the in-memory variant). */ int load_pack_revindex(struct packed_git *p); @@ -55,7 +61,9 @@ uint32_t pack_pos_to_index(struct packed_git *p, uint32_t pos); * If the reverse index has not yet been loaded, or the position is out of * bounds, this function aborts. * - * This function runs in constant time. + * This function runs in constant time under both in-memory and on-disk reverse + * indexes, but an additional step is taken to consult the corresponding .idx + * file when using the on-disk format. */ off_t pack_pos_to_offset(struct packed_git *p, uint32_t pos); diff --git a/packfile.c b/packfile.c index 4b938b4372..1fec12ac5f 100644 --- a/packfile.c +++ b/packfile.c @@ -324,11 +324,21 @@ void close_pack_index(struct packed_git *p) } } +void close_pack_revindex(struct packed_git *p) { + if (!p->revindex_map) + return; + + munmap((void *)p->revindex_map, p->revindex_size); + p->revindex_map = NULL; + p->revindex_data = NULL; +} + void close_pack(struct packed_git *p) { close_pack_windows(p); close_pack_fd(p); close_pack_index(p); + close_pack_revindex(p); } void close_object_store(struct raw_object_store *o) @@ -351,7 +361,7 @@ void close_object_store(struct raw_object_store *o) void unlink_pack_path(const char *pack_name, int force_delete) { - static const char *exts[] = {".pack", ".idx", ".keep", ".bitmap", ".promisor"}; + static const char *exts[] = {".pack", ".idx", ".rev", ".keep", ".bitmap", ".promisor"}; int i; struct strbuf buf = STRBUF_INIT; size_t plen; @@ -853,6 +863,7 @@ static void prepare_pack(const char *full_name, size_t full_name_len, if (!strcmp(file_name, "multi-pack-index")) return; if (ends_with(file_name, ".idx") || + ends_with(file_name, ".rev") || ends_with(file_name, ".pack") || ends_with(file_name, ".bitmap") || ends_with(file_name, ".keep") || diff --git a/packfile.h b/packfile.h index a58fc738e0..4cfec9e8d3 100644 --- a/packfile.h +++ b/packfile.h @@ -90,6 +90,7 @@ uint32_t get_pack_fanout(struct packed_git *p, uint32_t value); unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *); void close_pack_windows(struct packed_git *); +void close_pack_revindex(struct packed_git *); void close_pack(struct packed_git *); void close_object_store(struct raw_object_store *o); void unuse_pack(struct pack_window **); diff --git a/tmp-objdir.c b/tmp-objdir.c index 42ed4db5d3..b8d880e362 100644 --- a/tmp-objdir.c +++ b/tmp-objdir.c @@ -185,9 +185,11 @@ static int pack_copy_priority(const char *name) return 1; if (ends_with(name, ".pack")) return 2; - if (ends_with(name, ".idx")) + if (ends_with(name, ".rev")) return 3; - return 4; + if (ends_with(name, ".idx")) + return 4; + return 5; } static int pack_copy_cmp(const char *a, const char *b) From 8ef50d9958f7be21e38026433d30f72521b4de47 Mon Sep 17 00:00:00 2001 From: Taylor Blau Date: Mon, 25 Jan 2021 18:37:18 -0500 Subject: [PATCH 02/10] pack-write.c: prepare to write 'pack-*.rev' files This patch prepares for callers to be able to write reverse index files to disk. It adds the necessary machinery to write a format-compliant .rev file from within 'write_rev_file()', which is called from 'finish_tmp_packfile()'. Similar to the process by which the reverse index is computed in memory, these new paths also have to sort a list of objects by their offsets within a packfile. These new paths use a qsort() (as opposed to a radix sort), since our specialized radix sort requires a full revindex_entry struct per object, which is more memory than we need to allocate. The qsort is obviously slower, but the theoretical slowdown would require a repository with a large amount of objects, likely implying that the time spent in, say, pack-objects during a repack would dominate the overall runtime. Signed-off-by: Taylor Blau Signed-off-by: Junio C Hamano --- pack-write.c | 120 ++++++++++++++++++++++++++++++++++++++++++++++++++- pack.h | 4 ++ 2 files changed, 123 insertions(+), 1 deletion(-) diff --git a/pack-write.c b/pack-write.c index e9bb3fd949..680c36755d 100644 --- a/pack-write.c +++ b/pack-write.c @@ -167,6 +167,113 @@ const char *write_idx_file(const char *index_name, struct pack_idx_entry **objec return index_name; } +static int pack_order_cmp(const void *va, const void *vb, void *ctx) +{ + struct pack_idx_entry **objects = ctx; + + off_t oa = objects[*(uint32_t*)va]->offset; + off_t ob = objects[*(uint32_t*)vb]->offset; + + if (oa < ob) + return -1; + if (oa > ob) + return 1; + return 0; +} + +static void write_rev_header(struct hashfile *f) +{ + uint32_t oid_version; + switch (hash_algo_by_ptr(the_hash_algo)) { + case GIT_HASH_SHA1: + oid_version = 1; + break; + case GIT_HASH_SHA256: + oid_version = 2; + break; + default: + die("write_rev_header: unknown hash version"); + } + + hashwrite_be32(f, RIDX_SIGNATURE); + hashwrite_be32(f, RIDX_VERSION); + hashwrite_be32(f, oid_version); +} + +static void write_rev_index_positions(struct hashfile *f, + struct pack_idx_entry **objects, + uint32_t nr_objects) +{ + uint32_t *pack_order; + uint32_t i; + + ALLOC_ARRAY(pack_order, nr_objects); + for (i = 0; i < nr_objects; i++) + pack_order[i] = i; + QSORT_S(pack_order, nr_objects, pack_order_cmp, objects); + + for (i = 0; i < nr_objects; i++) + hashwrite_be32(f, pack_order[i]); + + free(pack_order); +} + +static void write_rev_trailer(struct hashfile *f, const unsigned char *hash) +{ + hashwrite(f, hash, the_hash_algo->rawsz); +} + +const char *write_rev_file(const char *rev_name, + struct pack_idx_entry **objects, + uint32_t nr_objects, + const unsigned char *hash, + unsigned flags) +{ + struct hashfile *f; + int fd; + + if ((flags & WRITE_REV) && (flags & WRITE_REV_VERIFY)) + die(_("cannot both write and verify reverse index")); + + if (flags & WRITE_REV) { + if (!rev_name) { + struct strbuf tmp_file = STRBUF_INIT; + fd = odb_mkstemp(&tmp_file, "pack/tmp_rev_XXXXXX"); + rev_name = strbuf_detach(&tmp_file, NULL); + } else { + unlink(rev_name); + fd = open(rev_name, O_CREAT|O_EXCL|O_WRONLY, 0600); + if (fd < 0) + die_errno("unable to create '%s'", rev_name); + } + f = hashfd(fd, rev_name); + } else if (flags & WRITE_REV_VERIFY) { + struct stat statbuf; + if (stat(rev_name, &statbuf)) { + if (errno == ENOENT) { + /* .rev files are optional */ + return NULL; + } else + die_errno(_("could not stat: %s"), rev_name); + } + f = hashfd_check(rev_name); + } else + return NULL; + + write_rev_header(f); + + write_rev_index_positions(f, objects, nr_objects); + write_rev_trailer(f, hash); + + if (rev_name && adjust_shared_perm(rev_name) < 0) + die(_("failed to make %s readable"), rev_name); + + finalize_hashfile(f, NULL, CSUM_HASH_IN_STREAM | CSUM_CLOSE | + ((flags & WRITE_IDX_VERIFY) ? 0 : CSUM_FSYNC)); + + return rev_name; +} + off_t write_pack_header(struct hashfile *f, uint32_t nr_entries) { struct pack_header hdr; @@ -342,7 +449,7 @@ void finish_tmp_packfile(struct strbuf *name_buffer, struct pack_idx_option *pack_idx_opts, unsigned char hash[]) { - const char *idx_tmp_name; + const char *idx_tmp_name, *rev_tmp_name = NULL; int basename_len = name_buffer->len; if (adjust_shared_perm(pack_tmp_name)) @@ -353,6 +460,9 @@ void finish_tmp_packfile(struct strbuf *name_buffer, if (adjust_shared_perm(idx_tmp_name)) die_errno("unable to make temporary index file readable"); + rev_tmp_name = write_rev_file(NULL, written_list, nr_written, hash, + pack_idx_opts->flags); + strbuf_addf(name_buffer, "%s.pack", hash_to_hex(hash)); if (rename(pack_tmp_name, name_buffer->buf)) @@ -366,6 +476,14 @@ void finish_tmp_packfile(struct strbuf *name_buffer, strbuf_setlen(name_buffer, basename_len); + if (rev_tmp_name) { + strbuf_addf(name_buffer, "%s.rev", hash_to_hex(hash)); + if (rename(rev_tmp_name, name_buffer->buf)) + die_errno("unable to rename temporary reverse-index file"); + } + + strbuf_setlen(name_buffer, basename_len); + free((void *)idx_tmp_name); } diff --git a/pack.h b/pack.h index 9ae640f417..afdcf8f5c7 100644 --- a/pack.h +++ b/pack.h @@ -42,6 +42,8 @@ struct pack_idx_option { /* flag bits */ #define WRITE_IDX_VERIFY 01 /* verify only, do not write the idx file */ #define WRITE_IDX_STRICT 02 +#define WRITE_REV 04 +#define WRITE_REV_VERIFY 010 uint32_t version; uint32_t off32_limit; @@ -91,6 +93,8 @@ struct ref; void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought); +const char *write_rev_file(const char *rev_name, struct pack_idx_entry **objects, uint32_t nr_objects, const unsigned char *hash, unsigned flags); + /* * The "hdr" output buffer should be at least this big, which will handle sizes * up to 2^67. From 84d544943c27a1540826730b6a8f588200c65fe6 Mon Sep 17 00:00:00 2001 From: Taylor Blau Date: Mon, 25 Jan 2021 18:37:22 -0500 Subject: [PATCH 03/10] builtin/index-pack.c: allow stripping arbitrary extensions To derive the filename for a .idx file, 'git index-pack' uses derive_filename() to strip the '.pack' suffix and add the new suffix. Prepare for stripping off suffixes other than '.pack' by making the suffix to strip a parameter of derive_filename(). In order to make this consistent with the "suffix" parameter which does not begin with a ".", an additional check in derive_filename. Suggested-by: Jeff King Signed-off-by: Taylor Blau Signed-off-by: Junio C Hamano --- builtin/index-pack.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 557bd2f348..c758f3b8e9 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1436,15 +1436,15 @@ static void fix_unresolved_deltas(struct hashfile *f) free(sorted_by_pos); } -static const char *derive_filename(const char *pack_name, const char *suffix, - struct strbuf *buf) +static const char *derive_filename(const char *pack_name, const char *strip, + const char *suffix, struct strbuf *buf) { size_t len; - if (!strip_suffix(pack_name, ".pack", &len)) - die(_("packfile name '%s' does not end with '.pack'"), - pack_name); + if (!strip_suffix(pack_name, strip, &len) || !len || + pack_name[len - 1] != '.') + die(_("packfile name '%s' does not end with '.%s'"), + pack_name, strip); strbuf_add(buf, pack_name, len); - strbuf_addch(buf, '.'); strbuf_addstr(buf, suffix); return buf->buf; } @@ -1459,7 +1459,7 @@ static void write_special_file(const char *suffix, const char *msg, int msg_len = strlen(msg); if (pack_name) - filename = derive_filename(pack_name, suffix, &name_buf); + filename = derive_filename(pack_name, "pack", suffix, &name_buf); else filename = odb_pack_name(&name_buf, hash, suffix); @@ -1824,7 +1824,7 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) if (from_stdin && hash_algo) die(_("--object-format cannot be used with --stdin")); if (!index_name && pack_name) - index_name = derive_filename(pack_name, "idx", &index_name_buf); + index_name = derive_filename(pack_name, "pack", "idx", &index_name_buf); if (verify) { if (!index_name) From e37d0b8730b67b83c3a7cc20ed3a45cf7f0aa7ed Mon Sep 17 00:00:00 2001 From: Taylor Blau Date: Mon, 25 Jan 2021 18:37:26 -0500 Subject: [PATCH 04/10] builtin/index-pack.c: write reverse indexes Teach 'git index-pack' to optionally write and verify reverse index with '--[no-]rev-index', as well as respecting the 'pack.writeReverseIndex' configuration option. Signed-off-by: Taylor Blau Signed-off-by: Junio C Hamano --- Documentation/git-index-pack.txt | 18 +++++--- builtin/index-pack.c | 50 ++++++++++++++++++++-- t/t5325-reverse-index.sh | 71 ++++++++++++++++++++++++++++++++ 3 files changed, 131 insertions(+), 8 deletions(-) create mode 100755 t/t5325-reverse-index.sh diff --git a/Documentation/git-index-pack.txt b/Documentation/git-index-pack.txt index af0c26232c..69ba904d44 100644 --- a/Documentation/git-index-pack.txt +++ b/Documentation/git-index-pack.txt @@ -9,17 +9,18 @@ git-index-pack - Build pack index file for an existing packed archive SYNOPSIS -------- [verse] -'git index-pack' [-v] [-o ] +'git index-pack' [-v] [-o ] [--[no-]rev-index] 'git index-pack' --stdin [--fix-thin] [--keep] [-v] [-o ] - [] + [--[no-]rev-index] [] DESCRIPTION ----------- Reads a packed archive (.pack) from the specified file, and -builds a pack index file (.idx) for it. The packed archive -together with the pack index can then be placed in the -objects/pack/ directory of a Git repository. +builds a pack index file (.idx) for it. Optionally writes a +reverse-index (.rev) for the specified pack. The packed +archive together with the pack index can then be placed in +the objects/pack/ directory of a Git repository. OPTIONS @@ -35,6 +36,13 @@ OPTIONS fails if the name of packed archive does not end with .pack). +--[no-]rev-index:: + When this flag is provided, generate a reverse index + (a `.rev` file) corresponding to the given pack. If + `--verify` is given, ensure that the existing + reverse index is correct. Takes precedence over + `pack.writeReverseIndex`. + --stdin:: When this flag is provided, the pack is read from stdin instead and a copy is then written to . If diff --git a/builtin/index-pack.c b/builtin/index-pack.c index c758f3b8e9..d5cd665b98 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -17,7 +17,7 @@ #include "promisor-remote.h" static const char index_pack_usage[] = -"git index-pack [-v] [-o ] [--keep | --keep=] [--verify] [--strict] ( | --stdin [--fix-thin] [])"; +"git index-pack [-v] [-o ] [--keep | --keep=] [--[no-]rev-index] [--verify] [--strict] ( | --stdin [--fix-thin] [])"; struct object_entry { struct pack_idx_entry idx; @@ -1484,12 +1484,14 @@ static void write_special_file(const char *suffix, const char *msg, static void final(const char *final_pack_name, const char *curr_pack_name, const char *final_index_name, const char *curr_index_name, + const char *final_rev_index_name, const char *curr_rev_index_name, const char *keep_msg, const char *promisor_msg, unsigned char *hash) { const char *report = "pack"; struct strbuf pack_name = STRBUF_INIT; struct strbuf index_name = STRBUF_INIT; + struct strbuf rev_index_name = STRBUF_INIT; int err; if (!from_stdin) { @@ -1524,6 +1526,16 @@ static void final(const char *final_pack_name, const char *curr_pack_name, } else chmod(final_index_name, 0444); + if (curr_rev_index_name) { + if (final_rev_index_name != curr_rev_index_name) { + if (!final_rev_index_name) + final_rev_index_name = odb_pack_name(&rev_index_name, hash, "rev"); + if (finalize_object_file(curr_rev_index_name, final_rev_index_name)) + die(_("cannot store reverse index file")); + } else + chmod(final_rev_index_name, 0444); + } + if (do_fsck_object) { struct packed_git *p; p = add_packed_git(final_index_name, strlen(final_index_name), 0); @@ -1553,6 +1565,7 @@ static void final(const char *final_pack_name, const char *curr_pack_name, } } + strbuf_release(&rev_index_name); strbuf_release(&index_name); strbuf_release(&pack_name); } @@ -1578,6 +1591,12 @@ static int git_index_pack_config(const char *k, const char *v, void *cb) } return 0; } + if (!strcmp(k, "pack.writereverseindex")) { + if (git_config_bool(k, v)) + opts->flags |= WRITE_REV; + else + opts->flags &= ~WRITE_REV; + } return git_default_config(k, v, cb); } @@ -1695,12 +1714,14 @@ static void show_pack_info(int stat_only) int cmd_index_pack(int argc, const char **argv, const char *prefix) { - int i, fix_thin_pack = 0, verify = 0, stat_only = 0; + int i, fix_thin_pack = 0, verify = 0, stat_only = 0, rev_index; const char *curr_index; - const char *index_name = NULL, *pack_name = NULL; + const char *curr_rev_index = NULL; + const char *index_name = NULL, *pack_name = NULL, *rev_index_name = NULL; const char *keep_msg = NULL; const char *promisor_msg = NULL; struct strbuf index_name_buf = STRBUF_INIT; + struct strbuf rev_index_name_buf = STRBUF_INIT; struct pack_idx_entry **idx_objects; struct pack_idx_option opts; unsigned char pack_hash[GIT_MAX_RAWSZ]; @@ -1727,6 +1748,8 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) if (prefix && chdir(prefix)) die(_("Cannot come back to cwd")); + rev_index = !!(opts.flags & (WRITE_REV_VERIFY | WRITE_REV)); + for (i = 1; i < argc; i++) { const char *arg = argv[i]; @@ -1805,6 +1828,10 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) if (hash_algo == GIT_HASH_UNKNOWN) die(_("unknown hash algorithm '%s'"), arg); repo_set_hash_algo(the_repository, hash_algo); + } else if (!strcmp(arg, "--rev-index")) { + rev_index = 1; + } else if (!strcmp(arg, "--no-rev-index")) { + rev_index = 0; } else usage(index_pack_usage); continue; @@ -1826,6 +1853,15 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) if (!index_name && pack_name) index_name = derive_filename(pack_name, "pack", "idx", &index_name_buf); + opts.flags &= ~(WRITE_REV | WRITE_REV_VERIFY); + if (rev_index) { + opts.flags |= verify ? WRITE_REV_VERIFY : WRITE_REV; + if (index_name) + rev_index_name = derive_filename(index_name, + "idx", "rev", + &rev_index_name_buf); + } + if (verify) { if (!index_name) die(_("--verify with no packfile name given")); @@ -1878,11 +1914,16 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) for (i = 0; i < nr_objects; i++) idx_objects[i] = &objects[i].idx; curr_index = write_idx_file(index_name, idx_objects, nr_objects, &opts, pack_hash); + if (rev_index) + curr_rev_index = write_rev_file(rev_index_name, idx_objects, + nr_objects, pack_hash, + opts.flags); free(idx_objects); if (!verify) final(pack_name, curr_pack, index_name, curr_index, + rev_index_name, curr_rev_index, keep_msg, promisor_msg, pack_hash); else @@ -1893,10 +1934,13 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) free(objects); strbuf_release(&index_name_buf); + strbuf_release(&rev_index_name_buf); if (pack_name == NULL) free((void *) curr_pack); if (index_name == NULL) free((void *) curr_index); + if (rev_index_name == NULL) + free((void *) curr_rev_index); /* * Let the caller know this pack is not self contained diff --git a/t/t5325-reverse-index.sh b/t/t5325-reverse-index.sh new file mode 100755 index 0000000000..2dae213126 --- /dev/null +++ b/t/t5325-reverse-index.sh @@ -0,0 +1,71 @@ +#!/bin/sh + +test_description='on-disk reverse index' +. ./test-lib.sh + +packdir=.git/objects/pack + +test_expect_success 'setup' ' + test_commit base && + + pack=$(git pack-objects --all $packdir/pack) && + rev=$packdir/pack-$pack.rev && + + test_path_is_missing $rev +' + +test_index_pack () { + rm -f $rev && + conf=$1 && + shift && + # remove the index since Windows won't overwrite an existing file + rm $packdir/pack-$pack.idx && + git -c pack.writeReverseIndex=$conf index-pack "$@" \ + $packdir/pack-$pack.pack +} + +test_expect_success 'index-pack with pack.writeReverseIndex' ' + test_index_pack "" && + test_path_is_missing $rev && + + test_index_pack false && + test_path_is_missing $rev && + + test_index_pack true && + test_path_is_file $rev +' + +test_expect_success 'index-pack with --[no-]rev-index' ' + for conf in "" true false + do + test_index_pack "$conf" --rev-index && + test_path_exists $rev && + + test_index_pack "$conf" --no-rev-index && + test_path_is_missing $rev + done +' + +test_expect_success 'index-pack can verify reverse indexes' ' + test_when_finished "rm -f $rev" && + test_index_pack true && + + test_path_is_file $rev && + git index-pack --rev-index --verify $packdir/pack-$pack.pack && + + # Intentionally corrupt the reverse index. + chmod u+w $rev && + printf "xxxx" | dd of=$rev bs=1 count=4 conv=notrunc && + + test_must_fail git index-pack --rev-index --verify \ + $packdir/pack-$pack.pack 2>err && + grep "validation error" err +' + +test_expect_success 'index-pack infers reverse index name with -o' ' + git index-pack --rev-index -o other.idx $packdir/pack-$pack.pack && + test_path_is_file other.idx && + test_path_is_file other.rev +' + +test_done From c97733435aae270bbef9bd4d179edca678774375 Mon Sep 17 00:00:00 2001 From: Taylor Blau Date: Mon, 25 Jan 2021 18:37:30 -0500 Subject: [PATCH 05/10] builtin/pack-objects.c: respect 'pack.writeReverseIndex' Now that we have an implementation that can write the new reverse index format, enable writing a .rev file in 'git pack-objects' by consulting the pack.writeReverseIndex configuration variable. Signed-off-by: Taylor Blau Signed-off-by: Junio C Hamano --- builtin/pack-objects.c | 7 +++++++ t/t5325-reverse-index.sh | 13 +++++++++++++ 2 files changed, 20 insertions(+) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 5b0c4489e2..d784569200 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -2955,6 +2955,13 @@ static int git_pack_config(const char *k, const char *v, void *cb) pack_idx_opts.version); return 0; } + if (!strcmp(k, "pack.writereverseindex")) { + if (git_config_bool(k, v)) + pack_idx_opts.flags |= WRITE_REV; + else + pack_idx_opts.flags &= ~WRITE_REV; + return 0; + } if (!strcmp(k, "uploadpack.blobpackfileuri")) { struct configured_exclusion *ex = xmalloc(sizeof(*ex)); const char *oid_end, *pack_end; diff --git a/t/t5325-reverse-index.sh b/t/t5325-reverse-index.sh index 2dae213126..87040263b7 100755 --- a/t/t5325-reverse-index.sh +++ b/t/t5325-reverse-index.sh @@ -68,4 +68,17 @@ test_expect_success 'index-pack infers reverse index name with -o' ' test_path_is_file other.rev ' +test_expect_success 'pack-objects respects pack.writeReverseIndex' ' + test_when_finished "rm -fr pack-1-*" && + + git -c pack.writeReverseIndex= pack-objects --all pack-1 && + test_path_is_missing pack-1-*.rev && + + git -c pack.writeReverseIndex=false pack-objects --all pack-1 && + test_path_is_missing pack-1-*.rev && + + git -c pack.writeReverseIndex=true pack-objects --all pack-1 && + test_path_is_file pack-1-*.rev +' + test_done From 1615c567b8cf65725f4aee5206ad5a04273794dd Mon Sep 17 00:00:00 2001 From: Taylor Blau Date: Mon, 25 Jan 2021 18:37:34 -0500 Subject: [PATCH 06/10] Documentation/config/pack.txt: advertise 'pack.writeReverseIndex' Now that the pack.writeReverseIndex configuration is respected in both 'git index-pack' and 'git pack-objects' (and therefore, all of their callers), we can safely advertise it for use in the git-config manual. Signed-off-by: Taylor Blau Signed-off-by: Junio C Hamano --- Documentation/config/pack.txt | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/Documentation/config/pack.txt b/Documentation/config/pack.txt index 837f1b1679..3da4ea98e2 100644 --- a/Documentation/config/pack.txt +++ b/Documentation/config/pack.txt @@ -133,3 +133,10 @@ pack.writeBitmapHashCache:: between an older, bitmapped pack and objects that have been pushed since the last gc). The downside is that it consumes 4 bytes per object of disk space. Defaults to true. + +pack.writeReverseIndex:: + When true, git will write a corresponding .rev file (see: + link:../technical/pack-format.html[Documentation/technical/pack-format.txt]) + for each new packfile that it writes in all places except for + linkgit:git-fast-import[1] and in the bulk checkin mechanism. + Defaults to false. From 35a8a3547a841b031b14cd759cf53996bca89baf Mon Sep 17 00:00:00 2001 From: Taylor Blau Date: Mon, 25 Jan 2021 18:37:38 -0500 Subject: [PATCH 07/10] t: prepare for GIT_TEST_WRITE_REV_INDEX In the next patch, we'll add support for unconditionally enabling the 'pack.writeReverseIndex' setting with a new GIT_TEST_WRITE_REV_INDEX environment variable. This causes a little bit of fallout with tests that, for example, compare the list of files in the pack directory being unprepared to see .rev files in its output. Those locations can be cleaned up to look for specific file extensions, rather than take everything in the pack directory (for instance) and then grep out unwanted items. Once the pack.writeReverseIndex option has been thoroughly tested, we will default it to 'true', removing GIT_TEST_WRITE_REV_INDEX, and making it possible to revert this patch. Signed-off-by: Taylor Blau Signed-off-by: Junio C Hamano --- t/t5319-multi-pack-index.sh | 5 +++-- t/t5325-reverse-index.sh | 4 ++++ t/t5604-clone-reference.sh | 2 +- t/t5702-protocol-v2.sh | 12 ++++++++---- t/t6500-gc.sh | 6 +++--- t/t9300-fast-import.sh | 5 ++++- 6 files changed, 23 insertions(+), 11 deletions(-) diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 297de502a9..2fc3aadbd1 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -710,8 +710,9 @@ test_expect_success 'expire respects .keep files' ' PACKA=$(ls .git/objects/pack/a-pack*\.pack | sed s/\.pack\$//) && touch $PACKA.keep && git multi-pack-index expire && - ls -S .git/objects/pack/a-pack* | grep $PACKA >a-pack-files && - test_line_count = 3 a-pack-files && + test_path_is_file $PACKA.idx && + test_path_is_file $PACKA.keep && + test_path_is_file $PACKA.pack && test-tool read-midx .git/objects | grep idx >midx-list && test_line_count = 2 midx-list ) diff --git a/t/t5325-reverse-index.sh b/t/t5325-reverse-index.sh index 87040263b7..be452bb343 100755 --- a/t/t5325-reverse-index.sh +++ b/t/t5325-reverse-index.sh @@ -3,6 +3,10 @@ test_description='on-disk reverse index' . ./test-lib.sh +# The below tests want control over the 'pack.writeReverseIndex' setting +# themselves to assert various combinations of it with other options. +sane_unset GIT_TEST_WRITE_REV_INDEX + packdir=.git/objects/pack test_expect_success 'setup' ' diff --git a/t/t5604-clone-reference.sh b/t/t5604-clone-reference.sh index 5d682706ae..e845d621f6 100755 --- a/t/t5604-clone-reference.sh +++ b/t/t5604-clone-reference.sh @@ -329,7 +329,7 @@ test_expect_success SYMLINKS 'clone repo with symlinked or unknown files at obje for raw in $(ls T*.raw) do sed -e "s!/../!/Y/!; s![0-9a-f]\{38,\}!Z!" -e "/commit-graph/d" \ - -e "/multi-pack-index/d" <$raw >$raw.de-sha-1 && + -e "/multi-pack-index/d" -e "/rev/d" <$raw >$raw.de-sha-1 && sort $raw.de-sha-1 >$raw.de-sha || return 1 done && diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh index 3d994e0b1b..e8f0b4a299 100755 --- a/t/t5702-protocol-v2.sh +++ b/t/t5702-protocol-v2.sh @@ -851,8 +851,10 @@ test_expect_success 'part of packfile response provided as URI' ' test -f h2found && # Ensure that there are exactly 6 files (3 .pack and 3 .idx). - ls http_child/.git/objects/pack/* >filelist && - test_line_count = 6 filelist + ls http_child/.git/objects/pack/*.pack >packlist && + ls http_child/.git/objects/pack/*.idx >idxlist && + test_line_count = 3 idxlist && + test_line_count = 3 packlist ' test_expect_success 'fetching with valid packfile URI but invalid hash fails' ' @@ -905,8 +907,10 @@ test_expect_success 'packfile-uri with transfer.fsckobjects' ' clone "$HTTPD_URL/smart/http_parent" http_child && # Ensure that there are exactly 4 files (2 .pack and 2 .idx). - ls http_child/.git/objects/pack/* >filelist && - test_line_count = 4 filelist + ls http_child/.git/objects/pack/*.pack >packlist && + ls http_child/.git/objects/pack/*.idx >idxlist && + test_line_count = 2 idxlist && + test_line_count = 2 packlist ' test_expect_success 'packfile-uri with transfer.fsckobjects fails on bad object' ' diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh index 4a3b8f48ac..f76586f808 100755 --- a/t/t6500-gc.sh +++ b/t/t6500-gc.sh @@ -106,17 +106,17 @@ test_expect_success 'auto gc with too many loose objects does not attempt to cre test_commit "$(test_oid obj2)" && # Our first gc will create a pack; our second will create a second pack git gc --auto && - ls .git/objects/pack | sort >existing_packs && + ls .git/objects/pack/pack-*.pack | sort >existing_packs && test_commit "$(test_oid obj3)" && test_commit "$(test_oid obj4)" && git gc --auto 2>err && test_i18ngrep ! "^warning:" err && - ls .git/objects/pack/ | sort >post_packs && + ls .git/objects/pack/pack-*.pack | sort >post_packs && comm -1 -3 existing_packs post_packs >new && comm -2 -3 existing_packs post_packs >del && test_line_count = 0 del && # No packs are deleted - test_line_count = 2 new # There is one new pack and its .idx + test_line_count = 1 new # There is one new pack ' test_expect_success 'gc --no-quiet' ' diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh index 3d17e932a0..8f1caf8025 100755 --- a/t/t9300-fast-import.sh +++ b/t/t9300-fast-import.sh @@ -1632,7 +1632,10 @@ test_expect_success 'O: blank lines not necessary after other commands' ' INPUT_END git fast-import packlist && + ls -la .git/objects/pack/pack-*.pack >idxlist && + test_line_count = 4 idxlist && + test_line_count = 4 packlist && test $(git rev-parse refs/tags/O3-2nd) = $(git rev-parse O3^) && git log --reverse --pretty=oneline O3 | sed s/^.*z// >actual && test_cmp expect actual From e8c58f894b4d5369779e06ff7dc03fb0706c930a Mon Sep 17 00:00:00 2001 From: Taylor Blau Date: Mon, 25 Jan 2021 18:37:42 -0500 Subject: [PATCH 08/10] t: support GIT_TEST_WRITE_REV_INDEX Add a new option that unconditionally enables the pack.writeReverseIndex setting in order to run the whole test suite in a mode that generates on-disk reverse indexes. Additionally, enable this mode in the second run of tests under linux-gcc in 'ci/run-build-and-tests.sh'. Once on-disk reverse indexes are proven out over several releases, we can change the default value of that configuration to 'true', and drop this patch. Signed-off-by: Taylor Blau Signed-off-by: Junio C Hamano --- builtin/index-pack.c | 5 ++++- builtin/pack-objects.c | 2 ++ ci/run-build-and-tests.sh | 1 + pack-revindex.h | 3 +++ t/README | 3 +++ 5 files changed, 13 insertions(+), 1 deletion(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index d5cd665b98..54f74c4874 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1748,7 +1748,10 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) if (prefix && chdir(prefix)) die(_("Cannot come back to cwd")); - rev_index = !!(opts.flags & (WRITE_REV_VERIFY | WRITE_REV)); + if (git_env_bool(GIT_TEST_WRITE_REV_INDEX, 0)) + rev_index = 1; + else + rev_index = !!(opts.flags & (WRITE_REV_VERIFY | WRITE_REV)); for (i = 1; i < argc; i++) { const char *arg = argv[i]; diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index d784569200..24df0c98f7 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -3601,6 +3601,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix) reset_pack_idx_option(&pack_idx_opts); git_config(git_pack_config, NULL); + if (git_env_bool(GIT_TEST_WRITE_REV_INDEX, 0)) + pack_idx_opts.flags |= WRITE_REV; progress = isatty(2); argc = parse_options(argc, argv, prefix, pack_objects_options, diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh index 50e0b90073..a66b5e8c75 100755 --- a/ci/run-build-and-tests.sh +++ b/ci/run-build-and-tests.sh @@ -24,6 +24,7 @@ linux-gcc) export GIT_TEST_MULTI_PACK_INDEX=1 export GIT_TEST_ADD_I_USE_BUILTIN=1 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master + export GIT_TEST_WRITE_REV_INDEX=1 make test ;; linux-clang) diff --git a/pack-revindex.h b/pack-revindex.h index 61b2f3ab75..d1a0595e89 100644 --- a/pack-revindex.h +++ b/pack-revindex.h @@ -16,9 +16,12 @@ * can be found */ + #define RIDX_SIGNATURE 0x52494458 /* "RIDX" */ #define RIDX_VERSION 1 +#define GIT_TEST_WRITE_REV_INDEX "GIT_TEST_WRITE_REV_INDEX" + struct packed_git; /* diff --git a/t/README b/t/README index c730a70770..0f97a51640 100644 --- a/t/README +++ b/t/README @@ -439,6 +439,9 @@ GIT_TEST_DEFAULT_HASH= specifies which hash algorithm to use in the test scripts. Recognized values for are "sha1" and "sha256". +GIT_TEST_WRITE_REV_INDEX=, when true enables the +'pack.writeReverseIndex' setting. + Naming Tests ------------ From ec8e7760ac38ef426f87ec738454e574be53a00e Mon Sep 17 00:00:00 2001 From: Taylor Blau Date: Mon, 25 Jan 2021 18:37:46 -0500 Subject: [PATCH 09/10] pack-revindex: ensure that on-disk reverse indexes are given precedence When an on-disk reverse index exists, there is no need to generate one in memory. In fact, doing so can be slow, and require large amounts of the heap. Let's make sure that we treat the on-disk reverse index with precedence (i.e., that when it exists, we don't bother trying to generate an equivalent one in memory) by teaching Git how to conditionally die() when generating a reverse index in memory. Then, add a test to ensure that when (a) an on-disk reverse index exists, and (b) when setting GIT_TEST_REV_INDEX_DIE_IN_MEMORY, that we do not die, implying that we read from the on-disk one. Signed-off-by: Taylor Blau Signed-off-by: Junio C Hamano --- pack-revindex.c | 4 ++++ pack-revindex.h | 1 + t/t5325-reverse-index.sh | 9 +++++++++ 3 files changed, 14 insertions(+) diff --git a/pack-revindex.c b/pack-revindex.c index a174fa5388..83fe4de773 100644 --- a/pack-revindex.c +++ b/pack-revindex.c @@ -2,6 +2,7 @@ #include "pack-revindex.h" #include "object-store.h" #include "packfile.h" +#include "config.h" struct revindex_entry { off_t offset; @@ -166,6 +167,9 @@ static void create_pack_revindex(struct packed_git *p) static int create_pack_revindex_in_memory(struct packed_git *p) { + if (git_env_bool(GIT_TEST_REV_INDEX_DIE_IN_MEMORY, 0)) + die("dying as requested by '%s'", + GIT_TEST_REV_INDEX_DIE_IN_MEMORY); if (open_pack_index(p)) return -1; create_pack_revindex(p); diff --git a/pack-revindex.h b/pack-revindex.h index d1a0595e89..ba7c82c125 100644 --- a/pack-revindex.h +++ b/pack-revindex.h @@ -21,6 +21,7 @@ #define RIDX_VERSION 1 #define GIT_TEST_WRITE_REV_INDEX "GIT_TEST_WRITE_REV_INDEX" +#define GIT_TEST_REV_INDEX_DIE_IN_MEMORY "GIT_TEST_REV_INDEX_DIE_IN_MEMORY" struct packed_git; diff --git a/t/t5325-reverse-index.sh b/t/t5325-reverse-index.sh index be452bb343..a344b18d7e 100755 --- a/t/t5325-reverse-index.sh +++ b/t/t5325-reverse-index.sh @@ -85,4 +85,13 @@ test_expect_success 'pack-objects respects pack.writeReverseIndex' ' test_path_is_file pack-1-*.rev ' +test_expect_success 'reverse index is not generated when available on disk' ' + test_index_pack true && + test_path_is_file $rev && + + git rev-parse HEAD >tip && + GIT_TEST_REV_INDEX_DIE_IN_MEMORY=1 git cat-file \ + --batch-check="%(objectsize:disk)" Date: Thu, 28 Jan 2021 20:32:02 -0500 Subject: [PATCH 10/10] t5325: check both on-disk and in-memory reverse index Right now, the test suite can be run with 'GIT_TEST_WRITE_REV_INDEX=1' in the environment, which causes all operations which write a pack to also write a .rev file. To prepare for when that eventually becomes the default, we should continue to test the in-memory reverse index, too, in order to avoid losing existing coverage. Unfortunately, explicit existing coverage is rather sparse, so only a basic test is added that compares the result of git rev-list --objects --no-object-names --all | git cat-file --batch-check='%(objectsize:disk) %(objectname)' with and without an on-disk reverse index. Suggested-by: Jeff King Helped-by: Jeff King Signed-off-by: Taylor Blau Signed-off-by: Junio C Hamano --- t/t5325-reverse-index.sh | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/t/t5325-reverse-index.sh b/t/t5325-reverse-index.sh index a344b18d7e..da453f68d6 100755 --- a/t/t5325-reverse-index.sh +++ b/t/t5325-reverse-index.sh @@ -94,4 +94,27 @@ test_expect_success 'reverse index is not generated when available on disk' ' --batch-check="%(objectsize:disk)" objects && + + git -c pack.writeReverseIndex=false repack -ad && + test_path_is_missing $packdir/pack-*.rev && + git cat-file --batch-check="%(objectsize:disk) %(objectname)" \ + in-core && + + git -c pack.writeReverseIndex=true repack -ad && + test_path_is_file $packdir/pack-*.rev && + git cat-file --batch-check="%(objectsize:disk) %(objectname)" \ + on-disk && + + test_cmp on-disk in-core + ) +' test_done