commit-graph: fix corrupt upgrade from generation v1 to v2
The previous commit demonstrates a bug where a commit-graph using generation v2 could enter a state where one of the GDA2 values has its most-significant bit set (indicating that its value should be read from the extended offset table in the GDO2 chunk) without having a GDO2 chunk to read from. This results in the following error message being displayed to the caller: fatal: commit-graph requires overflow generation data but has none This bug arises in the following scenario: - We decide to write a commit-graph using generation number v2, and decide (correctly) that no GDO2 chunk is necessary (e.g., because all of the commiter date offsets are no larger than 2^31-1). - The v2 generation numbers are stored in the `->generation` member of the commit slab holding `struct commit_graph_data`'s. - Later on, `load_commit_graph_info()` is called, overwriting the v2 generation data in the aforementioned slab with any existing v1 generation data. Then, when the commit-graph code goes to write the GDA2 chunk via `write_graph_chunk_generation_data()`, we use the overwritten generation v1 data in a place where we expect to use a v2 generation number: offset = commit_graph_data_at(c)->generation - c->date; ...because `commit_graph_data_at(c)->generation` used to hold the v2 generation data, but it was overwritten to contain the v1 generation number via `load_commit_graph_info()`. If the `offset` computation above overflows the v2 generation number max, then `write_graph_chunk_generation_data()` will update its count of large offsets and write the marker accordingly: if (offset > GENERATION_NUMBER_V2_OFFSET_MAX) { offset = CORRECTED_COMMIT_DATE_OFFSET_OVERFLOW | num_generation_data_overflows; num_generation_data_overflows++; } and reads will look for the GDO2 chunk containing the overflowing v2 generation number, *after* the commit-graph code decided that no such chunk was necessary. The main problem is that the slab containing `struct commit_graph_data` has a dual purpose. It is used to hold data that we are about to write to disk while generating a commit-graph, as well as hold data that was read from an existing commit-graph. When the two mix, namely when the result of reading the commit-graph has a side-effect that mixes poorly with an in-progress commit-graph write, we end up with corrupt data. A complete fix might be to introduce a new slab that is used exclusively for writing, and gate access between the two slabs based on context provided by the caller (e.g., whether this computation is part of a "read" or "write" operation). But a more minimal fix addresses the only known path which overwrites the slab data, which is `compute_bloom_filters()` -> `get_or_compute_bloom_filter()` -> `load_commit_graph_info()` -> `fill_commit_graph_info()` by avoiding the last call which clobbers the data altogether. This path only needs to learn the graph position of a given commit so that it can be used in `load_bloom_filter_from_graph()`. By replacing the last steps of the above with one that records the graph position into a temporary variable which is then used to load the existing Bloom data, we eliminate the clobbering, removing the corruption. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
parent
7805360b7a
commit
9550f6c16a
10
bloom.c
10
bloom.c
@ -30,10 +30,9 @@ static inline unsigned char get_bitmask(uint32_t pos)
|
||||
|
||||
static int load_bloom_filter_from_graph(struct commit_graph *g,
|
||||
struct bloom_filter *filter,
|
||||
struct commit *c)
|
||||
uint32_t graph_pos)
|
||||
{
|
||||
uint32_t lex_pos, start_index, end_index;
|
||||
uint32_t graph_pos = commit_graph_position(c);
|
||||
|
||||
while (graph_pos < g->num_commits_in_base)
|
||||
g = g->base_graph;
|
||||
@ -203,9 +202,10 @@ struct bloom_filter *get_or_compute_bloom_filter(struct repository *r,
|
||||
filter = bloom_filter_slab_at(&bloom_filters, c);
|
||||
|
||||
if (!filter->data) {
|
||||
load_commit_graph_info(r, c);
|
||||
if (commit_graph_position(c) != COMMIT_NOT_FROM_GRAPH)
|
||||
load_bloom_filter_from_graph(r->objects->commit_graph, filter, c);
|
||||
uint32_t graph_pos;
|
||||
if (repo_find_commit_pos_in_graph(r, c, &graph_pos))
|
||||
load_bloom_filter_from_graph(r->objects->commit_graph,
|
||||
filter, graph_pos);
|
||||
}
|
||||
|
||||
if (filter->data && filter->len)
|
||||
|
@ -811,7 +811,7 @@ test_expect_success 'set up and verify repo with generation data overflow chunk'
|
||||
|
||||
graph_git_behavior 'generation data overflow chunk repo' repo left right
|
||||
|
||||
test_expect_failure 'overflow during generation version upgrade' '
|
||||
test_expect_success 'overflow during generation version upgrade' '
|
||||
git init overflow-v2-upgrade &&
|
||||
(
|
||||
cd overflow-v2-upgrade &&
|
||||
|
Loading…
Reference in New Issue
Block a user