commit-graph write: don't die if the existing graph is corrupt

When the commit-graph is written we end up calling
parse_commit(). This will in turn invoke code that'll consult the
existing commit-graph about the commit, if the graph is corrupted we
die.

We thus get into a state where a failing "commit-graph verify" can't
be followed-up with a "commit-graph write" if core.commitGraph=true is
set, the graph either needs to be manually removed to proceed, or
core.commitGraph needs to be set to "false".

Change the "commit-graph write" codepath to use a new
parse_commit_no_graph() helper instead of parse_commit() to avoid
this. The latter will call repo_parse_commit_internal() with
use_commit_graph=1 as seen in 177722b344 ("commit: integrate commit
graph with commit parsing", 2018-04-10).

Not using the old graph at all slows down the writing of the new graph
by some small amount, but is a sensible way to prevent an error in the
existing commit-graph from spreading.

Just fixing the current issue would be likely to result in code that's
inadvertently broken in the future. New code might use the
commit-graph at a distance. To detect such cases introduce a
"GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD" setting used when we do our
corruption tests, and test that a "write/verify" combo works after
every one of our current test cases where we now detect commit-graph
corruption.

Some of the code changes here might be strictly unnecessary, e.g. I
was unable to find cases where the parse_commit() called from
write_graph_chunk_data() didn't exit early due to
"item->object.parsed" being true in
repo_parse_commit_internal() (before the use_commit_graph=1 has any
effect). But let's also convert those cases for good measure, we do
not have exhaustive tests for all possible types of commit-graph
corruption.

This might need to be re-visited if we learn to write the commit-graph
incrementally, but probably not. Hopefully we'll just start by finding
out what commits we have in total, then read the old graph(s) to see
what they cover, and finally write a new graph file with everything
that's missing. In that case the new graph writing code just needs to
continue to use e.g. a parse_commit() that doesn't consult the
existing commit-graphs.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Ævar Arnfjörð Bjarmason 2019-03-25 13:08:33 +01:00 committed by Junio C Hamano
parent 7b8ce9c673
commit 43d3561805
4 changed files with 23 additions and 5 deletions

View File

@ -311,6 +311,10 @@ static int prepare_commit_graph(struct repository *r)
struct object_directory *odb; struct object_directory *odb;
int config_value; int config_value;
if (git_env_bool(GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD, 0))
die("dying as requested by the '%s' variable on commit-graph load!",
GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD);
if (r->objects->commit_graph_attempted) if (r->objects->commit_graph_attempted)
return !!r->objects->commit_graph; return !!r->objects->commit_graph;
r->objects->commit_graph_attempted = 1; r->objects->commit_graph_attempted = 1;
@ -575,7 +579,7 @@ static void write_graph_chunk_data(struct hashfile *f, int hash_len,
uint32_t packedDate[2]; uint32_t packedDate[2];
display_progress(progress, ++*progress_cnt); display_progress(progress, ++*progress_cnt);
parse_commit(*list); parse_commit_no_graph(*list);
hashwrite(f, get_commit_tree_oid(*list)->hash, hash_len); hashwrite(f, get_commit_tree_oid(*list)->hash, hash_len);
parent = (*list)->parents; parent = (*list)->parents;
@ -772,7 +776,7 @@ static void close_reachable(struct packed_oid_list *oids, int report_progress)
display_progress(progress, i + 1); display_progress(progress, i + 1);
commit = lookup_commit(the_repository, &oids->list[i]); commit = lookup_commit(the_repository, &oids->list[i]);
if (commit && !parse_commit(commit)) if (commit && !parse_commit_no_graph(commit))
add_missing_parents(oids, commit); add_missing_parents(oids, commit);
} }
stop_progress(&progress); stop_progress(&progress);
@ -1021,7 +1025,7 @@ void write_commit_graph(const char *obj_dir,
continue; continue;
commits.list[commits.nr] = lookup_commit(the_repository, &oids.list[i]); commits.list[commits.nr] = lookup_commit(the_repository, &oids.list[i]);
parse_commit(commits.list[commits.nr]); parse_commit_no_graph(commits.list[commits.nr]);
for (parent = commits.list[commits.nr]->parents; for (parent = commits.list[commits.nr]->parents;
parent; parent = parent->next) parent; parent = parent->next)

View File

@ -7,6 +7,7 @@
#include "cache.h" #include "cache.h"
#define GIT_TEST_COMMIT_GRAPH "GIT_TEST_COMMIT_GRAPH" #define GIT_TEST_COMMIT_GRAPH "GIT_TEST_COMMIT_GRAPH"
#define GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD "GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD"
struct commit; struct commit;

View File

@ -89,6 +89,12 @@ static inline int repo_parse_commit(struct repository *r, struct commit *item)
{ {
return repo_parse_commit_gently(r, item, 0); return repo_parse_commit_gently(r, item, 0);
} }
static inline int parse_commit_no_graph(struct commit *commit)
{
return repo_parse_commit_internal(the_repository, commit, 0, 0);
}
#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS #ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS
#define parse_commit_internal(item, quiet, use) repo_parse_commit_internal(the_repository, item, quiet, use) #define parse_commit_internal(item, quiet, use) repo_parse_commit_internal(the_repository, item, quiet, use)
#define parse_commit_gently(item, quiet) repo_parse_commit_gently(the_repository, item, quiet) #define parse_commit_gently(item, quiet) repo_parse_commit_gently(the_repository, item, quiet)

View File

@ -377,7 +377,13 @@ corrupt_graph_verify() {
test_must_fail git commit-graph verify 2>test_err && test_must_fail git commit-graph verify 2>test_err &&
grep -v "^+" test_err >err && grep -v "^+" test_err >err &&
test_i18ngrep "$grepstr" err && test_i18ngrep "$grepstr" err &&
git status --short if test "$2" != "no-copy"
then
cp $objdir/info/commit-graph commit-graph-pre-write-test
fi &&
git status --short &&
GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD=true git commit-graph write &&
git commit-graph verify
} }
# usage: corrupt_graph_and_verify <position> <data> <string> [<zero_pos>] # usage: corrupt_graph_and_verify <position> <data> <string> [<zero_pos>]
@ -403,7 +409,7 @@ corrupt_graph_and_verify() {
test_expect_success POSIXPERM,SANITY 'detect permission problem' ' test_expect_success POSIXPERM,SANITY 'detect permission problem' '
corrupt_graph_setup && corrupt_graph_setup &&
chmod 000 $objdir/info/commit-graph && chmod 000 $objdir/info/commit-graph &&
corrupt_graph_verify "Could not open" corrupt_graph_verify "Could not open" "no-copy"
' '
test_expect_success 'detect too small' ' test_expect_success 'detect too small' '
@ -522,6 +528,7 @@ test_expect_success 'git fsck (checks commit-graph)' '
git fsck && git fsck &&
corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \ corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
"incorrect checksum" && "incorrect checksum" &&
cp commit-graph-pre-write-test $objdir/info/commit-graph &&
test_must_fail git fsck test_must_fail git fsck
' '