git-commit-vandalism/commit-graph.h
Jeff King 6abada1880 upload-pack: disable commit graph more gently for shallow traversal
When the client has asked for certain shallow options like
"deepen-since", we do a custom rev-list walk that pretends to be
shallow. Before doing so, we have to disable the commit-graph, since it
is not compatible with the shallow view of the repository. That's
handled by 829a321569 (commit-graph: close_commit_graph before shallow
walk, 2018-08-20). That commit literally closes and frees our
repo->objects->commit_graph struct.

That creates an interesting problem for commits that have _already_ been
parsed using the commit graph. Their commit->object.parsed flag is set,
their commit->graph_pos is set, but their commit->maybe_tree may still
be NULL. When somebody later calls repo_get_commit_tree(), we see that
we haven't loaded the tree oid yet and try to get it from the commit
graph. But since it has been freed, we segfault!

So the root of the issue is a data dependency between the commit's
lazy-load of the tree oid and the fact that the commit graph can go
away mid-process. How can we resolve it?

There are a couple of general approaches:

  1. The obvious answer is to avoid loading the tree from the graph when
     we see that it's NULL. But then what do we return for the tree oid?
     If we return NULL, our caller in do_traverse() will rightly
     complain that we have no tree. We'd have to fallback to loading the
     actual commit object and re-parsing it. That requires teaching
     parse_commit_buffer() to understand re-parsing (i.e., not starting
     from a clean slate and not leaking any allocated bits like parent
     list pointers).

  2. When we close the commit graph, walk through the set of in-memory
     objects and clear any graph_pos pointers. But this means we also
     have to "unparse" any such commits so that we know they still need
     to open the commit object to fill in their trees. So it's no less
     complicated than (1), and is more expensive (since we clear objects
     we might not later need).

  3. Stop freeing the commit-graph struct. Continue to let it be used
     for lazy-loads of tree oids, but let upload-pack specify that it
     shouldn't be used for further commit parsing.

  4. Push the whole shallow rev-list out to its own sub-process, with
     the commit-graph disabled from the start, giving it a clean memory
     space to work from.

I've chosen (3) here. Options (1) and (2) would work, but are
non-trivial to implement. Option (4) is more expensive, and I'm not sure
how complicated it is (shelling out for the actual rev-list part is
easy, but we do then parse the resulting commits internally, and I'm not
clear which parts need to be handling shallow-ness).

The new test in t5500 triggers this segfault, but see the comments there
for how horribly intimate it has to be with how both upload-pack and
commit graphs work.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-12 12:30:08 -07:00

117 lines
3.4 KiB
C

#ifndef COMMIT_GRAPH_H
#define COMMIT_GRAPH_H
#include "git-compat-util.h"
#include "repository.h"
#include "string-list.h"
#include "cache.h"
#define GIT_TEST_COMMIT_GRAPH "GIT_TEST_COMMIT_GRAPH"
#define GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD "GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD"
struct commit;
char *get_commit_graph_filename(const char *obj_dir);
int open_commit_graph(const char *graph_file, int *fd, struct stat *st);
/*
* Given a commit struct, try to fill the commit struct info, including:
* 1. tree object
* 2. date
* 3. parents.
*
* Returns 1 if and only if the commit was found in the packed graph.
*
* See parse_commit_buffer() for the fallback after this call.
*/
int parse_commit_in_graph(struct repository *r, struct commit *item);
/*
* It is possible that we loaded commit contents from the commit buffer,
* but we also want to ensure the commit-graph content is correctly
* checked and filled. Fill the graph_pos and generation members of
* the given commit.
*/
void load_commit_graph_info(struct repository *r, struct commit *item);
struct tree *get_commit_tree_in_graph(struct repository *r,
const struct commit *c);
struct commit_graph {
int graph_fd;
const unsigned char *data;
size_t data_len;
unsigned char hash_len;
unsigned char num_chunks;
uint32_t num_commits;
struct object_id oid;
char *filename;
const char *obj_dir;
uint32_t num_commits_in_base;
struct commit_graph *base_graph;
const uint32_t *chunk_oid_fanout;
const unsigned char *chunk_oid_lookup;
const unsigned char *chunk_commit_data;
const unsigned char *chunk_extra_edges;
const unsigned char *chunk_base_graphs;
};
struct commit_graph *load_commit_graph_one_fd_st(int fd, struct stat *st);
struct commit_graph *read_commit_graph_one(struct repository *r, const char *obj_dir);
struct commit_graph *parse_commit_graph(void *graph_map, int fd,
size_t graph_size);
/*
* Return 1 if and only if the repository has a commit-graph
* file and generation numbers are computed in that file.
*/
int generation_numbers_enabled(struct repository *r);
enum commit_graph_write_flags {
COMMIT_GRAPH_WRITE_APPEND = (1 << 0),
COMMIT_GRAPH_WRITE_PROGRESS = (1 << 1),
COMMIT_GRAPH_WRITE_SPLIT = (1 << 2),
/* Make sure that each OID in the input is a valid commit OID. */
COMMIT_GRAPH_WRITE_CHECK_OIDS = (1 << 3)
};
struct split_commit_graph_opts {
int size_multiple;
int max_commits;
timestamp_t expire_time;
};
/*
* The write_commit_graph* methods return zero on success
* and a negative value on failure. Note that if the repository
* is not compatible with the commit-graph feature, then the
* methods will return 0 without writing a commit-graph.
*/
int write_commit_graph_reachable(const char *obj_dir,
enum commit_graph_write_flags flags,
const struct split_commit_graph_opts *split_opts);
int write_commit_graph(const char *obj_dir,
struct string_list *pack_indexes,
struct string_list *commit_hex,
enum commit_graph_write_flags flags,
const struct split_commit_graph_opts *split_opts);
#define COMMIT_GRAPH_VERIFY_SHALLOW (1 << 0)
int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags);
void close_commit_graph(struct raw_object_store *);
void free_commit_graph(struct commit_graph *);
/*
* Disable further use of the commit graph in this process when parsing a
* "struct commit".
*/
void disable_commit_graph(struct repository *r);
#endif