git-commit-vandalism/repository.h

242 lines
6.4 KiB
C
Raw Normal View History

#ifndef REPOSITORY_H
#define REPOSITORY_H
#include "path.h"
struct config_set;
fsmonitor: config settings are repository-specific Move fsmonitor config settings to a new and opaque `struct fsmonitor_settings` structure. Add a lazily-loaded pointer to this into `struct repo_settings` Create an `enum fsmonitor_mode` type in `struct fsmonitor_settings` to represent the state of fsmonitor. This lets us represent which, if any, fsmonitor provider (hook or IPC) is enabled. Create `fsm_settings__get_*()` getters to lazily look up fsmonitor- related config settings. Get rid of the `core_fsmonitor` global variable. Move the code to lookup the existing `core.fsmonitor` config value into the fsmonitor settings. Create a hook pathname variable in `struct fsmonitor-settings` and only set it when in hook mode. Extend the definition of `core.fsmonitor` to be either a boolean or a hook pathname. When true, the builtin FSMonitor is used. When false or unset, no FSMonitor (neither builtin nor hook) is used. The existing `core_fsmonitor` global variable was used to store the pathname to the fsmonitor hook *and* it was used as a boolean to see if fsmonitor was enabled. This dual usage and global visibility leads to confusion when we add the IPC-based provider. So lets hide the details in fsmonitor-settings.c and let it decide which provider to use in the case of multiple settings. This avoids cluttering up repo-settings.c with these private details. A future commit in builtin-fsmonitor series will add the ability to disqualify worktrees for various reasons, such as being mounted from a remote volume, where fsmonitor should not be started. Having the config settings hidden in fsmonitor-settings.c allows such worktree restrictions to override the config values used. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-25 19:02:46 +01:00
struct fsmonitor_settings;
struct git_hash_algo;
struct index_state;
struct lock_file;
struct pathspec;
struct raw_object_store;
struct submodule_cache;
struct promisor_remote_config;
struct remote_state;
enum untracked_cache_setting {
repo-settings.c: simplify the setup Simplify the setup code in repo-settings.c in various ways, making the code shorter, easier to read, and requiring fewer hacks to do the same thing as it did before: Since 7211b9e7534 (repo-settings: consolidate some config settings, 2019-08-13) we have memset() the whole "settings" structure to -1 in prepare_repo_settings(), and subsequently relied on the -1 value. Most of the fields did not need to be initialized to -1, and because we were doing that we had the enum labels "UNTRACKED_CACHE_UNSET" and "FETCH_NEGOTIATION_UNSET" purely to reflect the resulting state created this memset() in prepare_repo_settings(). No other code used or relied on them, more on that below. For the rest most of the subsequent "are we -1, then read xyz" can simply be removed by re-arranging what we read first. E.g. when setting the "index.version" setting we should have first read "feature.experimental", so that it (and "feature.manyfiles") can provide a default for our "index.version". Instead the code setting it, added when "feature.manyFiles"[1] was created, was using the UPDATE_DEFAULT_BOOL() macro added in an earlier commit[2]. That macro is now gone, since it was only needed for this pattern of reading things in the wrong order. This also fixes an (admittedly obscure) logic error where we'd conflate an explicit "-1" value in the config with our own earlier memset() -1. We can also remove the UPDATE_DEFAULT_BOOL() wrapper added in [3]. Using it is redundant to simply using the return value from repo_config_get_bool(), which is non-zero if the provided key exists in the config. Details on edge cases relating to the memset() to -1, continued from "more on that below" above: * UNTRACKED_CACHE_KEEP: In [4] the "unset" and "keep" handling for core.untrackedCache was consolidated. But it while we understand the "keep" value, we don't handle it differently than the case of any other unknown value. So let's retain UNTRACKED_CACHE_KEEP and remove the UNTRACKED_CACHE_UNSET setting (which was always implicitly UNTRACKED_CACHE_KEEP before). We don't need to inform any code after prepare_repo_settings() that the setting was "unset", as far as anyone else is concerned it's core.untrackedCache=keep. if "core.untrackedcache" isn't present in the config. * FETCH_NEGOTIATION_UNSET & FETCH_NEGOTIATION_NONE: Since these two two enum fields added in [5] don't rely on the memzero() setting them to "-1" anymore we don't have to provide them with explicit values. 1. c6cc4c5afd2 (repo-settings: create feature.manyFiles setting, 2019-08-13) 2. 31b1de6a09b (commit-graph: turn on commit-graph by default, 2019-08-13) 3. 31b1de6a09b (commit-graph: turn on commit-graph by default, 2019-08-13) 4. ad0fb659993 (repo-settings: parse core.untrackedCache, 2019-08-13) 5. aaf633c2ad1 (repo-settings: create feature.experimental setting, 2019-08-13) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-21 15:13:02 +02:00
UNTRACKED_CACHE_KEEP,
UNTRACKED_CACHE_REMOVE,
UNTRACKED_CACHE_WRITE,
};
enum fetch_negotiation_setting {
FETCH_NEGOTIATION_CONSECUTIVE,
repo-settings.c: simplify the setup Simplify the setup code in repo-settings.c in various ways, making the code shorter, easier to read, and requiring fewer hacks to do the same thing as it did before: Since 7211b9e7534 (repo-settings: consolidate some config settings, 2019-08-13) we have memset() the whole "settings" structure to -1 in prepare_repo_settings(), and subsequently relied on the -1 value. Most of the fields did not need to be initialized to -1, and because we were doing that we had the enum labels "UNTRACKED_CACHE_UNSET" and "FETCH_NEGOTIATION_UNSET" purely to reflect the resulting state created this memset() in prepare_repo_settings(). No other code used or relied on them, more on that below. For the rest most of the subsequent "are we -1, then read xyz" can simply be removed by re-arranging what we read first. E.g. when setting the "index.version" setting we should have first read "feature.experimental", so that it (and "feature.manyfiles") can provide a default for our "index.version". Instead the code setting it, added when "feature.manyFiles"[1] was created, was using the UPDATE_DEFAULT_BOOL() macro added in an earlier commit[2]. That macro is now gone, since it was only needed for this pattern of reading things in the wrong order. This also fixes an (admittedly obscure) logic error where we'd conflate an explicit "-1" value in the config with our own earlier memset() -1. We can also remove the UPDATE_DEFAULT_BOOL() wrapper added in [3]. Using it is redundant to simply using the return value from repo_config_get_bool(), which is non-zero if the provided key exists in the config. Details on edge cases relating to the memset() to -1, continued from "more on that below" above: * UNTRACKED_CACHE_KEEP: In [4] the "unset" and "keep" handling for core.untrackedCache was consolidated. But it while we understand the "keep" value, we don't handle it differently than the case of any other unknown value. So let's retain UNTRACKED_CACHE_KEEP and remove the UNTRACKED_CACHE_UNSET setting (which was always implicitly UNTRACKED_CACHE_KEEP before). We don't need to inform any code after prepare_repo_settings() that the setting was "unset", as far as anyone else is concerned it's core.untrackedCache=keep. if "core.untrackedcache" isn't present in the config. * FETCH_NEGOTIATION_UNSET & FETCH_NEGOTIATION_NONE: Since these two two enum fields added in [5] don't rely on the memzero() setting them to "-1" anymore we don't have to provide them with explicit values. 1. c6cc4c5afd2 (repo-settings: create feature.manyFiles setting, 2019-08-13) 2. 31b1de6a09b (commit-graph: turn on commit-graph by default, 2019-08-13) 3. 31b1de6a09b (commit-graph: turn on commit-graph by default, 2019-08-13) 4. ad0fb659993 (repo-settings: parse core.untrackedCache, 2019-08-13) 5. aaf633c2ad1 (repo-settings: create feature.experimental setting, 2019-08-13) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-21 15:13:02 +02:00
FETCH_NEGOTIATION_SKIPPING,
FETCH_NEGOTIATION_NOOP,
};
struct repo_settings {
int initialized;
int core_commit_graph;
commit-graph: pass repo_settings instead of repository The parse_commit_graph() function takes a 'struct repository *' pointer, but it only ever accesses config settings (either directly or through the .settings field of the repo struct). Move all relevant config settings into the repo_settings struct, and update parse_commit_graph() and its existing callers so that it takes 'struct repo_settings *' instead. Callers of parse_commit_graph() will now need to call prepare_repo_settings() themselves, or initialize a 'struct repo_settings' directly. Prior to ab14d0676c (commit-graph: pass a 'struct repository *' in more places, 2020-09-09), parsing a commit-graph was a pure function depending only on the contents of the commit-graph itself. Commit ab14d0676c introduced a dependency on a `struct repository` pointer, and later commits such as b66d84756f (commit-graph: respect 'commitGraph.readChangedPaths', 2020-09-09) added dependencies on config settings, which were accessed through the `settings` field of the repository pointer. This field was initialized via a call to `prepare_repo_settings()`. Additionally, this fixes an issue in fuzz-commit-graph: In 44c7e62 (2021-12-06, repo-settings:prepare_repo_settings only in git repos), prepare_repo_settings was changed to issue a BUG() if it is called by a process whose CWD is not a Git repository. The combination of commits mentioned above broke fuzz-commit-graph, which attempts to parse arbitrary fuzzing-engine-provided bytes as a commit graph file. Prior to this change, parse_commit_graph() called prepare_repo_settings(), but since we run the fuzz tests without a valid repository, we are hitting the BUG() from 44c7e62 for every test case. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-14 23:43:06 +02:00
int commit_graph_generation_version;
int commit_graph_read_changed_paths;
int gc_write_commit_graph;
int gc_cruft_packs;
int fetch_write_commit_graph;
repository.h: don't use a mix of int and bitfields Change the bitfield added in 58300f47432 (sparse-index: add index.sparse config option, 2021-03-30) and 3964fc2aae7 (sparse-index: add guard to ensure full index, 2021-03-30) to just use an "int" boolean instead. It might be smart to optimize the space here in the future, but by consistently using an "int" we can take its address and pass it to repo_cfg_bool(), and therefore don't need to handle "sparse_index" as a special-case when reading the "index.sparse" setting. There's no corresponding config for "command_requires_full_index", but let's change it too for consistency and to prevent future bugs creeping in due to one of these being "unsigned". Using "int" consistently also prevents subtle bugs or undesired control flow creeping in here. Before the preceding commit the initialization of "command_requires_full_index" in prepare_repo_settings() did nothing, i.e. this: r->settings.command_requires_full_index = 1 Was redundant to the earlier memset() to -1. Likewise for "sparse_index" added in 58300f47432 (sparse-index: add index.sparse config option, 2021-03-30) the code and comment added there was misleading, we weren't initializing it to off, but re-initializing it from "1" to "0", and then finally checking the config, and perhaps setting it to "1" again. I.e. we could have applied this patch before the preceding commit: + assert(r->settings.command_requires_full_index == 1); r->settings.command_requires_full_index = 1; /* * Initialize this as off. */ + assert(r->settings.sparse_index == 1); r->settings.sparse_index = 0; if (!repo_config_get_bool(r, "index.sparse", &value) && value) r->settings.sparse_index = 1; Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-21 15:13:03 +02:00
int command_requires_full_index;
int sparse_index;
fsmonitor: config settings are repository-specific Move fsmonitor config settings to a new and opaque `struct fsmonitor_settings` structure. Add a lazily-loaded pointer to this into `struct repo_settings` Create an `enum fsmonitor_mode` type in `struct fsmonitor_settings` to represent the state of fsmonitor. This lets us represent which, if any, fsmonitor provider (hook or IPC) is enabled. Create `fsm_settings__get_*()` getters to lazily look up fsmonitor- related config settings. Get rid of the `core_fsmonitor` global variable. Move the code to lookup the existing `core.fsmonitor` config value into the fsmonitor settings. Create a hook pathname variable in `struct fsmonitor-settings` and only set it when in hook mode. Extend the definition of `core.fsmonitor` to be either a boolean or a hook pathname. When true, the builtin FSMonitor is used. When false or unset, no FSMonitor (neither builtin nor hook) is used. The existing `core_fsmonitor` global variable was used to store the pathname to the fsmonitor hook *and* it was used as a boolean to see if fsmonitor was enabled. This dual usage and global visibility leads to confusion when we add the IPC-based provider. So lets hide the details in fsmonitor-settings.c and let it decide which provider to use in the case of multiple settings. This avoids cluttering up repo-settings.c with these private details. A future commit in builtin-fsmonitor series will add the ability to disqualify worktrees for various reasons, such as being mounted from a remote volume, where fsmonitor should not be started. Having the config settings hidden in fsmonitor-settings.c allows such worktree restrictions to override the config values used. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-25 19:02:46 +01:00
struct fsmonitor_settings *fsmonitor; /* lazily loaded */
int index_version;
int index_skip_hash;
enum untracked_cache_setting core_untracked_cache;
int pack_use_sparse;
enum fetch_negotiation_setting fetch_negotiation_algorithm;
int core_multi_pack_index;
};
struct repo_path_cache {
char *squash_msg;
char *merge_msg;
char *merge_rr;
char *merge_mode;
char *merge_head;
char *merge_autostash;
char *auto_merge;
char *fetch_head;
char *shallow;
};
struct repository {
/* Environment */
/*
* Path to the git directory.
* Cannot be NULL after initialization.
*/
char *gitdir;
/*
* Path to the common git directory.
* Cannot be NULL after initialization.
*/
char *commondir;
/*
* Holds any information related to accessing the raw object content.
*/
struct raw_object_store *objects;
/*
* All objects in this repository that have been parsed. This structure
* owns all objects it references, so users of "struct object *"
* generally do not need to free them; instead, when a repository is no
* longer used, call parsed_object_pool_clear() on this structure, which
* is called by the repositories repo_clear on its desconstruction.
*/
struct parsed_object_pool *parsed_objects;
/*
* The store in which the refs are held. This should generally only be
* accessed via get_main_ref_store(), as that will lazily initialize
* the ref object.
*/
struct ref_store *refs_private;
/*
* Contains path to often used file names.
*/
struct repo_path_cache cached_paths;
/*
* Path to the repository's graft file.
* Cannot be NULL after initialization.
*/
char *graft_file;
/*
* Path to the current worktree's index file.
* Cannot be NULL after initialization.
*/
char *index_file;
/*
* Path to the working directory.
* A NULL value indicates that there is no working directory.
*/
char *worktree;
/*
* Path from the root of the top-level superproject down to this
* repository. This is only non-NULL if the repository is initialized
* as a submodule of another repository.
*/
char *submodule_prefix;
struct repo_settings settings;
/* Subsystems */
/*
* Repository's config which contains key-value pairs from the usual
* set of config files (i.e. repo specific .git/config, user wide
* ~/.gitconfig, XDG config file and the global /etc/gitconfig)
*/
struct config_set *config;
/* Repository's submodule config as defined by '.gitmodules' */
struct submodule_cache *submodule_cache;
/*
* Repository's in-memory index.
* 'repo_read_index()' can be used to populate 'index'.
*/
struct index_state *index;
/* Repository's remotes and associated structures. */
struct remote_state *remote_state;
/* Repository's current hash algorithm, as serialized on disk. */
const struct git_hash_algo *hash_algo;
/* A unique-id for tracing purposes. */
int trace2_repo_id;
upload-pack: disable commit graph more gently for shallow traversal When the client has asked for certain shallow options like "deepen-since", we do a custom rev-list walk that pretends to be shallow. Before doing so, we have to disable the commit-graph, since it is not compatible with the shallow view of the repository. That's handled by 829a321569 (commit-graph: close_commit_graph before shallow walk, 2018-08-20). That commit literally closes and frees our repo->objects->commit_graph struct. That creates an interesting problem for commits that have _already_ been parsed using the commit graph. Their commit->object.parsed flag is set, their commit->graph_pos is set, but their commit->maybe_tree may still be NULL. When somebody later calls repo_get_commit_tree(), we see that we haven't loaded the tree oid yet and try to get it from the commit graph. But since it has been freed, we segfault! So the root of the issue is a data dependency between the commit's lazy-load of the tree oid and the fact that the commit graph can go away mid-process. How can we resolve it? There are a couple of general approaches: 1. The obvious answer is to avoid loading the tree from the graph when we see that it's NULL. But then what do we return for the tree oid? If we return NULL, our caller in do_traverse() will rightly complain that we have no tree. We'd have to fallback to loading the actual commit object and re-parsing it. That requires teaching parse_commit_buffer() to understand re-parsing (i.e., not starting from a clean slate and not leaking any allocated bits like parent list pointers). 2. When we close the commit graph, walk through the set of in-memory objects and clear any graph_pos pointers. But this means we also have to "unparse" any such commits so that we know they still need to open the commit object to fill in their trees. So it's no less complicated than (1), and is more expensive (since we clear objects we might not later need). 3. Stop freeing the commit-graph struct. Continue to let it be used for lazy-loads of tree oids, but let upload-pack specify that it shouldn't be used for further commit parsing. 4. Push the whole shallow rev-list out to its own sub-process, with the commit-graph disabled from the start, giving it a clean memory space to work from. I've chosen (3) here. Options (1) and (2) would work, but are non-trivial to implement. Option (4) is more expensive, and I'm not sure how complicated it is (shelling out for the actual rev-list part is easy, but we do then parse the resulting commits internally, and I'm not clear which parts need to be handling shallow-ness). The new test in t5500 triggers this segfault, but see the comments there for how horribly intimate it has to be with how both upload-pack and commit graphs work. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-12 16:44:45 +02:00
/* True if commit-graph has been disabled within this process. */
int commit_graph_disabled;
/* Configurations related to promisor remotes. */
char *repository_format_partial_clone;
struct promisor_remote_config *promisor_remote_config;
/* Configurations */
/* Indicate if a repository has a different 'commondir' from 'gitdir' */
unsigned different_commondir:1;
};
extern struct repository *the_repository;
/*
* Define a custom repository layout. Any field can be NULL, which
* will default back to the path according to the default layout.
*/
struct set_gitdir_args {
const char *commondir;
const char *object_dir;
const char *graft_file;
const char *index_file;
const char *alternate_db;
int disable_ref_updates;
};
void repo_set_gitdir(struct repository *repo, const char *root,
const struct set_gitdir_args *extra_args);
void repo_set_worktree(struct repository *repo, const char *path);
void repo_set_hash_algo(struct repository *repo, int algo);
void initialize_the_repository(void);
RESULT_MUST_BE_USED
int repo_init(struct repository *r, const char *gitdir, const char *worktree);
/*
* Initialize the repository 'subrepo' as the submodule at the given path. If
* the submodule's gitdir cannot be found at <path>/.git, this function calls
* submodule_from_path() to try to find it. treeish_name is only used if
* submodule_from_path() needs to be called; see its documentation for more
* information.
* Return 0 upon success and a non-zero value upon failure.
*/
struct object_id;
RESULT_MUST_BE_USED
int repo_submodule_init(struct repository *subrepo,
struct repository *superproject,
const char *path,
const struct object_id *treeish_name);
void repo_clear(struct repository *repo);
/*
* Populates the repository's index from its index_file, an index struct will
* be allocated if needed.
*
* Return the number of index entries in the populated index or a value less
* than zero if an error occurred. If the repository's index has already been
* populated then the number of entries will simply be returned.
*/
int repo_read_index(struct repository *repo);
int repo_hold_locked_index(struct repository *repo,
struct lock_file *lf,
int flags);
int repo_read_index_preload(struct repository *,
const struct pathspec *pathspec,
unsigned refresh_flags);
int repo_read_index_unmerged(struct repository *);
/*
* Opportunistically update the index but do not complain if we can't.
* The lockfile is always committed or rolled back.
*/
void repo_update_index_if_able(struct repository *, struct lock_file *);
void prepare_repo_settings(struct repository *r);
/*
* Return 1 if upgrade repository format to target_version succeeded,
* 0 if no upgrade is necessary, and -1 when upgrade is not possible.
*/
int upgrade_repository_format(int target_version);
#endif /* REPOSITORY_H */