git-commit-vandalism/setup.c

1631 lines
44 KiB
C
Raw Normal View History

#include "cache.h"
#include "repository.h"
#include "config.h"
Clean up work-tree handling The old version of work-tree support was an unholy mess, barely readable, and not to the point. For example, why do you have to provide a worktree, when it is not used? As in "git status". Now it works. Another riddle was: if you can have work trees inside the git dir, why are some programs complaining that they need a work tree? IOW it is allowed to call $ git --git-dir=../ --work-tree=. bla when you really want to. In this case, you are both in the git directory and in the working tree. So, programs have to actually test for the right thing, namely if they are inside a working tree, and not if they are inside a git directory. Also, GIT_DIR=../.git should behave the same as if no GIT_DIR was specified, unless there is a repository in the current working directory. It does now. The logic to determine if a repository is bare, or has a work tree (tertium non datur), is this: --work-tree=bla overrides GIT_WORK_TREE, which overrides core.bare = true, which overrides core.worktree, which overrides GIT_DIR/.. when GIT_DIR ends in /.git, which overrides the directory in which .git/ was found. In related news, a long standing bug was fixed: when in .git/bla/x.git/, which is a bare repository, git formerly assumed ../.. to be the appropriate git dir. This problem was reported by Shawn Pearce to have caused much pain, where a colleague mistakenly ran "git init" in "/" a long time ago, and bare repositories just would not work. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-01 02:30:14 +02:00
#include "dir.h"
#include "string-list.h"
set_work_tree: use chdir_notify When we change to the top of the working tree, we manually re-adjust $GIT_DIR and call set_git_dir() again, in order to update any relative git-dir we'd compute earlier. Instead of the work-tree code having to know to call the git-dir code, let's use the new chdir_notify interface. There are two spots that need updating, with a few subtleties in each: 1. the set_git_dir() code needs to chdir_notify_register() so it can be told when to update its path. Technically we could push this down into repo_set_gitdir(), so that even repository structs besides the_repository could benefit from this. But that opens up a lot of complications: - we'd still need to touch set_git_dir(), because it does some other setup (like setting $GIT_DIR in the environment) - submodules using other repository structs get cleaned up, which means we'd need to remove them from the chdir_notify list - it's unlikely to fix any bugs, since we shouldn't generally chdir() in the middle of working on a submodule 2. setup_work_tree now needs to call chdir_notify(), and can lose its manual set_git_dir() call. Note that at first glance it looks like this undoes the absolute-to-relative optimization added by 044bbbcb63 (Make git_dir a path relative to work_tree in setup_work_tree(), 2008-06-19). But for the most part that optimization was just _undoing_ the relative-to-absolute conversion which the function was doing earlier (and which is now gone). It is true that if you already have an absolute git_dir that the setup_work_tree() function will no longer make it relative as a side effect. But: - we generally do have relative git-dir's due to the way the discovery code works - if we really care about making git-dir's relative when possible, then we should be relativizing them earlier (e.g., when we see an absolute $GIT_DIR we could turn it relative, whether we are going to chdir into a worktree or not). That would cover all cases, including ones that 044bbbcb63 did not. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-30 20:35:08 +02:00
#include "chdir-notify.h"
#include "promisor-remote.h"
setup_git_directory(): add an owner check for the top-level directory It poses a security risk to search for a git directory outside of the directories owned by the current user. For example, it is common e.g. in computer pools of educational institutes to have a "scratch" space: a mounted disk with plenty of space that is regularly swiped where any authenticated user can create a directory to do their work. Merely navigating to such a space with a Git-enabled `PS1` when there is a maliciously-crafted `/scratch/.git/` can lead to a compromised account. The same holds true in multi-user setups running Windows, as `C:\` is writable to every authenticated user by default. To plug this vulnerability, we stop Git from accepting top-level directories owned by someone other than the current user. We avoid looking at the ownership of each and every directories between the current and the top-level one (if there are any between) to avoid introducing a performance bottleneck. This new default behavior is obviously incompatible with the concept of shared repositories, where we expect the top-level directory to be owned by only one of its legitimate users. To re-enable that use case, we add support for adding exceptions from the new default behavior via the config setting `safe.directory`. The `safe.directory` config setting is only respected in the system and global configs, not from repository configs or via the command-line, and can have multiple values to allow for multiple shared repositories. We are particularly careful to provide a helpful message to any user trying to use a shared repository. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2022-03-02 12:23:04 +01:00
#include "quote.h"
Clean up work-tree handling The old version of work-tree support was an unholy mess, barely readable, and not to the point. For example, why do you have to provide a worktree, when it is not used? As in "git status". Now it works. Another riddle was: if you can have work trees inside the git dir, why are some programs complaining that they need a work tree? IOW it is allowed to call $ git --git-dir=../ --work-tree=. bla when you really want to. In this case, you are both in the git directory and in the working tree. So, programs have to actually test for the right thing, namely if they are inside a working tree, and not if they are inside a git directory. Also, GIT_DIR=../.git should behave the same as if no GIT_DIR was specified, unless there is a repository in the current working directory. It does now. The logic to determine if a repository is bare, or has a work tree (tertium non datur), is this: --work-tree=bla overrides GIT_WORK_TREE, which overrides core.bare = true, which overrides core.worktree, which overrides GIT_DIR/.. when GIT_DIR ends in /.git, which overrides the directory in which .git/ was found. In related news, a long standing bug was fixed: when in .git/bla/x.git/, which is a bare repository, git formerly assumed ../.. to be the appropriate git dir. This problem was reported by Shawn Pearce to have caused much pain, where a colleague mistakenly ran "git init" in "/" a long time ago, and bare repositories just would not work. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-01 02:30:14 +02:00
static int inside_git_dir = -1;
static int inside_work_tree = -1;
setup_git_directory: delay core.bare/core.worktree errors If both core.bare and core.worktree are set, we complain about the bogus config and die. Dying is good, because it avoids commands running and doing damage in a potentially incorrect setup. But dying _there_ is bad, because it means that commands which do not even care about the work tree cannot run. This can make repairing the situation harder: [setup] $ git config core.bare true $ git config core.worktree /some/path [OK, expected.] $ git status fatal: core.bare and core.worktree do not make sense [Hrm...] $ git config --unset core.worktree fatal: core.bare and core.worktree do not make sense [Nope...] $ git config --edit fatal: core.bare and core.worktree do not make sense [Gaaah.] $ git help config fatal: core.bare and core.worktree do not make sense Instead, let's issue a warning about the bogus config when we notice it (i.e., for all commands), but only die when the command tries to use the work tree (by calling setup_work_tree). So we now get: $ git status warning: core.bare and core.worktree do not make sense fatal: unable to set up work tree using invalid config $ git config --unset core.worktree warning: core.bare and core.worktree do not make sense We have to update t1510 to accomodate this; it uses symbolic-ref to check whether the configuration works or not, but of course that command does not use the working tree. Instead, we switch it to use `git status`, as it requires a work-tree, does not need any special setup, and is read-only (so a failure will not adversely affect further tests). In addition, we add a new test that checks the desired behavior (i.e., that running "git config" with the bogus config does in fact work). Reported-by: SZEDER Gábor <szeder@ira.uka.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-29 08:49:10 +02:00
static int work_tree_config_is_bogus;
setup: make startup_info available everywhere Commit a60645f (setup: remember whether repository was found, 2010-08-05) introduced the startup_info structure, which records some parts of the setup_git_directory() process (notably, whether we actually found a repository or not). One of the uses of this data is for functions to behave appropriately based on whether we are in a repo. But the startup_info struct is just a pointer to storage provided by the main program, and the only program that sets it up is the git.c wrapper. Thus builtins have access to startup_info, but externally linked programs do not. Worse, library code which is accessible from both has to be careful about accessing startup_info. This can be used to trigger a die("BUG") via get_sha1(): $ git fast-import <<-\EOF tag foo from HEAD:./whatever EOF fatal: BUG: startup_info struct is not initialized. Obviously that's fairly nonsensical input to feed to fast-import, but we should never hit a die("BUG"). And there may be other ways to trigger it if other non-builtins resolve sha1s. So let's point the storage for startup_info to a static variable in setup.c, making it available to all users of the library code. We _could_ turn startup_info into a regular extern struct, but doing so would mean tweaking all of the existing use sites. So let's leave the pointer indirection in place. We can, however, drop any checks for NULL, as they will always be false (and likewise, we can drop the test covering this case, which was a rather artificial situation using one of the test-* programs). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-05 23:10:27 +01:00
static struct startup_info the_startup_info;
struct startup_info *startup_info = &the_startup_info;
const char *tmp_original_cwd;
setup: make startup_info available everywhere Commit a60645f (setup: remember whether repository was found, 2010-08-05) introduced the startup_info structure, which records some parts of the setup_git_directory() process (notably, whether we actually found a repository or not). One of the uses of this data is for functions to behave appropriately based on whether we are in a repo. But the startup_info struct is just a pointer to storage provided by the main program, and the only program that sets it up is the git.c wrapper. Thus builtins have access to startup_info, but externally linked programs do not. Worse, library code which is accessible from both has to be careful about accessing startup_info. This can be used to trigger a die("BUG") via get_sha1(): $ git fast-import <<-\EOF tag foo from HEAD:./whatever EOF fatal: BUG: startup_info struct is not initialized. Obviously that's fairly nonsensical input to feed to fast-import, but we should never hit a die("BUG"). And there may be other ways to trigger it if other non-builtins resolve sha1s. So let's point the storage for startup_info to a static variable in setup.c, making it available to all users of the library code. We _could_ turn startup_info into a regular extern struct, but doing so would mean tweaking all of the existing use sites. So let's leave the pointer indirection in place. We can, however, drop any checks for NULL, as they will always be false (and likewise, we can drop the test covering this case, which was a rather artificial situation using one of the test-* programs). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-05 23:10:27 +01:00
/*
* The input parameter must contain an absolute path, and it must already be
* normalized.
*
* Find the part of an absolute path that lies inside the work tree by
* dereferencing symlinks outside the work tree, for example:
* /dir1/repo/dir2/file (work tree is /dir1/repo) -> dir2/file
* /dir/file (work tree is /) -> dir/file
* /dir/symlink1/symlink2 (symlink1 points to work tree) -> symlink2
* /dir/repolink/file (repolink points to /dir/repo) -> file
* /dir/repo (exactly equal to work tree) -> (empty string)
*/
static int abspath_part_inside_repo(char *path)
{
size_t len;
size_t wtlen;
char *path0;
int off;
const char *work_tree = get_git_work_tree();
struct strbuf realpath = STRBUF_INIT;
if (!work_tree)
return -1;
wtlen = strlen(work_tree);
len = strlen(path);
off = offset_1st_component(path);
/* check if work tree is already the prefix */
if (wtlen <= len && !fspathncmp(path, work_tree, wtlen)) {
if (path[wtlen] == '/') {
memmove(path, path + wtlen + 1, len - wtlen);
return 0;
} else if (path[wtlen - 1] == '/' || path[wtlen] == '\0') {
/* work tree is the root, or the whole path */
memmove(path, path + wtlen, len - wtlen + 1);
return 0;
}
/* work tree might match beginning of a symlink to work tree */
off = wtlen;
}
path0 = path;
path += off;
/* check each '/'-terminated level */
while (*path) {
path++;
if (*path == '/') {
*path = '\0';
strbuf_realpath(&realpath, path0, 1);
if (fspathcmp(realpath.buf, work_tree) == 0) {
memmove(path0, path + 1, len - (path - path0));
strbuf_release(&realpath);
return 0;
}
*path = '/';
}
}
/* check whole path */
strbuf_realpath(&realpath, path0, 1);
if (fspathcmp(realpath.buf, work_tree) == 0) {
*path0 = '\0';
strbuf_release(&realpath);
return 0;
}
strbuf_release(&realpath);
return -1;
}
/*
* Normalize "path", prepending the "prefix" for relative paths. If
* remaining_prefix is not NULL, return the actual prefix still
* remains in the path. For example, prefix = sub1/sub2/ and path is
*
* foo -> sub1/sub2/foo (full prefix)
* ../foo -> sub1/foo (remaining prefix is sub1/)
* ../../bar -> bar (no remaining prefix)
* ../../sub1/sub2/foo -> sub1/sub2/foo (but no remaining prefix)
* `pwd`/../bar -> sub1/bar (no remaining prefix)
*/
char *prefix_path_gently(const char *prefix, int len,
int *remaining_prefix, const char *path)
setup: sanitize absolute and funny paths in get_pathspec() The prefix_path() function called from get_pathspec() is responsible for translating list of user-supplied pathspecs to list of pathspecs that is relative to the root of the work tree. When working inside a subdirectory, the user-supplied pathspecs are taken to be relative to the current subdirectory. Among special path components in pathspecs, we used to accept and interpret only "." ("the directory", meaning a no-op) and ".." ("up one level") at the beginning. Everything else was passed through as-is. For example, if you are in Documentation/ directory of the project, you can name Documentation/howto/maintain-git.txt as: howto/maintain-git.txt ../Documentation/howto/maitain-git.txt ../././Documentation/howto/maitain-git.txt but not as: howto/./maintain-git.txt $(pwd)/howto/maintain-git.txt This patch updates prefix_path() in several ways: - If the pathspec is not absolute, prefix (i.e. the current subdirectory relative to the root of the work tree, with terminating slash, if not empty) and the pathspec is concatenated first and used in the next step. Otherwise, that absolute pathspec is used in the next step. - Then special path components "." (no-op) and ".." (up one level) are interpreted to simplify the path. It is an error to have too many ".." to cause the intermediate result to step outside of the input to this step. - If the original pathspec was not absolute, the result from the previous step is the resulting "sanitized" pathspec. Otherwise, the result from the previous step is still absolute, and it is an error if it does not begin with the directory that corresponds to the root of the work tree. The directory is stripped away from the result and is returned. - In any case, the resulting pathspec in the array get_pathspec() returns omit the ones that caused errors. With this patch, the last two examples also behave as expected. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-29 07:44:27 +01:00
{
const char *orig = path;
char *sanitized;
if (is_absolute_path(orig)) {
sanitized = xmallocz(strlen(path));
if (remaining_prefix)
*remaining_prefix = 0;
if (normalize_path_copy_len(sanitized, path, remaining_prefix)) {
free(sanitized);
return NULL;
}
if (abspath_part_inside_repo(sanitized)) {
free(sanitized);
return NULL;
}
} else {
sanitized = xstrfmt("%.*s%s", len, len ? prefix : "", path);
if (remaining_prefix)
*remaining_prefix = len;
if (normalize_path_copy_len(sanitized, sanitized, remaining_prefix)) {
free(sanitized);
return NULL;
setup: sanitize absolute and funny paths in get_pathspec() The prefix_path() function called from get_pathspec() is responsible for translating list of user-supplied pathspecs to list of pathspecs that is relative to the root of the work tree. When working inside a subdirectory, the user-supplied pathspecs are taken to be relative to the current subdirectory. Among special path components in pathspecs, we used to accept and interpret only "." ("the directory", meaning a no-op) and ".." ("up one level") at the beginning. Everything else was passed through as-is. For example, if you are in Documentation/ directory of the project, you can name Documentation/howto/maintain-git.txt as: howto/maintain-git.txt ../Documentation/howto/maitain-git.txt ../././Documentation/howto/maitain-git.txt but not as: howto/./maintain-git.txt $(pwd)/howto/maintain-git.txt This patch updates prefix_path() in several ways: - If the pathspec is not absolute, prefix (i.e. the current subdirectory relative to the root of the work tree, with terminating slash, if not empty) and the pathspec is concatenated first and used in the next step. Otherwise, that absolute pathspec is used in the next step. - Then special path components "." (no-op) and ".." (up one level) are interpreted to simplify the path. It is an error to have too many ".." to cause the intermediate result to step outside of the input to this step. - If the original pathspec was not absolute, the result from the previous step is the resulting "sanitized" pathspec. Otherwise, the result from the previous step is still absolute, and it is an error if it does not begin with the directory that corresponds to the root of the work tree. The directory is stripped away from the result and is returned. - In any case, the resulting pathspec in the array get_pathspec() returns omit the ones that caused errors. With this patch, the last two examples also behave as expected. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-29 07:44:27 +01:00
}
}
return sanitized;
}
char *prefix_path(const char *prefix, int len, const char *path)
{
char *r = prefix_path_gently(prefix, len, NULL, path);
if (!r) {
const char *hint_path = get_git_work_tree();
if (!hint_path)
hint_path = get_git_dir();
die(_("'%s' is outside repository at '%s'"), path,
absolute_path(hint_path));
}
return r;
}
int path_inside_repo(const char *prefix, const char *path)
{
int len = prefix ? strlen(prefix) : 0;
char *r = prefix_path_gently(prefix, len, NULL, path);
if (r) {
free(r);
return 1;
}
return 0;
}
int check_filename(const char *prefix, const char *arg)
{
char *to_free = NULL;
struct stat st;
if (skip_prefix(arg, ":/", &arg)) {
if (!*arg) /* ":/" is root dir, always exists */
return 1;
prefix = NULL;
} else if (skip_prefix(arg, ":!", &arg) ||
skip_prefix(arg, ":^", &arg)) {
if (!*arg) /* excluding everything is silly, but allowed */
return 1;
}
if (prefix)
arg = to_free = prefix_filename(prefix, arg);
if (!lstat(arg, &st)) {
free(to_free);
return 1; /* file exists */
}
if (is_missing_file_error(errno)) {
free(to_free);
return 0; /* file does not exist */
}
die_errno(_("failed to stat '%s'"), arg);
}
static void NORETURN die_verify_filename(struct repository *r,
const char *prefix,
const char *arg,
int diagnose_misspelt_rev)
{
if (!diagnose_misspelt_rev)
die(_("%s: no such path in the working tree.\n"
"Use 'git <command> -- <path>...' to specify paths that do not exist locally."),
arg);
/*
* Saying "'(icase)foo' does not exist in the index" when the
* user gave us ":(icase)foo" is just stupid. A magic pathspec
* begins with a colon and is followed by a non-alnum; do not
* let maybe_die_on_misspelt_object_name() even trigger.
*/
if (!(arg[0] == ':' && !isalnum(arg[1])))
maybe_die_on_misspelt_object_name(r, arg, prefix);
/* ... or fall back the most general message. */
die(_("ambiguous argument '%s': unknown revision or path not in the working tree.\n"
"Use '--' to separate paths from revisions, like this:\n"
"'git <command> [<revision>...] -- [<file>...]'"), arg);
}
verify_filename(): treat ":(magic)" as a pathspec For commands that take revisions and pathspecs, magic pathspecs like ":(exclude)foo" require the user to specify a disambiguating "--", since they do not match a file in the filesystem, like: git grep foo -- :(exclude)bar This makes them more annoying to use than they need to be. We loosened the rules for wildcards in 28fcc0b71 (pathspec: avoid the need of "--" when wildcard is used, 2015-05-02). Let's do the same for pathspecs with long-form magic. We already handle the short-forms ":/" and ":^" specially in check_filename(), so we don't need to handle them here. And in fact, we could do the same with long-form magic, parsing out the actual filename and making sure it exists. But there are a few reasons not to do it that way: - the parsing gets much more complicated, and we'd want to hand it off to the pathspec code. But that code isn't ready to do this kind of speculative parsing (it's happy to die() when it sees a syntactically invalid pathspec). - not all pathspec magic maps to a filesystem path. E.g., :(attr) should be treated as a pathspec regardless of what is in the filesystem - we can be a bit looser with ":(" than with the short-form ":/", because it is much less likely to have a false positive. Whereas ":/" also means "search for a commit with this regex". Note that because the change is in verify_filename() and not in its helper check_filename(), this doesn't affect the verify_non_filename() case. I.e., if an item that matches our new rule doesn't resolve as an object, we may fallback to treating it as a pathspec (rather than complaining it doesn't exist). But if it does resolve (e.g., as a file in the index that starts with an open-paren), we won't then complain that it's also a valid pathspec. This matches the wildcard-exception behavior. And of course in either case, one can always insert the "--" to get more precise results. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26 21:10:31 +02:00
/*
* Check for arguments that don't resolve as actual files,
* but which look sufficiently like pathspecs that we'll consider
* them such for the purposes of rev/pathspec DWIM parsing.
*/
static int looks_like_pathspec(const char *arg)
{
const char *p;
int escaped = 0;
/*
* Wildcard characters imply the user is looking to match pathspecs
* that aren't in the filesystem. Note that this doesn't include
* backslash even though it's a glob special; by itself it doesn't
* cause any increase in the match. Likewise ignore backslash-escaped
* wildcard characters.
*/
for (p = arg; *p; p++) {
if (escaped) {
escaped = 0;
} else if (is_glob_special(*p)) {
if (*p == '\\')
escaped = 1;
else
return 1;
}
}
verify_filename(): treat ":(magic)" as a pathspec For commands that take revisions and pathspecs, magic pathspecs like ":(exclude)foo" require the user to specify a disambiguating "--", since they do not match a file in the filesystem, like: git grep foo -- :(exclude)bar This makes them more annoying to use than they need to be. We loosened the rules for wildcards in 28fcc0b71 (pathspec: avoid the need of "--" when wildcard is used, 2015-05-02). Let's do the same for pathspecs with long-form magic. We already handle the short-forms ":/" and ":^" specially in check_filename(), so we don't need to handle them here. And in fact, we could do the same with long-form magic, parsing out the actual filename and making sure it exists. But there are a few reasons not to do it that way: - the parsing gets much more complicated, and we'd want to hand it off to the pathspec code. But that code isn't ready to do this kind of speculative parsing (it's happy to die() when it sees a syntactically invalid pathspec). - not all pathspec magic maps to a filesystem path. E.g., :(attr) should be treated as a pathspec regardless of what is in the filesystem - we can be a bit looser with ":(" than with the short-form ":/", because it is much less likely to have a false positive. Whereas ":/" also means "search for a commit with this regex". Note that because the change is in verify_filename() and not in its helper check_filename(), this doesn't affect the verify_non_filename() case. I.e., if an item that matches our new rule doesn't resolve as an object, we may fallback to treating it as a pathspec (rather than complaining it doesn't exist). But if it does resolve (e.g., as a file in the index that starts with an open-paren), we won't then complain that it's also a valid pathspec. This matches the wildcard-exception behavior. And of course in either case, one can always insert the "--" to get more precise results. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26 21:10:31 +02:00
/* long-form pathspec magic */
if (starts_with(arg, ":("))
return 1;
return 0;
}
/*
* Verify a filename that we got as an argument for a pathspec
* entry. Note that a filename that begins with "-" never verifies
* as true, because even if such a filename were to exist, we want
* it to be preceded by the "--" marker (or we want the user to
* use a format like "./-filename")
*
* The "diagnose_misspelt_rev" is used to provide a user-friendly
* diagnosis when dying upon finding that "name" is not a pathname.
* If set to 1, the diagnosis will try to diagnose "name" as an
* invalid object name (e.g. HEAD:foo). If set to 0, the diagnosis
* will only complain about an inexisting file.
*
* This function is typically called to check that a "file or rev"
* argument is unambiguous. In this case, the caller will want
* diagnose_misspelt_rev == 1 when verifying the first non-rev
* argument (which could have been a revision), and
* diagnose_misspelt_rev == 0 for the next ones (because we already
* saw a filename, there's not ambiguity anymore).
*/
void verify_filename(const char *prefix,
const char *arg,
int diagnose_misspelt_rev)
{
if (*arg == '-')
die(_("option '%s' must come before non-option arguments"), arg);
if (looks_like_pathspec(arg) || check_filename(prefix, arg))
return;
die_verify_filename(the_repository, prefix, arg, diagnose_misspelt_rev);
}
/*
* Opposite of the above: the command line did not have -- marker
* and we parsed the arg as a refname. It should not be interpretable
* as a filename.
*/
void verify_non_filename(const char *prefix, const char *arg)
{
if (!is_inside_work_tree() || is_inside_git_dir())
return;
if (*arg == '-')
return; /* flag */
if (!check_filename(prefix, arg))
return;
die(_("ambiguous argument '%s': both revision and filename\n"
"Use '--' to separate paths from revisions, like this:\n"
"'git <command> [<revision>...] -- [<file>...]'"), arg);
}
int get_common_dir(struct strbuf *sb, const char *gitdir)
{
const char *git_env_common_dir = getenv(GIT_COMMON_DIR_ENVIRONMENT);
if (git_env_common_dir) {
strbuf_addstr(sb, git_env_common_dir);
return 1;
} else {
return get_common_dir_noenv(sb, gitdir);
}
}
int get_common_dir_noenv(struct strbuf *sb, const char *gitdir)
{
struct strbuf data = STRBUF_INIT;
struct strbuf path = STRBUF_INIT;
int ret = 0;
strbuf_addf(&path, "%s/commondir", gitdir);
if (file_exists(path.buf)) {
if (strbuf_read_file(&data, path.buf, 0) <= 0)
die_errno(_("failed to read %s"), path.buf);
while (data.len && (data.buf[data.len - 1] == '\n' ||
data.buf[data.len - 1] == '\r'))
data.len--;
data.buf[data.len] = '\0';
strbuf_reset(&path);
if (!is_absolute_path(data.buf))
strbuf_addf(&path, "%s/", gitdir);
strbuf_addbuf(&path, &data);
strbuf_add_real_path(sb, path.buf);
ret = 1;
} else {
strbuf_addstr(sb, gitdir);
}
strbuf_release(&data);
strbuf_release(&path);
return ret;
}
/*
* Test if it looks like we're at a git directory.
* We want to see:
*
* - either an objects/ directory _or_ the proper
* GIT_OBJECT_DIRECTORY environment variable
* - a refs/ directory
* - either a HEAD symlink or a HEAD file that is formatted as
* a proper "ref:", or a regular file HEAD that has a properly
* formatted sha1 object name.
*/
standardize and improve lookup rules for external local repos When you specify a local repository on the command line of clone, ls-remote, upload-pack, receive-pack, or upload-archive, or in a request to git-daemon, we perform a little bit of lookup magic, doing things like looking in working trees for .git directories and appending ".git" for bare repos. For clone, this magic happens in get_repo_path. For everything else, it happens in enter_repo. In both cases, there are some ambiguous or confusing cases that aren't handled well, and there is one case that is not handled the same by both methods. This patch tries to provide (and test!) standard, sensible lookup rules for both code paths. The intended changes are: 1. When looking up "foo", we have always preferred a working tree "foo" (containing "foo/.git" over the bare "foo.git". But we did not prefer a bare "foo" over "foo.git". With this patch, we do so. 2. We would select directories that existed but didn't actually look like git repositories. With this patch, we make sure a selected directory looks like a git repo. Not only is this more sensible in general, but it will help anybody who is negatively affected by change (1) negatively (e.g., if they had "foo.git" next to its separate work tree "foo", and expect to keep finding "foo.git" when they reference "foo"). 3. The enter_repo code path would, given "foo", look for "foo.git/.git" (i.e., do the ".git" append magic even for a repo with working tree). The clone code path did not; with this patch, they now behave the same. In the unlikely case of a working tree overlaying a bare repo (i.e., a ".git" directory _inside_ a bare repo), we continue to treat it as a working tree (prefering the "inner" .git over the bare repo). This is mainly because the combination seems nonsensical, and I'd rather stick with existing behavior on the off chance that somebody is relying on it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-02 22:59:13 +01:00
int is_git_directory(const char *suspect)
{
struct strbuf path = STRBUF_INIT;
int ret = 0;
size_t len;
/* Check worktree-related signatures */
strbuf_addstr(&path, suspect);
strbuf_complete(&path, '/');
strbuf_addstr(&path, "HEAD");
if (validate_headref(path.buf))
goto done;
strbuf_reset(&path);
get_common_dir(&path, suspect);
len = path.len;
/* Check non-worktree-related signatures */
if (getenv(DB_ENVIRONMENT)) {
if (access(getenv(DB_ENVIRONMENT), X_OK))
goto done;
}
else {
strbuf_setlen(&path, len);
strbuf_addstr(&path, "/objects");
if (access(path.buf, X_OK))
goto done;
}
strbuf_setlen(&path, len);
strbuf_addstr(&path, "/refs");
if (access(path.buf, X_OK))
goto done;
ret = 1;
done:
strbuf_release(&path);
return ret;
}
int is_nonbare_repository_dir(struct strbuf *path)
{
int ret = 0;
int gitfile_error;
size_t orig_path_len = path->len;
assert(orig_path_len != 0);
strbuf_complete(path, '/');
strbuf_addstr(path, ".git");
if (read_gitfile_gently(path->buf, &gitfile_error) || is_git_directory(path->buf))
ret = 1;
if (gitfile_error == READ_GITFILE_ERR_OPEN_FAILED ||
gitfile_error == READ_GITFILE_ERR_READ_FAILED)
ret = 1;
strbuf_setlen(path, orig_path_len);
return ret;
}
int is_inside_git_dir(void)
{
Clean up work-tree handling The old version of work-tree support was an unholy mess, barely readable, and not to the point. For example, why do you have to provide a worktree, when it is not used? As in "git status". Now it works. Another riddle was: if you can have work trees inside the git dir, why are some programs complaining that they need a work tree? IOW it is allowed to call $ git --git-dir=../ --work-tree=. bla when you really want to. In this case, you are both in the git directory and in the working tree. So, programs have to actually test for the right thing, namely if they are inside a working tree, and not if they are inside a git directory. Also, GIT_DIR=../.git should behave the same as if no GIT_DIR was specified, unless there is a repository in the current working directory. It does now. The logic to determine if a repository is bare, or has a work tree (tertium non datur), is this: --work-tree=bla overrides GIT_WORK_TREE, which overrides core.bare = true, which overrides core.worktree, which overrides GIT_DIR/.. when GIT_DIR ends in /.git, which overrides the directory in which .git/ was found. In related news, a long standing bug was fixed: when in .git/bla/x.git/, which is a bare repository, git formerly assumed ../.. to be the appropriate git dir. This problem was reported by Shawn Pearce to have caused much pain, where a colleague mistakenly ran "git init" in "/" a long time ago, and bare repositories just would not work. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-01 02:30:14 +02:00
if (inside_git_dir < 0)
inside_git_dir = is_inside_dir(get_git_dir());
return inside_git_dir;
introduce GIT_WORK_TREE to specify the work tree setup_gdg is used as abbreviation for setup_git_directory_gently. The work tree can be specified using the environment variable GIT_WORK_TREE and the config option core.worktree (the environment variable has precendence over the config option). Additionally there is a command line option --work-tree which sets the environment variable. setup_gdg does the following now: GIT_DIR unspecified repository in .git directory parent directory of the .git directory is used as work tree, GIT_WORK_TREE is ignored GIT_DIR unspecified repository in cwd GIT_DIR is set to cwd see the cases with GIT_DIR specified what happens next and also see the note below GIT_DIR specified GIT_WORK_TREE/core.worktree unspecified cwd is used as work tree GIT_DIR specified GIT_WORK_TREE/core.worktree specified the specified work tree is used Note on the case where GIT_DIR is unspecified and repository is in cwd: GIT_WORK_TREE is used but is_inside_git_dir is always true. I did it this way because setup_gdg might be called multiple times (e.g. when doing alias expansion) and in successive calls setup_gdg should do the same thing every time. Meaning of is_bare/is_inside_work_tree/is_inside_git_dir: (1) is_bare_repository A repository is bare if core.bare is true or core.bare is unspecified and the name suggests it is bare (directory not named .git). The bare option disables a few protective checks which are useful with a working tree. Currently this changes if a repository is bare: updates of HEAD are allowed git gc packs the refs the reflog is disabled by default (2) is_inside_work_tree True if the cwd is inside the associated working tree (if there is one), false otherwise. (3) is_inside_git_dir True if the cwd is inside the git directory, false otherwise. Before this patch is_inside_git_dir was always true for bare repositories. When setup_gdg finds a repository git_config(git_default_config) is always called. This ensure that is_bare_repository makes use of core.bare and does not guess even though core.bare is specified. inside_work_tree and inside_git_dir are set if setup_gdg finds a repository. The is_inside_work_tree and is_inside_git_dir functions will die if they are called before a successful call to setup_gdg. Signed-off-by: Matthias Lederhofer <matled@gmx.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-06 09:10:42 +02:00
}
int is_inside_work_tree(void)
{
Clean up work-tree handling The old version of work-tree support was an unholy mess, barely readable, and not to the point. For example, why do you have to provide a worktree, when it is not used? As in "git status". Now it works. Another riddle was: if you can have work trees inside the git dir, why are some programs complaining that they need a work tree? IOW it is allowed to call $ git --git-dir=../ --work-tree=. bla when you really want to. In this case, you are both in the git directory and in the working tree. So, programs have to actually test for the right thing, namely if they are inside a working tree, and not if they are inside a git directory. Also, GIT_DIR=../.git should behave the same as if no GIT_DIR was specified, unless there is a repository in the current working directory. It does now. The logic to determine if a repository is bare, or has a work tree (tertium non datur), is this: --work-tree=bla overrides GIT_WORK_TREE, which overrides core.bare = true, which overrides core.worktree, which overrides GIT_DIR/.. when GIT_DIR ends in /.git, which overrides the directory in which .git/ was found. In related news, a long standing bug was fixed: when in .git/bla/x.git/, which is a bare repository, git formerly assumed ../.. to be the appropriate git dir. This problem was reported by Shawn Pearce to have caused much pain, where a colleague mistakenly ran "git init" in "/" a long time ago, and bare repositories just would not work. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-01 02:30:14 +02:00
if (inside_work_tree < 0)
inside_work_tree = is_inside_dir(get_git_work_tree());
return inside_work_tree;
introduce GIT_WORK_TREE to specify the work tree setup_gdg is used as abbreviation for setup_git_directory_gently. The work tree can be specified using the environment variable GIT_WORK_TREE and the config option core.worktree (the environment variable has precendence over the config option). Additionally there is a command line option --work-tree which sets the environment variable. setup_gdg does the following now: GIT_DIR unspecified repository in .git directory parent directory of the .git directory is used as work tree, GIT_WORK_TREE is ignored GIT_DIR unspecified repository in cwd GIT_DIR is set to cwd see the cases with GIT_DIR specified what happens next and also see the note below GIT_DIR specified GIT_WORK_TREE/core.worktree unspecified cwd is used as work tree GIT_DIR specified GIT_WORK_TREE/core.worktree specified the specified work tree is used Note on the case where GIT_DIR is unspecified and repository is in cwd: GIT_WORK_TREE is used but is_inside_git_dir is always true. I did it this way because setup_gdg might be called multiple times (e.g. when doing alias expansion) and in successive calls setup_gdg should do the same thing every time. Meaning of is_bare/is_inside_work_tree/is_inside_git_dir: (1) is_bare_repository A repository is bare if core.bare is true or core.bare is unspecified and the name suggests it is bare (directory not named .git). The bare option disables a few protective checks which are useful with a working tree. Currently this changes if a repository is bare: updates of HEAD are allowed git gc packs the refs the reflog is disabled by default (2) is_inside_work_tree True if the cwd is inside the associated working tree (if there is one), false otherwise. (3) is_inside_git_dir True if the cwd is inside the git directory, false otherwise. Before this patch is_inside_git_dir was always true for bare repositories. When setup_gdg finds a repository git_config(git_default_config) is always called. This ensure that is_bare_repository makes use of core.bare and does not guess even though core.bare is specified. inside_work_tree and inside_git_dir are set if setup_gdg finds a repository. The is_inside_work_tree and is_inside_git_dir functions will die if they are called before a successful call to setup_gdg. Signed-off-by: Matthias Lederhofer <matled@gmx.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-06 09:10:42 +02:00
}
void setup_work_tree(void)
{
set_work_tree: use chdir_notify When we change to the top of the working tree, we manually re-adjust $GIT_DIR and call set_git_dir() again, in order to update any relative git-dir we'd compute earlier. Instead of the work-tree code having to know to call the git-dir code, let's use the new chdir_notify interface. There are two spots that need updating, with a few subtleties in each: 1. the set_git_dir() code needs to chdir_notify_register() so it can be told when to update its path. Technically we could push this down into repo_set_gitdir(), so that even repository structs besides the_repository could benefit from this. But that opens up a lot of complications: - we'd still need to touch set_git_dir(), because it does some other setup (like setting $GIT_DIR in the environment) - submodules using other repository structs get cleaned up, which means we'd need to remove them from the chdir_notify list - it's unlikely to fix any bugs, since we shouldn't generally chdir() in the middle of working on a submodule 2. setup_work_tree now needs to call chdir_notify(), and can lose its manual set_git_dir() call. Note that at first glance it looks like this undoes the absolute-to-relative optimization added by 044bbbcb63 (Make git_dir a path relative to work_tree in setup_work_tree(), 2008-06-19). But for the most part that optimization was just _undoing_ the relative-to-absolute conversion which the function was doing earlier (and which is now gone). It is true that if you already have an absolute git_dir that the setup_work_tree() function will no longer make it relative as a side effect. But: - we generally do have relative git-dir's due to the way the discovery code works - if we really care about making git-dir's relative when possible, then we should be relativizing them earlier (e.g., when we see an absolute $GIT_DIR we could turn it relative, whether we are going to chdir into a worktree or not). That would cover all cases, including ones that 044bbbcb63 did not. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-30 20:35:08 +02:00
const char *work_tree;
static int initialized = 0;
if (initialized)
return;
setup_git_directory: delay core.bare/core.worktree errors If both core.bare and core.worktree are set, we complain about the bogus config and die. Dying is good, because it avoids commands running and doing damage in a potentially incorrect setup. But dying _there_ is bad, because it means that commands which do not even care about the work tree cannot run. This can make repairing the situation harder: [setup] $ git config core.bare true $ git config core.worktree /some/path [OK, expected.] $ git status fatal: core.bare and core.worktree do not make sense [Hrm...] $ git config --unset core.worktree fatal: core.bare and core.worktree do not make sense [Nope...] $ git config --edit fatal: core.bare and core.worktree do not make sense [Gaaah.] $ git help config fatal: core.bare and core.worktree do not make sense Instead, let's issue a warning about the bogus config when we notice it (i.e., for all commands), but only die when the command tries to use the work tree (by calling setup_work_tree). So we now get: $ git status warning: core.bare and core.worktree do not make sense fatal: unable to set up work tree using invalid config $ git config --unset core.worktree warning: core.bare and core.worktree do not make sense We have to update t1510 to accomodate this; it uses symbolic-ref to check whether the configuration works or not, but of course that command does not use the working tree. Instead, we switch it to use `git status`, as it requires a work-tree, does not need any special setup, and is read-only (so a failure will not adversely affect further tests). In addition, we add a new test that checks the desired behavior (i.e., that running "git config" with the bogus config does in fact work). Reported-by: SZEDER Gábor <szeder@ira.uka.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-29 08:49:10 +02:00
if (work_tree_config_is_bogus)
die(_("unable to set up work tree using invalid config"));
setup_git_directory: delay core.bare/core.worktree errors If both core.bare and core.worktree are set, we complain about the bogus config and die. Dying is good, because it avoids commands running and doing damage in a potentially incorrect setup. But dying _there_ is bad, because it means that commands which do not even care about the work tree cannot run. This can make repairing the situation harder: [setup] $ git config core.bare true $ git config core.worktree /some/path [OK, expected.] $ git status fatal: core.bare and core.worktree do not make sense [Hrm...] $ git config --unset core.worktree fatal: core.bare and core.worktree do not make sense [Nope...] $ git config --edit fatal: core.bare and core.worktree do not make sense [Gaaah.] $ git help config fatal: core.bare and core.worktree do not make sense Instead, let's issue a warning about the bogus config when we notice it (i.e., for all commands), but only die when the command tries to use the work tree (by calling setup_work_tree). So we now get: $ git status warning: core.bare and core.worktree do not make sense fatal: unable to set up work tree using invalid config $ git config --unset core.worktree warning: core.bare and core.worktree do not make sense We have to update t1510 to accomodate this; it uses symbolic-ref to check whether the configuration works or not, but of course that command does not use the working tree. Instead, we switch it to use `git status`, as it requires a work-tree, does not need any special setup, and is read-only (so a failure will not adversely affect further tests). In addition, we add a new test that checks the desired behavior (i.e., that running "git config" with the bogus config does in fact work). Reported-by: SZEDER Gábor <szeder@ira.uka.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-29 08:49:10 +02:00
work_tree = get_git_work_tree();
set_work_tree: use chdir_notify When we change to the top of the working tree, we manually re-adjust $GIT_DIR and call set_git_dir() again, in order to update any relative git-dir we'd compute earlier. Instead of the work-tree code having to know to call the git-dir code, let's use the new chdir_notify interface. There are two spots that need updating, with a few subtleties in each: 1. the set_git_dir() code needs to chdir_notify_register() so it can be told when to update its path. Technically we could push this down into repo_set_gitdir(), so that even repository structs besides the_repository could benefit from this. But that opens up a lot of complications: - we'd still need to touch set_git_dir(), because it does some other setup (like setting $GIT_DIR in the environment) - submodules using other repository structs get cleaned up, which means we'd need to remove them from the chdir_notify list - it's unlikely to fix any bugs, since we shouldn't generally chdir() in the middle of working on a submodule 2. setup_work_tree now needs to call chdir_notify(), and can lose its manual set_git_dir() call. Note that at first glance it looks like this undoes the absolute-to-relative optimization added by 044bbbcb63 (Make git_dir a path relative to work_tree in setup_work_tree(), 2008-06-19). But for the most part that optimization was just _undoing_ the relative-to-absolute conversion which the function was doing earlier (and which is now gone). It is true that if you already have an absolute git_dir that the setup_work_tree() function will no longer make it relative as a side effect. But: - we generally do have relative git-dir's due to the way the discovery code works - if we really care about making git-dir's relative when possible, then we should be relativizing them earlier (e.g., when we see an absolute $GIT_DIR we could turn it relative, whether we are going to chdir into a worktree or not). That would cover all cases, including ones that 044bbbcb63 did not. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-30 20:35:08 +02:00
if (!work_tree || chdir_notify(work_tree))
die(_("this operation must be run in a work tree"));
/*
* Make sure subsequent git processes find correct worktree
* if $GIT_WORK_TREE is set relative
*/
if (getenv(GIT_WORK_TREE_ENVIRONMENT))
setenv(GIT_WORK_TREE_ENVIRONMENT, ".", 1);
initialized = 1;
}
static void setup_original_cwd(void)
{
struct strbuf tmp = STRBUF_INIT;
const char *worktree = NULL;
int offset = -1;
if (!tmp_original_cwd)
return;
/*
* startup_info->original_cwd points to the current working
* directory we inherited from our parent process, which is a
* directory we want to avoid removing.
*
* For convience, we would like to have the path relative to the
* worktree instead of an absolute path.
*
* Yes, startup_info->original_cwd is usually the same as 'prefix',
* but differs in two ways:
* - prefix has a trailing '/'
* - if the user passes '-C' to git, that modifies the prefix but
* not startup_info->original_cwd.
*/
/* Normalize the directory */
strbuf_realpath(&tmp, tmp_original_cwd, 1);
free((char*)tmp_original_cwd);
tmp_original_cwd = NULL;
startup_info->original_cwd = strbuf_detach(&tmp, NULL);
/*
* Get our worktree; we only protect the current working directory
* if it's in the worktree.
*/
worktree = get_git_work_tree();
if (!worktree)
goto no_prevention_needed;
offset = dir_inside_of(startup_info->original_cwd, worktree);
if (offset >= 0) {
/*
* If startup_info->original_cwd == worktree, that is already
* protected and we don't need original_cwd as a secondary
* protection measure.
*/
if (!*(startup_info->original_cwd + offset))
goto no_prevention_needed;
/*
* original_cwd was inside worktree; precompose it just as
* we do prefix so that built up paths will match
*/
startup_info->original_cwd = \
precompose_string_if_needed(startup_info->original_cwd
+ offset);
return;
}
no_prevention_needed:
free((char*)startup_info->original_cwd);
startup_info->original_cwd = NULL;
}
worktree: add per-worktree config files A new repo extension is added, worktreeConfig. When it is present: - Repository config reading by default includes $GIT_DIR/config _and_ $GIT_DIR/config.worktree. "config" file remains shared in multiple worktree setup. - The special treatment for core.bare and core.worktree, to stay effective only in main worktree, is gone. These config settings are supposed to be in config.worktree. This extension is most useful in multiple worktree setup because you now have an option to store per-worktree config (which is either .git/config.worktree for main worktree, or .git/worktrees/xx/config.worktree for linked ones). This extension can be used in single worktree mode, even though it's pretty much useless (but this can happen after you remove all linked worktrees and move back to single worktree). "git config" reads from both "config" and "config.worktree" by default (i.e. without either --user, --file...) when this extension is present. Default writes still go to "config", not "config.worktree". A new option --worktree is added for that (*). Since a new repo extension is introduced, existing git binaries should refuse to access to the repo (both from main and linked worktrees). So they will not misread the config file (i.e. skip the config.worktree part). They may still accidentally write to the config file anyway if they use with "git config --file <path>". This design places a bet on the assumption that the majority of config variables are shared so it is the default mode. A safer move would be default writes go to per-worktree file, so that accidental changes are isolated. (*) "git config --worktree" points back to "config" file when this extension is not present and there is only one worktree so that it works in any both single and multiple worktree setups. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-21 16:02:28 +02:00
static int read_worktree_config(const char *var, const char *value, void *vdata)
{
struct repository_format *data = vdata;
if (strcmp(var, "core.bare") == 0) {
data->is_bare = git_config_bool(var, value);
} else if (strcmp(var, "core.worktree") == 0) {
if (!value)
return config_error_nonbool(var);
free(data->work_tree);
worktree: add per-worktree config files A new repo extension is added, worktreeConfig. When it is present: - Repository config reading by default includes $GIT_DIR/config _and_ $GIT_DIR/config.worktree. "config" file remains shared in multiple worktree setup. - The special treatment for core.bare and core.worktree, to stay effective only in main worktree, is gone. These config settings are supposed to be in config.worktree. This extension is most useful in multiple worktree setup because you now have an option to store per-worktree config (which is either .git/config.worktree for main worktree, or .git/worktrees/xx/config.worktree for linked ones). This extension can be used in single worktree mode, even though it's pretty much useless (but this can happen after you remove all linked worktrees and move back to single worktree). "git config" reads from both "config" and "config.worktree" by default (i.e. without either --user, --file...) when this extension is present. Default writes still go to "config", not "config.worktree". A new option --worktree is added for that (*). Since a new repo extension is introduced, existing git binaries should refuse to access to the repo (both from main and linked worktrees). So they will not misread the config file (i.e. skip the config.worktree part). They may still accidentally write to the config file anyway if they use with "git config --file <path>". This design places a bet on the assumption that the majority of config variables are shared so it is the default mode. A safer move would be default writes go to per-worktree file, so that accidental changes are isolated. (*) "git config --worktree" points back to "config" file when this extension is not present and there is only one worktree so that it works in any both single and multiple worktree setups. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-21 16:02:28 +02:00
data->work_tree = xstrdup(value);
}
return 0;
}
verify_repository_format(): complain about new extensions in v0 repo We made the mistake in the past of respecting extensions.* even when the repository format version was set to 0. This is bad because forgetting to bump the repository version means that older versions of Git (which do not know about our extensions) won't complain. I.e., it's not a problem in itself, but it means your repository is in a state which does not give you the protection you think you're getting from older versions. For compatibility reasons, we are stuck with that decision for existing extensions. However, we'd prefer not to extend the damage further. We can do that by catching any newly-added extensions and complaining about the repository format. Note that this is a pretty heavy hammer: we'll refuse to work with the repository at all. A lesser option would be to ignore (possibly with a warning) any new extensions. But because of the way the extensions are handled, that puts the burden on each new extension that is added to remember to "undo" itself (because they are handled before we know for sure whether we are in a v1 repo or not, since we don't insist on a particular ordering of config entries). So one option would be to rewrite that handling to record any new extensions (and their values) during the config parse, and then only after proceed to handle new ones only if we're in a v1 repository. But I'm not sure if it's worth the trouble: - ignoring extensions is likely to end up with broken results anyway (e.g., ignoring a proposed objectformat extension means parsing any object data is likely to encounter errors) - this is a sign that whatever tool wrote the extension field is broken. We may be better off notifying immediately and forcefully so that such tools don't even appear to work accidentally. The only downside is that fixing the situation is a little tricky, because programs like "git config" won't want to work with the repository. But: git config --file=.git/config core.repositoryformatversion 1 should still suffice. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-16 14:25:13 +02:00
enum extension_result {
EXTENSION_ERROR = -1, /* compatible with error(), etc */
EXTENSION_UNKNOWN = 0,
EXTENSION_OK = 1
};
/*
* Do not add new extensions to this function. It handles extensions which are
* respected even in v0-format repositories for historical compatibility.
*/
static enum extension_result handle_extension_v0(const char *var,
const char *value,
const char *ext,
struct repository_format *data)
{
if (!strcmp(ext, "noop")) {
return EXTENSION_OK;
} else if (!strcmp(ext, "preciousobjects")) {
data->precious_objects = git_config_bool(var, value);
return EXTENSION_OK;
} else if (!strcmp(ext, "partialclone")) {
data->partial_clone = xstrdup(value);
return EXTENSION_OK;
} else if (!strcmp(ext, "worktreeconfig")) {
data->worktree_config = git_config_bool(var, value);
return EXTENSION_OK;
}
return EXTENSION_UNKNOWN;
}
/*
* Record any new extensions in this function.
*/
static enum extension_result handle_extension(const char *var,
const char *value,
const char *ext,
struct repository_format *data)
{
if (!strcmp(ext, "noop-v1")) {
return EXTENSION_OK;
} else if (!strcmp(ext, "objectformat")) {
int format;
verify_repository_format(): complain about new extensions in v0 repo We made the mistake in the past of respecting extensions.* even when the repository format version was set to 0. This is bad because forgetting to bump the repository version means that older versions of Git (which do not know about our extensions) won't complain. I.e., it's not a problem in itself, but it means your repository is in a state which does not give you the protection you think you're getting from older versions. For compatibility reasons, we are stuck with that decision for existing extensions. However, we'd prefer not to extend the damage further. We can do that by catching any newly-added extensions and complaining about the repository format. Note that this is a pretty heavy hammer: we'll refuse to work with the repository at all. A lesser option would be to ignore (possibly with a warning) any new extensions. But because of the way the extensions are handled, that puts the burden on each new extension that is added to remember to "undo" itself (because they are handled before we know for sure whether we are in a v1 repo or not, since we don't insist on a particular ordering of config entries). So one option would be to rewrite that handling to record any new extensions (and their values) during the config parse, and then only after proceed to handle new ones only if we're in a v1 repository. But I'm not sure if it's worth the trouble: - ignoring extensions is likely to end up with broken results anyway (e.g., ignoring a proposed objectformat extension means parsing any object data is likely to encounter errors) - this is a sign that whatever tool wrote the extension field is broken. We may be better off notifying immediately and forcefully so that such tools don't even appear to work accidentally. The only downside is that fixing the situation is a little tricky, because programs like "git config" won't want to work with the repository. But: git config --file=.git/config core.repositoryformatversion 1 should still suffice. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-16 14:25:13 +02:00
if (!value)
return config_error_nonbool(var);
format = hash_algo_by_name(value);
if (format == GIT_HASH_UNKNOWN)
return error("invalid value for 'extensions.objectformat'");
data->hash_algo = format;
return EXTENSION_OK;
}
verify_repository_format(): complain about new extensions in v0 repo We made the mistake in the past of respecting extensions.* even when the repository format version was set to 0. This is bad because forgetting to bump the repository version means that older versions of Git (which do not know about our extensions) won't complain. I.e., it's not a problem in itself, but it means your repository is in a state which does not give you the protection you think you're getting from older versions. For compatibility reasons, we are stuck with that decision for existing extensions. However, we'd prefer not to extend the damage further. We can do that by catching any newly-added extensions and complaining about the repository format. Note that this is a pretty heavy hammer: we'll refuse to work with the repository at all. A lesser option would be to ignore (possibly with a warning) any new extensions. But because of the way the extensions are handled, that puts the burden on each new extension that is added to remember to "undo" itself (because they are handled before we know for sure whether we are in a v1 repo or not, since we don't insist on a particular ordering of config entries). So one option would be to rewrite that handling to record any new extensions (and their values) during the config parse, and then only after proceed to handle new ones only if we're in a v1 repository. But I'm not sure if it's worth the trouble: - ignoring extensions is likely to end up with broken results anyway (e.g., ignoring a proposed objectformat extension means parsing any object data is likely to encounter errors) - this is a sign that whatever tool wrote the extension field is broken. We may be better off notifying immediately and forcefully so that such tools don't even appear to work accidentally. The only downside is that fixing the situation is a little tricky, because programs like "git config" won't want to work with the repository. But: git config --file=.git/config core.repositoryformatversion 1 should still suffice. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-16 14:25:13 +02:00
return EXTENSION_UNKNOWN;
}
setup: refactor repo format reading and verification When we want to know if we're in a git repository of reasonable vintage, we can call check_repository_format_gently(), which does three things: 1. Reads the config from the .git/config file. 2. Verifies that the version info we read is sane. 3. Writes some global variables based on this. There are a few things we could improve here. One is that steps 1 and 3 happen together. So if the verification in step 2 fails, we still clobber the global variables. This is especially bad if we go on to try another repository directory; we may end up with a state of mixed config variables. The second is there's no way to ask about the repository version for anything besides the main repository we're in. git-init wants to do this, and it's possible that we would want to start doing so for submodules (e.g., to find out which ref backend they're using). We can improve both by splitting the first two steps into separate functions. Now check_repository_format_gently() calls out to steps 1 and 2, and does 3 only if step 2 succeeds. Note that the public interface for read_repository_format() and what check_repository_format_gently() needs from it are not quite the same, leading us to have an extra read_repository_format_1() helper. The extra needs from check_repository_format_gently() will go away in a future patch, and we can simplify this then to just the public interface. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-11 23:37:07 +01:00
static int check_repo_format(const char *var, const char *value, void *vdata)
{
setup: refactor repo format reading and verification When we want to know if we're in a git repository of reasonable vintage, we can call check_repository_format_gently(), which does three things: 1. Reads the config from the .git/config file. 2. Verifies that the version info we read is sane. 3. Writes some global variables based on this. There are a few things we could improve here. One is that steps 1 and 3 happen together. So if the verification in step 2 fails, we still clobber the global variables. This is especially bad if we go on to try another repository directory; we may end up with a state of mixed config variables. The second is there's no way to ask about the repository version for anything besides the main repository we're in. git-init wants to do this, and it's possible that we would want to start doing so for submodules (e.g., to find out which ref backend they're using). We can improve both by splitting the first two steps into separate functions. Now check_repository_format_gently() calls out to steps 1 and 2, and does 3 only if step 2 succeeds. Note that the public interface for read_repository_format() and what check_repository_format_gently() needs from it are not quite the same, leading us to have an extra read_repository_format_1() helper. The extra needs from check_repository_format_gently() will go away in a future patch, and we can simplify this then to just the public interface. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-11 23:37:07 +01:00
struct repository_format *data = vdata;
introduce "extensions" form of core.repositoryformatversion Normally we try to avoid bumps of the whole-repository core.repositoryformatversion field. However, it is unavoidable if we want to safely change certain aspects of git in a backwards-incompatible way (e.g., modifying the set of ref tips that we must traverse to generate a list of unreachable, safe-to-prune objects). If we were to bump the repository version for every such change, then any implementation understanding version `X` would also have to understand `X-1`, `X-2`, and so forth, even though the incompatibilities may be in orthogonal parts of the system, and there is otherwise no reason we cannot implement one without the other (or more importantly, that the user cannot choose to use one feature without the other, weighing the tradeoff in compatibility only for that particular feature). This patch documents the existing repositoryformatversion strategy and introduces a new format, "1", which lets a repository specify that it must run with an arbitrary set of extensions. This can be used, for example: - to inform git that the objects should not be pruned based only on the reachability of the ref tips (e.g, because it has "clone --shared" children) - that the refs are stored in a format besides the usual "refs" and "packed-refs" directories Because we bump to format "1", and because format "1" requires that a running git knows about any extensions mentioned, we know that older versions of the code will not do something dangerous when confronted with these new formats. For example, if the user chooses to use database storage for refs, they may set the "extensions.refbackend" config to "db". Older versions of git will not understand format "1" and bail. Versions of git which understand "1" but do not know about "refbackend", or which know about "refbackend" but not about the "db" backend, will refuse to run. This is annoying, of course, but much better than the alternative of claiming that there are no refs in the repository, or writing to a location that other implementations will not read. Note that we are only defining the rules for format 1 here. We do not ever write format 1 ourselves; it is a tool that is meant to be used by users and future extensions to provide safety with older implementations. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-23 12:53:58 +02:00
const char *ext;
if (strcmp(var, "core.repositoryformatversion") == 0)
setup: refactor repo format reading and verification When we want to know if we're in a git repository of reasonable vintage, we can call check_repository_format_gently(), which does three things: 1. Reads the config from the .git/config file. 2. Verifies that the version info we read is sane. 3. Writes some global variables based on this. There are a few things we could improve here. One is that steps 1 and 3 happen together. So if the verification in step 2 fails, we still clobber the global variables. This is especially bad if we go on to try another repository directory; we may end up with a state of mixed config variables. The second is there's no way to ask about the repository version for anything besides the main repository we're in. git-init wants to do this, and it's possible that we would want to start doing so for submodules (e.g., to find out which ref backend they're using). We can improve both by splitting the first two steps into separate functions. Now check_repository_format_gently() calls out to steps 1 and 2, and does 3 only if step 2 succeeds. Note that the public interface for read_repository_format() and what check_repository_format_gently() needs from it are not quite the same, leading us to have an extra read_repository_format_1() helper. The extra needs from check_repository_format_gently() will go away in a future patch, and we can simplify this then to just the public interface. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-11 23:37:07 +01:00
data->version = git_config_int(var, value);
introduce "extensions" form of core.repositoryformatversion Normally we try to avoid bumps of the whole-repository core.repositoryformatversion field. However, it is unavoidable if we want to safely change certain aspects of git in a backwards-incompatible way (e.g., modifying the set of ref tips that we must traverse to generate a list of unreachable, safe-to-prune objects). If we were to bump the repository version for every such change, then any implementation understanding version `X` would also have to understand `X-1`, `X-2`, and so forth, even though the incompatibilities may be in orthogonal parts of the system, and there is otherwise no reason we cannot implement one without the other (or more importantly, that the user cannot choose to use one feature without the other, weighing the tradeoff in compatibility only for that particular feature). This patch documents the existing repositoryformatversion strategy and introduces a new format, "1", which lets a repository specify that it must run with an arbitrary set of extensions. This can be used, for example: - to inform git that the objects should not be pruned based only on the reachability of the ref tips (e.g, because it has "clone --shared" children) - that the refs are stored in a format besides the usual "refs" and "packed-refs" directories Because we bump to format "1", and because format "1" requires that a running git knows about any extensions mentioned, we know that older versions of the code will not do something dangerous when confronted with these new formats. For example, if the user chooses to use database storage for refs, they may set the "extensions.refbackend" config to "db". Older versions of git will not understand format "1" and bail. Versions of git which understand "1" but do not know about "refbackend", or which know about "refbackend" but not about the "db" backend, will refuse to run. This is annoying, of course, but much better than the alternative of claiming that there are no refs in the repository, or writing to a location that other implementations will not read. Note that we are only defining the rules for format 1 here. We do not ever write format 1 ourselves; it is a tool that is meant to be used by users and future extensions to provide safety with older implementations. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-23 12:53:58 +02:00
else if (skip_prefix(var, "extensions.", &ext)) {
verify_repository_format(): complain about new extensions in v0 repo We made the mistake in the past of respecting extensions.* even when the repository format version was set to 0. This is bad because forgetting to bump the repository version means that older versions of Git (which do not know about our extensions) won't complain. I.e., it's not a problem in itself, but it means your repository is in a state which does not give you the protection you think you're getting from older versions. For compatibility reasons, we are stuck with that decision for existing extensions. However, we'd prefer not to extend the damage further. We can do that by catching any newly-added extensions and complaining about the repository format. Note that this is a pretty heavy hammer: we'll refuse to work with the repository at all. A lesser option would be to ignore (possibly with a warning) any new extensions. But because of the way the extensions are handled, that puts the burden on each new extension that is added to remember to "undo" itself (because they are handled before we know for sure whether we are in a v1 repo or not, since we don't insist on a particular ordering of config entries). So one option would be to rewrite that handling to record any new extensions (and their values) during the config parse, and then only after proceed to handle new ones only if we're in a v1 repository. But I'm not sure if it's worth the trouble: - ignoring extensions is likely to end up with broken results anyway (e.g., ignoring a proposed objectformat extension means parsing any object data is likely to encounter errors) - this is a sign that whatever tool wrote the extension field is broken. We may be better off notifying immediately and forcefully so that such tools don't even appear to work accidentally. The only downside is that fixing the situation is a little tricky, because programs like "git config" won't want to work with the repository. But: git config --file=.git/config core.repositoryformatversion 1 should still suffice. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-16 14:25:13 +02:00
switch (handle_extension_v0(var, value, ext, data)) {
case EXTENSION_ERROR:
return -1;
case EXTENSION_OK:
return 0;
case EXTENSION_UNKNOWN:
break;
}
switch (handle_extension(var, value, ext, data)) {
case EXTENSION_ERROR:
return -1;
case EXTENSION_OK:
string_list_append(&data->v1_only_extensions, ext);
return 0;
case EXTENSION_UNKNOWN:
setup: refactor repo format reading and verification When we want to know if we're in a git repository of reasonable vintage, we can call check_repository_format_gently(), which does three things: 1. Reads the config from the .git/config file. 2. Verifies that the version info we read is sane. 3. Writes some global variables based on this. There are a few things we could improve here. One is that steps 1 and 3 happen together. So if the verification in step 2 fails, we still clobber the global variables. This is especially bad if we go on to try another repository directory; we may end up with a state of mixed config variables. The second is there's no way to ask about the repository version for anything besides the main repository we're in. git-init wants to do this, and it's possible that we would want to start doing so for submodules (e.g., to find out which ref backend they're using). We can improve both by splitting the first two steps into separate functions. Now check_repository_format_gently() calls out to steps 1 and 2, and does 3 only if step 2 succeeds. Note that the public interface for read_repository_format() and what check_repository_format_gently() needs from it are not quite the same, leading us to have an extra read_repository_format_1() helper. The extra needs from check_repository_format_gently() will go away in a future patch, and we can simplify this then to just the public interface. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-11 23:37:07 +01:00
string_list_append(&data->unknown_extensions, ext);
verify_repository_format(): complain about new extensions in v0 repo We made the mistake in the past of respecting extensions.* even when the repository format version was set to 0. This is bad because forgetting to bump the repository version means that older versions of Git (which do not know about our extensions) won't complain. I.e., it's not a problem in itself, but it means your repository is in a state which does not give you the protection you think you're getting from older versions. For compatibility reasons, we are stuck with that decision for existing extensions. However, we'd prefer not to extend the damage further. We can do that by catching any newly-added extensions and complaining about the repository format. Note that this is a pretty heavy hammer: we'll refuse to work with the repository at all. A lesser option would be to ignore (possibly with a warning) any new extensions. But because of the way the extensions are handled, that puts the burden on each new extension that is added to remember to "undo" itself (because they are handled before we know for sure whether we are in a v1 repo or not, since we don't insist on a particular ordering of config entries). So one option would be to rewrite that handling to record any new extensions (and their values) during the config parse, and then only after proceed to handle new ones only if we're in a v1 repository. But I'm not sure if it's worth the trouble: - ignoring extensions is likely to end up with broken results anyway (e.g., ignoring a proposed objectformat extension means parsing any object data is likely to encounter errors) - this is a sign that whatever tool wrote the extension field is broken. We may be better off notifying immediately and forcefully so that such tools don't even appear to work accidentally. The only downside is that fixing the situation is a little tricky, because programs like "git config" won't want to work with the repository. But: git config --file=.git/config core.repositoryformatversion 1 should still suffice. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-16 14:25:13 +02:00
return 0;
}
introduce "extensions" form of core.repositoryformatversion Normally we try to avoid bumps of the whole-repository core.repositoryformatversion field. However, it is unavoidable if we want to safely change certain aspects of git in a backwards-incompatible way (e.g., modifying the set of ref tips that we must traverse to generate a list of unreachable, safe-to-prune objects). If we were to bump the repository version for every such change, then any implementation understanding version `X` would also have to understand `X-1`, `X-2`, and so forth, even though the incompatibilities may be in orthogonal parts of the system, and there is otherwise no reason we cannot implement one without the other (or more importantly, that the user cannot choose to use one feature without the other, weighing the tradeoff in compatibility only for that particular feature). This patch documents the existing repositoryformatversion strategy and introduces a new format, "1", which lets a repository specify that it must run with an arbitrary set of extensions. This can be used, for example: - to inform git that the objects should not be pruned based only on the reachability of the ref tips (e.g, because it has "clone --shared" children) - that the refs are stored in a format besides the usual "refs" and "packed-refs" directories Because we bump to format "1", and because format "1" requires that a running git knows about any extensions mentioned, we know that older versions of the code will not do something dangerous when confronted with these new formats. For example, if the user chooses to use database storage for refs, they may set the "extensions.refbackend" config to "db". Older versions of git will not understand format "1" and bail. Versions of git which understand "1" but do not know about "refbackend", or which know about "refbackend" but not about the "db" backend, will refuse to run. This is annoying, of course, but much better than the alternative of claiming that there are no refs in the repository, or writing to a location that other implementations will not read. Note that we are only defining the rules for format 1 here. We do not ever write format 1 ourselves; it is a tool that is meant to be used by users and future extensions to provide safety with older implementations. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-23 12:53:58 +02:00
}
worktree: add per-worktree config files A new repo extension is added, worktreeConfig. When it is present: - Repository config reading by default includes $GIT_DIR/config _and_ $GIT_DIR/config.worktree. "config" file remains shared in multiple worktree setup. - The special treatment for core.bare and core.worktree, to stay effective only in main worktree, is gone. These config settings are supposed to be in config.worktree. This extension is most useful in multiple worktree setup because you now have an option to store per-worktree config (which is either .git/config.worktree for main worktree, or .git/worktrees/xx/config.worktree for linked ones). This extension can be used in single worktree mode, even though it's pretty much useless (but this can happen after you remove all linked worktrees and move back to single worktree). "git config" reads from both "config" and "config.worktree" by default (i.e. without either --user, --file...) when this extension is present. Default writes still go to "config", not "config.worktree". A new option --worktree is added for that (*). Since a new repo extension is introduced, existing git binaries should refuse to access to the repo (both from main and linked worktrees). So they will not misread the config file (i.e. skip the config.worktree part). They may still accidentally write to the config file anyway if they use with "git config --file <path>". This design places a bet on the assumption that the majority of config variables are shared so it is the default mode. A safer move would be default writes go to per-worktree file, so that accidental changes are isolated. (*) "git config --worktree" points back to "config" file when this extension is not present and there is only one worktree so that it works in any both single and multiple worktree setups. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-21 16:02:28 +02:00
return read_worktree_config(var, value, vdata);
}
static int check_repository_format_gently(const char *gitdir, struct repository_format *candidate, int *nongit_ok)
{
struct strbuf sb = STRBUF_INIT;
setup: refactor repo format reading and verification When we want to know if we're in a git repository of reasonable vintage, we can call check_repository_format_gently(), which does three things: 1. Reads the config from the .git/config file. 2. Verifies that the version info we read is sane. 3. Writes some global variables based on this. There are a few things we could improve here. One is that steps 1 and 3 happen together. So if the verification in step 2 fails, we still clobber the global variables. This is especially bad if we go on to try another repository directory; we may end up with a state of mixed config variables. The second is there's no way to ask about the repository version for anything besides the main repository we're in. git-init wants to do this, and it's possible that we would want to start doing so for submodules (e.g., to find out which ref backend they're using). We can improve both by splitting the first two steps into separate functions. Now check_repository_format_gently() calls out to steps 1 and 2, and does 3 only if step 2 succeeds. Note that the public interface for read_repository_format() and what check_repository_format_gently() needs from it are not quite the same, leading us to have an extra read_repository_format_1() helper. The extra needs from check_repository_format_gently() will go away in a future patch, and we can simplify this then to just the public interface. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-11 23:37:07 +01:00
struct strbuf err = STRBUF_INIT;
int has_common;
introduce "extensions" form of core.repositoryformatversion Normally we try to avoid bumps of the whole-repository core.repositoryformatversion field. However, it is unavoidable if we want to safely change certain aspects of git in a backwards-incompatible way (e.g., modifying the set of ref tips that we must traverse to generate a list of unreachable, safe-to-prune objects). If we were to bump the repository version for every such change, then any implementation understanding version `X` would also have to understand `X-1`, `X-2`, and so forth, even though the incompatibilities may be in orthogonal parts of the system, and there is otherwise no reason we cannot implement one without the other (or more importantly, that the user cannot choose to use one feature without the other, weighing the tradeoff in compatibility only for that particular feature). This patch documents the existing repositoryformatversion strategy and introduces a new format, "1", which lets a repository specify that it must run with an arbitrary set of extensions. This can be used, for example: - to inform git that the objects should not be pruned based only on the reachability of the ref tips (e.g, because it has "clone --shared" children) - that the refs are stored in a format besides the usual "refs" and "packed-refs" directories Because we bump to format "1", and because format "1" requires that a running git knows about any extensions mentioned, we know that older versions of the code will not do something dangerous when confronted with these new formats. For example, if the user chooses to use database storage for refs, they may set the "extensions.refbackend" config to "db". Older versions of git will not understand format "1" and bail. Versions of git which understand "1" but do not know about "refbackend", or which know about "refbackend" but not about the "db" backend, will refuse to run. This is annoying, of course, but much better than the alternative of claiming that there are no refs in the repository, or writing to a location that other implementations will not read. Note that we are only defining the rules for format 1 here. We do not ever write format 1 ourselves; it is a tool that is meant to be used by users and future extensions to provide safety with older implementations. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-23 12:53:58 +02:00
has_common = get_common_dir(&sb, gitdir);
strbuf_addstr(&sb, "/config");
read_repository_format(candidate, sb.buf);
setup: refactor repo format reading and verification When we want to know if we're in a git repository of reasonable vintage, we can call check_repository_format_gently(), which does three things: 1. Reads the config from the .git/config file. 2. Verifies that the version info we read is sane. 3. Writes some global variables based on this. There are a few things we could improve here. One is that steps 1 and 3 happen together. So if the verification in step 2 fails, we still clobber the global variables. This is especially bad if we go on to try another repository directory; we may end up with a state of mixed config variables. The second is there's no way to ask about the repository version for anything besides the main repository we're in. git-init wants to do this, and it's possible that we would want to start doing so for submodules (e.g., to find out which ref backend they're using). We can improve both by splitting the first two steps into separate functions. Now check_repository_format_gently() calls out to steps 1 and 2, and does 3 only if step 2 succeeds. Note that the public interface for read_repository_format() and what check_repository_format_gently() needs from it are not quite the same, leading us to have an extra read_repository_format_1() helper. The extra needs from check_repository_format_gently() will go away in a future patch, and we can simplify this then to just the public interface. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-11 23:37:07 +01:00
strbuf_release(&sb);
/*
setup: refactor repo format reading and verification When we want to know if we're in a git repository of reasonable vintage, we can call check_repository_format_gently(), which does three things: 1. Reads the config from the .git/config file. 2. Verifies that the version info we read is sane. 3. Writes some global variables based on this. There are a few things we could improve here. One is that steps 1 and 3 happen together. So if the verification in step 2 fails, we still clobber the global variables. This is especially bad if we go on to try another repository directory; we may end up with a state of mixed config variables. The second is there's no way to ask about the repository version for anything besides the main repository we're in. git-init wants to do this, and it's possible that we would want to start doing so for submodules (e.g., to find out which ref backend they're using). We can improve both by splitting the first two steps into separate functions. Now check_repository_format_gently() calls out to steps 1 and 2, and does 3 only if step 2 succeeds. Note that the public interface for read_repository_format() and what check_repository_format_gently() needs from it are not quite the same, leading us to have an extra read_repository_format_1() helper. The extra needs from check_repository_format_gently() will go away in a future patch, and we can simplify this then to just the public interface. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-11 23:37:07 +01:00
* For historical use of check_repository_format() in git-init,
* we treat a missing config as a silent "ok", even when nongit_ok
* is unset.
*/
if (candidate->version < 0)
setup: refactor repo format reading and verification When we want to know if we're in a git repository of reasonable vintage, we can call check_repository_format_gently(), which does three things: 1. Reads the config from the .git/config file. 2. Verifies that the version info we read is sane. 3. Writes some global variables based on this. There are a few things we could improve here. One is that steps 1 and 3 happen together. So if the verification in step 2 fails, we still clobber the global variables. This is especially bad if we go on to try another repository directory; we may end up with a state of mixed config variables. The second is there's no way to ask about the repository version for anything besides the main repository we're in. git-init wants to do this, and it's possible that we would want to start doing so for submodules (e.g., to find out which ref backend they're using). We can improve both by splitting the first two steps into separate functions. Now check_repository_format_gently() calls out to steps 1 and 2, and does 3 only if step 2 succeeds. Note that the public interface for read_repository_format() and what check_repository_format_gently() needs from it are not quite the same, leading us to have an extra read_repository_format_1() helper. The extra needs from check_repository_format_gently() will go away in a future patch, and we can simplify this then to just the public interface. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-11 23:37:07 +01:00
return 0;
if (verify_repository_format(candidate, &err) < 0) {
setup: refactor repo format reading and verification When we want to know if we're in a git repository of reasonable vintage, we can call check_repository_format_gently(), which does three things: 1. Reads the config from the .git/config file. 2. Verifies that the version info we read is sane. 3. Writes some global variables based on this. There are a few things we could improve here. One is that steps 1 and 3 happen together. So if the verification in step 2 fails, we still clobber the global variables. This is especially bad if we go on to try another repository directory; we may end up with a state of mixed config variables. The second is there's no way to ask about the repository version for anything besides the main repository we're in. git-init wants to do this, and it's possible that we would want to start doing so for submodules (e.g., to find out which ref backend they're using). We can improve both by splitting the first two steps into separate functions. Now check_repository_format_gently() calls out to steps 1 and 2, and does 3 only if step 2 succeeds. Note that the public interface for read_repository_format() and what check_repository_format_gently() needs from it are not quite the same, leading us to have an extra read_repository_format_1() helper. The extra needs from check_repository_format_gently() will go away in a future patch, and we can simplify this then to just the public interface. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-11 23:37:07 +01:00
if (nongit_ok) {
warning("%s", err.buf);
strbuf_release(&err);
*nongit_ok = -1;
return -1;
}
die("%s", err.buf);
}
Revert "check_repository_format_gently(): refuse extensions for old repositories" This reverts commit 14c7fa269e42df4133edd9ae7763b678ed6594cd. The core.repositoryFormatVersion field was introduced in ab9cb76f661 (Repository format version check., 2005-11-25), providing a welcome bit of forward compatibility, thanks to some welcome analysis by Martin Atukunda. The semantics are simple: a repository with core.repositoryFormatVersion set to 0 should be comprehensible by all Git implementations in active use; and Git implementations should error out early instead of trying to act on Git repositories with higher core.repositoryFormatVersion values representing new formats that they do not understand. A new repository format did not need to be defined until 00a09d57eb8 (introduce "extensions" form of core.repositoryformatversion, 2015-06-23). This provided a finer-grained extension mechanism for Git repositories. In a repository with core.repositoryFormatVersion set to 1, Git implementations can act on "extensions.*" settings that modify how a repository is interpreted. In repository format version 1, unrecognized extensions settings cause Git to error out. What happens if a user sets an extension setting but forgets to increase the repository format version to 1? The extension settings were still recognized in that case; worse, unrecognized extensions settings do *not* cause Git to error out. So combining repository format version 0 with extensions settings produces in some sense the worst of both worlds. To improve that situation, since 14c7fa269e4 (check_repository_format_gently(): refuse extensions for old repositories, 2020-06-05) Git instead ignores extensions in v0 mode. This way, v0 repositories get the historical (pre-2015) behavior and maintain compatibility with Git implementations that do not know about the v1 format. Unfortunately, users had been using this sort of configuration and this behavior change came to many as a surprise: - users of "git config --worktree" that had followed its advice to enable extensions.worktreeConfig (without also increasing the repository format version) would find their worktree configuration no longer taking effect - tools such as copybara[*] that had set extensions.partialClone in existing repositories (without also increasing the repository format version) would find that setting no longer taking effect The behavior introduced in 14c7fa269e4 might be a good behavior if we were traveling back in time to 2015, but we're far too late. For some reason I thought that it was what had been originally implemented and that it had regressed. Apologies for not doing my research when 14c7fa269e4 was under development. Let's return to the behavior we've had since 2015: always act on extensions.* settings, regardless of repository format version. While we're here, include some tests to describe the effect on the "upgrade repository version" code path. [*] https://github.com/google/copybara/commit/ca76c0b1e13c4e36448d12c2aba4a5d9d98fb6e7 Reported-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-16 08:24:29 +02:00
repository_format_precious_objects = candidate->precious_objects;
repository_format_worktree_config = candidate->worktree_config;
string_list_clear(&candidate->unknown_extensions, 0);
verify_repository_format(): complain about new extensions in v0 repo We made the mistake in the past of respecting extensions.* even when the repository format version was set to 0. This is bad because forgetting to bump the repository version means that older versions of Git (which do not know about our extensions) won't complain. I.e., it's not a problem in itself, but it means your repository is in a state which does not give you the protection you think you're getting from older versions. For compatibility reasons, we are stuck with that decision for existing extensions. However, we'd prefer not to extend the damage further. We can do that by catching any newly-added extensions and complaining about the repository format. Note that this is a pretty heavy hammer: we'll refuse to work with the repository at all. A lesser option would be to ignore (possibly with a warning) any new extensions. But because of the way the extensions are handled, that puts the burden on each new extension that is added to remember to "undo" itself (because they are handled before we know for sure whether we are in a v1 repo or not, since we don't insist on a particular ordering of config entries). So one option would be to rewrite that handling to record any new extensions (and their values) during the config parse, and then only after proceed to handle new ones only if we're in a v1 repository. But I'm not sure if it's worth the trouble: - ignoring extensions is likely to end up with broken results anyway (e.g., ignoring a proposed objectformat extension means parsing any object data is likely to encounter errors) - this is a sign that whatever tool wrote the extension field is broken. We may be better off notifying immediately and forcefully so that such tools don't even appear to work accidentally. The only downside is that fixing the situation is a little tricky, because programs like "git config" won't want to work with the repository. But: git config --file=.git/config core.repositoryformatversion 1 should still suffice. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-16 14:25:13 +02:00
string_list_clear(&candidate->v1_only_extensions, 0);
worktree: add per-worktree config files A new repo extension is added, worktreeConfig. When it is present: - Repository config reading by default includes $GIT_DIR/config _and_ $GIT_DIR/config.worktree. "config" file remains shared in multiple worktree setup. - The special treatment for core.bare and core.worktree, to stay effective only in main worktree, is gone. These config settings are supposed to be in config.worktree. This extension is most useful in multiple worktree setup because you now have an option to store per-worktree config (which is either .git/config.worktree for main worktree, or .git/worktrees/xx/config.worktree for linked ones). This extension can be used in single worktree mode, even though it's pretty much useless (but this can happen after you remove all linked worktrees and move back to single worktree). "git config" reads from both "config" and "config.worktree" by default (i.e. without either --user, --file...) when this extension is present. Default writes still go to "config", not "config.worktree". A new option --worktree is added for that (*). Since a new repo extension is introduced, existing git binaries should refuse to access to the repo (both from main and linked worktrees). So they will not misread the config file (i.e. skip the config.worktree part). They may still accidentally write to the config file anyway if they use with "git config --file <path>". This design places a bet on the assumption that the majority of config variables are shared so it is the default mode. A safer move would be default writes go to per-worktree file, so that accidental changes are isolated. (*) "git config --worktree" points back to "config" file when this extension is not present and there is only one worktree so that it works in any both single and multiple worktree setups. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-21 16:02:28 +02:00
if (repository_format_worktree_config) {
/*
* pick up core.bare and core.worktree from per-worktree
* config if present
*/
strbuf_addf(&sb, "%s/config.worktree", gitdir);
git_config_from_file(read_worktree_config, sb.buf, candidate);
strbuf_release(&sb);
has_common = 0;
}
if (!has_common) {
if (candidate->is_bare != -1) {
is_bare_repository_cfg = candidate->is_bare;
if (is_bare_repository_cfg == 1)
inside_work_tree = -1;
}
if (candidate->work_tree) {
free(git_work_tree_cfg);
setup: fix memory leaks with `struct repository_format` After we set up a `struct repository_format`, it owns various pieces of allocated memory. We then either use those members, because we decide we want to use the "candidate" repository format, or we discard the candidate / scratch space. In the first case, we transfer ownership of the memory to a few global variables. In the latter case, we just silently drop the struct and end up leaking memory. Introduce an initialization macro `REPOSITORY_FORMAT_INIT` and a function `clear_repository_format()`, to be used on each side of `read_repository_format()`. To have a clear and simple memory ownership, let all users of `struct repository_format` duplicate the strings that they take from it, rather than stealing the pointers. Call `clear_...()` at the start of `read_...()` instead of just zeroing the struct, since we sometimes enter the function multiple times. Thus, it is important to initialize the struct before calling `read_...()`, so document that. It's also important because we might not even call `read_...()` before we call `clear_...()`, see, e.g., builtin/init-db.c. Teach `read_...()` to clear the struct on error, so that it is reset to a safe state, and document this. (In `setup_git_directory_gently()`, we look at `repo_fmt.hash_algo` even if `repo_fmt.version` is -1, which we weren't actually supposed to do per the API. After this commit, that's ok.) We inherit the existing code's combining "error" and "no version found". Both are signalled through `version == -1` and now both cause us to clear any partial configuration we have picked up. For "extensions.*", that's fine, since they require a positive version number. For "core.bare" and "core.worktree", we're already verifying that we have a non-negative version number before using them. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-28 21:36:28 +01:00
git_work_tree_cfg = xstrdup(candidate->work_tree);
setup: refactor repo format reading and verification When we want to know if we're in a git repository of reasonable vintage, we can call check_repository_format_gently(), which does three things: 1. Reads the config from the .git/config file. 2. Verifies that the version info we read is sane. 3. Writes some global variables based on this. There are a few things we could improve here. One is that steps 1 and 3 happen together. So if the verification in step 2 fails, we still clobber the global variables. This is especially bad if we go on to try another repository directory; we may end up with a state of mixed config variables. The second is there's no way to ask about the repository version for anything besides the main repository we're in. git-init wants to do this, and it's possible that we would want to start doing so for submodules (e.g., to find out which ref backend they're using). We can improve both by splitting the first two steps into separate functions. Now check_repository_format_gently() calls out to steps 1 and 2, and does 3 only if step 2 succeeds. Note that the public interface for read_repository_format() and what check_repository_format_gently() needs from it are not quite the same, leading us to have an extra read_repository_format_1() helper. The extra needs from check_repository_format_gently() will go away in a future patch, and we can simplify this then to just the public interface. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-11 23:37:07 +01:00
inside_work_tree = -1;
}
setup: refactor repo format reading and verification When we want to know if we're in a git repository of reasonable vintage, we can call check_repository_format_gently(), which does three things: 1. Reads the config from the .git/config file. 2. Verifies that the version info we read is sane. 3. Writes some global variables based on this. There are a few things we could improve here. One is that steps 1 and 3 happen together. So if the verification in step 2 fails, we still clobber the global variables. This is especially bad if we go on to try another repository directory; we may end up with a state of mixed config variables. The second is there's no way to ask about the repository version for anything besides the main repository we're in. git-init wants to do this, and it's possible that we would want to start doing so for submodules (e.g., to find out which ref backend they're using). We can improve both by splitting the first two steps into separate functions. Now check_repository_format_gently() calls out to steps 1 and 2, and does 3 only if step 2 succeeds. Note that the public interface for read_repository_format() and what check_repository_format_gently() needs from it are not quite the same, leading us to have an extra read_repository_format_1() helper. The extra needs from check_repository_format_gently() will go away in a future patch, and we can simplify this then to just the public interface. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-11 23:37:07 +01:00
}
return 0;
}
int upgrade_repository_format(int target_version)
{
struct strbuf sb = STRBUF_INIT;
struct strbuf err = STRBUF_INIT;
struct strbuf repo_version = STRBUF_INIT;
struct repository_format repo_fmt = REPOSITORY_FORMAT_INIT;
strbuf_git_common_path(&sb, the_repository, "config");
read_repository_format(&repo_fmt, sb.buf);
strbuf_release(&sb);
if (repo_fmt.version >= target_version)
return 0;
if (verify_repository_format(&repo_fmt, &err) < 0) {
error("cannot upgrade repository format from %d to %d: %s",
repo_fmt.version, target_version, err.buf);
strbuf_release(&err);
return -1;
}
if (!repo_fmt.version && repo_fmt.unknown_extensions.nr)
return error("cannot upgrade repository format: "
"unknown extension %s",
repo_fmt.unknown_extensions.items[0].string);
strbuf_addf(&repo_version, "%d", target_version);
git_config_set("core.repositoryformatversion", repo_version.buf);
strbuf_release(&repo_version);
return 1;
}
setup: fix memory leaks with `struct repository_format` After we set up a `struct repository_format`, it owns various pieces of allocated memory. We then either use those members, because we decide we want to use the "candidate" repository format, or we discard the candidate / scratch space. In the first case, we transfer ownership of the memory to a few global variables. In the latter case, we just silently drop the struct and end up leaking memory. Introduce an initialization macro `REPOSITORY_FORMAT_INIT` and a function `clear_repository_format()`, to be used on each side of `read_repository_format()`. To have a clear and simple memory ownership, let all users of `struct repository_format` duplicate the strings that they take from it, rather than stealing the pointers. Call `clear_...()` at the start of `read_...()` instead of just zeroing the struct, since we sometimes enter the function multiple times. Thus, it is important to initialize the struct before calling `read_...()`, so document that. It's also important because we might not even call `read_...()` before we call `clear_...()`, see, e.g., builtin/init-db.c. Teach `read_...()` to clear the struct on error, so that it is reset to a safe state, and document this. (In `setup_git_directory_gently()`, we look at `repo_fmt.hash_algo` even if `repo_fmt.version` is -1, which we weren't actually supposed to do per the API. After this commit, that's ok.) We inherit the existing code's combining "error" and "no version found". Both are signalled through `version == -1` and now both cause us to clear any partial configuration we have picked up. For "extensions.*", that's fine, since they require a positive version number. For "core.bare" and "core.worktree", we're already verifying that we have a non-negative version number before using them. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-28 21:36:28 +01:00
static void init_repository_format(struct repository_format *format)
{
const struct repository_format fresh = REPOSITORY_FORMAT_INIT;
memcpy(format, &fresh, sizeof(fresh));
}
int read_repository_format(struct repository_format *format, const char *path)
setup: refactor repo format reading and verification When we want to know if we're in a git repository of reasonable vintage, we can call check_repository_format_gently(), which does three things: 1. Reads the config from the .git/config file. 2. Verifies that the version info we read is sane. 3. Writes some global variables based on this. There are a few things we could improve here. One is that steps 1 and 3 happen together. So if the verification in step 2 fails, we still clobber the global variables. This is especially bad if we go on to try another repository directory; we may end up with a state of mixed config variables. The second is there's no way to ask about the repository version for anything besides the main repository we're in. git-init wants to do this, and it's possible that we would want to start doing so for submodules (e.g., to find out which ref backend they're using). We can improve both by splitting the first two steps into separate functions. Now check_repository_format_gently() calls out to steps 1 and 2, and does 3 only if step 2 succeeds. Note that the public interface for read_repository_format() and what check_repository_format_gently() needs from it are not quite the same, leading us to have an extra read_repository_format_1() helper. The extra needs from check_repository_format_gently() will go away in a future patch, and we can simplify this then to just the public interface. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-11 23:37:07 +01:00
{
setup: fix memory leaks with `struct repository_format` After we set up a `struct repository_format`, it owns various pieces of allocated memory. We then either use those members, because we decide we want to use the "candidate" repository format, or we discard the candidate / scratch space. In the first case, we transfer ownership of the memory to a few global variables. In the latter case, we just silently drop the struct and end up leaking memory. Introduce an initialization macro `REPOSITORY_FORMAT_INIT` and a function `clear_repository_format()`, to be used on each side of `read_repository_format()`. To have a clear and simple memory ownership, let all users of `struct repository_format` duplicate the strings that they take from it, rather than stealing the pointers. Call `clear_...()` at the start of `read_...()` instead of just zeroing the struct, since we sometimes enter the function multiple times. Thus, it is important to initialize the struct before calling `read_...()`, so document that. It's also important because we might not even call `read_...()` before we call `clear_...()`, see, e.g., builtin/init-db.c. Teach `read_...()` to clear the struct on error, so that it is reset to a safe state, and document this. (In `setup_git_directory_gently()`, we look at `repo_fmt.hash_algo` even if `repo_fmt.version` is -1, which we weren't actually supposed to do per the API. After this commit, that's ok.) We inherit the existing code's combining "error" and "no version found". Both are signalled through `version == -1` and now both cause us to clear any partial configuration we have picked up. For "extensions.*", that's fine, since they require a positive version number. For "core.bare" and "core.worktree", we're already verifying that we have a non-negative version number before using them. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-28 21:36:28 +01:00
clear_repository_format(format);
git_config_from_file(check_repo_format, path, format);
setup: fix memory leaks with `struct repository_format` After we set up a `struct repository_format`, it owns various pieces of allocated memory. We then either use those members, because we decide we want to use the "candidate" repository format, or we discard the candidate / scratch space. In the first case, we transfer ownership of the memory to a few global variables. In the latter case, we just silently drop the struct and end up leaking memory. Introduce an initialization macro `REPOSITORY_FORMAT_INIT` and a function `clear_repository_format()`, to be used on each side of `read_repository_format()`. To have a clear and simple memory ownership, let all users of `struct repository_format` duplicate the strings that they take from it, rather than stealing the pointers. Call `clear_...()` at the start of `read_...()` instead of just zeroing the struct, since we sometimes enter the function multiple times. Thus, it is important to initialize the struct before calling `read_...()`, so document that. It's also important because we might not even call `read_...()` before we call `clear_...()`, see, e.g., builtin/init-db.c. Teach `read_...()` to clear the struct on error, so that it is reset to a safe state, and document this. (In `setup_git_directory_gently()`, we look at `repo_fmt.hash_algo` even if `repo_fmt.version` is -1, which we weren't actually supposed to do per the API. After this commit, that's ok.) We inherit the existing code's combining "error" and "no version found". Both are signalled through `version == -1` and now both cause us to clear any partial configuration we have picked up. For "extensions.*", that's fine, since they require a positive version number. For "core.bare" and "core.worktree", we're already verifying that we have a non-negative version number before using them. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-28 21:36:28 +01:00
if (format->version == -1)
clear_repository_format(format);
setup: refactor repo format reading and verification When we want to know if we're in a git repository of reasonable vintage, we can call check_repository_format_gently(), which does three things: 1. Reads the config from the .git/config file. 2. Verifies that the version info we read is sane. 3. Writes some global variables based on this. There are a few things we could improve here. One is that steps 1 and 3 happen together. So if the verification in step 2 fails, we still clobber the global variables. This is especially bad if we go on to try another repository directory; we may end up with a state of mixed config variables. The second is there's no way to ask about the repository version for anything besides the main repository we're in. git-init wants to do this, and it's possible that we would want to start doing so for submodules (e.g., to find out which ref backend they're using). We can improve both by splitting the first two steps into separate functions. Now check_repository_format_gently() calls out to steps 1 and 2, and does 3 only if step 2 succeeds. Note that the public interface for read_repository_format() and what check_repository_format_gently() needs from it are not quite the same, leading us to have an extra read_repository_format_1() helper. The extra needs from check_repository_format_gently() will go away in a future patch, and we can simplify this then to just the public interface. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-11 23:37:07 +01:00
return format->version;
}
setup: fix memory leaks with `struct repository_format` After we set up a `struct repository_format`, it owns various pieces of allocated memory. We then either use those members, because we decide we want to use the "candidate" repository format, or we discard the candidate / scratch space. In the first case, we transfer ownership of the memory to a few global variables. In the latter case, we just silently drop the struct and end up leaking memory. Introduce an initialization macro `REPOSITORY_FORMAT_INIT` and a function `clear_repository_format()`, to be used on each side of `read_repository_format()`. To have a clear and simple memory ownership, let all users of `struct repository_format` duplicate the strings that they take from it, rather than stealing the pointers. Call `clear_...()` at the start of `read_...()` instead of just zeroing the struct, since we sometimes enter the function multiple times. Thus, it is important to initialize the struct before calling `read_...()`, so document that. It's also important because we might not even call `read_...()` before we call `clear_...()`, see, e.g., builtin/init-db.c. Teach `read_...()` to clear the struct on error, so that it is reset to a safe state, and document this. (In `setup_git_directory_gently()`, we look at `repo_fmt.hash_algo` even if `repo_fmt.version` is -1, which we weren't actually supposed to do per the API. After this commit, that's ok.) We inherit the existing code's combining "error" and "no version found". Both are signalled through `version == -1` and now both cause us to clear any partial configuration we have picked up. For "extensions.*", that's fine, since they require a positive version number. For "core.bare" and "core.worktree", we're already verifying that we have a non-negative version number before using them. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-28 21:36:28 +01:00
void clear_repository_format(struct repository_format *format)
{
string_list_clear(&format->unknown_extensions, 0);
verify_repository_format(): complain about new extensions in v0 repo We made the mistake in the past of respecting extensions.* even when the repository format version was set to 0. This is bad because forgetting to bump the repository version means that older versions of Git (which do not know about our extensions) won't complain. I.e., it's not a problem in itself, but it means your repository is in a state which does not give you the protection you think you're getting from older versions. For compatibility reasons, we are stuck with that decision for existing extensions. However, we'd prefer not to extend the damage further. We can do that by catching any newly-added extensions and complaining about the repository format. Note that this is a pretty heavy hammer: we'll refuse to work with the repository at all. A lesser option would be to ignore (possibly with a warning) any new extensions. But because of the way the extensions are handled, that puts the burden on each new extension that is added to remember to "undo" itself (because they are handled before we know for sure whether we are in a v1 repo or not, since we don't insist on a particular ordering of config entries). So one option would be to rewrite that handling to record any new extensions (and their values) during the config parse, and then only after proceed to handle new ones only if we're in a v1 repository. But I'm not sure if it's worth the trouble: - ignoring extensions is likely to end up with broken results anyway (e.g., ignoring a proposed objectformat extension means parsing any object data is likely to encounter errors) - this is a sign that whatever tool wrote the extension field is broken. We may be better off notifying immediately and forcefully so that such tools don't even appear to work accidentally. The only downside is that fixing the situation is a little tricky, because programs like "git config" won't want to work with the repository. But: git config --file=.git/config core.repositoryformatversion 1 should still suffice. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-16 14:25:13 +02:00
string_list_clear(&format->v1_only_extensions, 0);
setup: fix memory leaks with `struct repository_format` After we set up a `struct repository_format`, it owns various pieces of allocated memory. We then either use those members, because we decide we want to use the "candidate" repository format, or we discard the candidate / scratch space. In the first case, we transfer ownership of the memory to a few global variables. In the latter case, we just silently drop the struct and end up leaking memory. Introduce an initialization macro `REPOSITORY_FORMAT_INIT` and a function `clear_repository_format()`, to be used on each side of `read_repository_format()`. To have a clear and simple memory ownership, let all users of `struct repository_format` duplicate the strings that they take from it, rather than stealing the pointers. Call `clear_...()` at the start of `read_...()` instead of just zeroing the struct, since we sometimes enter the function multiple times. Thus, it is important to initialize the struct before calling `read_...()`, so document that. It's also important because we might not even call `read_...()` before we call `clear_...()`, see, e.g., builtin/init-db.c. Teach `read_...()` to clear the struct on error, so that it is reset to a safe state, and document this. (In `setup_git_directory_gently()`, we look at `repo_fmt.hash_algo` even if `repo_fmt.version` is -1, which we weren't actually supposed to do per the API. After this commit, that's ok.) We inherit the existing code's combining "error" and "no version found". Both are signalled through `version == -1` and now both cause us to clear any partial configuration we have picked up. For "extensions.*", that's fine, since they require a positive version number. For "core.bare" and "core.worktree", we're already verifying that we have a non-negative version number before using them. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-28 21:36:28 +01:00
free(format->work_tree);
free(format->partial_clone);
init_repository_format(format);
}
setup: refactor repo format reading and verification When we want to know if we're in a git repository of reasonable vintage, we can call check_repository_format_gently(), which does three things: 1. Reads the config from the .git/config file. 2. Verifies that the version info we read is sane. 3. Writes some global variables based on this. There are a few things we could improve here. One is that steps 1 and 3 happen together. So if the verification in step 2 fails, we still clobber the global variables. This is especially bad if we go on to try another repository directory; we may end up with a state of mixed config variables. The second is there's no way to ask about the repository version for anything besides the main repository we're in. git-init wants to do this, and it's possible that we would want to start doing so for submodules (e.g., to find out which ref backend they're using). We can improve both by splitting the first two steps into separate functions. Now check_repository_format_gently() calls out to steps 1 and 2, and does 3 only if step 2 succeeds. Note that the public interface for read_repository_format() and what check_repository_format_gently() needs from it are not quite the same, leading us to have an extra read_repository_format_1() helper. The extra needs from check_repository_format_gently() will go away in a future patch, and we can simplify this then to just the public interface. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-11 23:37:07 +01:00
int verify_repository_format(const struct repository_format *format,
struct strbuf *err)
{
if (GIT_REPO_VERSION_READ < format->version) {
strbuf_addf(err, _("Expected git repo version <= %d, found %d"),
setup: refactor repo format reading and verification When we want to know if we're in a git repository of reasonable vintage, we can call check_repository_format_gently(), which does three things: 1. Reads the config from the .git/config file. 2. Verifies that the version info we read is sane. 3. Writes some global variables based on this. There are a few things we could improve here. One is that steps 1 and 3 happen together. So if the verification in step 2 fails, we still clobber the global variables. This is especially bad if we go on to try another repository directory; we may end up with a state of mixed config variables. The second is there's no way to ask about the repository version for anything besides the main repository we're in. git-init wants to do this, and it's possible that we would want to start doing so for submodules (e.g., to find out which ref backend they're using). We can improve both by splitting the first two steps into separate functions. Now check_repository_format_gently() calls out to steps 1 and 2, and does 3 only if step 2 succeeds. Note that the public interface for read_repository_format() and what check_repository_format_gently() needs from it are not quite the same, leading us to have an extra read_repository_format_1() helper. The extra needs from check_repository_format_gently() will go away in a future patch, and we can simplify this then to just the public interface. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-11 23:37:07 +01:00
GIT_REPO_VERSION_READ, format->version);
return -1;
}
if (format->version >= 1 && format->unknown_extensions.nr) {
introduce "extensions" form of core.repositoryformatversion Normally we try to avoid bumps of the whole-repository core.repositoryformatversion field. However, it is unavoidable if we want to safely change certain aspects of git in a backwards-incompatible way (e.g., modifying the set of ref tips that we must traverse to generate a list of unreachable, safe-to-prune objects). If we were to bump the repository version for every such change, then any implementation understanding version `X` would also have to understand `X-1`, `X-2`, and so forth, even though the incompatibilities may be in orthogonal parts of the system, and there is otherwise no reason we cannot implement one without the other (or more importantly, that the user cannot choose to use one feature without the other, weighing the tradeoff in compatibility only for that particular feature). This patch documents the existing repositoryformatversion strategy and introduces a new format, "1", which lets a repository specify that it must run with an arbitrary set of extensions. This can be used, for example: - to inform git that the objects should not be pruned based only on the reachability of the ref tips (e.g, because it has "clone --shared" children) - that the refs are stored in a format besides the usual "refs" and "packed-refs" directories Because we bump to format "1", and because format "1" requires that a running git knows about any extensions mentioned, we know that older versions of the code will not do something dangerous when confronted with these new formats. For example, if the user chooses to use database storage for refs, they may set the "extensions.refbackend" config to "db". Older versions of git will not understand format "1" and bail. Versions of git which understand "1" but do not know about "refbackend", or which know about "refbackend" but not about the "db" backend, will refuse to run. This is annoying, of course, but much better than the alternative of claiming that there are no refs in the repository, or writing to a location that other implementations will not read. Note that we are only defining the rules for format 1 here. We do not ever write format 1 ourselves; it is a tool that is meant to be used by users and future extensions to provide safety with older implementations. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-23 12:53:58 +02:00
int i;
strbuf_addstr(err, Q_("unknown repository extension found:",
"unknown repository extensions found:",
format->unknown_extensions.nr));
introduce "extensions" form of core.repositoryformatversion Normally we try to avoid bumps of the whole-repository core.repositoryformatversion field. However, it is unavoidable if we want to safely change certain aspects of git in a backwards-incompatible way (e.g., modifying the set of ref tips that we must traverse to generate a list of unreachable, safe-to-prune objects). If we were to bump the repository version for every such change, then any implementation understanding version `X` would also have to understand `X-1`, `X-2`, and so forth, even though the incompatibilities may be in orthogonal parts of the system, and there is otherwise no reason we cannot implement one without the other (or more importantly, that the user cannot choose to use one feature without the other, weighing the tradeoff in compatibility only for that particular feature). This patch documents the existing repositoryformatversion strategy and introduces a new format, "1", which lets a repository specify that it must run with an arbitrary set of extensions. This can be used, for example: - to inform git that the objects should not be pruned based only on the reachability of the ref tips (e.g, because it has "clone --shared" children) - that the refs are stored in a format besides the usual "refs" and "packed-refs" directories Because we bump to format "1", and because format "1" requires that a running git knows about any extensions mentioned, we know that older versions of the code will not do something dangerous when confronted with these new formats. For example, if the user chooses to use database storage for refs, they may set the "extensions.refbackend" config to "db". Older versions of git will not understand format "1" and bail. Versions of git which understand "1" but do not know about "refbackend", or which know about "refbackend" but not about the "db" backend, will refuse to run. This is annoying, of course, but much better than the alternative of claiming that there are no refs in the repository, or writing to a location that other implementations will not read. Note that we are only defining the rules for format 1 here. We do not ever write format 1 ourselves; it is a tool that is meant to be used by users and future extensions to provide safety with older implementations. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-23 12:53:58 +02:00
setup: refactor repo format reading and verification When we want to know if we're in a git repository of reasonable vintage, we can call check_repository_format_gently(), which does three things: 1. Reads the config from the .git/config file. 2. Verifies that the version info we read is sane. 3. Writes some global variables based on this. There are a few things we could improve here. One is that steps 1 and 3 happen together. So if the verification in step 2 fails, we still clobber the global variables. This is especially bad if we go on to try another repository directory; we may end up with a state of mixed config variables. The second is there's no way to ask about the repository version for anything besides the main repository we're in. git-init wants to do this, and it's possible that we would want to start doing so for submodules (e.g., to find out which ref backend they're using). We can improve both by splitting the first two steps into separate functions. Now check_repository_format_gently() calls out to steps 1 and 2, and does 3 only if step 2 succeeds. Note that the public interface for read_repository_format() and what check_repository_format_gently() needs from it are not quite the same, leading us to have an extra read_repository_format_1() helper. The extra needs from check_repository_format_gently() will go away in a future patch, and we can simplify this then to just the public interface. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-11 23:37:07 +01:00
for (i = 0; i < format->unknown_extensions.nr; i++)
strbuf_addf(err, "\n\t%s",
format->unknown_extensions.items[i].string);
return -1;
introduce "extensions" form of core.repositoryformatversion Normally we try to avoid bumps of the whole-repository core.repositoryformatversion field. However, it is unavoidable if we want to safely change certain aspects of git in a backwards-incompatible way (e.g., modifying the set of ref tips that we must traverse to generate a list of unreachable, safe-to-prune objects). If we were to bump the repository version for every such change, then any implementation understanding version `X` would also have to understand `X-1`, `X-2`, and so forth, even though the incompatibilities may be in orthogonal parts of the system, and there is otherwise no reason we cannot implement one without the other (or more importantly, that the user cannot choose to use one feature without the other, weighing the tradeoff in compatibility only for that particular feature). This patch documents the existing repositoryformatversion strategy and introduces a new format, "1", which lets a repository specify that it must run with an arbitrary set of extensions. This can be used, for example: - to inform git that the objects should not be pruned based only on the reachability of the ref tips (e.g, because it has "clone --shared" children) - that the refs are stored in a format besides the usual "refs" and "packed-refs" directories Because we bump to format "1", and because format "1" requires that a running git knows about any extensions mentioned, we know that older versions of the code will not do something dangerous when confronted with these new formats. For example, if the user chooses to use database storage for refs, they may set the "extensions.refbackend" config to "db". Older versions of git will not understand format "1" and bail. Versions of git which understand "1" but do not know about "refbackend", or which know about "refbackend" but not about the "db" backend, will refuse to run. This is annoying, of course, but much better than the alternative of claiming that there are no refs in the repository, or writing to a location that other implementations will not read. Note that we are only defining the rules for format 1 here. We do not ever write format 1 ourselves; it is a tool that is meant to be used by users and future extensions to provide safety with older implementations. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-23 12:53:58 +02:00
}
verify_repository_format(): complain about new extensions in v0 repo We made the mistake in the past of respecting extensions.* even when the repository format version was set to 0. This is bad because forgetting to bump the repository version means that older versions of Git (which do not know about our extensions) won't complain. I.e., it's not a problem in itself, but it means your repository is in a state which does not give you the protection you think you're getting from older versions. For compatibility reasons, we are stuck with that decision for existing extensions. However, we'd prefer not to extend the damage further. We can do that by catching any newly-added extensions and complaining about the repository format. Note that this is a pretty heavy hammer: we'll refuse to work with the repository at all. A lesser option would be to ignore (possibly with a warning) any new extensions. But because of the way the extensions are handled, that puts the burden on each new extension that is added to remember to "undo" itself (because they are handled before we know for sure whether we are in a v1 repo or not, since we don't insist on a particular ordering of config entries). So one option would be to rewrite that handling to record any new extensions (and their values) during the config parse, and then only after proceed to handle new ones only if we're in a v1 repository. But I'm not sure if it's worth the trouble: - ignoring extensions is likely to end up with broken results anyway (e.g., ignoring a proposed objectformat extension means parsing any object data is likely to encounter errors) - this is a sign that whatever tool wrote the extension field is broken. We may be better off notifying immediately and forcefully so that such tools don't even appear to work accidentally. The only downside is that fixing the situation is a little tricky, because programs like "git config" won't want to work with the repository. But: git config --file=.git/config core.repositoryformatversion 1 should still suffice. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-16 14:25:13 +02:00
if (format->version == 0 && format->v1_only_extensions.nr) {
int i;
strbuf_addstr(err,
Q_("repo version is 0, but v1-only extension found:",
"repo version is 0, but v1-only extensions found:",
format->v1_only_extensions.nr));
verify_repository_format(): complain about new extensions in v0 repo We made the mistake in the past of respecting extensions.* even when the repository format version was set to 0. This is bad because forgetting to bump the repository version means that older versions of Git (which do not know about our extensions) won't complain. I.e., it's not a problem in itself, but it means your repository is in a state which does not give you the protection you think you're getting from older versions. For compatibility reasons, we are stuck with that decision for existing extensions. However, we'd prefer not to extend the damage further. We can do that by catching any newly-added extensions and complaining about the repository format. Note that this is a pretty heavy hammer: we'll refuse to work with the repository at all. A lesser option would be to ignore (possibly with a warning) any new extensions. But because of the way the extensions are handled, that puts the burden on each new extension that is added to remember to "undo" itself (because they are handled before we know for sure whether we are in a v1 repo or not, since we don't insist on a particular ordering of config entries). So one option would be to rewrite that handling to record any new extensions (and their values) during the config parse, and then only after proceed to handle new ones only if we're in a v1 repository. But I'm not sure if it's worth the trouble: - ignoring extensions is likely to end up with broken results anyway (e.g., ignoring a proposed objectformat extension means parsing any object data is likely to encounter errors) - this is a sign that whatever tool wrote the extension field is broken. We may be better off notifying immediately and forcefully so that such tools don't even appear to work accidentally. The only downside is that fixing the situation is a little tricky, because programs like "git config" won't want to work with the repository. But: git config --file=.git/config core.repositoryformatversion 1 should still suffice. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-16 14:25:13 +02:00
for (i = 0; i < format->v1_only_extensions.nr; i++)
strbuf_addf(err, "\n\t%s",
format->v1_only_extensions.items[i].string);
return -1;
}
setup: refactor repo format reading and verification When we want to know if we're in a git repository of reasonable vintage, we can call check_repository_format_gently(), which does three things: 1. Reads the config from the .git/config file. 2. Verifies that the version info we read is sane. 3. Writes some global variables based on this. There are a few things we could improve here. One is that steps 1 and 3 happen together. So if the verification in step 2 fails, we still clobber the global variables. This is especially bad if we go on to try another repository directory; we may end up with a state of mixed config variables. The second is there's no way to ask about the repository version for anything besides the main repository we're in. git-init wants to do this, and it's possible that we would want to start doing so for submodules (e.g., to find out which ref backend they're using). We can improve both by splitting the first two steps into separate functions. Now check_repository_format_gently() calls out to steps 1 and 2, and does 3 only if step 2 succeeds. Note that the public interface for read_repository_format() and what check_repository_format_gently() needs from it are not quite the same, leading us to have an extra read_repository_format_1() helper. The extra needs from check_repository_format_gently() will go away in a future patch, and we can simplify this then to just the public interface. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-11 23:37:07 +01:00
return 0;
}
void read_gitfile_error_die(int error_code, const char *path, const char *dir)
{
switch (error_code) {
case READ_GITFILE_ERR_STAT_FAILED:
case READ_GITFILE_ERR_NOT_A_FILE:
/* non-fatal; follow return path */
break;
case READ_GITFILE_ERR_OPEN_FAILED:
die_errno(_("error opening '%s'"), path);
case READ_GITFILE_ERR_TOO_LARGE:
die(_("too large to be a .git file: '%s'"), path);
case READ_GITFILE_ERR_READ_FAILED:
die(_("error reading %s"), path);
case READ_GITFILE_ERR_INVALID_FORMAT:
die(_("invalid gitfile format: %s"), path);
case READ_GITFILE_ERR_NO_PATH:
die(_("no path in gitfile: %s"), path);
case READ_GITFILE_ERR_NOT_A_REPO:
die(_("not a git repository: %s"), dir);
default:
BUG("unknown error code");
}
}
/*
* Try to read the location of the git directory from the .git file,
* return path to git directory if found. The return value comes from
* a shared buffer.
*
* On failure, if return_error_code is not NULL, return_error_code
* will be set to an error code and NULL will be returned. If
* return_error_code is NULL the function will die instead (for most
* cases).
*/
const char *read_gitfile_gently(const char *path, int *return_error_code)
{
const int max_file_size = 1 << 20; /* 1MB */
int error_code = 0;
char *buf = NULL;
char *dir = NULL;
const char *slash;
struct stat st;
int fd;
ssize_t len;
static struct strbuf realpath = STRBUF_INIT;
if (stat(path, &st)) {
/* NEEDSWORK: discern between ENOENT vs other errors */
error_code = READ_GITFILE_ERR_STAT_FAILED;
goto cleanup_return;
}
if (!S_ISREG(st.st_mode)) {
error_code = READ_GITFILE_ERR_NOT_A_FILE;
goto cleanup_return;
}
if (st.st_size > max_file_size) {
error_code = READ_GITFILE_ERR_TOO_LARGE;
goto cleanup_return;
}
fd = open(path, O_RDONLY);
if (fd < 0) {
error_code = READ_GITFILE_ERR_OPEN_FAILED;
goto cleanup_return;
}
buf = xmallocz(st.st_size);
len = read_in_full(fd, buf, st.st_size);
close(fd);
if (len != st.st_size) {
error_code = READ_GITFILE_ERR_READ_FAILED;
goto cleanup_return;
}
if (!starts_with(buf, "gitdir: ")) {
error_code = READ_GITFILE_ERR_INVALID_FORMAT;
goto cleanup_return;
}
while (buf[len - 1] == '\n' || buf[len - 1] == '\r')
len--;
if (len < 9) {
error_code = READ_GITFILE_ERR_NO_PATH;
goto cleanup_return;
}
buf[len] = '\0';
dir = buf + 8;
if (!is_absolute_path(dir) && (slash = strrchr(path, '/'))) {
size_t pathlen = slash+1 - path;
dir = xstrfmt("%.*s%.*s", (int)pathlen, path,
(int)(len - 8), buf + 8);
free(buf);
buf = dir;
}
if (!is_git_directory(dir)) {
error_code = READ_GITFILE_ERR_NOT_A_REPO;
goto cleanup_return;
}
strbuf_realpath(&realpath, dir, 1);
path = realpath.buf;
cleanup_return:
if (return_error_code)
*return_error_code = error_code;
else if (error_code)
read_gitfile_error_die(error_code, path, dir);
free(buf);
return error_code ? NULL : path;
}
static const char *setup_explicit_git_dir(const char *gitdirenv,
struct strbuf *cwd,
struct repository_format *repo_fmt,
int *nongit_ok)
{
const char *work_tree_env = getenv(GIT_WORK_TREE_ENVIRONMENT);
const char *worktree;
char *gitfile;
int offset;
if (PATH_MAX - 40 < strlen(gitdirenv))
die(_("'$%s' too big"), GIT_DIR_ENVIRONMENT);
gitfile = (char*)read_gitfile(gitdirenv);
if (gitfile) {
gitfile = xstrdup(gitfile);
gitdirenv = gitfile;
}
if (!is_git_directory(gitdirenv)) {
if (nongit_ok) {
*nongit_ok = 1;
free(gitfile);
return NULL;
}
die(_("not a git repository: '%s'"), gitdirenv);
}
if (check_repository_format_gently(gitdirenv, repo_fmt, nongit_ok)) {
free(gitfile);
return NULL;
}
/* #3, #7, #11, #15, #19, #23, #27, #31 (see t1510) */
if (work_tree_env)
set_git_work_tree(work_tree_env);
else if (is_bare_repository_cfg > 0) {
setup_git_directory: delay core.bare/core.worktree errors If both core.bare and core.worktree are set, we complain about the bogus config and die. Dying is good, because it avoids commands running and doing damage in a potentially incorrect setup. But dying _there_ is bad, because it means that commands which do not even care about the work tree cannot run. This can make repairing the situation harder: [setup] $ git config core.bare true $ git config core.worktree /some/path [OK, expected.] $ git status fatal: core.bare and core.worktree do not make sense [Hrm...] $ git config --unset core.worktree fatal: core.bare and core.worktree do not make sense [Nope...] $ git config --edit fatal: core.bare and core.worktree do not make sense [Gaaah.] $ git help config fatal: core.bare and core.worktree do not make sense Instead, let's issue a warning about the bogus config when we notice it (i.e., for all commands), but only die when the command tries to use the work tree (by calling setup_work_tree). So we now get: $ git status warning: core.bare and core.worktree do not make sense fatal: unable to set up work tree using invalid config $ git config --unset core.worktree warning: core.bare and core.worktree do not make sense We have to update t1510 to accomodate this; it uses symbolic-ref to check whether the configuration works or not, but of course that command does not use the working tree. Instead, we switch it to use `git status`, as it requires a work-tree, does not need any special setup, and is read-only (so a failure will not adversely affect further tests). In addition, we add a new test that checks the desired behavior (i.e., that running "git config" with the bogus config does in fact work). Reported-by: SZEDER Gábor <szeder@ira.uka.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-29 08:49:10 +02:00
if (git_work_tree_cfg) {
/* #22.2, #30 */
warning("core.bare and core.worktree do not make sense");
work_tree_config_is_bogus = 1;
}
/* #18, #26 */
set_git_dir(gitdirenv, 0);
free(gitfile);
return NULL;
}
else if (git_work_tree_cfg) { /* #6, #14 */
if (is_absolute_path(git_work_tree_cfg))
set_git_work_tree(git_work_tree_cfg);
else {
char *core_worktree;
if (chdir(gitdirenv))
die_errno(_("cannot chdir to '%s'"), gitdirenv);
if (chdir(git_work_tree_cfg))
die_errno(_("cannot chdir to '%s'"), git_work_tree_cfg);
core_worktree = xgetcwd();
if (chdir(cwd->buf))
die_errno(_("cannot come back to cwd"));
set_git_work_tree(core_worktree);
free(core_worktree);
}
}
setup: suppress implicit "." work-tree for bare repos If an explicit GIT_DIR is given without a working tree, we implicitly assume that the current working directory should be used as the working tree. E.g.,: GIT_DIR=/some/repo.git git status would compare against the cwd. Unfortunately, we fool this rule for sub-invocations of git by setting GIT_DIR internally ourselves. For example: git init foo cd foo/.git git status ;# fails, as we expect git config alias.st status git status ;# does not fail, but should What happens is that we run setup_git_directory when doing alias lookup (since we need to see the config), set GIT_DIR as a result, and then leave GIT_WORK_TREE blank (because we do not have one). Then when we actually run the status command, we do setup_git_directory again, which sees our explicit GIT_DIR and uses the cwd as an implicit worktree. It's tempting to argue that we should be suppressing that second invocation of setup_git_directory, as it could use the values we already found in memory. However, the problem still exists for sub-processes (e.g., if "git status" were an external command). You can see another example with the "--bare" option, which sets GIT_DIR explicitly. For example: git init foo cd foo/.git git status ;# fails git --bare status ;# does NOT fail We need some way of telling sub-processes "even though GIT_DIR is set, do not use cwd as an implicit working tree". We could do it by putting a special token into GIT_WORK_TREE, but the obvious choice (an empty string) has some portability problems. Instead, we add a new boolean variable, GIT_IMPLICIT_WORK_TREE, which suppresses the use of cwd as a working tree when GIT_DIR is set. We trigger the new variable when we know we are in a bare setting. The variable is left intentionally undocumented, as this is an internal detail (for now, anyway). If somebody comes up with a good alternate use for it, and once we are confident we have shaken any bugs out of it, we can consider promoting it further. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-08 10:32:22 +01:00
else if (!git_env_bool(GIT_IMPLICIT_WORK_TREE_ENVIRONMENT, 1)) {
/* #16d */
set_git_dir(gitdirenv, 0);
setup: suppress implicit "." work-tree for bare repos If an explicit GIT_DIR is given without a working tree, we implicitly assume that the current working directory should be used as the working tree. E.g.,: GIT_DIR=/some/repo.git git status would compare against the cwd. Unfortunately, we fool this rule for sub-invocations of git by setting GIT_DIR internally ourselves. For example: git init foo cd foo/.git git status ;# fails, as we expect git config alias.st status git status ;# does not fail, but should What happens is that we run setup_git_directory when doing alias lookup (since we need to see the config), set GIT_DIR as a result, and then leave GIT_WORK_TREE blank (because we do not have one). Then when we actually run the status command, we do setup_git_directory again, which sees our explicit GIT_DIR and uses the cwd as an implicit worktree. It's tempting to argue that we should be suppressing that second invocation of setup_git_directory, as it could use the values we already found in memory. However, the problem still exists for sub-processes (e.g., if "git status" were an external command). You can see another example with the "--bare" option, which sets GIT_DIR explicitly. For example: git init foo cd foo/.git git status ;# fails git --bare status ;# does NOT fail We need some way of telling sub-processes "even though GIT_DIR is set, do not use cwd as an implicit working tree". We could do it by putting a special token into GIT_WORK_TREE, but the obvious choice (an empty string) has some portability problems. Instead, we add a new boolean variable, GIT_IMPLICIT_WORK_TREE, which suppresses the use of cwd as a working tree when GIT_DIR is set. We trigger the new variable when we know we are in a bare setting. The variable is left intentionally undocumented, as this is an internal detail (for now, anyway). If somebody comes up with a good alternate use for it, and once we are confident we have shaken any bugs out of it, we can consider promoting it further. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-08 10:32:22 +01:00
free(gitfile);
return NULL;
}
else /* #2, #10 */
set_git_work_tree(".");
/* set_git_work_tree() must have been called by now */
worktree = get_git_work_tree();
/* both get_git_work_tree() and cwd are already normalized */
if (!strcmp(cwd->buf, worktree)) { /* cwd == worktree */
set_git_dir(gitdirenv, 0);
free(gitfile);
return NULL;
}
offset = dir_inside_of(cwd->buf, worktree);
if (offset >= 0) { /* cwd inside worktree? */
set_git_dir(gitdirenv, 1);
if (chdir(worktree))
die_errno(_("cannot chdir to '%s'"), worktree);
strbuf_addch(cwd, '/');
free(gitfile);
return cwd->buf + offset;
}
/* cwd outside worktree */
set_git_dir(gitdirenv, 0);
free(gitfile);
return NULL;
}
static const char *setup_discovered_git_dir(const char *gitdir,
struct strbuf *cwd, int offset,
struct repository_format *repo_fmt,
int *nongit_ok)
{
if (check_repository_format_gently(gitdir, repo_fmt, nongit_ok))
return NULL;
/* --work-tree is set without --git-dir; use discovered one */
if (getenv(GIT_WORK_TREE_ENVIRONMENT) || git_work_tree_cfg) {
char *to_free = NULL;
const char *ret;
if (offset != cwd->len && !is_absolute_path(gitdir))
gitdir = to_free = real_pathdup(gitdir, 1);
if (chdir(cwd->buf))
die_errno(_("cannot come back to cwd"));
ret = setup_explicit_git_dir(gitdir, cwd, repo_fmt, nongit_ok);
free(to_free);
return ret;
}
/* #16.2, #17.2, #20.2, #21.2, #24, #25, #28, #29 (see t1510) */
if (is_bare_repository_cfg > 0) {
set_git_dir(gitdir, (offset != cwd->len));
if (chdir(cwd->buf))
die_errno(_("cannot come back to cwd"));
return NULL;
}
/* #0, #1, #5, #8, #9, #12, #13 */
set_git_work_tree(".");
if (strcmp(gitdir, DEFAULT_GIT_DIR_ENVIRONMENT))
set_git_dir(gitdir, 0);
inside_git_dir = 0;
inside_work_tree = 1;
if (offset >= cwd->len)
return NULL;
/* Make "offset" point past the '/' (already the case for root dirs) */
if (offset != offset_1st_component(cwd->buf))
offset++;
/* Add a '/' at the end */
strbuf_addch(cwd, '/');
return cwd->buf + offset;
}
/* #16.1, #17.1, #20.1, #21.1, #22.1 (see t1510) */
static const char *setup_bare_git_dir(struct strbuf *cwd, int offset,
struct repository_format *repo_fmt,
int *nongit_ok)
{
int root_len;
if (check_repository_format_gently(".", repo_fmt, nongit_ok))
return NULL;
setup: suppress implicit "." work-tree for bare repos If an explicit GIT_DIR is given without a working tree, we implicitly assume that the current working directory should be used as the working tree. E.g.,: GIT_DIR=/some/repo.git git status would compare against the cwd. Unfortunately, we fool this rule for sub-invocations of git by setting GIT_DIR internally ourselves. For example: git init foo cd foo/.git git status ;# fails, as we expect git config alias.st status git status ;# does not fail, but should What happens is that we run setup_git_directory when doing alias lookup (since we need to see the config), set GIT_DIR as a result, and then leave GIT_WORK_TREE blank (because we do not have one). Then when we actually run the status command, we do setup_git_directory again, which sees our explicit GIT_DIR and uses the cwd as an implicit worktree. It's tempting to argue that we should be suppressing that second invocation of setup_git_directory, as it could use the values we already found in memory. However, the problem still exists for sub-processes (e.g., if "git status" were an external command). You can see another example with the "--bare" option, which sets GIT_DIR explicitly. For example: git init foo cd foo/.git git status ;# fails git --bare status ;# does NOT fail We need some way of telling sub-processes "even though GIT_DIR is set, do not use cwd as an implicit working tree". We could do it by putting a special token into GIT_WORK_TREE, but the obvious choice (an empty string) has some portability problems. Instead, we add a new boolean variable, GIT_IMPLICIT_WORK_TREE, which suppresses the use of cwd as a working tree when GIT_DIR is set. We trigger the new variable when we know we are in a bare setting. The variable is left intentionally undocumented, as this is an internal detail (for now, anyway). If somebody comes up with a good alternate use for it, and once we are confident we have shaken any bugs out of it, we can consider promoting it further. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-08 10:32:22 +01:00
setenv(GIT_IMPLICIT_WORK_TREE_ENVIRONMENT, "0", 1);
/* --work-tree is set without --git-dir; use discovered one */
if (getenv(GIT_WORK_TREE_ENVIRONMENT) || git_work_tree_cfg) {
static const char *gitdir;
gitdir = offset == cwd->len ? "." : xmemdupz(cwd->buf, offset);
if (chdir(cwd->buf))
die_errno(_("cannot come back to cwd"));
return setup_explicit_git_dir(gitdir, cwd, repo_fmt, nongit_ok);
}
inside_git_dir = 1;
inside_work_tree = 0;
if (offset != cwd->len) {
if (chdir(cwd->buf))
die_errno(_("cannot come back to cwd"));
root_len = offset_1st_component(cwd->buf);
strbuf_setlen(cwd, offset > root_len ? offset : root_len);
set_git_dir(cwd->buf, 0);
}
else
set_git_dir(".", 0);
return NULL;
}
static dev_t get_device_or_die(const char *path, const char *prefix, int prefix_len)
{
struct stat buf;
if (stat(path, &buf)) {
die_errno(_("failed to stat '%*s%s%s'"),
prefix_len,
prefix ? prefix : "",
prefix ? "/" : "", path);
}
return buf.st_dev;
}
longest_ancestor_length(): require prefix list entries to be normalized Move the responsibility for normalizing prefixes from longest_ancestor_length() to its callers. Use slightly different normalizations at the two callers: In setup_git_directory_gently_1(), use the old normalization, which ignores paths that are not usable. In the next commit we will change this caller to also resolve symlinks in the paths from GIT_CEILING_DIRECTORIES as part of the normalization. In "test-path-utils longest_ancestor_length", use the old normalization, but die() if any paths are unusable. Also change t0060 to only pass normalized paths to the test program (no empty entries or non-absolute paths, strip trailing slashes from the paths, and remove tests that thereby become redundant). The point of this change is to reduce the scope of the ancestor_length tests in t0060 from testing normalization+longest_prefix to testing only mostly longest_prefix. This is necessary because when setup_git_directory_gently_1() starts resolving symlinks as part of its normalization, it will not be reasonable to do the same in the test suite, because that would make the test results depend on the contents of the root directory of the filesystem on which the test is run. HOWEVER: under Windows, bash mangles arguments that look like absolute POSIX paths into DOS paths. So we have to retain the level of normalization done by normalize_path_copy() to convert the bash-mangled DOS paths (which contain backslashes) into paths that use forward slashes. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net>
2012-10-28 17:16:25 +01:00
/*
* A "string_list_each_func_t" function that canonicalizes an entry
* from GIT_CEILING_DIRECTORIES using real_pathdup(), or
Provide a mechanism to turn off symlink resolution in ceiling paths Commit 1b77d83cab 'setup_git_directory_gently_1(): resolve symlinks in ceiling paths' changed the setup code to resolve symlinks in the entries in GIT_CEILING_DIRECTORIES. Because those entries are compared textually to the symlink-resolved current directory, an entry in GIT_CEILING_DIRECTORIES that contained a symlink would have no effect. It was known that this could cause performance problems if the symlink resolution *itself* touched slow filesystems, but it was thought that such use cases would be unlikely. The intention of the earlier change was to deal with a case when the user has this: GIT_CEILING_DIRECTORIES=/home/gitster but in reality, /home/gitster is a symbolic link to somewhere else, e.g. /net/machine/home4/gitster. A textual comparison between the specified value /home/gitster and the location getcwd(3) returns would not help us, but readlink("/home/gitster") would still be fast. After this change was released, Anders Kaseorg <andersk@mit.edu> reported: > [...] my computer has been acting so slow when I’m not connected to > the network. I put various network filesystem paths in > $GIT_CEILING_DIRECTORIES, such as > /afs/athena.mit.edu/user/a/n/andersk (to avoid hitting its parents > /afs/athena.mit.edu, /afs/athena.mit.edu/user/a, and > /afs/athena.mit.edu/user/a/n which all live in different AFS > volumes). Now when I’m not connected to the network, every > invocation of Git, including the __git_ps1 in my shell prompt, waits > for AFS to timeout. To allow users to work around this problem, give them a mechanism to turn off symlink resolution in GIT_CEILING_DIRECTORIES entries. All the entries that follow an empty entry will not be checked for symbolic links and used literally in comparison. E.g. with these: GIT_CEILING_DIRECTORIES=:/foo/bar:/xyzzy or GIT_CEILING_DIRECTORIES=/foo/bar::/xyzzy we will not readlink("/xyzzy") because it comes after an empty entry. With the former (but not with the latter), "/foo/bar" comes after an empty entry, and we will not readlink it, either. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 10:09:24 +01:00
* discards it if unusable. The presence of an empty entry in
* GIT_CEILING_DIRECTORIES turns off canonicalization for all
* subsequent entries.
longest_ancestor_length(): require prefix list entries to be normalized Move the responsibility for normalizing prefixes from longest_ancestor_length() to its callers. Use slightly different normalizations at the two callers: In setup_git_directory_gently_1(), use the old normalization, which ignores paths that are not usable. In the next commit we will change this caller to also resolve symlinks in the paths from GIT_CEILING_DIRECTORIES as part of the normalization. In "test-path-utils longest_ancestor_length", use the old normalization, but die() if any paths are unusable. Also change t0060 to only pass normalized paths to the test program (no empty entries or non-absolute paths, strip trailing slashes from the paths, and remove tests that thereby become redundant). The point of this change is to reduce the scope of the ancestor_length tests in t0060 from testing normalization+longest_prefix to testing only mostly longest_prefix. This is necessary because when setup_git_directory_gently_1() starts resolving symlinks as part of its normalization, it will not be reasonable to do the same in the test suite, because that would make the test results depend on the contents of the root directory of the filesystem on which the test is run. HOWEVER: under Windows, bash mangles arguments that look like absolute POSIX paths into DOS paths. So we have to retain the level of normalization done by normalize_path_copy() to convert the bash-mangled DOS paths (which contain backslashes) into paths that use forward slashes. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net>
2012-10-28 17:16:25 +01:00
*/
static int canonicalize_ceiling_entry(struct string_list_item *item,
Provide a mechanism to turn off symlink resolution in ceiling paths Commit 1b77d83cab 'setup_git_directory_gently_1(): resolve symlinks in ceiling paths' changed the setup code to resolve symlinks in the entries in GIT_CEILING_DIRECTORIES. Because those entries are compared textually to the symlink-resolved current directory, an entry in GIT_CEILING_DIRECTORIES that contained a symlink would have no effect. It was known that this could cause performance problems if the symlink resolution *itself* touched slow filesystems, but it was thought that such use cases would be unlikely. The intention of the earlier change was to deal with a case when the user has this: GIT_CEILING_DIRECTORIES=/home/gitster but in reality, /home/gitster is a symbolic link to somewhere else, e.g. /net/machine/home4/gitster. A textual comparison between the specified value /home/gitster and the location getcwd(3) returns would not help us, but readlink("/home/gitster") would still be fast. After this change was released, Anders Kaseorg <andersk@mit.edu> reported: > [...] my computer has been acting so slow when I’m not connected to > the network. I put various network filesystem paths in > $GIT_CEILING_DIRECTORIES, such as > /afs/athena.mit.edu/user/a/n/andersk (to avoid hitting its parents > /afs/athena.mit.edu, /afs/athena.mit.edu/user/a, and > /afs/athena.mit.edu/user/a/n which all live in different AFS > volumes). Now when I’m not connected to the network, every > invocation of Git, including the __git_ps1 in my shell prompt, waits > for AFS to timeout. To allow users to work around this problem, give them a mechanism to turn off symlink resolution in GIT_CEILING_DIRECTORIES entries. All the entries that follow an empty entry will not be checked for symbolic links and used literally in comparison. E.g. with these: GIT_CEILING_DIRECTORIES=:/foo/bar:/xyzzy or GIT_CEILING_DIRECTORIES=/foo/bar::/xyzzy we will not readlink("/xyzzy") because it comes after an empty entry. With the former (but not with the latter), "/foo/bar" comes after an empty entry, and we will not readlink it, either. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 10:09:24 +01:00
void *cb_data)
longest_ancestor_length(): require prefix list entries to be normalized Move the responsibility for normalizing prefixes from longest_ancestor_length() to its callers. Use slightly different normalizations at the two callers: In setup_git_directory_gently_1(), use the old normalization, which ignores paths that are not usable. In the next commit we will change this caller to also resolve symlinks in the paths from GIT_CEILING_DIRECTORIES as part of the normalization. In "test-path-utils longest_ancestor_length", use the old normalization, but die() if any paths are unusable. Also change t0060 to only pass normalized paths to the test program (no empty entries or non-absolute paths, strip trailing slashes from the paths, and remove tests that thereby become redundant). The point of this change is to reduce the scope of the ancestor_length tests in t0060 from testing normalization+longest_prefix to testing only mostly longest_prefix. This is necessary because when setup_git_directory_gently_1() starts resolving symlinks as part of its normalization, it will not be reasonable to do the same in the test suite, because that would make the test results depend on the contents of the root directory of the filesystem on which the test is run. HOWEVER: under Windows, bash mangles arguments that look like absolute POSIX paths into DOS paths. So we have to retain the level of normalization done by normalize_path_copy() to convert the bash-mangled DOS paths (which contain backslashes) into paths that use forward slashes. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net>
2012-10-28 17:16:25 +01:00
{
Provide a mechanism to turn off symlink resolution in ceiling paths Commit 1b77d83cab 'setup_git_directory_gently_1(): resolve symlinks in ceiling paths' changed the setup code to resolve symlinks in the entries in GIT_CEILING_DIRECTORIES. Because those entries are compared textually to the symlink-resolved current directory, an entry in GIT_CEILING_DIRECTORIES that contained a symlink would have no effect. It was known that this could cause performance problems if the symlink resolution *itself* touched slow filesystems, but it was thought that such use cases would be unlikely. The intention of the earlier change was to deal with a case when the user has this: GIT_CEILING_DIRECTORIES=/home/gitster but in reality, /home/gitster is a symbolic link to somewhere else, e.g. /net/machine/home4/gitster. A textual comparison between the specified value /home/gitster and the location getcwd(3) returns would not help us, but readlink("/home/gitster") would still be fast. After this change was released, Anders Kaseorg <andersk@mit.edu> reported: > [...] my computer has been acting so slow when I’m not connected to > the network. I put various network filesystem paths in > $GIT_CEILING_DIRECTORIES, such as > /afs/athena.mit.edu/user/a/n/andersk (to avoid hitting its parents > /afs/athena.mit.edu, /afs/athena.mit.edu/user/a, and > /afs/athena.mit.edu/user/a/n which all live in different AFS > volumes). Now when I’m not connected to the network, every > invocation of Git, including the __git_ps1 in my shell prompt, waits > for AFS to timeout. To allow users to work around this problem, give them a mechanism to turn off symlink resolution in GIT_CEILING_DIRECTORIES entries. All the entries that follow an empty entry will not be checked for symbolic links and used literally in comparison. E.g. with these: GIT_CEILING_DIRECTORIES=:/foo/bar:/xyzzy or GIT_CEILING_DIRECTORIES=/foo/bar::/xyzzy we will not readlink("/xyzzy") because it comes after an empty entry. With the former (but not with the latter), "/foo/bar" comes after an empty entry, and we will not readlink it, either. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 10:09:24 +01:00
int *empty_entry_found = cb_data;
char *ceil = item->string;
longest_ancestor_length(): require prefix list entries to be normalized Move the responsibility for normalizing prefixes from longest_ancestor_length() to its callers. Use slightly different normalizations at the two callers: In setup_git_directory_gently_1(), use the old normalization, which ignores paths that are not usable. In the next commit we will change this caller to also resolve symlinks in the paths from GIT_CEILING_DIRECTORIES as part of the normalization. In "test-path-utils longest_ancestor_length", use the old normalization, but die() if any paths are unusable. Also change t0060 to only pass normalized paths to the test program (no empty entries or non-absolute paths, strip trailing slashes from the paths, and remove tests that thereby become redundant). The point of this change is to reduce the scope of the ancestor_length tests in t0060 from testing normalization+longest_prefix to testing only mostly longest_prefix. This is necessary because when setup_git_directory_gently_1() starts resolving symlinks as part of its normalization, it will not be reasonable to do the same in the test suite, because that would make the test results depend on the contents of the root directory of the filesystem on which the test is run. HOWEVER: under Windows, bash mangles arguments that look like absolute POSIX paths into DOS paths. So we have to retain the level of normalization done by normalize_path_copy() to convert the bash-mangled DOS paths (which contain backslashes) into paths that use forward slashes. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net>
2012-10-28 17:16:25 +01:00
Provide a mechanism to turn off symlink resolution in ceiling paths Commit 1b77d83cab 'setup_git_directory_gently_1(): resolve symlinks in ceiling paths' changed the setup code to resolve symlinks in the entries in GIT_CEILING_DIRECTORIES. Because those entries are compared textually to the symlink-resolved current directory, an entry in GIT_CEILING_DIRECTORIES that contained a symlink would have no effect. It was known that this could cause performance problems if the symlink resolution *itself* touched slow filesystems, but it was thought that such use cases would be unlikely. The intention of the earlier change was to deal with a case when the user has this: GIT_CEILING_DIRECTORIES=/home/gitster but in reality, /home/gitster is a symbolic link to somewhere else, e.g. /net/machine/home4/gitster. A textual comparison between the specified value /home/gitster and the location getcwd(3) returns would not help us, but readlink("/home/gitster") would still be fast. After this change was released, Anders Kaseorg <andersk@mit.edu> reported: > [...] my computer has been acting so slow when I’m not connected to > the network. I put various network filesystem paths in > $GIT_CEILING_DIRECTORIES, such as > /afs/athena.mit.edu/user/a/n/andersk (to avoid hitting its parents > /afs/athena.mit.edu, /afs/athena.mit.edu/user/a, and > /afs/athena.mit.edu/user/a/n which all live in different AFS > volumes). Now when I’m not connected to the network, every > invocation of Git, including the __git_ps1 in my shell prompt, waits > for AFS to timeout. To allow users to work around this problem, give them a mechanism to turn off symlink resolution in GIT_CEILING_DIRECTORIES entries. All the entries that follow an empty entry will not be checked for symbolic links and used literally in comparison. E.g. with these: GIT_CEILING_DIRECTORIES=:/foo/bar:/xyzzy or GIT_CEILING_DIRECTORIES=/foo/bar::/xyzzy we will not readlink("/xyzzy") because it comes after an empty entry. With the former (but not with the latter), "/foo/bar" comes after an empty entry, and we will not readlink it, either. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 10:09:24 +01:00
if (!*ceil) {
*empty_entry_found = 1;
longest_ancestor_length(): require prefix list entries to be normalized Move the responsibility for normalizing prefixes from longest_ancestor_length() to its callers. Use slightly different normalizations at the two callers: In setup_git_directory_gently_1(), use the old normalization, which ignores paths that are not usable. In the next commit we will change this caller to also resolve symlinks in the paths from GIT_CEILING_DIRECTORIES as part of the normalization. In "test-path-utils longest_ancestor_length", use the old normalization, but die() if any paths are unusable. Also change t0060 to only pass normalized paths to the test program (no empty entries or non-absolute paths, strip trailing slashes from the paths, and remove tests that thereby become redundant). The point of this change is to reduce the scope of the ancestor_length tests in t0060 from testing normalization+longest_prefix to testing only mostly longest_prefix. This is necessary because when setup_git_directory_gently_1() starts resolving symlinks as part of its normalization, it will not be reasonable to do the same in the test suite, because that would make the test results depend on the contents of the root directory of the filesystem on which the test is run. HOWEVER: under Windows, bash mangles arguments that look like absolute POSIX paths into DOS paths. So we have to retain the level of normalization done by normalize_path_copy() to convert the bash-mangled DOS paths (which contain backslashes) into paths that use forward slashes. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net>
2012-10-28 17:16:25 +01:00
return 0;
Provide a mechanism to turn off symlink resolution in ceiling paths Commit 1b77d83cab 'setup_git_directory_gently_1(): resolve symlinks in ceiling paths' changed the setup code to resolve symlinks in the entries in GIT_CEILING_DIRECTORIES. Because those entries are compared textually to the symlink-resolved current directory, an entry in GIT_CEILING_DIRECTORIES that contained a symlink would have no effect. It was known that this could cause performance problems if the symlink resolution *itself* touched slow filesystems, but it was thought that such use cases would be unlikely. The intention of the earlier change was to deal with a case when the user has this: GIT_CEILING_DIRECTORIES=/home/gitster but in reality, /home/gitster is a symbolic link to somewhere else, e.g. /net/machine/home4/gitster. A textual comparison between the specified value /home/gitster and the location getcwd(3) returns would not help us, but readlink("/home/gitster") would still be fast. After this change was released, Anders Kaseorg <andersk@mit.edu> reported: > [...] my computer has been acting so slow when I’m not connected to > the network. I put various network filesystem paths in > $GIT_CEILING_DIRECTORIES, such as > /afs/athena.mit.edu/user/a/n/andersk (to avoid hitting its parents > /afs/athena.mit.edu, /afs/athena.mit.edu/user/a, and > /afs/athena.mit.edu/user/a/n which all live in different AFS > volumes). Now when I’m not connected to the network, every > invocation of Git, including the __git_ps1 in my shell prompt, waits > for AFS to timeout. To allow users to work around this problem, give them a mechanism to turn off symlink resolution in GIT_CEILING_DIRECTORIES entries. All the entries that follow an empty entry will not be checked for symbolic links and used literally in comparison. E.g. with these: GIT_CEILING_DIRECTORIES=:/foo/bar:/xyzzy or GIT_CEILING_DIRECTORIES=/foo/bar::/xyzzy we will not readlink("/xyzzy") because it comes after an empty entry. With the former (but not with the latter), "/foo/bar" comes after an empty entry, and we will not readlink it, either. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 10:09:24 +01:00
} else if (!is_absolute_path(ceil)) {
longest_ancestor_length(): require prefix list entries to be normalized Move the responsibility for normalizing prefixes from longest_ancestor_length() to its callers. Use slightly different normalizations at the two callers: In setup_git_directory_gently_1(), use the old normalization, which ignores paths that are not usable. In the next commit we will change this caller to also resolve symlinks in the paths from GIT_CEILING_DIRECTORIES as part of the normalization. In "test-path-utils longest_ancestor_length", use the old normalization, but die() if any paths are unusable. Also change t0060 to only pass normalized paths to the test program (no empty entries or non-absolute paths, strip trailing slashes from the paths, and remove tests that thereby become redundant). The point of this change is to reduce the scope of the ancestor_length tests in t0060 from testing normalization+longest_prefix to testing only mostly longest_prefix. This is necessary because when setup_git_directory_gently_1() starts resolving symlinks as part of its normalization, it will not be reasonable to do the same in the test suite, because that would make the test results depend on the contents of the root directory of the filesystem on which the test is run. HOWEVER: under Windows, bash mangles arguments that look like absolute POSIX paths into DOS paths. So we have to retain the level of normalization done by normalize_path_copy() to convert the bash-mangled DOS paths (which contain backslashes) into paths that use forward slashes. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net>
2012-10-28 17:16:25 +01:00
return 0;
Provide a mechanism to turn off symlink resolution in ceiling paths Commit 1b77d83cab 'setup_git_directory_gently_1(): resolve symlinks in ceiling paths' changed the setup code to resolve symlinks in the entries in GIT_CEILING_DIRECTORIES. Because those entries are compared textually to the symlink-resolved current directory, an entry in GIT_CEILING_DIRECTORIES that contained a symlink would have no effect. It was known that this could cause performance problems if the symlink resolution *itself* touched slow filesystems, but it was thought that such use cases would be unlikely. The intention of the earlier change was to deal with a case when the user has this: GIT_CEILING_DIRECTORIES=/home/gitster but in reality, /home/gitster is a symbolic link to somewhere else, e.g. /net/machine/home4/gitster. A textual comparison between the specified value /home/gitster and the location getcwd(3) returns would not help us, but readlink("/home/gitster") would still be fast. After this change was released, Anders Kaseorg <andersk@mit.edu> reported: > [...] my computer has been acting so slow when I’m not connected to > the network. I put various network filesystem paths in > $GIT_CEILING_DIRECTORIES, such as > /afs/athena.mit.edu/user/a/n/andersk (to avoid hitting its parents > /afs/athena.mit.edu, /afs/athena.mit.edu/user/a, and > /afs/athena.mit.edu/user/a/n which all live in different AFS > volumes). Now when I’m not connected to the network, every > invocation of Git, including the __git_ps1 in my shell prompt, waits > for AFS to timeout. To allow users to work around this problem, give them a mechanism to turn off symlink resolution in GIT_CEILING_DIRECTORIES entries. All the entries that follow an empty entry will not be checked for symbolic links and used literally in comparison. E.g. with these: GIT_CEILING_DIRECTORIES=:/foo/bar:/xyzzy or GIT_CEILING_DIRECTORIES=/foo/bar::/xyzzy we will not readlink("/xyzzy") because it comes after an empty entry. With the former (but not with the latter), "/foo/bar" comes after an empty entry, and we will not readlink it, either. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 10:09:24 +01:00
} else if (*empty_entry_found) {
/* Keep entry but do not canonicalize it */
return 1;
} else {
char *real_path = real_pathdup(ceil, 0);
if (!real_path) {
Provide a mechanism to turn off symlink resolution in ceiling paths Commit 1b77d83cab 'setup_git_directory_gently_1(): resolve symlinks in ceiling paths' changed the setup code to resolve symlinks in the entries in GIT_CEILING_DIRECTORIES. Because those entries are compared textually to the symlink-resolved current directory, an entry in GIT_CEILING_DIRECTORIES that contained a symlink would have no effect. It was known that this could cause performance problems if the symlink resolution *itself* touched slow filesystems, but it was thought that such use cases would be unlikely. The intention of the earlier change was to deal with a case when the user has this: GIT_CEILING_DIRECTORIES=/home/gitster but in reality, /home/gitster is a symbolic link to somewhere else, e.g. /net/machine/home4/gitster. A textual comparison between the specified value /home/gitster and the location getcwd(3) returns would not help us, but readlink("/home/gitster") would still be fast. After this change was released, Anders Kaseorg <andersk@mit.edu> reported: > [...] my computer has been acting so slow when I’m not connected to > the network. I put various network filesystem paths in > $GIT_CEILING_DIRECTORIES, such as > /afs/athena.mit.edu/user/a/n/andersk (to avoid hitting its parents > /afs/athena.mit.edu, /afs/athena.mit.edu/user/a, and > /afs/athena.mit.edu/user/a/n which all live in different AFS > volumes). Now when I’m not connected to the network, every > invocation of Git, including the __git_ps1 in my shell prompt, waits > for AFS to timeout. To allow users to work around this problem, give them a mechanism to turn off symlink resolution in GIT_CEILING_DIRECTORIES entries. All the entries that follow an empty entry will not be checked for symbolic links and used literally in comparison. E.g. with these: GIT_CEILING_DIRECTORIES=:/foo/bar:/xyzzy or GIT_CEILING_DIRECTORIES=/foo/bar::/xyzzy we will not readlink("/xyzzy") because it comes after an empty entry. With the former (but not with the latter), "/foo/bar" comes after an empty entry, and we will not readlink it, either. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 10:09:24 +01:00
return 0;
}
Provide a mechanism to turn off symlink resolution in ceiling paths Commit 1b77d83cab 'setup_git_directory_gently_1(): resolve symlinks in ceiling paths' changed the setup code to resolve symlinks in the entries in GIT_CEILING_DIRECTORIES. Because those entries are compared textually to the symlink-resolved current directory, an entry in GIT_CEILING_DIRECTORIES that contained a symlink would have no effect. It was known that this could cause performance problems if the symlink resolution *itself* touched slow filesystems, but it was thought that such use cases would be unlikely. The intention of the earlier change was to deal with a case when the user has this: GIT_CEILING_DIRECTORIES=/home/gitster but in reality, /home/gitster is a symbolic link to somewhere else, e.g. /net/machine/home4/gitster. A textual comparison between the specified value /home/gitster and the location getcwd(3) returns would not help us, but readlink("/home/gitster") would still be fast. After this change was released, Anders Kaseorg <andersk@mit.edu> reported: > [...] my computer has been acting so slow when I’m not connected to > the network. I put various network filesystem paths in > $GIT_CEILING_DIRECTORIES, such as > /afs/athena.mit.edu/user/a/n/andersk (to avoid hitting its parents > /afs/athena.mit.edu, /afs/athena.mit.edu/user/a, and > /afs/athena.mit.edu/user/a/n which all live in different AFS > volumes). Now when I’m not connected to the network, every > invocation of Git, including the __git_ps1 in my shell prompt, waits > for AFS to timeout. To allow users to work around this problem, give them a mechanism to turn off symlink resolution in GIT_CEILING_DIRECTORIES entries. All the entries that follow an empty entry will not be checked for symbolic links and used literally in comparison. E.g. with these: GIT_CEILING_DIRECTORIES=:/foo/bar:/xyzzy or GIT_CEILING_DIRECTORIES=/foo/bar::/xyzzy we will not readlink("/xyzzy") because it comes after an empty entry. With the former (but not with the latter), "/foo/bar" comes after an empty entry, and we will not readlink it, either. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 10:09:24 +01:00
free(item->string);
item->string = real_path;
Provide a mechanism to turn off symlink resolution in ceiling paths Commit 1b77d83cab 'setup_git_directory_gently_1(): resolve symlinks in ceiling paths' changed the setup code to resolve symlinks in the entries in GIT_CEILING_DIRECTORIES. Because those entries are compared textually to the symlink-resolved current directory, an entry in GIT_CEILING_DIRECTORIES that contained a symlink would have no effect. It was known that this could cause performance problems if the symlink resolution *itself* touched slow filesystems, but it was thought that such use cases would be unlikely. The intention of the earlier change was to deal with a case when the user has this: GIT_CEILING_DIRECTORIES=/home/gitster but in reality, /home/gitster is a symbolic link to somewhere else, e.g. /net/machine/home4/gitster. A textual comparison between the specified value /home/gitster and the location getcwd(3) returns would not help us, but readlink("/home/gitster") would still be fast. After this change was released, Anders Kaseorg <andersk@mit.edu> reported: > [...] my computer has been acting so slow when I’m not connected to > the network. I put various network filesystem paths in > $GIT_CEILING_DIRECTORIES, such as > /afs/athena.mit.edu/user/a/n/andersk (to avoid hitting its parents > /afs/athena.mit.edu, /afs/athena.mit.edu/user/a, and > /afs/athena.mit.edu/user/a/n which all live in different AFS > volumes). Now when I’m not connected to the network, every > invocation of Git, including the __git_ps1 in my shell prompt, waits > for AFS to timeout. To allow users to work around this problem, give them a mechanism to turn off symlink resolution in GIT_CEILING_DIRECTORIES entries. All the entries that follow an empty entry will not be checked for symbolic links and used literally in comparison. E.g. with these: GIT_CEILING_DIRECTORIES=:/foo/bar:/xyzzy or GIT_CEILING_DIRECTORIES=/foo/bar::/xyzzy we will not readlink("/xyzzy") because it comes after an empty entry. With the former (but not with the latter), "/foo/bar" comes after an empty entry, and we will not readlink it, either. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 10:09:24 +01:00
return 1;
}
longest_ancestor_length(): require prefix list entries to be normalized Move the responsibility for normalizing prefixes from longest_ancestor_length() to its callers. Use slightly different normalizations at the two callers: In setup_git_directory_gently_1(), use the old normalization, which ignores paths that are not usable. In the next commit we will change this caller to also resolve symlinks in the paths from GIT_CEILING_DIRECTORIES as part of the normalization. In "test-path-utils longest_ancestor_length", use the old normalization, but die() if any paths are unusable. Also change t0060 to only pass normalized paths to the test program (no empty entries or non-absolute paths, strip trailing slashes from the paths, and remove tests that thereby become redundant). The point of this change is to reduce the scope of the ancestor_length tests in t0060 from testing normalization+longest_prefix to testing only mostly longest_prefix. This is necessary because when setup_git_directory_gently_1() starts resolving symlinks as part of its normalization, it will not be reasonable to do the same in the test suite, because that would make the test results depend on the contents of the root directory of the filesystem on which the test is run. HOWEVER: under Windows, bash mangles arguments that look like absolute POSIX paths into DOS paths. So we have to retain the level of normalization done by normalize_path_copy() to convert the bash-mangled DOS paths (which contain backslashes) into paths that use forward slashes. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net>
2012-10-28 17:16:25 +01:00
}
setup_git_directory(): add an owner check for the top-level directory It poses a security risk to search for a git directory outside of the directories owned by the current user. For example, it is common e.g. in computer pools of educational institutes to have a "scratch" space: a mounted disk with plenty of space that is regularly swiped where any authenticated user can create a directory to do their work. Merely navigating to such a space with a Git-enabled `PS1` when there is a maliciously-crafted `/scratch/.git/` can lead to a compromised account. The same holds true in multi-user setups running Windows, as `C:\` is writable to every authenticated user by default. To plug this vulnerability, we stop Git from accepting top-level directories owned by someone other than the current user. We avoid looking at the ownership of each and every directories between the current and the top-level one (if there are any between) to avoid introducing a performance bottleneck. This new default behavior is obviously incompatible with the concept of shared repositories, where we expect the top-level directory to be owned by only one of its legitimate users. To re-enable that use case, we add support for adding exceptions from the new default behavior via the config setting `safe.directory`. The `safe.directory` config setting is only respected in the system and global configs, not from repository configs or via the command-line, and can have multiple values to allow for multiple shared repositories. We are particularly careful to provide a helpful message to any user trying to use a shared repository. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2022-03-02 12:23:04 +01:00
struct safe_directory_data {
const char *path;
int is_safe;
};
static int safe_directory_cb(const char *key, const char *value, void *d)
{
struct safe_directory_data *data = d;
if (strcmp(key, "safe.directory"))
return 0;
if (!value || !*value) {
setup_git_directory(): add an owner check for the top-level directory It poses a security risk to search for a git directory outside of the directories owned by the current user. For example, it is common e.g. in computer pools of educational institutes to have a "scratch" space: a mounted disk with plenty of space that is regularly swiped where any authenticated user can create a directory to do their work. Merely navigating to such a space with a Git-enabled `PS1` when there is a maliciously-crafted `/scratch/.git/` can lead to a compromised account. The same holds true in multi-user setups running Windows, as `C:\` is writable to every authenticated user by default. To plug this vulnerability, we stop Git from accepting top-level directories owned by someone other than the current user. We avoid looking at the ownership of each and every directories between the current and the top-level one (if there are any between) to avoid introducing a performance bottleneck. This new default behavior is obviously incompatible with the concept of shared repositories, where we expect the top-level directory to be owned by only one of its legitimate users. To re-enable that use case, we add support for adding exceptions from the new default behavior via the config setting `safe.directory`. The `safe.directory` config setting is only respected in the system and global configs, not from repository configs or via the command-line, and can have multiple values to allow for multiple shared repositories. We are particularly careful to provide a helpful message to any user trying to use a shared repository. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2022-03-02 12:23:04 +01:00
data->is_safe = 0;
} else if (!strcmp(value, "*")) {
data->is_safe = 1;
} else {
setup_git_directory(): add an owner check for the top-level directory It poses a security risk to search for a git directory outside of the directories owned by the current user. For example, it is common e.g. in computer pools of educational institutes to have a "scratch" space: a mounted disk with plenty of space that is regularly swiped where any authenticated user can create a directory to do their work. Merely navigating to such a space with a Git-enabled `PS1` when there is a maliciously-crafted `/scratch/.git/` can lead to a compromised account. The same holds true in multi-user setups running Windows, as `C:\` is writable to every authenticated user by default. To plug this vulnerability, we stop Git from accepting top-level directories owned by someone other than the current user. We avoid looking at the ownership of each and every directories between the current and the top-level one (if there are any between) to avoid introducing a performance bottleneck. This new default behavior is obviously incompatible with the concept of shared repositories, where we expect the top-level directory to be owned by only one of its legitimate users. To re-enable that use case, we add support for adding exceptions from the new default behavior via the config setting `safe.directory`. The `safe.directory` config setting is only respected in the system and global configs, not from repository configs or via the command-line, and can have multiple values to allow for multiple shared repositories. We are particularly careful to provide a helpful message to any user trying to use a shared repository. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2022-03-02 12:23:04 +01:00
const char *interpolated = NULL;
if (!git_config_pathname(&interpolated, key, value) &&
!fspathcmp(data->path, interpolated ? interpolated : value))
data->is_safe = 1;
free((char *)interpolated);
}
return 0;
}
setup: tighten ownership checks post CVE-2022-24765 8959555cee7 (setup_git_directory(): add an owner check for the top-level directory, 2022-03-02), adds a function to check for ownership of repositories using a directory that is representative of it, and ways to add exempt a specific repository from said check if needed, but that check didn't account for owership of the gitdir, or (when used) the gitfile that points to that gitdir. An attacker could create a git repository in a directory that they can write into but that is owned by the victim to work around the fix that was introduced with CVE-2022-24765 to potentially run code as the victim. An example that could result in privilege escalation to root in *NIX would be to set a repository in a shared tmp directory by doing (for example): $ git -C /tmp init To avoid that, extend the ensure_valid_ownership function to be able to check for all three paths. This will have the side effect of tripling the number of stat() calls when a repository is detected, but the effect is expected to be likely minimal, as it is done only once during the directory walk in which Git looks for a repository. Additionally make sure to resolve the gitfile (if one was used) to find the relevant gitdir for checking. While at it change the message printed on failure so it is clear we are referring to the repository by its worktree (or gitdir if it is bare) and not to a specific directory. Helped-by: Junio C Hamano <junio@pobox.com> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
2022-05-10 21:35:29 +02:00
/*
* Check if a repository is safe, by verifying the ownership of the
* worktree (if any), the git directory, and the gitfile (if any).
*
* Exemptions for known-safe repositories can be added via `safe.directory`
* config settings; for non-bare repositories, their worktree needs to be
* added, for bare ones their git directory.
*/
static int ensure_valid_ownership(const char *gitfile,
const char *worktree, const char *gitdir)
setup_git_directory(): add an owner check for the top-level directory It poses a security risk to search for a git directory outside of the directories owned by the current user. For example, it is common e.g. in computer pools of educational institutes to have a "scratch" space: a mounted disk with plenty of space that is regularly swiped where any authenticated user can create a directory to do their work. Merely navigating to such a space with a Git-enabled `PS1` when there is a maliciously-crafted `/scratch/.git/` can lead to a compromised account. The same holds true in multi-user setups running Windows, as `C:\` is writable to every authenticated user by default. To plug this vulnerability, we stop Git from accepting top-level directories owned by someone other than the current user. We avoid looking at the ownership of each and every directories between the current and the top-level one (if there are any between) to avoid introducing a performance bottleneck. This new default behavior is obviously incompatible with the concept of shared repositories, where we expect the top-level directory to be owned by only one of its legitimate users. To re-enable that use case, we add support for adding exceptions from the new default behavior via the config setting `safe.directory`. The `safe.directory` config setting is only respected in the system and global configs, not from repository configs or via the command-line, and can have multiple values to allow for multiple shared repositories. We are particularly careful to provide a helpful message to any user trying to use a shared repository. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2022-03-02 12:23:04 +01:00
{
setup: tighten ownership checks post CVE-2022-24765 8959555cee7 (setup_git_directory(): add an owner check for the top-level directory, 2022-03-02), adds a function to check for ownership of repositories using a directory that is representative of it, and ways to add exempt a specific repository from said check if needed, but that check didn't account for owership of the gitdir, or (when used) the gitfile that points to that gitdir. An attacker could create a git repository in a directory that they can write into but that is owned by the victim to work around the fix that was introduced with CVE-2022-24765 to potentially run code as the victim. An example that could result in privilege escalation to root in *NIX would be to set a repository in a shared tmp directory by doing (for example): $ git -C /tmp init To avoid that, extend the ensure_valid_ownership function to be able to check for all three paths. This will have the side effect of tripling the number of stat() calls when a repository is detected, but the effect is expected to be likely minimal, as it is done only once during the directory walk in which Git looks for a repository. Additionally make sure to resolve the gitfile (if one was used) to find the relevant gitdir for checking. While at it change the message printed on failure so it is clear we are referring to the repository by its worktree (or gitdir if it is bare) and not to a specific directory. Helped-by: Junio C Hamano <junio@pobox.com> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
2022-05-10 21:35:29 +02:00
struct safe_directory_data data = {
.path = worktree ? worktree : gitdir
};
setup_git_directory(): add an owner check for the top-level directory It poses a security risk to search for a git directory outside of the directories owned by the current user. For example, it is common e.g. in computer pools of educational institutes to have a "scratch" space: a mounted disk with plenty of space that is regularly swiped where any authenticated user can create a directory to do their work. Merely navigating to such a space with a Git-enabled `PS1` when there is a maliciously-crafted `/scratch/.git/` can lead to a compromised account. The same holds true in multi-user setups running Windows, as `C:\` is writable to every authenticated user by default. To plug this vulnerability, we stop Git from accepting top-level directories owned by someone other than the current user. We avoid looking at the ownership of each and every directories between the current and the top-level one (if there are any between) to avoid introducing a performance bottleneck. This new default behavior is obviously incompatible with the concept of shared repositories, where we expect the top-level directory to be owned by only one of its legitimate users. To re-enable that use case, we add support for adding exceptions from the new default behavior via the config setting `safe.directory`. The `safe.directory` config setting is only respected in the system and global configs, not from repository configs or via the command-line, and can have multiple values to allow for multiple shared repositories. We are particularly careful to provide a helpful message to any user trying to use a shared repository. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2022-03-02 12:23:04 +01:00
if (!git_env_bool("GIT_TEST_ASSUME_DIFFERENT_OWNER", 0) &&
setup: tighten ownership checks post CVE-2022-24765 8959555cee7 (setup_git_directory(): add an owner check for the top-level directory, 2022-03-02), adds a function to check for ownership of repositories using a directory that is representative of it, and ways to add exempt a specific repository from said check if needed, but that check didn't account for owership of the gitdir, or (when used) the gitfile that points to that gitdir. An attacker could create a git repository in a directory that they can write into but that is owned by the victim to work around the fix that was introduced with CVE-2022-24765 to potentially run code as the victim. An example that could result in privilege escalation to root in *NIX would be to set a repository in a shared tmp directory by doing (for example): $ git -C /tmp init To avoid that, extend the ensure_valid_ownership function to be able to check for all three paths. This will have the side effect of tripling the number of stat() calls when a repository is detected, but the effect is expected to be likely minimal, as it is done only once during the directory walk in which Git looks for a repository. Additionally make sure to resolve the gitfile (if one was used) to find the relevant gitdir for checking. While at it change the message printed on failure so it is clear we are referring to the repository by its worktree (or gitdir if it is bare) and not to a specific directory. Helped-by: Junio C Hamano <junio@pobox.com> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
2022-05-10 21:35:29 +02:00
(!gitfile || is_path_owned_by_current_user(gitfile)) &&
(!worktree || is_path_owned_by_current_user(worktree)) &&
(!gitdir || is_path_owned_by_current_user(gitdir)))
setup_git_directory(): add an owner check for the top-level directory It poses a security risk to search for a git directory outside of the directories owned by the current user. For example, it is common e.g. in computer pools of educational institutes to have a "scratch" space: a mounted disk with plenty of space that is regularly swiped where any authenticated user can create a directory to do their work. Merely navigating to such a space with a Git-enabled `PS1` when there is a maliciously-crafted `/scratch/.git/` can lead to a compromised account. The same holds true in multi-user setups running Windows, as `C:\` is writable to every authenticated user by default. To plug this vulnerability, we stop Git from accepting top-level directories owned by someone other than the current user. We avoid looking at the ownership of each and every directories between the current and the top-level one (if there are any between) to avoid introducing a performance bottleneck. This new default behavior is obviously incompatible with the concept of shared repositories, where we expect the top-level directory to be owned by only one of its legitimate users. To re-enable that use case, we add support for adding exceptions from the new default behavior via the config setting `safe.directory`. The `safe.directory` config setting is only respected in the system and global configs, not from repository configs or via the command-line, and can have multiple values to allow for multiple shared repositories. We are particularly careful to provide a helpful message to any user trying to use a shared repository. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2022-03-02 12:23:04 +01:00
return 1;
setup: tighten ownership checks post CVE-2022-24765 8959555cee7 (setup_git_directory(): add an owner check for the top-level directory, 2022-03-02), adds a function to check for ownership of repositories using a directory that is representative of it, and ways to add exempt a specific repository from said check if needed, but that check didn't account for owership of the gitdir, or (when used) the gitfile that points to that gitdir. An attacker could create a git repository in a directory that they can write into but that is owned by the victim to work around the fix that was introduced with CVE-2022-24765 to potentially run code as the victim. An example that could result in privilege escalation to root in *NIX would be to set a repository in a shared tmp directory by doing (for example): $ git -C /tmp init To avoid that, extend the ensure_valid_ownership function to be able to check for all three paths. This will have the side effect of tripling the number of stat() calls when a repository is detected, but the effect is expected to be likely minimal, as it is done only once during the directory walk in which Git looks for a repository. Additionally make sure to resolve the gitfile (if one was used) to find the relevant gitdir for checking. While at it change the message printed on failure so it is clear we are referring to the repository by its worktree (or gitdir if it is bare) and not to a specific directory. Helped-by: Junio C Hamano <junio@pobox.com> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
2022-05-10 21:35:29 +02:00
/*
* data.path is the "path" that identifies the repository and it is
* constant regardless of what failed above. data.is_safe should be
* initialized to false, and might be changed by the callback.
*/
setup_git_directory(): add an owner check for the top-level directory It poses a security risk to search for a git directory outside of the directories owned by the current user. For example, it is common e.g. in computer pools of educational institutes to have a "scratch" space: a mounted disk with plenty of space that is regularly swiped where any authenticated user can create a directory to do their work. Merely navigating to such a space with a Git-enabled `PS1` when there is a maliciously-crafted `/scratch/.git/` can lead to a compromised account. The same holds true in multi-user setups running Windows, as `C:\` is writable to every authenticated user by default. To plug this vulnerability, we stop Git from accepting top-level directories owned by someone other than the current user. We avoid looking at the ownership of each and every directories between the current and the top-level one (if there are any between) to avoid introducing a performance bottleneck. This new default behavior is obviously incompatible with the concept of shared repositories, where we expect the top-level directory to be owned by only one of its legitimate users. To re-enable that use case, we add support for adding exceptions from the new default behavior via the config setting `safe.directory`. The `safe.directory` config setting is only respected in the system and global configs, not from repository configs or via the command-line, and can have multiple values to allow for multiple shared repositories. We are particularly careful to provide a helpful message to any user trying to use a shared repository. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2022-03-02 12:23:04 +01:00
read_very_early_config(safe_directory_cb, &data);
return data.is_safe;
}
enum discovery_result {
GIT_DIR_NONE = 0,
GIT_DIR_EXPLICIT,
GIT_DIR_DISCOVERED,
GIT_DIR_BARE,
/* these are errors */
GIT_DIR_HIT_CEILING = -1,
GIT_DIR_HIT_MOUNT_POINT = -2,
setup_git_directory(): add an owner check for the top-level directory It poses a security risk to search for a git directory outside of the directories owned by the current user. For example, it is common e.g. in computer pools of educational institutes to have a "scratch" space: a mounted disk with plenty of space that is regularly swiped where any authenticated user can create a directory to do their work. Merely navigating to such a space with a Git-enabled `PS1` when there is a maliciously-crafted `/scratch/.git/` can lead to a compromised account. The same holds true in multi-user setups running Windows, as `C:\` is writable to every authenticated user by default. To plug this vulnerability, we stop Git from accepting top-level directories owned by someone other than the current user. We avoid looking at the ownership of each and every directories between the current and the top-level one (if there are any between) to avoid introducing a performance bottleneck. This new default behavior is obviously incompatible with the concept of shared repositories, where we expect the top-level directory to be owned by only one of its legitimate users. To re-enable that use case, we add support for adding exceptions from the new default behavior via the config setting `safe.directory`. The `safe.directory` config setting is only respected in the system and global configs, not from repository configs or via the command-line, and can have multiple values to allow for multiple shared repositories. We are particularly careful to provide a helpful message to any user trying to use a shared repository. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2022-03-02 12:23:04 +01:00
GIT_DIR_INVALID_GITFILE = -3,
GIT_DIR_INVALID_OWNERSHIP = -4
};
Clean up work-tree handling The old version of work-tree support was an unholy mess, barely readable, and not to the point. For example, why do you have to provide a worktree, when it is not used? As in "git status". Now it works. Another riddle was: if you can have work trees inside the git dir, why are some programs complaining that they need a work tree? IOW it is allowed to call $ git --git-dir=../ --work-tree=. bla when you really want to. In this case, you are both in the git directory and in the working tree. So, programs have to actually test for the right thing, namely if they are inside a working tree, and not if they are inside a git directory. Also, GIT_DIR=../.git should behave the same as if no GIT_DIR was specified, unless there is a repository in the current working directory. It does now. The logic to determine if a repository is bare, or has a work tree (tertium non datur), is this: --work-tree=bla overrides GIT_WORK_TREE, which overrides core.bare = true, which overrides core.worktree, which overrides GIT_DIR/.. when GIT_DIR ends in /.git, which overrides the directory in which .git/ was found. In related news, a long standing bug was fixed: when in .git/bla/x.git/, which is a bare repository, git formerly assumed ../.. to be the appropriate git dir. This problem was reported by Shawn Pearce to have caused much pain, where a colleague mistakenly ran "git init" in "/" a long time ago, and bare repositories just would not work. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-01 02:30:14 +02:00
/*
* We cannot decide in this function whether we are in the work tree or
* not, since the config can only be read _after_ this function was called.
*
* Also, we avoid changing any global state (such as the current working
* directory) to allow early callers.
*
* The directory where the search should start needs to be passed in via the
* `dir` parameter; upon return, the `dir` buffer will contain the path of
* the directory where the search ended, and `gitdir` will contain the path of
* the discovered .git/ directory, if any. If `gitdir` is not absolute, it
* is relative to `dir` (i.e. *not* necessarily the cwd).
Clean up work-tree handling The old version of work-tree support was an unholy mess, barely readable, and not to the point. For example, why do you have to provide a worktree, when it is not used? As in "git status". Now it works. Another riddle was: if you can have work trees inside the git dir, why are some programs complaining that they need a work tree? IOW it is allowed to call $ git --git-dir=../ --work-tree=. bla when you really want to. In this case, you are both in the git directory and in the working tree. So, programs have to actually test for the right thing, namely if they are inside a working tree, and not if they are inside a git directory. Also, GIT_DIR=../.git should behave the same as if no GIT_DIR was specified, unless there is a repository in the current working directory. It does now. The logic to determine if a repository is bare, or has a work tree (tertium non datur), is this: --work-tree=bla overrides GIT_WORK_TREE, which overrides core.bare = true, which overrides core.worktree, which overrides GIT_DIR/.. when GIT_DIR ends in /.git, which overrides the directory in which .git/ was found. In related news, a long standing bug was fixed: when in .git/bla/x.git/, which is a bare repository, git formerly assumed ../.. to be the appropriate git dir. This problem was reported by Shawn Pearce to have caused much pain, where a colleague mistakenly ran "git init" in "/" a long time ago, and bare repositories just would not work. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-01 02:30:14 +02:00
*/
static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
struct strbuf *gitdir,
int die_on_error)
{
const char *env_ceiling_dirs = getenv(CEILING_DIRECTORIES_ENVIRONMENT);
struct string_list ceiling_dirs = STRING_LIST_INIT_DUP;
const char *gitdirenv;
int ceil_offset = -1, min_offset = offset_1st_component(dir->buf);
dev_t current_device = 0;
int one_filesystem = 1;
Clean up work-tree handling The old version of work-tree support was an unholy mess, barely readable, and not to the point. For example, why do you have to provide a worktree, when it is not used? As in "git status". Now it works. Another riddle was: if you can have work trees inside the git dir, why are some programs complaining that they need a work tree? IOW it is allowed to call $ git --git-dir=../ --work-tree=. bla when you really want to. In this case, you are both in the git directory and in the working tree. So, programs have to actually test for the right thing, namely if they are inside a working tree, and not if they are inside a git directory. Also, GIT_DIR=../.git should behave the same as if no GIT_DIR was specified, unless there is a repository in the current working directory. It does now. The logic to determine if a repository is bare, or has a work tree (tertium non datur), is this: --work-tree=bla overrides GIT_WORK_TREE, which overrides core.bare = true, which overrides core.worktree, which overrides GIT_DIR/.. when GIT_DIR ends in /.git, which overrides the directory in which .git/ was found. In related news, a long standing bug was fixed: when in .git/bla/x.git/, which is a bare repository, git formerly assumed ../.. to be the appropriate git dir. This problem was reported by Shawn Pearce to have caused much pain, where a colleague mistakenly ran "git init" in "/" a long time ago, and bare repositories just would not work. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-01 02:30:14 +02:00
/*
* If GIT_DIR is set explicitly, we're not going
* to do any discovery, but we still do repository
* validation.
*/
gitdirenv = getenv(GIT_DIR_ENVIRONMENT);
if (gitdirenv) {
strbuf_addstr(gitdir, gitdirenv);
return GIT_DIR_EXPLICIT;
}
if (env_ceiling_dirs) {
Provide a mechanism to turn off symlink resolution in ceiling paths Commit 1b77d83cab 'setup_git_directory_gently_1(): resolve symlinks in ceiling paths' changed the setup code to resolve symlinks in the entries in GIT_CEILING_DIRECTORIES. Because those entries are compared textually to the symlink-resolved current directory, an entry in GIT_CEILING_DIRECTORIES that contained a symlink would have no effect. It was known that this could cause performance problems if the symlink resolution *itself* touched slow filesystems, but it was thought that such use cases would be unlikely. The intention of the earlier change was to deal with a case when the user has this: GIT_CEILING_DIRECTORIES=/home/gitster but in reality, /home/gitster is a symbolic link to somewhere else, e.g. /net/machine/home4/gitster. A textual comparison between the specified value /home/gitster and the location getcwd(3) returns would not help us, but readlink("/home/gitster") would still be fast. After this change was released, Anders Kaseorg <andersk@mit.edu> reported: > [...] my computer has been acting so slow when I’m not connected to > the network. I put various network filesystem paths in > $GIT_CEILING_DIRECTORIES, such as > /afs/athena.mit.edu/user/a/n/andersk (to avoid hitting its parents > /afs/athena.mit.edu, /afs/athena.mit.edu/user/a, and > /afs/athena.mit.edu/user/a/n which all live in different AFS > volumes). Now when I’m not connected to the network, every > invocation of Git, including the __git_ps1 in my shell prompt, waits > for AFS to timeout. To allow users to work around this problem, give them a mechanism to turn off symlink resolution in GIT_CEILING_DIRECTORIES entries. All the entries that follow an empty entry will not be checked for symbolic links and used literally in comparison. E.g. with these: GIT_CEILING_DIRECTORIES=:/foo/bar:/xyzzy or GIT_CEILING_DIRECTORIES=/foo/bar::/xyzzy we will not readlink("/xyzzy") because it comes after an empty entry. With the former (but not with the latter), "/foo/bar" comes after an empty entry, and we will not readlink it, either. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 10:09:24 +01:00
int empty_entry_found = 0;
string_list_split(&ceiling_dirs, env_ceiling_dirs, PATH_SEP, -1);
filter_string_list(&ceiling_dirs, 0,
Provide a mechanism to turn off symlink resolution in ceiling paths Commit 1b77d83cab 'setup_git_directory_gently_1(): resolve symlinks in ceiling paths' changed the setup code to resolve symlinks in the entries in GIT_CEILING_DIRECTORIES. Because those entries are compared textually to the symlink-resolved current directory, an entry in GIT_CEILING_DIRECTORIES that contained a symlink would have no effect. It was known that this could cause performance problems if the symlink resolution *itself* touched slow filesystems, but it was thought that such use cases would be unlikely. The intention of the earlier change was to deal with a case when the user has this: GIT_CEILING_DIRECTORIES=/home/gitster but in reality, /home/gitster is a symbolic link to somewhere else, e.g. /net/machine/home4/gitster. A textual comparison between the specified value /home/gitster and the location getcwd(3) returns would not help us, but readlink("/home/gitster") would still be fast. After this change was released, Anders Kaseorg <andersk@mit.edu> reported: > [...] my computer has been acting so slow when I’m not connected to > the network. I put various network filesystem paths in > $GIT_CEILING_DIRECTORIES, such as > /afs/athena.mit.edu/user/a/n/andersk (to avoid hitting its parents > /afs/athena.mit.edu, /afs/athena.mit.edu/user/a, and > /afs/athena.mit.edu/user/a/n which all live in different AFS > volumes). Now when I’m not connected to the network, every > invocation of Git, including the __git_ps1 in my shell prompt, waits > for AFS to timeout. To allow users to work around this problem, give them a mechanism to turn off symlink resolution in GIT_CEILING_DIRECTORIES entries. All the entries that follow an empty entry will not be checked for symbolic links and used literally in comparison. E.g. with these: GIT_CEILING_DIRECTORIES=:/foo/bar:/xyzzy or GIT_CEILING_DIRECTORIES=/foo/bar::/xyzzy we will not readlink("/xyzzy") because it comes after an empty entry. With the former (but not with the latter), "/foo/bar" comes after an empty entry, and we will not readlink it, either. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 10:09:24 +01:00
canonicalize_ceiling_entry, &empty_entry_found);
ceil_offset = longest_ancestor_length(dir->buf, &ceiling_dirs);
string_list_clear(&ceiling_dirs, 0);
}
if (ceil_offset < 0)
ceil_offset = min_offset - 2;
if (min_offset && min_offset == dir->len &&
!is_dir_sep(dir->buf[min_offset - 1])) {
strbuf_addch(dir, '/');
min_offset++;
}
introduce GIT_WORK_TREE to specify the work tree setup_gdg is used as abbreviation for setup_git_directory_gently. The work tree can be specified using the environment variable GIT_WORK_TREE and the config option core.worktree (the environment variable has precendence over the config option). Additionally there is a command line option --work-tree which sets the environment variable. setup_gdg does the following now: GIT_DIR unspecified repository in .git directory parent directory of the .git directory is used as work tree, GIT_WORK_TREE is ignored GIT_DIR unspecified repository in cwd GIT_DIR is set to cwd see the cases with GIT_DIR specified what happens next and also see the note below GIT_DIR specified GIT_WORK_TREE/core.worktree unspecified cwd is used as work tree GIT_DIR specified GIT_WORK_TREE/core.worktree specified the specified work tree is used Note on the case where GIT_DIR is unspecified and repository is in cwd: GIT_WORK_TREE is used but is_inside_git_dir is always true. I did it this way because setup_gdg might be called multiple times (e.g. when doing alias expansion) and in successive calls setup_gdg should do the same thing every time. Meaning of is_bare/is_inside_work_tree/is_inside_git_dir: (1) is_bare_repository A repository is bare if core.bare is true or core.bare is unspecified and the name suggests it is bare (directory not named .git). The bare option disables a few protective checks which are useful with a working tree. Currently this changes if a repository is bare: updates of HEAD are allowed git gc packs the refs the reflog is disabled by default (2) is_inside_work_tree True if the cwd is inside the associated working tree (if there is one), false otherwise. (3) is_inside_git_dir True if the cwd is inside the git directory, false otherwise. Before this patch is_inside_git_dir was always true for bare repositories. When setup_gdg finds a repository git_config(git_default_config) is always called. This ensure that is_bare_repository makes use of core.bare and does not guess even though core.bare is specified. inside_work_tree and inside_git_dir are set if setup_gdg finds a repository. The is_inside_work_tree and is_inside_git_dir functions will die if they are called before a successful call to setup_gdg. Signed-off-by: Matthias Lederhofer <matled@gmx.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-06 09:10:42 +02:00
/*
* Test in the following order (relative to the dir):
* - .git (file containing "gitdir: <path>")
Clean up work-tree handling The old version of work-tree support was an unholy mess, barely readable, and not to the point. For example, why do you have to provide a worktree, when it is not used? As in "git status". Now it works. Another riddle was: if you can have work trees inside the git dir, why are some programs complaining that they need a work tree? IOW it is allowed to call $ git --git-dir=../ --work-tree=. bla when you really want to. In this case, you are both in the git directory and in the working tree. So, programs have to actually test for the right thing, namely if they are inside a working tree, and not if they are inside a git directory. Also, GIT_DIR=../.git should behave the same as if no GIT_DIR was specified, unless there is a repository in the current working directory. It does now. The logic to determine if a repository is bare, or has a work tree (tertium non datur), is this: --work-tree=bla overrides GIT_WORK_TREE, which overrides core.bare = true, which overrides core.worktree, which overrides GIT_DIR/.. when GIT_DIR ends in /.git, which overrides the directory in which .git/ was found. In related news, a long standing bug was fixed: when in .git/bla/x.git/, which is a bare repository, git formerly assumed ../.. to be the appropriate git dir. This problem was reported by Shawn Pearce to have caused much pain, where a colleague mistakenly ran "git init" in "/" a long time ago, and bare repositories just would not work. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-01 02:30:14 +02:00
* - .git/
* - ./ (bare)
* - ../.git
Clean up work-tree handling The old version of work-tree support was an unholy mess, barely readable, and not to the point. For example, why do you have to provide a worktree, when it is not used? As in "git status". Now it works. Another riddle was: if you can have work trees inside the git dir, why are some programs complaining that they need a work tree? IOW it is allowed to call $ git --git-dir=../ --work-tree=. bla when you really want to. In this case, you are both in the git directory and in the working tree. So, programs have to actually test for the right thing, namely if they are inside a working tree, and not if they are inside a git directory. Also, GIT_DIR=../.git should behave the same as if no GIT_DIR was specified, unless there is a repository in the current working directory. It does now. The logic to determine if a repository is bare, or has a work tree (tertium non datur), is this: --work-tree=bla overrides GIT_WORK_TREE, which overrides core.bare = true, which overrides core.worktree, which overrides GIT_DIR/.. when GIT_DIR ends in /.git, which overrides the directory in which .git/ was found. In related news, a long standing bug was fixed: when in .git/bla/x.git/, which is a bare repository, git formerly assumed ../.. to be the appropriate git dir. This problem was reported by Shawn Pearce to have caused much pain, where a colleague mistakenly ran "git init" in "/" a long time ago, and bare repositories just would not work. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-01 02:30:14 +02:00
* - ../.git/
* - ../ (bare)
* - ../../.git
Clean up work-tree handling The old version of work-tree support was an unholy mess, barely readable, and not to the point. For example, why do you have to provide a worktree, when it is not used? As in "git status". Now it works. Another riddle was: if you can have work trees inside the git dir, why are some programs complaining that they need a work tree? IOW it is allowed to call $ git --git-dir=../ --work-tree=. bla when you really want to. In this case, you are both in the git directory and in the working tree. So, programs have to actually test for the right thing, namely if they are inside a working tree, and not if they are inside a git directory. Also, GIT_DIR=../.git should behave the same as if no GIT_DIR was specified, unless there is a repository in the current working directory. It does now. The logic to determine if a repository is bare, or has a work tree (tertium non datur), is this: --work-tree=bla overrides GIT_WORK_TREE, which overrides core.bare = true, which overrides core.worktree, which overrides GIT_DIR/.. when GIT_DIR ends in /.git, which overrides the directory in which .git/ was found. In related news, a long standing bug was fixed: when in .git/bla/x.git/, which is a bare repository, git formerly assumed ../.. to be the appropriate git dir. This problem was reported by Shawn Pearce to have caused much pain, where a colleague mistakenly ran "git init" in "/" a long time ago, and bare repositories just would not work. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-01 02:30:14 +02:00
* etc.
introduce GIT_WORK_TREE to specify the work tree setup_gdg is used as abbreviation for setup_git_directory_gently. The work tree can be specified using the environment variable GIT_WORK_TREE and the config option core.worktree (the environment variable has precendence over the config option). Additionally there is a command line option --work-tree which sets the environment variable. setup_gdg does the following now: GIT_DIR unspecified repository in .git directory parent directory of the .git directory is used as work tree, GIT_WORK_TREE is ignored GIT_DIR unspecified repository in cwd GIT_DIR is set to cwd see the cases with GIT_DIR specified what happens next and also see the note below GIT_DIR specified GIT_WORK_TREE/core.worktree unspecified cwd is used as work tree GIT_DIR specified GIT_WORK_TREE/core.worktree specified the specified work tree is used Note on the case where GIT_DIR is unspecified and repository is in cwd: GIT_WORK_TREE is used but is_inside_git_dir is always true. I did it this way because setup_gdg might be called multiple times (e.g. when doing alias expansion) and in successive calls setup_gdg should do the same thing every time. Meaning of is_bare/is_inside_work_tree/is_inside_git_dir: (1) is_bare_repository A repository is bare if core.bare is true or core.bare is unspecified and the name suggests it is bare (directory not named .git). The bare option disables a few protective checks which are useful with a working tree. Currently this changes if a repository is bare: updates of HEAD are allowed git gc packs the refs the reflog is disabled by default (2) is_inside_work_tree True if the cwd is inside the associated working tree (if there is one), false otherwise. (3) is_inside_git_dir True if the cwd is inside the git directory, false otherwise. Before this patch is_inside_git_dir was always true for bare repositories. When setup_gdg finds a repository git_config(git_default_config) is always called. This ensure that is_bare_repository makes use of core.bare and does not guess even though core.bare is specified. inside_work_tree and inside_git_dir are set if setup_gdg finds a repository. The is_inside_work_tree and is_inside_git_dir functions will die if they are called before a successful call to setup_gdg. Signed-off-by: Matthias Lederhofer <matled@gmx.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-06 09:10:42 +02:00
*/
one_filesystem = !git_env_bool("GIT_DISCOVERY_ACROSS_FILESYSTEM", 0);
if (one_filesystem)
current_device = get_device_or_die(dir->buf, NULL, 0);
Clean up work-tree handling The old version of work-tree support was an unholy mess, barely readable, and not to the point. For example, why do you have to provide a worktree, when it is not used? As in "git status". Now it works. Another riddle was: if you can have work trees inside the git dir, why are some programs complaining that they need a work tree? IOW it is allowed to call $ git --git-dir=../ --work-tree=. bla when you really want to. In this case, you are both in the git directory and in the working tree. So, programs have to actually test for the right thing, namely if they are inside a working tree, and not if they are inside a git directory. Also, GIT_DIR=../.git should behave the same as if no GIT_DIR was specified, unless there is a repository in the current working directory. It does now. The logic to determine if a repository is bare, or has a work tree (tertium non datur), is this: --work-tree=bla overrides GIT_WORK_TREE, which overrides core.bare = true, which overrides core.worktree, which overrides GIT_DIR/.. when GIT_DIR ends in /.git, which overrides the directory in which .git/ was found. In related news, a long standing bug was fixed: when in .git/bla/x.git/, which is a bare repository, git formerly assumed ../.. to be the appropriate git dir. This problem was reported by Shawn Pearce to have caused much pain, where a colleague mistakenly ran "git init" in "/" a long time ago, and bare repositories just would not work. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-01 02:30:14 +02:00
for (;;) {
int offset = dir->len, error_code = 0;
setup: tighten ownership checks post CVE-2022-24765 8959555cee7 (setup_git_directory(): add an owner check for the top-level directory, 2022-03-02), adds a function to check for ownership of repositories using a directory that is representative of it, and ways to add exempt a specific repository from said check if needed, but that check didn't account for owership of the gitdir, or (when used) the gitfile that points to that gitdir. An attacker could create a git repository in a directory that they can write into but that is owned by the victim to work around the fix that was introduced with CVE-2022-24765 to potentially run code as the victim. An example that could result in privilege escalation to root in *NIX would be to set a repository in a shared tmp directory by doing (for example): $ git -C /tmp init To avoid that, extend the ensure_valid_ownership function to be able to check for all three paths. This will have the side effect of tripling the number of stat() calls when a repository is detected, but the effect is expected to be likely minimal, as it is done only once during the directory walk in which Git looks for a repository. Additionally make sure to resolve the gitfile (if one was used) to find the relevant gitdir for checking. While at it change the message printed on failure so it is clear we are referring to the repository by its worktree (or gitdir if it is bare) and not to a specific directory. Helped-by: Junio C Hamano <junio@pobox.com> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
2022-05-10 21:35:29 +02:00
char *gitdir_path = NULL;
char *gitfile = NULL;
if (offset > min_offset)
strbuf_addch(dir, '/');
strbuf_addstr(dir, DEFAULT_GIT_DIR_ENVIRONMENT);
gitdirenv = read_gitfile_gently(dir->buf, die_on_error ?
NULL : &error_code);
if (!gitdirenv) {
if (die_on_error ||
error_code == READ_GITFILE_ERR_NOT_A_FILE) {
/* NEEDSWORK: fail if .git is not file nor dir */
setup: tighten ownership checks post CVE-2022-24765 8959555cee7 (setup_git_directory(): add an owner check for the top-level directory, 2022-03-02), adds a function to check for ownership of repositories using a directory that is representative of it, and ways to add exempt a specific repository from said check if needed, but that check didn't account for owership of the gitdir, or (when used) the gitfile that points to that gitdir. An attacker could create a git repository in a directory that they can write into but that is owned by the victim to work around the fix that was introduced with CVE-2022-24765 to potentially run code as the victim. An example that could result in privilege escalation to root in *NIX would be to set a repository in a shared tmp directory by doing (for example): $ git -C /tmp init To avoid that, extend the ensure_valid_ownership function to be able to check for all three paths. This will have the side effect of tripling the number of stat() calls when a repository is detected, but the effect is expected to be likely minimal, as it is done only once during the directory walk in which Git looks for a repository. Additionally make sure to resolve the gitfile (if one was used) to find the relevant gitdir for checking. While at it change the message printed on failure so it is clear we are referring to the repository by its worktree (or gitdir if it is bare) and not to a specific directory. Helped-by: Junio C Hamano <junio@pobox.com> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
2022-05-10 21:35:29 +02:00
if (is_git_directory(dir->buf)) {
gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT;
setup: tighten ownership checks post CVE-2022-24765 8959555cee7 (setup_git_directory(): add an owner check for the top-level directory, 2022-03-02), adds a function to check for ownership of repositories using a directory that is representative of it, and ways to add exempt a specific repository from said check if needed, but that check didn't account for owership of the gitdir, or (when used) the gitfile that points to that gitdir. An attacker could create a git repository in a directory that they can write into but that is owned by the victim to work around the fix that was introduced with CVE-2022-24765 to potentially run code as the victim. An example that could result in privilege escalation to root in *NIX would be to set a repository in a shared tmp directory by doing (for example): $ git -C /tmp init To avoid that, extend the ensure_valid_ownership function to be able to check for all three paths. This will have the side effect of tripling the number of stat() calls when a repository is detected, but the effect is expected to be likely minimal, as it is done only once during the directory walk in which Git looks for a repository. Additionally make sure to resolve the gitfile (if one was used) to find the relevant gitdir for checking. While at it change the message printed on failure so it is clear we are referring to the repository by its worktree (or gitdir if it is bare) and not to a specific directory. Helped-by: Junio C Hamano <junio@pobox.com> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
2022-05-10 21:35:29 +02:00
gitdir_path = xstrdup(dir->buf);
}
} else if (error_code != READ_GITFILE_ERR_STAT_FAILED)
return GIT_DIR_INVALID_GITFILE;
setup: tighten ownership checks post CVE-2022-24765 8959555cee7 (setup_git_directory(): add an owner check for the top-level directory, 2022-03-02), adds a function to check for ownership of repositories using a directory that is representative of it, and ways to add exempt a specific repository from said check if needed, but that check didn't account for owership of the gitdir, or (when used) the gitfile that points to that gitdir. An attacker could create a git repository in a directory that they can write into but that is owned by the victim to work around the fix that was introduced with CVE-2022-24765 to potentially run code as the victim. An example that could result in privilege escalation to root in *NIX would be to set a repository in a shared tmp directory by doing (for example): $ git -C /tmp init To avoid that, extend the ensure_valid_ownership function to be able to check for all three paths. This will have the side effect of tripling the number of stat() calls when a repository is detected, but the effect is expected to be likely minimal, as it is done only once during the directory walk in which Git looks for a repository. Additionally make sure to resolve the gitfile (if one was used) to find the relevant gitdir for checking. While at it change the message printed on failure so it is clear we are referring to the repository by its worktree (or gitdir if it is bare) and not to a specific directory. Helped-by: Junio C Hamano <junio@pobox.com> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
2022-05-10 21:35:29 +02:00
} else
gitfile = xstrdup(dir->buf);
/*
* Earlier, we tentatively added DEFAULT_GIT_DIR_ENVIRONMENT
* to check that directory for a repository.
* Now trim that tentative addition away, because we want to
* focus on the real directory we are in.
*/
strbuf_setlen(dir, offset);
if (gitdirenv) {
setup: tighten ownership checks post CVE-2022-24765 8959555cee7 (setup_git_directory(): add an owner check for the top-level directory, 2022-03-02), adds a function to check for ownership of repositories using a directory that is representative of it, and ways to add exempt a specific repository from said check if needed, but that check didn't account for owership of the gitdir, or (when used) the gitfile that points to that gitdir. An attacker could create a git repository in a directory that they can write into but that is owned by the victim to work around the fix that was introduced with CVE-2022-24765 to potentially run code as the victim. An example that could result in privilege escalation to root in *NIX would be to set a repository in a shared tmp directory by doing (for example): $ git -C /tmp init To avoid that, extend the ensure_valid_ownership function to be able to check for all three paths. This will have the side effect of tripling the number of stat() calls when a repository is detected, but the effect is expected to be likely minimal, as it is done only once during the directory walk in which Git looks for a repository. Additionally make sure to resolve the gitfile (if one was used) to find the relevant gitdir for checking. While at it change the message printed on failure so it is clear we are referring to the repository by its worktree (or gitdir if it is bare) and not to a specific directory. Helped-by: Junio C Hamano <junio@pobox.com> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
2022-05-10 21:35:29 +02:00
enum discovery_result ret;
if (ensure_valid_ownership(gitfile,
dir->buf,
(gitdir_path ? gitdir_path : gitdirenv))) {
strbuf_addstr(gitdir, gitdirenv);
ret = GIT_DIR_DISCOVERED;
} else
ret = GIT_DIR_INVALID_OWNERSHIP;
/*
* Earlier, during discovery, we might have allocated
* string copies for gitdir_path or gitfile so make
* sure we don't leak by freeing them now, before
* leaving the loop and function.
*
* Note: gitdirenv will be non-NULL whenever these are
* allocated, therefore we need not take care of releasing
* them outside of this conditional block.
*/
free(gitdir_path);
free(gitfile);
return ret;
}
if (is_git_directory(dir->buf)) {
setup: tighten ownership checks post CVE-2022-24765 8959555cee7 (setup_git_directory(): add an owner check for the top-level directory, 2022-03-02), adds a function to check for ownership of repositories using a directory that is representative of it, and ways to add exempt a specific repository from said check if needed, but that check didn't account for owership of the gitdir, or (when used) the gitfile that points to that gitdir. An attacker could create a git repository in a directory that they can write into but that is owned by the victim to work around the fix that was introduced with CVE-2022-24765 to potentially run code as the victim. An example that could result in privilege escalation to root in *NIX would be to set a repository in a shared tmp directory by doing (for example): $ git -C /tmp init To avoid that, extend the ensure_valid_ownership function to be able to check for all three paths. This will have the side effect of tripling the number of stat() calls when a repository is detected, but the effect is expected to be likely minimal, as it is done only once during the directory walk in which Git looks for a repository. Additionally make sure to resolve the gitfile (if one was used) to find the relevant gitdir for checking. While at it change the message printed on failure so it is clear we are referring to the repository by its worktree (or gitdir if it is bare) and not to a specific directory. Helped-by: Junio C Hamano <junio@pobox.com> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
2022-05-10 21:35:29 +02:00
if (!ensure_valid_ownership(NULL, NULL, dir->buf))
setup_git_directory(): add an owner check for the top-level directory It poses a security risk to search for a git directory outside of the directories owned by the current user. For example, it is common e.g. in computer pools of educational institutes to have a "scratch" space: a mounted disk with plenty of space that is regularly swiped where any authenticated user can create a directory to do their work. Merely navigating to such a space with a Git-enabled `PS1` when there is a maliciously-crafted `/scratch/.git/` can lead to a compromised account. The same holds true in multi-user setups running Windows, as `C:\` is writable to every authenticated user by default. To plug this vulnerability, we stop Git from accepting top-level directories owned by someone other than the current user. We avoid looking at the ownership of each and every directories between the current and the top-level one (if there are any between) to avoid introducing a performance bottleneck. This new default behavior is obviously incompatible with the concept of shared repositories, where we expect the top-level directory to be owned by only one of its legitimate users. To re-enable that use case, we add support for adding exceptions from the new default behavior via the config setting `safe.directory`. The `safe.directory` config setting is only respected in the system and global configs, not from repository configs or via the command-line, and can have multiple values to allow for multiple shared repositories. We are particularly careful to provide a helpful message to any user trying to use a shared repository. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2022-03-02 12:23:04 +01:00
return GIT_DIR_INVALID_OWNERSHIP;
strbuf_addstr(gitdir, ".");
return GIT_DIR_BARE;
}
if (offset <= min_offset)
return GIT_DIR_HIT_CEILING;
while (--offset > ceil_offset && !is_dir_sep(dir->buf[offset]))
; /* continue */
if (offset <= ceil_offset)
return GIT_DIR_HIT_CEILING;
strbuf_setlen(dir, offset > min_offset ? offset : min_offset);
if (one_filesystem &&
current_device != get_device_or_die(dir->buf, NULL, offset))
return GIT_DIR_HIT_MOUNT_POINT;
introduce GIT_WORK_TREE to specify the work tree setup_gdg is used as abbreviation for setup_git_directory_gently. The work tree can be specified using the environment variable GIT_WORK_TREE and the config option core.worktree (the environment variable has precendence over the config option). Additionally there is a command line option --work-tree which sets the environment variable. setup_gdg does the following now: GIT_DIR unspecified repository in .git directory parent directory of the .git directory is used as work tree, GIT_WORK_TREE is ignored GIT_DIR unspecified repository in cwd GIT_DIR is set to cwd see the cases with GIT_DIR specified what happens next and also see the note below GIT_DIR specified GIT_WORK_TREE/core.worktree unspecified cwd is used as work tree GIT_DIR specified GIT_WORK_TREE/core.worktree specified the specified work tree is used Note on the case where GIT_DIR is unspecified and repository is in cwd: GIT_WORK_TREE is used but is_inside_git_dir is always true. I did it this way because setup_gdg might be called multiple times (e.g. when doing alias expansion) and in successive calls setup_gdg should do the same thing every time. Meaning of is_bare/is_inside_work_tree/is_inside_git_dir: (1) is_bare_repository A repository is bare if core.bare is true or core.bare is unspecified and the name suggests it is bare (directory not named .git). The bare option disables a few protective checks which are useful with a working tree. Currently this changes if a repository is bare: updates of HEAD are allowed git gc packs the refs the reflog is disabled by default (2) is_inside_work_tree True if the cwd is inside the associated working tree (if there is one), false otherwise. (3) is_inside_git_dir True if the cwd is inside the git directory, false otherwise. Before this patch is_inside_git_dir was always true for bare repositories. When setup_gdg finds a repository git_config(git_default_config) is always called. This ensure that is_bare_repository makes use of core.bare and does not guess even though core.bare is specified. inside_work_tree and inside_git_dir are set if setup_gdg finds a repository. The is_inside_work_tree and is_inside_git_dir functions will die if they are called before a successful call to setup_gdg. Signed-off-by: Matthias Lederhofer <matled@gmx.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-06 09:10:42 +02:00
}
}
int discover_git_directory(struct strbuf *commondir,
struct strbuf *gitdir)
{
struct strbuf dir = STRBUF_INIT, err = STRBUF_INIT;
size_t gitdir_offset = gitdir->len, cwd_len;
size_t commondir_offset = commondir->len;
setup: fix memory leaks with `struct repository_format` After we set up a `struct repository_format`, it owns various pieces of allocated memory. We then either use those members, because we decide we want to use the "candidate" repository format, or we discard the candidate / scratch space. In the first case, we transfer ownership of the memory to a few global variables. In the latter case, we just silently drop the struct and end up leaking memory. Introduce an initialization macro `REPOSITORY_FORMAT_INIT` and a function `clear_repository_format()`, to be used on each side of `read_repository_format()`. To have a clear and simple memory ownership, let all users of `struct repository_format` duplicate the strings that they take from it, rather than stealing the pointers. Call `clear_...()` at the start of `read_...()` instead of just zeroing the struct, since we sometimes enter the function multiple times. Thus, it is important to initialize the struct before calling `read_...()`, so document that. It's also important because we might not even call `read_...()` before we call `clear_...()`, see, e.g., builtin/init-db.c. Teach `read_...()` to clear the struct on error, so that it is reset to a safe state, and document this. (In `setup_git_directory_gently()`, we look at `repo_fmt.hash_algo` even if `repo_fmt.version` is -1, which we weren't actually supposed to do per the API. After this commit, that's ok.) We inherit the existing code's combining "error" and "no version found". Both are signalled through `version == -1` and now both cause us to clear any partial configuration we have picked up. For "extensions.*", that's fine, since they require a positive version number. For "core.bare" and "core.worktree", we're already verifying that we have a non-negative version number before using them. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-28 21:36:28 +01:00
struct repository_format candidate = REPOSITORY_FORMAT_INIT;
if (strbuf_getcwd(&dir))
return -1;
cwd_len = dir.len;
if (setup_git_directory_gently_1(&dir, gitdir, 0) <= 0) {
strbuf_release(&dir);
return -1;
}
/*
* The returned gitdir is relative to dir, and if dir does not reflect
* the current working directory, we simply make the gitdir absolute.
*/
if (dir.len < cwd_len && !is_absolute_path(gitdir->buf + gitdir_offset)) {
/* Avoid a trailing "/." */
if (!strcmp(".", gitdir->buf + gitdir_offset))
strbuf_setlen(gitdir, gitdir_offset);
else
strbuf_addch(&dir, '/');
strbuf_insert(gitdir, gitdir_offset, dir.buf, dir.len);
}
get_common_dir(commondir, gitdir->buf + gitdir_offset);
strbuf_reset(&dir);
strbuf_addf(&dir, "%s/config", commondir->buf + commondir_offset);
read_repository_format(&candidate, dir.buf);
strbuf_release(&dir);
if (verify_repository_format(&candidate, &err) < 0) {
warning("ignoring git dir '%s': %s",
gitdir->buf + gitdir_offset, err.buf);
strbuf_release(&err);
strbuf_setlen(commondir, commondir_offset);
strbuf_setlen(gitdir, gitdir_offset);
setup: fix memory leaks with `struct repository_format` After we set up a `struct repository_format`, it owns various pieces of allocated memory. We then either use those members, because we decide we want to use the "candidate" repository format, or we discard the candidate / scratch space. In the first case, we transfer ownership of the memory to a few global variables. In the latter case, we just silently drop the struct and end up leaking memory. Introduce an initialization macro `REPOSITORY_FORMAT_INIT` and a function `clear_repository_format()`, to be used on each side of `read_repository_format()`. To have a clear and simple memory ownership, let all users of `struct repository_format` duplicate the strings that they take from it, rather than stealing the pointers. Call `clear_...()` at the start of `read_...()` instead of just zeroing the struct, since we sometimes enter the function multiple times. Thus, it is important to initialize the struct before calling `read_...()`, so document that. It's also important because we might not even call `read_...()` before we call `clear_...()`, see, e.g., builtin/init-db.c. Teach `read_...()` to clear the struct on error, so that it is reset to a safe state, and document this. (In `setup_git_directory_gently()`, we look at `repo_fmt.hash_algo` even if `repo_fmt.version` is -1, which we weren't actually supposed to do per the API. After this commit, that's ok.) We inherit the existing code's combining "error" and "no version found". Both are signalled through `version == -1` and now both cause us to clear any partial configuration we have picked up. For "extensions.*", that's fine, since they require a positive version number. For "core.bare" and "core.worktree", we're already verifying that we have a non-negative version number before using them. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-28 21:36:28 +01:00
clear_repository_format(&candidate);
return -1;
}
/* take ownership of candidate.partial_clone */
the_repository->repository_format_partial_clone =
candidate.partial_clone;
candidate.partial_clone = NULL;
setup: fix memory leaks with `struct repository_format` After we set up a `struct repository_format`, it owns various pieces of allocated memory. We then either use those members, because we decide we want to use the "candidate" repository format, or we discard the candidate / scratch space. In the first case, we transfer ownership of the memory to a few global variables. In the latter case, we just silently drop the struct and end up leaking memory. Introduce an initialization macro `REPOSITORY_FORMAT_INIT` and a function `clear_repository_format()`, to be used on each side of `read_repository_format()`. To have a clear and simple memory ownership, let all users of `struct repository_format` duplicate the strings that they take from it, rather than stealing the pointers. Call `clear_...()` at the start of `read_...()` instead of just zeroing the struct, since we sometimes enter the function multiple times. Thus, it is important to initialize the struct before calling `read_...()`, so document that. It's also important because we might not even call `read_...()` before we call `clear_...()`, see, e.g., builtin/init-db.c. Teach `read_...()` to clear the struct on error, so that it is reset to a safe state, and document this. (In `setup_git_directory_gently()`, we look at `repo_fmt.hash_algo` even if `repo_fmt.version` is -1, which we weren't actually supposed to do per the API. After this commit, that's ok.) We inherit the existing code's combining "error" and "no version found". Both are signalled through `version == -1` and now both cause us to clear any partial configuration we have picked up. For "extensions.*", that's fine, since they require a positive version number. For "core.bare" and "core.worktree", we're already verifying that we have a non-negative version number before using them. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-28 21:36:28 +01:00
clear_repository_format(&candidate);
return 0;
}
const char *setup_git_directory_gently(int *nongit_ok)
{
static struct strbuf cwd = STRBUF_INIT;
struct strbuf dir = STRBUF_INIT, gitdir = STRBUF_INIT;
Simplify handling of setup_git_directory_gently() failure cases. setup_git_directory_gently() expects two types of failures to discover a git directory (e.g. .git/): - GIT_DIR_HIT_CEILING: could not find a git directory in any parent directories of the cwd. - GIT_DIR_HIT_MOUNT_POINT: could not find a git directory in any parent directories up to the mount point of the cwd. Both cases are handled in a similar way, but there are misleading and unimportant differences. In both cases, setup_git_directory_gently() should: - Die if we are not in a git repository. Otherwise: - Set nongit_ok = 1, indicating that we are not in a git repository but this is ok. - Call strbuf_release() on any non-static struct strbufs that we allocated. Before this change are two misleading additional behaviors: - GIT_DIR_HIT_CEILING: setup_nongit() changes to the cwd for no apparent reason. We never had the chance to change directories up to this point so chdir(current cwd) is pointless. - GIT_DIR_HIT_MOUNT_POINT: strbuf_release() frees the buffer of a static struct strbuf (cwd). This is unnecessary because the struct is static so its buffer is always reachable. This is also misleading because nowhere else in the function is this buffer released. This change eliminates these two misleading additional behaviors and deletes setup_nogit() because the code is clearer without it. The result is that we can see clearly that GIT_DIR_HIT_CEILING and GIT_DIR_HIT_MOUNT_POINT lead to the same behavior (ignoring the different help messages). During review, this change was amended to additionally include: - Neither GIT_DIR_HIT_CEILING nor GIT_DIR_HIT_MOUNT_POINT may return early from setup_git_directory_gently() before the GIT_PREFIX environment variable is reset. Change both cases to break instead of return. See GIT_PREFIX below for more details. - GIT_DIR_NONE: setup_git_directory_gently_1() never returns this value, but if it ever did, setup_git_directory_gently() would incorrectly record that it had found a repository. Explicitly BUG on this case because it is underspecified. - GIT_PREFIX: this environment variable must always match the value of startup_info->prefix and the prefix returned from setup_git_directory_gently(). Make how we handle this slightly more repetitive but also more clear. - setup_git_env() and repo_set_hash_algo(): Add comments showing that only GIT_DIR_EXPLICIT, GIT_DIR_DISCOVERED, and GIT_DIR_BARE will cause setup_git_directory_gently() to call these setup functions. This was obvious (but partly incorrect) before this change when GIT_DIR_HIT_MOUNT_POINT returned early from setup_git_directory_gently(). Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-12-28 00:36:29 +01:00
const char *prefix = NULL;
setup: fix memory leaks with `struct repository_format` After we set up a `struct repository_format`, it owns various pieces of allocated memory. We then either use those members, because we decide we want to use the "candidate" repository format, or we discard the candidate / scratch space. In the first case, we transfer ownership of the memory to a few global variables. In the latter case, we just silently drop the struct and end up leaking memory. Introduce an initialization macro `REPOSITORY_FORMAT_INIT` and a function `clear_repository_format()`, to be used on each side of `read_repository_format()`. To have a clear and simple memory ownership, let all users of `struct repository_format` duplicate the strings that they take from it, rather than stealing the pointers. Call `clear_...()` at the start of `read_...()` instead of just zeroing the struct, since we sometimes enter the function multiple times. Thus, it is important to initialize the struct before calling `read_...()`, so document that. It's also important because we might not even call `read_...()` before we call `clear_...()`, see, e.g., builtin/init-db.c. Teach `read_...()` to clear the struct on error, so that it is reset to a safe state, and document this. (In `setup_git_directory_gently()`, we look at `repo_fmt.hash_algo` even if `repo_fmt.version` is -1, which we weren't actually supposed to do per the API. After this commit, that's ok.) We inherit the existing code's combining "error" and "no version found". Both are signalled through `version == -1` and now both cause us to clear any partial configuration we have picked up. For "extensions.*", that's fine, since they require a positive version number. For "core.bare" and "core.worktree", we're already verifying that we have a non-negative version number before using them. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-28 21:36:28 +01:00
struct repository_format repo_fmt = REPOSITORY_FORMAT_INIT;
/*
* We may have read an incomplete configuration before
* setting-up the git directory. If so, clear the cache so
* that the next queries to the configuration reload complete
* configuration (including the per-repo config file that we
* ignored previously).
*/
git_config_clear();
/*
* Let's assume that we are in a git repository.
* If it turns out later that we are somewhere else, the value will be
* updated accordingly.
*/
if (nongit_ok)
*nongit_ok = 0;
if (strbuf_getcwd(&cwd))
die_errno(_("Unable to read current working directory"));
strbuf_addbuf(&dir, &cwd);
switch (setup_git_directory_gently_1(&dir, &gitdir, 1)) {
case GIT_DIR_EXPLICIT:
prefix = setup_explicit_git_dir(gitdir.buf, &cwd, &repo_fmt, nongit_ok);
break;
case GIT_DIR_DISCOVERED:
if (dir.len < cwd.len && chdir(dir.buf))
die(_("cannot change to '%s'"), dir.buf);
prefix = setup_discovered_git_dir(gitdir.buf, &cwd, dir.len,
&repo_fmt, nongit_ok);
break;
case GIT_DIR_BARE:
if (dir.len < cwd.len && chdir(dir.buf))
die(_("cannot change to '%s'"), dir.buf);
prefix = setup_bare_git_dir(&cwd, dir.len, &repo_fmt, nongit_ok);
break;
case GIT_DIR_HIT_CEILING:
Simplify handling of setup_git_directory_gently() failure cases. setup_git_directory_gently() expects two types of failures to discover a git directory (e.g. .git/): - GIT_DIR_HIT_CEILING: could not find a git directory in any parent directories of the cwd. - GIT_DIR_HIT_MOUNT_POINT: could not find a git directory in any parent directories up to the mount point of the cwd. Both cases are handled in a similar way, but there are misleading and unimportant differences. In both cases, setup_git_directory_gently() should: - Die if we are not in a git repository. Otherwise: - Set nongit_ok = 1, indicating that we are not in a git repository but this is ok. - Call strbuf_release() on any non-static struct strbufs that we allocated. Before this change are two misleading additional behaviors: - GIT_DIR_HIT_CEILING: setup_nongit() changes to the cwd for no apparent reason. We never had the chance to change directories up to this point so chdir(current cwd) is pointless. - GIT_DIR_HIT_MOUNT_POINT: strbuf_release() frees the buffer of a static struct strbuf (cwd). This is unnecessary because the struct is static so its buffer is always reachable. This is also misleading because nowhere else in the function is this buffer released. This change eliminates these two misleading additional behaviors and deletes setup_nogit() because the code is clearer without it. The result is that we can see clearly that GIT_DIR_HIT_CEILING and GIT_DIR_HIT_MOUNT_POINT lead to the same behavior (ignoring the different help messages). During review, this change was amended to additionally include: - Neither GIT_DIR_HIT_CEILING nor GIT_DIR_HIT_MOUNT_POINT may return early from setup_git_directory_gently() before the GIT_PREFIX environment variable is reset. Change both cases to break instead of return. See GIT_PREFIX below for more details. - GIT_DIR_NONE: setup_git_directory_gently_1() never returns this value, but if it ever did, setup_git_directory_gently() would incorrectly record that it had found a repository. Explicitly BUG on this case because it is underspecified. - GIT_PREFIX: this environment variable must always match the value of startup_info->prefix and the prefix returned from setup_git_directory_gently(). Make how we handle this slightly more repetitive but also more clear. - setup_git_env() and repo_set_hash_algo(): Add comments showing that only GIT_DIR_EXPLICIT, GIT_DIR_DISCOVERED, and GIT_DIR_BARE will cause setup_git_directory_gently() to call these setup functions. This was obvious (but partly incorrect) before this change when GIT_DIR_HIT_MOUNT_POINT returned early from setup_git_directory_gently(). Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-12-28 00:36:29 +01:00
if (!nongit_ok)
die(_("not a git repository (or any of the parent directories): %s"),
DEFAULT_GIT_DIR_ENVIRONMENT);
*nongit_ok = 1;
break;
case GIT_DIR_HIT_MOUNT_POINT:
Simplify handling of setup_git_directory_gently() failure cases. setup_git_directory_gently() expects two types of failures to discover a git directory (e.g. .git/): - GIT_DIR_HIT_CEILING: could not find a git directory in any parent directories of the cwd. - GIT_DIR_HIT_MOUNT_POINT: could not find a git directory in any parent directories up to the mount point of the cwd. Both cases are handled in a similar way, but there are misleading and unimportant differences. In both cases, setup_git_directory_gently() should: - Die if we are not in a git repository. Otherwise: - Set nongit_ok = 1, indicating that we are not in a git repository but this is ok. - Call strbuf_release() on any non-static struct strbufs that we allocated. Before this change are two misleading additional behaviors: - GIT_DIR_HIT_CEILING: setup_nongit() changes to the cwd for no apparent reason. We never had the chance to change directories up to this point so chdir(current cwd) is pointless. - GIT_DIR_HIT_MOUNT_POINT: strbuf_release() frees the buffer of a static struct strbuf (cwd). This is unnecessary because the struct is static so its buffer is always reachable. This is also misleading because nowhere else in the function is this buffer released. This change eliminates these two misleading additional behaviors and deletes setup_nogit() because the code is clearer without it. The result is that we can see clearly that GIT_DIR_HIT_CEILING and GIT_DIR_HIT_MOUNT_POINT lead to the same behavior (ignoring the different help messages). During review, this change was amended to additionally include: - Neither GIT_DIR_HIT_CEILING nor GIT_DIR_HIT_MOUNT_POINT may return early from setup_git_directory_gently() before the GIT_PREFIX environment variable is reset. Change both cases to break instead of return. See GIT_PREFIX below for more details. - GIT_DIR_NONE: setup_git_directory_gently_1() never returns this value, but if it ever did, setup_git_directory_gently() would incorrectly record that it had found a repository. Explicitly BUG on this case because it is underspecified. - GIT_PREFIX: this environment variable must always match the value of startup_info->prefix and the prefix returned from setup_git_directory_gently(). Make how we handle this slightly more repetitive but also more clear. - setup_git_env() and repo_set_hash_algo(): Add comments showing that only GIT_DIR_EXPLICIT, GIT_DIR_DISCOVERED, and GIT_DIR_BARE will cause setup_git_directory_gently() to call these setup functions. This was obvious (but partly incorrect) before this change when GIT_DIR_HIT_MOUNT_POINT returned early from setup_git_directory_gently(). Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-12-28 00:36:29 +01:00
if (!nongit_ok)
die(_("not a git repository (or any parent up to mount point %s)\n"
"Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set)."),
dir.buf);
*nongit_ok = 1;
break;
setup_git_directory(): add an owner check for the top-level directory It poses a security risk to search for a git directory outside of the directories owned by the current user. For example, it is common e.g. in computer pools of educational institutes to have a "scratch" space: a mounted disk with plenty of space that is regularly swiped where any authenticated user can create a directory to do their work. Merely navigating to such a space with a Git-enabled `PS1` when there is a maliciously-crafted `/scratch/.git/` can lead to a compromised account. The same holds true in multi-user setups running Windows, as `C:\` is writable to every authenticated user by default. To plug this vulnerability, we stop Git from accepting top-level directories owned by someone other than the current user. We avoid looking at the ownership of each and every directories between the current and the top-level one (if there are any between) to avoid introducing a performance bottleneck. This new default behavior is obviously incompatible with the concept of shared repositories, where we expect the top-level directory to be owned by only one of its legitimate users. To re-enable that use case, we add support for adding exceptions from the new default behavior via the config setting `safe.directory`. The `safe.directory` config setting is only respected in the system and global configs, not from repository configs or via the command-line, and can have multiple values to allow for multiple shared repositories. We are particularly careful to provide a helpful message to any user trying to use a shared repository. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2022-03-02 12:23:04 +01:00
case GIT_DIR_INVALID_OWNERSHIP:
if (!nongit_ok) {
struct strbuf quoted = STRBUF_INIT;
sq_quote_buf_pretty(&quoted, dir.buf);
setup: tighten ownership checks post CVE-2022-24765 8959555cee7 (setup_git_directory(): add an owner check for the top-level directory, 2022-03-02), adds a function to check for ownership of repositories using a directory that is representative of it, and ways to add exempt a specific repository from said check if needed, but that check didn't account for owership of the gitdir, or (when used) the gitfile that points to that gitdir. An attacker could create a git repository in a directory that they can write into but that is owned by the victim to work around the fix that was introduced with CVE-2022-24765 to potentially run code as the victim. An example that could result in privilege escalation to root in *NIX would be to set a repository in a shared tmp directory by doing (for example): $ git -C /tmp init To avoid that, extend the ensure_valid_ownership function to be able to check for all three paths. This will have the side effect of tripling the number of stat() calls when a repository is detected, but the effect is expected to be likely minimal, as it is done only once during the directory walk in which Git looks for a repository. Additionally make sure to resolve the gitfile (if one was used) to find the relevant gitdir for checking. While at it change the message printed on failure so it is clear we are referring to the repository by its worktree (or gitdir if it is bare) and not to a specific directory. Helped-by: Junio C Hamano <junio@pobox.com> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
2022-05-10 21:35:29 +02:00
die(_("detected dubious ownership in repository at '%s'\n"
setup_git_directory(): add an owner check for the top-level directory It poses a security risk to search for a git directory outside of the directories owned by the current user. For example, it is common e.g. in computer pools of educational institutes to have a "scratch" space: a mounted disk with plenty of space that is regularly swiped where any authenticated user can create a directory to do their work. Merely navigating to such a space with a Git-enabled `PS1` when there is a maliciously-crafted `/scratch/.git/` can lead to a compromised account. The same holds true in multi-user setups running Windows, as `C:\` is writable to every authenticated user by default. To plug this vulnerability, we stop Git from accepting top-level directories owned by someone other than the current user. We avoid looking at the ownership of each and every directories between the current and the top-level one (if there are any between) to avoid introducing a performance bottleneck. This new default behavior is obviously incompatible with the concept of shared repositories, where we expect the top-level directory to be owned by only one of its legitimate users. To re-enable that use case, we add support for adding exceptions from the new default behavior via the config setting `safe.directory`. The `safe.directory` config setting is only respected in the system and global configs, not from repository configs or via the command-line, and can have multiple values to allow for multiple shared repositories. We are particularly careful to provide a helpful message to any user trying to use a shared repository. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2022-03-02 12:23:04 +01:00
"To add an exception for this directory, call:\n"
"\n"
"\tgit config --global --add safe.directory %s"),
dir.buf, quoted.buf);
}
*nongit_ok = 1;
break;
Simplify handling of setup_git_directory_gently() failure cases. setup_git_directory_gently() expects two types of failures to discover a git directory (e.g. .git/): - GIT_DIR_HIT_CEILING: could not find a git directory in any parent directories of the cwd. - GIT_DIR_HIT_MOUNT_POINT: could not find a git directory in any parent directories up to the mount point of the cwd. Both cases are handled in a similar way, but there are misleading and unimportant differences. In both cases, setup_git_directory_gently() should: - Die if we are not in a git repository. Otherwise: - Set nongit_ok = 1, indicating that we are not in a git repository but this is ok. - Call strbuf_release() on any non-static struct strbufs that we allocated. Before this change are two misleading additional behaviors: - GIT_DIR_HIT_CEILING: setup_nongit() changes to the cwd for no apparent reason. We never had the chance to change directories up to this point so chdir(current cwd) is pointless. - GIT_DIR_HIT_MOUNT_POINT: strbuf_release() frees the buffer of a static struct strbuf (cwd). This is unnecessary because the struct is static so its buffer is always reachable. This is also misleading because nowhere else in the function is this buffer released. This change eliminates these two misleading additional behaviors and deletes setup_nogit() because the code is clearer without it. The result is that we can see clearly that GIT_DIR_HIT_CEILING and GIT_DIR_HIT_MOUNT_POINT lead to the same behavior (ignoring the different help messages). During review, this change was amended to additionally include: - Neither GIT_DIR_HIT_CEILING nor GIT_DIR_HIT_MOUNT_POINT may return early from setup_git_directory_gently() before the GIT_PREFIX environment variable is reset. Change both cases to break instead of return. See GIT_PREFIX below for more details. - GIT_DIR_NONE: setup_git_directory_gently_1() never returns this value, but if it ever did, setup_git_directory_gently() would incorrectly record that it had found a repository. Explicitly BUG on this case because it is underspecified. - GIT_PREFIX: this environment variable must always match the value of startup_info->prefix and the prefix returned from setup_git_directory_gently(). Make how we handle this slightly more repetitive but also more clear. - setup_git_env() and repo_set_hash_algo(): Add comments showing that only GIT_DIR_EXPLICIT, GIT_DIR_DISCOVERED, and GIT_DIR_BARE will cause setup_git_directory_gently() to call these setup functions. This was obvious (but partly incorrect) before this change when GIT_DIR_HIT_MOUNT_POINT returned early from setup_git_directory_gently(). Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-12-28 00:36:29 +01:00
case GIT_DIR_NONE:
/*
* As a safeguard against setup_git_directory_gently_1 returning
* this value, fallthrough to BUG. Otherwise it is possible to
* set startup_info->have_repository to 1 when we did nothing to
* find a repository.
*/
default:
BUG("unhandled setup_git_directory_1() result");
}
Simplify handling of setup_git_directory_gently() failure cases. setup_git_directory_gently() expects two types of failures to discover a git directory (e.g. .git/): - GIT_DIR_HIT_CEILING: could not find a git directory in any parent directories of the cwd. - GIT_DIR_HIT_MOUNT_POINT: could not find a git directory in any parent directories up to the mount point of the cwd. Both cases are handled in a similar way, but there are misleading and unimportant differences. In both cases, setup_git_directory_gently() should: - Die if we are not in a git repository. Otherwise: - Set nongit_ok = 1, indicating that we are not in a git repository but this is ok. - Call strbuf_release() on any non-static struct strbufs that we allocated. Before this change are two misleading additional behaviors: - GIT_DIR_HIT_CEILING: setup_nongit() changes to the cwd for no apparent reason. We never had the chance to change directories up to this point so chdir(current cwd) is pointless. - GIT_DIR_HIT_MOUNT_POINT: strbuf_release() frees the buffer of a static struct strbuf (cwd). This is unnecessary because the struct is static so its buffer is always reachable. This is also misleading because nowhere else in the function is this buffer released. This change eliminates these two misleading additional behaviors and deletes setup_nogit() because the code is clearer without it. The result is that we can see clearly that GIT_DIR_HIT_CEILING and GIT_DIR_HIT_MOUNT_POINT lead to the same behavior (ignoring the different help messages). During review, this change was amended to additionally include: - Neither GIT_DIR_HIT_CEILING nor GIT_DIR_HIT_MOUNT_POINT may return early from setup_git_directory_gently() before the GIT_PREFIX environment variable is reset. Change both cases to break instead of return. See GIT_PREFIX below for more details. - GIT_DIR_NONE: setup_git_directory_gently_1() never returns this value, but if it ever did, setup_git_directory_gently() would incorrectly record that it had found a repository. Explicitly BUG on this case because it is underspecified. - GIT_PREFIX: this environment variable must always match the value of startup_info->prefix and the prefix returned from setup_git_directory_gently(). Make how we handle this slightly more repetitive but also more clear. - setup_git_env() and repo_set_hash_algo(): Add comments showing that only GIT_DIR_EXPLICIT, GIT_DIR_DISCOVERED, and GIT_DIR_BARE will cause setup_git_directory_gently() to call these setup functions. This was obvious (but partly incorrect) before this change when GIT_DIR_HIT_MOUNT_POINT returned early from setup_git_directory_gently(). Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-12-28 00:36:29 +01:00
/*
* At this point, nongit_ok is stable. If it is non-NULL and points
* to a non-zero value, then this means that we haven't found a
* repository and that the caller expects startup_info to reflect
* this.
*
* Regardless of the state of nongit_ok, startup_info->prefix and
* the GIT_PREFIX environment variable must always match. For details
* see Documentation/config/alias.txt.
*/
if (nongit_ok && *nongit_ok)
Simplify handling of setup_git_directory_gently() failure cases. setup_git_directory_gently() expects two types of failures to discover a git directory (e.g. .git/): - GIT_DIR_HIT_CEILING: could not find a git directory in any parent directories of the cwd. - GIT_DIR_HIT_MOUNT_POINT: could not find a git directory in any parent directories up to the mount point of the cwd. Both cases are handled in a similar way, but there are misleading and unimportant differences. In both cases, setup_git_directory_gently() should: - Die if we are not in a git repository. Otherwise: - Set nongit_ok = 1, indicating that we are not in a git repository but this is ok. - Call strbuf_release() on any non-static struct strbufs that we allocated. Before this change are two misleading additional behaviors: - GIT_DIR_HIT_CEILING: setup_nongit() changes to the cwd for no apparent reason. We never had the chance to change directories up to this point so chdir(current cwd) is pointless. - GIT_DIR_HIT_MOUNT_POINT: strbuf_release() frees the buffer of a static struct strbuf (cwd). This is unnecessary because the struct is static so its buffer is always reachable. This is also misleading because nowhere else in the function is this buffer released. This change eliminates these two misleading additional behaviors and deletes setup_nogit() because the code is clearer without it. The result is that we can see clearly that GIT_DIR_HIT_CEILING and GIT_DIR_HIT_MOUNT_POINT lead to the same behavior (ignoring the different help messages). During review, this change was amended to additionally include: - Neither GIT_DIR_HIT_CEILING nor GIT_DIR_HIT_MOUNT_POINT may return early from setup_git_directory_gently() before the GIT_PREFIX environment variable is reset. Change both cases to break instead of return. See GIT_PREFIX below for more details. - GIT_DIR_NONE: setup_git_directory_gently_1() never returns this value, but if it ever did, setup_git_directory_gently() would incorrectly record that it had found a repository. Explicitly BUG on this case because it is underspecified. - GIT_PREFIX: this environment variable must always match the value of startup_info->prefix and the prefix returned from setup_git_directory_gently(). Make how we handle this slightly more repetitive but also more clear. - setup_git_env() and repo_set_hash_algo(): Add comments showing that only GIT_DIR_EXPLICIT, GIT_DIR_DISCOVERED, and GIT_DIR_BARE will cause setup_git_directory_gently() to call these setup functions. This was obvious (but partly incorrect) before this change when GIT_DIR_HIT_MOUNT_POINT returned early from setup_git_directory_gently(). Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-12-28 00:36:29 +01:00
startup_info->have_repository = 0;
else
Simplify handling of setup_git_directory_gently() failure cases. setup_git_directory_gently() expects two types of failures to discover a git directory (e.g. .git/): - GIT_DIR_HIT_CEILING: could not find a git directory in any parent directories of the cwd. - GIT_DIR_HIT_MOUNT_POINT: could not find a git directory in any parent directories up to the mount point of the cwd. Both cases are handled in a similar way, but there are misleading and unimportant differences. In both cases, setup_git_directory_gently() should: - Die if we are not in a git repository. Otherwise: - Set nongit_ok = 1, indicating that we are not in a git repository but this is ok. - Call strbuf_release() on any non-static struct strbufs that we allocated. Before this change are two misleading additional behaviors: - GIT_DIR_HIT_CEILING: setup_nongit() changes to the cwd for no apparent reason. We never had the chance to change directories up to this point so chdir(current cwd) is pointless. - GIT_DIR_HIT_MOUNT_POINT: strbuf_release() frees the buffer of a static struct strbuf (cwd). This is unnecessary because the struct is static so its buffer is always reachable. This is also misleading because nowhere else in the function is this buffer released. This change eliminates these two misleading additional behaviors and deletes setup_nogit() because the code is clearer without it. The result is that we can see clearly that GIT_DIR_HIT_CEILING and GIT_DIR_HIT_MOUNT_POINT lead to the same behavior (ignoring the different help messages). During review, this change was amended to additionally include: - Neither GIT_DIR_HIT_CEILING nor GIT_DIR_HIT_MOUNT_POINT may return early from setup_git_directory_gently() before the GIT_PREFIX environment variable is reset. Change both cases to break instead of return. See GIT_PREFIX below for more details. - GIT_DIR_NONE: setup_git_directory_gently_1() never returns this value, but if it ever did, setup_git_directory_gently() would incorrectly record that it had found a repository. Explicitly BUG on this case because it is underspecified. - GIT_PREFIX: this environment variable must always match the value of startup_info->prefix and the prefix returned from setup_git_directory_gently(). Make how we handle this slightly more repetitive but also more clear. - setup_git_env() and repo_set_hash_algo(): Add comments showing that only GIT_DIR_EXPLICIT, GIT_DIR_DISCOVERED, and GIT_DIR_BARE will cause setup_git_directory_gently() to call these setup functions. This was obvious (but partly incorrect) before this change when GIT_DIR_HIT_MOUNT_POINT returned early from setup_git_directory_gently(). Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-12-28 00:36:29 +01:00
startup_info->have_repository = 1;
setup: make startup_info available everywhere Commit a60645f (setup: remember whether repository was found, 2010-08-05) introduced the startup_info structure, which records some parts of the setup_git_directory() process (notably, whether we actually found a repository or not). One of the uses of this data is for functions to behave appropriately based on whether we are in a repo. But the startup_info struct is just a pointer to storage provided by the main program, and the only program that sets it up is the git.c wrapper. Thus builtins have access to startup_info, but externally linked programs do not. Worse, library code which is accessible from both has to be careful about accessing startup_info. This can be used to trigger a die("BUG") via get_sha1(): $ git fast-import <<-\EOF tag foo from HEAD:./whatever EOF fatal: BUG: startup_info struct is not initialized. Obviously that's fairly nonsensical input to feed to fast-import, but we should never hit a die("BUG"). And there may be other ways to trigger it if other non-builtins resolve sha1s. So let's point the storage for startup_info to a static variable in setup.c, making it available to all users of the library code. We _could_ turn startup_info into a regular extern struct, but doing so would mean tweaking all of the existing use sites. So let's leave the pointer indirection in place. We can, however, drop any checks for NULL, as they will always be false (and likewise, we can drop the test covering this case, which was a rather artificial situation using one of the test-* programs). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-05 23:10:27 +01:00
/*
* Not all paths through the setup code will call 'set_git_dir()' (which
* directly sets up the environment) so in order to guarantee that the
* environment is in a consistent state after setup, explicitly setup
* the environment if we have a repository.
*
* NEEDSWORK: currently we allow bogus GIT_DIR values to be set in some
* code paths so we also need to explicitly setup the environment if
* the user has set GIT_DIR. It may be beneficial to disallow bogus
* GIT_DIR values at some point in the future.
*/
Simplify handling of setup_git_directory_gently() failure cases. setup_git_directory_gently() expects two types of failures to discover a git directory (e.g. .git/): - GIT_DIR_HIT_CEILING: could not find a git directory in any parent directories of the cwd. - GIT_DIR_HIT_MOUNT_POINT: could not find a git directory in any parent directories up to the mount point of the cwd. Both cases are handled in a similar way, but there are misleading and unimportant differences. In both cases, setup_git_directory_gently() should: - Die if we are not in a git repository. Otherwise: - Set nongit_ok = 1, indicating that we are not in a git repository but this is ok. - Call strbuf_release() on any non-static struct strbufs that we allocated. Before this change are two misleading additional behaviors: - GIT_DIR_HIT_CEILING: setup_nongit() changes to the cwd for no apparent reason. We never had the chance to change directories up to this point so chdir(current cwd) is pointless. - GIT_DIR_HIT_MOUNT_POINT: strbuf_release() frees the buffer of a static struct strbuf (cwd). This is unnecessary because the struct is static so its buffer is always reachable. This is also misleading because nowhere else in the function is this buffer released. This change eliminates these two misleading additional behaviors and deletes setup_nogit() because the code is clearer without it. The result is that we can see clearly that GIT_DIR_HIT_CEILING and GIT_DIR_HIT_MOUNT_POINT lead to the same behavior (ignoring the different help messages). During review, this change was amended to additionally include: - Neither GIT_DIR_HIT_CEILING nor GIT_DIR_HIT_MOUNT_POINT may return early from setup_git_directory_gently() before the GIT_PREFIX environment variable is reset. Change both cases to break instead of return. See GIT_PREFIX below for more details. - GIT_DIR_NONE: setup_git_directory_gently_1() never returns this value, but if it ever did, setup_git_directory_gently() would incorrectly record that it had found a repository. Explicitly BUG on this case because it is underspecified. - GIT_PREFIX: this environment variable must always match the value of startup_info->prefix and the prefix returned from setup_git_directory_gently(). Make how we handle this slightly more repetitive but also more clear. - setup_git_env() and repo_set_hash_algo(): Add comments showing that only GIT_DIR_EXPLICIT, GIT_DIR_DISCOVERED, and GIT_DIR_BARE will cause setup_git_directory_gently() to call these setup functions. This was obvious (but partly incorrect) before this change when GIT_DIR_HIT_MOUNT_POINT returned early from setup_git_directory_gently(). Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-12-28 00:36:29 +01:00
if (/* GIT_DIR_EXPLICIT, GIT_DIR_DISCOVERED, GIT_DIR_BARE */
startup_info->have_repository ||
/* GIT_DIR_EXPLICIT */
getenv(GIT_DIR_ENVIRONMENT)) {
if (!the_repository->gitdir) {
const char *gitdir = getenv(GIT_DIR_ENVIRONMENT);
if (!gitdir)
gitdir = DEFAULT_GIT_DIR_ENVIRONMENT;
setup_git_env(gitdir);
}
if (startup_info->have_repository) {
repo_set_hash_algo(the_repository, repo_fmt.hash_algo);
/* take ownership of repo_fmt.partial_clone */
the_repository->repository_format_partial_clone =
repo_fmt.partial_clone;
repo_fmt.partial_clone = NULL;
}
}
/*
* Since precompose_string_if_needed() needs to look at
* the core.precomposeunicode configuration, this
* has to happen after the above block that finds
* out where the repository is, i.e. a preparation
* for calling git_config_get_bool().
*/
if (prefix) {
prefix = precompose_string_if_needed(prefix);
startup_info->prefix = prefix;
setenv(GIT_PREFIX_ENVIRONMENT, prefix, 1);
} else {
startup_info->prefix = NULL;
setenv(GIT_PREFIX_ENVIRONMENT, "", 1);
}
setup_original_cwd();
strbuf_release(&dir);
strbuf_release(&gitdir);
setup: fix memory leaks with `struct repository_format` After we set up a `struct repository_format`, it owns various pieces of allocated memory. We then either use those members, because we decide we want to use the "candidate" repository format, or we discard the candidate / scratch space. In the first case, we transfer ownership of the memory to a few global variables. In the latter case, we just silently drop the struct and end up leaking memory. Introduce an initialization macro `REPOSITORY_FORMAT_INIT` and a function `clear_repository_format()`, to be used on each side of `read_repository_format()`. To have a clear and simple memory ownership, let all users of `struct repository_format` duplicate the strings that they take from it, rather than stealing the pointers. Call `clear_...()` at the start of `read_...()` instead of just zeroing the struct, since we sometimes enter the function multiple times. Thus, it is important to initialize the struct before calling `read_...()`, so document that. It's also important because we might not even call `read_...()` before we call `clear_...()`, see, e.g., builtin/init-db.c. Teach `read_...()` to clear the struct on error, so that it is reset to a safe state, and document this. (In `setup_git_directory_gently()`, we look at `repo_fmt.hash_algo` even if `repo_fmt.version` is -1, which we weren't actually supposed to do per the API. After this commit, that's ok.) We inherit the existing code's combining "error" and "no version found". Both are signalled through `version == -1` and now both cause us to clear any partial configuration we have picked up. For "extensions.*", that's fine, since they require a positive version number. For "core.bare" and "core.worktree", we're already verifying that we have a non-negative version number before using them. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-28 21:36:28 +01:00
clear_repository_format(&repo_fmt);
return prefix;
}
int git_config_perm(const char *var, const char *value)
{
int i;
char *endptr;
if (value == NULL)
return PERM_GROUP;
if (!strcmp(value, "umask"))
return PERM_UMASK;
if (!strcmp(value, "group"))
return PERM_GROUP;
if (!strcmp(value, "all") ||
!strcmp(value, "world") ||
!strcmp(value, "everybody"))
return PERM_EVERYBODY;
/* Parse octal numbers */
i = strtol(value, &endptr, 8);
/* If not an octal number, maybe true/false? */
if (*endptr != 0)
return git_config_bool(var, value) ? PERM_GROUP : PERM_UMASK;
/*
* Treat values 0, 1 and 2 as compatibility cases, otherwise it is
* a chmod value to restrict to.
*/
switch (i) {
case PERM_UMASK: /* 0 */
return PERM_UMASK;
case OLD_PERM_GROUP: /* 1 */
return PERM_GROUP;
case OLD_PERM_EVERYBODY: /* 2 */
return PERM_EVERYBODY;
}
/* A filemode value was given: 0xxx */
if ((i & 0600) != 0600)
die(_("problem with core.sharedRepository filemode value "
"(0%.3o).\nThe owner of files must always have "
"read and write permissions."), i);
/*
* Mask filemode value. Others can not get write permission.
* x flags for directories are handled separately.
*/
return -(i & 0666);
}
void check_repository_format(struct repository_format *fmt)
{
setup: fix memory leaks with `struct repository_format` After we set up a `struct repository_format`, it owns various pieces of allocated memory. We then either use those members, because we decide we want to use the "candidate" repository format, or we discard the candidate / scratch space. In the first case, we transfer ownership of the memory to a few global variables. In the latter case, we just silently drop the struct and end up leaking memory. Introduce an initialization macro `REPOSITORY_FORMAT_INIT` and a function `clear_repository_format()`, to be used on each side of `read_repository_format()`. To have a clear and simple memory ownership, let all users of `struct repository_format` duplicate the strings that they take from it, rather than stealing the pointers. Call `clear_...()` at the start of `read_...()` instead of just zeroing the struct, since we sometimes enter the function multiple times. Thus, it is important to initialize the struct before calling `read_...()`, so document that. It's also important because we might not even call `read_...()` before we call `clear_...()`, see, e.g., builtin/init-db.c. Teach `read_...()` to clear the struct on error, so that it is reset to a safe state, and document this. (In `setup_git_directory_gently()`, we look at `repo_fmt.hash_algo` even if `repo_fmt.version` is -1, which we weren't actually supposed to do per the API. After this commit, that's ok.) We inherit the existing code's combining "error" and "no version found". Both are signalled through `version == -1` and now both cause us to clear any partial configuration we have picked up. For "extensions.*", that's fine, since they require a positive version number. For "core.bare" and "core.worktree", we're already verifying that we have a non-negative version number before using them. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-28 21:36:28 +01:00
struct repository_format repo_fmt = REPOSITORY_FORMAT_INIT;
if (!fmt)
fmt = &repo_fmt;
check_repository_format_gently(get_git_dir(), fmt, NULL);
startup_info->have_repository = 1;
repo_set_hash_algo(the_repository, fmt->hash_algo);
the_repository->repository_format_partial_clone =
xstrdup_or_null(fmt->partial_clone);
setup: fix memory leaks with `struct repository_format` After we set up a `struct repository_format`, it owns various pieces of allocated memory. We then either use those members, because we decide we want to use the "candidate" repository format, or we discard the candidate / scratch space. In the first case, we transfer ownership of the memory to a few global variables. In the latter case, we just silently drop the struct and end up leaking memory. Introduce an initialization macro `REPOSITORY_FORMAT_INIT` and a function `clear_repository_format()`, to be used on each side of `read_repository_format()`. To have a clear and simple memory ownership, let all users of `struct repository_format` duplicate the strings that they take from it, rather than stealing the pointers. Call `clear_...()` at the start of `read_...()` instead of just zeroing the struct, since we sometimes enter the function multiple times. Thus, it is important to initialize the struct before calling `read_...()`, so document that. It's also important because we might not even call `read_...()` before we call `clear_...()`, see, e.g., builtin/init-db.c. Teach `read_...()` to clear the struct on error, so that it is reset to a safe state, and document this. (In `setup_git_directory_gently()`, we look at `repo_fmt.hash_algo` even if `repo_fmt.version` is -1, which we weren't actually supposed to do per the API. After this commit, that's ok.) We inherit the existing code's combining "error" and "no version found". Both are signalled through `version == -1` and now both cause us to clear any partial configuration we have picked up. For "extensions.*", that's fine, since they require a positive version number. For "core.bare" and "core.worktree", we're already verifying that we have a non-negative version number before using them. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-28 21:36:28 +01:00
clear_repository_format(&repo_fmt);
}
/*
* Returns the "prefix", a path to the current working directory
* relative to the work tree root, or NULL, if the current working
* directory is not a strict subdirectory of the work tree root. The
* prefix always ends with a '/' character.
*/
const char *setup_git_directory(void)
{
return setup_git_directory_gently(NULL);
}
const char *resolve_gitdir_gently(const char *suspect, int *return_error_code)
{
if (is_git_directory(suspect))
return suspect;
return read_gitfile_gently(suspect, return_error_code);
}
/* if any standard file descriptor is missing open it to /dev/null */
void sanitize_stdfds(void)
{
int fd = xopen("/dev/null", O_RDWR);
while (fd < 2)
fd = xdup(fd);
if (fd > 2)
close(fd);
}
int daemonize(void)
{
#ifdef NO_POSIX_GOODIES
errno = ENOSYS;
return -1;
#else
switch (fork()) {
case 0:
break;
case -1:
die_errno(_("fork failed"));
default:
exit(0);
}
if (setsid() == -1)
die_errno(_("setsid failed"));
close(0);
close(1);
close(2);
sanitize_stdfds();
return 0;
#endif
}