From f94804c1f2626831c6bdf8cc269a571324e3f2f2 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Thu, 29 Aug 2019 11:19:18 -0400 Subject: [PATCH 01/34] t9300: drop some useless uses of cat These waste a process, and make the line longer than it needs to be. Signed-off-by: Jeff King --- t/t9300-fast-import.sh | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh index d47560b634..1d2a7516fd 100755 --- a/t/t9300-fast-import.sh +++ b/t/t9300-fast-import.sh @@ -2125,12 +2125,12 @@ test_expect_success 'R: export-marks feature results in a marks file being creat EOF - cat input | git fast-import && + git fast-import output && + git fast-import 2>output Date: Thu, 29 Aug 2019 13:43:23 -0400 Subject: [PATCH 02/34] t9300: create marks files for double-import-marks test Our tests confirm that providing two "import-marks" options in a fast-import stream is an error. However, the invoked command would fail even without covering this case, because the marks files themselves do not actually exist. Let's create the files to make sure we fail for the right reason (we actually do, because the option parsing happens before we open anything, but this future-proofs our test). Signed-off-by: Jeff King --- t/t9300-fast-import.sh | 2 ++ 1 file changed, 2 insertions(+) diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh index 1d2a7516fd..c0d04ec3ee 100755 --- a/t/t9300-fast-import.sh +++ b/t/t9300-fast-import.sh @@ -2107,6 +2107,8 @@ test_expect_success 'R: abort on receiving feature after data command' ' ' test_expect_success 'R: only one import-marks feature allowed per stream' ' + >git.marks && + >git2.marks && cat >input <<-EOF && feature import-marks=git.marks feature import-marks=git2.marks From 11e934d56e46875b24d8a047d44b45ff243f6715 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Thu, 29 Aug 2019 11:25:45 -0400 Subject: [PATCH 03/34] fast-import: tighten parsing of boolean command line options We parse options like "--max-pack-size=" using skip_prefix(), which makes sense to get at the bytes after the "=". However, we also parse "--quiet" and "--stats" with skip_prefix(), which allows things like "--quiet-nonsense" to behave like "--quiet". This was a mistaken conversion in 0f6927c229 (fast-import: put option parsing code in separate functions, 2009-12-04). Let's tighten this to an exact match, which was the original intent. Signed-off-by: Jeff King --- fast-import.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fast-import.c b/fast-import.c index 32ac5323f6..1d0a06ccfd 100644 --- a/fast-import.c +++ b/fast-import.c @@ -3312,9 +3312,9 @@ static int parse_one_option(const char *option) option_active_branches(option); } else if (skip_prefix(option, "export-pack-edges=", &option)) { option_export_pack_edges(option); - } else if (starts_with(option, "quiet")) { + } else if (!strcmp(option, "quiet")) { show_stats = 0; - } else if (starts_with(option, "stats")) { + } else if (!strcmp(option, "stats")) { show_stats = 1; } else { return 0; From e075dba3723875f478654068609f69b2a5af8566 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Thu, 29 Aug 2019 13:07:04 -0400 Subject: [PATCH 04/34] fast-import: stop creating leading directories for import-marks When asked to import marks from "subdir/file.marks", we create the leading directory "subdir" if it doesn't exist. This makes no sense for importing marks, where we only ever open the path for reading. Most of the time this would be a noop, since if the marks file exists, then the leading directories exist, too. But if it doesn't (e.g., because --import-marks-if-exists was used), then we'd create the useless directory. This dates back to 580d5f83e7 (fast-import: always create marks_file directories, 2010-03-29). Even then it was useless, so it seems to have been added in error alongside the --export-marks case (which _is_ helpful). Signed-off-by: Jeff King --- fast-import.c | 1 - 1 file changed, 1 deletion(-) diff --git a/fast-import.c b/fast-import.c index 1d0a06ccfd..a5b7685e84 100644 --- a/fast-import.c +++ b/fast-import.c @@ -3228,7 +3228,6 @@ static void option_import_marks(const char *marks, } import_marks_file = make_fast_import_path(marks); - safe_create_leading_directories_const(import_marks_file); import_marks_file_from_stream = from_stream; import_marks_file_ignore_missing = ignore_missing; } From 019683025f1b14d7cb671312ab01f7330e9b33e7 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Thu, 29 Aug 2019 13:33:48 -0400 Subject: [PATCH 05/34] fast-import: delay creating leading directories for export-marks When we parse the --export-marks option, we don't immediately open the file, but we do create any leading directories. This can be especially confusing when a command-line option overrides an in-stream one, in which case we'd create the leading directory for the in-stream file, even though we never actually write the file. Let's instead create the directories just before opening the file, which means we'll create only useful directories. Note that this could change the handling of relative paths if we chdir() in between, but we don't actually do so; the only permanent chdir is from setup_git_directory() which runs before either code path (potentially we should take the pre-setup dir into account to avoid surprising the user, but that's an orthogonal change). The test just adapts the existing "override" test to use paths with leading directories. This checks both that the correct directory is created (which worked before but was not tested), and that the overridden one is not (our new fix here). While we're here, let's also check the error result of safe_create_leading_directories(). We'd presumably notice any failure immediately after when we try to open the file itself, but we can give a more specific error message in this case. Signed-off-by: Jeff King --- fast-import.c | 7 ++++++- t/t9300-fast-import.sh | 13 +++++++++++-- 2 files changed, 17 insertions(+), 3 deletions(-) diff --git a/fast-import.c b/fast-import.c index a5b7685e84..d64609b083 100644 --- a/fast-import.c +++ b/fast-import.c @@ -1861,6 +1861,12 @@ static void dump_marks(void) if (!export_marks_file || (import_marks_file && !import_marks_file_done)) return; + if (safe_create_leading_directories_const(export_marks_file)) { + failure |= error_errno("unable to create leading directories of %s", + export_marks_file); + return; + } + if (hold_lock_file_for_update(&mark_lock, export_marks_file, 0) < 0) { failure |= error_errno("Unable to write marks file %s", export_marks_file); @@ -3268,7 +3274,6 @@ static void option_active_branches(const char *branches) static void option_export_marks(const char *marks) { export_marks_file = make_fast_import_path(marks); - safe_create_leading_directories_const(export_marks_file); } static void option_cat_blob_fd(const char *fd) diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh index c0d04ec3ee..1ba20c1f1a 100755 --- a/t/t9300-fast-import.sh +++ b/t/t9300-fast-import.sh @@ -2132,8 +2132,17 @@ test_expect_success 'R: export-marks feature results in a marks file being creat ' test_expect_success 'R: export-marks options can be overridden by commandline options' ' - git fast-import --export-marks=other.marks input <<-\EOF && + feature export-marks=feature-sub/git.marks + blob + mark :1 + data 3 + hi + + EOF + git fast-import --export-marks=cmdline-sub/other.marks Date: Thu, 29 Aug 2019 14:37:26 -0400 Subject: [PATCH 06/34] fast-import: disallow "feature export-marks" by default The fast-import stream command "feature export-marks=" lets the stream write marks to an arbitrary path. This may be surprising if you are running fast-import against an untrusted input (which otherwise cannot do anything except update Git objects and refs). Let's disallow the use of this feature by default, and provide a command-line option to re-enable it (you can always just use the command-line --export-marks as well, but the in-stream version provides an easy way for exporters to control the process). This is a backwards-incompatible change, since the default is flipping to the new, safer behavior. However, since the main users of the in-stream versions would be import/export-based remote helpers, and since we trust remote helpers already (which are already running arbitrary code), we'll pass the new option by default when reading a remote helper's stream. This should minimize the impact. Note that the implementation isn't totally simple, as we have to work around the fact that fast-import doesn't parse its command-line options until after it has read any "feature" lines from the stream. This is how it lets command-line options override in-stream. But in our case, it's important to parse the new --allow-unsafe-features first. There are three options for resolving this: 1. Do a separate "early" pass over the options. This is easy for us to do because there are no command-line options that allow the "unstuck" form (so there's no chance of us mistaking an argument for an option), though it does introduce a risk of incorrect parsing later (e.g,. if we convert to parse-options). 2. Move the option parsing phase back to the start of the program, but teach the stream-reading code never to override an existing value. This is tricky, because stream "feature" lines override each other (meaning we'd have to start tracking the source for every option). 3. Accept that we might parse a "feature export-marks" line that is forbidden, as long we don't _act_ on it until after we've parsed the command line options. This would, in fact, work with the current code, but only because the previous patch fixed the export-marks parser to avoid touching the filesystem. So while it works, it does carry risk of somebody getting it wrong in the future in a rather subtle and unsafe way. I've gone with option (1) here as simple, safe, and unlikely to cause regressions. This fixes CVE-2019-1348. Signed-off-by: Jeff King --- Documentation/git-fast-import.txt | 14 ++++++++++++++ fast-import.c | 25 +++++++++++++++++++++++++ t/t9300-fast-import.sh | 23 +++++++++++++++-------- transport-helper.c | 1 + 4 files changed, 55 insertions(+), 8 deletions(-) diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt index 3d3d219e58..fbb3f914f2 100644 --- a/Documentation/git-fast-import.txt +++ b/Documentation/git-fast-import.txt @@ -50,6 +50,20 @@ OPTIONS memory used by fast-import during this run. Showing this output is currently the default, but can be disabled with --quiet. +--allow-unsafe-features:: + Many command-line options can be provided as part of the + fast-import stream itself by using the `feature` or `option` + commands. However, some of these options are unsafe (e.g., + allowing fast-import to access the filesystem outside of the + repository). These options are disabled by default, but can be + allowed by providing this option on the command line. This + currently impacts only the `feature export-marks` command. ++ + Only enable this option if you trust the program generating the + fast-import stream! This option is enabled automatically for + remote-helpers that use the `import` capability, as they are + already trusted to run their own code. + Options for Frontends ~~~~~~~~~~~~~~~~~~~~~ diff --git a/fast-import.c b/fast-import.c index d64609b083..967077ad0b 100644 --- a/fast-import.c +++ b/fast-import.c @@ -366,6 +366,7 @@ static uintmax_t next_mark; static struct strbuf new_data = STRBUF_INIT; static int seen_data_command; static int require_explicit_termination; +static int allow_unsafe_features; /* Signal handling */ static volatile sig_atomic_t checkpoint_requested; @@ -3320,6 +3321,8 @@ static int parse_one_option(const char *option) show_stats = 0; } else if (!strcmp(option, "stats")) { show_stats = 1; + } else if (!strcmp(option, "allow-unsafe-features")) { + ; /* already handled during early option parsing */ } else { return 0; } @@ -3327,6 +3330,13 @@ static int parse_one_option(const char *option) return 1; } +static void check_unsafe_feature(const char *feature, int from_stream) +{ + if (from_stream && !allow_unsafe_features) + die(_("feature '%s' forbidden in input without --allow-unsafe-features"), + feature); +} + static int parse_one_feature(const char *feature, int from_stream) { const char *arg; @@ -3338,6 +3348,7 @@ static int parse_one_feature(const char *feature, int from_stream) } else if (skip_prefix(feature, "import-marks-if-exists=", &arg)) { option_import_marks(arg, from_stream, 1); } else if (skip_prefix(feature, "export-marks=", &arg)) { + check_unsafe_feature(feature, from_stream); option_export_marks(arg); } else if (!strcmp(feature, "get-mark")) { ; /* Don't die - this feature is supported */ @@ -3464,6 +3475,20 @@ int cmd_main(int argc, const char **argv) avail_tree_table = xcalloc(avail_tree_table_sz, sizeof(struct avail_tree_content*)); marks = pool_calloc(1, sizeof(struct mark_set)); + /* + * We don't parse most options until after we've seen the set of + * "feature" lines at the start of the stream (which allows the command + * line to override stream data). But we must do an early parse of any + * command-line options that impact how we interpret the feature lines. + */ + for (i = 1; i < argc; i++) { + const char *arg = argv[i]; + if (*arg != '-' || !strcmp(arg, "--")) + break; + if (!strcmp(arg, "--allow-unsafe-features")) + allow_unsafe_features = 1; + } + global_argc = argc; global_argv = argv; diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh index 1ba20c1f1a..ba5a35c32c 100755 --- a/t/t9300-fast-import.sh +++ b/t/t9300-fast-import.sh @@ -2117,6 +2117,11 @@ test_expect_success 'R: only one import-marks feature allowed per stream' ' test_must_fail git fast-import input && + test_must_fail git fast-import input <<-EOF && feature export-marks=git.marks @@ -2127,7 +2132,7 @@ test_expect_success 'R: export-marks feature results in a marks file being creat EOF - git fast-import one.marks && tail -n +3 marks.out > two.marks && - git fast-import --import-marks=one.marks --import-marks=two.marks in = helper->out; argv_array_push(&fastimport->args, "fast-import"); + argv_array_push(&fastimport->args, "--allow-unsafe-features"); argv_array_push(&fastimport->args, debug ? "--stats" : "--quiet"); if (data->bidi_import) { From a52ed76142f6e8d993bb4c50938a408966eb2b7c Mon Sep 17 00:00:00 2001 From: Jeff King Date: Thu, 29 Aug 2019 15:08:42 -0400 Subject: [PATCH 07/34] fast-import: disallow "feature import-marks" by default As with export-marks in the previous commit, import-marks can access the filesystem. This is significantly less dangerous than export-marks because it only involves reading from arbitrary paths, rather than writing them. However, it could still be surprising and have security implications (e.g., exfiltrating data from a service that accepts fast-import streams). Let's lump it (and its "if-exists" counterpart) in with export-marks, and enable the in-stream version only if --allow-unsafe-features is set. Signed-off-by: Jeff King --- Documentation/git-fast-import.txt | 3 ++- fast-import.c | 2 ++ t/t9300-fast-import.sh | 22 +++++++++++++++++----- 3 files changed, 21 insertions(+), 6 deletions(-) diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt index fbb3f914f2..ff71fc2962 100644 --- a/Documentation/git-fast-import.txt +++ b/Documentation/git-fast-import.txt @@ -57,7 +57,8 @@ OPTIONS allowing fast-import to access the filesystem outside of the repository). These options are disabled by default, but can be allowed by providing this option on the command line. This - currently impacts only the `feature export-marks` command. + currently impacts only the `export-marks`, `import-marks`, and + `import-marks-if-exists` feature commands. + Only enable this option if you trust the program generating the fast-import stream! This option is enabled automatically for diff --git a/fast-import.c b/fast-import.c index 967077ad0b..93c3838254 100644 --- a/fast-import.c +++ b/fast-import.c @@ -3344,8 +3344,10 @@ static int parse_one_feature(const char *feature, int from_stream) if (skip_prefix(feature, "date-format=", &arg)) { option_date_format(arg); } else if (skip_prefix(feature, "import-marks=", &arg)) { + check_unsafe_feature("import-marks", from_stream); option_import_marks(arg, from_stream, 0); } else if (skip_prefix(feature, "import-marks-if-exists=", &arg)) { + check_unsafe_feature("import-marks-if-exists", from_stream); option_import_marks(arg, from_stream, 1); } else if (skip_prefix(feature, "export-marks=", &arg)) { check_unsafe_feature(feature, from_stream); diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh index ba5a35c32c..77104f9daa 100755 --- a/t/t9300-fast-import.sh +++ b/t/t9300-fast-import.sh @@ -2106,6 +2106,14 @@ test_expect_success 'R: abort on receiving feature after data command' ' test_must_fail git fast-import git.marks && + echo "feature import-marks=git.marks" >input && + test_must_fail git fast-import input && + test_must_fail git fast-import git.marks && >git2.marks && @@ -2114,7 +2122,7 @@ test_expect_success 'R: only one import-marks feature allowed per stream' ' feature import-marks=git2.marks EOF - test_must_fail git fast-import expect && - git fast-import --export-marks=io.marks <<-\EOF && + git fast-import --export-marks=io.marks \ + --allow-unsafe-features <<-\EOF && feature import-marks-if-exists=not_io.marks EOF test_cmp expect io.marks && @@ -2221,7 +2230,8 @@ test_expect_success 'R: feature import-marks-if-exists' ' echo ":1 $blob" >expect && echo ":2 $blob" >>expect && - git fast-import --export-marks=io.marks <<-\EOF && + git fast-import --export-marks=io.marks \ + --allow-unsafe-features <<-\EOF && feature import-marks-if-exists=io.marks blob mark :2 @@ -2234,7 +2244,8 @@ test_expect_success 'R: feature import-marks-if-exists' ' echo ":3 $blob" >>expect && git fast-import --import-marks=io.marks \ - --export-marks=io.marks <<-\EOF && + --export-marks=io.marks \ + --allow-unsafe-features <<-\EOF && feature import-marks-if-exists=not_io.marks blob mark :3 @@ -2247,7 +2258,8 @@ test_expect_success 'R: feature import-marks-if-exists' ' >expect && git fast-import --import-marks-if-exists=not_io.marks \ - --export-marks=io.marks <<-\EOF && + --export-marks=io.marks \ + --allow-unsafe-features <<-\EOF && feature import-marks-if-exists=io.marks EOF test_cmp expect io.marks From 0060fd1511b94c918928fa3708f69a3f33895a4a Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Thu, 12 Sep 2019 14:20:39 +0200 Subject: [PATCH 08/34] clone --recurse-submodules: prevent name squatting on Windows In addition to preventing `.git` from being tracked by Git, on Windows we also have to prevent `git~1` from being tracked, as the default NTFS short name (also known as the "8.3 filename") for the file name `.git` is `git~1`, otherwise it would be possible for malicious repositories to write directly into the `.git/` directory, e.g. a `post-checkout` hook that would then be executed _during_ a recursive clone. When we implemented appropriate protections in 2b4c6efc821 (read-cache: optionally disallow NTFS .git variants, 2014-12-16), we had analyzed carefully that the `.git` directory or file would be guaranteed to be the first directory entry to be written. Otherwise it would be possible e.g. for a file named `..git` to be assigned the short name `git~1` and subsequently, the short name generated for `.git` would be `git~2`. Or `git~3`. Or even `~9999999` (for a detailed explanation of the lengths we have to go to protect `.gitmodules`, see the commit message of e7cb0b4455c (is_ntfs_dotgit: match other .git files, 2018-05-11)). However, by exploiting two issues (that will be addressed in a related patch series close by), it is currently possible to clone a submodule into a non-empty directory: - On Windows, file names cannot end in a space or a period (for historical reasons: the period separating the base name from the file extension was not actually written to disk, and the base name/file extension was space-padded to the full 8/3 characters, respectively). Helpfully, when creating a directory under the name, say, `sub.`, that trailing period is trimmed automatically and the actual name on disk is `sub`. This means that while Git thinks that the submodule names `sub` and `sub.` are different, they both access `.git/modules/sub/`. - While the backslash character is a valid file name character on Linux, it is not so on Windows. As Git tries to be cross-platform, it therefore allows backslash characters in the file names stored in tree objects. Which means that it is totally possible that a submodule `c` sits next to a file `c\..git`, and on Windows, during recursive clone a file called `..git` will be written into `c/`, of course _before_ the submodule is cloned. Note that the actual exploit is not quite as simple as having a submodule `c` next to a file `c\..git`, as we have to make sure that the directory `.git/modules/b` already exists when the submodule is checked out, otherwise a different code path is taken in `module_clone()` that does _not_ allow a non-empty submodule directory to exist already. Even if we will address both issues nearby (the next commit will disallow backslash characters in tree entries' file names on Windows, and another patch will disallow creating directories/files with trailing spaces or periods), it is a wise idea to defend in depth against this sort of attack vector: when submodules are cloned recursively, we now _require_ the directory to be empty, addressing CVE-2019-1349. Note: the code path we patch is shared with the code path of `git submodule update --init`, which must not expect, in general, that the directory is empty. Hence we have to introduce the new option `--force-init` and hand it all the way down from `git submodule` to the actual `git submodule--helper` process that performs the initial clone. Reported-by: Nicolas Joly Signed-off-by: Johannes Schindelin --- builtin/clone.c | 2 +- builtin/submodule--helper.c | 13 ++++++++++++- git-submodule.sh | 6 ++++++ t/t7415-submodule-names.sh | 31 +++++++++++++++++++++++++++++++ 4 files changed, 50 insertions(+), 2 deletions(-) diff --git a/builtin/clone.c b/builtin/clone.c index f7e17d2295..44b716923f 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -757,7 +757,7 @@ static int checkout(int submodule_progress) if (!err && (option_recurse_submodules.nr > 0)) { struct argv_array args = ARGV_ARRAY_INIT; - argv_array_pushl(&args, "submodule", "update", "--init", "--recursive", NULL); + argv_array_pushl(&args, "submodule", "update", "--require-init", "--recursive", NULL); if (option_shallow_submodules == 1) argv_array_push(&args, "--depth=1"); diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index 676cfed770..79156fac45 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -13,6 +13,7 @@ #include "remote.h" #include "refs.h" #include "connect.h" +#include "dir.h" static char *get_default_remote(void) { @@ -623,6 +624,7 @@ static int module_clone(int argc, const char **argv, const char *prefix) char *p, *path = NULL, *sm_gitdir; struct strbuf sb = STRBUF_INIT; struct string_list reference = STRING_LIST_INIT_NODUP; + int require_init = 0; char *sm_alternate = NULL, *error_strategy = NULL; struct option module_clone_options[] = { @@ -647,6 +649,8 @@ static int module_clone(int argc, const char **argv, const char *prefix) OPT__QUIET(&quiet, "Suppress output for cloning a submodule"), OPT_BOOL(0, "progress", &progress, N_("force cloning progress")), + OPT_BOOL(0, "require-init", &require_init, + N_("disallow cloning into non-empty directory")), OPT_END() }; @@ -685,6 +689,8 @@ static int module_clone(int argc, const char **argv, const char *prefix) die(_("clone of '%s' into submodule path '%s' failed"), url, path); } else { + if (require_init && !access(path, X_OK) && !is_empty_dir(path)) + die(_("directory not empty: '%s'"), path); if (safe_create_leading_directories_const(path) < 0) die(_("could not create directory '%s'"), path); strbuf_addf(&sb, "%s/index", sm_gitdir); @@ -733,6 +739,7 @@ struct submodule_update_clone { int quiet; int recommend_shallow; struct string_list references; + unsigned require_init; const char *depth; const char *recursive_prefix; const char *prefix; @@ -748,7 +755,7 @@ struct submodule_update_clone { int failed_clones_nr, failed_clones_alloc; }; #define SUBMODULE_UPDATE_CLONE_INIT {0, MODULE_LIST_INIT, 0, \ - SUBMODULE_UPDATE_STRATEGY_INIT, 0, 0, -1, STRING_LIST_INIT_DUP, \ + SUBMODULE_UPDATE_STRATEGY_INIT, 0, 0, -1, STRING_LIST_INIT_DUP, 0, \ NULL, NULL, NULL, \ STRING_LIST_INIT_DUP, 0, NULL, 0, 0} @@ -850,6 +857,8 @@ static int prepare_to_clone_next_submodule(const struct cache_entry *ce, argv_array_pushl(&child->args, "--prefix", suc->prefix, NULL); if (suc->recommend_shallow && sub->recommend_shallow == 1) argv_array_push(&child->args, "--depth=1"); + if (suc->require_init) + argv_array_push(&child->args, "--require-init"); argv_array_pushl(&child->args, "--path", sub->path, NULL); argv_array_pushl(&child->args, "--name", sub->name, NULL); argv_array_pushl(&child->args, "--url", sub->url, NULL); @@ -992,6 +1001,8 @@ static int update_clone(int argc, const char **argv, const char *prefix) OPT__QUIET(&suc.quiet, N_("don't print cloning progress")), OPT_BOOL(0, "progress", &suc.progress, N_("force cloning progress")), + OPT_BOOL(0, "require-init", &suc.require_init, + N_("disallow cloning into non-empty directory")), OPT_END() }; diff --git a/git-submodule.sh b/git-submodule.sh index 8f260fbd9c..e4843a5874 100755 --- a/git-submodule.sh +++ b/git-submodule.sh @@ -34,6 +34,7 @@ reference= cached= recursive= init= +require_init= files= remote= nofetch= @@ -528,6 +529,10 @@ cmd_update() -i|--init) init=1 ;; + --require-init) + init=1 + require_init=1 + ;; --remote) remote=1 ;; @@ -606,6 +611,7 @@ cmd_update() ${update:+--update "$update"} \ ${reference:+"$reference"} \ ${depth:+--depth "$depth"} \ + ${require_init:+--require-init} \ ${recommend_shallow:+"$recommend_shallow"} \ ${jobs:+$jobs} \ "$@" || echo "#unmatched" $? diff --git a/t/t7415-submodule-names.sh b/t/t7415-submodule-names.sh index 75fa071c6d..e1cd0a3545 100755 --- a/t/t7415-submodule-names.sh +++ b/t/t7415-submodule-names.sh @@ -73,4 +73,35 @@ test_expect_success 'clone evil superproject' ' ! grep "RUNNING POST CHECKOUT" output ' +test_expect_success MINGW 'prevent git~1 squatting on Windows' ' + git init squatting && + ( + cd squatting && + mkdir a && + touch a/..git && + git add a/..git && + test_tick && + git commit -m initial && + + modules="$(test_write_lines \ + "[submodule \"b.\"]" "url = ." "path = c" \ + "[submodule \"b\"]" "url = ." "path = d\\\\a" | + git hash-object -w --stdin)" && + rev="$(git rev-parse --verify HEAD)" && + hash="$(echo x | git hash-object -w --stdin)" && + git update-index --add \ + --cacheinfo 100644,$modules,.gitmodules \ + --cacheinfo 160000,$rev,c \ + --cacheinfo 160000,$rev,d\\a \ + --cacheinfo 100644,$hash,d./a/x \ + --cacheinfo 100644,$hash,d./a/..git && + test_tick && + git commit -m "module" + ) && + test_must_fail git \ + clone --recurse-submodules squatting squatting-clone 2>err && + test_i18ngrep "directory not empty" err && + ! grep gitdir squatting-clone/d/a/git~2 +' + test_done From e1d911dd4c7b76a5a8cec0f5c8de15981e34da83 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Thu, 12 Sep 2019 14:54:05 +0200 Subject: [PATCH 09/34] mingw: disallow backslash characters in tree objects' file names The backslash character is not a valid part of a file name on Windows. Hence it is dangerous to allow writing files that were unpacked from tree objects, when the stored file name contains a backslash character: it will be misinterpreted as directory separator. This not only causes ambiguity when a tree contains a blob `a\b` and a tree `a` that contains a blob `b`, but it also can be used as part of an attack vector to side-step the careful protections against writing into the `.git/` directory during a clone of a maliciously-crafted repository. Let's prevent that, addressing CVE-2019-1354. Note: we guard against backslash characters in tree objects' file names _only_ on Windows (because on other platforms, even on those where NTFS volumes can be mounted, the backslash character is _not_ a directory separator), and _only_ when `core.protectNTFS = true` (because users might need to generate tree objects for other platforms, of course without touching the worktree, e.g. using `git update-index --cacheinfo`). Signed-off-by: Johannes Schindelin --- t/t1450-fsck.sh | 1 + t/t7415-submodule-names.sh | 8 +++++--- t/t9350-fast-export.sh | 1 + tree-walk.c | 6 ++++++ 4 files changed, 13 insertions(+), 3 deletions(-) diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh index cb4b66e29d..33c955f912 100755 --- a/t/t1450-fsck.sh +++ b/t/t1450-fsck.sh @@ -419,6 +419,7 @@ while read name path pretty; do ( git init $name-$type && cd $name-$type && + git config core.protectNTFS false && echo content >file && git add file && git commit -m base && diff --git a/t/t7415-submodule-names.sh b/t/t7415-submodule-names.sh index e1cd0a3545..7c65e7a35c 100755 --- a/t/t7415-submodule-names.sh +++ b/t/t7415-submodule-names.sh @@ -89,16 +89,18 @@ test_expect_success MINGW 'prevent git~1 squatting on Windows' ' git hash-object -w --stdin)" && rev="$(git rev-parse --verify HEAD)" && hash="$(echo x | git hash-object -w --stdin)" && - git update-index --add \ + git -c core.protectNTFS=false update-index --add \ --cacheinfo 100644,$modules,.gitmodules \ --cacheinfo 160000,$rev,c \ --cacheinfo 160000,$rev,d\\a \ --cacheinfo 100644,$hash,d./a/x \ --cacheinfo 100644,$hash,d./a/..git && test_tick && - git commit -m "module" + git -c core.protectNTFS=false commit -m "module" && + test_must_fail git show HEAD: 2>err && + test_i18ngrep backslash err ) && - test_must_fail git \ + test_must_fail git -c core.protectNTFS=false \ clone --recurse-submodules squatting squatting-clone 2>err && test_i18ngrep "directory not empty" err && ! grep gitdir squatting-clone/d/a/git~2 diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh index 866ddf6058..e6062071e6 100755 --- a/t/t9350-fast-export.sh +++ b/t/t9350-fast-export.sh @@ -421,6 +421,7 @@ test_expect_success 'directory becomes symlink' ' test_expect_success 'fast-export quotes pathnames' ' git init crazy-paths && + test_config -C crazy-paths core.protectNTFS false && (cd crazy-paths && blob=$(echo foo | git hash-object -w --stdin) && git update-index --add \ diff --git a/tree-walk.c b/tree-walk.c index d459feda23..54ff959d7f 100644 --- a/tree-walk.c +++ b/tree-walk.c @@ -41,6 +41,12 @@ static int decode_tree_entry(struct tree_desc *desc, const char *buf, unsigned l strbuf_addstr(err, _("empty filename in tree entry")); return -1; } +#ifdef GIT_WINDOWS_NATIVE + if (protect_ntfs && strchr(path, '\\')) { + strbuf_addf(err, _("filename in tree entry contains backslash: '%s'"), path); + return -1; + } +#endif len = strlen(path) + 1; /* Initialize the descriptor entry */ From 525e7fba7854c23ee3530d0bf88d75f106f14c95 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 16 Sep 2019 20:44:31 +0200 Subject: [PATCH 10/34] path.c: document the purpose of `is_ntfs_dotgit()` Previously, this function was completely undocumented. It is worth, though, to explain what is going on, as it is not really obvious at all. Suggested-by: Garima Singh Signed-off-by: Johannes Schindelin --- path.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/path.c b/path.c index 9ac0531a29..22bd0b6f52 100644 --- a/path.c +++ b/path.c @@ -1302,6 +1302,34 @@ static int only_spaces_and_periods(const char *path, size_t len, size_t skip) return 1; } +/* + * On NTFS, we need to be careful to disallow certain synonyms of the `.git/` + * directory: + * + * - For historical reasons, file names that end in spaces or periods are + * automatically trimmed. Therefore, `.git . . ./` is a valid way to refer + * to `.git/`. + * + * - For other historical reasons, file names that do not conform to the 8.3 + * format (up to eight characters for the basename, three for the file + * extension, certain characters not allowed such as `+`, etc) are associated + * with a so-called "short name", at least on the `C:` drive by default. + * Which means that `git~1/` is a valid way to refer to `.git/`. + * + * Note: Technically, `.git/` could receive the short name `git~2` if the + * short name `git~1` were already used. In Git, however, we guarantee that + * `.git` is the first item in a directory, therefore it will be associated + * with the short name `git~1` (unless short names are disabled). + * + * When this function returns 1, it indicates that the specified file/directory + * name refers to a `.git` file or directory, or to any of these synonyms, and + * Git should therefore not track it. + * + * This function is intended to be used by `git fsck` even on platforms where + * the backslash is a regular filename character, therefore it needs to handle + * backlash characters in the provided `name` specially: they are interpreted + * as directory separators. + */ int is_ntfs_dotgit(const char *name) { size_t len; From a62f9d1ace8c6556cbc1bb7df69eff0a0bb9e774 Mon Sep 17 00:00:00 2001 From: Garima Singh Date: Wed, 4 Sep 2019 13:36:39 -0400 Subject: [PATCH 11/34] test-path-utils: offer to run a protectNTFS/protectHFS benchmark In preparation to flipping the default on `core.protectNTFS`, let's have some way to measure the speed impact of this config setting reliably (and for comparison, the `core.protectHFS` config setting). For now, this is a manual performance benchmark: ./t/helper/test-path-utils protect_ntfs_hfs [arguments...] where the arguments are an optional number of file names to test with, optionally followed by minimum and maximum length of the random file names. The default values are one million, 3 and 20, respectively. Just like `sqrti()` in `bisect.c`, we introduce a very simple function to approximation the square root of a given value, in order to avoid having to introduce the first user of `` in Git's source code. Note: this is _not_ implemented as a Unix shell script in t/perf/ because we really care about _very_ precise timings here, and Unix shell scripts are simply unsuited for precise and consistent benchmarking. Signed-off-by: Garima Singh Signed-off-by: Johannes Schindelin --- t/helper/test-path-utils.c | 96 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 96 insertions(+) diff --git a/t/helper/test-path-utils.c b/t/helper/test-path-utils.c index 94846550f7..16d8e689c8 100644 --- a/t/helper/test-path-utils.c +++ b/t/helper/test-path-utils.c @@ -176,6 +176,99 @@ static int is_dotgitmodules(const char *path) return is_hfs_dotgitmodules(path) || is_ntfs_dotgitmodules(path); } +/* + * A very simple, reproducible pseudo-random generator. Copied from + * `test-genrandom.c`. + */ +static uint64_t my_random_value = 1234; + +static uint64_t my_random(void) +{ + my_random_value = my_random_value * 1103515245 + 12345; + return my_random_value; +} + +/* + * A fast approximation of the square root, without requiring math.h. + * + * It uses Newton's method to approximate the solution of 0 = x^2 - value. + */ +static double my_sqrt(double value) +{ + const double epsilon = 1e-6; + double x = value; + + if (value == 0) + return 0; + + for (;;) { + double delta = (value / x - x) / 2; + if (delta < epsilon && delta > -epsilon) + return x + delta; + x += delta; + } +} + +static int protect_ntfs_hfs_benchmark(int argc, const char **argv) +{ + size_t i, j, nr, min_len = 3, max_len = 20; + char **names; + int repetitions = 15, file_mode = 0100644; + uint64_t begin, end; + double m[3][2], v[3][2]; + uint64_t cumul; + double cumul2; + + if (argc > 1 && !strcmp(argv[1], "--with-symlink-mode")) { + file_mode = 0120000; + argc--; + argv++; + } + + nr = argc > 1 ? strtoul(argv[1], NULL, 0) : 1000000; + ALLOC_ARRAY(names, nr); + + if (argc > 2) { + min_len = strtoul(argv[2], NULL, 0); + if (argc > 3) + max_len = strtoul(argv[3], NULL, 0); + if (min_len > max_len) + die("min_len > max_len"); + } + + for (i = 0; i < nr; i++) { + size_t len = min_len + (my_random() % (max_len + 1 - min_len)); + + names[i] = xmallocz(len); + while (len > 0) + names[i][--len] = (char)(' ' + (my_random() % ('\x7f' - ' '))); + } + + for (protect_ntfs = 0; protect_ntfs < 2; protect_ntfs++) + for (protect_hfs = 0; protect_hfs < 2; protect_hfs++) { + cumul = 0; + cumul2 = 0; + for (i = 0; i < repetitions; i++) { + begin = getnanotime(); + for (j = 0; j < nr; j++) + verify_path(names[j], file_mode); + end = getnanotime(); + printf("protect_ntfs = %d, protect_hfs = %d: %lfms\n", protect_ntfs, protect_hfs, (end-begin) / (double)1e6); + cumul += end - begin; + cumul2 += (end - begin) * (end - begin); + } + m[protect_ntfs][protect_hfs] = cumul / (double)repetitions; + v[protect_ntfs][protect_hfs] = my_sqrt(cumul2 / (double)repetitions - m[protect_ntfs][protect_hfs] * m[protect_ntfs][protect_hfs]); + printf("mean: %lfms, stddev: %lfms\n", m[protect_ntfs][protect_hfs] / (double)1e6, v[protect_ntfs][protect_hfs] / (double)1e6); + } + + for (protect_ntfs = 0; protect_ntfs < 2; protect_ntfs++) + for (protect_hfs = 0; protect_hfs < 2; protect_hfs++) + printf("ntfs=%d/hfs=%d: %lf%% slower\n", protect_ntfs, protect_hfs, (m[protect_ntfs][protect_hfs] - m[0][0]) * 100 / m[0][0]); + + return 0; +} + int cmd_main(int argc, const char **argv) { if (argc == 3 && !strcmp(argv[1], "normalize_path_copy")) { @@ -290,6 +383,9 @@ int cmd_main(int argc, const char **argv) return !!res; } + if (argc > 1 && !strcmp(argv[1], "protect_ntfs_hfs")) + return !!protect_ntfs_hfs_benchmark(argc - 1, argv + 1); + fprintf(stderr, "%s: unknown function name: %s\n", argv[0], argv[1] ? argv[1] : "(there was none)"); return 1; From 288a74bcd28229a00c3632f18cba92dbfdf73ee9 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 23 Sep 2019 08:58:11 +0200 Subject: [PATCH 12/34] is_ntfs_dotgit(): only verify the leading segment The config setting `core.protectNTFS` is specifically designed to work not only on Windows, but anywhere, to allow for repositories hosted on, say, Linux servers to be protected against NTFS-specific attack vectors. As a consequence, `is_ntfs_dotgit()` manually splits backslash-separated paths (but does not do the same for paths separated by forward slashes), under the assumption that the backslash might not be a valid directory separator on the _current_ Operating System. However, the two callers, `verify_path()` and `fsck_tree()`, are supposed to feed only individual path segments to the `is_ntfs_dotgit()` function. This causes a lot of duplicate scanning (and very inefficient scanning, too, as the inner loop of `is_ntfs_dotgit()` was optimized for readability rather than for speed. Let's simplify the design of `is_ntfs_dotgit()` by putting the burden of splitting the paths by backslashes as directory separators on the callers of said function. Consequently, the `verify_path()` function, which already splits the path by directory separators, now treats backslashes as directory separators _explicitly_ when `core.protectNTFS` is turned on, even on platforms where the backslash is _not_ a directory separator. Note that we have to repeat some code in `verify_path()`: if the backslash is not a directory separator on the current Operating System, we want to allow file names like `\`, but we _do_ want to disallow paths that are clearly intended to cause harm when the repository is cloned on Windows. The `fsck_tree()` function (the other caller of `is_ntfs_dotgit()`) now needs to look for backslashes in tree entries' names specifically when `core.protectNTFS` is turned on. While it would be tempting to completely disallow backslashes in that case (much like `fsck` reports names containing forward slashes as "full paths"), this would be overzealous: when `core.protectNTFS` is turned on in a non-Windows setup, backslashes are perfectly valid characters in file names while we _still_ want to disallow tree entries that are clearly designed to exploit NTFS-specific behavior. This simplification will make subsequent changes easier to implement, such as turning `core.protectNTFS` on by default (not only on Windows) or protecting against attack vectors involving NTFS Alternate Data Streams. Incidentally, this change allows for catching malicious repositories that contain tree entries of the form `dir\.gitmodules` already on the server side rather than only on the client side (and previously only on Windows): in contrast to `is_ntfs_dotgit()`, the `is_ntfs_dotgitmodules()` function already expects the caller to split the paths by directory separators. Signed-off-by: Johannes Schindelin --- fsck.c | 11 ++++++++++- path.c | 5 +---- read-cache.c | 8 ++++++++ 3 files changed, 19 insertions(+), 5 deletions(-) diff --git a/fsck.c b/fsck.c index b1579c7e28..d80a96f4be 100644 --- a/fsck.c +++ b/fsck.c @@ -551,7 +551,7 @@ static int fsck_tree(struct tree *item, struct fsck_options *options) while (desc.size) { unsigned mode; - const char *name; + const char *name, *backslash; const struct object_id *oid; oid = tree_entry_extract(&desc, &name, &mode); @@ -565,6 +565,15 @@ static int fsck_tree(struct tree *item, struct fsck_options *options) is_hfs_dotgit(name) || is_ntfs_dotgit(name)); has_zero_pad |= *(char *)desc.buffer == '0'; + + if ((backslash = strchr(name, '\\'))) { + while (backslash) { + backslash++; + has_dotgit |= is_ntfs_dotgit(backslash); + backslash = strchr(backslash, '\\'); + } + } + if (update_tree_entry_gently(&desc)) { retval += report(options, &item->object, FSCK_MSG_BAD_TREE, "cannot be parsed as a tree"); break; diff --git a/path.c b/path.c index 22bd0b6f52..f62a37d5f5 100644 --- a/path.c +++ b/path.c @@ -1342,10 +1342,7 @@ int is_ntfs_dotgit(const char *name) if (only_spaces_and_periods(name, len, 5) && !strncasecmp(name, "git~1", 5)) return 1; - if (name[len] != '\\') - return 0; - name += len + 1; - len = -1; + return 0; } } diff --git a/read-cache.c b/read-cache.c index 5b57b369e8..bde1e70c51 100644 --- a/read-cache.c +++ b/read-cache.c @@ -874,7 +874,15 @@ inside: if ((c == '.' && !verify_dotfile(path, mode)) || is_dir_sep(c) || c == '\0') return 0; + } else if (c == '\\' && protect_ntfs) { + if (is_ntfs_dotgit(path)) + return 0; + if (S_ISLNK(mode)) { + if (is_ntfs_dotgitmodules(path)) + return 0; + } } + c = *path++; } } From 7c3745fc6185495d5765628b4dfe1bd2c25a2981 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 28 Aug 2019 12:22:17 +0200 Subject: [PATCH 13/34] path: safeguard `.git` against NTFS Alternate Streams Accesses Probably inspired by HFS' resource streams, NTFS supports "Alternate Data Streams": by appending `:` to the file name, information in addition to the file contents can be written and read, information that is copied together with the file (unless copied to a non-NTFS location). These Alternate Data Streams are typically used for things like marking an executable as having just been downloaded from the internet (and hence not necessarily being trustworthy). In addition to a stream name, a stream type can be appended, like so: `::`. Unless specified, the default stream type is `$DATA` for files and `$INDEX_ALLOCATION` for directories. In other words, `.git::$INDEX_ALLOCATION` is a valid way to reference the `.git` directory! In our work in Git v2.2.1 to protect Git on NTFS drives under `core.protectNTFS`, we focused exclusively on NTFS short names, unaware of the fact that NTFS Alternate Data Streams offer a similar attack vector. Let's fix this. Seeing as it is better to be safe than sorry, we simply disallow paths referring to *any* NTFS Alternate Data Stream of `.git`, not just `::$INDEX_ALLOCATION`. This also simplifies the implementation. This closes CVE-2019-1352. Further reading about NTFS Alternate Data Streams: https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/c54dec26-1551-4d3a-a0ea-4fa40f848eb3 Reported-by: Nicolas Joly Signed-off-by: Johannes Schindelin --- path.c | 12 +++++++++++- t/t1014-read-tree-confusing.sh | 1 + 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/path.c b/path.c index f62a37d5f5..e39ecf4689 100644 --- a/path.c +++ b/path.c @@ -1321,10 +1321,19 @@ static int only_spaces_and_periods(const char *path, size_t len, size_t skip) * `.git` is the first item in a directory, therefore it will be associated * with the short name `git~1` (unless short names are disabled). * + * - For yet other historical reasons, NTFS supports so-called "Alternate Data + * Streams", i.e. metadata associated with a given file, referred to via + * `::`. There exists a default stream + * type for directories, allowing `.git/` to be accessed via + * `.git::$INDEX_ALLOCATION/`. + * * When this function returns 1, it indicates that the specified file/directory * name refers to a `.git` file or directory, or to any of these synonyms, and * Git should therefore not track it. * + * For performance reasons, _all_ Alternate Data Streams of `.git/` are + * forbidden, not just `::$INDEX_ALLOCATION`. + * * This function is intended to be used by `git fsck` even on platforms where * the backslash is a regular filename character, therefore it needs to handle * backlash characters in the provided `name` specially: they are interpreted @@ -1335,7 +1344,8 @@ int is_ntfs_dotgit(const char *name) size_t len; for (len = 0; ; len++) - if (!name[len] || name[len] == '\\' || is_dir_sep(name[len])) { + if (!name[len] || name[len] == '\\' || is_dir_sep(name[len]) || + name[len] == ':') { if (only_spaces_and_periods(name, len, 4) && !strncasecmp(name, ".git", 4)) return 1; diff --git a/t/t1014-read-tree-confusing.sh b/t/t1014-read-tree-confusing.sh index 2f5a25d503..da3376b3bb 100755 --- a/t/t1014-read-tree-confusing.sh +++ b/t/t1014-read-tree-confusing.sh @@ -49,6 +49,7 @@ git~1 .git.SPACE .git.{space} .\\\\.GIT\\\\foobar backslashes .git\\\\foobar backslashes2 +.git...:alternate-stream EOF test_expect_success 'utf-8 paths allowed with core.protectHFS off' ' From 3a85dc7d534fc2d410ddc0c771c963b20d1b4857 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 6 Sep 2019 21:09:35 +0200 Subject: [PATCH 14/34] is_ntfs_dotgit(): speed it up Previously, this function was written without focusing on speed, intending to make reviewing the code as easy as possible, to avoid any bugs in this critical code. Turns out: we can do much better on both accounts. With this patch, we make it as fast as this developer can make it go: - We avoid the call to `is_dir_sep()` and make all the character comparisons explicit. - We avoid the cost of calling `strncasecmp()` and unroll the test for `.git` and `git~1`, not even using `tolower()` because it is faster to compare against two constant values. - We look for `.git` and `.git~1` first thing, and return early if not found. - We also avoid calling a separate function for detecting chains of spaces and periods. Each of these improvements has a noticeable impact on the speed of `is_ntfs_dotgit()`. Signed-off-by: Johannes Schindelin --- path.c | 55 ++++++++++++++++++++++++++++++------------------------- 1 file changed, 30 insertions(+), 25 deletions(-) diff --git a/path.c b/path.c index 2037e2d8c1..43b16aabd4 100644 --- a/path.c +++ b/path.c @@ -1288,20 +1288,6 @@ int daemon_avoid_alias(const char *p) } } -static int only_spaces_and_periods(const char *path, size_t len, size_t skip) -{ - if (len < skip) - return 0; - len -= skip; - path += skip; - while (len-- > 0) { - char c = *(path++); - if (c != ' ' && c != '.') - return 0; - } - return 1; -} - /* * On NTFS, we need to be careful to disallow certain synonyms of the `.git/` * directory: @@ -1341,19 +1327,38 @@ static int only_spaces_and_periods(const char *path, size_t len, size_t skip) */ int is_ntfs_dotgit(const char *name) { - size_t len; + char c; - for (len = 0; ; len++) - if (!name[len] || name[len] == '\\' || is_dir_sep(name[len]) || - name[len] == ':') { - if (only_spaces_and_periods(name, len, 4) && - !strncasecmp(name, ".git", 4)) - return 1; - if (only_spaces_and_periods(name, len, 5) && - !strncasecmp(name, "git~1", 5)) - return 1; + /* + * Note that when we don't find `.git` or `git~1` we end up with `name` + * advanced partway through the string. That's okay, though, as we + * return immediately in those cases, without looking at `name` any + * further. + */ + c = *(name++); + if (c == '.') { + /* .git */ + if (((c = *(name++)) != 'g' && c != 'G') || + ((c = *(name++)) != 'i' && c != 'I') || + ((c = *(name++)) != 't' && c != 'T')) return 0; - } + } else if (c == 'g' || c == 'G') { + /* git ~1 */ + if (((c = *(name++)) != 'i' && c != 'I') || + ((c = *(name++)) != 't' && c != 'T') || + *(name++) != '~' || + *(name++) != '1') + return 0; + } else + return 0; + + for (;;) { + c = *(name++); + if (!c || c == '\\' || c == '/' || c == ':') + return 1; + if (c != '.' && c != ' ') + return 0; + } } static int is_ntfs_dot_generic(const char *name, From 6d8684161ee9c03bed5cb69ae76dfdddb85a0003 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 13 Sep 2019 16:32:43 +0200 Subject: [PATCH 15/34] mingw: fix quoting of arguments We need to be careful to follow proper quoting rules. For example, if an argument contains spaces, we have to quote them. Double-quotes need to be escaped. Backslashes need to be escaped, but only if they are followed by a double-quote character. We need to be _extra_ careful to consider the case where an argument ends in a backslash _and_ needs to be quoted: in this case, we append a double-quote character, i.e. the backslash now has to be escaped! The current code, however, fails to recognize that, and therefore can turn an argument that ends in a single backslash into a quoted argument that now ends in an escaped double-quote character. This allows subsequent command-line parameters to be split and part of them being mistaken for command-line options, e.g. through a maliciously-crafted submodule URL during a recursive clone. Technically, we would not need to quote _all_ arguments which end in a backslash _unless_ the argument needs to be quoted anyway. For example, `test\` would not need to be quoted, while `test \` would need to be. To keep the code simple, however, and therefore easier to reason about and ensure its correctness, we now _always_ quote an argument that ends in a backslash. This addresses CVE-2019-1350. Signed-off-by: Johannes Schindelin --- compat/mingw.c | 9 ++++++--- t/t7416-submodule-dash-url.sh | 14 ++++++++++++++ 2 files changed, 20 insertions(+), 3 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 8b6fa0db44..459ee20df6 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -872,7 +872,7 @@ static const char *quote_arg(const char *arg) p++; len++; } - if (*p == '"') + if (*p == '"' || !*p) n += count*2 + 1; continue; } @@ -894,16 +894,19 @@ static const char *quote_arg(const char *arg) count++; *d++ = *arg++; } - if (*arg == '"') { + if (*arg == '"' || !*arg) { while (count-- > 0) *d++ = '\\'; + /* don't escape the surrounding end quote */ + if (!*arg) + break; *d++ = '\\'; } } *d++ = *arg++; } *d++ = '"'; - *d++ = 0; + *d++ = '\0'; return q; } diff --git a/t/t7416-submodule-dash-url.sh b/t/t7416-submodule-dash-url.sh index 459193c976..2966e93071 100755 --- a/t/t7416-submodule-dash-url.sh +++ b/t/t7416-submodule-dash-url.sh @@ -31,4 +31,18 @@ test_expect_success 'clone rejects unprotected dash' ' test_i18ngrep ignoring err ' +test_expect_success 'trailing backslash is handled correctly' ' + git init testmodule && + test_commit -C testmodule c && + git submodule add ./testmodule && + : ensure that the name ends in a double backslash && + sed -e "s|\\(submodule \"testmodule\\)\"|\\1\\\\\\\\\"|" \ + -e "s|url = .*|url = \" --should-not-be-an-option\"|" \ + <.gitmodules >.new && + mv .new .gitmodules && + git commit -am "Add testmodule" && + test_must_fail git clone --verbose --recurse-submodules . dolly 2>err && + test_i18ngrep ! "unknown option" err +' + test_done From 91bd46588e6959e6903e275f78b10bd07830d547 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 28 Aug 2019 12:22:17 +0200 Subject: [PATCH 16/34] path: also guard `.gitmodules` against NTFS Alternate Data Streams We just safe-guarded `.git` against NTFS Alternate Data Stream-related attack vectors, and now it is time to do the same for `.gitmodules`. Note: In the added regression test, we refrain from verifying all kinds of variations between short names and NTFS Alternate Data Streams: as the new code disallows _all_ Alternate Data Streams of `.gitmodules`, it is enough to test one in order to know that all of them are guarded against. Signed-off-by: Johannes Schindelin --- path.c | 2 +- t/t0060-path-utils.sh | 7 ++++++- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/path.c b/path.c index e39ecf4689..2037e2d8c1 100644 --- a/path.c +++ b/path.c @@ -1369,7 +1369,7 @@ static int is_ntfs_dot_generic(const char *name, only_spaces_and_periods: for (;;) { char c = name[i++]; - if (!c) + if (!c || c == ':') return 1; if (c != ' ' && c != '.') return 0; diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh index 3f3357ed9f..2b8589e921 100755 --- a/t/t0060-path-utils.sh +++ b/t/t0060-path-utils.sh @@ -408,6 +408,9 @@ test_expect_success 'match .gitmodules' ' ~1000000 \ ~9999999 \ \ + .gitmodules:\$DATA \ + "gitmod~4 . :\$DATA" \ + \ --not \ ".gitmodules x" \ ".gitmodules .x" \ @@ -432,7 +435,9 @@ test_expect_success 'match .gitmodules' ' \ GI7EB~1 \ GI7EB~01 \ - GI7EB~1X + GI7EB~1X \ + \ + .gitmodules,:\$DATA ' test_done From 9102f958ee5254b10c0be72672aa3305bf4f4704 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 9 Sep 2019 21:04:41 +0200 Subject: [PATCH 17/34] protect_ntfs: turn on NTFS protection by default Back in the DOS days, in the FAT file system, file names always consisted of a base name of length 8 plus a file extension of length 3. Shorter file names were simply padded with spaces to the full 8.3 format. Later, the FAT file system was taught to support _also_ longer names, with an 8.3 "short name" as primary file name. While at it, the same facility allowed formerly illegal file names, such as `.git` (empty base names were not allowed), which would have the "short name" `git~1` associated with it. For backwards-compatibility, NTFS supports alternative 8.3 short filenames, too, even if starting with Windows Vista, they are only generated on the system drive by default. We addressed the problem that the `.git/` directory can _also_ be accessed via `git~1/` (when short names are enabled) in 2b4c6efc821 (read-cache: optionally disallow NTFS .git variants, 2014-12-16), i.e. since Git v1.9.5, by introducing the config setting `core.protectNTFS` and enabling it by default on Windows. In the meantime, Windows 10 introduced the "Windows Subsystem for Linux" (short: WSL), i.e. a way to run Linux applications/distributions in a thinly-isolated subsystem on Windows (giving rise to many a "2016 is the Year of Linux on the Desktop" jokes). WSL is getting increasingly popular, also due to the painless way Linux application can operate directly ("natively") on files on Windows' file system: the Windows drives are mounted automatically (e.g. `C:` as `/mnt/c/`). Taken together, this means that we now have to enable the safe-guards of Git v1.9.5 also in WSL: it is possible to access a `.git` directory inside `/mnt/c/` via the 8.3 name `git~1` (unless short name generation was disabled manually). Since regular Linux distributions run in WSL, this means we have to enable `core.protectNTFS` at least on Linux, too. To enable Services for Macintosh in Windows NT to store so-called resource forks, NTFS introduced "Alternate Data Streams". Essentially, these constitute additional metadata that are connected to (and copied with) their associated files, and they are accessed via pseudo file names of the form `filename::`. In a recent patch, we extended `core.protectNTFS` to also protect against accesses via NTFS Alternate Data Streams, e.g. to prevent contents of the `.git/` directory to be "tracked" via yet another alternative file name. While it is not possible (at least by default) to access files via NTFS Alternate Data Streams from within WSL, the defaults on macOS when mounting network shares via SMB _do_ allow accessing files and directories in that way. Therefore, we need to enable `core.protectNTFS` on macOS by default, too, and really, on any Operating System that can mount network shares via SMB/CIFS. A couple of approaches were considered for fixing this: 1. We could perform a dynamic NTFS check similar to the `core.symlinks` check in `init`/`clone`: instead of trying to create a symbolic link in the `.git/` directory, we could create a test file and try to access `.git/config` via 8.3 name and/or Alternate Data Stream. 2. We could simply "flip the switch" on `core.protectNTFS`, to make it "on by default". The obvious downside of 1. is that it won't protect worktrees that were clone with a vulnerable Git version already. We considered patching code paths that check out files to check whether we're running on an NTFS system dynamically and persist the result in the repository-local config setting `core.protectNTFS`, but in the end decided that this solution would be too fragile, and too involved. The obvious downside of 2. is that everybody will have to "suffer" the performance penalty incurred from calling `is_ntfs_dotgit()` on every path, even in setups where. After the recent work to accelerate `is_ntfs_dotgit()` in most cases, it looks as if the time spent on validating ten million random file names increases only negligibly (less than 20ms, well within the standard deviation of ~50ms). Therefore the benefits outweigh the cost. Another downside of this is that paths that might have been acceptable previously now will be forbidden. Realistically, though, this is an improvement because public Git hosters already would reject any `git push` that contains such file names. Note: There might be a similar problem mounting HFS+ on Linux. However, this scenario has been considered unlikely and in light of the cost (in the aforementioned benchmark, `core.protectHFS = true` increased the time from ~440ms to ~610ms), it was decided _not_ to touch the default of `core.protectHFS`. This change addresses CVE-2019-1353. Reported-by: Nicolas Joly Helped-by: Garima Singh Signed-off-by: Johannes Schindelin --- config.mak.uname | 2 -- environment.c | 2 +- 2 files changed, 1 insertion(+), 3 deletions(-) diff --git a/config.mak.uname b/config.mak.uname index 6604b130f8..333bd399d0 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -379,7 +379,6 @@ ifeq ($(uname_S),Windows) EXTLIBS = user32.lib advapi32.lib shell32.lib wininet.lib ws2_32.lib invalidcontinue.obj PTHREAD_LIBS = lib = - BASIC_CFLAGS += -DPROTECT_NTFS_DEFAULT=1 ifndef DEBUG BASIC_CFLAGS += -GL -Os -MD BASIC_LDFLAGS += -LTCG @@ -516,7 +515,6 @@ ifneq (,$(findstring MINGW,$(uname_S))) COMPAT_OBJS += compat/mingw.o compat/winansi.o \ compat/win32/pthread.o compat/win32/syslog.o \ compat/win32/dirent.o - BASIC_CFLAGS += -DPROTECT_NTFS_DEFAULT=1 EXTLIBS += -lws2_32 GITLIBS += git.res PTHREAD_LIBS = diff --git a/environment.c b/environment.c index 3fd4b10845..ab38deefa5 100644 --- a/environment.c +++ b/environment.c @@ -73,7 +73,7 @@ enum log_refs_config log_all_ref_updates = LOG_REFS_UNSET; int protect_hfs = PROTECT_HFS_DEFAULT; #ifndef PROTECT_NTFS_DEFAULT -#define PROTECT_NTFS_DEFAULT 0 +#define PROTECT_NTFS_DEFAULT 1 #endif int protect_ntfs = PROTECT_NTFS_DEFAULT; From a8dee3ca610f5a1d403634492136c887f83b59d2 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Tue, 1 Oct 2019 23:27:18 +0200 Subject: [PATCH 18/34] Disallow dubiously-nested submodule git directories Currently it is technically possible to let a submodule's git directory point right into the git dir of a sibling submodule. Example: the git directories of two submodules with the names `hippo` and `hippo/hooks` would be `.git/modules/hippo/` and `.git/modules/hippo/hooks/`, respectively, but the latter is already intended to house the former's hooks. In most cases, this is just confusing, but there is also a (quite contrived) attack vector where Git can be fooled into mistaking remote content for file contents it wrote itself during a recursive clone. Let's plug this bug. To do so, we introduce the new function `validate_submodule_git_dir()` which simply verifies that no git dir exists for any leading directories of the submodule name (if there are any). Note: this patch specifically continues to allow sibling modules names of the form `core/lib`, `core/doc`, etc, as long as `core` is not a submodule name. This fixes CVE-2019-1387. Reported-by: Nicolas Joly Signed-off-by: Johannes Schindelin --- builtin/submodule--helper.c | 4 +++ submodule.c | 49 +++++++++++++++++++++++++++++++++++-- submodule.h | 5 ++++ t/t7415-submodule-names.sh | 23 +++++++++++++++++ 4 files changed, 79 insertions(+), 2 deletions(-) diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index 79156fac45..3376b1bb29 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -678,6 +678,10 @@ static int module_clone(int argc, const char **argv, const char *prefix) } else path = xstrdup(path); + if (validate_submodule_git_dir(sm_gitdir, name) < 0) + die(_("refusing to create/use '%s' in another submodule's " + "git dir"), sm_gitdir); + if (!file_exists(sm_gitdir)) { if (safe_create_leading_directories_const(sm_gitdir) < 0) die(_("could not create directory '%s'"), sm_gitdir); diff --git a/submodule.c b/submodule.c index 36f45f5a5a..9abc90d9cd 100644 --- a/submodule.c +++ b/submodule.c @@ -1842,6 +1842,47 @@ int parallel_submodules(void) return parallel_jobs; } +int validate_submodule_git_dir(char *git_dir, const char *submodule_name) +{ + size_t len = strlen(git_dir), suffix_len = strlen(submodule_name); + char *p; + int ret = 0; + + if (len <= suffix_len || (p = git_dir + len - suffix_len)[-1] != '/' || + strcmp(p, submodule_name)) + BUG("submodule name '%s' not a suffix of git dir '%s'", + submodule_name, git_dir); + + /* + * We prevent the contents of sibling submodules' git directories to + * clash. + * + * Example: having a submodule named `hippo` and another one named + * `hippo/hooks` would result in the git directories + * `.git/modules/hippo/` and `.git/modules/hippo/hooks/`, respectively, + * but the latter directory is already designated to contain the hooks + * of the former. + */ + for (; *p; p++) { + if (is_dir_sep(*p)) { + char c = *p; + + *p = '\0'; + if (is_git_directory(git_dir)) + ret = -1; + *p = c; + + if (ret < 0) + return error(_("submodule git dir '%s' is " + "inside git dir '%.*s'"), + git_dir, + (int)(p - git_dir), git_dir); + } + } + + return 0; +} + /* * Embeds a single submodules git directory into the superprojects git dir, * non recursively. @@ -1850,7 +1891,7 @@ static void relocate_single_git_dir_into_superproject(const char *prefix, const char *path) { char *old_git_dir = NULL, *real_old_git_dir = NULL, *real_new_git_dir = NULL; - const char *new_git_dir; + char *new_git_dir; const struct submodule *sub; if (submodule_uses_worktrees(path)) @@ -1868,10 +1909,14 @@ static void relocate_single_git_dir_into_superproject(const char *prefix, if (!sub) die(_("could not lookup name for submodule '%s'"), path); - new_git_dir = git_path("modules/%s", sub->name); + new_git_dir = git_pathdup("modules/%s", sub->name); + if (validate_submodule_git_dir(new_git_dir, sub->name) < 0) + die(_("refusing to move '%s' into an existing git dir"), + real_old_git_dir); if (safe_create_leading_directories_const(new_git_dir) < 0) die(_("could not create directory '%s'"), new_git_dir); real_new_git_dir = real_pathdup(new_git_dir, 1); + free(new_git_dir); fprintf(stderr, _("Migrating git directory of '%s%s' from\n'%s' to\n'%s'\n"), get_super_prefix_or_empty(), path, diff --git a/submodule.h b/submodule.h index 3c239d1ecf..cb1ab07b9a 100644 --- a/submodule.h +++ b/submodule.h @@ -120,6 +120,11 @@ extern int parallel_submodules(void); */ int submodule_to_gitdir(struct strbuf *buf, const char *submodule); +/* + * Make sure that no submodule's git dir is nested in a sibling submodule's. + */ +int validate_submodule_git_dir(char *git_dir, const char *submodule_name); + #define SUBMODULE_MOVE_HEAD_DRY_RUN (1<<0) #define SUBMODULE_MOVE_HEAD_FORCE (1<<1) extern int submodule_move_head(const char *path, diff --git a/t/t7415-submodule-names.sh b/t/t7415-submodule-names.sh index 7c65e7a35c..8bd3d0937d 100755 --- a/t/t7415-submodule-names.sh +++ b/t/t7415-submodule-names.sh @@ -106,4 +106,27 @@ test_expect_success MINGW 'prevent git~1 squatting on Windows' ' ! grep gitdir squatting-clone/d/a/git~2 ' +test_expect_success 'git dirs of sibling submodules must not be nested' ' + git init nested && + test_commit -C nested nested && + ( + cd nested && + cat >.gitmodules <<-EOF && + [submodule "hippo"] + url = . + path = thing1 + [submodule "hippo/hooks"] + url = . + path = thing2 + EOF + git clone . thing1 && + git clone . thing2 && + git add .gitmodules thing1 thing2 && + test_tick && + git commit -m nested + ) && + test_must_fail git clone --recurse-submodules nested clone 2>err && + test_i18ngrep "is inside git dir" err +' + test_done From ad1559252945179e28fba7d693494051352810c5 Mon Sep 17 00:00:00 2001 From: Garima Singh Date: Wed, 18 Sep 2019 16:03:59 -0400 Subject: [PATCH 19/34] tests: add a helper to stress test argument quoting On Windows, we have to do all the command-line argument quoting ourselves. Worse: we have to have two versions of said quoting, one for MSYS2 programs (which have their own dequoting rules) and the rest. We care mostly about the rest, and to make sure that that works, let's have a stress test that comes up with all kinds of awkward arguments, verifying that a spawned sub-process receives those unharmed. Signed-off-by: Garima Singh Signed-off-by: Johannes Schindelin --- t/helper/test-run-command.c | 118 +++++++++++++++++++++++++++++++++++- 1 file changed, 116 insertions(+), 2 deletions(-) diff --git a/t/helper/test-run-command.c b/t/helper/test-run-command.c index d24d157379..6c801cb529 100644 --- a/t/helper/test-run-command.c +++ b/t/helper/test-run-command.c @@ -12,8 +12,8 @@ #include "run-command.h" #include "argv-array.h" #include "strbuf.h" -#include -#include +#include "gettext.h" +#include "parse-options.h" static int number_callbacks; static int parallel_next(struct child_process *cp, @@ -49,11 +49,125 @@ static int task_finished(int result, return 1; } +static uint64_t my_random_next = 1234; + +static uint64_t my_random(void) +{ + uint64_t res = my_random_next; + my_random_next = my_random_next * 1103515245 + 12345; + return res; +} + +static int quote_stress_test(int argc, const char **argv) +{ + /* + * We are running a quote-stress test. + * spawn a subprocess that runs quote-stress with a + * special option that echoes back the arguments that + * were passed in. + */ + char special[] = ".?*\\^_\"'`{}()[]<>@~&+:;$%"; // \t\r\n\a"; + int i, j, k, trials = 100; + struct strbuf out = STRBUF_INIT; + struct argv_array args = ARGV_ARRAY_INIT; + struct option options[] = { + OPT_INTEGER('n', "trials", &trials, "Number of trials"), + OPT_END() + }; + const char * const usage[] = { + "test-run-command quote-stress-test ", + NULL + }; + + argc = parse_options(argc, argv, NULL, options, usage, 0); + + for (i = 0; i < trials; i++) { + struct child_process cp = CHILD_PROCESS_INIT; + size_t arg_count = 1 + (my_random() % 5), arg_offset; + int ret = 0; + + argv_array_clear(&args); + argv_array_pushl(&args, "test-run-command", + "quote-echo", NULL); + arg_offset = args.argc; + for (j = 0; j < arg_count; j++) { + char buf[20]; + size_t min_len = 1; + size_t arg_len = min_len + + (my_random() % (ARRAY_SIZE(buf) - min_len)); + + for (k = 0; k < arg_len; k++) + buf[k] = special[my_random() % + ARRAY_SIZE(special)]; + buf[arg_len] = '\0'; + + argv_array_push(&args, buf); + } + + cp.argv = args.argv; + strbuf_reset(&out); + if (pipe_command(&cp, NULL, 0, &out, 0, NULL, 0) < 0) + return error("Failed to spawn child process"); + + for (j = 0, k = 0; j < arg_count; j++) { + const char *arg = args.argv[j + arg_offset]; + + if (strcmp(arg, out.buf + k)) + ret = error("incorrectly quoted arg: '%s', " + "echoed back as '%s'", + arg, out.buf + k); + k += strlen(out.buf + k) + 1; + } + + if (k != out.len) + ret = error("got %d bytes, but consumed only %d", + (int)out.len, (int)k); + + if (ret) { + fprintf(stderr, "Trial #%d failed. Arguments:\n", i); + for (j = 0; j < arg_count; j++) + fprintf(stderr, "arg #%d: '%s'\n", + (int)j, args.argv[j + arg_offset]); + + strbuf_release(&out); + argv_array_clear(&args); + + return ret; + } + + if (i && (i % 100) == 0) + fprintf(stderr, "Trials completed: %d\n", (int)i); + } + + strbuf_release(&out); + argv_array_clear(&args); + + return 0; +} + +static int quote_echo(int argc, const char **argv) +{ + while (argc > 1) { + fwrite(argv[1], strlen(argv[1]), 1, stdout); + fputc('\0', stdout); + argv++; + argc--; + } + + return 0; +} + int cmd_main(int argc, const char **argv) { struct child_process proc = CHILD_PROCESS_INIT; int jobs; + if (argc >= 2 && !strcmp(argv[1], "quote-stress-test")) + return !!quote_stress_test(argc - 1, argv + 1); + + if (argc >= 2 && !strcmp(argv[1], "quote-echo")) + return !!quote_echo(argc - 1, argv + 1); + if (argc < 3) return 1; proc.argv = (const char **)argv + 2; From 55953c77c0bfcb727ffd7e293e4661b7a24b791b Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 20 Sep 2019 19:09:39 +0200 Subject: [PATCH 20/34] quote-stress-test: accept arguments to test via the command-line When the stress test reported a problem with quoting certain arguments, it is helpful to have a facility to play with those arguments in order to find out whether variations of those arguments are affected, too. Let's allow `test-run-command quote-stress-test -- ` to be used for that purpose. Signed-off-by: Johannes Schindelin --- t/helper/test-run-command.c | 31 ++++++++++++++++++++----------- 1 file changed, 20 insertions(+), 11 deletions(-) diff --git a/t/helper/test-run-command.c b/t/helper/test-run-command.c index 6c801cb529..bdbc5ec56a 100644 --- a/t/helper/test-run-command.c +++ b/t/helper/test-run-command.c @@ -83,25 +83,34 @@ static int quote_stress_test(int argc, const char **argv) for (i = 0; i < trials; i++) { struct child_process cp = CHILD_PROCESS_INIT; - size_t arg_count = 1 + (my_random() % 5), arg_offset; + size_t arg_count, arg_offset; int ret = 0; argv_array_clear(&args); argv_array_pushl(&args, "test-run-command", "quote-echo", NULL); arg_offset = args.argc; - for (j = 0; j < arg_count; j++) { - char buf[20]; - size_t min_len = 1; - size_t arg_len = min_len + - (my_random() % (ARRAY_SIZE(buf) - min_len)); - for (k = 0; k < arg_len; k++) - buf[k] = special[my_random() % - ARRAY_SIZE(special)]; - buf[arg_len] = '\0'; + if (argc > 0) { + trials = 1; + arg_count = argc; + for (j = 0; j < arg_count; j++) + argv_array_push(&args, argv[j]); + } else { + arg_count = 1 + (my_random() % 5); + for (j = 0; j < arg_count; j++) { + char buf[20]; + size_t min_len = 1; + size_t arg_len = min_len + + (my_random() % (ARRAY_SIZE(buf) - min_len)); - argv_array_push(&args, buf); + for (k = 0; k < arg_len; k++) + buf[k] = special[my_random() % + ARRAY_SIZE(special)]; + buf[arg_len] = '\0'; + + argv_array_push(&args, buf); + } } cp.argv = args.argv; From 35edce205615c553fdc49bcf10b0c91f061c56c9 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 9 Sep 2019 15:43:35 +0200 Subject: [PATCH 21/34] t6130/t9350: prepare for stringent Win32 path validation On Windows, file names cannot contain asterisks nor newline characters. In an upcoming commit, we will make this limitation explicit, disallowing even the creation of commits that introduce such file names. However, in the test scripts touched by this patch, we _know_ that those paths won't be checked out, so we _want_ to allow such file names. Happily, the stringent path validation will be guarded via the `core.protectNTFS` flag, so all we need to do is to force that flag off temporarily. Signed-off-by: Johannes Schindelin --- t/t6130-pathspec-noglob.sh | 1 + t/t9350-fast-export.sh | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/t/t6130-pathspec-noglob.sh b/t/t6130-pathspec-noglob.sh index 658353277e..4129d9fd9a 100755 --- a/t/t6130-pathspec-noglob.sh +++ b/t/t6130-pathspec-noglob.sh @@ -10,6 +10,7 @@ test_expect_success 'create commits with glob characters' ' # the name "f*" in the worktree, because it is not allowed # on Windows (the tests below do not depend on the presence # of the file in the worktree) + git config core.protectNTFS false && git update-index --add --cacheinfo 100644 "$(git rev-parse HEAD:foo)" "f*" && test_tick && git commit -m star && diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh index e6062071e6..15b167d29d 100755 --- a/t/t9350-fast-export.sh +++ b/t/t9350-fast-export.sh @@ -424,7 +424,7 @@ test_expect_success 'fast-export quotes pathnames' ' test_config -C crazy-paths core.protectNTFS false && (cd crazy-paths && blob=$(echo foo | git hash-object -w --stdin) && - git update-index --add \ + git -c core.protectNTFS=false update-index --add \ --cacheinfo 100644 $blob "$(printf "path with\\nnewline")" \ --cacheinfo 100644 $blob "path with \"quote\"" \ --cacheinfo 100644 $blob "path with \\backslash" \ From 7530a6287e20a74b9fe6d4ca3a66df0f0f5cc52c Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Thu, 19 Sep 2019 23:46:31 +0200 Subject: [PATCH 22/34] quote-stress-test: allow skipping some trials When the, say, 93rd trial run fails, it is a good idea to have a way to skip the first 92 trials and dig directly into the 93rd in a debugger. Signed-off-by: Johannes Schindelin --- t/helper/test-run-command.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/t/helper/test-run-command.c b/t/helper/test-run-command.c index bdbc5ec56a..07989f78ec 100644 --- a/t/helper/test-run-command.c +++ b/t/helper/test-run-command.c @@ -67,11 +67,12 @@ static int quote_stress_test(int argc, const char **argv) * were passed in. */ char special[] = ".?*\\^_\"'`{}()[]<>@~&+:;$%"; // \t\r\n\a"; - int i, j, k, trials = 100; + int i, j, k, trials = 100, skip = 0; struct strbuf out = STRBUF_INIT; struct argv_array args = ARGV_ARRAY_INIT; struct option options[] = { OPT_INTEGER('n', "trials", &trials, "Number of trials"), + OPT_INTEGER('s', "skip", &skip, "Skip trials"), OPT_END() }; const char * const usage[] = { @@ -113,6 +114,9 @@ static int quote_stress_test(int argc, const char **argv) } } + if (i < skip) + continue; + cp.argv = args.argv; strbuf_reset(&out); if (pipe_command(&cp, NULL, 0, &out, 0, NULL, 0) < 0) From cc756edda63769cf6d7acc99e6ad3a9cbb5dc3ec Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 9 Sep 2019 13:56:15 +0200 Subject: [PATCH 23/34] unpack-trees: let merged_entry() pass through do_add_entry()'s errors A `git clone` will end with exit code 0 when `merged_entry()` returns a positive value during a call of `unpack_trees()` to `traverse_trees()`. The reason is that `unpack_trees()` will interpret a positive value not to be an error. The problem is, however, that `add_index_entry()` (which is called by `merged_entry()` can report an error, and we really should fail the entire clone in such a case. Let's fix this problem, in preparation for a Windows-specific patch disallowing `mkdir()` with directory names that contain a trailing space (which is illegal on NTFS): we want `git clone` to abort when a path cannot be checked out due to that condition. Signed-off-by: Johannes Schindelin --- unpack-trees.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/unpack-trees.c b/unpack-trees.c index 862cfce661..649d11855e 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -1821,7 +1821,8 @@ static int merged_entry(const struct cache_entry *ce, invalidate_ce_path(old, o); } - do_add_entry(o, merge, update, CE_STAGEMASK); + if (do_add_entry(o, merge, update, CE_STAGEMASK) < 0) + return -1; return 1; } From 817ddd64c20b29b2d86b3a0589f7ff88d1279109 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Thu, 5 Sep 2019 13:44:21 +0200 Subject: [PATCH 24/34] mingw: refuse to access paths with illegal characters Certain characters are not admissible in file names on Windows, even if Cygwin/MSYS2 (and therefore, Git for Windows' Bash) pretend that they are, e.g. `:`, `<`, `>`, etc Let's disallow those characters explicitly in Windows builds of Git. Note: just like trailing spaces or periods, it _is_ possible on Windows to create commits adding files with such illegal characters, as long as the operation leaves the worktree untouched. To allow for that, we continue to guard `is_valid_win32_path()` behind the config setting `core.protectNTFS`, so that users _can_ continue to do that, as long as they turn the protections off via that config setting. Among other problems, this prevents Git from trying to write to an "NTFS Alternate Data Stream" (which refers to metadata stored alongside a file, under a special name: ":"). This fix therefore also prevents an attack vector that was exploited in demonstrations of a number of recently-fixed security bugs. Further reading on illegal characters in Win32 filenames: https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file Signed-off-by: Johannes Schindelin --- compat/mingw.c | 10 ++++++++++ compat/mingw.h | 7 +++++-- t/t0060-path-utils.sh | 4 +++- 3 files changed, 18 insertions(+), 3 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 17b4da16e8..3aea26982d 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -2134,6 +2134,8 @@ int is_valid_win32_path(const char *path) if (!protect_ntfs) return 1; + skip_dos_drive_prefix((char **)&path); + for (;;) { char c = *(path++); switch (c) { @@ -2155,6 +2157,14 @@ int is_valid_win32_path(const char *path) preceding_space_or_period = 1; i++; continue; + case ':': /* DOS drive prefix was already skipped */ + case '<': case '>': case '"': case '|': case '?': case '*': + /* illegal character */ + return 0; + default: + if (c > '\0' && c < '\x20') + /* illegal character */ + return 0; } preceding_space_or_period = 0; i++; diff --git a/compat/mingw.h b/compat/mingw.h index 8c49c1d09b..7482f196af 100644 --- a/compat/mingw.h +++ b/compat/mingw.h @@ -431,8 +431,11 @@ int mingw_offset_1st_component(const char *path); /** * Verifies that the given path is a valid one on Windows. * - * In particular, path segments are disallowed which end in a period or a - * space (except the special directories `.` and `..`). + * In particular, path segments are disallowed which + * + * - end in a period or a space (except the special directories `.` and `..`). + * + * - contain any of the reserved characters, e.g. `:`, `;`, `*`, etc * * Returns 1 upon success, otherwise 0. */ diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh index 1171e0bb88..f7e2529bff 100755 --- a/t/t0060-path-utils.sh +++ b/t/t0060-path-utils.sh @@ -445,13 +445,15 @@ test_expect_success MINGW 'is_valid_path() on Windows' ' win32 \ "win32 x" \ ../hello.txt \ + C:\\git \ \ --not \ "win32 " \ "win32 /x " \ "win32." \ "win32 . ." \ - .../hello.txt + .../hello.txt \ + colon:test ' test_done From 379e51d1ae668a1f26d50eb59b3f8befc1eb8883 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 20 Sep 2019 00:12:37 +0200 Subject: [PATCH 25/34] quote-stress-test: offer to test quoting arguments for MSYS2 sh It is unfortunate that we need to quote arguments differently on Windows, depending whether we build a command-line for MSYS2's `sh` or for other Windows executables. We already have a test helper to verify the latter, with this patch we can also verify the former. Signed-off-by: Johannes Schindelin --- t/helper/test-run-command.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/t/helper/test-run-command.c b/t/helper/test-run-command.c index 07989f78ec..b622334407 100644 --- a/t/helper/test-run-command.c +++ b/t/helper/test-run-command.c @@ -67,12 +67,13 @@ static int quote_stress_test(int argc, const char **argv) * were passed in. */ char special[] = ".?*\\^_\"'`{}()[]<>@~&+:;$%"; // \t\r\n\a"; - int i, j, k, trials = 100, skip = 0; + int i, j, k, trials = 100, skip = 0, msys2 = 0; struct strbuf out = STRBUF_INIT; struct argv_array args = ARGV_ARRAY_INIT; struct option options[] = { OPT_INTEGER('n', "trials", &trials, "Number of trials"), OPT_INTEGER('s', "skip", &skip, "Skip trials"), + OPT_BOOL('m', "msys2", &msys2, "Test quoting for MSYS2's sh"), OPT_END() }; const char * const usage[] = { @@ -82,14 +83,20 @@ static int quote_stress_test(int argc, const char **argv) argc = parse_options(argc, argv, NULL, options, usage, 0); + setenv("MSYS_NO_PATHCONV", "1", 0); + for (i = 0; i < trials; i++) { struct child_process cp = CHILD_PROCESS_INIT; size_t arg_count, arg_offset; int ret = 0; argv_array_clear(&args); - argv_array_pushl(&args, "test-run-command", - "quote-echo", NULL); + if (msys2) + argv_array_pushl(&args, "sh", "-c", + "printf %s\\\\0 \"$@\"", "skip", NULL); + else + argv_array_pushl(&args, "test-run-command", + "quote-echo", NULL); arg_offset = args.argc; if (argc > 0) { From d2c84dad1c88f40906799bc879f70b965efd8ba6 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Thu, 5 Sep 2019 13:27:53 +0200 Subject: [PATCH 26/34] mingw: refuse to access paths with trailing spaces or periods When creating a directory on Windows whose path ends in a space or a period (or chains thereof), the Win32 API "helpfully" trims those. For example, `mkdir("abc ");` will return success, but actually create a directory called `abc` instead. This stems back to the DOS days, when all file names had exactly 8 characters plus exactly 3 characters for the file extension, and the only way to have shorter names was by padding with spaces. Sadly, this "helpful" behavior is a bit inconsistent: after a successful `mkdir("abc ");`, a `mkdir("abc /def")` will actually _fail_ (because the directory `abc ` does not actually exist). Even if it would work, we now have a serious problem because a Git repository could contain directories `abc` and `abc `, and on Windows, they would be "merged" unintentionally. As these paths are illegal on Windows, anyway, let's disallow any accesses to such paths on that Operating System. For practical reasons, this behavior is still guarded by the config setting `core.protectNTFS`: it is possible (and at least two regression tests make use of it) to create commits without involving the worktree. In such a scenario, it is of course possible -- even on Windows -- to create such file names. Among other consequences, this patch disallows submodules' paths to end in spaces on Windows (which would formerly have confused Git enough to try to write into incorrect paths, anyway). While this patch does not fix a vulnerability on its own, it prevents an attack vector that was exploited in demonstrations of a number of recently-fixed security bugs. The regression test added to `t/t7417-submodule-path-url.sh` reflects that attack vector. Note that we have to adjust the test case "prevent git~1 squatting on Windows" in `t/t7415-submodule-names.sh` because of a very subtle issue. It tries to clone two submodules whose names differ only in a trailing period character, and as a consequence their git directories differ in the same way. Previously, when Git tried to clone the second submodule, it thought that the git directory already existed (because on Windows, when you create a directory with the name `b.` it actually creates `b`), but with this patch, the first submodule's clone will fail because of the illegal name of the git directory. Therefore, when cloning the second submodule, Git will take a different code path: a fresh clone (without an existing git directory). Both code paths fail to clone the second submodule, both because the the corresponding worktree directory exists and is not empty, but the error messages are worded differently. Signed-off-by: Johannes Schindelin --- compat/mingw.c | 57 ++++++++++++++++++++++++++++++++++- compat/mingw.h | 11 +++++++ git-compat-util.h | 4 +++ read-cache.c | 3 ++ t/helper/test-path-utils.c | 17 +++++++++++ t/t0060-path-utils.sh | 14 +++++++++ t/t7415-submodule-names.sh | 2 +- t/t7417-submodule-path-url.sh | 17 +++++++++++ 8 files changed, 123 insertions(+), 2 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 8b6fa0db44..17b4da16e8 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -333,6 +333,12 @@ int mingw_mkdir(const char *path, int mode) { int ret; wchar_t wpath[MAX_PATH]; + + if (!is_valid_win32_path(path)) { + errno = EINVAL; + return -1; + } + if (xutftowcs_path(wpath, path) < 0) return -1; ret = _wmkdir(wpath); @@ -345,13 +351,18 @@ int mingw_open (const char *filename, int oflags, ...) { va_list args; unsigned mode; - int fd; + int fd, create = (oflags & (O_CREAT | O_EXCL)) == (O_CREAT | O_EXCL); wchar_t wfilename[MAX_PATH]; va_start(args, oflags); mode = va_arg(args, int); va_end(args); + if (!is_valid_win32_path(filename)) { + errno = create ? EINVAL : ENOENT; + return -1; + } + if (filename && !strcmp(filename, "/dev/null")) filename = "nul"; @@ -413,6 +424,11 @@ FILE *mingw_fopen (const char *filename, const char *otype) int hide = needs_hiding(filename); FILE *file; wchar_t wfilename[MAX_PATH], wotype[4]; + if (!is_valid_win32_path(filename)) { + int create = otype && strchr(otype, 'w'); + errno = create ? EINVAL : ENOENT; + return NULL; + } if (filename && !strcmp(filename, "/dev/null")) filename = "nul"; if (xutftowcs_path(wfilename, filename) < 0 || @@ -435,6 +451,11 @@ FILE *mingw_freopen (const char *filename, const char *otype, FILE *stream) int hide = needs_hiding(filename); FILE *file; wchar_t wfilename[MAX_PATH], wotype[4]; + if (!is_valid_win32_path(filename)) { + int create = otype && strchr(otype, 'w'); + errno = create ? EINVAL : ENOENT; + return NULL; + } if (filename && !strcmp(filename, "/dev/null")) filename = "nul"; if (xutftowcs_path(wfilename, filename) < 0 || @@ -2106,6 +2127,40 @@ static void setup_windows_environment(void) setenv("TERM", "cygwin", 1); } +int is_valid_win32_path(const char *path) +{ + int preceding_space_or_period = 0, i = 0, periods = 0; + + if (!protect_ntfs) + return 1; + + for (;;) { + char c = *(path++); + switch (c) { + case '\0': + case '/': case '\\': + /* cannot end in ` ` or `.`, except for `.` and `..` */ + if (preceding_space_or_period && + (i != periods || periods > 2)) + return 0; + if (!c) + return 1; + + i = periods = preceding_space_or_period = 0; + continue; + case '.': + periods++; + /* fallthru */ + case ' ': + preceding_space_or_period = 1; + i++; + continue; + } + preceding_space_or_period = 0; + i++; + } +} + /* * Disable MSVCRT command line wildcard expansion (__getmainargs called from * mingw startup code, see init.c in mingw runtime). diff --git a/compat/mingw.h b/compat/mingw.h index e03aecfe2e..8c49c1d09b 100644 --- a/compat/mingw.h +++ b/compat/mingw.h @@ -428,6 +428,17 @@ int mingw_offset_1st_component(const char *path); #include #endif +/** + * Verifies that the given path is a valid one on Windows. + * + * In particular, path segments are disallowed which end in a period or a + * space (except the special directories `.` and `..`). + * + * Returns 1 upon success, otherwise 0. + */ +int is_valid_win32_path(const char *path); +#define is_valid_path(path) is_valid_win32_path(path) + /** * Converts UTF-8 encoded string to UTF-16LE. * diff --git a/git-compat-util.h b/git-compat-util.h index 6cb3c2f19e..e587ac4e23 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -370,6 +370,10 @@ static inline int git_offset_1st_component(const char *path) #define offset_1st_component git_offset_1st_component #endif +#ifndef is_valid_path +#define is_valid_path(path) 1 +#endif + #ifndef find_last_dir_sep static inline char *git_find_last_dir_sep(const char *path) { diff --git a/read-cache.c b/read-cache.c index bde1e70c51..771171c402 100644 --- a/read-cache.c +++ b/read-cache.c @@ -847,6 +847,9 @@ int verify_path(const char *path, unsigned mode) if (has_dos_drive_prefix(path)) return 0; + if (!is_valid_path(path)) + return 0; + goto inside; for (;;) { if (!c) diff --git a/t/helper/test-path-utils.c b/t/helper/test-path-utils.c index 16d8e689c8..8b3ce07860 100644 --- a/t/helper/test-path-utils.c +++ b/t/helper/test-path-utils.c @@ -386,6 +386,23 @@ int cmd_main(int argc, const char **argv) if (argc > 1 && !strcmp(argv[1], "protect_ntfs_hfs")) return !!protect_ntfs_hfs_benchmark(argc - 1, argv + 1); + if (argc > 1 && !strcmp(argv[1], "is_valid_path")) { + int res = 0, expect = 1, i; + + for (i = 2; i < argc; i++) + if (!strcmp("--not", argv[i])) + expect = 0; + else if (expect != is_valid_path(argv[i])) + res = error("'%s' is%s a valid path", + argv[i], expect ? " not" : ""); + else + fprintf(stderr, + "'%s' is%s a valid path\n", + argv[i], expect ? "" : " not"); + + return !!res; + } + fprintf(stderr, "%s: unknown function name: %s\n", argv[0], argv[1] ? argv[1] : "(there was none)"); return 1; diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh index 2b8589e921..1171e0bb88 100755 --- a/t/t0060-path-utils.sh +++ b/t/t0060-path-utils.sh @@ -440,4 +440,18 @@ test_expect_success 'match .gitmodules' ' .gitmodules,:\$DATA ' +test_expect_success MINGW 'is_valid_path() on Windows' ' + test-path-utils is_valid_path \ + win32 \ + "win32 x" \ + ../hello.txt \ + \ + --not \ + "win32 " \ + "win32 /x " \ + "win32." \ + "win32 . ." \ + .../hello.txt +' + test_done diff --git a/t/t7415-submodule-names.sh b/t/t7415-submodule-names.sh index 7c65e7a35c..5141ff45c3 100755 --- a/t/t7415-submodule-names.sh +++ b/t/t7415-submodule-names.sh @@ -102,7 +102,7 @@ test_expect_success MINGW 'prevent git~1 squatting on Windows' ' ) && test_must_fail git -c core.protectNTFS=false \ clone --recurse-submodules squatting squatting-clone 2>err && - test_i18ngrep "directory not empty" err && + test_i18ngrep -e "directory not empty" -e "not an empty directory" err && ! grep gitdir squatting-clone/d/a/git~2 ' diff --git a/t/t7417-submodule-path-url.sh b/t/t7417-submodule-path-url.sh index 638293f0da..fad9e20dc4 100755 --- a/t/t7417-submodule-path-url.sh +++ b/t/t7417-submodule-path-url.sh @@ -17,4 +17,21 @@ test_expect_success 'clone rejects unprotected dash' ' test_i18ngrep ignoring err ' +test_expect_success MINGW 'submodule paths disallows trailing spaces' ' + git init super && + test_must_fail git -C super submodule add ../upstream "sub " && + + : add "sub", then rename "sub" to "sub ", the hard way && + git -C super submodule add ../upstream sub && + tree=$(git -C super write-tree) && + git -C super ls-tree $tree >tree && + sed "s/sub/sub /" tree.new && + tree=$(git -C super mktree err && + test_i18ngrep "sub " err +' + test_done From f82a97eb9197c1e3768e72648f37ce0ca3233734 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 6 Sep 2019 00:09:10 +0200 Subject: [PATCH 27/34] mingw: handle `subst`-ed "DOS drives" MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Over a decade ago, in 25fe217b86c (Windows: Treat Windows style path names., 2008-03-05), Git was taught to handle absolute Windows paths, i.e. paths that start with a drive letter and a colon. Unbeknownst to us, while drive letters of physical drives are limited to letters of the English alphabet, there is a way to assign virtual drive letters to arbitrary directories, via the `subst` command, which is _not_ limited to English letters. It is therefore possible to have absolute Windows paths of the form `1:\what\the\hex.txt`. Even "better": pretty much arbitrary Unicode letters can also be used, e.g. `ä:\tschibät.sch`. While it can be sensibly argued that users who set up such funny drive letters really seek adverse consequences, the Windows Operating System is known to be a platform where many users are at the mercy of administrators who have their very own idea of what constitutes a reasonable setup. Therefore, let's just make sure that such funny paths are still considered absolute paths by Git, on Windows. In addition to Unicode characters, pretty much any character is a valid drive letter, as far as `subst` is concerned, even `:` and `"` or even a space character. While it is probably the opposite of smart to use them, let's safeguard `is_dos_drive_prefix()` against all of them. Note: `[::1]:repo` is a valid URL, but not a valid path on Windows. As `[` is now considered a valid drive letter, we need to be very careful to avoid misinterpreting such a string as valid local path in `url_is_local_not_ssh()`. To do that, we use the just-introduced function `is_valid_path()` (which will label the string as invalid file name because of the colon characters). This fixes CVE-2019-1351. Reported-by: Nicolas Joly Signed-off-by: Johannes Schindelin --- compat/mingw.c | 24 ++++++++++++++++++++++++ compat/mingw.h | 4 ++-- connect.c | 2 +- t/t0060-path-utils.sh | 9 +++++++++ 4 files changed, 36 insertions(+), 3 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 3aea26982d..27d6f4ac30 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -1986,6 +1986,30 @@ pid_t waitpid(pid_t pid, int *status, int options) return -1; } +int mingw_has_dos_drive_prefix(const char *path) +{ + int i; + + /* + * Does it start with an ASCII letter (i.e. highest bit not set), + * followed by a colon? + */ + if (!(0x80 & (unsigned char)*path)) + return *path && path[1] == ':' ? 2 : 0; + + /* + * While drive letters must be letters of the English alphabet, it is + * possible to assign virtually _any_ Unicode character via `subst` as + * a drive letter to "virtual drives". Even `1`, or `ä`. Or fun stuff + * like this: + * + * subst ֍: %USERPROFILE%\Desktop + */ + for (i = 1; i < 4 && (0x80 & (unsigned char)path[i]); i++) + ; /* skip first UTF-8 character */ + return path[i] == ':' ? i + 1 : 0; +} + int mingw_skip_dos_drive_prefix(char **path) { int ret = has_dos_drive_prefix(*path); diff --git a/compat/mingw.h b/compat/mingw.h index 7482f196af..17064665d9 100644 --- a/compat/mingw.h +++ b/compat/mingw.h @@ -394,8 +394,8 @@ HANDLE winansi_get_osfhandle(int fd); * git specific compatibility */ -#define has_dos_drive_prefix(path) \ - (isalpha(*(path)) && (path)[1] == ':' ? 2 : 0) +int mingw_has_dos_drive_prefix(const char *path); +#define has_dos_drive_prefix mingw_has_dos_drive_prefix int mingw_skip_dos_drive_prefix(char **path); #define skip_dos_drive_prefix mingw_skip_dos_drive_prefix static inline int mingw_is_dir_sep(int c) diff --git a/connect.c b/connect.c index 49b28b83be..a053cc256d 100644 --- a/connect.c +++ b/connect.c @@ -264,7 +264,7 @@ int url_is_local_not_ssh(const char *url) const char *colon = strchr(url, ':'); const char *slash = strchr(url, '/'); return !colon || (slash && slash < colon) || - has_dos_drive_prefix(url); + (has_dos_drive_prefix(url) && is_valid_path(url)); } static const char *prot_name(enum protocol protocol) diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh index f7e2529bff..40db3e1e1a 100755 --- a/t/t0060-path-utils.sh +++ b/t/t0060-path-utils.sh @@ -165,6 +165,15 @@ test_expect_success 'absolute path rejects the empty string' ' test_must_fail test-path-utils absolute_path "" ' +test_expect_success MINGW ':\\abc is an absolute path' ' + for letter in : \" C Z 1 ä + do + path=$letter:\\abc && + absolute="$(test-path-utils absolute_path "$path")" && + test "$path" = "$absolute" || return 1 + done +' + test_expect_success 'real path rejects the empty string' ' test_must_fail test-path-utils real_path "" ' From 66d2a6159f511924e7e0b8a21c93538879bfd622 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 4 Dec 2019 19:58:46 +0100 Subject: [PATCH 28/34] Git 2.14.6 Signed-off-by: Johannes Schindelin --- Documentation/RelNotes/2.14.6.txt | 54 +++++++++++++++++++++++++++++++ GIT-VERSION-GEN | 2 +- RelNotes | 2 +- 3 files changed, 56 insertions(+), 2 deletions(-) create mode 100644 Documentation/RelNotes/2.14.6.txt diff --git a/Documentation/RelNotes/2.14.6.txt b/Documentation/RelNotes/2.14.6.txt new file mode 100644 index 0000000000..72b7af6799 --- /dev/null +++ b/Documentation/RelNotes/2.14.6.txt @@ -0,0 +1,54 @@ +Git v2.14.6 Release Notes +========================= + +This release addresses the security issues CVE-2019-1348, +CVE-2019-1349, CVE-2019-1350, CVE-2019-1351, CVE-2019-1352, +CVE-2019-1353, CVE-2019-1354, and CVE-2019-1387. + +Fixes since v2.14.5 +------------------- + + * CVE-2019-1348: + The --export-marks option of git fast-import is exposed also via + the in-stream command feature export-marks=... and it allows + overwriting arbitrary paths. + + * CVE-2019-1349: + When submodules are cloned recursively, under certain circumstances + Git could be fooled into using the same Git directory twice. We now + require the directory to be empty. + + * CVE-2019-1350: + Incorrect quoting of command-line arguments allowed remote code + execution during a recursive clone in conjunction with SSH URLs. + + * CVE-2019-1351: + While the only permitted drive letters for physical drives on + Windows are letters of the US-English alphabet, this restriction + does not apply to virtual drives assigned via subst : + . Git mistook such paths for relative paths, allowing writing + outside of the worktree while cloning. + + * CVE-2019-1352: + Git was unaware of NTFS Alternate Data Streams, allowing files + inside the .git/ directory to be overwritten during a clone. + + * CVE-2019-1353: + When running Git in the Windows Subsystem for Linux (also known as + "WSL") while accessing a working directory on a regular Windows + drive, none of the NTFS protections were active. + + * CVE-2019-1354: + Filenames on Linux/Unix can contain backslashes. On Windows, + backslashes are directory separators. Git did not use to refuse to + write out tracked files with such filenames. + + * CVE-2019-1387: + Recursive clones are currently affected by a vulnerability that is + caused by too-lax validation of submodule names, allowing very + targeted attacks via remote code execution in recursive clones. + +Credit for finding these vulnerabilities goes to Microsoft Security +Response Center, in particular to Nicolas Joly. The `fast-import` +fixes were provided by Jeff King, the other fixes by Johannes +Schindelin with help from Garima Singh. diff --git a/GIT-VERSION-GEN b/GIT-VERSION-GEN index 40680482ce..46557afb1e 100755 --- a/GIT-VERSION-GEN +++ b/GIT-VERSION-GEN @@ -1,7 +1,7 @@ #!/bin/sh GVF=GIT-VERSION-FILE -DEF_VER=v2.14.5 +DEF_VER=v2.14.6 LF=' ' diff --git a/RelNotes b/RelNotes index a127ce63f2..229381cd97 120000 --- a/RelNotes +++ b/RelNotes @@ -1 +1 @@ -Documentation/RelNotes/2.14.5.txt \ No newline at end of file +Documentation/RelNotes/2.14.6.txt \ No newline at end of file From e904deb89d9a9669a76a426182506a084d3f6308 Mon Sep 17 00:00:00 2001 From: Jonathan Nieder Date: Thu, 5 Dec 2019 01:28:28 -0800 Subject: [PATCH 29/34] submodule: reject submodule.update = !command in .gitmodules Since ac1fbbda2013 (submodule: do not copy unknown update mode from .gitmodules, 2013-12-02), Git has been careful to avoid copying [submodule "foo"] update = !run an arbitrary scary command from .gitmodules to a repository's local config, copying in the setting 'update = none' instead. The gitmodules(5) manpage documents the intention: The !command form is intentionally ignored here for security reasons Unfortunately, starting with v2.20.0-rc0 (which integrated ee69b2a9 (submodule--helper: introduce new update-module-mode helper, 2018-08-13, first released in v2.20.0-rc0)), there are scenarios where we *don't* ignore it: if the config store contains no submodule.foo.update setting, the submodule-config API falls back to reading .gitmodules and the repository-supplied !command gets run after all. This was part of a general change over time in submodule support to read more directly from .gitmodules, since unlike .git/config it allows a project to change values between branches and over time (while still allowing .git/config to override things). But it was never intended to apply to this kind of dangerous configuration. The behavior change was not advertised in ee69b2a9's commit message and was missed in review. Let's take the opportunity to make the protection more robust, even in Git versions that are technically not affected: instead of quietly converting 'update = !command' to 'update = none', noisily treat it as an error. Allowing the setting but treating it as meaning something else was just confusing; users are better served by seeing the error sooner. Forbidding the construct makes the semantics simpler and means we can check for it in fsck (in a separate patch). As a result, the submodule-config API cannot read this value from .gitmodules under any circumstance, and we can declare with confidence For security reasons, the '!command' form is not accepted here. Reported-by: Joern Schneeweisz Signed-off-by: Jonathan Nieder Signed-off-by: Johannes Schindelin --- Documentation/gitmodules.txt | 5 ++--- submodule-config.c | 12 ++++++++++-- t/t7406-submodule-update.sh | 14 ++++++++------ 3 files changed, 20 insertions(+), 11 deletions(-) diff --git a/Documentation/gitmodules.txt b/Documentation/gitmodules.txt index db5d47eb19..ac44a1510c 100644 --- a/Documentation/gitmodules.txt +++ b/Documentation/gitmodules.txt @@ -44,9 +44,8 @@ submodule..update:: submodule init` to initialize the configuration variable of the same name. Allowed values here are 'checkout', 'rebase', 'merge' or 'none'. See description of 'update' command in - linkgit:git-submodule[1] for their meaning. Note that the - '!command' form is intentionally ignored here for security - reasons. + linkgit:git-submodule[1] for their meaning. For security + reasons, the '!command' form is not accepted here. submodule..branch:: A remote branch name for tracking updates in the upstream submodule. diff --git a/submodule-config.c b/submodule-config.c index 3414fa1c1b..464908df76 100644 --- a/submodule-config.c +++ b/submodule-config.c @@ -396,6 +396,13 @@ struct parse_config_parameter { int overwrite; }; +/* + * Parse a config item from .gitmodules. + * + * This does not handle submodule-related configuration from the main + * config store (.git/config, etc). Callers are responsible for + * checking for overrides in the main config store when appropriate. + */ static int parse_config(const char *var, const char *value, void *data) { struct parse_config_parameter *me = data; @@ -473,8 +480,9 @@ static int parse_config(const char *var, const char *value, void *data) warn_multiple_config(me->treeish_name, submodule->name, "update"); else if (parse_submodule_update_strategy(value, - &submodule->update_strategy) < 0) - die(_("invalid value for %s"), var); + &submodule->update_strategy) < 0 || + submodule->update_strategy.type == SM_UPDATE_COMMAND) + die(_("invalid value for %s"), var); } else if (!strcmp(item.buf, "shallow")) { if (!me->overwrite && submodule->recommend_shallow != -1) warn_multiple_config(me->treeish_name, submodule->name, diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh index 6f083c4d68..779932457a 100755 --- a/t/t7406-submodule-update.sh +++ b/t/t7406-submodule-update.sh @@ -406,12 +406,12 @@ test_expect_success 'submodule update - command in .git/config' ' ) ' -test_expect_success 'submodule update - command in .gitmodules is ignored' ' +test_expect_success 'submodule update - command in .gitmodules is rejected' ' test_when_finished "git -C super reset --hard HEAD^" && git -C super config -f .gitmodules submodule.submodule.update "!false" && git -C super commit -a -m "add command to .gitmodules file" && git -C super/submodule reset --hard $submodulesha1^ && - git -C super submodule update submodule + test_must_fail git -C super submodule update submodule ' cat << EOF >expect @@ -480,6 +480,9 @@ test_expect_success 'recursive submodule update - command in .git/config catches ' test_expect_success 'submodule init does not copy command into .git/config' ' + test_when_finished "git -C super update-index --force-remove submodule1" && + test_when_finished git config -f super/.gitmodules \ + --remove-section submodule.submodule1 && (cd super && H=$(git ls-files -s submodule | cut -d" " -f2) && mkdir submodule1 && @@ -487,10 +490,9 @@ test_expect_success 'submodule init does not copy command into .git/config' ' git config -f .gitmodules submodule.submodule1.path submodule1 && git config -f .gitmodules submodule.submodule1.url ../submodule && git config -f .gitmodules submodule.submodule1.update !false && - git submodule init submodule1 && - echo "none" >expect && - git config submodule.submodule1.update >actual && - test_cmp expect actual + test_must_fail git submodule init submodule1 && + test_expect_code 1 git config submodule.submodule1.update >actual && + test_must_be_empty actual ) ' From 7cdafcaacf677b9e0700fa988c247bda192db48d Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 4 Dec 2019 21:33:29 +0100 Subject: [PATCH 30/34] Git 2.15.4 Signed-off-by: Johannes Schindelin --- Documentation/RelNotes/2.15.4.txt | 11 +++++++++++ GIT-VERSION-GEN | 2 +- RelNotes | 2 +- 3 files changed, 13 insertions(+), 2 deletions(-) create mode 100644 Documentation/RelNotes/2.15.4.txt diff --git a/Documentation/RelNotes/2.15.4.txt b/Documentation/RelNotes/2.15.4.txt new file mode 100644 index 0000000000..dc241cba34 --- /dev/null +++ b/Documentation/RelNotes/2.15.4.txt @@ -0,0 +1,11 @@ +Git v2.15.4 Release Notes +========================= + +This release merges up the fixes that appear in v2.14.6 to address +the security issues CVE-2019-1348, CVE-2019-1349, CVE-2019-1350, +CVE-2019-1351, CVE-2019-1352, CVE-2019-1353, CVE-2019-1354, and +CVE-2019-1387; see the release notes for that version for details. + +In conjunction with a vulnerability that was fixed in v2.20.2, +`.gitmodules` is no longer allowed to contain entries of the form +`submodule..update=!command`. diff --git a/GIT-VERSION-GEN b/GIT-VERSION-GEN index 4a63ce35ad..6fe1c9dcc7 100755 --- a/GIT-VERSION-GEN +++ b/GIT-VERSION-GEN @@ -1,7 +1,7 @@ #!/bin/sh GVF=GIT-VERSION-FILE -DEF_VER=v2.15.3 +DEF_VER=v2.15.4 LF=' ' diff --git a/RelNotes b/RelNotes index e7fe59f5d0..03f405050e 120000 --- a/RelNotes +++ b/RelNotes @@ -1 +1 @@ -Documentation/RelNotes/2.15.3.txt \ No newline at end of file +Documentation/RelNotes/2.15.4.txt \ No newline at end of file From 68440496c77c6d3a606537c78ea4b62eb895a64a Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 4 Dec 2019 21:40:01 +0100 Subject: [PATCH 31/34] test-drop-caches: use `has_dos_drive_prefix()` This is a companion patch to 'mingw: handle `subst`-ed "DOS drives"': use the DOS drive prefix handling that is already provided by `compat/mingw.c` (and which just learned to handle non-alphabetical "drive letters"). Signed-off-by: Johannes Schindelin --- t/helper/test-drop-caches.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/t/helper/test-drop-caches.c b/t/helper/test-drop-caches.c index bd1a857d52..f125192c97 100644 --- a/t/helper/test-drop-caches.c +++ b/t/helper/test-drop-caches.c @@ -6,18 +6,21 @@ static int cmd_sync(void) { char Buffer[MAX_PATH]; DWORD dwRet; - char szVolumeAccessPath[] = "\\\\.\\X:"; + char szVolumeAccessPath[] = "\\\\.\\XXXX:"; HANDLE hVolWrite; - int success = 0; + int success = 0, dos_drive_prefix; dwRet = GetCurrentDirectory(MAX_PATH, Buffer); if ((0 == dwRet) || (dwRet > MAX_PATH)) return error("Error getting current directory"); - if ((Buffer[0] < 'A') || (Buffer[0] > 'Z')) - return error("Invalid drive letter '%c'", Buffer[0]); + dos_drive_prefix = has_dos_drive_prefix(Buffer); + if (!dos_drive_prefix) + return error("'%s': invalid drive letter", Buffer); + + memcpy(szVolumeAccessPath, Buffer, dos_drive_prefix); + szVolumeAccessPath[dos_drive_prefix] = '\0'; - szVolumeAccessPath[4] = Buffer[0]; hVolWrite = CreateFile(szVolumeAccessPath, GENERIC_READ | GENERIC_WRITE, FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, OPEN_EXISTING, 0, NULL); if (INVALID_HANDLE_VALUE == hVolWrite) From eb288bc455ac67e3ceeff90daf6f25972bb586d0 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 4 Dec 2019 21:45:07 +0100 Subject: [PATCH 32/34] Git 2.16.6 Signed-off-by: Johannes Schindelin --- Documentation/RelNotes/2.16.6.txt | 8 ++++++++ GIT-VERSION-GEN | 2 +- RelNotes | 2 +- 3 files changed, 10 insertions(+), 2 deletions(-) create mode 100644 Documentation/RelNotes/2.16.6.txt diff --git a/Documentation/RelNotes/2.16.6.txt b/Documentation/RelNotes/2.16.6.txt new file mode 100644 index 0000000000..438306e60b --- /dev/null +++ b/Documentation/RelNotes/2.16.6.txt @@ -0,0 +1,8 @@ +Git v2.16.6 Release Notes +========================= + +This release merges up the fixes that appear in v2.14.6 and in +v2.15.4 addressing the security issues CVE-2019-1348, CVE-2019-1349, +CVE-2019-1350, CVE-2019-1351, CVE-2019-1352, CVE-2019-1353, +CVE-2019-1354, and CVE-2019-1387; see the release notes for those +versions for details. diff --git a/GIT-VERSION-GEN b/GIT-VERSION-GEN index 64f5097bcb..3c4ff15b48 100755 --- a/GIT-VERSION-GEN +++ b/GIT-VERSION-GEN @@ -1,7 +1,7 @@ #!/bin/sh GVF=GIT-VERSION-FILE -DEF_VER=v2.16.5 +DEF_VER=v2.16.6 LF=' ' diff --git a/RelNotes b/RelNotes index 7b0f25d4c7..232b7f1a89 120000 --- a/RelNotes +++ b/RelNotes @@ -1 +1 @@ -Documentation/RelNotes/2.16.5.txt \ No newline at end of file +Documentation/RelNotes/2.16.6.txt \ No newline at end of file From bb92255ebe6bccd76227e023d6d0bc997e318ad0 Mon Sep 17 00:00:00 2001 From: Jonathan Nieder Date: Thu, 5 Dec 2019 01:30:43 -0800 Subject: [PATCH 33/34] fsck: reject submodule.update = !command in .gitmodules This allows hosting providers to detect whether they are being used to attack users using malicious 'update = !command' settings in .gitmodules. Since ac1fbbda2013 (submodule: do not copy unknown update mode from .gitmodules, 2013-12-02), in normal cases such settings have been treated as 'update = none', so forbidding them should not produce any collateral damage to legitimate uses. A quick search does not reveal any repositories making use of this construct, either. Reported-by: Joern Schneeweisz Signed-off-by: Jonathan Nieder Signed-off-by: Johannes Schindelin --- fsck.c | 7 +++++++ t/t7406-submodule-update.sh | 14 ++++++++++++++ 2 files changed, 21 insertions(+) diff --git a/fsck.c b/fsck.c index 2fc6bbca16..0741e62586 100644 --- a/fsck.c +++ b/fsck.c @@ -66,6 +66,7 @@ static struct oidset gitmodules_done = OIDSET_INIT; FUNC(GITMODULES_SYMLINK, ERROR) \ FUNC(GITMODULES_URL, ERROR) \ FUNC(GITMODULES_PATH, ERROR) \ + FUNC(GITMODULES_UPDATE, ERROR) \ /* warnings */ \ FUNC(BAD_FILEMODE, WARN) \ FUNC(EMPTY_NAME, WARN) \ @@ -975,6 +976,12 @@ static int fsck_gitmodules_fn(const char *var, const char *value, void *vdata) FSCK_MSG_GITMODULES_PATH, "disallowed submodule path: %s", value); + if (!strcmp(key, "update") && value && + parse_submodule_update_type(value) == SM_UPDATE_COMMAND) + data->ret |= report(data->options, data->obj, + FSCK_MSG_GITMODULES_UPDATE, + "disallowed submodule update setting: %s", + value); free(name); return 0; diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh index 779932457a..ceb5eed6e1 100755 --- a/t/t7406-submodule-update.sh +++ b/t/t7406-submodule-update.sh @@ -414,6 +414,20 @@ test_expect_success 'submodule update - command in .gitmodules is rejected' ' test_must_fail git -C super submodule update submodule ' +test_expect_success 'fsck detects command in .gitmodules' ' + git init command-in-gitmodules && + ( + cd command-in-gitmodules && + git submodule add ../submodule submodule && + test_commit adding-submodule && + + git config -f .gitmodules submodule.submodule.update "!false" && + git add .gitmodules && + test_commit configuring-update && + test_must_fail git fsck + ) +' + cat << EOF >expect Execution of 'false $submodulesha1' failed in submodule path 'submodule' EOF From a5ab8d03173458b76b8452efd90a7173f490c132 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 4 Dec 2019 22:13:04 +0100 Subject: [PATCH 34/34] Git 2.17.3 Signed-off-by: Johannes Schindelin --- Documentation/RelNotes/2.17.3.txt | 12 ++++++++++++ GIT-VERSION-GEN | 2 +- RelNotes | 2 +- 3 files changed, 14 insertions(+), 2 deletions(-) create mode 100644 Documentation/RelNotes/2.17.3.txt diff --git a/Documentation/RelNotes/2.17.3.txt b/Documentation/RelNotes/2.17.3.txt new file mode 100644 index 0000000000..5a46c94271 --- /dev/null +++ b/Documentation/RelNotes/2.17.3.txt @@ -0,0 +1,12 @@ +Git v2.17.3 Release Notes +========================= + +This release merges up the fixes that appear in v2.14.6 and in +v2.15.4 addressing the security issues CVE-2019-1348, CVE-2019-1349, +CVE-2019-1350, CVE-2019-1351, CVE-2019-1352, CVE-2019-1353, +CVE-2019-1354, and CVE-2019-1387; see the release notes for those +versions for details. + +In addition, `git fsck` was taught to identify `.gitmodules` entries +of the form `submodule..update=!command`, which have been +disallowed in v2.15.4. diff --git a/GIT-VERSION-GEN b/GIT-VERSION-GEN index bc54879938..bd6cd16e3d 100755 --- a/GIT-VERSION-GEN +++ b/GIT-VERSION-GEN @@ -1,7 +1,7 @@ #!/bin/sh GVF=GIT-VERSION-FILE -DEF_VER=v2.17.2 +DEF_VER=v2.17.3 LF=' ' diff --git a/RelNotes b/RelNotes index 733d1745a9..d14bdb5eda 120000 --- a/RelNotes +++ b/RelNotes @@ -1 +1 @@ -Documentation/RelNotes/2.17.2.txt \ No newline at end of file +Documentation/RelNotes/2.17.3.txt \ No newline at end of file