Users of oneway_merge() (like "reset --hard") learned to take
advantage of fsmonitor to avoid unnecessary lstat(2) calls.
* us/unpack-trees-fsmonitor:
unpack-trees: skip stat on fsmonitor-valid files
Recently we have declared that GIT_TEST_* variables take the
usual boolean values (it used to be that some used "non-empty
means true" and taking GIT_TEST_VAR=YesPlease as true); make
sure we notice and fail when non-bool strings are given to
these variables.
* sg/test-bool-env:
t5608-clone-2gb.sh: turn GIT_TEST_CLONE_2GB into a bool
tests: add 'test_bool_env' to catch non-bool GIT_TEST_* values
The revision walking machinery uses resources like per-object flag
bits that need to be reset before a new iteration of walking
begins, but the resources related to topological walk were not
cleared correctly, which has been corrected.
* mh/clear-topo-walk-upon-reset:
revision: free topo_walk_info before creating a new one in init_topo_walk
revision: clear the topo-walk flags in reset_revision_walk
We have had compatibility fallback macro definitions for "PRIuMAX",
"PRIu32", etc. but did not for "PRIdMAX", while the code used the
last one apparently without any hiccup reported recently. The
fallback macro definitions for these <inttypes.h> macros that must
appear in C99 systems have been removed.
* hv/assume-priumax-is-available-anywhere:
git-compat-util.h: drop the `PRIuMAX` and other fallback definitions
"git submodule status" that is run from a subdirectory of the
superproject did not work well, which has been corrected.
* mg/submodule-status-from-a-subdirectory:
submodule: fix 'submodule status' when called from a subdirectory
Test cleanup.
* dl/t5520-cleanup:
t5520: replace `! git` with `test_must_fail git`
t5520: remove redundant lines in test cases
t5520: replace $(cat ...) comparison with test_cmp
t5520: don't put git in upstream of pipe
t5520: test single-line files by git with test_cmp
t5520: use test_cmp_rev where possible
t5520: replace test -{n,z} with test-lib functions
t5520: use test_line_count where possible
t5520: remove spaces after redirect operator
t5520: replace test -f with test-lib functions
t5520: let sed open its own input
t5520: use sq for test case names
t5520: improve test style
t: teach test_cmp_rev to accept ! for not-equals
t0000: test multiple local assignment
"git reset --patch $object" without any pathspec should allow a
tree object to be given, but incorrectly required a committish,
which has been corrected.
* nl/reset-patch-takes-a-tree:
reset: parse rev as tree-ish in patch mode
"git submodule status" and "git submodule status --cached" show
different things, but the documentation did not cover them
correctly, which has been corrected.
* mg/doc-submodule-status-cached:
doc: document 'git submodule status --cached'
The code to parse GPG output used to assume incorrectly that the
finterprint for the primary key would always be present for a valid
signature, which has been corrected.
* hi/gpg-optional-pkfp-fix:
gpg-interface: limit search for primary key fingerprint
gpg-interface: refactor the free-and-xmemdupz pattern
The sequencer machinery compared the HEAD and the state it is
attempting to commit to decide if the result would be a no-op
commit, even when amending a commit, which was incorrect, and
has been corrected.
* pw/sequencer-compare-with-right-parent-to-check-empty-commits:
sequencer: fix empty commit check when amending
"git rev-parse --show-toplevel" run outside of any working tree did
not error out, which has been corrected.
* jk/fail-show-toplevel-outside-working-tree:
rev-parse: make --show-toplevel without a worktree an error
"git unpack-objects" used to show progress based only on the number
of received and unpacked objects, which stalled when it has to
handle an unusually large object. It now shows the throughput as
well.
* sg/unpack-progress-throughput:
builtin/unpack-objects.c: show throughput progress
CI jobs for macOS has been made less chatty when updating perforce
package used during testing.
* jc/azure-ci-osx-fix-fix:
ci(osx): update homebrew-cask repository with less noise
"git range-diff" learned to take the "--notes=<ref>" and the
"--no-notes" options to control the commit notes included in the
log message that gets compared.
* dl/range-diff-with-notes:
format-patch: pass notes configuration to range-diff
range-diff: pass through --notes to `git log`
range-diff: output `## Notes ##` header
t3206: range-diff compares logs with commit notes
t3206: s/expected/expect/
t3206: disable parameter substitution in heredoc
t3206: remove spaces after redirect operators
pretty-options.txt: --notes accepts a ref instead of treeish
rev-list-options.txt: remove reference to --show-notes
argv-array: add space after `while`
The userdiff machinery has been taught that "async def" is another
way to begin a "function" in Python.
* jh/userdiff-python-async:
userdiff: support Python async functions
The logic to avoid duplicate label names generated by "git rebase
--rebase-merges" forgot that the machinery itself uses "onto" as a
label name, which must be avoided by auto-generated labels, which
has been corrected.
* dd/rebase-merge-reserves-onto-label:
sequencer: handle rebase-merges for "onto" message
The beginning of rewriting "git add -i" in C.
* js/builtin-add-i:
built-in add -i: implement the `help` command
built-in add -i: use color in the main loop
built-in add -i: support `?` (prompt help)
built-in add -i: show unique prefixes of the commands
built-in add -i: implement the main loop
built-in add -i: color the header in the `status` command
built-in add -i: implement the `status` command
diff: export diffstat interface
Start to implement a built-in version of `git add --interactive`
A label used in the todo list that are generated by "git rebase
--rebase-merges" is used as a part of a refname; the logic to come
up with the label has been tightened to avoid names that cannot be
used as such.
* js/rebase-r-safer-label:
rebase -r: let `label` generate safer labels
rebase-merges: move labels' whitespace mangling into `label_oid()`
git-gui learned to delete untracked files when the "Revert Changes"
option is selected. Since there are two types of revert operations (one
for tracked files and one for untracked ones), the "checkout" and
"deletion" operations are done in parallel. The status bar is updated
to allow both to use it in parallel.
* jg/revert-untracked:
git-gui: revert untracked files by deleting them
git-gui: update status bar to track operations
git-gui: consolidate naming conventions
Update the revert_helper proc to check for untracked files as well as
changes, and then handle changes to be reverted and untracked files with
independent blocks of code. Prompt the user independently for untracked
files, since the underlying action is fundamentally different (rm -f).
If after deleting untracked files, the directory containing them becomes
empty, then remove the directory as well. Migrate unlocking of the index
out of _close_updateindex to a responsibility of the caller, to permit
paths that don't directly unlock the index, and refactor the error
handling added in d4e890e5 so that callers can make flow control
decisions in the event of errors. Update Tcl/Tk dependency from 8.4 to
8.6 in git-gui.sh.
A new proc delete_files takes care of actually deleting the files in
batches, using the Tcler's Wiki recommended approach for keeping the UI
responsive.
Since the checkout_index and delete_files calls are both asynchronous
and could potentially complete in any order, a "chord" is used to
coordinate unlocking the index and returning the UI to a usable state
only after both operations are complete. The `SimpleChord` class,
based on TclOO (Tcl/Tk 8.6), is added in this commit.
Signed-off-by: Jonathan Gilbert <JonathanG@iQmetrix.com>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
Update the status bar to track updates as individual "operations" that
can overlap. Update all call sites to interact with the new status bar
mechanism. Update initialization to explicitly clear status text,
since otherwise it may persist across future operations.
Signed-off-by: Jonathan Gilbert <JonathanG@iQmetrix.com>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
A few variables in this file use camelCase, while the overall standard
is snake_case. A consistent naming scheme will improve readability of
future changes. To avoid mixing naming changes with semantic changes,
this commit contains only naming changes.
Signed-off-by: Jonathan Gilbert <JonathanG@iQmetrix.com>
Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>
Changes involving only blank lines are hidden with --ignore-blank-lines,
unless they appear in the context lines of other changes. This is
handled by xdl_get_hunk() for context added by --inter-hunk-context, -u
and -U.
Function context for -W and --function-context added by xdl_emit_diff()
doesn't pay attention to such ignored changes; it relies fully on
xdl_get_hunk() and shows just the post-image of ignored changes
appearing in function context. That's inconsistent and confusing.
Improve the result of using --ignore-blank-lines and --function-context
together by fully showing ignored changes if they happen to fall within
function context.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While the only permitted drive letters for physical drives on Windows
are letters of the US-English alphabet, this restriction does not apply
to virtual drives assigned via `subst <letter>: <path>`.
To prevent targeted attacks against systems where "funny" drive letters
such as `1` or `!` are assigned, let's handle them as regular drive
letters on Windows.
This fixes CVE-2019-1351.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
On Windows, filenames cannot have trailing spaces or periods, when
opening such paths, they are stripped automatically. Read: you can open
the file `README` via the file name `README . . .`. This ambiguity can
be used in combination with other security bugs to cause e.g. remote
code execution during recursive clones. This patch series fixes that.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This patch fixes a vulnerability in the Windows-specific code where a
submodule names ending in a backslash were quoted incorrectly, and that
bug could be abused to insert command-line parameters e.g. to `ssh` in a
recursive clone.
Note: this bug is Windows-only, as we have to construct a command line
for the process-to-spawn, unlike Linux/macOS, where `execv()` accepts an
already-split command line.
While at it, other quoting issues are fixed as well.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Recursive clones are currently affected by a vulnerability that is
caused by too-lax validation of submodule names.
This topic branch fixes that.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This patch series makes it safe to use Git on Windows drives, even if
running on a mounted network share or within the Windows Subsystem for
Linux (WSL).
This topic branch addresses CVE-2019-1353.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Over a decade ago, in 25fe217b86 (Windows: Treat Windows style path
names., 2008-03-05), Git was taught to handle absolute Windows paths,
i.e. paths that start with a drive letter and a colon.
Unbeknownst to us, while drive letters of physical drives are limited to
letters of the English alphabet, there is a way to assign virtual drive
letters to arbitrary directories, via the `subst` command, which is
_not_ limited to English letters.
It is therefore possible to have absolute Windows paths of the form
`1:\what\the\hex.txt`. Even "better": pretty much arbitrary Unicode
letters can also be used, e.g. `ä:\tschibät.sch`.
While it can be sensibly argued that users who set up such funny drive
letters really seek adverse consequences, the Windows Operating System
is known to be a platform where many users are at the mercy of
administrators who have their very own idea of what constitutes a
reasonable setup.
Therefore, let's just make sure that such funny paths are still
considered absolute paths by Git, on Windows.
In addition to Unicode characters, pretty much any character is a valid
drive letter, as far as `subst` is concerned, even `:` and `"` or even a
space character. While it is probably the opposite of smart to use them,
let's safeguard `is_dos_drive_prefix()` against all of them.
Note: `[::1]:repo` is a valid URL, but not a valid path on Windows.
As `[` is now considered a valid drive letter, we need to be very
careful to avoid misinterpreting such a string as valid local path in
`url_is_local_not_ssh()`. To do that, we use the just-introduced
function `is_valid_path()` (which will label the string as invalid file
name because of the colon characters).
This fixes CVE-2019-1351.
Reported-by: Nicolas Joly <Nicolas.Joly@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This patch series plugs an attack vector we had overlooked in our
December 2014 work on `core.protectNTFS`.
Essentially, the path `.git::$INDEX_ALLOCATION/config` is interpreted as
`.git/config` when NTFS Alternate Data Streams are available (which they
are on Windows, and at least on network shares that are SMB-mounted on
macOS).
Needless to say: we don't want that.
In fact, we want to stay on the very safe side and not even special-case
the `$INDEX_ALLOCATION` stream type: let's just prevent Git from
touching _any_ explicitly specified Alternate Data Stream of `.git`.
In essence, we'll prevent Git from tracking, or writing to, any path
with a segment of the form `.git:<anything>`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When creating a directory on Windows whose path ends in a space or a
period (or chains thereof), the Win32 API "helpfully" trims those. For
example, `mkdir("abc ");` will return success, but actually create a
directory called `abc` instead.
This stems back to the DOS days, when all file names had exactly 8
characters plus exactly 3 characters for the file extension, and the
only way to have shorter names was by padding with spaces.
Sadly, this "helpful" behavior is a bit inconsistent: after a successful
`mkdir("abc ");`, a `mkdir("abc /def")` will actually _fail_ (because
the directory `abc ` does not actually exist).
Even if it would work, we now have a serious problem because a Git
repository could contain directories `abc` and `abc `, and on Windows,
they would be "merged" unintentionally.
As these paths are illegal on Windows, anyway, let's disallow any
accesses to such paths on that Operating System.
For practical reasons, this behavior is still guarded by the
config setting `core.protectNTFS`: it is possible (and at least two
regression tests make use of it) to create commits without involving the
worktree. In such a scenario, it is of course possible -- even on
Windows -- to create such file names.
Among other consequences, this patch disallows submodules' paths to end
in spaces on Windows (which would formerly have confused Git enough to
try to write into incorrect paths, anyway).
While this patch does not fix a vulnerability on its own, it prevents an
attack vector that was exploited in demonstrations of a number of
recently-fixed security bugs.
The regression test added to `t/t7417-submodule-path-url.sh` reflects
that attack vector.
Note that we have to adjust the test case "prevent git~1 squatting on
Windows" in `t/t7415-submodule-names.sh` because of a very subtle issue.
It tries to clone two submodules whose names differ only in a trailing
period character, and as a consequence their git directories differ in
the same way. Previously, when Git tried to clone the second submodule,
it thought that the git directory already existed (because on Windows,
when you create a directory with the name `b.` it actually creates `b`),
but with this patch, the first submodule's clone will fail because of
the illegal name of the git directory. Therefore, when cloning the
second submodule, Git will take a different code path: a fresh clone
(without an existing git directory). Both code paths fail to clone the
second submodule, both because the the corresponding worktree directory
exists and is not empty, but the error messages are worded differently.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
It is unfortunate that we need to quote arguments differently on
Windows, depending whether we build a command-line for MSYS2's `sh` or
for other Windows executables.
We already have a test helper to verify the latter, with this patch we
can also verify the former.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Certain characters are not admissible in file names on Windows, even if
Cygwin/MSYS2 (and therefore, Git for Windows' Bash) pretend that they
are, e.g. `:`, `<`, `>`, etc
Let's disallow those characters explicitly in Windows builds of Git.
Note: just like trailing spaces or periods, it _is_ possible on Windows
to create commits adding files with such illegal characters, as long as
the operation leaves the worktree untouched. To allow for that, we
continue to guard `is_valid_win32_path()` behind the config setting
`core.protectNTFS`, so that users _can_ continue to do that, as long as
they turn the protections off via that config setting.
Among other problems, this prevents Git from trying to write to an "NTFS
Alternate Data Stream" (which refers to metadata stored alongside a
file, under a special name: "<filename>:<stream-name>"). This fix
therefore also prevents an attack vector that was exploited in
demonstrations of a number of recently-fixed security bugs.
Further reading on illegal characters in Win32 filenames:
https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
A `git clone` will end with exit code 0 when `merged_entry()` returns a
positive value during a call of `unpack_trees()` to `traverse_trees()`.
The reason is that `unpack_trees()` will interpret a positive value not
to be an error.
The problem is, however, that `add_index_entry()` (which is called by
`merged_entry()` can report an error, and we really should fail the
entire clone in such a case.
Let's fix this problem, in preparation for a Windows-specific patch
disallowing `mkdir()` with directory names that contain a trailing space
(which is illegal on NTFS): we want `git clone` to abort when a path
cannot be checked out due to that condition.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When the, say, 93rd trial run fails, it is a good idea to have a way to
skip the first 92 trials and dig directly into the 93rd in a debugger.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
On Windows, file names cannot contain asterisks nor newline characters.
In an upcoming commit, we will make this limitation explicit,
disallowing even the creation of commits that introduce such file names.
However, in the test scripts touched by this patch, we _know_ that those
paths won't be checked out, so we _want_ to allow such file names.
Happily, the stringent path validation will be guarded via the
`core.protectNTFS` flag, so all we need to do is to force that flag off
temporarily.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When the stress test reported a problem with quoting certain arguments,
it is helpful to have a facility to play with those arguments in order
to find out whether variations of those arguments are affected, too.
Let's allow `test-run-command quote-stress-test -- <args>` to be used
for that purpose.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
On Windows, we have to do all the command-line argument quoting
ourselves. Worse: we have to have two versions of said quoting, one for
MSYS2 programs (which have their own dequoting rules) and the rest.
We care mostly about the rest, and to make sure that that works, let's
have a stress test that comes up with all kinds of awkward arguments,
verifying that a spawned sub-process receives those unharmed.
Signed-off-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Currently it is technically possible to let a submodule's git
directory point right into the git dir of a sibling submodule.
Example: the git directories of two submodules with the names `hippo`
and `hippo/hooks` would be `.git/modules/hippo/` and
`.git/modules/hippo/hooks/`, respectively, but the latter is already
intended to house the former's hooks.
In most cases, this is just confusing, but there is also a (quite
contrived) attack vector where Git can be fooled into mistaking remote
content for file contents it wrote itself during a recursive clone.
Let's plug this bug.
To do so, we introduce the new function `validate_submodule_git_dir()`
which simply verifies that no git dir exists for any leading directories
of the submodule name (if there are any).
Note: this patch specifically continues to allow sibling modules names
of the form `core/lib`, `core/doc`, etc, as long as `core` is not a
submodule name.
This fixes CVE-2019-1387.
Reported-by: Nicolas Joly <Nicolas.Joly@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Back in the DOS days, in the FAT file system, file names always
consisted of a base name of length 8 plus a file extension of length 3.
Shorter file names were simply padded with spaces to the full 8.3
format.
Later, the FAT file system was taught to support _also_ longer names,
with an 8.3 "short name" as primary file name. While at it, the same
facility allowed formerly illegal file names, such as `.git` (empty base
names were not allowed), which would have the "short name" `git~1`
associated with it.
For backwards-compatibility, NTFS supports alternative 8.3 short
filenames, too, even if starting with Windows Vista, they are only
generated on the system drive by default.
We addressed the problem that the `.git/` directory can _also_ be
accessed via `git~1/` (when short names are enabled) in 2b4c6efc82
(read-cache: optionally disallow NTFS .git variants, 2014-12-16), i.e.
since Git v1.9.5, by introducing the config setting `core.protectNTFS`
and enabling it by default on Windows.
In the meantime, Windows 10 introduced the "Windows Subsystem for Linux"
(short: WSL), i.e. a way to run Linux applications/distributions in a
thinly-isolated subsystem on Windows (giving rise to many a "2016 is the
Year of Linux on the Desktop" jokes). WSL is getting increasingly
popular, also due to the painless way Linux application can operate
directly ("natively") on files on Windows' file system: the Windows
drives are mounted automatically (e.g. `C:` as `/mnt/c/`).
Taken together, this means that we now have to enable the safe-guards of
Git v1.9.5 also in WSL: it is possible to access a `.git` directory
inside `/mnt/c/` via the 8.3 name `git~1` (unless short name generation
was disabled manually). Since regular Linux distributions run in WSL,
this means we have to enable `core.protectNTFS` at least on Linux, too.
To enable Services for Macintosh in Windows NT to store so-called
resource forks, NTFS introduced "Alternate Data Streams". Essentially,
these constitute additional metadata that are connected to (and copied
with) their associated files, and they are accessed via pseudo file
names of the form `filename:<stream-name>:<stream-type>`.
In a recent patch, we extended `core.protectNTFS` to also protect
against accesses via NTFS Alternate Data Streams, e.g. to prevent
contents of the `.git/` directory to be "tracked" via yet another
alternative file name.
While it is not possible (at least by default) to access files via NTFS
Alternate Data Streams from within WSL, the defaults on macOS when
mounting network shares via SMB _do_ allow accessing files and
directories in that way. Therefore, we need to enable `core.protectNTFS`
on macOS by default, too, and really, on any Operating System that can
mount network shares via SMB/CIFS.
A couple of approaches were considered for fixing this:
1. We could perform a dynamic NTFS check similar to the `core.symlinks`
check in `init`/`clone`: instead of trying to create a symbolic link
in the `.git/` directory, we could create a test file and try to
access `.git/config` via 8.3 name and/or Alternate Data Stream.
2. We could simply "flip the switch" on `core.protectNTFS`, to make it
"on by default".
The obvious downside of 1. is that it won't protect worktrees that were
clone with a vulnerable Git version already. We considered patching code
paths that check out files to check whether we're running on an NTFS
system dynamically and persist the result in the repository-local config
setting `core.protectNTFS`, but in the end decided that this solution
would be too fragile, and too involved.
The obvious downside of 2. is that everybody will have to "suffer" the
performance penalty incurred from calling `is_ntfs_dotgit()` on every
path, even in setups where.
After the recent work to accelerate `is_ntfs_dotgit()` in most cases,
it looks as if the time spent on validating ten million random
file names increases only negligibly (less than 20ms, well within the
standard deviation of ~50ms). Therefore the benefits outweigh the cost.
Another downside of this is that paths that might have been acceptable
previously now will be forbidden. Realistically, though, this is an
improvement because public Git hosters already would reject any `git
push` that contains such file names.
Note: There might be a similar problem mounting HFS+ on Linux. However,
this scenario has been considered unlikely and in light of the cost (in
the aforementioned benchmark, `core.protectHFS = true` increased the
time from ~440ms to ~610ms), it was decided _not_ to touch the default
of `core.protectHFS`.
This change addresses CVE-2019-1353.
Reported-by: Nicolas Joly <Nicolas.Joly@microsoft.com>
Helped-by: Garima Singh <garima.singh@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We just safe-guarded `.git` against NTFS Alternate Data Stream-related
attack vectors, and now it is time to do the same for `.gitmodules`.
Note: In the added regression test, we refrain from verifying all kinds
of variations between short names and NTFS Alternate Data Streams: as
the new code disallows _all_ Alternate Data Streams of `.gitmodules`, it
is enough to test one in order to know that all of them are guarded
against.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We need to be careful to follow proper quoting rules. For example, if an
argument contains spaces, we have to quote them. Double-quotes need to
be escaped. Backslashes need to be escaped, but only if they are
followed by a double-quote character.
We need to be _extra_ careful to consider the case where an argument
ends in a backslash _and_ needs to be quoted: in this case, we append a
double-quote character, i.e. the backslash now has to be escaped!
The current code, however, fails to recognize that, and therefore can
turn an argument that ends in a single backslash into a quoted argument
that now ends in an escaped double-quote character. This allows
subsequent command-line parameters to be split and part of them being
mistaken for command-line options, e.g. through a maliciously-crafted
submodule URL during a recursive clone.
Technically, we would not need to quote _all_ arguments which end in a
backslash _unless_ the argument needs to be quoted anyway. For example,
`test\` would not need to be quoted, while `test \` would need to be.
To keep the code simple, however, and therefore easier to reason about
and ensure its correctness, we now _always_ quote an argument that ends
in a backslash.
This addresses CVE-2019-1350.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Previously, this function was written without focusing on speed,
intending to make reviewing the code as easy as possible, to avoid any
bugs in this critical code.
Turns out: we can do much better on both accounts. With this patch, we
make it as fast as this developer can make it go:
- We avoid the call to `is_dir_sep()` and make all the character
comparisons explicit.
- We avoid the cost of calling `strncasecmp()` and unroll the test for
`.git` and `git~1`, not even using `tolower()` because it is faster to
compare against two constant values.
- We look for `.git` and `.git~1` first thing, and return early if not
found.
- We also avoid calling a separate function for detecting chains of
spaces and periods.
Each of these improvements has a noticeable impact on the speed of
`is_ntfs_dotgit()`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Probably inspired by HFS' resource streams, NTFS supports "Alternate
Data Streams": by appending `:<stream-name>` to the file name,
information in addition to the file contents can be written and read,
information that is copied together with the file (unless copied to a
non-NTFS location).
These Alternate Data Streams are typically used for things like marking
an executable as having just been downloaded from the internet (and
hence not necessarily being trustworthy).
In addition to a stream name, a stream type can be appended, like so:
`:<stream-name>:<stream-type>`. Unless specified, the default stream
type is `$DATA` for files and `$INDEX_ALLOCATION` for directories. In
other words, `.git::$INDEX_ALLOCATION` is a valid way to reference the
`.git` directory!
In our work in Git v2.2.1 to protect Git on NTFS drives under
`core.protectNTFS`, we focused exclusively on NTFS short names, unaware
of the fact that NTFS Alternate Data Streams offer a similar attack
vector.
Let's fix this.
Seeing as it is better to be safe than sorry, we simply disallow paths
referring to *any* NTFS Alternate Data Stream of `.git`, not just
`::$INDEX_ALLOCATION`. This also simplifies the implementation.
This closes CVE-2019-1352.
Further reading about NTFS Alternate Data Streams:
https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/c54dec26-1551-4d3a-a0ea-4fa40f848eb3
Reported-by: Nicolas Joly <Nicolas.Joly@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>