Now that the index can block pathnames that can be mistaken
to mean ".git" on NTFS and FAT32, it would be helpful for
fsck to notice such problematic paths. This lets servers
which use receive.fsckObjects block them before the damage
spreads.
Note that the fsck check is always on, even for systems
without core.protectNTFS set. This is technically more
restrictive than we need to be, as a set of users on ext4
could happily use these odd filenames without caring about
NTFS.
However, on balance, it's helpful for all servers to block
these (because the paths can be used for mischief, and
servers which bother to fsck would want to stop the spread
whether they are on NTFS themselves or not), and hardly
anybody will be affected (because the blocked names are
variants of .git or git~1, meaning mischief is almost
certainly what the tree author had in mind).
Ideally these would be controlled by a separate
"fsck.protectNTFS" flag. However, it would be much nicer to
be able to enable/disable _any_ fsck flag individually, and
any scheme we choose should match such a system. Given the
likelihood of anybody using such a path in practice, it is
not unreasonable to wait until such a system materializes.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The point of disallowing ".git" in the index is that we
would never want to accidentally overwrite files in the
repository directory. But this means we need to respect the
filesystem's idea of when two paths are equal. The prior
commit added a helper to make such a comparison for NTFS
and FAT32; let's use it in verify_path().
We make this check optional for two reasons:
1. It restricts the set of allowable filenames, which is
unnecessary for people who are not on NTFS nor FAT32.
In practice this probably doesn't matter, though, as
the restricted names are rather obscure and almost
certainly would never come up in practice.
2. It has a minor performance penalty for every path we
insert into the index.
This patch ties the check to the core.protectNTFS config
option. Though this is expected to be most useful on Windows,
we allow it to be set everywhere, as NTFS may be mounted on
other platforms. The variable does default to on for Windows,
though.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We do not allow paths with a ".git" component to be added to
the index, as that would mean repository contents could
overwrite our repository files. However, asking "is this
path the same as .git" is not as simple as strcmp() on some
filesystems.
On NTFS (and FAT32), there exist so-called "short names" for
backwards-compatibility: 8.3 compliant names that refer to the same files
as their long names. As ".git" is not an 8.3 compliant name, a short name
is generated automatically, typically "git~1".
Depending on the Windows version, any combination of trailing spaces and
periods are ignored, too, so that both "git~1." and ".git." still refer
to the Git directory. The reason is that 8.3 stores file names shorter
than 8 characters with trailing spaces. So literally, it does not matter
for the short name whether it is padded with spaces or whether it is
shorter than 8 characters, it is considered to be the exact same.
The period is the separator between file name and file extension, and
again, an empty extension consists just of spaces in 8.3 format. So
technically, we would need only take care of the equivalent of this
regex:
(\.git {0,4}|git~1 {0,3})\. {0,3}
However, there are indications that at least some Windows versions might
be more lenient and accept arbitrary combinations of trailing spaces and
periods and strip them out. So we're playing it real safe here. Besides,
there can be little doubt about the intention behind using file names
matching even the more lenient pattern specified above, therefore we
should be fine with disallowing such patterns.
Extra care is taken to catch names such as '.\\.git\\booh' because the
backslash is marked as a directory separator only on Windows, and we want
to use this new helper function also in fsck on other platforms.
A big thank you goes to Ed Thomson and an unnamed Microsoft engineer for
the detailed analysis performed to come up with the corresponding fixes
for libgit2.
This commit adds a function to detect whether a given file name can refer
to the Git directory by mistake.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that the index can block pathnames that case-fold to
".git" on HFS+, it would be helpful for fsck to notice such
problematic paths. This lets servers which use
receive.fsckObjects block them before the damage spreads.
Note that the fsck check is always on, even for systems
without core.protectHFS set. This is technically more
restrictive than we need to be, as a set of users on ext4
could happily use these odd filenames without caring about
HFS+.
However, on balance, it's helpful for all servers to block
these (because the paths can be used for mischief, and
servers which bother to fsck would want to stop the spread
whether they are on HFS+ themselves or not), and hardly
anybody will be affected (because the blocked names are
variants of .git with invisible Unicode code-points mixed
in, meaning mischief is almost certainly what the tree
author had in mind).
Ideally these would be controlled by a separate
"fsck.protectHFS" flag. However, it would be much nicer to
be able to enable/disable _any_ fsck flag individually, and
any scheme we choose should match such a system. Given the
likelihood of anybody using such a path in practice, it is
not unreasonable to wait until such a system materializes.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The point of disallowing ".git" in the index is that we
would never want to accidentally overwrite files in the
repository directory. But this means we need to respect the
filesystem's idea of when two paths are equal. The prior
commit added a helper to make such a comparison for HFS+;
let's use it in verify_path.
We make this check optional for two reasons:
1. It restricts the set of allowable filenames, which is
unnecessary for people who are not on HFS+. In practice
this probably doesn't matter, though, as the restricted
names are rather obscure and almost certainly would
never come up in practice.
2. It has a minor performance penalty for every path we
insert into the index.
This patch ties the check to the core.protectHFS config
option. Though this is expected to be most useful on OS X,
we allow it to be set everywhere, as HFS+ may be mounted on
other platforms. The variable does default to on for OS X,
though.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We do not allow paths with a ".git" component to be added to
the index, as that would mean repository contents could
overwrite our repository files. However, asking "is this
path the same as .git" is not as simple as strcmp() on some
filesystems.
HFS+'s case-folding does more than just fold uppercase into
lowercase (which we already handle with strcasecmp). It may
also skip past certain "ignored" Unicode code points, so
that (for example) ".gi\u200ct" is mapped ot ".git".
The full list of folds can be found in the tables at:
https://www.opensource.apple.com/source/xnu/xnu-1504.15.3/bsd/hfs/hfscommon/Unicode/UCStringCompareData.h
Implementing a full "is this path the same as that path"
comparison would require us importing the whole set of
tables. However, what we want to do is much simpler: we
only care about checking ".git". We know that 'G' is the
only thing that folds to 'g', and so on, so we really only
need to deal with the set of ignored code points, which is
much smaller.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We complain about ".git" in a tree because it cannot be
loaded into the index or checked out. Since we now also
reject ".GIT" case-insensitively, fsck should notice the
same, so that errors do not propagate.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We check that fsck notices and complains about confusing
paths in trees. However, there are a few shortcomings:
1. We check only for these paths as file entries, not as
intermediate paths (so ".git" and not ".git/foo").
2. We check "." and ".." together, so it is possible that
we notice only one and not the other.
3. We repeat a lot of boilerplate.
Let's use some loops to be more thorough in our testing, and
still end up with shorter code.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We do not allow ".git" to enter into the index as a path
component, because checking out the result to the working
tree may causes confusion for subsequent git commands.
However, on case-insensitive file systems, ".Git" or ".GIT"
is the same. We should catch and prevent those, too.
Note that technically we could allow this for repos on
case-sensitive filesystems. But there's not much point. It's
unlikely that anybody cares, and it creates a repository
that is unexpectedly non-portable to other systems.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We should prevent nonsense paths from entering the index in
the first place, as they can cause confusing results if they
are ever checked out into the working tree. We already do
so, but we never tested it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When unpack_trees tries to write an entry to the index,
add_index_entry may report an error to stderr, but we ignore
its return value. This leads to us returning a successful
exit code for an operation that partially failed. Let's make
sure to propagate this code.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We want to recognize the packed-refs header and skip to the
"traits" part of the line. We currently do it by feeding
sizeof() a static const array to strncmp. However, it's a
bit simpler to just skip_prefix, which expresses the
intention more directly, and without remembering to account
for the NUL-terminator in each sizeof() call.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we have a strbuf in read_packed_refs, we can pass
it straight to the line parser, which saves us an extra
strlen.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Current code uses a fixed PATH_MAX-sized buffer for reading
packed-refs lines. This is a reasonable guess, in the sense
that git generally cannot work with refs larger than
PATH_MAX. However, there are a few cases where it is not
great:
1. Some systems may have a low value of PATH_MAX, but can
actually handle larger paths in practice. Fixing this
code path probably isn't enough to make them work
completely with long refs, but it is a step in the
right direction.
2. We use fgets, which will happily give us half a line on
the first read, and then the rest of the line on the
second. This is probably OK in practice, because our
refline parser is careful enough to look for the
trailing newline on the first line. The second line may
look like a peeled line to us, but since "^" is illegal
in refnames, it is not likely to come up.
Still, it does not hurt to be more careful.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our loop should always process all lines, even if we hit the
beginning of the file. We have a conditional after the loop
ends to double-check that there is nothing left and to
process it. But this should never happen, and is a sign of a
logic bug in the loop. Let's turn it into a BUG assertion.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we read a reflog file in reverse, we read whole chunks
of BUFSIZ bytes, then loop over the buffer, parsing any
lines we find. We find the beginning of each line by looking
for the newline from the previous line. If we don't find
one, we know that we are either at the beginning of
the file, or that we have to read another block.
In the latter case, we stuff away what we have into a
strbuf, read another block, and continue our parse. But we
missed one case here. If we did find a newline, and it is at
the beginning of the block, we must also stuff that newline
into the strbuf, as it belongs to the block we are about to
read.
The minimal fix here would be to add this special case to
the conditional that checks whether we found a newline.
But we can make the flow a little clearer by rearranging a
bit: we first handle lines that we are going to show, and
then at the end of each loop, stuff away any leftovers if
necessary. That lets us fold this special-case in with the
more common "we ended in the middle of a line" case.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The plan for the push.default transition had all along been
to use the "simple" method rather than "upstream" as a
default if the user did not specify their own push.default
value. Commit 11037ee (push: switch default from "matching"
to "simple", 2013-01-04) tried to implement that by moving
PUSH_DEFAULT_UNSPECIFIED in our switch statement to
fall-through to the PUSH_DEFAULT_SIMPLE case.
When the commit that became 11037ee was originally written,
that would have been enough. We would fall through to
calling setup_push_upstream() with the "simple" parameter
set to 1. However, it was delayed for a while until we were
ready to make the transition in Git 2.0.
And in the meantime, commit ed2b182 (push: change `simple`
to accommodate triangular workflows, 2013-06-19) threw a
monkey wrench into the works. That commit drops the "simple"
parameter to setup_push_upstream, and instead checks whether
the global "push_default" is PUSH_DEFAULT_SIMPLE. This is
right when the user has explicitly configured push.default
to simple, but wrong when we are a fall-through for the
"unspecified" case.
We never noticed because our push.default tests do not cover
the case of the variable being totally unset; they only
check the "simple" behavior itself.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An example where this happens is when doing an ls-tree on a tree that
contains a commit link. In that case, find_unique_abbrev is called
to get a non-abbreviated hex sha1, but still, a lookup is done as
to whether the sha1 is in the repository (which ends up looking for
a loose object in .git/objects), while the result of that lookup is
not used when returning a non-abbreviated hex sha1.
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git add foo bar" adds neither foo nor bar when bar is ignored, but dies
to let the user recheck their command invocation. This becomes less
helpful when "git add foo.*" is subject to shell expansion and some of
the expanded files are ignored.
"git add --ignore-errors" is supposed to ignore errors when indexing
some files and adds the others. It does ignore errors from actual
indexing attempts, but does not ignore the error "file is ignored" as
outlined above. This is unexpected.
Change "git add foo bar" to add foo when bar is ignored, but issue
a warning and return a failure code as before the change.
That is, in the case of trying to add ignored files we now act the same
way (with or without "--ignore-errors") in which we act for more
severe indexing errors when "--ignore-errors" is specified.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the user specifiers "normal" for a foreground color, this
should be a noop (while this may sound useless, it is the
only way to specify an unchanged foreground color followed
by a specific background color).
We also check that color "-1" does the same thing. This is
not documented, but has worked forever, so let's make sure
we keep supporting it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of git-config's command line options use OPT_BIT to
choose an action, and then parse the non-option arguments
in a context-dependent way. However, --get-color and
--get-colorbool are unlike the rest of the options, in that
they are OPT_STRING, taking the option name as a parameter.
This generally works, because we then use the presence of
those strings to set an action bit anyway. But it does mean
that the option-parser will continue looking for options
even after the key (because it is not a non-option; it is an
argument to an option). And running:
git config --get-color some.key -1
(to use "-1" as the default color spec) will barf, claiming
that "-1" is not an option. Instead, we should treat
--get-color and --get-colorbool as action bits, just like
--add, --get, and all the other actions, and then check that
the non-option arguments we got are sane. This fixes the
weirdness above, and makes those two options like all the
others.
This "fixes" a test in t4026, which checked that feeding
"-2" as a color should fail (it does fail, but prior to this
patch, because parseopt barfed, not because we actually ever
tried to parse the color).
This also catches other errors, like:
git config --get-color some.key black blue
which previously silently ignored "blue" (and now will
complain that you gave too many arguments).
There are some possible regressions, though. We now disallow
these, which currently do what you would expect:
# specifying other options after the action
git config --get-color some.key --file whatever
# using long-arg syntax
git config --get-color=some.key
However, we have never advertised these in the
documentation, and in fact they did not work in some older
versions of git. The behavior was apparently switched as an
accidental side effect of d64ec16 (git config: reorganize to
use parseopt, 2009-02-21).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our color specifications have supported the 256-color ANSI
extension for years, but we never documented it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ralf reported that '--recurse-submodules' option in push.c should not be
translated [1]. Before his commit is merged, remove superfluous
translations for push.c.
[1] http://www.spinics.net/lists/git/msg241964.html
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
In order to catch up with the release of Git 2.2.0 final, make a batch
l10n update for the new l10n change brought by commit d52adf1 (trailer:
display a trailer without its trailing newline).
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Small fixes to a new experimental command already in 'master'.
* cc/interpret-trailers:
trailer: display a trailer without its trailing newline
trailer: ignore comment lines inside the trailers
As of CGI.pm's 4.08 release, the behavior to call
CGI::param() in a list context is deprecated (because it can
be potentially unsafe if called inside a hash constructor).
This causes gitweb to issue a warning for some of our code,
which in turn causes the tests to fail.
Our use is in fact _not_ one of the dangerous cases, as we
are intentionally using a list context. The recommended
route by 4.08 is to use the new CGI::multi_param() call to
make it explicit that we know what we are doing.
However, that function is only available in 4.08, which is
about a month old; we cannot rely on having it.
One option would be to set $CGI::LIST_CONTEXT_WARN globally,
which turns off the warning. However, that would eliminate
the protection these newer releases are trying to provide.
We want to annotate each site as OK using the new function.
So instead, let's check whether CGI provides the
multi_param() function, and if not, provide an
implementation that just wraps param(). That will work on
both old and new versions of CGI. Sadly, we cannot just
check defined(\&CGI::multi_param), because CGI uses the
autoload feature, which claims that all functions are
defined. Instead, we just do a version check.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Like the perl scripts, python scripts need a dependency to ensure they
are rebuilt when switching between the "dummy" versions that run
without Python and the real thing.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>