When we read a sha1 file, we first look for a packed
version, then a loose version, and then re-check the pack
directory again before concluding that we cannot find it.
This lets us handle a process that is writing to the
repository simultaneously (e.g., receive-pack writing a new
pack followed by a ref update, or git-repack packing
existing loose objects into a new pack).
However, we do not do the same trick with has_sha1_file; we
only check the packed objects once, followed by loose
objects. This means that we might incorrectly report that we
do not have an object, even though we could find it if we
simply re-checked the pack directory.
By itself, this is usually not a big deal. The other process
is running simultaneously, so we may run has_sha1_file
before it writes, anyway. It is a race whether we see the
object or not. However, we may also see other things
the writing process has done (like updating refs); and in
that case, we must be able to also see the new objects.
For example, imagine we are doing a for_each_ref iteration,
and somebody simultaneously pushes. Receive-pack may write
the pack and update a ref after we have examined the
objects/pack directory, but before the iteration gets to the
updated ref. When we do finally see the updated ref,
for_each_ref will call has_sha1_file to check whether the
ref is broken. If has_sha1_file returns the wrong answer, we
erroneously will think that the ref is broken.
For a normal iteration without DO_FOR_EACH_INCLUDE_BROKEN,
this means that the caller does not see the ref at all
(neither the old nor the new value). So not only will we
fail to see the new value of the ref (which is acceptable,
since we are running simultaneously with the writer, and we
might well read the ref before the writer commits its
write), but we will not see the old value either. For
programs that act on reachability like pack-objects or
prune, this can cause data loss, as we may see the objects
referenced by the original ref value as dangling (and either
omit them from the pack, or delete them via prune).
There's no test included here, because the success case is
two processes running simultaneously forever. But you can
replicate the issue with:
# base.sh
# run this in one terminal; it creates and pushes
# repeatedly to a repository
git init parent &&
(cd parent &&
# create a base commit that will trigger us looking at
# the objects/pack directory before we hit the updated ref
echo content >file &&
git add file &&
git commit -m base &&
# set the unpack limit abnormally low, which
# lets us simulate full-size pushes using tiny ones
git config receive.unpackLimit 1
) &&
git clone parent child &&
cd child &&
n=0 &&
while true; do
echo $n >file && git add file && git commit -m $n &&
git push origin HEAD:refs/remotes/child/master &&
n=$(($n + 1))
done
# fsck.sh
# now run this simultaneously in another terminal; it
# repeatedly fscks, looking for us to consider the
# newly-pushed ref broken. We cannot use for-each-ref
# here, as it uses DO_FOR_EACH_INCLUDE_BROKEN, which
# skips the has_sha1_file check (and if it wants
# more information on the object, it will actually read
# the object, which does the proper two-step lookup)
cd parent &&
while true; do
broken=`git fsck 2>&1 | grep remotes/child`
if test -n "$broken"; then
echo $broken
exit 1
fi
done
Without this patch, the fsck loop fails within a few seconds
(and almost instantly if the test repository actually has a
large number of refs). With it, the two can run
indefinitely.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we had to revert two topics at the last minute, let's have
another (hopefully short) round of rc to make sure the final release
will be sound.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit cdfd94837b, as it
does not just apply to "@" (and forms with modifiers like @{u}
applied to it), but also affects e.g. "refs/heads/@/foo", which it
shouldn't.
The basic idea of giving a short-hand might be good, and the topic
can be retried later, but let's revert to avoid affecting existing
use cases for now for the upcoming release.
This reverts commit a73653130e, as it
has been reported that "ls-files --killed" is too time-consuming in
a deep directory with too many untracked crufts (e.g. $HOME/.git
tracking only a few files).
We'd need to revisit it later but "ls-files --killed" needs to be
optimized before it happens.
A handful of past contributors are recorded with multiple e-mail
addresses, all of which are undeliverable. With a lot of help from
Jonathan, we located all of them except for one person, and a pair
of addresses we suspect belong to a single person but we are not
certain.
Update the found ones with their currently preferred address, and
use the last known address to consolidate contributions by the lost
one.
Helped-by: Stefan Beller <stefanbeller@googlemail.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch adds no new names, but fixes the mistakes I made in the previous
commits. (94b410bba8, f4f49e225, c07a6bc57, 2013-07-12, .mailmap: Map
email addresses to names).
These mistakes are double white spaces between name and surname,
different capitalization in email address, or just the email address set
as name.
Also I forgot to include James Knight to the mailmap file.
Signed-off-by: Stefan Beller <stefanbeller@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t/t7407-submodule-foreach.sh there is a typo in one of the
path names given for a test step. The correct path is
nested1/nested2/.git, but nested1/nested1/nested2/.git is
given instead. The typo is hidden because this line also
accidentally omits the && chain operator. The omitted chain
also means the return values of all the previous commands in
this test are also being ignored.
Fix the path and add the chain operator so the entire test
sequence can be properly validated.
Signed-off-by: Phil Hord <hordp@cisco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
a469a10193 (silence some -Wuninitialized false positives;
2012-12-15) triggered "unused value" warnings when the return value of
opterror() and several other error-related functions was not used.
5ded807f7c (fix clang -Wunused-value warnings for error functions;
2013-01-16) applied a fix by adding #if !defined(__clang__) in cache.h
and git-compat-util.h, but misspelled it as #if !defined(clang) in
parse-options.h. Fix this.
This mistake went unnoticed because existing callers of opterror()
utilize its return value. 1158826394 (parse-options: add
OPT_CMDMODE(); 2013-07-30), however, adds a new invocation of opterror()
which ignores the return value, thus triggering the "unused value"
warning.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Translate 99 new messages came from git.pot update in
28b3cff (l10n: git.pot: v1.8.4 round 1 (99 new, 46 removed)).
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
Acked-by: Thomas Rast <trast@inf.ethz.ch>
This is with mostly minor documentation and test updates, nothing
spectacular except for removal of funky lstat(2) emulation on Cygwin.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This switches the translation from pure German to German+English.
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
Acked-by: Thomas Rast <trast@inf.ethz.ch>
* maint:
fix typo in documentation of git-svn
Documentation/rev-list-options: add missing word in --*-parents
log doc: the argument to --encoding is not optional
Sub-test 42 of t8001 and t8002 ("blame -L :literal") fails on NetBSD
with the following verbose output:
git annotate -L:main hello.c
Author F (expected 4, attributed 3) bad
Author G (expected 1, attributed 1) good
This is not caused by different behaviour of git blame or annotate on
that platform, but by different test input, in turn caused by a sed
command that forgets to add a newline on NetBSD. Here's the diff of the
commit that adds "goodbye" to hello.c, for Linux:
@@ -1,4 +1,5 @@
int main(int argc, const char *argv[])
{
puts("hello");
+ puts("goodbye");
}
We see that it adds an extra TAB, but that's not a problem. Here's the
same on NetBSD:
@@ -1,4 +1,4 @@
int main(int argc, const char *argv[])
{
puts("hello");
-}
+ puts("goodbye");}
It also adds an extra TAB, but it is missing the newline character
after the semicolon.
The following patch gets rid of the extra TAB at the beginning, but
more importantly adds the missing newline at the end in a (hopefully)
portable way, mentioned in http://sed.sourceforge.net/sedfaq4.html.
The diff becomes this, on both Linux and NetBSD:
@@ -1,4 +1,5 @@
int main(int argc, const char *argv[])
{
puts("hello");
+ puts("goodbye");
}
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We wanted to catch all codepoints that ends with FFFE and FFFF,
not with 0FFFE and 0FFFF.
Noticed and corrected by Peter Krefting.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is not like that our longer term desire is to someday start
accept log messages with NULs in them, so it is wrong to mark a test
that demonstrates "git commit" that correctly fails given such an
input as "expect-failure". "git commit" should fail today, and it
should fail the same way in the future given a message with NUL in it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test file that the UTF-16 rejection test looks for is missing, but this went
unnoticed because the test is expected to fail anyway; as a consequence, the
test fails because the file containing the commit message is missing, and not
because the test file contains a NUL byte. Fix this by including a sample text
file containing a commit message encoded in UTF-16.
Signed-off-by: Brian M. Carlson <sandals@crustytoothpaste.net>
Tested-by: Duy Nguyen <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A commit has "parent commits" or "parents", not "commits".
Signed-off-by: Torstein Hegge <hegge@resisty.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Stefan Beller <stefanbeller@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
$ git log --encoding
fatal: Option '--encoding' requires a value
$ git rev-list --encoding
fatal: Option '--encoding' requires a value
The argument to --encoding has always been mandatory. Unfortunately
manpages like git-rev-list(1), git-log(1), and git-show(1) have
described the option's syntax as "--encoding[=<encoding>]" since it
was first documented. Clarify by removing the extra brackets.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Cygwin port added a "not quite correct but a lot faster and good
enough for many lstat() calls that are only used to see if the
working tree entity matches the index entry" lstat() emulation some
time ago, and it started biting us in places. This removes it and
uses the standard lstat() that comes with Cygwin.
Recent topic that uses lstat on packed-refs file is broken when
this cheating lstat is used, and this is a simplest fix that is
also the cleanest direction to go in the long run.
* rj/cygwin-clarify-use-of-cheating-lstat:
cygwin: Remove the Win32 l/stat() implementation
This reverts commit c334b87b30c1464a1ab563fe1fb8de5eaf0e5bac; the
update assumed that people only used the command to read from
"rev-list --objects" output, whose lines begin with a 40-hex object
name followed by a whitespace, but it turns out that scripts feed
random extended SHA-1 expressions (e.g. "HEAD:$pathname") in which
a whitespace has to be kept.