Commit 7c0fe330d5 (rev-list: handle missing tree objects properly,
2018-10-05) taught the traversal machinery used by git-rev-list to
ignore missing trees, so that rev-list could handle them itself.
However, it does so only by checking via oid_object_info_extended() that
the object exists at all. This can miss several classes of errors that
were previously detected by rev-list:
- type mismatches (e.g., we expected a tree but got a blob)
- failure to read the object data (e.g., due to bitrot on disk)
This is especially important because we use "rev-list --objects" as our
connectivity check to admit new objects to the repository, and it will
now miss these cases (though the bitrot one is less important here,
because we'd typically have just hashed and stored the object).
There are a few options to fix this:
1. we could check these properties in rev-list when we do the existence
check. This is probably too expensive in practice (perhaps even for
a type check, but definitely for checking the whole content again,
which implies loading each object into memory twice).
2. teach the traversal machinery to differentiate between a missing
object, and one that could not be loaded as expected. This probably
wouldn't be too hard to detect type mismatches, but detecting bitrot
versus a truly missing object would require deep changes to the
object-loading code.
3. have the traversal machinery communicate the failure to the caller,
so that it can decide how to proceed without re-evaluting the object
itself.
Of those, I think (3) is probably the best path forward. However, this
patch does none of them. In the name of expediently fixing the
regression to a normal "rev-list --objects" that we use for connectivity
checks, this simply restores the pre-7c0fe330d5 behavior of having the
traversal die as soon as it fails to load a tree (when --missing is set
to MA_ERROR, which is the default).
Note that we can't get rid of the object-existence check in
finish_object(), because this also handles blobs (which are not
otherwise checked at all by the traversal code).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Return NULL from 'get_commit_tree()' when a commit's root tree is
corrupt, doesn't exist, or points to an object which is not a tree.
In [1], this situation became a BUG(), but it can certainly occur in
cases which are not a bug in Git, for e.g., if a caller manually crafts
a commit whose tree is corrupt in any of the above ways.
Note that the expect_failure test in t6102 triggers this BUG(), but we
can't flip it to expect_success yet. Solving this problem actually
reveals a second bug.
[1]: 7b8a21dba1 (commit-graph: lazy-load trees for commits, 2018-04-06)
Co-authored-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apply similar treatment as the previous commit for non-tree entries,
too.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix one of the cases described in the previous commit where a tree-entry
that is promised to a blob is in fact a non-blob.
When 'lookup_blob()' returns NULL, it is because Git has cached the
requested object as a non-blob. In this case, prevent a SIGSEGV by
'die()'-ing immediately before attempting to dereference the result.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Call an object's type "unexpected" when the actual type of an object
does not match Git's contextual expectation. For example, a tree entry
whose mode differs from the object's actual type, or a commit's parent
which is not another commit, and so on.
This can manifest itself in various unfortunate ways, including Git
SIGSEGV-ing under specific conditions. Consider the following example:
Git traverses a blob (say, via `git rev-list`), and then tries to read
out a tree-entry which lists that object as something other than a blob.
In this case, `lookup_blob()` will return NULL, and the subsequent
dereference will result in a SIGSEGV.
Introduce tests that present objects of "unexpected" type in the above
fashion to 'git rev-list'. Mark as failures the combinations that are
already broken (i.e., they exhibit the segfault described above). In the
cases that are not broken (i.e., they have NULL-ness checks or similar),
mark these as expecting success.
We might hit an unexpected type in two different ways (imagine we have a
tree entry that claims to be a tree but actually points to a blob):
- when we call lookup_tree(), we might find that we've already seen
the object referenced as a blob, in which case we'd get NULL. We
can exercise this with "git rev-list --objects $blob $tree", which
guarantees that the blob will have been parsed before we look in
the tree. These tests are marked as "seen" in the test script.
- we call lookup_tree() successfully, but when we try to read the
object, we find out it's something else. We construct our tests
such that $blob is not otherwise mentioned in $tree. These tests
are marked as "lone" in the script.
We should check that we behave sensibly in both cases (especially
because it is easy for a malicious actor to provoke one case or the
other).
Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The helper 'hex2oct' is used to convert base-16 encoded data into a
base-8 binary form, and is useful for preparing data for commands that
accept input in a binary format, such as 'git hash-object', via
'printf'.
This helper is defined identically in three separate places throughout
't'. Move the definition to test-lib-function.sh, so that it can be used
in new test suites, and its definition is not redundant.
This will likewise make our job easier in the subsequent commit, which
also uses 'hex2oct'.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Recently the Git for Windows project started the upgrade process to
a MSYS2 runtime version based on Cygwin v3.x.
This has the very notable consequence that `$(uname -r)` no longer
reports a version starting with "2", but a version with "3".
That breaks our build, as df5218b4c3 (config.mak.uname: support MSys2,
2016-01-13) simply did not expect the version reported by `uname -r` to
depend on the underlying Cygwin version: it expected the reported
version to match the "2" in "MSYS2".
So let's invert that test case to test for *anything else* than a
version starting with "1" (for MSys). That should safeguard us for the
future, even if Cygwin ends up releasing versionsl like 314.272.65536.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On NetBSD, the version of msgfmt is still 0.14.4. There's no hope for
an upgrade due to some GPLv3 allergy of NetBSD's. This version chokes
on heavily decorated commented entries in po files. It's safer to get
rid of all these obsolete entries.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
During the six months of development of the Azure Pipelines support, the
patches went through quite a few iterations of changes, and to test
those iterations, a temporary build definition was used.
In the meantime, Azure Pipelines support made it to `master`, and we now
have a regular Azure Pipeline, installed via the common GitHub App
workflow. This new pipeline has a different name (git.git instead of
test-git.git), and a new ID (11 instead of 2).
Let's adjust the badge in our README to reflect that final shape of the
Azure Pipeline.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change an unportable invocation of "dd" with count=0, that wanted to
truncate the commit-graph file. In POSIX it is unspecified what
happens when count=0 is provided[1]. The NetBSD "dd" behavior
differs from GNU (and seemingly other BSDs), which has left this test
broken since d2b86fbaa1 ("commit-graph: fix buffer read-overflow",
2019-01-15).
Copying from /dev/null would seek/truncate to seek=$zero_pos and
stop immediately after that (without being able to copy anything),
which is the right way to truncate the file.
1. http://pubs.opengroup.org/onlinepubs/9699919799/utilities/dd.html
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix widely supported but non-POSIX basic regex syntax introduced in
[1] and [2]. On GNU, NetBSD and FreeBSD the following works:
$ echo xy >f
$ grep 'xy\?' f; echo $?
xy
0
The same goes for "\+". The "?" and "+" syntax is not in the BRE
syntax, just in ERE, but on some implementations it can be invoked by
prefixing the meta-operator with "\", but not on OpenBSD:
$ uname -a
OpenBSD obsd.my.domain 6.2 GENERIC#132 amd64
$ grep --version
grep version 0.9
$ grep 'xy\?' f; echo $?
1
Let's fix this by moving to ERE syntax instead, where "?" and "+" are
universally supported:
$ grep -E 'xy?' f; echo $?
xy
0
1. 2ed5c8e174 ("describe: setup working tree for --dirty", 2019-02-03)
2. c801170b0c ("t6120: test for describe with a bare repository",
2019-02-03)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* mk/t5562-no-input-to-too-large-an-input-test:
t5562: do not depend on /dev/zero
Revert "t5562: replace /dev/zero with a pipe from generate_zero_bytes"
Some expected failures of git-http-backend leaves running its children
(receive-pack or upload-pack) which still hold opened descriptors
to act.err and with some probability they live long enough to write
there their failure messages after next test has already truncated
the files. This causes occasional failures of the test script.
Avoid the issue by using separated output and error file for each test,
apprending the test number to their name.
Reported-by: Carlo Arenas <carenas@gmail.com>
Helped-by: Carlo Arenas <carenas@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Max Kirillov <max@max630.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In cc95bc2025 (t5562: replace /dev/zero with a pipe from
generate_zero_bytes, 2019-02-09), we replaced usage of /dev/zero (which
is not available on NonStop, apparently) by a Perl script snippet to
generate NUL bytes.
Sadly, it does not seem to work on NonStop, as t5562 reportedly hangs.
Worse, this also hangs in the Ubuntu 16.04 agents of the CI builds on
Azure Pipelines: for some reason, the Perl script snippet that is run
via `generate_zero_bytes` in t5562's 'CONTENT_LENGTH overflow ssite_t'
test case tries to write out an infinite amount of NUL bytes unless a
broken pipe is encountered, that snippet never encounters the broken
pipe, and keeps going until the build times out.
Oddly enough, this does not reproduce on the Windows and macOS agents,
nor in a local Ubuntu 18.04.
This developer tried for a day to figure out the exact circumstances
under which this hang happens, to no avail, the details remain a
mystery.
In the end, though, what counts is that this here change incidentally
fixes that hang (maybe also on NonStop?). Even more positively, it gets
rid of yet another unnecessary Perl invocation.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It was reported [1] that NonStop platform does not have /dev/zero.
The test uses /dev/zero as a dummy input. Passing case (http-backed
failed because of too big input size) should not be reading anything
from it. If http-backend would erroneously try to read any data
returning EOF probably would be even safer than providing some
meaningless data.
Replace /dev/zero with /dev/null to avoid issues with platforms which do
not have /dev/zero.
[1] https://public-inbox.org/git/20190209185930.5256-4-randall.s.becker@rogers.com/
Reported-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Max Kirillov <max@max630.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Revert cc95bc20 ("t5562: replace /dev/zero with a pipe from
generate_zero_bytes", 2019-02-09), as not feeding anything to the
command is a better way to test it.
'root commit' is usually translated as 'Root-Commit'. But in one
occasion it‘s translated as 'Basis-Commit' which is the translation
for 'base commit'.
Signed-off-by: Sebastian Staudt <koraktor@gmail.com>
Running up to v2.21.0, we fixed two bugs that were made prominent by the
Windows-specific change to retain copies of only the 30 latest getenv()
calls' returned strings, invalidating any copies of previous getenv()
calls' return values.
While this really shines a light onto bugs of the form where we hold
onto getenv()'s return values without copying them, it is also a real
problem for users.
And even if Jeff King's patches merged via 773e408881 (Merge branch
'jk/save-getenv-result', 2019-01-29) provide further work on that front,
we are far from done. Just one example: on Windows, we unset environment
variables when spawning new processes, which potentially invalidates
strings that were previously obtained via getenv(), and therefore we
have to duplicate environment values that are somehow involved in
spawning new processes (e.g. GIT_MAN_VIEWER in show_man_page()).
We do not have a chance to investigate, let address, all of those issues
in time for v2.21.0, so let's at least help Windows users by increasing
the number of getenv() calls' return values that are kept valid. The
number 64 was determined by looking at the average number of getenv()
calls per process in the entire test suite run on Windows (which is
around 40) and then adding a bit for good measure. And it is a power of
two (which would have hit yesterday's theme perfectly).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>