Normally fsck makes a pass over all objects to check their
integrity, and then follows up with a reachability check to
make sure we have all of the referenced objects (and to know
which ones are dangling). The latter checks for the HAS_OBJ
flag in obj->flags to see if we found the object in the
first pass.
Commit 02976bf85 (fsck: introduce `git fsck --connectivity-only`,
2015-06-22) taught fsck to skip the initial pass, and to
fallback to has_sha1_file() instead of the HAS_OBJ check.
However, it converted only one HAS_OBJ check to use
has_sha1_file(). But there are many other places in
builtin/fsck.c that assume that the flag is set (or that
lookup_object() will return an object at all). This leads to
several bugs with --connectivity-only:
1. mark_object() will not queue objects for examination,
so recursively following links from commits to trees,
etc, did nothing. I.e., we were checking the
reachability of hardly anything at all.
2. When a set of heads is given on the command-line, we
use lookup_object() to see if they exist. But without
the initial pass, we assume nothing exists.
3. When loading reflog entries, we do a similar
lookup_object() check, and complain that the reflog is
broken if the object doesn't exist in our hash.
So in short, --connectivity-only is broken pretty badly, and
will claim that your repository is fine when it's not.
Presumably nobody noticed for a few reasons.
One is that the embedded test does not actually test the
recursive nature of the reachability check. All of the
missing objects are still in the index, and we directly
check items from the index. This patch modifies the test to
delete the index, which shows off breakage (1).
Another is that --connectivity-only just skips the initial
pass for loose objects. So on a real repository, the packed
objects were still checked correctly. But on the flipside,
it means that "git fsck --connectivity-only" still checks
the sha1 of all of the packed objects, nullifying its
original purpose of being a faster git-fsck.
And of course the final problem is that the bug only shows
up when there _is_ corruption, which is rare. So anybody
running "git fsck --connectivity-only" proactively would
assume it was being thorough, when it was not.
One possibility for fixing this is to find all of the spots
that rely on HAS_OBJ and tweak them for the connectivity-only
case. But besides the risk that we might miss a spot (and I
found three already, corresponding to the three bugs above),
there are other parts of fsck that _can't_ work without a
full list of objects. E.g., the list of dangling objects.
Instead, let's make the connectivity-only case look more
like the normal case. Rather than skip the initial pass
completely, we'll do an abbreviated one that sets up the
HAS_OBJ flag for each object, without actually loading the
object data.
That's simple and fast, and we don't have to care about the
connectivity_only flag in the rest of the code at all.
While we're at it, let's make sure we treat loose and packed
objects the same (i.e., setting up dummy objects for both
and skipping the actual sha1 check). That makes the
connectivity-only check actually fast on a real repo (40
seconds versus 180 seconds on my copy of linux.git).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After checking connectivity, fsck looks through the list of
any objects we've seen mentioned, and reports unreachable
and un-"used" ones as dangling. However, it skips any object
which is not marked as "parsed", as that is an object that
we _don't_ have (but that somebody mentioned).
Since 6e454b9a3 (clear parsed flag when we free tree
buffers, 2013-06-05), that flag can't be relied on, and the
correct method is to check the HAS_OBJ flag. The cleanup in
that commit missed this callsite, though. As a result, we
would generally fail to report dangling trees.
We never noticed because there were no tests in this area
(for trees or otherwise). Let's add some.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This test creates a multi-level set of trees, but its
cleanup routine only removes the top-level tree. After the
test finishes, the inner tree and the blob it points to
remain, making the inner tree dangling.
A later test ("cleaned up") verifies that we've removed any
cruft and "git fsck" output is clean. This passes only
because of a bug in git-fsck which fails to notice dangling
trees.
In preparation for fixing the bug, let's teach this earlier
test to clean up after itself correctly. We have to remove
the inner tree (and therefore the blob, too, which becomes
dangling after removing that tree).
Since the setup code happens inside a subshell, we can't
just set a variable for each object. However, we can stuff
all of the sha1s into the $T output variable, which is not
used for anything except cleanup.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Improve the rule to convert "unsigned char [20]" into "struct
object_id *" in contrib/coccinelle/
* rs/cocci:
cocci: avoid self-references in object_id transformations
Update to the test framework made in 2.9 timeframe broke running
the tests under valgrind, which has been fixed.
* nd/test-helpers:
valgrind: support test helpers
Portability update and workaround for builds on recent Mac OS X.
* ls/macos-update:
travis-ci: disable GIT_TEST_HTTPD for macOS
Makefile: set NO_OPENSSL on macOS by default
Fix for a racy false-positive test failure.
* as/merge-attr-sleep:
t6026: clarify the point of "kill $(cat sleep.pid)"
t6026: ensure that long-running script really is
Revert "t6026-merge-attr: don't fail if sleep exits early"
Revert "t6026-merge-attr: ensure that the merge driver was called"
t6026-merge-attr: ensure that the merge driver was called
t6026-merge-attr: don't fail if sleep exits early
Recent update to git-sh-setup (a library of shell functions that
are used by our in-tree scripted Porcelain commands) included
another shell library git-sh-i18n without specifying where it is,
relying on the $PATH. This has been fixed to be more explicit by
prefixing $(git --exec-path) output in front.
* ak/sh-setup-dot-source-i18n-fix:
git-sh-setup: be explicit where to dot-source git-sh-i18n from.
"git daemon" used fixed-length buffers to turn URL to the
repository the client asked for into the server side directory
path, using snprintf() to avoid overflowing these buffers, but
allowed possibly truncated paths to the directory. This has been
tightened to reject such a request that causes overlong path to be
required to serve.
* jk/daemon-path-ok-check-truncation:
daemon: detect and reject too-long paths
The code that we have used for the past 10+ years to cycle
4-element ring buffers turns out to be not quite portable in
theoretical world.
* rs/ring-buffer-wraparound:
hex: make wraparound of the index into ring-buffer explicit
"git send-email" attempts to pick up valid e-mails from the
trailers, but people in real world write non-addresses there, like
"Cc: Stable <add@re.ss> # 4.8+", which broke the output depending
on the availability and vintage of Mail::Address perl module.
* mm/send-email-cc-cruft-after-address:
Git.pm: add comment pointing to t9000
t9000-addresses: update expected results after fix
parse_mailboxes: accept extra text after <...> address
The command-line completion script (in contrib/) learned to
complete "git cmd ^mas<HT>" to complete the negative end of
reference to "git cmd ^master".
* cp/completion-negative-refs:
completion: support excluding refs
Extract a small helper out of the function that reads the authors
script file "git am" internally uses.
This by itself is not useful until a second caller appears in the
future for "rebase -i" helper.
* jc/am-read-author-file:
am: refactor read_author_script()
Since 650c44925 (common-main: call git_extract_argv0_path(),
2016-07-01), the argv[0] that is seen in cmd_main() of
individual programs is always the basename of the
executable, as common-main strips off the full path. This
can produce confusing results for git-daemon, which wants to
re-exec itself.
For instance, if the program was originally run as
"/usr/lib/git/git-daemon", it will try just re-execing
"git-daemon", which will find the first instance in $PATH.
If git's exec-path has not been prepended to $PATH, we may
find the git-daemon from a different version (or no
git-daemon at all).
Normally this isn't a problem. Git commands are run as "git
daemon", the git wrapper puts the exec-path at the front of
$PATH, and argv[0] is already "daemon" anyway. But running
git-daemon via its full exec-path, while not really a
recommended method, did work prior to 650c44925. Let's make
it work again.
The real goal of 650c44925 was not to munge argv[0], but to
reliably set the argv0_path global. The only reason it
munges at all is that one caller, the git.c wrapper,
piggy-backed on that computation to find the command
basename. Instead, let's leave argv[0] untouched in
common-main, and have git.c do its own basename computation.
While we're at it, let's drop the return value from
git_extract_argv0_path(). It was only ever used in this one
callsite, and its dual purposes is what led to this
confusion in the first place.
Note that by changing the interface, the compiler can
confirm for us that there are no other callers storing the
return value. But the compiler can't tell us whether any of
the cmd_main() functions (besides git.c) were relying on the
basename munging. However, we can observe that prior to
650c44925, no other cmd_main() functions did that munging,
and no new cmd_main() functions have been introduced since
then. So we can't be regressing any of those cases.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Translate one message introduced by commit:
* 358718064b i18n: fix unmatched single quote in error message
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
"git archive" and "git mailinfo" stopped reading from local
configuration file with a recent update.
* jc/setup-cleanup-fix:
archive: read local configuration
mailinfo: read local configuration
"git rebase -i" did not work well with core.commentchar
configuration variable for two reasons, both of which have been
fixed.
* js/rebase-i-commentchar-fix:
rebase -i: handle core.commentChar=auto
stripspace: respect repository config
rebase -i: highlight problems with core.commentchar
Using a %(HEAD) placeholder in "for-each-ref --format=" option
caused the command to segfault when on an unborn branch.
* jc/for-each-ref-head-segfault-fix:
for-each-ref: do not segv with %(HEAD) on an unborn branch
Since b9605bc4f2 ("config: only read .git/config from configured
repos", 2016-09-12), we do not read from ".git/config" unless we
know we are in a repository. "git archive" however didn't do the
repository discovery and instead relied on the old behaviour.
Teach the command to run a "gentle" version of repository discovery
so that local configuration variables are honoured.
[jc: stole tests from peff]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since b9605bc4f2 ("config: only read .git/config from configured
repos", 2016-09-12), we do not read from ".git/config" unless we
know we are in a repository. "git mailinfo" however didn't do the
repository discovery and instead relied on the old behaviour. This
was mostly OK because it was merely run as a helper program by other
porcelain scripts that first chdir's up to the root of the working
tree.
Teach the command to run a "gentle" version of repository discovery
so that local configuration variables like mailinfo.scissors are
honoured.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git 2.11.0-rc2 introduced one small l10n update, and this commit fixed
the affected translations all in one batch.
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
In commit 1462450 ("trailer: allow non-trailers in trailer block",
2016-10-21), functionality was added (and tested [1]) to allow
non-trailer lines in trailer blocks, as long as those blocks contain at
least one Git-generated or user-configured trailer, and consists of at
least 25% trailers. The documentation was updated to mention this new
functionality, but did not mention "user-configured trailer".
Further update the documentation to also mention "user-configured
trailer".
[1] "with non-trailer lines mixed with a configured trailer" in
t/t7513-interpret-trailers.sh
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 84c9dc2 (commit: allow core.commentChar=auto for character auto
selection, 2014-05-17) extended the core.commentChar functionality to
allow for the value 'auto', it forgot that rebase -i was already taught to
handle core.commentChar, and in turn forgot to let rebase -i handle that
new value gracefully.
Reported by Taufiq Hoven.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The way "git stripspace" reads the configuration was not quite
kosher, in that the code forgot to probe for a possibly existing
repository (note: stripspace is designed to be usable outside the
repository as well). It read .git/config only when it was run from
the top-level of the working tree by accident. A recent change
b9605bc4f2 ("config: only read .git/config from configured repos",
2016-09-12) stopped reading the repository-local configuration file
".git/config" unless the repository discovery process is done, so
that .git/config is never read even when run from the top-level,
exposing the old bug more.
When rebasing interactively with a commentChar defined in the
current repository's config, the help text at the bottom of the edit
script potentially used an incorrect comment character. This was not
only funny-looking, but also resulted in tons of warnings like this
one:
Warning: the command isn't recognized in the following line
- #
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>