The 'invalid tag name' and 'missing tagger entry' warnings can now be
upgraded to errors by specifying `invalidTagName` and
`missingTaggerEntry` in the receive.fsck.<msg-id> config setting.
Incidentally, the missing tagger warning is now really shown as a warning
(as opposed to being reported with the "error:" prefix, as it used to be
the case before this commit).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An fsck issue in a legacy repository might be so common that one would
like not to bother the user with mentioning it at all. With this change,
that is possible by setting the respective message type to "ignore".
This change "abuses" the missingEmail=warn test to verify that "ignore"
is also accepted and works correctly. And while at it, it makes sure
that multiple options work, too (they are passed to unpack-objects or
index-pack as a comma-separated list via the --strict=... command-line
option).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some kinds of errors are intrinsically unrecoverable (e.g. errors while
uncompressing objects). It does not make sense to allow demoting them to
mere warnings.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fsck_tag() identifies a problem with the commit, it should try
to make it possible to continue checking the commit object, in case the
user wants to demote the detected errors to mere warnings.
Just like fsck_commit(), there are certain problems that could hide other
issues with the same tag object. For example, if the 'type' line is not
encountered in the correct position, the 'tag' line – if there is any –
would not be handled at all.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This problem has been detected in the wild, and is the primary reason
to introduce an option to demote certain fsck errors to warnings. Let's
offer to ignore this particular problem specifically.
Technically, we could handle such repositories by setting
receive.fsck.<msg-id> to missingCommitter=warn, but that could hide
missing tree objects in the same commit because we cannot continue
verifying any commit object after encountering a missing committer line,
while we can continue in the case of multiple author lines.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fsck_commit() identifies a problem with the commit, it should try
to make it possible to continue checking the commit object, in case the
user wants to demote the detected errors to mere warnings.
Note that some problems are too problematic to simply ignore. For
example, when the header lines are mixed up, we punt after encountering
an incorrect line. Therefore, demoting certain warnings to errors can
hide other problems. Example: demoting the missingauthor error to
a warning would hide a problematic committer line.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fsck_ident() identifies a problem with the ident, it should still
advance the pointer to the next line so that fsck can continue in the
case of a mere warning.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some repositories written by legacy code have objects with non-fatal
fsck issues. To allow the user to ignore those issues, let's print
out the ID (e.g. when encountering "missingEmail", the user might
want to call `git config --add receive.fsck.missingEmail=warn`).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For example, missing emails in commit and tag objects can be demoted to
mere warnings with
git config receive.fsck.missingemail=warn
The value is actually a comma-separated list.
In case that the same key is listed in multiple receive.fsck.<msg-id>
lines in the config, the latter configuration wins (this can happen for
example when both $HOME/.gitconfig and .git/config contain message type
settings).
As git receive-pack does not actually perform the checks, it hands off
the setting to index-pack or unpack-objects in the form of an optional
argument to the --strict option.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are legacy repositories out there whose older commits and tags
have issues that prevent pushing them when 'receive.fsckObjects' is set.
One real-life example is a commit object that has been hand-crafted to
list two authors.
Often, it is not possible to fix those issues without disrupting the
work with said repositories, yet it is still desirable to perform checks
by setting `receive.fsckObjects = true`. This commit is the first step
to allow demoting specific fsck issues to mere warnings.
The `fsck_set_msg_types()` function added by this commit parses a list
of settings in the form:
missingemail=warn,badname=warn,...
Unfortunately, the FSCK_WARN/FSCK_ERROR flag is only really heeded by
git fsck so far, but other call paths (e.g. git index-pack --strict)
error out *always* no matter what type was specified. Therefore, we need
to take extra care to set all message types to FSCK_ERROR by default in
those cases.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit d95d728aba.
It turns out that many other commands that need to interact with the
result of running diff-files and diff-index, e.g. "git apply", "git
rm", etc., need to be adjusted to the new world order it brings in.
For example, it would break this sequence to correct a whitespace
breakage in the parts you changed:
git add -N file
git diff --cached file | git apply --cached --whitespace=fix
git checkout file
In the old world order, "diff" showed a patch to modify an existing
empty file by adding its full contents, and "apply" updated the
index by modifying the existing empty blob (which is what an
Intent-to-Add entry records in the index) with that patch.
In the new world order, "diff" shows a patch to create a new file
with its full contents, but because "apply" thinks that the i-t-a
entry already exists in the index, it refused to accept a creation.
Adjusting "apply" to this new world order is easy, but we need to
assess the extent of the damage to the rest of the system the new
world order brought in before going forward and adjust them all,
after which we can resurrect the commit being reverted here.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's no need to switch branches to parse another branch's ancestry.
Signed-off-by: Charles Bailey <cbailey32@bloomberg.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This fixes two instances where a &&-chain was broken in the subtree
tests and fixes a test error that was revealed because of this.
Many tests in t7900-subtree.sh make a commit and then use 'undo' to
reset the state for the next test. In the 'check hash of split' test,
an 'undo' was being invoked after a 'subtree split' even though the
particular invocation of 'subtree split' did not actually make a commit.
The subsequent check_equal was failing, but this failure was masked by
that broken &&-chain.
Removing this undo causes the failing check_equal to succeed but breaks
the a check_equal later on in the same test.
It turns out that an earlier test ('check if --message for merge works
with squash too') makes a commit but doesn't 'undo' to the state
expected by the remaining tests. None of the intervening tests cared
enough about the state of the test repo to fail and the spurious 'undo'
in 'check hash of split' restored the expected state for any remaining
test that might care.
Adding the missing 'undo' to 'check if --message for merge works
with squash too' and removing the spurious one from 'check hash of
split' fixes all tests once the &&-chains are completed.
Signed-off-by: Charles Bailey <cbailey32@bloomberg.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Although subtrees tests uses more spaces for indentation than tabs,
there are still quite a lot of lines indented with tabs. As tabs conform
with Git coding guidelines resolve the inconsistency in favour of tabs.
Signed-off-by: Charles Bailey <cbailey32@bloomberg.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The unsigned long option parsing (including 'k'/'m'/'g' suffix
parsing) is more widely applicable. Add support for OPT_MAGNITUDE
to parse-options.h and change pack-objects.c use this support.
The error behavior on parse errors follows that of OPT_INTEGER. The
name of the option that failed to parse is reported with a brief
message describing the expect format for the option argument and
then the full usage message for the command invoked.
This differs from the previous behavior for OPT_ULONG used in
pack-objects for --max-pack-size and --window-memory which used to
display the value supplied in the error message and did not display
the full usage message.
Signed-off-by: Charles Bailey <cbailey32@bloomberg.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix the printf specification to treat 'integer' as the signed type
that it is and add a test that checks that we parse negative option
arguments.
Signed-off-by: Charles Bailey <cbailey32@bloomberg.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It can sometimes be useful to examine all objects in the
repository. Normally this is done with "git rev-list --all
--objects", but:
1. That shows only reachable objects. You may want to look
at all available objects.
2. It's slow. We actually open each object to walk the
graph. If your operation is OK with seeing unreachable
objects, it's an order of magnitude faster to just
enumerate the loose directories and pack indices.
You can do this yourself using "ls" and "git show-index",
but it's non-obvious. This patch adds an option to
"cat-file --batch-check" to operate on all available
objects (rather than reading names from stdin).
This is based on a proposal by Charles Bailey to provide a
separate "git list-all-objects" command. That is more
orthogonal, as it splits enumerating the objects from
getting information about them. However, in practice you
will either:
a. Feed the list of objects directly into cat-file anyway,
so you can find out information about them. Keeping it
in a single process is more efficient.
b. Ask the listing process to start telling you more
information about the objects, in which case you will
reinvent cat-file's batch-check formatter.
Adding a cat-file option is simple and efficient. And if you
really do want just the object names, you can always do:
git cat-file --batch-check='%(objectname)' --batch-all-objects
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are really two things going on in this function:
1. We convert the name we got on stdin to a sha1.
2. We look up and print information on the sha1.
Let's split out the second half so that we can call it
separately.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If batch_one_object returns an error code, we stop reading
input. However, it will only do so if we feed it NULL,
which cannot happen; we give it the "buf" member of a
strbuf, which is always non-NULL.
We did originally stop on other errors (like a missing
object), but this was changed in 3c076db (cat-file --batch /
--batch-check: do not exit if hashes are missing,
2008-06-09). These days we keep going for any per-object
error (and print "missing" when necessary).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use a direct write() to output the results of --batch and
--batch-check. This is good for processes feeding the input
and reading the output interactively, but it introduces
measurable overhead if you do not want this feature. For
example, on linux.git:
$ git rev-list --objects --all | cut -d' ' -f1 >objects
$ time git cat-file --batch-check='%(objectsize)' \
<objects >/dev/null
real 0m5.440s
user 0m5.060s
sys 0m0.384s
This patch adds an option to use regular stdio buffering:
$ time git cat-file --batch-check='%(objectsize)' \
--buffer <objects >/dev/null
real 0m4.975s
user 0m4.888s
sys 0m0.092s
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We do not put extra whitespace before the first macro
argument.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When for_each_packed_object is called, we call
prepare_packed_git() to make sure we have the actual list of
packs. But the latter does not actually open the pack
indices, meaning that pack->nr_objects may simply be 0 if
the pack has not otherwise been used since the program
started.
In practice, this didn't come up for the current callers,
because they iterate the packed objects only after iterating
all reachable objects (so for it to matter you would have to
have a pack consisting only of unreachable objects). But it
is a dangerous and confusing interface that should be fixed
for future callers.
Note that we do not end the iteration when a pack cannot be
opened, but we do return an error. That lets you complete
the iteration even in actively-repacked repository where an
.idx file may racily go away, but it also lets callers know
that they may not have gotten the complete list (which the
current reachability-check caller does care about).
We have to tweak one of the prune tests due to the changed
return value; an earlier test creates bogus .idx files and
does not clean them up. Having to make this tweak is a good
thing; it means we will not prune in a broken repository,
and the test confirms that we do not negatively impact a
more lenient caller, count-objects.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
verify-tag by default displays human-readable output on standard error.
However, it can also be useful to get access to the raw gpg status
information, which is machine-readable, allowing automated
implementation of signing policy. Add a --raw option to make verify-tag
produce the gpg status information on standard error instead of the
human-readable format.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
verify-commit by default displays human-readable output on standard
error. However, it can also be useful to get access to the raw gpg
status information, which is machine-readable, allowing automated
implementation of signing policy. Add a --raw option to make
verify-commit produce the gpg status information on standard error
instead of the human-readable format.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code to handle printing of signature data from a struct
signature_check is very similar between verify-commit and verify-tag.
Place this in a single function. verify-tag retains its special case
behavior of printing the tag even when no valid signature is found.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
verify-commit and verify-tag both share a central codepath for verifying
commits: check_signature. However, verify-tag exited successfully for
untrusted signature, while verify-commit exited unsuccessfully.
Centralize this signature check and make verify-commit adopt the older
verify-tag behavior. This behavior is more logical anyway, as the
signature is in fact valid, whether or not there's a path of trust to
the author.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
verify-tag exits successfully if the signature is good but the key is
untrusted. verify-commit exits unsuccessfully. This divergence in
behavior is unexpected and unwanted. Since verify-tag existed earlier,
add a failing test to have verify-commit share verify-tag's behavior.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
verify-tag was executing an entirely different codepath than
verify-commit, except for the underlying verify_signed_buffer. Move
much of the code from check_commit_signature to a generic
check_signature function and adjust both codepaths to call it.
Update verify-tag to explicitly output the signature text, as we now
call verify_signed_buffer with strbufs to catch the output, which
prevents it from being printed automatically.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
verify-tag was lacking tests. Add some, mirroring those used for
verify-commit.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ref_transaction_update() family of functions use the following
convention for their old_sha1 parameters:
* old_sha1 == NULL: Don't check the old value at all.
* is_null_sha1(old_sha1): Ensure that the reference didn't exist
before the transaction.
* otherwise: Ensure that the reference had the specified value before
the transaction.
delete_ref() had a different convention, namely treating
is_null_sha1(old_sha1) as "don't care". Change it to adhere to the
standard convention to reduce the scope for confusion.
Please note that it is now a bug to pass old_sha1=NULL_SHA1 to
delete_ref() (because it doesn't make sense to delete a reference that
you already know doesn't exist). This is consistent with the behavior
of ref_transaction_delete().
Most of the callers of delete_ref() never pass old_sha1=NULL_SHA1 to
delete_ref(), and are therefore unaffected by this change. The
two exceptions are:
* The call in cmd_update_ref(), which passed NULL_SHA1 if the old
value passed in on the command line was 0{40} or the empty string.
Change that caller to pass NULL in those cases.
Arguably, it should be an error to call "update-ref -d" with the old
value set to "does not exist", just as it is for the `--stdin`
command "delete". But since this usage was accepted until now,
continue to accept it.
* The call in delete_branches(), which could pass NULL_SHA1 if
deleting a broken or symbolic ref. Change it to pass NULL in these
cases.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Restructure the code to avoid clearing oldsha1 when oldval is unset.
It's value is not needed in that case, so this change makes it more
obvious that its initialization is consistent with its later use.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we are deleting the reference, then we don't need to read the
reference's old value. It doesn't provide any race safety, because the
value read just before the delete is no "better" than the value that
would be read under lock during the delete. And even if the reference
previously didn't exist, we can call delete_ref() on it if we don't
provide an old_sha1 value.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make it clear that this function does not overwrite its first
argument.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some functions from the refs module were still declared in cache.h.
Move them to refs.h.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In initial_ref_transaction_commit(), check for D/F conflicts (i.e.,
the type of conflict that exists between "refs/foo" and
"refs/foo/bar") among the references being created and between the
references being created and any hypothetical existing references.
Ideally, there shouldn't *be* any existing references when this
function is called. But, at least in the case of the "testgit" remote
helper, "clone" can be called after the remote-tracking "HEAD" and
"master" branches have already been created. So let's just do the
full-blown check.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Error out if the ref_transaction includes more than one update for any
refname.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The following functions are no longer used from outside the refs
module:
* lock_packed_refs()
* add_packed_ref()
* commit_packed_refs()
* rollback_packed_refs()
So make these functions private.
This is an important step, because it means that nobody outside of the
refs module needs to know the difference between loose and packed
references.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git clone" uses shortcuts when creating the initial set of
references:
* It writes them directly to packed-refs.
* It doesn't lock the individual references (though it does lock the
packed-refs file).
* It doesn't check for refname conflicts between two new references or
between one new reference and any hypothetical old ones.
* It doesn't create reflog entries for the reference creations.
This functionality was implemented in builtin/clone.c. But really that
file shouldn't have such intimate knowledge of how references are
stored. So provide a new function in the refs API,
initial_ref_transaction_commit(), which can be used for initial
reference creation. The new function is based on the ref_transaction
interface.
This means that we can make some other functions private to the refs
module. That will be done in a followup commit.
It would seem to make sense to add a test here that there are no
existing references, because that is how the function *should* be
used. But in fact, the "testgit" remote helper appears to call it
*after* having set up refs/remotes/<name>/HEAD and
refs/remotes/<name>/master, so we can't be so strict. For now, the
function trusts its caller to only call it when it makes sense. Future
commits will add some more limited sanity checks.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is no longer called from outside of the refs module. Also move its
docstring and change it to imperative voice.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The old version just looped over the references to delete, calling
delete_ref() on each one. But that has quadratic behavior, because
each call to delete_ref() might have to rewrite the packed-refs file.
This can be very expensive in a repository with a large number of
references. In some (admittedly extreme) repositories, we've seen
cases where the ref-pruning part of fetch takes multiple tens of
minutes.
Instead call delete_refs(), which (aside from being less code) has the
optimization that it only rewrites the packed-refs file a single time.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This slightly changes how errors are reported. The old and new code
both report errors that come from repack_without_refs() the same way.
But if an error occurs within delete_ref(), the old version only
emitted an error within delete_ref() without further comment. The new
version (in delete_refs()) still emits that error, but then follows it
up with
error(_("could not remove reference %s"), refname)
The "could not remove reference" error originally came from a similar
message in remove_branches() (from builtin/remote.c).
This is an improvement, because the error from delete_ref() (which
usually comes from ref_transaction_commit()) can be pretty low-level,
like
Cannot lock ref '%s': unable to resolve reference %s: %s
where the last "%s" is the original strerror.
In any case, I don't think we need to sweat the details too much,
because these errors should only ever be seen in the case of a broken
repository or a race between two processes; i.e., only in pretty rare
and anomalous situations.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we fail to delete the doomed references from the packed-refs file,
then it is unsafe to delete their loose references, because doing so
might expose a value from the packed-refs file that is obsolete and
perhaps even points at an object that has been garbage collected.
So if repack_without_refs() fails, emit a more explicit error message
and bail.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the error message from
Could not remove branch %s
to
could not remove reference %s
First of all, the old error message referred to "branch
refs/remotes/origin/foo", which was awkward even for the existing
caller. Normally we would refer to a reference like that as either
"remote-tracking branch origin/foo" or "reference
refs/remotes/origin/foo". Here I take the lazier alternative.
Moreover, now that this function is part of the refs API, it might be
called for refs that are neither branches nor remote-tracking
branches.
While we're at it, convert the error message to lower case, as per our
usual convention.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the function remove_branches() from builtin/remote.c to refs.c,
rename it to delete_refs(), and make it public.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
delete_ref() uses a different convention for its old_sha1 parameter
than, say, ref_transaction_delete(): NULL_SHA1 means not to check the
old value. Make this fact a little bit clearer in the code by handling
it in explicit, commented code rather than burying it in a conditional
expression.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>