Commit Graph

89 Commits

Author SHA1 Message Date
Jeff King
76e86fc6e3 fsck: notice .git case-insensitively
We complain about ".git" in a tree because it cannot be
loaded into the index or checked out. Since we now also
reject ".GIT" case-insensitively, fsck should notice the
same, so that errors do not propagate.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-17 11:04:39 -08:00
Jeff King
7add441984 fsck: properly bound "invalid tag name" error message
When we detect an invalid tag-name header in a tag object,
like, "tag foo bar\n", we feed the pointer starting at "foo
bar" to a printf "%s" formatter. This shows the name, as we
want, but then it keeps printing the rest of the tag buffer,
rather than stopping at the end of the line.

Our tests did not notice because they look only for the
matching line, but the bug is that we print much more than
we wanted to. So we also adjust the test to be more exact.

Note that when fscking tags with "index-pack --strict", this
is even worse. index-pack does not add a trailing
NUL-terminator after the object, so we may actually read
past the buffer and print uninitialized memory. Running
t5302 with valgrind does notice the bug for that reason.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-09 11:54:25 -08:00
Johannes Schindelin
cec097be3a fsck: check tag objects' headers
We inspect commit objects pretty much in detail in git-fsck, but we just
glanced over the tag objects. Let's be stricter.

Since we do not want to limit 'tag' lines unduly, values that would fail
the refname check only result in warnings, not errors.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-11 10:44:26 -07:00
Johannes Schindelin
4d0d89755e Make sure fsck_commit_buffer() does not run out of the buffer
So far, we assumed that the buffer is NUL terminated, but this is not
a safe assumption, now that we opened the fsck_object() API to pass a
buffer directly.

So let's make sure that there is at least an empty line in the buffer.
That way, our checks would fail if the empty line was encountered
prematurely, and consequently we can get away with the current string
comparisons even with non-NUL-terminated buffers are passed to
fsck_object().

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-11 10:44:01 -07:00
Johannes Schindelin
90a398bbd7 fsck_object(): allow passing object data separately from the object itself
When fsck'ing an incoming pack, we need to fsck objects that cannot be
read via read_sha1_file() because they are not local yet (and might even
be rejected if transfer.fsckobjects is set to 'true').

For commits, there is a hack in place: we basically cache commit
objects' buffers anyway, but the same is not true, say, for tag objects.

By refactoring fsck_object() to take the object buffer and size as
optional arguments -- optional, because we still fall back to the
previous method to look at the cached commit objects if the caller
passes NULL -- we prepare the machinery for the upcoming handling of tag
objects.

The assumption that such buffers are inherently NUL terminated is now
wrong, of course, hence we pass the size of the buffer so that we can
add a sanity check later, to prevent running past the end of the buffer.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-10 13:54:21 -07:00
René Scharfe
9d02150cf4 fsck: simplify fsck_commit_buffer() by using commit_list_count()
fsck_commit_buffer() checks that the number of items in the parents
list of a commit matches the number of parent lines in its buffer or --
if a graft is used -- the number of parents in that graft.  Simplify
the code by using commit_list_count() instead of counting by hand.
Also use different variables for the number of lines and the number of
list items, making it easier to compare them.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-10 14:10:27 -07:00
Junio C Hamano
e91ae32a01 Merge branch 'jk/skip-prefix'
* jk/skip-prefix:
  http-push: refactor parsing of remote object names
  imap-send: use skip_prefix instead of using magic numbers
  use skip_prefix to avoid repeated calculations
  git: avoid magic number with skip_prefix
  fetch-pack: refactor parsing in get_ack
  fast-import: refactor parsing of spaces
  stat_opt: check extra strlen call
  daemon: use skip_prefix to avoid magic numbers
  fast-import: use skip_prefix for parsing input
  use skip_prefix to avoid repeating strings
  use skip_prefix to avoid magic numbers
  transport-helper: avoid reading past end-of-string
  fast-import: fix read of uninitialized argv memory
  apply: use skip_prefix instead of raw addition
  refactor skip_prefix to return a boolean
  avoid using skip_prefix as a boolean
  daemon: mark some strings as const
  parse_diff_color_slot: drop ofs parameter
2014-07-09 11:33:28 -07:00
Jeff King
cf4fff579e refactor skip_prefix to return a boolean
The skip_prefix() function returns a pointer to the content
past the prefix, or NULL if the prefix was not found. While
this is nice and simple, in practice it makes it hard to use
for two reasons:

  1. When you want to conditionally skip or keep the string
     as-is, you have to introduce a temporary variable.
     For example:

       tmp = skip_prefix(buf, "foo");
       if (tmp)
	       buf = tmp;

  2. It is verbose to check the outcome in a conditional, as
     you need extra parentheses to silence compiler
     warnings. For example:

       if ((cp = skip_prefix(buf, "foo"))
	       /* do something with cp */

Both of these make it harder to use for long if-chains, and
we tend to use starts_with() instead. However, the first line
of "do something" is often to then skip forward in buf past
the prefix, either using a magic constant or with an extra
strlen(3) (which is generally computed at compile time, but
means we are repeating ourselves).

This patch refactors skip_prefix() to return a simple boolean,
and to provide the pointer value as an out-parameter. If the
prefix is not found, the out-parameter is untouched. This
lets you write:

  if (skip_prefix(arg, "foo ", &arg))
	  do_foo(arg);
  else if (skip_prefix(arg, "bar ", &arg))
	  do_bar(arg);

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-20 10:44:43 -07:00
Jeff King
8597ea3afe commit: record buffer length in cache
Most callsites which use the commit buffer try to use the
cached version attached to the commit, rather than
re-reading from disk. Unfortunately, that interface provides
only a pointer to the NUL-terminated buffer, with no
indication of the original length.

For the most part, this doesn't matter. People do not put
NULs in their commit messages, and the log code is happy to
treat it all as a NUL-terminated string. However, some code
paths do care. For example, when checking signatures, we
want to be very careful that we verify all the bytes to
avoid malicious trickery.

This patch just adds an optional "size" out-pointer to
get_commit_buffer and friends. The existing callers all pass
NULL (there did not seem to be any obvious sites where we
could avoid an immediate strlen() call, though perhaps with
some further refactoring we could).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 12:09:38 -07:00
Jeff King
bc6b8fc130 use get_commit_buffer everywhere
Each of these sites assumes that commit->buffer is valid.
Since they would segfault if this was not the case, they are
likely to be correct in practice. However, we can
future-proof them by using get_commit_buffer.

And as a side effect, we abstract away the final bare uses
of commit->buffer.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 12:08:17 -07:00
Junio C Hamano
f7804e250d Merge branch 'hs/simplify-bit-setting-in-fsck-tree'
* hs/simplify-bit-setting-in-fsck-tree:
  fsck: use bitwise-or assignment operator to set flag
2014-03-31 16:30:44 -07:00
Junio C Hamano
40adf520a3 Merge branch 'ys/fsck-commit-parsing'
* ys/fsck-commit-parsing:
  fsck.c:fsck_commit(): use skip_prefix() to verify and skip constant
  fsck.c:fsck_ident(): ident points at a const string
2014-03-28 13:51:24 -07:00
Hiroyuki Sano
effd12ec87 fsck: use bitwise-or assignment operator to set flag
fsck_tree() has two different ways to set a flag variable, either by
using a if-statement that guards an assignment, or by using a
bitwise-or assignment operator.  Most are done with the former, and
only one variable is assigned with the latter.

Since all the conditions are short-and-sweet, we can afford to
uniformly use the latter style, which makes the resulting code
shorter and easier to read.

Signed-off-by: Hiroyuki Sano <sh19910711@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-20 11:20:48 -07:00
Yuxuan Shui
2d820a61df fsck.c:fsck_commit(): use skip_prefix() to verify and skip constant
fsck_commit() uses memcmp() to check if the buffer starts with a
certain prefix, and skips the prefix if it does.

This is exactly what skip_prefix() was designed for.

Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-19 15:34:56 -07:00
Yuxuan Shui
de42180f6a fsck.c:fsck_ident(): ident points at a const string
Since fsck_ident doesn't change the content of **ident, the type of
ident could be const char **.

This change is required to rewrite fsck_commit() to use skip_prefix().

Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-13 13:00:30 -07:00
Jeff King
7ca36d9398 date: check date overflow against time_t
When we check whether a timestamp has overflowed, we check
only against ULONG_MAX, meaning that strtoul has overflowed.
However, we also feed these timestamps to system functions
like gmtime, which expect a time_t. On many systems, time_t
is actually smaller than "unsigned long" (e.g., because it
is signed), and we would overflow when using these
functions.  We don't know the actual size or signedness of
time_t, but we can easily check for truncation with a simple
assignment.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 10:12:58 -08:00
Jeff King
d4b8de0420 fsck: report integer overflow in author timestamps
When we check commit objects, we complain if commit->date is
ULONG_MAX, which is an indication that we saw integer
overflow when parsing it. However, we do not do any check at
all for author lines, which also contain a timestamp.

Let's actually check the timestamps on each ident line
with strtoul. This catches both author and committer lines,
and we can get rid of the now-redundant commit->date check.

Note that like the existing check, we compare only against
ULONG_MAX. Now that we are calling strtoul at the site of
the check, we could be slightly more careful and also check
that errno is set to ERANGE. However, this will make further
refactoring in future patches a little harder, and it
doesn't really matter in practice.

For 32-bit systems, one would have to create a commit at the
exact wrong second in 2038. But by the time we get close to
that, all systems will hopefully have moved to 64-bit (and
if they haven't, they have a real problem one second later).

For 64-bit systems, by the time we get close to ULONG_MAX,
all systems will hopefully have been consumed in the fiery
wrath of our expanding Sun.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 10:12:58 -08:00
Jeff King
5c17f51270 fsck: warn about ".git" in trees
Having a ".git" entry inside a tree can cause confusing
results on checkout. At the top-level, you could not
checkout such a tree, as it would complain about overwriting
the real ".git" directory. In a subdirectory, you might
check it out, but performing operations in the subdirectory
would confusingly consider the in-tree ".git" directory as
the repository.

The regular git tools already make it hard to accidentally
add such an entry to a tree, and do not allow such entries
to enter the index at all. Teaching fsck about it provides
an additional safety check, and let's us avoid propagating
any such bogosity when transfer.fsckObjects is on.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-28 13:52:54 -08:00
Jeff King
5d34a4359d fsck: warn about '.' and '..' in trees
A tree with meta-paths like '.' or '..' does not work well
with git; the index will refuse to load it or check it out
to the filesystem (and even if we did not have that safety,
it would look like we were overwriting an untracked
directory). For the same reason, it is difficult to create
such a tree with regular git.

Let's warn about these dubious entries during fsck, just in
case somebody has created a bogus tree (and this also lets
us prevent them from propagating when transfer.fsckObjects
is set).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-28 10:41:08 -08:00
Junio C Hamano
03adeeaad6 Merge branch 'jk/maint-null-in-trees' into maint-1.7.11
"git diff" had a confusion between taking data from a path in the
working tree and taking data from an object that happens to have
name 0{40} recorded in a tree.

* jk/maint-null-in-trees:
  fsck: detect null sha1 in tree entries
  do not write null sha1s to on-disk index
  diff: do not use null sha1 as a sentinel value
2012-09-10 15:24:54 -07:00
Jeff King
c479d14a80 fsck: detect null sha1 in tree entries
Short of somebody happening to beat the 1 in 2^160 odds of
actually generating content that hashes to the null sha1, we
should never see this value in a tree entry. So let's have
fsck warn if it it seen.

As in the previous commit, we test both blob and submodule
entries to future-proof the test suite against the
implementation depending on connectivity to notice the
error.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-29 15:14:08 -07:00
Pete Wyckoff
82247e9bd5 remove superfluous newlines in error messages
The error handling routines add a newline.  Remove
the duplicate ones in error messages.

Signed-off-by: Pete Wyckoff <pw@padd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-30 15:45:51 -07:00
Dmitry Ivankov
53f53cff24 fsck: improve committer/author check
fsck allows a name with > character in it like "name> <email>". Also for
"name email>" fsck says "missing space before email".

More precisely, it seeks for a first '<', checks that ' ' preceeds it.
Then seeks to '<' or '>' and checks that it is the '>'. Missing space is
reported if either '<' is not found or it's not preceeded with ' '.

Change it to following. Seek to '<' or '>', check that it is '<' and is
preceeded with ' '. Seek to '<' or '>' and check that it is '>'. So now
"name> <email>" is rejected as "bad name". More strict name check is the
only change in what is accepted.

Report 'missing space' only if '<' is found and is not preceeded with a
space.

Signed-off-by: Dmitry Ivankov <divanorama@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-11 12:21:07 -07:00
Junio C Hamano
5b42477b59 Merge branch 'jm/maint-misc-fix' into maint
* jm/maint-misc-fix:
  read_gitfile_gently: use ssize_t to hold read result
  remove tests of always-false condition
  rerere.c: diagnose a corrupt MERGE_RR when hitting EOF between TAB and '\0'
2011-05-30 00:09:41 -07:00
Jim Meyering
5dd564895e remove tests of always-false condition
* fsck.c (fsck_error_function): Don't test obj->sha1 == 0.
It can never be true, since that sha1 member is an array.
* transport.c (set_upstreams): Likewise for ref->new_sha1.

Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-26 11:24:24 -07:00
Jeff King
ebeb60900f strbuf: add strbuf_vaddf
In a variable-args function, the code for writing into a strbuf is
non-trivial. We ended up cutting and pasting it in several places
because there was no vprintf-style function for strbufs (which in turn
was held up by a lack of va_copy).

Now that we have a fallback va_copy, we can add strbuf_vaddf, the
strbuf equivalent of vsprintf. And we can clean up the cut and paste
mess.

Signed-off-by: Jeff King <peff@peff.net>
Improved-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-26 01:06:50 -08:00
Jonathan Nieder
0adc6a3d49 fsck: fix bogus commit header check
daae1922 (fsck: check ident lines in commit objects, 2010-04-24)
taught fsck to expect commit objects to have the form

  tree <object name>
  <parents>
  author <valid ident string>
  committer <valid ident string>

  log message

The check is overly strict: for example, it errors out with the
message “expected blank line” for perfectly valid commits with an
"encoding ISO-8859-1" line.

Later it might make sense to teach fsck about the rest of the header
and warn about unrecognized header lines, but for simplicity, let’s
accept arbitrary trailing lines for now.

Reported-by: Tuncer Ayaz <tuncer.ayaz@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-28 15:08:27 -07:00
Jonathan Nieder
daae19224a fsck: check ident lines in commit objects
Check that email addresses do not contain <, >, or newline so they can
be quickly scanned without trouble.  The copy() function in ident.c
already ensures that ordinary git commands will not write email
addresses without this property.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-01 12:15:06 -07:00
Eric Wong
f290974390 Allow the Unix epoch to be a valid commit date
It is common practice to use the Unix epoch as a fallback date
when a suitable date is not available.  This is true of git svn
and possibly other importing tools that import non-git history
into git.

Instead of clobbering established strtoul() error reporting
semantics with our own, preserve the strtoul() error value
of ULONG_MAX for fsck.c to handle.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-06 09:36:16 -07:00
Benjamin Kramer
eb3a9dd327 Remove unused function scope local variables
These variables were unused and can be removed safely:

  builtin-clone.c::cmd_clone(): use_local_hardlinks, use_separate_remote
  builtin-fetch-pack.c::find_common(): len
  builtin-remote.c::mv(): symref
  diff.c::show_stats():show_stats(): total
  diffcore-break.c::should_break(): base_size
  fast-import.c::validate_raw_date(): date, sign
  fsck.c::fsck_tree(): o_sha1, sha1
  xdiff-interface.c::parse_num(): read_some

Signed-off-by: Benjamin Kramer <benny.kra@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-07 20:52:17 -08:00
Junio C Hamano
8bb4646dae Merge branch 'maint'
* maint:
  Fix non-literal format in printf-style calls
  git-submodule: Avoid printing a spurious message.
  git ls-remote: make usage string match manpage
  Makefile: help people who run 'make check' by mistake
2008-11-11 14:49:50 -08:00
Daniel Lowe
9db56f71b9 Fix non-literal format in printf-style calls
These were found using gcc 4.3.2-1ubuntu11 with the warning:

    warning: format not a string literal and no format arguments

Incorporated suggestions from Brandon Casey <casey@nrlssc.navy.mil>.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-11 14:43:59 -08:00
Brandon Casey
f285a2d7ed Replace calls to strbuf_init(&foo, 0) with STRBUF_INIT initializer
Many call sites use strbuf_init(&foo, 0) to initialize local
strbuf variable "foo" which has not been accessed since its
declaration. These can be replaced with a static initialization
using the STRBUF_INIT macro which is just as readable, saves a
function call, and takes up fewer lines.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-10-12 12:36:19 -07:00
Junio C Hamano
79b1138e78 fsck.c: fix bogus "empty tree" check
ba002f3 (builtin-fsck: move common object checking code to fsck.c) did
more than what it claimed to.  Most notably, it wrongly made an empty tree
object an error by pretending to only move code from fsck_tree() in
builtin-fsck.c to fsck_tree() in fsck.c, but in fact adding a bogus check
to barf on an empty tree.

An empty tree object is _unusual_.  Recent porcelains try reasonably hard
not to let the user create a commit that contains such a tree.  Perhaps
warning about them in git-fsck may have some merit.

HOWEVER.

Being unusual and being errorneous are two quite different things.  This
is especially true now we seem to use the same fsck_$object() code in
places other than git-fsck itself.  For example, receive-pack should not
reject unusual objects, even if it would be a good idea to tighten it to
reject incorrect ones.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-03-05 10:32:01 -08:00
Martin Koegler
d6ffc8d784 add common fsck error printing function
Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-25 23:57:35 -08:00
Martin Koegler
ba002f3b28 builtin-fsck: move common object checking code to fsck.c
Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-25 23:57:35 -08:00
Martin Koegler
355885d531 add generic, type aware object chain walker
The requirements are:
* it may not crash on NULL pointers
* a callback function is needed, as index-pack/unpack-objects
  need to do different things
* the type information is needed to check the expected <-> real type
  and print better error messages

Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-02-25 23:57:34 -08:00
Mark Wooding
b4dfefe00f Make fsck and fsck-objects be builtins.
The earlier change df391b192 to rename fsck-objects to fsck broke
fsck-objects.  This should fix it again.

Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-29 09:36:21 -08:00
Junio C Hamano
df391b192d git-fsck-objects is now synonym to git-fsck
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-28 16:33:58 -08:00