In the same spirit as commit 8eca0b47ff, let's try to survive a pack
corruption by making packed_object_info() able to fall back to alternate
packs or loose objects.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is possible to have pack corruption in the object header. Currently
unpack_object_header() simply die() on them instead of letting the caller
deal with that gracefully.
So let's have unpack_object_header() return an error instead, and find
a better name for unpack_object_header_gently() in that context. All
callers of unpack_object_header() are ready for it.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In one case, it was possible to have a bad offset equal to 0 effectively
pointing a delta onto itself and crashing git after too many recursions.
In the other cases, a negative offset could result due to off_t being
signed. Catch those.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Abstract
--------
With index v2 we have a per object CRC to allow quick and safe reuse of
pack data when repacking. This, however, doesn't currently prevent a
stealth corruption from being propagated into a new pack when _not_
reusing pack data as demonstrated by the modification to t5302 included
here.
The Context
-----------
The Git database is all checksummed with SHA1 hashes. Any kind of
corruption can be confirmed by verifying this per object hash against
corresponding data. However this can be costly to perform systematically
and therefore this check is often not performed at run time when
accessing the object database.
First, the loose object format is entirely compressed with zlib which
already provide a CRC verification of its own when inflating data. Any
disk corruption would be caught already in this case.
Then, packed objects are also compressed with zlib but only for their
actual payload. The object headers and delta base references are not
deflated for obvious performance reasons, however this leave them
vulnerable to potentially undetected disk corruptions. Object types
are often validated against the expected type when they're requested,
and deflated size must always match the size recorded in the object header,
so those cases are pretty much covered as well.
Where corruptions could go unnoticed is in the delta base reference.
Of course, in the OBJ_REF_DELTA case, the odds for a SHA1 reference to
get corrupted so it actually matches the SHA1 of another object with the
same size (the delta header stores the expected size of the base object
to apply against) are virtually zero. In the OBJ_OFS_DELTA case, the
reference is a pack offset which would have to match the start boundary
of a different base object but still with the same size, and although this
is relatively much more "probable" than in the OBJ_REF_DELTA case, the
probability is also about zero in absolute terms. Still, the possibility
exists as demonstrated in t5302 and is certainly greater than a SHA1
collision, especially in the OBJ_OFS_DELTA case which is now the default
when repacking.
Again, repacking by reusing existing pack data is OK since the per object
CRC provided by index v2 guards against any such corruptions. What t5302
failed to test is a full repack in such case.
The Solution
------------
As unlikely as this kind of stealth corruption can be in practice, it
certainly isn't acceptable to propagate it into a freshly created pack.
But, because this is so unlikely, we don't want to pay the run time cost
associated with extra validation checks all the time either. Furthermore,
consequences of such corruption in anything but repacking should be rather
visible, and even if it could be quite unpleasant, it still has far less
severe consequences than actively creating bad packs.
So the best compromize is to check packed object CRC when unpacking
objects, and only during the compression/writing phase of a repack, and
only when not streaming the result. The cost of this is minimal (less
than 1% CPU time), and visible only with a full repack.
Someone with a stats background could provide an objective evaluation of
this, but I suspect that it's bad RAM that has more potential for data
corruptions at this point, even in those cases where this extra check
is not performed. Still, it is best to prevent a known hole for
corruption when recreating object data into a new pack.
What about the streamed pack case? Well, any client receiving a pack
must always consider that pack as untrusty and perform full validation
anyway, hence no such stealth corruption could be propagated to remote
repositoryes already. It is therefore worthless doing local validation
in that case.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We used to have non-POSIX comformant BRE in our code, and linked with GNU
regexp library on a few platforms (Darwin, FreeBSD and AIX) to work it
around. This was backwards.
We've fixed the broken regexps to use ERE that native regexp libraries on
these platforms can handle just fine. There is no need to link with GNU
regexp library on these platforms anymore.
Tested-on-AIX-by: Mike Ralphson <mike@abacus.co.uk>
Tested-on-FreeBSD-by: Jeff King <peff@peff.net>
Tested-on-Darwin-by: Arjen Laarhoven <arjen@yaph.org>
Tested-on-Darwin-by: Pieter de Bie <pieter@frim.nl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current behavior of git-daemon is to simply close the connection on
any error condition. This leaves the client without any information as
to the cause of the failed fetch/push/etc.
This patch allows get_remote_heads to accept a line prefixed with "ERR"
that it can display to the user in an informative fashion. Once clients
can understand this ERR line, git-daemon can be made to properly report
"repository not found", "permission denied", or other errors.
Example
S: ERR No matching repository.
C: fatal: remote error: No matching repository.
Signed-off-by: Tom Preston-Werner <tom@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Occasionally, it may be useful to prevent branches from getting deleted from
a centralized repository, particularly when no administrative access to the
server is available to undo it via reflog. It also makes
receive.denyNonFastForwards more useful if it is used for access control
since it prevents force-updating by deleting and re-creating a ref.
Signed-off-by: Jan Krüger <jk@jk.gs>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
Start 1.6.0.4 cycle
add instructions on how to send patches to the mailing list with Gmail
Documentation/gitattributes: Add subsection header for each attribute
git send-email: avoid leaking directory file descriptors.
send-pack: do not send out single-level refs such as refs/stash
fix overlapping memcpy in normalize_absolute_path
pack-objects: avoid reading uninitalized data
correct cache_entry allocation
Conflicts:
RelNotes
Gmail is one of the most popular email providers in the world. Now that Gmail
supports IMAP, sending properly formatted patches via `git imap-send` is
trivial. This section in SubmittingPatches explains how to do so.
Signed-off-by: Tom Preston-Werner <tom@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes attributes easier to find; before this patch some
attributes had individual subsections, and some didn't.
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since no version of receive-pack accepts these "funny refs", we should
mirror the check when considering the list of refs to send. IOW, don't
even make them eligible for matching or mirroring.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The comments for normalize_absolute_path explicitly claim
that the source and destination buffers may be the same
(though they may not otherwise overlap). Thus the call to
memcpy may involve copying overlapping data, and memmove
should be used instead.
This fixes a valgrind error in t1504.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the main loop of find_deltas, we do:
struct object_entry *entry = *list++;
...
if (!*list_size)
...
break
Because we look at and increment *list _before_ the check of
list_size, in the very last iteration of the loop we will
look at uninitialized data, and increment the pointer beyond
one past the end of the allocated space. Since we don't
actually do anything with the data until after the check,
this is not a problem in practice.
But since it technically violates the C standard, and
because it provokes a spurious valgrind warning, let's just
move the initialization of entry to a safe place.
This fixes valgrind errors in t5300, t5301, t5302, t303, and
t9400.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most cache_entry structs are allocated by using the
cache_entry_size macro, which rounds the size of the struct
up to the nearest multiple of 8 bytes (presumably to avoid
memory fragmentation).
There is one exception: the special "conflict entry" is
allocated with an empty name, and so is explicitly given
just one extra byte to hold the NUL.
However, later code doesn't realize that this particular
struct has been allocated differently, and happily tries
reading and copying it based on the ce_size macro, which
assumes the 8-byte alignment.
This can lead to reading uninitalized data, though since
that data is simply padding, there shouldn't be any problem
as a result. Still, it makes sense to hold the padding
assumption so as not to surprise later maintainers.
This fixes valgrind errors in t1005, t3030, t4002, and
t4114.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* git://repo.or.cz/git-gui: (27 commits)
git-gui: Update German translation.
git-gui: Do not munge conflict marker lines in a normal diff
git-gui: Add a simple implementation of SSH_ASKPASS.
git-gui: Add a dialog that shows the OpenSSH public key.
git-gui: Mark-up strings in show_{other,unmerged}_diff() for localization
git-gui: Show a round number of bytes of large untracked text files
git-gui: Fix the blame viewer destroy handler.
git-gui: Add a search command to the blame viewer.
git-gui: Fix the blame window shape.
git-gui: Fix switch statement in lib/merge.tcl
git-gui: Fix fetching from remotes when adding them
git-gui: Fix removing non-pushable remotes
git-gui: Make input boxes in init/clone/open dialogs consistent
git-gui: Avoid using the term URL when specifying repositories
git-gui: gui.autoexplore makes explorer to pop up automatically after picking
git-gui: Add Explore Working Copy to the Repository menu
git-gui: Use git web--browser for web browsing
git-gui: mkdir -p when initializing new remote repository
git-gui: Add support for removing remotes
git-gui: Add support for adding remotes
...
Previously, conflict markers were highlighted in two ways: (1) They
received a distinguishing color; and (2) they had the '+' removed at the
beginning of the line. However, by doing (2), a hunk that contained
conflict markers could not be staged or unstaged because the resulting
patch was corrupted. With this change we no longer modify the diff text
of a 2-way diff, so that "Stage Hunk" and friends work.
Note that 3-way diff of a conflicted file is unaffected by this change,
and '++' before conflict markers is still removed. But this has no negative
impact because in this mode staging hunks or lines is disabled anyway.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
OpenSSH allows specifying an external program to use
for direct user interaction. While most Linux systems
already have such programs, some environments, for
instance, msysgit, lack it. This patch adds a simple
fallback Tcl implementation of the tool.
In msysgit it is also necessary to set a fake value of
the DISPLAY variable, because otherwise ssh won't even
try to use SSH_ASKPASS handlers.
Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com>
Acked-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Generating a new SSH key or finding an existing one may
be a difficult task for non-technical users, especially
on Windows.
This commit adds a new dialog that shows the public key,
or allows the user to generate a new one if none were found.
Since this is a convenience/informational feature for new
users, and the dialog is mostly read-only, it is located
in the Help menu.
The command line used to invoke ssh-keygen is designed to
force it to use SSH_ASKPASS if available, or accept empty
passphrases, but _never_ wait for user response on the tty.
Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com>
Acked-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* maint:
git-svn: change dashed git-commit-tree to git commit-tree
Documentation: clarify information about 'ident' attribute
bash completion: add doubledash to "git show"
Use test-chmtime -v instead of perl in t5000 to get mtime of a file
Add --verbose|-v to test-chmtime
asciidoc: add minor workaround to add an empty line after code blocks
Plug a memleak in builtin-revert
Add file delete/create info when we overflow rename_limit
Install git-cvsserver in $(bindir)
Install git-shell in bindir, too
The documentation spoke of the attribute being set "to" a path; this can
mistakenly be interpreted as "the attribute needs to have its value set to
some kind of path". This clarifies things.
Signed-off-by: Jan Krüger <jk@jk.gs>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test was broken on admittedly broken combination of Windows, Cygwin,
and ActiveState Perl.
Signed-off-by: Alex Riesen <ariesen@harmanbecker.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows us replace perl when getting the mtime of a file because
of time zone conversions, though at the moment only one platform which
does this has been identified: Cygwin when used with ActiveState Perl
(as usual).
The output format is:
<mtime1> TAB <filename1> <LF>
<mtime2> TAB <filename2> <LF>
...
which, if only mtime is needed can be parsed with cut(1):
test-chmtime -v +0 filename1 | cut -f 1
Also, the change adds a description of programs features, with examples.
Signed-off-by: Alex Riesen <ariesen@harmanbecker.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Insert an empty <simpara> in manpages after code blocks to force and
empty line.
The problem can be seen on the manpage for the git tutorial, where an
example command and the following paragraph is printed with no empty
line between them:
First, note that you can get documentation for a command such as git
log --graph with:
$ man git-log
It is a good idea to introduce yourself to git [...]
Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The original patch that lead to an earlier commit adbc0b6 (cygwin: Use
native Win32 API for stat, 2008-09-30) did not call git_default_config()
and it was a good thing. The lazy config reading when lstat/stat is
called for the first time to find out if core.filemode is set can happen
anytime in the calling program. If it happens after the calling program
parsed the configuration file to prime its default parameter settings and
processed its command line parameters to tweak them, this will overwrite
the values set by the program with the values read from the config file.
This essentially reverts the code to the version as submitted by Mark,
with a bit more comments to clarify why we do not fall back on the default
configuration parser from git_cygwin_config().
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we refuse to do rename detection due to having too many files
created or deleted, let the user know the numbers. That way there is a
reasonable starting point for setting the diff.renamelimit option.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is one of the server side programs and needs to be found on usual $PATH.
Signed-off-by: Nanako Shiraishi <nanako3@lavabit.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
/etc/passwd shell field must be something execable, you can't enter
"/usr/bin/git shell" there. git-shell must be present as a separate
executable, or it is useless.
Signed-off-by: Tommi Virtanen <tv@eagain.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Current 'git add -p' will say "No changes." if there are no changes to
text files, which can be confusing if there _are_ changes to binary
files. Add some code to distinguish the two cases, and give a
different message in the latter one.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This moves the call to git_config to a place where it doesn't break the
logic for using git archive in a bare repository but retains the fix to
make git archive respect core.autocrlf.
Tests are by René Scharfe.
Signed-off-by: Charles Bailey <charles@hashpling.org>
Tested-by: Deskin Miller <deskinm@umich.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the alternate_object_database structure, ent->base[] is a buffer the
users can use to form pathnames to loose objects, and ent->name is a
pointer into that buffer (it points at one beyond ".git/objects/"). If
you get a call to add_refs_from_alternate() after somebody used the entry
(has_loose_object() has been called, for example), *ent->name would not be
NUL, and ent->base[] won't be the path to the object store.
This caller is expecting to read the path to the object store in ent->base[];
it needs to NUL terminate the buffer if it wants to.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows hooks like pre-receive to look at the client's IP
address.
Of course the IP address can't be used to get strong security;
git-daemon isn't the right thing to use if you need that. However,
basic IP address checking can be good enough in some situations.
REMOTE_ADDR is the same environment variable used to communicate the
client's address to CGI scripts.
Signed-off-by: Joey Hess <joey@kitenet.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Not all greps support "-e", but in this case we can easily convert it to a
single extended regex.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, branches were listed on a single line in each section. But
if there are many branches, then horizontal, line-wrapped lists are very
inconvenient to scan for a human. This makes the lists vertical, i.e one
branch per line is printed.
Since "git remote" is porcelain, we can easily make this
backwards-incompatible change.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a file is different between the working tree copy, the index, and the
HEAD, then we do not allow it to be deleted without --force.
However, this is overly tight in the face of "git add --intent-to-add":
$ git add --intent-to-add file
$ : oops, I don't actually want to stage that yet
$ git rm --cached file
error: 'empty' has staged content different from both the
file and the HEAD (use -f to force removal)
$ git rm -f --cached file
Unfortunately, there is currently no way to distinguish between an empty
file that has been added and an "intent to add" file. The ideal behavior
would be to disallow the former while allowing the latter.
This patch loosens the safety valve to allow the deletion only if we are
deleting the cached entry and the cached content is empty. This covers
the intent-to-add situation, and assumes there is little harm in not
protecting users who have legitimately added an empty file. In many
cases, the file will still be empty, in which case the safety valve does
not trigger anyway (since the content remains untouched in the working
tree). Otherwise, we do remove the fact that no content was staged, but
given that the content is by definition empty, it is not terribly
difficult for a user to recreate it.
However, we still document the desired behavior in the form of two
tests. One checks the correct removal of an intent-to-add file. The other
checks that we still disallow removal of empty files, but is marked as
expect_failure to indicate this compromise. If the intent-to-add feature
is ever extended to differentiate between normal empty files and
intent-to-add files, then the safety valve can be re-tightened.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/diff-convfilter:
diff: add filter for converting binary to text
diff: introduce diff.<driver>.binary
diff: unify external diff and funcname parsing code
t4012: use test_cmp instead of cmp
* js/maint-fetch-update-head:
pull: allow "git pull origin $something:$current_branch" into an unborn branch
Fix fetch/pull when run without --update-head-ok
Conflicts:
t/t5510-fetch.sh
* jc/maint-co-track:
Enhance hold_lock_file_for_{update,append}() API
demonstrate breakage of detached checkout with symbolic link HEAD
Fix "checkout --track -b newbranch" on detached HEAD
Conflicts:
builtin-commit.c