We should never need to write the null sha1 into an index
entry (short of the 1 in 2^160 chance that somebody actually
has content that hashes to it). If we attempt to do so, it
is much more likely that it is a bug, since we use the null
sha1 as a sentinel value to mean "not valid".
The presence of null sha1s in the index (which can come
from, among other things, "update-index --cacheinfo", or by
reading a corrupted tree) can cause problems for later
readers, because they cannot distinguish the literal null
sha1 from its use a sentinel value. For example, "git
diff-files" on such an entry would make it appear as if it
is stat-dirty, and until recently, the diff code assumed
such an entry meant that we should be diffing a working tree
file rather than a blob.
Ideally, we would stop such entries from entering even our
in-core index. However, we do sometimes legitimately add
entries with null sha1s in order to represent these sentinel
situations; simply forbidding them in add_index_entry breaks
a lot of the existing code. However, we can at least make
sure that our in-core sentinel representation never makes it
to disk.
To be thorough, we will test an attempt to add both a blob
and a submodule entry. In the former case, we might run into
problems anyway because we will be missing the blob object.
But in the latter case, we do not enforce connectivity
across gitlink entries, making this our only point of
enforcement. The current implementation does not care which
type of entry we are seeing, but testing both cases helps
future-proof the test suite in case that changes.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The diff code represents paths using the diff_filespec
struct. This struct has a sha1 to represent the sha1 of the
content at that path, as well as a sha1_valid member which
indicates whether its sha1 field is actually useful. If
sha1_valid is not true, then the filespec represents a
working tree file (e.g., for the no-index case, or for when
the index is not up-to-date).
The diff_filespec is only used internally, though. At the
interfaces to the diff subsystem, callers feed the sha1
directly, and we create a diff_filespec from it. It's at
that point that we look at the sha1 and decide whether it is
valid or not; callers may pass the null sha1 as a sentinel
value to indicate that it is not.
We should not typically see the null sha1 coming from any
other source (e.g., in the index itself, or from a tree).
However, a corrupt tree might have a null sha1, which would
cause "diff --patch" to accidentally diff the working tree
version of a file instead of treating it as a blob.
This patch extends the edges of the diff interface to accept
a "sha1_valid" flag whenever we accept a sha1, and to use
that flag when creating a filespec. In some cases, this
means passing the flag through several layers, making the
code change larger than would be desirable.
One alternative would be to simply die() upon seeing
corrupted trees with null sha1s. However, this fix more
directly addresses the problem (while bogus sha1s in a tree
are probably a bad thing, it is really the sentinel
confusion sending us down the wrong code path that is what
makes it devastating). And it means that git is more capable
of examining and debugging these corrupted trees. For
example, you can still "diff --raw" such a tree to find out
when the bogus entry was introduced; you just cannot do a
"--patch" diff (just as you could not with any other
corrupted tree, as we do not have any content to diff).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we are leaving a detached HEAD, we do a revision traversal to
check whether we are orphaning any commits, marking the commit we're
leaving as the start of the traversal, and all existing refs as
uninteresting.
Prior to commit 468224e5, we did so by calling for_each_ref, and
feeding each resulting refname to setup_revisions. Commit 468224e5
refactored this to simply mark the pending objects, saving an extra
lookup.
However, it confused the "flags" parameter to the each_ref_fn
clalback, which is about the flags we found while looking up the ref
with the object flag. Because REF_ISSYMREF ("this ref is a symbolic
ref, e.g. refs/remotes/origin/HEAD") happens to be the same bit
pattern as SEEN ("we have picked this object up from the pending
list and moved it to revs.commits list"), we incorrectly reported
that a commit previously at the detached HEAD will become
unreachable if the only ref that can reach the commit happens to be
pointed at by a symbolic ref.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It was a bit hard to learn how <rev>^@, <rev>^! and various other
forms of range specifiers are used, because they were discussed
mostly in the prose part of the documentation, unlike various forms
of extended SHA-1 expressions that are listed in an enumerated list.
Also add a few more examples showing use of <rev>, <rev>..<rev> and
<rev>^! forms, stolen from a patch by Max Horn.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We do not document COMMIT_EDITMSG at all, but users may want
to know about it for two reasons:
1. They may want to tell their editor to configure itself
for formatting a commit message.
2. If a commit is aborted by an error, the user may want
to recover the commit message they typed.
Let's put a note in git-commit(1).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This way we do not have to risk the list of tools going out of sync
between the implementation and the documentation.
In the same spirit as bf73fc2 (difftool: print list of valid tools
with '--tool-help', 2012-03-29), trim the list of merge backends in
the documentation. We do not want to have a complete list of valid
tools; we only want a list to help people guess what kind of things
the tools do to be specified there, and refer them to --tool-help
for a complete list.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The identity of the committer will ultimately be pulled from
the ident code by commit_tree(). However, we make an attempt
to check the author and committer identity early, before the
user has done any manual work like inputting a commit
message. That lets us abort without them having to worry
about salvaging the work from .git/COMMIT_EDITMSG.
The early check for committer ident does not use the
IDENT_STRICT flag, meaning that it would not find an empty
name field. The motivation was presumably because we did not
want to be too restrictive, as later calls might be more lax
(for example, when we create the reflog entry, we do not
care too much about a real name). However, because
commit_tree will always get a strict identity to put in the
commit object itself, there is no point in being lax only to
die later (and in fact it is harmful, because the user will
have wasted time typing their commit message).
Incidentally, this bug was masked prior to 060d4bb, as the
initial loose call would taint the later strict call. So the
commit would succeed (albeit with a bogus committer line in
the commit object), and nobody noticed that our early check
did not match the later one.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The advise() function takes a variable number of arguments
and converts them into a va_list object to pass to strbuf
for handling. However, we accidentally called strbuf_addf
(that takes a variable number of arguments) instead of
strbuf_vaddf (that takes a va_list).
This bug dates back to v1.7.8.1-1-g23cb5bf, but we never
noticed because none of the current callers passes a string
with a format specifier in it. And the compiler did not
notice because the format string is not available at
compile time.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
block-sha1/ is fast on most known platforms. Clarify the Makefile to
be less misleading about that.
Early versions of block-sha1/ explicitly relied on fast htonl() and
fast 32-bit loads with arbitrary alignment. Now it uses those on some
arches but the default behavior is byte-at-a-time access for the sake
of arches like ARM, Alpha, and their kin and it is still pretty fast
on these arches (fast enough to supersede the mozilla SHA1
implementation and the hand-written ARM assembler implementation that
were bundled before).
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When I invoke "make block-sha1/sha1.s", 'make' runs $(CC) -S without
specifying where it should put its output and the output ends up in
./sha1.s. Confusing.
Add an -o option to the .s rule to fix this. We were already doing
that for most compiler invocations but had forgotten it for the
assembler listings.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
't' is currently always a numeric constant, but it can't hurt to
prepare for the day that it becomes useful for a caller to pass in a
more complex expression.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With 660231aa (block-sha1: support for architectures with memory
alignment restrictions, 2009-08-12), blk_SHA1_Update was modified to
access 32-bit chunks of memory one byte at a time on arches that
prefer that:
#define get_be32(p) ( \
(*((unsigned char *)(p) + 0) << 24) | \
(*((unsigned char *)(p) + 1) << 16) | \
(*((unsigned char *)(p) + 2) << 8) | \
(*((unsigned char *)(p) + 3) << 0) )
The code previously accessed these values by just using htonl(*p).
Unfortunately, Michael noticed on an Alpha machine that git was using
plain 32-bit reads anyway. As soon as we convert a pointer to int *,
the compiler can assume that the object pointed to is correctly
aligned as an int (C99 section 6.3.2.3 "pointer conversions"
paragraph 7), and gcc takes full advantage by using a single 32-bit
load, resulting in a whole bunch of unaligned access traps.
So we need to obey the alignment constraints even when only dealing
with pointers instead of actual values. Do so by changing the type
of 'data' to void *. This patch renames 'data' to 'block' at the same
time to make sure all references are updated to reflect the new type.
Reported-tested-and-explained-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The error message from "git push $there :bogo" (and its equivalent
"git push $there --delete bogo") mentioned that we tried and failed
to guess what ref is being deleted based on the LHS of the refspec,
which we don't.
* jk/push-delete-ref-error-message:
push: don't guess at qualifying remote refs on deletion
A handful of files and directories we create had tighter than
necessary permission bits when the user wanted to have group
writability (e.g. by setting "umask 002").
* ar/clone-honor-umask-at-top:
add: create ADD_EDIT.patch with mode 0666
rerere: make rr-cache fanout directory honor umask
Restore umasks influence on the permissions of work tree created by clone
"commit --amend" used to refuse amending a commit with an empty log
message, with or without "--allow-empty-message".
* cw/amend-commit-without-message:
Allow edit of empty message with commit --amend
"git commit --amend --only --" was meant to allow "Clever" people to
rewrite the commit message without making any change even when they
have already changes for the next commit added to their index, but
it never worked as advertised since it was introduced in 1.3.0 era.
* jk/maint-commit-amend-only-no-paths:
commit: fix "--amend --only" with no pathspec
Even though the index can record pathnames longer than 1<<12 bytes,
in some places we were not comparing them in full, potentially
replacing index entries instead of adding.
* tg/maint-cache-name-compare:
cache_name_compare(): do not truncate while comparing paths
"git show"'s auto-walking behaviour was an unreliable and
unpredictable hack; it now behaves just like "git log" does when it
walks.
* tr/maint-show-walk:
show: fix "range implies walking"
Demonstrate git-show is broken with ranges
"git diff", "git status" and anything that internally uses the
comparison machinery was utterly broken when the difference
involved a file with "-" as its name. This was due to the way "git
diff --no-index" was incorrectly bolted on to the system, making
any comparison that involves a file "-" at the root level
incorrectly read from the standard input.
* jc/refactor-diff-stdin:
diff-index.c: "git diff" has no need to read blob from the standard input
diff-index.c: unify handling of command line paths
diff-index.c: do not pretend paths are pathspecs
We did not have test to make sure "git rebase" without extra options
filters out an empty commit in the original history.
* mz/empty-rebase-test:
add test case for rebase of empty commit
"git fast-export" produced an input stream for fast-import without
properly quoting pathnames when they contain SPs in them.
* js/fast-export-paths-with-spaces:
fast-export: quote paths with spaces
"git checkout --detach", when you are still on an unborn branch,
should be forbidden, but it wasn't.
* cw/no-detaching-an-unborn:
git-checkout: disallow --detach on unborn branch
Some implementations of Perl terminates "lines" with CRLF even when
the script is operating on just a sequence of bytes. Make sure to
use "$PERL_PATH", the version of Perl the user told Git to use, in
our tests to avoid unnecessary breakages in tests.
* vr/use-our-perl-in-tests:
t/README: add a bit more Don'ts
tests: enclose $PERL_PATH in double quotes
t/test-lib.sh: export PERL_PATH for use in scripts
t: Replace 'perl' by $PERL_PATH
There are three ways to specify an external diff command:
GIT_EXTERNAL_DIFF in the environment, diff.external in the
config, or a "diff" gitattribute. The current order of
precedence is:
1. gitattribute
2. GIT_EXTERNAL_DIFF
3. diff.external
Usually our rule is that environment variables should take
precedence over on-disk config (i.e., option 2 should come
before option 1). However, this situation is trickier than
some, because option 1 is more specific to the individual
file than option 2 (which affects all files), so it might be
preferable. So the current behavior can be seen as
implementing "do the specific thing if we can, but fall back
to this general thing".
This is probably not what we would do if we were writing git
from scratch, but it has been this way for several years,
and is not worth changing. So let's at least document that
this is the way it's supposed to work with a test.
While we're there, let's also make sure that diff.external
(which was not previously tested at all) works by running it
through the same tests as GIT_EXTERNAL_DIFF.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Upon seeing a type-change filepair, "diff --no-ext-diff" does not
show the usual "deletion followed by addition" split patch and does
not run the external diff driver either.
This is because the logic to disable external diff was placed at a
wrong level in the callchain. run_diff_cmd() decides to show the
split patch only when external diff driver is not configured or
specified via GIT_EXTERNAL_DIFF environment, but this is done before
checking if --no-ext-diff was given. To make things worse,
run_diff_cmd() checks --no-ext-diff and disables the output for such
a filepair completely, as the callchain below it (e.g. builtin_diff)
does not want to handle typechange filepairs.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
79a9312 (commit-tree: update the command line parsing, 2011-11-09)
updated the command line parser to understand the usual "flags first
and then non-flag arguments" order, in addition to the original and
a bit unusual "tree comes first and then zero or more -p <parent>".
Unfortunately, ba3c69a (commit: teach --gpg-sign option, 2011-10-05)
broke it by mistake. Resurrect it, and protect the feature with a
test from future breakages.
Noticed by Keshav Kini
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If "git am" fails to apply something, the end user may need to know
where to find the patch that failed to apply, so that the user can
do other things (e.g. trying "GNU patch" on it, running "diffstat"
to see what it tried to change, etc.) The input to "am" may have
contained more than one patch, or the message may have been MIME
encoded, and knowing what the user fed to "am" does not help very
much for this purpose.
Also introduce advice.amworkdir configuration to allow people who
learned where to look to squelch this message.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Running filter-branch on a history that has a commit with timestamp
at epoch used to fail, but it should have been fixed. Add test to
make sure it won't break again.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is perfectly OK for a valid decimal integer to begin with '9' but
116eb3a (parse_date(): allow ancient git-timestamp, 2012-02-02) did
not express the range correctly.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 69c3051 (submodules: refactor computation of relative gitdir path)
cloning a submodule recursively fails for nested submodules when a
symbolic link is part of the path to the work tree of the superproject.
This happens when module_clone() tries to find the relative paths between
the work tree and the git dir. When a symbolic link in current $PWD points
to a directory that is at a different level, then determining the number
of "../" needed to traverse to the superproject's work tree leads to a
wrong result.
As there is no portable way to say "pwd -P", use cd_to_toplevel to remove
the link from $PWD, which fixes this problem.
A test for this issue has been added to t7406.
Reported-by: Bob Halley <halley@play-bow.org>
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git blame" did not try to make sure that the abbreviated commit
object names in its output are unique.
* jc/maint-blame-unique-abbrev:
blame: compute abbreviation width that ensures uniqueness
On Cygwin, the platform pread(2) is not thread safe, just like our own
compat/ emulation, and cannot be used in the index-pack program.
Makefile variable NO_THREAD_SAFE_PREAD can be defined to avoid use of
this function in a threaded program.
* rj/platform-pread-may-be-thread-unsafe:
index-pack: Disable threading on cygwin
"git diff --no-index" did not correctly handle relative paths and
did not correctly give exit codes when run under "--quiet" option.
* th/diff-no-index-fixes:
diff-no-index: exit(1) if 'diff --quiet <repo file> <external file>' finds changes
diff: handle relative paths in no-index
"git clone --single-branch" to clone a single branch did not limit
the cloning to the specified branch.
* nd/clone-single-fix:
clone: fix ref selection in --single-branch --branch=xxx
When "git log" gets "--simplify-merges/by-decoration" together with
"--first-parent", the combination of these options makes the
simplification logic to use in-core commit objects that haven't been
examined for relevance, either producing incorrect result or taking
too long to produce any output. Teach the simplification logic to
ignore commits that the first-parent traversal logic ignored when
both are in effect to work around the issue.
* jc/rev-list-simplify-merges-first-parent:
revision: ignore side parents while running simplify-merges
revision: note the lack of free() in simplify_merges()
revision: "simplify" options imply topo-order sort
"git add" allows adding a regular file to the path where a submodule
used to exist, but "git update-index" did not allow an equivalent
operation to Porcelain writers.
* hv/submodule-update-nuke-submodules:
update-index: allow overwriting existing submodule index entries
"git diff --no-index" did not work with pagers correctly.
* jk/diff-no-index-pager:
do not run pager with diff --no-index --quiet
fix pager.diff with diff --no-index
"git diff COPYING HEAD:COPYING" gave a nonsense error message that
claimed that the treeish HEAD did not have COPYING in it.
* mm/verify-filename-fix:
verify_filename(): ask the caller to chose the kind of diagnosis
sha1_name: do not trigger detailed diagnosis for file arguments
The documentation for "git cherry-pick A B..C" was misleading.
* cn/cherry-pick-range-docs:
git-cherry-pick.txt: clarify the use of revision range notation
Documentation: --no-walk is no-op if range is specified
"git archive" incorrectly computed the header checksum; the symptom
was observed only when using pathnames with hi-bit set.
* jc/ustar-checksum-is-unsigned:
archive: ustar header checksum is computed unsigned
Running "git bundle verify" on a bundle that records a complete
history said "it requires these 0 commits".
* jc/bundle-complete-notice:
tweak "bundle verify" of a complete history
"git ls-files --exclude=t -i" did not consider anything under t/ as
excluded, as it did not pay attention to exclusion of leading paths
while walking the index. Other two users of excluded() are also
updated.
* jc/ls-files-i-dir:
dir.c: make excluded() file scope static
unpack-trees.c: use path_excluded() in check_ok_to_remove()
builtin/add.c: use path_excluded()
path_excluded(): update API to less cache-entry centric
ls-files -i: micro-optimize path_excluded()
ls-files -i: pay attention to exclusion of leading paths
"git request-pull $url dev" when the tip of "dev" branch was tagged
with "ext4-for-linus" used the contents from the tag in the output
but still asked the "dev" branch to be pulled, not the tag.
* jc/request-pull-match-tagname:
request-pull: really favor a matching tag
We failed to use ce_namelen() equivalent and instead only compared
up to the CE_NAMEMASK bytes by mistake. Adding an overlong path
that shares the same common prefix as an existing entry in the index
did not add a new entry, but instead replaced the existing one, as
the result.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we do not have any pathspec, we typically disallow an
explicit "--only", because it makes no sense (your commit
would, by definition, be empty). But since 6a74642
(git-commit --amend: two fixes., 2006-04-20), we have
allowed "--amend --only" with the intent that it would amend
the commit, ignoring any contents staged in the index.
However, while that commit allowed the combination, we never
actually implemented the logic to make it work. The current
code notices that we have no pathspec and assumes we want to
do an as-is commit (i.e., the "--only" is ignored).
Instead, we must make sure to follow the partial-commit
code-path. We also need to tweak the list_paths function to
handle a NULL pathspec.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>