When upload-pack receives a "want" line from the client, it
adds it to an object array. We call lookup_object to find
the actual object, which will only check for objects already
in memory. This works because we are expecting to find
objects that we already loaded during the ref advertisement.
We use the resulting object structs for a variety of
purposes. Some of them care only about the object flags, but
others care about the type of the object (e.g.,
ok_to_give_up), or even feed them to the revision parser
(when --depth is used), which assumes that objects it
receives are fully parsed.
Once upon a time, this was OK; any object we loaded into
memory would also have been parsed. But since 435c833
(upload-pack: use peel_ref for ref advertisements,
2012-10-04), we try to avoid parsing objects during the ref
advertisement. This means that lookup_object may return an
object with a type of OBJ_NONE. The resulting mess depends
on the exact set of objects, but can include the revision
parser barfing, or the shallow code sending the wrong set of
objects.
This patch teaches upload-pack to parse each "want" object
as we receive it. We do not replace the lookup_object call
with parse_object, as the current code is careful not to let
just any object appear on a "want" line, but rather only one
we have previously advertised (whereas parse_object would
actually load any arbitrary object from disk).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git submodule frotz" was not diagnosed as "frotz" being an unknown
subcommand to "git submodule"; the user instead got a complaint that
"git submodule status" was run with an unknown path "frotz".
* rr/maint-submodule-unknown-cmd:
submodule: if $command was not matched, don't parse other args
"git fetch" over http advertised that it supports "deflate", which
is much less common, and did not advertise more common "gzip" on its
Accept-Encoding header.
* sp/maint-http-enable-gzip:
Enable info/refs gzip decompression in HTTP client
* jc/maint-log-grep-all-match-1:
grep.c: make two symbols really file-scope static this time
t7810-grep: test --all-match with multiple --grep and --author options
t7810-grep: test interaction of multiple --grep and --author options
t7810-grep: test multiple --author with --all-match
t7810-grep: test multiple --grep with and without --all-match
t7810-grep: bring log --grep tests in common form
grep.c: mark private file-scope symbols as static
log: document use of multiple commit limiting options
log --grep/--author: honor --all-match honored for multiple --grep patterns
grep: show --debug output only once
grep: teach --debug option to dump the parse tree
"git submodule" command DWIMs the command line and assumes a
unspecified action word for 'status' action. This is a UI mistake
that leads to a confusing behaviour. A mistyped command name is
instead treated as a request for 'status' of the submodule with that
name, e.g.
$ git submodule show
error: pathspec 'show' did not match any file(s) known to git.
Did you forget to 'git add'?
Stop DWIMming an unknown or mistyped subcommand name as pathspec
given to unspelled "status" subcommand. "git submodule" without any
argument is still interpreted as "git submodule status", but its
value is questionable.
Adjust t7400 to match, and stop advertising the default subcommand
being 'status' which does not help much in practice, other than
promoting laziness and confusion.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even during a conflicted merge, "git blame $path" always meant to
blame uncommitted changes to the "working tree" version; make it
more useful by showing cleanly merged parts as coming from the other
branch that is being merged.
This incidentally fixes an unrelated problem on a case insensitive
filesystem, where "git blame MAKEFILE" run in a history that has
"Makefile" but not "MAKEFILE" did not say "No such file MAKEFILE in
HEAD" but pretended as if "MAKEFILE" was a newly added file.
* jc/maint-blame-no-such-path:
blame: allow "blame file" in the middle of a conflicted merge
blame $path: avoid getting fooled by case insensitive filesystems
"git fetch --all", when passed "--no-tags", did not honor the
"--no-tags" option while fetching from individual remotes (the same
issue existed with "--tags", but combination "--all --tags" makes
much less sense than "--all --no-tags").
* dj/fetch-all-tags:
fetch --all: pass --tags/--no-tags through to each remote
submodule: use argv_array instead of hand-building arrays
fetch: use argv_array instead of hand-building arrays
argv-array: fix bogus cast when freeing array
argv-array: add pop function
Some HTTP servers try to use gzip compression on the /info/refs
request to save transfer bandwidth. Repositories with many tags
may find the /info/refs request can be gzipped to be 50% of the
original size due to the few but often repeated bytes used (hex
SHA-1 and commonly digits in tag names).
For most HTTP requests enable "Accept-Encoding: gzip" ensuring
the /info/refs payload can use this encoding format.
Only request gzip encoding from servers. Although deflate is
supported by libcurl, most servers have standardized on gzip
encoding for compression as that is what most browsers support.
Asking for deflate increases request sizes by a few bytes, but is
unlikely to ever be used by a server.
Disable the Accept-Encoding header on probe RPCs as response bodies
are supposed to be exactly 4 bytes long, "0000". The HTTP headers
requesting and indicating compression use more space than the data
transferred in the body.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"Content-type: text/plain; charset=UTF-8" header should not appear
twice in the input, but it is always better to gracefully deal with
such a case. The current code concatenates the value to the values
we have seen previously, producing nonsense such as "utf8UTF-8".
Instead of concatenating, forget the previous value and use the last
value we see.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code used to have a bug that ignores "--all-match", that requires
all "--grep" to have matched, when "--author" or "--committer" was used.
Make sure the bug will not be reintroduced.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are tests for this interaction already. Restructure slightly and
avoid any claims about --all-match.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "--all-match" option is about "--grep", and does not affect how
"--author" or "--committer" limitation is applied.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The log --grep tests generate the expected out in different ways.
Make them all use command blocks so that subshells are avoided and the
expected output is easier to grasp visually.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* mz/cherry-pick-cmdline-order:
cherry-pick/revert: respect order of revisions to pick
demonstrate broken 'git cherry-pick three one two'
teach log --no-walk=unsorted, which avoids sorting
"git apply -p0" did not parse pathnames on "diff --git" line
correctly. This caused patches that had pathnames in no other
places to be mistakenly rejected (most notably, binary patch that
does not rename nor change mode). Textual patches, renames or mode
changes have preimage and postimage pathnames in different places in
a form that can be parsed unambiguously and did not suffer from this
problem.
* jc/apply-binary-p0:
apply: compute patch->def_name correctly under -p0
"git log .." errored out saying it is both rev range and a path when
there is no disambiguating "--" is on the command line. Update the
command line parser to interpret ".." as a path in such a case.
* jc/dotdot-is-parent-directory:
specifying ranges: we did not mean to make ".." an empty set
Pushing to smart HTTP server with recent Git fails without having
the username in the URL to force authentication, if the server is
configured to allow GET anonymously, while requiring authentication
for POST.
* jk/maint-http-half-auth-push:
http: prompt for credentials on failed POST
http: factor out http error code handling
t: test http access to "half-auth" repositories
t: test basic smart-http authentication
t/lib-httpd: recognize */smart/* repos as smart-http
t/lib-httpd: only route auth/dumb to dumb repos
t5550: factor out http auth setup
t5550: put auth-required repo in auth/dumb
"git for-each-ref" did not honor multiple "--sort=<key>" arguments
correctly.
* kk/maint-for-each-ref-multi-sort:
for-each-ref: Fix sort with multiple keys
t6300: test sort with multiple keys
"git blame file" has always meant "find the origin of each line of
the file in the history leading to HEAD, oh by the way, blame the
lines that are modified locally to the working tree".
This teaches "git blame" that during a conflicted merge, some
uncommitted changes may have come from the other history that is
being merged.
The verify_working_tree_path() function introduced in the previous
patch to notice a typo in the filename (primarily on case insensitive
filesystems) has been updated to allow a filename that does not exist
in HEAD (i.e. the tip of our history) as long as it exists one of the
commits being merged, so that a "we deleted, the other side modified"
case tracks the history of the file in the history of the other side.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/test-prereq:
t3910: use the UTF8_NFD_TO_NFC test prereq
test-lib: provide UTF8 behaviour as a prerequisite
t0050: use the SYMLINKS test prereq
t0050: use the CASE_INSENSITIVE_FS test prereq
test-lib: provide case insensitivity as a prerequisite
test: allow prerequisite to be evaluated lazily
test: rename $satisfied to $satisfied_prereq
"git blame MAKEFILE" run in a history that has "Makefile" but not
MAKEFILE can get confused on a case insensitive filesystem, because
the check we run to see if there is a corresponding file in the
working tree with lstat("MAKEFILE") succeeds. In addition to that
check, we have to make sure that the given path also exists in the
commit we start digging history from (i.e. "HEAD").
Note that this reveals the breakage in a test added in cd8ae20
(git-blame shouldn't crash if run in an unmerged tree, 2007-10-18),
which expects the entire merge-in-progress path to be blamed to the
working tree when it did not exist in our tree. As it is clear in
the log message of that commit, the old breakage was that it was
causing an internal error and the fix was about avoiding it.
Just check that the command does not die an uncontrolled death. For
this particular case, the blame should fail, as the history for the
file in that contents has not been committed yet at the point in the
test.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint-1.7.11:
Almost 1.7.11.6
gitweb: URL-decode $my_url/$my_uri when stripping PATH_INFO
rebase -i: use full onto sha1 in reflog
sh-setup: protect from exported IFS
receive-pack: do not leak output from auto-gc to standard output
t/t5400: demonstrate breakage caused by informational message from prune
setup: clarify error messages for file/revisions ambiguity
send-email: improve RFC2047 quote parsing
fsck: detect null sha1 in tree entries
do not write null sha1s to on-disk index
diff: do not use null sha1 as a sentinel value
When "git push" triggered the automatic gc on the receiving end, a
message from "git prune" that said it was removing cruft leaked to
the standard output, breaking the communication protocol.
* bc/receive-pack-stdout-protection:
receive-pack: do not leak output from auto-gc to standard output
t/t5400: demonstrate breakage caused by informational message from prune
"git diff" had a confusion between taking data from a path in the
working tree and taking data from an object that happens to have
name 0{40} recorded in a tree.
* jk/maint-null-in-trees:
fsck: detect null sha1 in tree entries
do not write null sha1s to on-disk index
diff: do not use null sha1 as a sentinel value
"git send-email" did not unquote encoded words that appear on the
header correctly, and lost "_" from strings.
* tr/maint-send-email-2047:
send-email: improve RFC2047 quote parsing
When fetch is invoked with --all, we need to pass the tag-following
preference to each individual fetch; without this, we will always
auto-follow tags, preventing us from fetching the remote tags into a
remote-specific namespace, for example.
Reported-by: Oswald Buddenhagen <ossi@kde.org>
Signed-off-by: Dan Johnson <ComputerDruid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All remote subcommands are spelled out words except 'rm'. 'rm', being a
popular UNIX command name, may mislead users that there are also 'ls' or
'mv'. Use 'remove' to fit with the rest of subcommands.
'rm' is still supported and used in the test suite. It's just not
widely advertised.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When giving multiple individual revisions to cherry-pick or revert, as
in 'git cherry-pick A B' or 'git revert B A', one would expect them to
be picked/reverted in the order given on the command line. They are
instead ordered by their commit timestamp -- in chronological order
for "cherry-pick" and in reverse chronological order for
"revert". This matches the order in which one would usually give them
on the command line, making this bug somewhat hard to notice. Still,
it has been reported at least once before [1].
It seems like the chronological sorting happened by accident because
the revision walker has traditionally always sorted commits in reverse
chronological order when rev_info.no_walk was enabled. In the case of
'git revert B A' where B is newer than A, this sorting is a no-op. For
'git cherry-pick A B', the sorting would reverse the arguments, but
because the sequencer also flips the rev_info.reverse flag when
picking (as opposed to reverting), the end result is a chronological
order. The rev_info.reverse flag was probably flipped so that the
revision walker emits B before C in 'git cherry-pick A..C'; that it
happened to effectively undo the unexpected sorting done when not
walking, was probably a coincidence that allowed this bug to happen at
all.
Fix the bug by telling the revision walker not to sort the commits
when not walking. The only case we want to reverse the order is now
when cherry-picking and walking revisions (rev_info.no_walk = 0).
[1] http://thread.gmane.org/gmane.comp.version-control.git/164794
Signed-off-by: Martin von Zweigbergk <martinvonz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Cherry-picking commits out of order (w.r.t. commit time stamp) doesn't
currently work. Add a test case to demonstrate it.
Signed-off-by: Martin von Zweigbergk <martinvonz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 'git log' is passed the --no-walk option, no revision walk takes
place, naturally. Perhaps somewhat surprisingly, however, the provided
revisions still get sorted by commit date. So e.g 'git log --no-walk
HEAD HEAD~1' and 'git log --no-walk HEAD~1 HEAD' give the same result
(unless the two revisions share the commit date, in which case they
will retain the order given on the command line). As the commit that
introduced --no-walk (8e64006 (Teach revision machinery about
--no-walk, 2007-07-24)) points out, the sorting is intentional, to
allow things like
git log --abbrev-commit --pretty=oneline --decorate --all --no-walk
to show all refs in order by commit date.
But there are also other cases where the sorting is not wanted, such
as
<command producing revisions in order> |
git log --oneline --no-walk --stdin
To accomodate both cases, leave the decision of whether or not to sort
up to the caller, by allowing --no-walk={sorted,unsorted}, defaulting
to 'sorted' for backward-compatibility reasons.
Signed-off-by: Martin von Zweigbergk <martinvonz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Originally the "--quiet" option was parsed by the
diff-option parser into the internal QUICK option. This had
the effect of silencing diff output from the log (which was
not intended, but happened to work and people started to
use it). But it also had other odd side effects at the diff
level (for example, it would suppress the second commit in
"git show A B").
To fix this, commit 1c40c36 converted log to parse-options
and handled the "quiet" option separately, not passing it
on to the diff code. However, it simply ignored the option,
which was a regression for people using it as a synonym for
"-s". Commit 01771a8 then fixed that by interpreting the
option to add DIFF_FORMAT_NO_OUTPUT to the list of output
formats.
However, that commit did not fix it in all cases. It sets
the flag after setup_revisions is called. Naively, this
makes sense because you would expect the setup_revisions
parser to overwrite our output format flag if "-p" or
another output format flag is seen.
However, that is not how the NO_OUTPUT flag works. We
actually store it in the bit-field as just another format.
At the end of setup_revisions, we call diff_setup_done,
which post-processes the bitfield and clears any other
formats if we have set NO_OUTPUT. By setting the flag after
setup_revisions is done, diff_setup_done does not have a
chance to make this tweak, and we end up with other format
options still set.
As a result, the flag would have no effect in "git log -p
--quiet" or "git show --quiet". Fix it by setting the
format flag before the call to setup_revisions.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All of the smart-http GET requests go through the http_get_*
functions, which will prompt for credentials and retry if we
see an HTTP 401.
POST requests, however, do not go through any central point.
Moreover, it is difficult to retry in the general case; we
cannot assume the request body fits in memory or is even
seekable, and we don't know how much of it was consumed
during the attempt.
Most of the time, this is not a big deal; for both fetching
and pushing, we make a GET request before doing any POSTs,
so typically we figure out the credentials during the first
request, then reuse them during the POST. However, some
servers may allow a client to get the list of refs from
receive-pack without authentication, and then require
authentication when the client actually tries to POST the
pack.
This is not ideal, as the client may do a non-trivial amount
of work to generate the pack (e.g., delta-compressing
objects). However, for a long time it has been the
recommended example configuration in git-http-backend(1) for
setting up a repository with anonymous fetch and
authenticated push. This setup has always been broken
without putting a username into the URL. Prior to commit
986bbc0, it did work with a username in the URL, because git
would prompt for credentials before making any requests at
all. However, post-986bbc0, it is totally broken. Since it
has been advertised in the manpage for some time, we should
make sure it works.
Unfortunately, it is not as easy as simply calling post_rpc
again when it fails, due to the input issue mentioned above.
However, we can still make this specific case work by
retrying in two specific instances:
1. If the request is large (bigger than LARGE_PACKET_MAX),
we will first send a probe request with a single flush
packet. Since this request is static, we can freely
retry it.
2. If the request is small and we are not using gzip, then
we have the whole thing in-core, and we can freely
retry.
That means we will not retry in some instances, including:
1. If we are using gzip. However, we only do so when
calling git-upload-pack, so it does not apply to
pushes.
2. If we have a large request, the probe succeeds, but
then the real POST wants authentication. This is an
extremely unlikely configuration and not worth worrying
about.
While it might be nice to cover those instances, doing so
would be significantly more complex for very little
real-world gain. In the long run, we will be much better off
when curl learns to internally handle authentication as a
callback, and we can cleanly handle all cases that way.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some sites set up http access to repositories such that
fetching is anonymous and unauthenticated, but pushing is
authenticated. While there are multiple ways to do this, the
technique advertised in the git-http-backend manpage is to
block access to locations matching "/git-receive-pack$".
Let's emulate that advice in our test setup, which makes it
clear that this advice does not actually work.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We do not currently test authentication over smart-http at
all. In theory, it should work exactly as it does for dumb
http (which we do test). It does indeed work for these
simple tests, but this patch lays the groundwork for more
complex tests in future patches.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>