* es/rebase-i-no-abbrev:
rebase -i: fix short SHA-1 collision
t3404: rebase -i: demonstrate short SHA-1 collision
t3404: make tests more self-contained
Conflicts:
t/t3404-rebase-interactive.sh
- Don't start tests with 'test $? = 0' to catch preparation done
outside the test_expect_success block.
- Move writing the bogus patch and the expected output into the
appropriate test_expect_success blocks.
- Use the test_must_fail helper instead of manually checking for
non-zero exit code.
- Use the debug-friendly test_path_is_file helper instead of 'test -f'.
- No space after '>'.
Signed-off-by: SZEDER Gábor <szeder@ira.uka.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A range notation "A..B" means exactly the same thing as what "^A B"
means, i.e. the set of commits that are reachable from B but not
from A. But the internal representation after the revision parser
parsed these two notations are subtly different.
- "rev-list ^A B" leaves A and B in the revs->pending.objects[]
array, with the former marked as UNINTERESTING and the revision
traversal machinery propagates the mark to underlying commit
objects A^0 and B^0.
- "rev-list A..B" peels tags and leaves A^0 (marked as
UNINTERESTING) and B^0 in revs->pending.objects[] array before
the traversal machinery kicks in.
This difference usually does not matter, but starts to matter when
the --objects option is used. For example, we see this:
$ git rev-list --objects v1.8.4^1..v1.8.4 | grep $(git rev-parse v1.8.4)
$ git rev-list --objects v1.8.4 ^v1.8.4^1 | grep $(git rev-parse v1.8.4)
04f013dc38d7512eadb915eba22efc414f18b869 v1.8.4
With the former invocation, the revision traversal machinery never
hears about the tag v1.8.4 (it only sees the result of peeling it,
i.e. the commit v1.8.4^0), and the tag itself does not appear in the
output. The latter does send the tag object itself to the output.
Make the range notation keep the unpeeled objects and feed them to
the traversal machinery to fix this inconsistency.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-prune-packed operates on GIT_OBJECT_DIRECTORY, not
GIT_OBJECT_DIR.
Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The '+=' operator is not supported by old Bash versions (3.0) we still
care about.
Signed-off-by: SZEDER Gábor <szeder@ira.uka.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Split_ident currently parses left to right. Given this
input:
Your Name <email@example.com> 123456789 -0500\n
We assume the name starts the line and runs until the first
"<". That starts the email address, which runs until the
first ">". Everything after that is assumed to be the
timestamp.
This works fine in the normal case, but is easily broken by
corrupted ident lines that contain an extra ">". Some
examples seen in the wild are:
1. Name <email>-<> 123456789 -0500\n
2. Name <email> <Name<email>> 123456789 -0500\n
3. Name1 <email1>, Name2 <email2> 123456789 -0500\n
Currently each of these produces some email address (which
is not necessarily the one the user intended) and end up
with a NULL date (which is generally interpreted as the
epoch by "git log" and friends).
But in each case we could get the correct timestamp simply
by parsing from the right-hand side, looking backwards for
the final ">", and then reading the timestamp from there.
In general, it's a losing battle to try to automatically
guess what the user meant with their broken crud. But this
particular workaround is probably worth doing. One, it's
dirt simple, and can't impact non-broken cases. Two, it
doesn't catch a single breakage we've seen, but rather a
large class of errors (i.e., any breakage inside the email
angle brackets may affect the email, but won't spill over
into the timestamp parsing). And three, the timestamp is
arguably more valuable to get right, because it can affect
correctness (e.g., in --until cutoffs).
This patch implements the right-to-left scheme described
above. We adjust the tests in t4212, which generate a commit
with such a broken ident, and now gets the timestamp right.
We also add a test that fsck continues to detect the
breakage.
For reference, here are pointers to the breakages seen (as
numbered above):
[1] http://article.gmane.org/gmane.comp.version-control.git/221441
[2] http://article.gmane.org/gmane.comp.version-control.git/222362
[3] http://perl5.git.perl.org/perl.git/commit/13b79730adea97e660de84bbe67f9d7cbe344302
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For efficiency and security reasons, an earlier commit in
this series taught http_get_* to re-write the base url based
on redirections we saw while making a specific request.
This commit wires that option into the info/refs request,
meaning that a redirect from
http://example.com/foo.git/info/refs
to
https://example.com/bar.git/info/refs
will behave as if "https://example.com/bar.git" had been
provided to git in the first place.
The tests bear some explanation. We introduce two new
hierearchies into the httpd test config:
1. Requests to /smart-redir-limited will work only for the
initial info/refs request, but not any subsequent
requests. As a result, we can confirm whether the
client is re-rooting its requests after the initial
contact, since otherwise it will fail (it will ask for
"repo.git/git-upload-pack", which is not redirected).
2. Requests to smart-redir-auth will redirect, and require
auth after the redirection. Since we are using the
redirected base for further requests, we also update
the credential struct, in order not to mislead the user
(or credential helpers) about which credential is
needed. We can therefore check the GIT_ASKPASS prompts
to make sure we are prompting for the new location.
Because we have neither multiple servers nor https
support in our test setup, we can only redirect between
paths, meaning we need to turn on
credential.useHttpPath to see the difference.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
We use a strbuf to generate the string containing the remote
URL, but then detach it to a bare pointer. This makes it
harder to later manipulate the URL, as we have forgotten the
length (and the allocation semantics are not as clear).
Let's instead keep the strbuf around. As a bonus, this
eliminates a confusing double-use of the "buf" strbuf in
main(). Prior to this, it was used both for constructing the
url, and for reading commands from stdin.
The downside is that we have to update each call site to
refer to "url.buf" rather than just "url" when they want the
C string.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
In the discover_refs function, we use a strbuf named
"buffer" for multiple purposes. First we build the info/refs
URL in it, and then detach that to a bare pointer. Then, we
use the same strbuf to store the result of fetching the
refs.
Let's instead keep a separate refs_url strbuf. This is less
confusing, as the "buffer" strbuf is now used for only one
thing.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
If a caller asks the http_get_* functions to go to a
particular URL and we end up elsewhere due to a redirect,
the effective_url field can tell us where we went.
It would be nice to remember this redirect and short-cut
further requests for two reasons:
1. It's more efficient. Otherwise we spend an extra http
round-trip to the server for each subsequent request,
just to get redirected.
2. If we end up with an http 401 and are going to ask for
credentials, it is to feed them to the redirect target.
If the redirect is an http->https upgrade, this means
our credentials may be provided on the http leg, just
to end up redirected to https. And if the redirect
crosses server boundaries, then curl will drop the
credentials entirely as it follows the redirect.
However, it, it is not enough to simply record the effective
URL we saw and use that for subsequent requests. We were
originally fed a "base" url like:
http://example.com/foo.git
and we want to figure out what the new base is, even though
the URLs we see may be:
original: http://example.com/foo.git/info/refs
effective: http://example.com/bar.git/info/refs
Subsequent requests will not be for "info/refs", but for
other paths relative to the base. We must ask the caller to
pass in the original base, and we must pass the redirected
base back to the caller (so that it can generate more URLs
from it). Furthermore, we need to feed the new base to the
credential code, so that requests to credential helpers (or
to the user) match the URL we will be requesting.
This patch teaches http_request_reauth to do this munging.
Since it is the caller who cares about making more URLs, it
seems at first glance that callers could simply check
effective_url themselves and handle it. However, since we
need to update the credential struct before the second
re-auth request, we have to do it inside http_request_reauth.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
When we ask curl to access a URL, it may follow one or more
redirects to reach the final location. We have no idea
this has happened, as curl takes care of the details and
simply returns the final content to us.
The final URL that we ended up with can be accessed via
CURLINFO_EFFECTIVE_URL. Let's make that optionally available
to callers of http_get_*, so that they can make further
decisions based on the redirection.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
When we are handling a curl response code in http_request or
in the remote-curl RPC code, we use the handle_curl_result
helper to translate curl's response into an easy-to-use
code. When we see an HTTP 401, we do one of two things:
1. If we already had a filled-in credential, we mark it as
rejected, and then return HTTP_NOAUTH to indicate to
the caller that we failed.
2. If we didn't, then we ask for a new credential and tell
the caller HTTP_REAUTH to indicate that they may want
to try again.
Rejecting in the first case makes sense; it is the natural
result of the request we just made. However, prompting for
more credentials in the second step does not always make
sense. We do not know for sure that the caller is going to
make a second request, and nor are we sure that it will be
to the same URL. Logically, the prompt belongs not to the
request we just finished, but to the request we are (maybe)
about to make.
In practice, it is very hard to trigger any bad behavior.
Currently, if we make a second request, it will always be to
the same URL (even in the face of redirects, because curl
handles the redirects internally). And we almost always
retry on HTTP_REAUTH these days. The one exception is if we
are streaming a large RPC request to the server (e.g., a
pushed packfile), in which case we cannot restart. It's
extremely unlikely to see a 401 response at this stage,
though, as we would typically have seen it when we sent a
probe request, before streaming the data.
This patch drops the automatic prompt out of case 2, and
instead requires the caller to do it. This is a few extra
lines of code, and the bug it fixes is unlikely to come up
in practice. But it is conceptually cleaner, and paves the
way for better handling of credentials across redirects.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Since 920b691 (clone: refuse to clone if --branch
points to bogus ref) we refuse to clone with option
"-b" if the specified branch does not exist in the
(non-empty) upstream. If the upstream repository is empty,
the branch doesn't exist, either. So refuse the clone too.
Reported-by: Robert Mitwicki <robert.mitwicki@opensoftware.pl>
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
Acked-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Over time, the http_get_strbuf function has grown several
optional parameters. We now have a bitfield with multiple
boolean options, as well as an optional strbuf for returning
the content-type of the response. And a future patch in this
series is going to add another strbuf option.
Treating these as separate arguments has a few downsides:
1. Most call sites need to add extra NULLs and 0s for the
options they aren't interested in.
2. The http_get_* functions are actually wrappers around
2 layers of low-level implementation functions. We have
to pass these options through individually.
3. The http_get_strbuf wrapper learned these options, but
nobody bothered to do so for http_get_file, even though
it is backed by the same function that does understand
the options.
Let's consolidate the options into a single struct. For the
common case of the default options, we'll allow callers to
simply pass a NULL for the options struct.
The resulting code is often a few lines longer, but it ends
up being easier to read (and to change as we add new
options, since we do not need to update each call site).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
When we retrieve the content-type of an http response, curl
gives us a pointer to internal storage, which we then copy
into a strbuf. Let's factor out the get-and-copy routine,
which can be used for getting other curl info.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Besides being ugly, the extra parentheses are idiomatic for
suppressing compiler warnings when we are assigning within a
conditional. We aren't doing that here, and they just
confuse the reader.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
The "diff-algorithm" option to the recursive merge strategy takes the
name of the algorithm as an option, but it uses strcmp on the option
string to check if it starts with "diff-algorithm=", meaning that this
options cannot actually be used.
Fix this by switching to prefixcmp. At the same time, clarify the
following line by using strlen instead of a hard-coded length, which
also makes it consistent with nearby code.
Reported-by: Luke Noel-Storr <luke.noel-storr@integrate.co.uk>
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
* km/svn-1.8-serf-only:
Git.pm: revert _temp_cache use of temp_is_locked
git-svn: allow git-svn fetching to work using serf
Git.pm: add new temp_is_locked function
Mediawiki introduces a new API for queries w/ more than 500 results in
version 1.21. That change triggered an infinite loop while cloning a
mediawiki with such a page.
The latest API renamed and moved the "continuing" information in the
response, necessary to build the next query. The code failed to retrieve
that information but still detected that it was in a "continuing
query". As a result, it launched the same query over and over again.
If a "continuing" information is detected in the response (old or new),
the next query is updated accordingly. If not, we quit assuming it's not
a continuing query.
Reported-by: Benjamin Cathey
Signed-off-by: Benoit Person <benoit.person@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Commit a908047 taught format-patch the "--from" option,
which places the author ident into an in-body from header,
and uses the committer ident in the rfc822 from header. The
documentation claims that it will omit the in-body header
when it is the same as the rfc822 header, but the code never
implemented that behavior.
This patch completes the feature by comparing the two idents
and doing nothing when they are the same (this is the same
as simply omitting the in-body header, as the two are by
definition indistinguishable in this case). This makes it
reasonable to turn on "--from" all the time (if it matches
your particular workflow), rather than only using it when
exporting other people's patches.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of git's traversals are robust against minor breakages
in commit data. For example, "git log" will still output an
entry for a commit that has a broken encoding or missing
author, and will not abort the whole operation.
Shortlog, on the other hand, will die as soon as it sees a
commit without an author, meaning that a repository with
a broken commit cannot get any shortlog output at all.
Let's downgrade this fatal error to a warning, and continue
the operation.
We simply ignore the commit and do not count it in the total
(since we do not have any author under which to file it).
Alternatively, we could output some kind of "<empty>" record
to collect these bogus commits. It is probably not worth it,
though; we have already warned to stderr, so the user is
aware that such bogosities exist, and any placeholder we
came up with would either be syntactically invalid, or would
potentially conflict with real data.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A clone will always create a transport struct, whether we
are cloning locally or using an actual protocol. In the
local case, we only use the transport to get the list of
refs, and then transfer the objects out-of-band.
However, there are many options that we do not bother
setting up in the local case. For the most part, these are
noops, because they only affect the object-fetching stage
(e.g., the --depth option). However, some options do have a
visible impact. For example, giving the path to upload-pack
via "-u" does not currently work for a local clone, even
though we need upload-pack to get the ref list.
We can just drop the conditional entirely and set these
options for both local and non-local clones. Rather than
keep track of which options impact the object versus the ref
fetching stage, we can simply let the noops be noops (and
the cost of setting the options in the first place is not
high).
The one exception is that we also check that the transport
provides both a "get_refs_list" and a "fetch" method. We
will now be checking the former for both cases (which is
good, since a transport that cannot fetch refs would not
work for a local clone), and we tweak the conditional to
check for a "fetch" only when we are non-local.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When stderr does not point to a tty, we typically suppress
"we are now in this phase" progress reporting (e.g., we ask
the server not to send us "counting objects" and the like).
The new "checking connectivity" message is in the same vein,
and should be suppressed. Since clone relies on the
transport code to make the decision, we can simply sneak a
peek at the "progress" field of the transport struct. That
properly takes into account both the verbosity and progress
options we were given, as well as the result of isatty().
Note that we do not set up that progress flag for a local
clone, as we do not fetch using the transport at all. That's
acceptable here, though, because we also do not perform a
connectivity check in that case.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Putting messages like "Cloning into.." and "done" on stdout
is un-Unix and uselessly clutters the stdout channel. Send
them to stderr.
We have to tweak two tests to accommodate this:
1. t5601 checks for doubled output due to forking, and
doesn't actually care where the output goes; adjust it
to check stderr.
2. t5702 is trying to test whether progress output was
sent to stderr, but naively does so by checking
whether stderr produced any output. Instead, have it
look for "%", a token found in progress output but not
elsewhere (and which lets us avoid hard-coding the
progress text in the test).
This should not regress any scripts that try to parse the
current output, as the output is already internationalized
and therefore unstable.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some people still use rather old versions of bash, which cannot grok
some constructs like 'printf -v varname' the prompt and completion
code started to use recently.
* bc/completion-for-bash-3.0:
contrib/git-prompt.sh: handle missing 'printf -v' more gracefully
t9902-completion.sh: old Bash still does not support array+=('') notation
git-completion.bash: use correct Bash/Zsh array length syntax
Fixes a minor bug in "git rebase -i" (there could be others, as the
root cause is pretty generic) where the code feeds a random, data
dependeant string to 'echo' and expects it to come out literally.
* mm/no-shell-escape-in-die-message:
die_with_status: use "printf '%s\n'", not "echo"
Output from "git log --full-diff -- <pathspec>" looked strange,
because comparison was done with the previous ancestor that touched
the specified <pathspec>, causing the patches for paths outside the
pathspec to show more than the single commit has changed.
* tr/log-full-diff-keep-true-parents:
log: use true parents for diff when walking reflogs
log: use true parents for diff even when rewriting
The auto-tag-following code in "git fetch" tries to reuse the same
transport twice when the serving end does not cooperate and does
not give tags that point to commits that are asked for as part of
the primary transfer. Unfortunately, Git-aware transport helper
interface is not designed to be used more than once, hence this
does not work over smart-http transfer.
* jc/transport-do-not-use-connect-twice-in-fetch:
builtin/fetch.c: Fix a sparse warning
fetch: work around "transport-take-over" hack
fetch: refactor code that fetches leftover tags
fetch: refactor code that prepares a transport
fetch: rename file-scope global "transport" to "gtransport"
t5802: add test for connect helper
Send a large request to read(2)/write(2) as a smaller but still
reasonably large chunks, which would improve the latency when the
operation needs to be killed and incidentally works around broken
64-bit systems that cannot take a 2GB write or read in one go.
* sp/clip-read-write-to-8mb:
Revert "compat/clipped-write.c: large write(2) fails on Mac OS X/XNU"
xread, xwrite: limit size of IO to 8MB
By doing this, clients of upload-pack can now reliably tell what ref
a symbolic ref points at; the updated test in t5505 used to expect
failure due to the ambiguity and made sure we give diagnostics, but
we no longer need to be so pessimistic. Make sure we correctly learn
which branch HEAD points at from the other side instead.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the same mechanism as used to tell where "HEAD" points at to
the other end, we can tell the target of other symbolic refs as
well.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One long-standing flaw in the pack transfer protocol was that there
was no way to tell the other end which branch "HEAD" points at.
With a capability "symref=HEAD:refs/heads/master", let the sender to
tell the receiver what symbolic ref points at what ref.
This capability can be repeated more than once to represent symbolic
refs other than HEAD, such as "refs/remotes/origin/HEAD").
Add an infrastructure to collect symbolic refs, format them as extra
capabilities and put it on the wire. For now, just send information
on the "HEAD" and nothing else.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The callee does not use cb_data, and the caller is an intermediate
function in a callchain that later wants to use the cb_data for its
own use. Clarify the code by breaking the dataflow explicitly by
not passing cb_data down to mark_our_ref().
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When two or more branches point at the same commit and HEAD is
pointing at one of them, without the symref extension, there is no
way to remotely tell which one of these branches HEAD points at.
The test in question attempts to make sure that this situation is
diagnosed and results in a failure.
However, even if there _were_ a way to reliably tell which branch
the HEAD points at, "set-head --auto" would fail if there is no
remote tracking branch. Make sure that this test does not fail
for that "wrong" reason.
Signed-off-by: Junio C Hamano <gitster@pobox.com>