The IOCTLFUNCTION option has been deprecated, and generates a compiler
warning in recent versions of curl. We can switch to using SEEKFUNCTION
instead. It was added in 2008 via curl 7.18.0; our INSTALL file already
indicates we require at least curl 7.19.4.
But there's one catch: curl says we should use CURL_SEEKFUNC_{OK,FAIL},
and those didn't arrive until 7.19.5. One workaround would be to use a
bare 0/1 here (or define our own macros). But let's just bump the
minimum required version to 7.19.5. That version is only a minor version
bump from our existing requirement, and is only a 2 month time bump for
versions that are almost 13 years old. So it's not likely that anybody
cares about the distinction.
Switching means we have to rewrite the ioctl functions into seek
functions. In some ways they are simpler (seeking is the only
operation), but in some ways more complex (the ioctl allowed only a full
rewind, but now we can seek to arbitrary offsets).
Curl will only ever use SEEK_SET (per their documentation), so I didn't
bother implementing anything else, since it would naturally be
completely untested. This seems unlikely to change, but I added an
assertion just in case.
Likewise, I doubt curl will ever try to seek outside of the buffer sizes
we've told it, but I erred on the defensive side here, rather than do an
out-of-bounds read.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Git server end's ability to accept Accept-Language header was introduced
in f18604bbf2 (http: add Accept-Language header if possible, 2015-01-28),
but this is only used by very early phase of the transfer, which is HTTP
GET request to discover references. For other phases, like POST request
in the smart HTTP, the server does not know what language the client
speaks.
Teach git client to learn end-user's preferred language and throw
accept-language header to the server side. Once the server gets this header,
it has the ability to talk to end-user with language they understand.
This would be very helpful for many non-English speakers.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Li Linchao <lilinchao@oschina.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This method will be used in an upcoming extension of git-remote-curl to
download a single file over HTTP(S) by request.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the error shown when a http.pinnedPubKey doesn't match to point
the http.pinnedPubKey variable added in aeff8a6121 (http: implement
public key pinning, 2016-02-15), e.g.:
git -c http.pinnedPubKey=sha256/someNonMatchingKey ls-remote https://github.com/git/git.git
fatal: unable to access 'https://github.com/git/git.git/' with http.pinnedPubkey configuration: SSL: public key does not match pinned public key!
Before this we'd emit the exact same thing without the " with
http.pinnedPubkey configuration". The advantage of doing this is that
we're going to get a translated message (everything after the ":" is
hardcoded in English in libcurl), and we've got a reference to the
git-specific configuration variable that's causing the error.
Unfortunately we can't test this easily, as there are no tests that
require https:// in the test suite, and t/lib-httpd.sh doesn't know
how to set up such tests. See [1] for the start of a discussion about
what it would take to have divergent "t/lib-httpd/apache.conf" test
setups. #leftoverbits
1. https://lore.kernel.org/git/YUonS1uoZlZEt+Yd@coredump.intra.peff.net/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the conditional use of CURLAUTH_DIGEST_IE and
CURLOPT_USE_SSL. These two have been split from earlier simpler checks
against LIBCURL_VERSION_NUM for ease of review.
According to
https://github.com/curl/curl/blob/master/docs/libcurl/symbols-in-versions
the CURLAUTH_DIGEST_IE flag became available in 7.19.3, and
CURLOPT_USE_SSL in 7.17.0.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the last commit we dropped support for curl < 7.16.0, let's
continue that and drop support for versions older than 7.19.3. This
allows us to simplify the code by getting rid of some "#ifdef"'s.
Git was broken with vanilla curl < 7.19.4 from v2.12.0 until
v2.15.0. Compiling with it was broken by using CURLPROTO_* outside any
"#ifdef" in aeae4db174 (http: create function to get curl allowed
protocols, 2016-12-14), and fixed in v2.15.0 in f18777ba6e (http: fix
handling of missing CURLPROTO_*, 2017-08-11).
It's unclear how much anyone was impacted by that in practice, since
as noted in [1] RHEL versions using curl older than that still
compiled, because RedHat backported some features. Perhaps other
vendors did the same.
Still, it's one datapoint indicating that it wasn't in active use at
the time. That (the v2.12.0 release) was in Feb 24, 2017, with v2.15.0
on Oct 30, 2017, it's now mid-2021.
1. http://lore.kernel.org/git/c8a2716d-76ac-735c-57f9-175ca3acbcb0@jupiterrise.com;
followed-up by f18777ba6e (http: fix handling of missing CURLPROTO_*,
2017-08-11)
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the last commit we dropped support for curl < 7.11.1, let's
continue that and drop support for versions older than 7.16.0. This
allows us to get rid of some now-obsolete #ifdefs.
Choosing 7.16.0 is a somewhat arbitrary cutoff:
1. It came out in October of 2006, almost 15 years ago.
Besides being a nice round number, around 10 years is
a common end-of-life support period, even for conservative
distributions.
2. That version introduced the curl_multi interface, which
gives us a lot of bang for the buck in removing #ifdefs
RHEL 5 came with curl 7.15.5[1] (released in August 2006). RHEL 5's
extended life cycle program ended on 2020-11-30[1]. RHEL 6 comes with
curl 7.19.7 (released in November 2009), and RHEL 7 comes with
7.29.0 (released in February 2013).
1. http://lore.kernel.org/git/873e1f31-2a96-5b72-2f20-a5816cad1b51@jupiterrise.com
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Drop support for this ancient version of curl and simplify the code by
allowing us get rid of some "#ifdef"'s.
Git will not build with vanilla curl older than 7.11.1 due our use of
CURLOPT_POSTFIELDSIZE in 37ee680d9b
(http.postbuffer: allow full range of ssize_t values,
2017-04-11). This field was introduced in curl 7.11.1.
We could solve these compilation problems with more #ifdefs,
but it's not worth the trouble. Version 7.11.1 came out in
March of 2004, over 17 years ago. Let's declare that too old
and drop any existing ifdefs that go further back. One
obvious benefit is that we'll have fewer conditional bits
cluttering the code.
This patch drops all #ifdefs that reference older versions
(note that curl's preprocessor macros are in hex, so we're
looking for 070b01, not 071101).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, when fetching, packfiles referenced by URIs are run through
index-pack without any arguments other than --stdin and --keep, no
matter what arguments are used for the packfile that is inline in the
fetch response. As a preparation for ensuring that all packs (whether
inline or not) use the same index-pack arguments, teach the http
subsystem to allow custom index-pack arguments.
http-fetch has been updated to use the new API. For now, it passes
--keep alone instead of --keep with a process ID, but this is only
temporary because http-fetch itself will be taught to accept index-pack
parameters (instead of using a hardcoded constant) in a subsequent
commit.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "fetch/clone" protocol has been updated to allow the server to
instruct the clients to grab pre-packaged packfile(s) in addition
to the packed object data coming over the wire.
* jt/cdn-offload:
upload-pack: fix a sparse '0 as NULL pointer' warning
upload-pack: send part of packfile response as uri
fetch-pack: support more than one pack lockfile
upload-pack: refactor reading of pack-objects out
Documentation: add Packfile URIs design doc
Documentation: order protocol v2 sections
http-fetch: support fetching packfiles by URL
http-fetch: refactor into function
http: refactor finish_http_pack_request()
http: use --stdin when indexing dumb HTTP pack
Teach http-fetch the ability to download packfiles directly, given a
URL, and to verify them.
The http_pack_request suite has been augmented with a function that
takes a URL directly. With this function, the hash is only used to
determine the name of the temporary file.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
finish_http_pack_request() does multiple tasks, including some
housekeeping on a struct packed_git - (1) closing its index, (2)
removing it from a list, and (3) installing it. These concerns are
independent of fetching a pack through HTTP: they are there only because
(1) the calling code opens the pack's index before deciding to fetch it,
(2) the calling code maintains a list of packfiles that can be fetched,
and (3) the calling code fetches it in order to make use of its objects
in the same process.
In preparation for a subsequent commit, which adds a feature that does
not need any of this housekeeping, remove (1), (2), and (3) from
finish_http_pack_request(). (2) and (3) are now done by a helper
function, and (1) is the responsibility of the caller (in this patch,
done closer to the point where the pack index is opened).
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Whenever GIT_CURL_VERBOSE is set, teach Git to behave as if
GIT_TRACE_CURL=1 and GIT_TRACE_CURL_NO_DATA=1 is set, instead of setting
CURLOPT_VERBOSE.
This is to prevent inadvertent revelation of sensitive data. In
particular, GIT_CURL_VERBOSE redacts neither the "Authorization" header
nor any cookies specified by GIT_REDACT_COOKIES.
Unifying the tracing mechanism also has the future benefit that any
improvements to the tracing mechanism will benefit both users of
GIT_CURL_VERBOSE and GIT_TRACE_CURL, and we do not need to remember to
implement any improvement twice.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
f0ed8226c9 (Add custom memory allocator to MinGW and MacOS builds, 2009-05-31)
never told cURL about it.
Correct that by using the cURL initializer available since version 7.12 to
point to xmalloc and friends for consistency which then will pass the
allocation requests along when USE_NED_ALLOCATOR=YesPlease is used (most
likely in Windows)
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mechanically and systematically drop "extern" from function
declarlation.
* dl/no-extern-in-func-decl:
*.[ch]: manually align parameter lists
*.[ch]: remove extern from function declarations using sed
*.[ch]: remove extern from function declarations using spatch
In previous patches, extern was mechanically removed from function
declarations without care to formatting, causing parameter lists to be
misaligned. Manually format changed sections such that the parameter
lists should be realigned.
Viewing this patch with 'git diff -w' should produce no output.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There has been a push to remove extern from function declarations.
Remove some instances of "extern" for function declarations which are
caught by Coccinelle. Note that Coccinelle has some difficulty with
processing functions with `__attribute__` or varargs so some `extern`
declarations are left behind to be dealt with in a future patch.
This was the Coccinelle patch used:
@@
type T;
identifier f;
@@
- extern
T f(...);
and it was run with:
$ git ls-files \*.{c,h} |
grep -v ^compat/ |
xargs spatch --sp-file contrib/coccinelle/noextern.cocci --in-place
Files under `compat/` are intentionally excluded as some are directly
copied from external sources and we should avoid churning them as much
as possible.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Conversion from unsigned char[20] to struct object_id continues.
* bc/hash-transition-16: (35 commits)
gitweb: make hash size independent
Git.pm: make hash size independent
read-cache: read data in a hash-independent way
dir: make untracked cache extension hash size independent
builtin/difftool: use parse_oid_hex
refspec: make hash size independent
archive: convert struct archiver_args to object_id
builtin/get-tar-commit-id: make hash size independent
get-tar-commit-id: parse comment record
hash: add a function to lookup hash algorithm by length
remote-curl: make hash size independent
http: replace sha1_to_hex
http: compute hash of downloaded objects using the_hash_algo
http: replace hard-coded constant with the_hash_algo
http-walker: replace sha1_to_hex
http-push: remove remaining uses of sha1_to_hex
http-backend: allow 64-character hex names
http-push: convert to use the_hash_algo
builtin/pull: make hash-size independent
builtin/am: make hash size independent
...
We make some requests with CURLOPT_FAILONERROR and some without, and
then handle_curl_result() normalizes any failures to a uniform CURLcode.
There are some other code paths in the dumb-http walker which don't use
handle_curl_result(); let's pull the normalization into its own function
so it can be reused.
Arguably those code paths would benefit from the rest of
handle_curl_result(), notably the auth handling. But retro-fitting it
now would be a lot of work, and in practice it doesn't matter too much
(whatever authentication we needed to make the initial contact with the
server is generally sufficient for the rest of the dumb-http requests).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
curl stops parsing a response when it sees a bad HTTP status code and it
has CURLOPT_FAILONERROR set. This prevents GIT_CURL_VERBOSE to show HTTP
headers on error.
keep_error is an option to receive the HTTP response body for those
error responses. By enabling this option, curl will process the HTTP
response headers, and they're shown if GIT_CURL_VERBOSE is set.
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The dumb-http walker code still passes around and stores object ids as
"unsigned char *sha1". Let's modernize it.
There's probably still more work to be done to handle dumb-http fetches
with a new, larger hash. But that can wait; this is enough that we can
now convert some of the low-level object routines that we call into from
here (and in fact, some of the "oid.hash" references added here will be
further improved in the next patch).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid unchecked snprintf() to make future code auditing easier.
* jk/snprintf-truncation:
fmt_with_err: add a comment that truncation is OK
shorten_unambiguous_ref: use xsnprintf
fsmonitor: use internal argv_array of struct child_process
log_write_email_headers: use strbufs
http: use strbufs instead of fixed buffers
We keep the names of incoming packs and objects in fixed
PATH_MAX-size buffers, and snprintf() into them. This is
unlikely to end up with truncated filenames, but it is
possible (especially on systems where PATH_MAX is shorter
than actual paths can be). Let's switch to using strbufs,
which makes the question go away entirely.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a way for callers to request that extra headers be included when
making http requests.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unfortunately, in order to push some large repos where a server does
not support chunked encoding, the http postbuffer must sometimes
exceed two gigabytes. On a 64-bit system, this is OK: we just malloc
a larger buffer.
This means that we need to use CURLOPT_POSTFIELDSIZE_LARGE to set the
buffer size.
Signed-off-by: David Turner <dturner@twosigma.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Transport with dumb http can be fooled into following foreign URLs
that the end user does not intend to, especially with the server
side redirects and http-alternates mechanism, which can lead to
security issues. Tighten the redirection and make it more obvious
to the end user when it happens.
* jk/http-walker-limit-redirect-2.9:
http: treat http-alternates like redirects
http: make redirects more obvious
remote-curl: rename shadowed options variable
http: always update the base URL for redirects
http: simplify update_url_from_redirect
We instruct curl to always follow HTTP redirects. This is
convenient, but it creates opportunities for malicious
servers to create confusing situations. For instance,
imagine Alice is a git user with access to a private
repository on Bob's server. Mallory runs her own server and
wants to access objects from Bob's repository.
Mallory may try a few tricks that involve asking Alice to
clone from her, build on top, and then push the result:
1. Mallory may simply redirect all fetch requests to Bob's
server. Git will transparently follow those redirects
and fetch Bob's history, which Alice may believe she
got from Mallory. The subsequent push seems like it is
just feeding Mallory back her own objects, but is
actually leaking Bob's objects. There is nothing in
git's output to indicate that Bob's repository was
involved at all.
The downside (for Mallory) of this attack is that Alice
will have received Bob's entire repository, and is
likely to notice that when building on top of it.
2. If Mallory happens to know the sha1 of some object X in
Bob's repository, she can instead build her own history
that references that object. She then runs a dumb http
server, and Alice's client will fetch each object
individually. When it asks for X, Mallory redirects her
to Bob's server. The end result is that Alice obtains
objects from Bob, but they may be buried deep in
history. Alice is less likely to notice.
Both of these attacks are fairly hard to pull off. There's a
social component in getting Mallory to convince Alice to
work with her. Alice may be prompted for credentials in
accessing Bob's repository (but not always, if she is using
a credential helper that caches). Attack (1) requires a
certain amount of obliviousness on Alice's part while making
a new commit. Attack (2) requires that Mallory knows a sha1
in Bob's repository, that Bob's server supports dumb http,
and that the object in question is loose on Bob's server.
But we can probably make things a bit more obvious without
any loss of functionality. This patch does two things to
that end.
First, when we encounter a whole-repo redirect during the
initial ref discovery, we now inform the user on stderr,
making attack (1) much more obvious.
Second, the decision to follow redirects is now
configurable. The truly paranoid can set the new
http.followRedirects to false to avoid any redirection
entirely. But for a more practical default, we will disallow
redirects only after the initial ref discovery. This is
enough to thwart attacks similar to (2), while still
allowing the common use of redirects at the repository
level. Since c93c92f30 (http: update base URLs when we see
redirects, 2013-09-28) we re-root all further requests from
the redirect destination, which should generally mean that
no further redirection is necessary.
As an escape hatch, in case there really is a server that
needs to redirect individual requests, the user can set
http.followRedirects to "true" (and this can be done on a
per-server basis via http.*.followRedirects config).
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
HTTP transport gained an option to produce more detailed debugging
trace.
* ep/http-curl-trace:
imap-send.c: introduce the GIT_TRACE_CURL enviroment variable
http.c: implement the GIT_TRACE_CURL environment variable
Implement the GIT_TRACE_CURL environment variable to allow a
greater degree of detail of GIT_CURL_VERBOSE, in particular
the complete transport header and all the data payload exchanged.
It might be useful if a particular situation could require a more
thorough debugging analysis. Document the new GIT_TRACE_CURL
environment variable.
Helped-by: Torsten Bögershausen <tboegi@web.de>
Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We introduce a way to send custom HTTP headers with all requests.
This allows us, for example, to send an extra token from build agents
for temporary access to private repositories. (This is the use case that
triggered this patch.)
This feature can be used like this:
git -c http.extraheader='Secret: sssh!' fetch $URL $REF
Note that `curl_easy_setopt(..., CURLOPT_HTTPHEADER, ...)` takes only
a single list, overriding any previous call. This means we have to
collect _all_ of the headers we want to use into a single list, and
feed it to cURL in one shot. Since we already unconditionally set a
"pragma" header when initializing the curl handles, we can add our new
headers to that list.
For callers which override the default header list (like probe_rpc),
we provide `http_copy_default_headers()` so they can do the same
trick.
Big thanks to Jeff King and Junio Hamano for their outstanding help and
patient reviews.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fetch" and friends that make network connections can now be
told to only use ipv4 (or ipv6).
* ew/force-ipv4:
connect & http: support -4 and -6 switches for remote operations
Sometimes it is necessary to force IPv4-only or IPv6-only operation
on networks where name lookups may return a non-routable address and
stall remote operations.
The ssh(1) command has an equivalent switches which we may pass when
we run them. There may be old ssh(1) implementations out there
which do not support these switches; they should report the
appropriate error in that case.
rsync support is untouched for now since it is deprecated and
scheduled to be removed.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Reviewed-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, the only way to pass proxy credentials to curl is by including them
in the proxy URL. Usually, this means they will end up on disk unencrypted, one
way or another (by inclusion in ~/.gitconfig, shell profile or history). Since
proxy authentication often uses a domain user, credentials can be security
sensitive; therefore, a safer way of passing credentials is desirable.
If the configured proxy contains a username but not a password, query the
credential API for one. Also, make sure we approve/reject proxy credentials
properly.
For consistency reasons, add parsing of http_proxy/https_proxy/all_proxy
environment variables, which would otherwise be evaluated as a fallback by curl.
Without this, we would have different semantics for git configuration and
environment variables.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Knut Franke <k.franke@science-computing.de>
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A HTTP server is permitted to return a non-range response to a HTTP
range request (and Apache httpd in fact does this in some cases).
While libcurl knows how to correctly handle this (by skipping bytes
before and after the requested range), it only turns on this handling
if it is aware that a range request is being made. By manually
setting the range header instead of using CURLOPT_RANGE, we were
hiding the fact that this was a range request from libcurl. This
could cause corruption.
Signed-off-by: David Turner <dturner@twopensource.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
They used to be used directly by remote-curl.c for the smart-http
protocol. But they got wrapped by run_one_slot() in beed336 (http:
never use curl_easy_perform, 2014-02-18). Any future users are
expected to follow that route.
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the previous commit, we now give a sanitized,
shortened version of the content-type header to any callers
who ask for it.
This patch adds back a way for them to cleanly access
specific parameters to the type. We could easily extract all
parameters and make them available via a string_list, but:
1. That complicates the interface and memory management.
2. In practice, no planned callers care about anything
except the charset.
This patch therefore goes with the simplest thing, and we
can expand or change the interface later if it becomes
necessary.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Eradicate mistaken use of "nor" (that is, essentially "nor" used
not in "neither A nor B" ;-)) from in-code comments, command output
strings, and documentations.
* jl/nor-or-nand-and:
code and test: fix misuses of "nor"
comments: fix misuses of "nor"
contrib: fix misuses of "nor"
Documentation: fix misuses of "nor"
We currently don't reuse http connections when fetching via
the smart-http protocol. This is bad because the TCP
handshake introduces latency, and especially because SSL
connection setup may be non-trivial.
We can fix it by consistently using curl's "multi"
interface. The reason is rather complicated:
Our http code has two ways of being used: queuing many
"slots" to be fetched in parallel, or fetching a single
request in a blocking manner. The parallel code is built on
curl's "multi" interface. Most of the single-request code
uses http_request, which is built on top of the parallel
code (we just feed it one slot, and wait until it finishes).
However, one could also accomplish the single-request scheme
by avoiding curl's multi interface entirely and just using
curl_easy_perform. This is simpler, and is used by post_rpc
in the smart-http protocol.
It does work to use the same curl handle in both contexts,
as long as it is not at the same time. However, internally
curl may not share all of the cached resources between both
contexts. In particular, a connection formed using the
"multi" code will go into a reuse pool connected to the
"multi" object. Further requests using the "easy" interface
will not be able to reuse that connection.
The smart http protocol does ref discovery via http_request,
which uses the "multi" interface, and then follows up with
the "easy" interface for its rpc calls. As a result, we make
two HTTP connections rather than reusing a single one.
We could teach the ref discovery to use the "easy"
interface. But it is only once we have done this discovery
that we know whether the protocol will be smart or dumb. If
it is dumb, then our further requests, which want to fetch
objects in parallel, will not be able to reuse the same
connection.
Instead, this patch switches post_rpc to build on the
parallel interface, which means that we use it consistently
everywhere. It's a little more complicated to use, but since
we have the infrastructure already, it doesn't add any code;
we can just factor out the relevant bits from http_request.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Issue "100 Continue" responses to help use of GSS-Negotiate
authentication scheme over HTTP transport when needed.
* bc/http-100-continue:
remote-curl: fix large pushes with GSSAPI
remote-curl: pass curl slot_results back through run_slot
http: return curl's AUTHAVAIL via slot_results
Callers of the http code may want to know which auth types
were available for the previous request. But after finishing
with the curl slot, they are not supposed to look at the
curl handle again. We already handle returning other
information via the slot_results struct; let's add a flag to
check the available auth.
Note that older versions of curl did not support this, so we
simply return 0 (something like "-1" would be worse, as the
value is a bitflag and we might accidentally set a flag).
This is sufficient for the callers planned in this series,
who only trigger some optional behavior if particular bits
are set, and can live with a fake "no bits" answer.
Signed-off-by: Jeff King <peff@peff.net>
If a caller asks the http_get_* functions to go to a
particular URL and we end up elsewhere due to a redirect,
the effective_url field can tell us where we went.
It would be nice to remember this redirect and short-cut
further requests for two reasons:
1. It's more efficient. Otherwise we spend an extra http
round-trip to the server for each subsequent request,
just to get redirected.
2. If we end up with an http 401 and are going to ask for
credentials, it is to feed them to the redirect target.
If the redirect is an http->https upgrade, this means
our credentials may be provided on the http leg, just
to end up redirected to https. And if the redirect
crosses server boundaries, then curl will drop the
credentials entirely as it follows the redirect.
However, it, it is not enough to simply record the effective
URL we saw and use that for subsequent requests. We were
originally fed a "base" url like:
http://example.com/foo.git
and we want to figure out what the new base is, even though
the URLs we see may be:
original: http://example.com/foo.git/info/refs
effective: http://example.com/bar.git/info/refs
Subsequent requests will not be for "info/refs", but for
other paths relative to the base. We must ask the caller to
pass in the original base, and we must pass the redirected
base back to the caller (so that it can generate more URLs
from it). Furthermore, we need to feed the new base to the
credential code, so that requests to credential helpers (or
to the user) match the URL we will be requesting.
This patch teaches http_request_reauth to do this munging.
Since it is the caller who cares about making more URLs, it
seems at first glance that callers could simply check
effective_url themselves and handle it. However, since we
need to update the credential struct before the second
re-auth request, we have to do it inside http_request_reauth.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
When we ask curl to access a URL, it may follow one or more
redirects to reach the final location. We have no idea
this has happened, as curl takes care of the details and
simply returns the final content to us.
The final URL that we ended up with can be accessed via
CURLINFO_EFFECTIVE_URL. Let's make that optionally available
to callers of http_get_*, so that they can make further
decisions based on the redirection.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
When we are handling a curl response code in http_request or
in the remote-curl RPC code, we use the handle_curl_result
helper to translate curl's response into an easy-to-use
code. When we see an HTTP 401, we do one of two things:
1. If we already had a filled-in credential, we mark it as
rejected, and then return HTTP_NOAUTH to indicate to
the caller that we failed.
2. If we didn't, then we ask for a new credential and tell
the caller HTTP_REAUTH to indicate that they may want
to try again.
Rejecting in the first case makes sense; it is the natural
result of the request we just made. However, prompting for
more credentials in the second step does not always make
sense. We do not know for sure that the caller is going to
make a second request, and nor are we sure that it will be
to the same URL. Logically, the prompt belongs not to the
request we just finished, but to the request we are (maybe)
about to make.
In practice, it is very hard to trigger any bad behavior.
Currently, if we make a second request, it will always be to
the same URL (even in the face of redirects, because curl
handles the redirects internally). And we almost always
retry on HTTP_REAUTH these days. The one exception is if we
are streaming a large RPC request to the server (e.g., a
pushed packfile), in which case we cannot restart. It's
extremely unlikely to see a 401 response at this stage,
though, as we would typically have seen it when we sent a
probe request, before streaming the data.
This patch drops the automatic prompt out of case 2, and
instead requires the caller to do it. This is a few extra
lines of code, and the bug it fixes is unlikely to come up
in practice. But it is conceptually cleaner, and paves the
way for better handling of credentials across redirects.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Over time, the http_get_strbuf function has grown several
optional parameters. We now have a bitfield with multiple
boolean options, as well as an optional strbuf for returning
the content-type of the response. And a future patch in this
series is going to add another strbuf option.
Treating these as separate arguments has a few downsides:
1. Most call sites need to add extra NULLs and 0s for the
options they aren't interested in.
2. The http_get_* functions are actually wrappers around
2 layers of low-level implementation functions. We have
to pass these options through individually.
3. The http_get_strbuf wrapper learned these options, but
nobody bothered to do so for http_get_file, even though
it is backed by the same function that does understand
the options.
Let's consolidate the options into a single struct. For the
common case of the default options, we'll allow callers to
simply pass a NULL for the options struct.
The resulting code is often a few lines longer, but it ends
up being easier to read (and to change as we add new
options, since we do not need to update each call site).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>