Most struct child_process variables are cleared using memset first after
declaration. Provide a macro, CHILD_PROCESS_INIT, that can be used to
initialize them statically instead. That's shorter, doesn't require a
function call and is slightly more readable (especially given that we
already have STRBUF_INIT, ARGV_ARRAY_INIT etc.).
Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/skip-prefix:
http-push: refactor parsing of remote object names
imap-send: use skip_prefix instead of using magic numbers
use skip_prefix to avoid repeated calculations
git: avoid magic number with skip_prefix
fetch-pack: refactor parsing in get_ack
fast-import: refactor parsing of spaces
stat_opt: check extra strlen call
daemon: use skip_prefix to avoid magic numbers
fast-import: use skip_prefix for parsing input
use skip_prefix to avoid repeating strings
use skip_prefix to avoid magic numbers
transport-helper: avoid reading past end-of-string
fast-import: fix read of uninitialized argv memory
apply: use skip_prefix instead of raw addition
refactor skip_prefix to return a boolean
avoid using skip_prefix as a boolean
daemon: mark some strings as const
parse_diff_color_slot: drop ofs parameter
In some cases, we use starts_with to check for a prefix, and
then use an already-calculated prefix length to advance a
pointer past the prefix. There are no magic numbers or
duplicated strings here, but we can still make the code
simpler and more obvious by using skip_prefix.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
extract_content_type() could not extract a charset parameter if the
parameter is not the first one and there is a whitespace and a following
semicolon just before the parameter. For example:
text/plain; format=fixed ;charset=utf-8
And it also could not handle correctly some other cases, such as:
text/plain; charset=utf-8; format=fixed
text/plain; some-param="a long value with ;semicolons;"; charset=utf-8
Thanks-to: Jeff King <peff@peff.net>
Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is specified by RFC 2616 as the default if no "charset"
parameter is given.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the previous commit, we now give a sanitized,
shortened version of the content-type header to any callers
who ask for it.
This patch adds back a way for them to cleanly access
specific parameters to the type. We could easily extract all
parameters and make them available via a string_list, but:
1. That complicates the interface and memory management.
2. In practice, no planned callers care about anything
except the charset.
This patch therefore goes with the simplest thing, and we
can expand or change the interface later if it becomes
necessary.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we get a content-type from curl, we get the whole
header line, including any parameters, and without any
normalization (like downcasing or whitespace) applied.
If we later try to match it with strcmp() or even
strcasecmp(), we may get false negatives.
This could cause two visible behaviors:
1. We might fail to recognize a smart-http server by its
content-type.
2. We might fail to relay text/plain error messages to
users (especially if they contain a charset parameter).
This patch teaches the http code to extract and normalize
just the type/subtype portion of the string. This is
technically passing out less information to the callers, who
can no longer see the parameters. But none of the current
callers cares, and a future patch will add back an
easier-to-use method for accessing those parameters.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* mh/object-code-cleanup:
sha1_file.c: document a bunch of functions defined in the file
sha1_file_name(): declare to return a const string
find_pack_entry(): document last_found_pack
replace_object: use struct members instead of an array
Change the return value of sha1_file_name() to (const char *).
(Callers have no business mucking about here.) Change callers
accordingly, deleting a few superfluous temporary variables along the
way.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently don't reuse http connections when fetching via
the smart-http protocol. This is bad because the TCP
handshake introduces latency, and especially because SSL
connection setup may be non-trivial.
We can fix it by consistently using curl's "multi"
interface. The reason is rather complicated:
Our http code has two ways of being used: queuing many
"slots" to be fetched in parallel, or fetching a single
request in a blocking manner. The parallel code is built on
curl's "multi" interface. Most of the single-request code
uses http_request, which is built on top of the parallel
code (we just feed it one slot, and wait until it finishes).
However, one could also accomplish the single-request scheme
by avoiding curl's multi interface entirely and just using
curl_easy_perform. This is simpler, and is used by post_rpc
in the smart-http protocol.
It does work to use the same curl handle in both contexts,
as long as it is not at the same time. However, internally
curl may not share all of the cached resources between both
contexts. In particular, a connection formed using the
"multi" code will go into a reuse pool connected to the
"multi" object. Further requests using the "easy" interface
will not be able to reuse that connection.
The smart http protocol does ref discovery via http_request,
which uses the "multi" interface, and then follows up with
the "easy" interface for its rpc calls. As a result, we make
two HTTP connections rather than reusing a single one.
We could teach the ref discovery to use the "easy"
interface. But it is only once we have done this discovery
that we know whether the protocol will be smart or dumb. If
it is dumb, then our further requests, which want to fetch
objects in parallel, will not be able to reuse the same
connection.
Instead, this patch switches post_rpc to build on the
parallel interface, which means that we use it consistently
everywhere. It's a little more complicated to use, but since
we have the infrastructure already, it doesn't add any code;
we can just factor out the relevant bits from http_request.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove a few duplicate implementations of prefix/suffix comparison
functions, and rename them to starts_with and ends_with.
* cc/starts-n-ends-with:
replace {pre,suf}fixcmp() with {starts,ends}_with()
strbuf: introduce starts_with() and ends_with()
builtin/remote: remove postfixcmp() and use suffixcmp() instead
environment: normalize use of prefixcmp() by removing " != 0"
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.
The change can be recreated by mechanically applying this:
$ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
grep -v strbuf\\.c |
xargs perl -pi -e '
s|!prefixcmp\(|starts_with\(|g;
s|prefixcmp\(|!starts_with\(|g;
s|!suffixcmp\(|ends_with\(|g;
s|suffixcmp\(|!ends_with\(|g;
'
on the result of preparatory changes in this series.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Issue "100 Continue" responses to help use of GSS-Negotiate
authentication scheme over HTTP transport when needed.
* bc/http-100-continue:
remote-curl: fix large pushes with GSSAPI
remote-curl: pass curl slot_results back through run_slot
http: return curl's AUTHAVAIL via slot_results
Callers of the http code may want to know which auth types
were available for the previous request. But after finishing
with the curl slot, they are not supposed to look at the
curl handle again. We already handle returning other
information via the slot_results struct; let's add a flag to
check the available auth.
Note that older versions of curl did not support this, so we
simply return 0 (something like "-1" would be worse, as the
value is a bitflag and we might accidentally set a flag).
This is sufficient for the callers planned in this series,
who only trigger some optional behavior if particular bits
are set, and can live with a fake "no bits" answer.
Signed-off-by: Jeff King <peff@peff.net>
Handle the case where http transport gets redirected during the
authorization request better.
* jk/http-auth-redirects:
http.c: Spell the null pointer as NULL
remote-curl: rewrite base url from info/refs redirects
remote-curl: store url as a strbuf
remote-curl: make refs_url a strbuf
http: update base URLs when we see redirects
http: provide effective url to callers
http: hoist credential request out of handle_curl_result
http: refactor options to http_get_*
http_request: factor out curlinfo_strbuf
http_get_file: style fixes
Commit 1bbcc224 ("http: refactor options to http_get_*", 28-09-2013)
changed the type of final 'options' argument of the http_get_file()
function from an int to an 'struct http_get_options' pointer.
However, it neglected to update the (single) call site. Since this
call was passing '0' to that argument, it was (correctly) being
interpreted as a null pointer. Change to argument to NULL.
Noticed by sparse. ("Using plain integer as NULL pointer")
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Acked-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit a15d069 taught git to use curl's SOCKOPTFUNCTION hook
to turn on TCP keepalives. However, modern versions of curl
have a TCP_KEEPALIVE option, which can do this for us. As an
added bonus, the curl code knows how to turn on keepalive
for a much wider variety of platforms. The only downside to
using this option is that not everybody has a new enough curl.
Let's split our keepalive options into three conditionals:
1. With curl 7.25.0 and newer, we rely on curl to do it
right.
2. With older curl that still knows SOCKOPTFUNCTION, we
use the code from a15d069.
3. Otherwise, we are out of luck, and the call is a no-op.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a caller asks the http_get_* functions to go to a
particular URL and we end up elsewhere due to a redirect,
the effective_url field can tell us where we went.
It would be nice to remember this redirect and short-cut
further requests for two reasons:
1. It's more efficient. Otherwise we spend an extra http
round-trip to the server for each subsequent request,
just to get redirected.
2. If we end up with an http 401 and are going to ask for
credentials, it is to feed them to the redirect target.
If the redirect is an http->https upgrade, this means
our credentials may be provided on the http leg, just
to end up redirected to https. And if the redirect
crosses server boundaries, then curl will drop the
credentials entirely as it follows the redirect.
However, it, it is not enough to simply record the effective
URL we saw and use that for subsequent requests. We were
originally fed a "base" url like:
http://example.com/foo.git
and we want to figure out what the new base is, even though
the URLs we see may be:
original: http://example.com/foo.git/info/refs
effective: http://example.com/bar.git/info/refs
Subsequent requests will not be for "info/refs", but for
other paths relative to the base. We must ask the caller to
pass in the original base, and we must pass the redirected
base back to the caller (so that it can generate more URLs
from it). Furthermore, we need to feed the new base to the
credential code, so that requests to credential helpers (or
to the user) match the URL we will be requesting.
This patch teaches http_request_reauth to do this munging.
Since it is the caller who cares about making more URLs, it
seems at first glance that callers could simply check
effective_url themselves and handle it. However, since we
need to update the credential struct before the second
re-auth request, we have to do it inside http_request_reauth.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
When we ask curl to access a URL, it may follow one or more
redirects to reach the final location. We have no idea
this has happened, as curl takes care of the details and
simply returns the final content to us.
The final URL that we ended up with can be accessed via
CURLINFO_EFFECTIVE_URL. Let's make that optionally available
to callers of http_get_*, so that they can make further
decisions based on the redirection.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
When we are handling a curl response code in http_request or
in the remote-curl RPC code, we use the handle_curl_result
helper to translate curl's response into an easy-to-use
code. When we see an HTTP 401, we do one of two things:
1. If we already had a filled-in credential, we mark it as
rejected, and then return HTTP_NOAUTH to indicate to
the caller that we failed.
2. If we didn't, then we ask for a new credential and tell
the caller HTTP_REAUTH to indicate that they may want
to try again.
Rejecting in the first case makes sense; it is the natural
result of the request we just made. However, prompting for
more credentials in the second step does not always make
sense. We do not know for sure that the caller is going to
make a second request, and nor are we sure that it will be
to the same URL. Logically, the prompt belongs not to the
request we just finished, but to the request we are (maybe)
about to make.
In practice, it is very hard to trigger any bad behavior.
Currently, if we make a second request, it will always be to
the same URL (even in the face of redirects, because curl
handles the redirects internally). And we almost always
retry on HTTP_REAUTH these days. The one exception is if we
are streaming a large RPC request to the server (e.g., a
pushed packfile), in which case we cannot restart. It's
extremely unlikely to see a 401 response at this stage,
though, as we would typically have seen it when we sent a
probe request, before streaming the data.
This patch drops the automatic prompt out of case 2, and
instead requires the caller to do it. This is a few extra
lines of code, and the bug it fixes is unlikely to come up
in practice. But it is conceptually cleaner, and paves the
way for better handling of credentials across redirects.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
This is a follow up to commit e47a8583 (enable SO_KEEPALIVE for
connected TCP sockets, 2011-12-06).
Sockets may never receive notification of some link errors,
causing "git fetch" or similar processes to hang forever.
Enabling keepalive messages allows hung processes to error out
after a few minutes/hours depending on the keepalive settings of
the system.
I noticed this problem with some non-interactive cronjobs getting
hung when talking to HTTP servers.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Over time, the http_get_strbuf function has grown several
optional parameters. We now have a bitfield with multiple
boolean options, as well as an optional strbuf for returning
the content-type of the response. And a future patch in this
series is going to add another strbuf option.
Treating these as separate arguments has a few downsides:
1. Most call sites need to add extra NULLs and 0s for the
options they aren't interested in.
2. The http_get_* functions are actually wrappers around
2 layers of low-level implementation functions. We have
to pass these options through individually.
3. The http_get_strbuf wrapper learned these options, but
nobody bothered to do so for http_get_file, even though
it is backed by the same function that does understand
the options.
Let's consolidate the options into a single struct. For the
common case of the default options, we'll allow callers to
simply pass a NULL for the options struct.
The resulting code is often a few lines longer, but it ends
up being easier to read (and to change as we add new
options, since we do not need to update each call site).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
When we retrieve the content-type of an http response, curl
gives us a pointer to internal storage, which we then copy
into a strbuf. Let's factor out the get-and-copy routine,
which can be used for getting other curl info.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Besides being ugly, the extra parentheses are idiomatic for
suppressing compiler warnings when we are assigning within a
conditional. We aren't doing that here, and they just
confuse the reader.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Allow section.<urlpattern>.var configuration variables to be
treated as a "virtual" section.var given a URL, and use the
mechanism to enhance http.* configuration variables.
This is a reroll of Kyle J. McKay's work.
* jc/url-match:
builtin/config.c: compilation fix
config: "git config --get-urlmatch" parses section.<url>.key
builtin/config: refactor collect_config()
config: parse http.<url>.<variable> using urlmatch
config: add generic callback wrapper to parse section.<url>.key
config: add helper to normalize and match URLs
http.c: fix parsing of http.sslCertPasswordProtected variable
Use the urlmatch_config_entry() to wrap the underlying
http_options() two-level variable parser in order to set
http.<variable> to the value with the most specific URL in the
configuration.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The existing code triggers only when the configuration variable is
set to true. Once the variable is set to true in a more generic
configuration file (e.g. ~/.gitconfig), it cannot be overriden to
false in the repository specific one (e.g. .git/config).
Signed-off-by: Junio C Hamano <gitster@pobox.com>
HTTP servers may send Set-Cookie headers in a response and expect them
to be set on subsequent requests. By default, libcurl behavior is to
store such cookies in memory and reuse them across requests within a
single session. However, it may also make sense, depending on the
server and the cookies, to store them across sessions. Provide users
an option to enable this behavior, writing cookies out to the same
file specified in http.cookiefile.
Signed-off-by: Dave Borowitz <dborowitz@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Older cURL wanted piece of memory we call it with to be stable, but
we updated the auth material after handing it to a call.
* bc/http-keep-memory-given-to-curl:
http.c: don't rewrite the user:passwd string multiple times
Curl older than 7.17 (RHEL 4.X provides 7.12 and RHEL 5.X provides
7.15) requires that we manage any strings that we pass to it as
pointers. So, we really shouldn't be modifying this strbuf after we
have passed it to curl.
Our interaction with curl is currently safe (before or after this
patch) since the pointer that is passed to curl is never invalidated;
it is repeatedly rewritten with the same sequence of characters but
the strbuf functions never need to allocate a larger string, so the
same memory buffer is reused.
This "guarantee" of safety is somewhat subtle and could be overlooked
by someone who may want to add a more complex handling of the username
and password. So, let's stop modifying this strbuf after we have
passed it to curl, but also leave a note to describe the assumptions
that have been made about username/password lifetime and to draw
attention to the code.
Signed-off-by: Brandon Casey <drafnel@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Because we reuse curl handles for multiple requests, the
setup of a handle happens in two stages: stable, global
setup and per-request setup. The lifecycle of a handle is
something like:
1. get_curl_handle; do basic global setup that will last
through the whole program (e.g., setting the user
agent, ssl options, etc)
2. get_active_slot; set up a per-request baseline (e.g.,
clearing the read/write functions, making it a GET
request, etc)
3. perform the request with curl_*_perform functions
4. goto step 2 to perform another request
Breaking it down this way means we can avoid doing global
setup from step (1) repeatedly, but we still finish step (2)
with a predictable baseline setup that callers can rely on.
Until commit 6d052d7 (http: add HTTP_KEEP_ERROR option,
2013-04-05), setting curl's FAILONERROR option was a global
setup; we never changed it. However, 6d052d7 introduced an
option where some requests might turn off FAILONERROR. Later
requests using the same handle would have the option
unexpectedly turned off, which meant they would not notice
http failures at all.
This could easily be seen in the test-suite for the
"half-auth" cases of t5541 and t5551. The initial requests
turned off FAILONERROR, which meant it was erroneously off
for the rpc POST. That worked fine for a successful request,
but meant that we failed to react properly to the HTTP 401
(instead, we treated whatever the server handed us as a
successful message body).
The solution is simple: now that FAILONERROR is a
per-request setting, we move it to get_active_slot to make
sure it is reset for each request.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a boolean http.sslTry option which allows to enable AUTH SSL/TLS and
encrypted data transfers when connecting via regular FTP protocol.
Default is false since it might trigger certificate verification errors on
misconfigured servers.
Signed-off-by: Modestas Vainius <modestas@vainius.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function is a single-liner and is only called from one
place. Just inline it, which makes the code more obvious.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we report an http error code, we say something like:
error: The requested URL reported failure: 403 Forbidden while accessing http://example.com/repo.git
Everything between "error:" and "while" is written by curl,
and the resulting sentence is hard to read (especially
because there is no punctuation between curl's sentence and
the remainder of ours). Instead, let's re-order this to give
better flow:
error: unable to access 'http://example.com/repo.git: The requested URL reported failure: 403 Forbidden
This is still annoyingly long, but at least reads more
clearly left to right.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This helper function should really be a one-liner that
prints an error message, but it has ended up unnecessarily
complicated:
1. We call error() directly when we fail to start the curl
request, so we must later avoid printing a duplicate
error in http_error().
It would be much simpler in this case to just stuff the
error message into our usual curl_errorstr buffer
rather than printing it ourselves. This means that
http_error does not even have to care about curl's exit
value (the interesting part is in the errorstr buffer
already).
2. We return the "ret" value passed in to us, but none of
the callers actually cares about our return value. We
can just drop this entirely.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently set curl's FAILONERROR option, which means that
any http failures are reported as curl errors, and the
http body content from the server is thrown away.
This patch introduces a new option to http_get_strbuf which
specifies that the body content from a failed http response
should be placed in the destination strbuf, where it can be
accessed by the caller.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Having the packet sizes defined near the packet read/write
functions makes more sense.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Callers may pass us a strbuf which we use to record the
content-type of the response. However, we simply appended to
it rather than overwriting its contents, meaning that cruft
in the strbuf gave us a bogus type. E.g., the multiple
requests triggered by http_request could yield a type like
"text/plainapplication/x-git-receive-pack-advertisement".
Reported-by: Michael Schubert <mschub@elegosoft.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before parsing a suspected smart-HTTP response verify the returned
Content-Type matches the standard. This protects a client from
attempting to process a payload that smells like a smart-HTTP
server response.
JGit has been doing this check on all responses since the dawn of
time. I mistakenly failed to include it in git-core when smart HTTP
was introduced. At the time I didn't know how to get the Content-Type
from libcurl. I punted, meant to circle back and fix this, and just
plain forgot about it.
Signed-off-by: Shawn Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If sslCertPasswordProtected is set to true do not ask for username to decrypt rsa key. This question is pointless, the key is only protected by a password. Internaly the username is simply set to "".
Signed-off-by: Rene Bredlau <git@unrelated.de>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sometimes curl_multi_timeout() function suggested a wrong timeout
value when there is no file descriptors to wait on and the http
transport ended up sleeping for minutes in select(2) system call.
Detect this and reduce the wait timeout in such a case.
* sz/maint-curl-multi-timeout:
Fix potential hang in https handshake
Further clean-up to the http codepath that picks up results after
cURL library is done with one request slot.
* jk/maint-http-init-not-in-result-handler:
http: do not set up curl auth after a 401
remote-curl: do not call run_slot repeatedly
It has been observed that curl_multi_timeout may return a very long
timeout value (e.g., 294 seconds and some usec) just before
curl_multi_fdset returns no file descriptors for reading. The
upshot is that select() will hang for a long time -- long enough for
an https handshake to be dropped. The observed behavior is that
the git command will hang at the terminal and never transfer any
data.
This patch is a workaround for a probable bug in libcurl. The bug
only seems to manifest around a very specific set of circumstances:
- curl version (from curl/curlver.h):
#define LIBCURL_VERSION_NUM 0x071307
- git-remote-https running on an ubuntu-lucid VM.
- Connecting through squid proxy running on another VM.
Interestingly, the problem doesn't manifest if a host connects
through squid proxy running on localhost; only if the proxy is on
a separate VM (not sure if the squid host needs to be on a separate
physical machine). That would seem to suggest that this issue
is timing-sensitive.
This patch is more or less in line with a recommendation in the
curl docs about how to behave when curl_multi_fdset doesn't return
and file descriptors:
http://curl.haxx.se/libcurl/c/curl_multi_fdset.html
Signed-off-by: Stefan Zager <szager@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fixes a regression in maint-1.7.11 (v1.7.11.7), maint (v1.7.12.1)
and master (v1.8.0-rc0).
* jk/maint-http-half-auth-push:
http: fix segfault in handle_curl_result
When we get an http 401, we prompt for credentials and put
them in our global credential struct. We also feed them to
the curl handle that produced the 401, with the intent that
they will be used on a retry.
When the code was originally introduced in commit 42653c0,
this was a necessary step. However, since dfa1725, we always
feed our global credential into every curl handle when we
initialize the slot with get_active_slot. So every further
request already feeds the credential to curl.
Moreover, accessing the slot here is somewhat dubious. After
the slot has produced a response, we don't actually control
it any more. If we are using curl_multi, it may even have
been re-initialized to handle a different request.
It just so happens that we will reuse the curl handle within
the slot in such a case, and that because we only keep one
global credential, it will be the one we want. So the
current code is not buggy, but it is misleading.
By cleaning it up, we can remove the slot argument entirely
from handle_curl_result, making it much more obvious that
slots should not be accessed after they are marked as
finished.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we create an http active_request_slot, we can set its
"results" pointer back to local storage. The http code will
fill in the details of how the request went, and we can
access those details even after the slot has been cleaned
up.
Commit 8809703 (http: factor out http error code handling)
switched us from accessing our local results struct directly
to accessing it via the "results" pointer of the slot. That
means we're accessing the slot after it has been marked as
finished, defeating the whole purpose of keeping the results
storage separate.
Most of the time this doesn't matter, as finishing the slot
does not actually clean up the pointer. However, when using
curl's multi interface with the dumb-http revision walker,
we might actually start a new request before handing control
back to the original caller. In that case, we may reuse the
slot, zeroing its results pointer, and leading the original
caller to segfault while looking for its results inside the
slot.
Instead, we need to pass a pointer to our local results
storage to the handle_curl_result function, rather than
relying on the pointer in the slot struct. This matches what
the original code did before the refactoring (which did not
use a separate function, and therefore just accessed the
results struct directly).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some HTTP servers try to use gzip compression on the /info/refs
request to save transfer bandwidth. Repositories with many tags
may find the /info/refs request can be gzipped to be 50% of the
original size due to the few but often repeated bytes used (hex
SHA-1 and commonly digits in tag names).
For most HTTP requests enable "Accept-Encoding: gzip" ensuring
the /info/refs payload can use this encoding format.
Only request gzip encoding from servers. Although deflate is
supported by libcurl, most servers have standardized on gzip
encoding for compression as that is what most browsers support.
Asking for deflate increases request sizes by a few bytes, but is
unlikely to ever be used by a server.
Disable the Accept-Encoding header on probe RPCs as response bodies
are supposed to be exactly 4 bytes long, "0000". The HTTP headers
requesting and indicating compression use more space than the data
transferred in the body.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pushing to smart HTTP server with recent Git fails without having
the username in the URL to force authentication, if the server is
configured to allow GET anonymously, while requiring authentication
for POST.
* jk/maint-http-half-auth-push:
http: prompt for credentials on failed POST
http: factor out http error code handling
t: test http access to "half-auth" repositories
t: test basic smart-http authentication
t/lib-httpd: recognize */smart/* repos as smart-http
t/lib-httpd: only route auth/dumb to dumb repos
t5550: factor out http auth setup
t5550: put auth-required repo in auth/dumb
Most of our http requests go through the http_request()
interface, which does some nice post-processing on the
results. In particular, it handles prompting for missing
credentials as well as approving and rejecting valid or
invalid credentials. Unfortunately, it only handles GET
requests. Making it handle POSTs would be quite complex, so
let's pull result handling code into its own function so
that it can be reused from the POST code paths.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reverts be22d92 (http: avoid empty error messages for some curl
errors, 2011-09-05) on platforms with older versions of libcURL
where the function is not available.
Signed-off-by: Joachim Schmitz <jojo@schmitz-digital.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This means we will respect the GIT_USER_AGENT build-time
configuration and run-time environment variable.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The error handling routines add a newline. Remove
the duplicate ones in error messages.
Signed-off-by: Pete Wyckoff <pw@padd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We give the username and password to curl by sticking them
in a buffer of the form "user:pass" and handing the result
to CURLOPT_USERPWD. Since curl 7.19.1, there is a split
mechanism, where you can specify each element individually.
This has the advantage that a username can contain a ":"
character. It also is less code for us, since we can hand
our strings over to curl directly. And since curl 7.17.0 and
higher promise to copy the strings for us, we we don't even
have to worry about memory ownership issues.
Unfortunately, we have to keep the ugly code for old curl
around, but as it is now nicely #if'd out, we can easily get
rid of it when we decide that 7.19.1 is "old enough".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we have a credential to give to curl, we must copy it
into a "user:pass" buffer and then hand the buffer to curl.
Old versions of curl did not copy the buffer, and we were
expected to keep it valid. Newer versions of curl will copy
the buffer.
Our solution was to use a strbuf and detach it, giving
ownership of the resulting buffer to curl. However, this
meant that we were leaking the buffer on newer versions of
curl, since curl was just copying it and throwing away the
string we passed. Furthermore, when we replaced a
credential (e.g., because our original one was rejected), we
were also leaking on both old and new versions of curl.
This got even worse in the last patch, which started
replacing the credential (and thus leaking) on every http
request.
Instead, let's use a static buffer to make the ownership
more clear and less leaky. We already keep a static "struct
credential", so we are only handling a single credential at
a time, anyway.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
HTTP authentication is currently handled by get_refs and fetch_ref, but
not by fetch_object, fetch_pack or fetch_alternates. In the
single-threaded case, this is not an issue, since get_refs is always
called first. It recognigzes the 401 and prompts the user for
credentials, which will then be used subsequently.
If the curl multi interface is used, however, only the multi handle used
by get_refs will have credentials configured. Requests made by other
handles fail with an authentication error.
Fix this by setting CURLOPT_USERPWD whenever a slot is requested.
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the proxy server specified by the http.proxy configuration or the
http_proxy environment variable requires authentication, git failed to
connect to the proxy, because we did not configure the cURL handle with
CURLOPT_PROXYAUTH.
When a proxy is in use, and you tell git that the proxy requires
authentication by having username in the http.proxy configuration, an
extra request needs to be made to the proxy to find out what
authentication method it supports, as this patch uses CURLAUTH_ANY to let
the library pick the most secure method supported by the proxy server.
The extra round-trip adds extra latency, but relieves the user from the
burden to configure a specific authentication method. If it becomes
problem, a later patch could add a configuration option to specify what
method to use, but let's start simple for the time being.
Signed-off-by: Nelson Benitez Leon <nbenitezl@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/credentials:
t: add test harness for external credential helpers
credentials: add "store" helper
strbuf: add strbuf_add*_urlencode
Makefile: unix sockets may not available on some platforms
credentials: add "cache" helper
docs: end-user documentation for the credential subsystem
credential: make relevance of http path configurable
credential: add credential.*.username
credential: apply helper config
http: use credential API to get passwords
credential: add function for parsing url components
introduce credentials API
t5550: fix typo
test-lib: add test_config_global variant
Conflicts:
strbuf.c
Before commit 986bbc08, git was proactive about asking for
http passwords. It assumed that if you had a username in
your URL, you would also want a password, and asked for it
before making any http requests.
However, this could interfere with the use of .netrc (see
986bbc08 for details). And it was also unnecessary, since
the http fetching code had learned to recognize an HTTP 401
and prompt the user then. Furthermore, the proactive prompt
could interfere with the usage of .netrc (see 986bbc08 for
details).
Unfortunately, the http push-over-DAV code never learned to
recognize HTTP 401, and so was broken by this change. This
patch does a quick fix of re-enabling the "proactive auth"
strategy only for http-push, leaving the dumb http fetch and
smart-http as-is.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch converts the http code to use the new credential
API, both for http authentication as well as for getting
certificate passwords.
Most of the code change is simply variable naming (the
passwords are now contained inside the credential struct)
or deletion of obsolete code (the credential code handles
URL parsing and prompting for us).
The behavior should be the same, with one exception: the
credential code will prompt with a description based on the
credential components. Therefore, the old prompt of:
Username for 'example.com':
Password for 'example.com':
now looks like:
Username for 'https://example.com/repo.git':
Password for 'https://user@example.com/repo.git':
Note that we include more information in each line,
specifically:
1. We now include the protocol. While more noisy, this is
an important part of knowing what you are accessing
(especially if you care about http vs https).
2. We include the username in the password prompt. This is
not a big deal when you have just been prompted for it,
but the username may also come from the remote's URL
(and after future patches, from configuration or
credential helpers). In that case, it's a nice
reminder of the user for which you're giving the
password.
3. We include the path component of the URL. In many
cases, the user won't care about this and it's simply
noise (i.e., they'll use the same credential for a
whole site). However, that is part of a larger
question, which is whether path components should be
part of credential context, both for prompting and for
lookup by storage helpers. That issue will be addressed
as a whole in a future patch.
Similarly, for unlocking certificates, we used to say:
Certificate Password for 'example.com':
and we now say:
Password for 'cert:///path/to/certificate':
Showing the path to the client certificate makes more sense,
as that is what you are unlocking, not "example.com".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* mf/curl-select-fdset:
http: drop "local" member from request struct
http.c: Rely on select instead of tracking whether data was received
http.c: Use timeout suggested by curl instead of fixed 50ms timeout
http.c: Use curl_multi_fdset to select on curl fds instead of just sleeping
This is a FILE pointer in the case that we are sending our
output to a file. We originally used it to run ftell() to
determine whether data had been written to our file during
our last call to curl. However, as of the last patch, we no
longer care about that flag anymore. All uses of this struct
member are now just book-keeping that can go away.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since now select is used with the file descriptors of the http connections,
tracking whether data was received recently (and trying to read more in
that case) is no longer necessary. Instead, always call select and rely on
it to return as soon as new data can be read.
Signed-off-by: Mika Fischer <mika.fischer@zoopnet.de>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Recent versions of curl can suggest a period of time the library user
should sleep and try again, when curl is blocked on reading or writing
(or connecting). Use this timeout instead of always sleeping for 50ms.
Signed-off-by: Mika Fischer <mika.fischer@zoopnet.de>
Helped-by: Daniel Stenberg <daniel@haxx.se>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of sleeping unconditionally for a 50ms, when no data can be read
from the http connection(s), use curl_multi_fdset() to obtain the actual
file descriptors of the open connections and use them in the select call.
This way, the 50ms sleep is interrupted when new data arrives.
Signed-off-by: Mika Fischer <mika.fischer@zoopnet.de>
Helped-by: Daniel Stenberg <daniel@haxx.se>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a username is already specified at the beginning of any HTTP
transaction (e.g. "git push https://user@hosting.example.com/project.git"
or "git ls-remote https://user@hosting.example.com/project.git"), the code
interactively asks for a password before calling into the libcurl library.
It is very likely that the reason why user included the username in the
URL is because the user knows that it would require authentication to
access the resource. Asking for the password upfront would save one
roundtrip to get a 401 response, getting the password and then retrying
the request. This is a reasonable optimization.
HOWEVER.
This is done even when $HOME/.netrc might have a corresponding entry to
access the site, or the site does not require authentication to access the
resource after all. But neither condition can be determined until we call
into libcurl library (we do not read and parse $HOME/.netrc ourselves). In
these cases, the user is forced to respond to the password prompt, only to
give a password that is not used in the HTTP transaction. If the password
is in $HOME/.netrc, an empty input would later let the libcurl layer to
pick up the password from there, and if the resource does not require
authentication, any input would be taken and then discarded without
getting used. It is wasteful to ask this unused information to the end
user.
Reduce the confusion by not trying to optimize for this case and always
incur roundtrip penalty. An alternative might be to document this and keep
this round-trip optimization as-is.
Signed-off-by: Stefan Naewe <stefan.naewe@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/http-auth:
http_init: accept separate URL parameter
http: use hostname in credential description
http: retry authentication failures for all http requests
remote-curl: don't retry auth failures with dumb protocol
improve httpd auth tests
url: decode buffers that are not NUL-terminated
The http_init function takes a "struct remote". Part of its
initialization procedure is to look at the remote's url and
grab some auth-related parameters. However, using the url
included in the remote is:
- wrong; the remote-curl helper may have a separate,
unrelated URL (e.g., from remote.*.pushurl). Looking at
the remote's configured url is incorrect.
- incomplete; http-fetch doesn't have a remote, so passes
NULL. So http_init never gets to see the URL we are
actually going to use.
- cumbersome; http-push has a similar problem to
http-fetch, but actually builds a fake remote just to
pass in the URL.
Instead, let's just add a separate URL parameter to
http_init, and all three callsites can pass in the
appropriate information.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Until now, a request for an http password looked like:
Username:
Password:
Now it will look like:
Username for 'example.com':
Password for 'example.com':
Picked-from: Jeff King <peff@peff.net>
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When asked to fetch over SSL without a valid
/etc/ssl/certs/ca-certificates.crt file, "git fetch" writes
error: while accessing https://github.com/torvalds/linux.git/info/refs
which is a little disconcerting. Better to fall back to
curl_easy_strerror(result) when the error string is empty, like the
curl utility does:
error: Problem with the SSL CA cert (path? access rights?) while
accessing https://github.com/torvalds/linux.git/info/refs
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is no need for a blank line between the detailed error message
and the later "fatal: HTTP request failed" notice. Keep the newline
written by error() itself and eliminate the extra one.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove a free() on the static buffer returned by sha1_file_name().
While we're at it, replace xmalloc() calls on the structs
http_(object|pack)_request with xcalloc() so that pointers in the
structs get initialized to NULL. That way, free()'s are safe - for
example, a free() on the url string member when aborting.
This fixes an invalid free().
Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Jeff King peff@peff.net
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 42653c0 (Prompt for a username when an HTTP request
401s, 2010-04-01) changed http_get_strbuf to prompt for
credentials when we receive a 401, but didn't touch
http_get_file. The latter is called only for dumb http;
while it's usually the case that people don't use
authentication on top of dumb http, there is no reason not
to allow both types of requests to use this feature.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The url_decode function needs only minor tweaks to handle
arbitrary buffers. Let's do those tweaks, which cleans up an
unreadable mess of temporary strings in http.c.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the config option http.cookiefile is set, pass this file to libCURL using
the CURLOPT_COOKIEFILE option. This is similar to calling curl with the -b
option. This allows git http authorization with authentication mechanisms
that use cookies, such as SAML Enhanced Client or Proxy (ECP) used by
Shibboleth.
To use SAML/ECP, the user needs to request a session cookie with their own ECP
code. See for example:
<https://wiki.shibboleth.net/confluence/display/SHIB2/ECP>
Once the cookie file has been created, it can be passed to git with, e.g.
git config --global http.cookiefile "/home/dbrown/.curlcookies"
libCURL will then pass the appropriate session cookies to the git http server.
Signed-off-by: Duncan Brown <duncan.brown@ligo.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Yes, these don't match perfectly with the void* first parameter of the
fread/fwrite in the standard library, but they do match the curl
expected method signature. This is needed when a refactor passes a
curl_write_callback around, which would otherwise give incorrect
parameter warnings.
Signed-off-by: Dan McGee <dpmcgee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After posting a short request using CURLOPT_POSTFIELDS, if the slot
is reused for posting a large payload, the slot ends up having both
POSTFIELDS (which now points at a random garbage) and READFUNCTION,
in which case the curl library tries to use the stale POSTFIELDS.
Clear it as part of the general slot initialization in get_active_slot().
Heavylifting-by: Shawn Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Shawn Pearce <spearce@spearce.org>
* tc/http-urls-ends-with-slash:
http-fetch: rework url handling
http-push: add trailing slash at arg-parse time, instead of later on
http-push: check path length before using it
http-push: Normalise directory names when pushing to some WebDAV servers
http-backend: use end_url_with_slash()
url: add str wrapper for end_url_with_slash()
shift end_url_with_slash() from http.[ch] to url.[ch]
t5550-http-fetch: add test for http-fetch
t5550-http-fetch: add missing '&&'
* gc/http-with-non-ascii-username-url:
Fix username and password extraction from HTTP URLs
t5550: test HTTP authentication and userinfo decoding
Conflicts:
t/lib-httpd/apache.conf
This allows non-http/curl users to access it too (eg. http-backend.c).
Update include headers in end_url_with_slash() users too.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the authentification initialisation to percent-decode username
and password for HTTP URLs.
Signed-off-by: Gabriel Corona <gabriel.corona@enst-bretagne.fr>
Acked-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For a long time (29508e1 "Isolate shared HTTP request functionality", Fri
Nov 18 11:02:58 2005), we've followed HTTP redirects with
CURLOPT_FOLLOWLOCATION.
However, when the remote HTTP server returns a redirect the default
libcurl action is to change a POST request into a GET request while
following the redirect, but the remote http backend does not expect
that.
Fix this by telling libcurl to always keep the request as type POST with
CURLOPT_POSTREDIR.
For users of libcurl older than 7.19.1, use CURLOPT_POST301 instead,
which only follows 301s instead of both 301s and 302s.
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some firewalls restrict HTTP connections based on the clients user agent. This
commit provides the user the ability to modify the user agent string via either
a new config option (http.useragent) or by an environment variable
(GIT_HTTP_USER_AGENT).
Relevant documentation is added to Documentation/config.txt.
Signed-off-by: Spencer E. Olson <olsonse@umich.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* sp/maint-dumb-http-pack-reidx:
http.c::new_http_pack_request: do away with the temp variable filename
http-fetch: Use temporary files for pack-*.idx until verified
http-fetch: Use index-pack rather than verify-pack to check packs
Allow parse_pack_index on temporary files
Extract verify_pack_index for reuse from verify_pack
Introduce close_pack_index to permit replacement
http.c: Remove unnecessary strdup of sha1_to_hex result
http.c: Don't store destination name in request structures
http.c: Drop useless != NULL test in finish_http_pack_request
http.c: Tiny refactoring of finish_http_pack_request
t5550-http-fetch: Use subshell for repository operations
http.c: Remove bad free of static block
* rc/maint-curl-helper:
remote-curl: ensure that URLs have a trailing slash
http: make end_url_with_slash() public
t5541-http-push: add test for URLs with trailing slash
Conflicts:
remote-curl.c
Now that the temporary variable char *filename is only used in one
place, do away with it and just call sha1_pack_name() directly.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Verify that a downloaded pack-*.idx file is consistent and valid
as an index file before we rename it into its final destination.
This prevents a corrupt index file from later being treated as a
usable file, confusing readers.
Check that we do not have the pack index file before invoking
fetch_pack_index(); that way, we can do without the has_pack_index()
check in fetch_pack_index().
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To ensure we don't leave a corrupt pack file positioned as though
it were a valid pack file, run index-pack on the temporary pack
before we rename it to its final name. If index-pack crashes out
when it discovers file corruption (e.g. GitHub's error HTML at the
end of the file), simply delete the temporary files to cleanup.
By waiting until the pack has been validated before we move it
to its final name, we eliminate a race condition where another
concurrent reader might try to access the pack at the same time
that we are still trying to verify its not corrupt.
Switching from verify-pack to index-pack is a change in behavior,
but it should turn out better for users. The index-pack algorithm
tries to minimize disk seeks, as well as the number of times any
given object is inflated, by organizing its work along delta chains.
The verify-pack logic does not attempt to do this, thrashing the
delta base cache and the filesystem cache.
By recreating the index file locally, we also can automatically
upgrade from a v1 pack table of contents to v2. This makes the
CRC32 data available for use during later repacks, even if the
server didn't have them on hand.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Acked-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The easiest way to verify a pack index is to open it through the
standard parse_pack_index function, permitting the header check
to happen when the file is mapped. However, the dumb HTTP client
needs to verify a pack index before its moved into its proper file
name within the objects/pack directory, to prevent a corrupt index
from being made available. So permit the caller to specify the
exact path of the index file.
For now we're still using the final destination name within the
sole call site in http.c, but eventually we will start to parse
the temporary path instead.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of the time the dumb HTTP transport is run without the verbose
flag set, so we only need the result of sha1_to_hex(sha1) once, to
construct the pack URL. Don't bother with an unnecessary malloc,
copy, free chain of this buffer.
If verbose is set, we'll format the SHA-1 twice now. But this
tiny extra CPU time spent is nothing compared to the slowdown that
is usually imposed by the verbose messages being sent to the tty,
and is entirely trivial compared to the latency involved with the
remote HTTP server sending something as big as a pack file.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Acked-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The destination name within the object store is easily computed
on demand, reusing a static buffer held by sha1_file.c. We don't
need to copy the entire path into the request structure for safe
keeping, when it can be easily reformatted after the download has
been completed.
This reduces the size of the per-request structure, and removes
yet another PATH_MAX based limit.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test preq->packfile != NULL is always true. If packfile was
actually NULL when entering this function the ftell() above would
crash out with a SIGSEGV, resulting in never reaching this point.
Simplify the code by just removing the conditional.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Always remove the struct packed_git from the active list, even
if the rename of the temporary file fails.
While we are here, simplify the code a bit by using a common
local variable name ("p") to hold the relevant packed_git.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The filename variable here is pointing to a block of memory that
was allocated by sha1_file.c and is also held in a static variable
scoped within the sha1_pack_name() function. Doing a free() here is
returning that memory to the allocator while we might still try to
reuse it on a subsequent sha1_pack_name() invocation. That's not
acceptable, so don't free it.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When an HTTP request returns a 401, Git will currently fail with a
confusing message saying that it got a 401, which is not very
descriptive.
Currently if a user wants to use Git over HTTP, they have to use one
URL with the username in the URL (e.g. "http://user@host.com/repo.git")
for write access and another without the username for unauthenticated
read access (unless they want to be prompted for the password each
time). However, since the HTTP servers will return a 401 if an action
requires authentication, we can prompt for username and password if we
see this, allowing us to use a single URL for both purposes.
This patch changes http_request to prompt for the username and password,
then return HTTP_REAUTH so http_get_strbuf can try again. If it gets
a 401 even when a user/pass is supplied, http_request will now return
HTTP_NOAUTH which remote_curl can then use to display a more
intelligent error message that is less confusing.
Signed-off-by: Scott Chacon <schacon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git tries to read a password from the terminal in imap-send and
when talking to a http server that requires authentication.
When a GUI is driving git, however, the end user is not paying
attention to the terminal (there may not even be a terminal).
GUI would appear to hang forever.
Fix this problem by allowing a password-retrieving command
to be specified in GIT_ASKPASS
Signed-off-by: Frank Li <lznuaa@gmail.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* tr/http-updates:
Remove http.authAny
Allow curl to rewind the RPC read buffer
Add an option for using any HTTP authentication scheme, not only basic
http: maintain curl sessions
Back when the feature to use different HTTP authentication methods was
originally written, it needed an extra HTTP request for everything when
the feature was in effect, because we didn't reuse curl sessions.
However, b8ac923 (Add an option for using any HTTP authentication scheme,
not only basic, 2009-11-27) builds on top of an updated codebase that does
reuse curl sessions; there is no need to manually avoid the extra overhead
by making this configurable anymore.
Acked-by: Martin Storsjo <martin@martin.st>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds the configuration option http.authAny (overridable with
the environment variable GIT_HTTP_AUTH_ANY), for instructing curl
to allow any HTTP authentication scheme, not only basic (which
sends the password in plaintext).
When this is enabled, curl has to do double requests most of the time,
in order to discover which HTTP authentication method to use, which
lowers the performance slightly. Therefore this isn't enabled by default.
One example of another authentication scheme to use is digest, which
doesn't send the password in plaintext, but uses a challenge-response
mechanism instead. Using digest authentication in practice requires
at least curl 7.18.1, due to bugs in the digest handling in earlier
versions of curl.
Signed-off-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow curl sessions to be kept alive (ie. not ended with
curl_easy_cleanup()) even after the request is completed, the number of
which is determined by the configuration setting http.minSessions.
Add a count for curl sessions, and update it, across slots, when
starting and ending curl sessions.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-remote-curl backend detects if the remote server supports
the git-receive-pack service, and if so, runs git-send-pack in a
pipe to dump the command and pack data as a single POST request.
The advertisements from the server that were obtained during the
discovery are passed into git-send-pack before the POST request
starts. This permits git-send-pack to operate largely unmodified.
For smaller packs (those under 1 MiB) a HTTP/1.0 POST with a
Content-Length is used, permitting interaction with any server.
The 1 MiB limit is arbitrary, but is sufficent to fit most deltas
created by human authors against text sources with the occasional
small binary file (e.g. few KiB icon image). The configuration
option http.postBuffer can be used to increase (or shink) this
buffer if the default is not sufficient.
For larger packs which cannot be spooled entirely into the helper's
memory space (due to http.postBuffer being too small), the POST
request requires HTTP/1.1 and sets "Transfer-Encoding: chunked".
This permits the client to upload an unknown amount of data in one
HTTP transaction without needing to pregenerate the entire pack
file locally.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
CC: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An earlier 59b8d38 (http.c: remove verification of remote packs) left
the variable "url" uninitialized; "goto cleanup" codepath can free it
which is not very nice.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make http.c::fetch_pack_index() no longer check for the remote pack
with a HEAD request before fetching the corresponding pack index file.
Not only does sending a HEAD request before we do a GET incur a
performance penalty, it does not offer any significant error-
prevention advantages (pack fetching in the *_http_pack_request()
methods is capable of handling any errors on its own).
This addresses an issue raised elsewhere:
http://code.google.com/p/msysgit/issues/detail?id=323http://support.github.com/discussions/repos/957-cant-clone-over-http-or-git
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Set the members callback_func and callback_data of freq->slot to NULL
when releasing a http_object_request. release_active_slot() is also
invoked on the slot to remove the curl handle associated with the slot
from the multi stack (CURLM *curlm in http.c).
These prevent the callback function and data from being used in http
methods (like http.c::finish_active_slot()) after a
http_object_request has been free'd.
Noticed by Ali Polatel, who later tested this patch to verify that it
fixes the problem he saw; Dscho helped to identify the problem spot.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make append_remote_object_url() (and by implication,
get_remote_object_url) use end_url_with_slash() to ensure that the url
ends with a slash.
Previously, they assumed that the url did not end with a slash and
as a result appended a slash, sometimes errorneously.
This fixes an issue introduced in 5424bc5 ("http*: add helper methods
for fetching objects (loose)"), where the append_remote_object_url()
implementation in http-push.c, which assumed that urls end with a
slash, was replaced by another one in http.c, which assumed urls did
not end with a slash.
The above issue was raised by Thomas Schlichter:
http://marc.info/?l=git&m=125043105231327
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Tested-by: Thomas Schlichter <thomas.schlichter@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In new_http_object_request(), check ftruncate() call return value and
handle possible errors.
Signed-off-by: Jeff Lasslett <jeff.lasslett@gmail.com>
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use preq->url in new_http_pack_request and freq->url in
new_http_object_request when calling curl_setopt(CURLOPT_URL), instead
of using an intermediate variable, 'url'.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Free preq in new_http_pack_request when aborting. preq was allocated
before jumping to the 'abort' label so this is safe.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a configuration option, http.sslCertPasswordProtected, and associated
environment variable, GIT_SSL_CERT_PASSWORD_PROTECTED, to enable SSL client
certificate password prompt from within git. If this option is false and
if the environment variable does not exist, git falls back to OpenSSL's
prompts (as in earlier versions of git).
The environment variable may only be used to enable, not to disable
git's password prompt. This behavior mimics GIT_NO_VERIFY; the mere
existence of the variable is all that is checked.
Signed-off-by: Mark Lodato <lodatom@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If an SSL client certificate is enabled (via http.sslcert or
GIT_SSL_CERT), prompt for the certificate password rather than
defaulting to OpenSSL's password prompt. This causes the prompt to only
appear once each run. Previously, OpenSSL prompted the user *many*
times, causing git to be unusable over HTTPS with client-side
certificates.
Note that the password is stored in memory in the clear while the
program is running. This may be a security problem if git crashes and
core dumps.
The user is always prompted, even if the certificate is not encrypted.
This should be fine; unencrypted certificates are rare and a security
risk anyway.
Signed-off-by: Mark Lodato <lodatom@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the minimimum required libcurl version for the http.sslKey option
to 7.9.3. Previously, preprocessor macros checked for >= 7.9.2, which
is incorrect because CURLOPT_SSLKEY was introduced in 7.9.3. This now
allows git to compile with libcurl 7.9.2.
Signed-off-by: Mark Lodato <lodatom@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code handling the fetching of packs in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(http_pack_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_pack_request
- finish_http_pack_request
- release_http_pack_request
and the new struct is http_pack_request.
Add a function, new_http_pack_request(), that deals with the details of
coming up with the filename to store the retrieved packfile, resuming a
previously aborted request, and making a new curl request. Update
http-push.c::start_fetch_packed() and http-walker.c::fetch_pack() to
use this.
Add a function, finish_http_pack_request(), that deals with renaming
the pack, advancing the pack list, and installing the pack. Update
http-push.c::finish_request() and http-walker.c::fetch_pack to use
this.
Update release_request() in http-push.c and http-walker.c to invoke
release_http_pack_request() to clean up pack request helper data.
The local_stream member of the transfer_request struct in http-push.c
has been removed, as the packfile pointer will be managed in the struct
http_pack_request.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
http-push.c and http-walker.c no longer have to use fetch_index or
setup_index; they simply need to use http_get_info_packs, a new http
method, in their fetch_indices implementations.
Move fetch_index() and rename to fetch_pack_index() in http.c; this
method is not meant to be used outside of http.c. It invokes
end_url_with_slash with base_url; apart from that change, the code is
identical.
Move setup_index() and rename to fetch_and_setup_pack_index() in
http.c; this method is not meant to be used outside of http.c.
Do not immediately set ret to 0 in http-walker.c::fetch_indices();
instead do it in the HTTP_MISSING_TARGET case, to make it clear that
the HTTP_OK and HTTP_MISSING_TARGET cases both return 0.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The error message ("Unable to start request") has been removed, since
the http API already prints it.
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new functions added are:
- http_request() (internal function)
- http_get_strbuf()
- http_get_file()
- http_error()
http_get_strbuf and http_get_file allow respectively to retrieve contents of
an URL to a strbuf or an opened file handle.
http_error prints out an error message containing the URL and the curl error
(in curl_errorstr).
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The logic to append a slash to the url if necessary in quote_ref_url
(added in 113106e "http.c: use strbuf API in quote_ref_url") has been
moved to a new function, end_url_with_slash.
The method takes a strbuf, the URL, and the path to be appended to the
URL. It first adds the URL to the strbuf. It then appends a slash
if the URL does not end with a slash.
The check on ref in quote_ref_url for a slash at the beginning has been
removed as a result of using end_url_with_slash. This check is not
needed, because slashes will be quoted anyway.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move RANGE_HEADER_SIZE to http.h.
Create no_pragma_header, the curl header list containing the header
"Pragma:" in http.[ch]. It is allocated in http_init, and freed in
http_cleanup. This replaces the no_pragma_header in http-push.c, and
the no_pragma_header member in walker_data in http-walker.c.
Create http_is_verbose. It is to be used by methods in http.c, and is
modified at the entry points of http.c's users, namely http-push.c
(when parsing options) and http-walker.c (in get_http_walker).
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using multi-pass authentication methods, the curl library may
need to rewind the read buffers (depending on how much already has
been fed to the server) used for providing data to HTTP PUT, POST or
PROPFIND, and in order to allow the library to do so, we need to tell
it how by providing either an ioctl callback or a seek callback.
This patch adds an ioctl callback, which should be usable on older
curl versions (since 7.12.3) than the seek callback (introduced in
curl 7.18.0).
Some HTTP servers (such as Apache) give an 401 error reply immediately
after receiving the headers (so no data has been read from the read
buffers, and thus no rewinding is needed), but other servers (such
as Lighttpd) only replies after the whole request has been sent and
all data has been read from the read buffers, making rewinding necessary.
Signed-off-by: Martin Storsjo <martin@martin.st>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Besides, we have already called easy_setopt with the option before coming
to this function if it was available, so there is no need to repeat it
here.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Curl is designed not to ask for password when only username is given in
the URL, but has a way for application to feed a (username, password) pair
to it. With this patch, you do not have to keep your password in
plaintext in your $HOME/.netrc file when talking with a password protected
URL with http://<username>@<host>/path/to/repository.git/ syntax.
The code handles only the http-walker side, not the push side. At least,
not yet. But interested parties can add support for it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We honor the command line options, environment variables, variables in
repository configuration file, variables in user's global configuration
file, variables in the system configuration file, and then finally use
built-in default. To implement this semantics, the code should:
- start from built-in default values;
- call git_config() with the configuration parser callback, which
implements "later definition overrides earlier ones" logic
(git_config() reads the system's, user's and then repository's
configuration file in this order);
- override the result from the above with environment variables if set;
- override the result from the above with command line options.
The initialization code http_init() for http transfer got this wrong, and
implemented a "first one wins, ignoring the later ones" in http_options(),
to compensate this mistake, read environment variables before calling
git_config(). This is all wrong.
As a second class citizen, the http codepath hasn't been audited as
closely as other parts of the system, but we should try to bring sanity to
it, before inviting contributors to improve on it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In addition, ''quote_ref_url'' inserts a slash between the base URL and
remote ref path only if needed. Previously, this insertion wasn't
contingent on the lack of a separating slash.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
GIT 1.6.0.5
"git diff <tree>{3,}": do not reverse order of arguments
tag: delete TAG_EDITMSG only on successful tag
gitweb: Make project specific override for 'grep' feature work
http.c: use 'git_config_string' to get 'curl_http_proxy'
fetch-pack: Avoid memcpy() with src==dst