Commit Graph

273 Commits

Author SHA1 Message Date
Jeff King
92b9bf4a15 Merge branch 'pt/http-socks-proxy' into maint
Add support for talking http/https over socks proxy.

* pt/http-socks-proxy:
  remote-http(s): support SOCKS proxies
2015-12-01 17:19:12 -05:00
Charles Bailey
bf9acba2c1 http: treat config options sslCAPath and sslCAInfo as paths
This enables ~ and ~user expansion for these config options.

Signed-off-by: Charles Bailey <cbailey32@bloomberg.net>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-24 18:51:00 -05:00
brian m. carlson
f4e54d02b8 Convert struct ref to use object_id.
Use struct object_id in three fields in struct ref and convert all the
necessary places that use it.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 08:02:05 -05:00
Pat Thoyts
6d7afe07f2 remote-http(s): support SOCKS proxies
With this patch we properly support SOCKS proxies, configured e.g. like
this:

	git config http.proxy socks5://192.168.67.1:32767

Without this patch, Git mistakenly tries to use SOCKS proxies as if they
were HTTP proxies, resulting in a error message like:

	fatal: unable to access 'http://.../': Proxy CONNECT aborted

This patch was required to work behind a faulty AP and scraped from
http://stackoverflow.com/questions/15227130/#15228479 and guarded with
an appropriate cURL version check by Johannes Schindelin.

Signed-off-by: Pat Thoyts <patthoyts@users.sourceforge.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-11-20 07:31:39 -05:00
Ramsay Jones
838ecf0b0f http: fix some printf format warnings
Commit f8117f55 ("http: use off_t to store partial file size",
02-11-2015) changed the type of some variables from long to off_t.
Unfortunately, the off_t type is not portable and can be represented
by several different actual types (even multiple types on the same
platform). This makes it difficult to print an off_t variable in
a platform independent way. As a result, this commit causes gcc to
issue some printf format warnings on a couple of different platforms.

In order to suppress the warnings, change the format specifier to use
the PRIuMAX macro and cast the off_t argument to uintmax_t. (See also
the http_opt_request_remainder() function, which uses the same
solution).

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-11 19:10:41 -05:00
Jeff King
f8117f550b http: use off_t to store partial file size
When we try to resume transfer of a partially-downloaded
object or pack, we fopen() the existing file for append,
then use ftell() to get the current position. We use a
"long", which can hold only 2GB on a 32-bit system, even
though packfiles may be larger than that.

Let's switch to using off_t, which should hold any file size
our system is capable of storing. We need to use ftello() to
get the off_t. This is in POSIX and hopefully available
everywhere; if not, we should be able to wrap it by falling
back to ftell(), which would presumably return "-1" on such
a large file (and we would simply skip resuming in that case).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-11-02 14:19:54 -08:00
David Turner
835c4d3689 http.c: use CURLOPT_RANGE for range requests
A HTTP server is permitted to return a non-range response to a HTTP
range request (and Apache httpd in fact does this in some cases).
While libcurl knows how to correctly handle this (by skipping bytes
before and after the requested range), it only turns on this handling
if it is aware that a range request is being made.  By manually
setting the range header instead of using CURLOPT_RANGE, we were
hiding the fact that this was a range request from libcurl.  This
could cause corruption.

Signed-off-by: David Turner <dturner@twopensource.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-11-02 14:18:06 -08:00
Junio C Hamano
78891795df Merge branch 'jk/war-on-sprintf'
Many allocations that is manually counted (correctly) that are
followed by strcpy/sprintf have been replaced with a less error
prone constructs such as xstrfmt.

Macintosh-specific breakage was noticed and corrected in this
reroll.

* jk/war-on-sprintf: (70 commits)
  name-rev: use strip_suffix to avoid magic numbers
  use strbuf_complete to conditionally append slash
  fsck: use for_each_loose_file_in_objdir
  Makefile: drop D_INO_IN_DIRENT build knob
  fsck: drop inode-sorting code
  convert strncpy to memcpy
  notes: document length of fanout path with a constant
  color: add color_set helper for copying raw colors
  prefer memcpy to strcpy
  help: clean up kfmclient munging
  receive-pack: simplify keep_arg computation
  avoid sprintf and strcpy with flex arrays
  use alloc_ref rather than hand-allocating "struct ref"
  color: add overflow checks for parsing colors
  drop strcpy in favor of raw sha1_to_hex
  use sha1_to_hex_r() instead of strcpy
  daemon: use cld->env_array when re-spawning
  stat_tracking_info: convert to argv_array
  http-push: use an argv_array for setup_revisions
  fetch-pack: use argv_array for index-pack / unpack-objects
  ...
2015-10-20 15:24:01 -07:00
Junio C Hamano
3adc4ec7b9 Sync with v2.5.4 2015-09-28 19:16:54 -07:00
Junio C Hamano
11a458befc Sync with 2.4.10 2015-09-28 15:33:56 -07:00
Junio C Hamano
6343e2f6f2 Sync with 2.3.10 2015-09-28 15:28:31 -07:00
Blake Burkhart
b258116462 http: limit redirection depth
By default, libcurl will follow circular http redirects
forever. Let's put a cap on this so that somebody who can
trigger an automated fetch of an arbitrary repository (e.g.,
for CI) cannot convince git to loop infinitely.

The value chosen is 20, which is the same default that
Firefox uses.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25 15:32:28 -07:00
Blake Burkhart
f4113cac0c http: limit redirection to protocol-whitelist
Previously, libcurl would follow redirection to any protocol
it was compiled for support with. This is desirable to allow
redirection from HTTP to HTTPS. However, it would even
successfully allow redirection from HTTP to SFTP, a protocol
that git does not otherwise support at all. Furthermore
git's new protocol-whitelisting could be bypassed by
following a redirect within the remote helper, as it was
only enforced at transport selection time.

This patch limits redirects within libcurl to HTTP, HTTPS,
FTP and FTPS. If there is a protocol-whitelist present, this
list is limited to those also allowed by the whitelist. As
redirection happens from within libcurl, it is impossible
for an HTTP redirect to a protocol implemented within
another remote helper.

When the curl version git was compiled with is too old to
support restrictions on protocol redirection, we warn the
user if GIT_ALLOW_PROTOCOL restrictions were requested. This
is a little inaccurate, as even without that variable in the
environment, we would still restrict SFTP, etc, and we do
not warn in that case. But anything else means we would
literally warn every time git accesses an http remote.

This commit includes a test, but it is not as robust as we
would hope. It redirects an http request to ftp, and checks
that curl complained about the protocol, which means that we
are relying on curl's specific error message to know what
happened. Ideally we would redirect to a working ftp server
and confirm that we can clone without protocol restrictions,
and not with them. But we do not have a portable way of
providing an ftp server, nor any other protocol that curl
supports (https is the closest, but we would have to deal
with certificates).

[jk: added test and version warning]

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25 15:30:39 -07:00
Jeff King
9ae97018fb use strip_suffix and xstrfmt to replace suffix
When we want to convert "foo.pack" to "foo.idx", we do it by
duplicating the original string and then munging the bytes
in place. Let's use strip_suffix and xstrfmt instead, which
has several advantages:

  1. It's more clear what the intent is.

  2. It does not implicitly rely on the fact that
     strlen(".idx") <= strlen(".pack") to avoid an overflow.

  3. We communicate the assumption that the input file ends
     with ".pack" (and get a run-time check that this is so).

  4. We drop calls to strcpy, which makes auditing the code
     base easier.

Likewise, we can do this to convert ".pack" to ".bitmap",
avoiding some manual memory computation.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25 10:18:18 -07:00
Jeff King
5096d4909f convert trivial sprintf / strcpy calls to xsnprintf
We sometimes sprintf into fixed-size buffers when we know
that the buffer is large enough to fit the input (either
because it's a constant, or because it's numeric input that
is bounded in size). Likewise with strcpy of constant
strings.

However, these sites make it hard to audit sprintf and
strcpy calls for buffer overflows, as a reader has to
cross-reference the size of the array with the input. Let's
use xsnprintf instead, which communicates to a reader that
we don't expect this to overflow (and catches the mistake in
case we do).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25 10:18:18 -07:00
Junio C Hamano
ed070a4007 Merge branch 'ep/http-configure-ssl-version'
A new configuration variable http.sslVersion can be used to specify
what specific version of SSL/TLS to use to make a connection.

* ep/http-configure-ssl-version:
  http: add support for specifying the SSL version
2015-08-26 15:45:31 -07:00
Junio C Hamano
51a22ce147 Merge branch 'jc/finalize-temp-file'
Long overdue micro clean-up.

* jc/finalize-temp-file:
  sha1_file.c: rename move_temp_to_file() to finalize_object_file()
2015-08-19 14:48:55 -07:00
Elia Pinto
01861cb7a2 http: add support for specifying the SSL version
Teach git about a new option, "http.sslVersion", which permits one
to specify the SSL version to use when negotiating SSL connections.
The setting can be overridden by the GIT_SSL_VERSION environment
variable.

Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-17 10:16:34 -07:00
Junio C Hamano
cb5add5868 sha1_file.c: rename move_temp_to_file() to finalize_object_file()
Since 5a688fe4 ("core.sharedrepository = 0mode" should set, not
loosen, 2009-03-25), we kept reminding ourselves:

    NEEDSWORK: this should be renamed to finalize_temp_file() as
    "moving" is only a part of what it does, when no patch between
    master to pu changes the call sites of this function.

without doing anything about it.  Let's do so.

The purpose of this function was not to move but to finalize.  The
detail of the primarily implementation of finalizing was to link the
temporary file to its final name and then to unlink, which wasn't
even "moving".  The alternative implementation did "move" by calling
rename(2), which is a fun tangent.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-10 11:10:37 -07:00
Junio C Hamano
0e521a41b5 Merge branch 'et/http-proxyauth'
We used to ask libCURL to use the most secure authentication method
available when talking to an HTTP proxy only when we were told to
talk to one via configuration variables.  We now ask libCURL to
always use the most secure authentication method, because the user
can tell libCURL to use an HTTP proxy via an environment variable
without using configuration variables.

* et/http-proxyauth:
  http: always use any proxy auth method available
2015-07-13 14:00:24 -07:00
Enrique Tobis
5841520b03 http: always use any proxy auth method available
We set CURLOPT_PROXYAUTH to use the most secure authentication
method available only when the user has set configuration variables
to specify a proxy.  However, libcurl also supports specifying a
proxy through environment variables.  In that case libcurl defaults
to only using the Basic proxy authentication method, because we do
not use CURLOPT_PROXYAUTH.

Set CURLOPT_PROXYAUTH to always use the most secure authentication
method available, even when there is no git configuration telling us
to use a proxy. This allows the user to use environment variables to
configure a proxy that requires an authentication method different
from Basic.

Signed-off-by: Enrique A. Tobis <etobis@twosigma.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-29 09:57:43 -07:00
Junio C Hamano
39fa79178f Merge branch 'ls/http-ssl-cipher-list'
Introduce http.<url>.SSLCipherList configuration variable to tweak
the list of cipher suite to be used with libcURL when talking with
https:// sites.

* ls/http-ssl-cipher-list:
  http: add support for specifying an SSL cipher list
2015-05-22 12:41:45 -07:00
Lars Kellogg-Stedman
f6f2a9e42d http: add support for specifying an SSL cipher list
Teach git about a new option, "http.sslCipherList", which permits one to
specify a list of ciphers to use when negotiating SSL connections.  The
setting can be overwridden by the GIT_SSL_CIPHER_LIST environment
variable.

Signed-off-by: Lars Kellogg-Stedman <lars@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-08 10:56:26 -07:00
Stefan Beller
826aed50cb http: release the memory of a http pack request as well
The cleanup function is used in 4 places now and it's always safe to
free up the memory as well.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-24 12:36:10 -07:00
Junio C Hamano
74c91d1f7a Merge branch 'ye/http-accept-language'
Compilation fix for a recent topic in 'master'.

* ye/http-accept-language:
  gettext.c: move get_preferred_languages() from http.c
2015-03-06 15:02:25 -08:00
Jeff King
93f7d9108a gettext.c: move get_preferred_languages() from http.c
Calling setlocale(LC_MESSAGES, ...) directly from http.c, without
including <locale.h>, was causing compilation warnings.  Move the
helper function to gettext.c that already includes the header and
where locale-related issues are handled.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-26 14:09:20 -08:00
Junio C Hamano
90eea883fd Merge branch 'tc/missing-http-proxyauth'
We did not check the curl library version before using
CURLOPT_PROXYAUTH feature that may not exist.

* tc/missing-http-proxyauth:
  http: support curl < 7.10.7
2015-02-25 15:40:12 -08:00
Junio C Hamano
117c1b333d Merge branch 'jk/dumb-http-idx-fetch-fix' into maint
A broken pack .idx file in the receiving repository prevented the
dumb http transport from fetching a good copy of it from the other
side.

* jk/dumb-http-idx-fetch-fix:
  dumb-http: do not pass NULL path to parse_pack_index
2015-02-24 22:10:37 -08:00
Junio C Hamano
f18e3896f7 Merge branch 'ye/http-accept-language'
Using environment variable LANGUAGE and friends on the client side,
HTTP-based transports now send Accept-Language when making requests.

* ye/http-accept-language:
  http: add Accept-Language header if possible
2015-02-18 11:44:57 -08:00
Junio C Hamano
b93b5b21b5 Merge branch 'jk/dumb-http-idx-fetch-fix'
A broken pack .idx file in the receiving repository prevented the
dumb http transport from fetching a good copy of it from the other
side.

* jk/dumb-http-idx-fetch-fix:
  dumb-http: do not pass NULL path to parse_pack_index
2015-02-17 10:15:25 -08:00
Junio C Hamano
c985aaf879 Merge branch 'jc/unused-symbols'
Mark file-local symbols as "static", and drop functions that nobody
uses.

* jc/unused-symbols:
  shallow.c: make check_shallow_file_for_update() static
  remote.c: make clear_cas_option() static
  urlmatch.c: make match_urls() static
  revision.c: make save_parents() and free_saved_parents() static
  line-log.c: make line_log_data_init() static
  pack-bitmap.c: make pack_bitmap_filename() static
  prompt.c: remove git_getpass() nobody uses
  http.c: make finish_active_slot() and handle_curl_result() static
2015-02-11 13:44:07 -08:00
Tom G. Christensen
1c2dbf2095 http: support curl < 7.10.7
Commit dd61399 introduced support for http proxies that require
authentication but it relies on the CURL_PROXYAUTH option which was
added in curl 7.10.7.
This makes sure proxy authentication is only enabled if libcurl can
support it.

Signed-off-by: Tom G. Christensen <tgc@statsbiblioteket.dk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-03 13:53:17 -08:00
Yi EungJun
f18604bbf2 http: add Accept-Language header if possible
Add an Accept-Language header which indicates the user's preferred
languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG.

Examples:
  LANGUAGE= -> ""
  LANGUAGE=ko:en -> "Accept-Language: ko, en;q=0.9, *;q=0.1"
  LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *;q=0.1"
  LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *;q=0.1"

This gives git servers a chance to display remote error messages in
the user's preferred language.

Limit the number of languages to 1,000 because q-value must not be
smaller than 0.001, and limit the length of Accept-Language header to
4,000 bytes for some HTTP servers which cannot accept such long header.

Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-01-28 11:17:08 -08:00
Jeff King
8b9c2dd4de dumb-http: do not pass NULL path to parse_pack_index
Once upon a time, dumb http always fetched .idx files
directly into their final location, and then checked their
validity with parse_pack_index. This was refactored in
commit 750ef42 (http-fetch: Use temporary files for
pack-*.idx until verified, 2010-04-19), which uses the
following logic:

  1. If we have the idx already in place, see if it's
     valid (using parse_pack_index). If so, use it.

  2. Otherwise, fetch the .idx to a tempfile, check
     that, and if so move it into place.

  3. Either way, fetch the pack itself if necessary.

However, it got step 1 wrong. We pass a NULL path parameter
to parse_pack_index, so an existing .idx file always looks
broken. Worse, we do not treat this broken .idx as an
opportunity to re-fetch, but instead return an error,
ignoring the pack entirely. This can lead to a dumb-http
fetch failing to retrieve the necessary objects.

This doesn't come up much in practice, because it must be a
packfile that we found out about (and whose .idx we stored)
during an earlier dumb-http fetch, but whose packfile we
_didn't_ fetch. I.e., we did a partial clone of a
repository, didn't need some packfiles, and now a followup
fetch needs them.

Discovery and tests by Charles Bailey <charles@hashpling.org>.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-01-27 12:41:45 -08:00
Junio C Hamano
b90a3d7b32 http.c: make finish_active_slot() and handle_curl_result() static
They used to be used directly by remote-curl.c for the smart-http
protocol. But they got wrapped by run_one_slot() in beed336 (http:
never use curl_easy_perform, 2014-02-18).  Any future users are
expected to follow that route.

Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-01-15 11:00:52 -08:00
brian m. carlson
4dbe66464b remote-curl: fall back to Basic auth if Negotiate fails
Apache servers using mod_auth_kerb can be configured to allow the user
to authenticate either using Negotiate (using the Kerberos ticket) or
Basic authentication (using the Kerberos password).  Often, one will
want to use Negotiate authentication if it is available, but fall back
to Basic authentication if the ticket is missing or expired.

However, libcurl will try very hard to use something other than Basic
auth, even over HTTPS.  If Basic and something else are offered, libcurl
will never attempt to use Basic, even if the other option fails.
Teach the HTTP client code to stop trying authentication mechanisms that
don't use a password (currently Negotiate) after the first failure,
since if they failed the first time, they will never succeed.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-01-07 19:48:19 -08:00
Junio C Hamano
211836f77b Merge branch 'da/include-compat-util-first-in-c'
Code clean-up.

* da/include-compat-util-first-in-c:
  cleanups: ensure that git-compat-util.h is included first
2014-10-14 10:49:01 -07:00
David Aguilar
1c4b660412 cleanups: ensure that git-compat-util.h is included first
CodingGuidelines states that the first #include in C files should be
git-compat-util.h or another header file that includes it, such as
cache.h or builtin.h.

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-15 12:05:14 -07:00
Junio C Hamano
6c1d42acae Merge branch 'br/http-init-fix'
Code clean-up.

* br/http-init-fix:
  http: style fixes for curl_multi_init error check
  http.c: die if curl_*_init fails
2014-09-11 10:33:28 -07:00
Jeff King
8837eb47f2 http: style fixes for curl_multi_init error check
Unless there is a good reason, we should use die() rather than
fprintf/exit. We can also shorten the message to match other curl init
failures (and match our usual lowercase no-full-stop style).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-21 10:53:55 -07:00
René Scharfe
d318027932 run-command: introduce CHILD_PROCESS_INIT
Most struct child_process variables are cleared using memset first after
declaration.  Provide a macro, CHILD_PROCESS_INIT, that can be used to
initialize them statically instead.  That's shorter, doesn't require a
function call and is slightly more readable (especially given that we
already have STRBUF_INIT, ARGV_ARRAY_INIT etc.).

Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-20 09:53:37 -07:00
Bernhard Reiter
faa3807cfe http.c: die if curl_*_init fails
Signed-off-by: Bernhard Reiter <ockham@raz.or.at>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-18 10:11:42 -07:00
Junio C Hamano
e91ae32a01 Merge branch 'jk/skip-prefix'
* jk/skip-prefix:
  http-push: refactor parsing of remote object names
  imap-send: use skip_prefix instead of using magic numbers
  use skip_prefix to avoid repeated calculations
  git: avoid magic number with skip_prefix
  fetch-pack: refactor parsing in get_ack
  fast-import: refactor parsing of spaces
  stat_opt: check extra strlen call
  daemon: use skip_prefix to avoid magic numbers
  fast-import: use skip_prefix for parsing input
  use skip_prefix to avoid repeating strings
  use skip_prefix to avoid magic numbers
  transport-helper: avoid reading past end-of-string
  fast-import: fix read of uninitialized argv memory
  apply: use skip_prefix instead of raw addition
  refactor skip_prefix to return a boolean
  avoid using skip_prefix as a boolean
  daemon: mark some strings as const
  parse_diff_color_slot: drop ofs parameter
2014-07-09 11:33:28 -07:00
Jeff King
de8118e153 use skip_prefix to avoid repeated calculations
In some cases, we use starts_with to check for a prefix, and
then use an already-calculated prefix length to advance a
pointer past the prefix. There are no magic numbers or
duplicated strings here, but we can still make the code
simpler and more obvious by using skip_prefix.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-20 10:45:19 -07:00
Yi EungJun
f34a655d4d http: fix charset detection of extract_content_type()
extract_content_type() could not extract a charset parameter if the
parameter is not the first one and there is a whitespace and a following
semicolon just before the parameter. For example:

    text/plain; format=fixed ;charset=utf-8

And it also could not handle correctly some other cases, such as:

    text/plain; charset=utf-8; format=fixed
    text/plain; some-param="a long value with ;semicolons;"; charset=utf-8

Thanks-to: Jeff King <peff@peff.net>
Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-17 15:25:00 -07:00
Jeff King
c553fd1c1e http: default text charset to iso-8859-1
This is specified by RFC 2616 as the default if no "charset"
parameter is given.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-27 09:59:22 -07:00
Jeff King
e31316263a http: optionally extract charset parameter from content-type
Since the previous commit, we now give a sanitized,
shortened version of the content-type header to any callers
who ask for it.

This patch adds back a way for them to cleanly access
specific parameters to the type. We could easily extract all
parameters and make them available via a string_list, but:

  1. That complicates the interface and memory management.

  2. In practice, no planned callers care about anything
     except the charset.

This patch therefore goes with the simplest thing, and we
can expand or change the interface later if it becomes
necessary.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-27 09:59:19 -07:00
Jeff King
bf197fd7ee http: extract type/subtype portion of content-type
When we get a content-type from curl, we get the whole
header line, including any parameters, and without any
normalization (like downcasing or whitespace) applied.
If we later try to match it with strcmp() or even
strcasecmp(), we may get false negatives.

This could cause two visible behaviors:

  1. We might fail to recognize a smart-http server by its
     content-type.

  2. We might fail to relay text/plain error messages to
     users (especially if they contain a charset parameter).

This patch teaches the http code to extract and normalize
just the type/subtype portion of the string. This is
technically passing out less information to the callers, who
can no longer see the parameters. But none of the current
callers cares, and a future patch will add back an
easier-to-use method for accessing those parameters.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-27 09:57:00 -07:00
Junio C Hamano
060be00621 Merge branch 'mh/object-code-cleanup'
* mh/object-code-cleanup:
  sha1_file.c: document a bunch of functions defined in the file
  sha1_file_name(): declare to return a const string
  find_pack_entry(): document last_found_pack
  replace_object: use struct members instead of an array
2014-03-14 14:26:29 -07:00
Michael Haggerty
30d6c6eabf sha1_file_name(): declare to return a const string
Change the return value of sha1_file_name() to (const char *).
(Callers have no business mucking about here.)  Change callers
accordingly, deleting a few superfluous temporary variables along the
way.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 09:10:22 -08:00
Jeff King
beed336c3e http: never use curl_easy_perform
We currently don't reuse http connections when fetching via
the smart-http protocol. This is bad because the TCP
handshake introduces latency, and especially because SSL
connection setup may be non-trivial.

We can fix it by consistently using curl's "multi"
interface.  The reason is rather complicated:

Our http code has two ways of being used: queuing many
"slots" to be fetched in parallel, or fetching a single
request in a blocking manner. The parallel code is built on
curl's "multi" interface. Most of the single-request code
uses http_request, which is built on top of the parallel
code (we just feed it one slot, and wait until it finishes).

However, one could also accomplish the single-request scheme
by avoiding curl's multi interface entirely and just using
curl_easy_perform. This is simpler, and is used by post_rpc
in the smart-http protocol.

It does work to use the same curl handle in both contexts,
as long as it is not at the same time.  However, internally
curl may not share all of the cached resources between both
contexts. In particular, a connection formed using the
"multi" code will go into a reuse pool connected to the
"multi" object. Further requests using the "easy" interface
will not be able to reuse that connection.

The smart http protocol does ref discovery via http_request,
which uses the "multi" interface, and then follows up with
the "easy" interface for its rpc calls. As a result, we make
two HTTP connections rather than reusing a single one.

We could teach the ref discovery to use the "easy"
interface. But it is only once we have done this discovery
that we know whether the protocol will be smart or dumb. If
it is dumb, then our further requests, which want to fetch
objects in parallel, will not be able to reuse the same
connection.

Instead, this patch switches post_rpc to build on the
parallel interface, which means that we use it consistently
everywhere. It's a little more complicated to use, but since
we have the infrastructure already, it doesn't add any code;
we can just factor out the relevant bits from http_request.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-18 15:50:57 -08:00
Junio C Hamano
ad70448576 Merge branch 'cc/starts-n-ends-with'
Remove a few duplicate implementations of prefix/suffix comparison
functions, and rename them to starts_with and ends_with.

* cc/starts-n-ends-with:
  replace {pre,suf}fixcmp() with {starts,ends}_with()
  strbuf: introduce starts_with() and ends_with()
  builtin/remote: remove postfixcmp() and use suffixcmp() instead
  environment: normalize use of prefixcmp() by removing " != 0"
2013-12-17 12:02:44 -08:00
Christian Couder
5955654823 replace {pre,suf}fixcmp() with {starts,ends}_with()
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.

The change can be recreated by mechanically applying this:

    $ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
      grep -v strbuf\\.c |
      xargs perl -pi -e '
        s|!prefixcmp\(|starts_with\(|g;
        s|prefixcmp\(|!starts_with\(|g;
        s|!suffixcmp\(|ends_with\(|g;
        s|suffixcmp\(|!ends_with\(|g;
      '

on the result of preparatory changes in this series.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-05 14:13:21 -08:00
Junio C Hamano
c5a77e8f92 Merge branch 'bc/http-100-continue'
Issue "100 Continue" responses to help use of GSS-Negotiate
authentication scheme over HTTP transport when needed.

* bc/http-100-continue:
  remote-curl: fix large pushes with GSSAPI
  remote-curl: pass curl slot_results back through run_slot
  http: return curl's AUTHAVAIL via slot_results
2013-12-05 12:58:59 -08:00
Jeff King
0972ccd97c http: return curl's AUTHAVAIL via slot_results
Callers of the http code may want to know which auth types
were available for the previous request. But after finishing
with the curl slot, they are not supposed to look at the
curl handle again. We already handle returning other
information via the slot_results struct; let's add a flag to
check the available auth.

Note that older versions of curl did not support this, so we
simply return 0 (something like "-1" would be worse, as the
value is a bitflag and we might accidentally set a flag).
This is sufficient for the callers planned in this series,
who only trigger some optional behavior if particular bits
are set, and can live with a fake "no bits" answer.

Signed-off-by: Jeff King <peff@peff.net>
2013-10-31 10:05:55 -07:00
Junio C Hamano
177f0a4009 Merge branch 'jk/http-auth-redirects'
Handle the case where http transport gets redirected during the
authorization request better.

* jk/http-auth-redirects:
  http.c: Spell the null pointer as NULL
  remote-curl: rewrite base url from info/refs redirects
  remote-curl: store url as a strbuf
  remote-curl: make refs_url a strbuf
  http: update base URLs when we see redirects
  http: provide effective url to callers
  http: hoist credential request out of handle_curl_result
  http: refactor options to http_get_*
  http_request: factor out curlinfo_strbuf
  http_get_file: style fixes
2013-10-30 12:09:53 -07:00
Junio C Hamano
bb2fd90c7b Merge branch 'ew/keepalive'
* ew/keepalive:
  http: use curl's tcp keepalive if available
  http: enable keepalive on TCP sockets
2013-10-28 10:43:24 -07:00
Ramsay Jones
70900eda4a http.c: Spell the null pointer as NULL
Commit 1bbcc224 ("http: refactor options to http_get_*", 28-09-2013)
changed the type of final 'options' argument of the http_get_file()
function from an int to an 'struct http_get_options' pointer.
However, it neglected to update the (single) call site. Since this
call was passing '0' to that argument, it was (correctly) being
interpreted as a null pointer. Change to argument to NULL.

Noticed by sparse. ("Using plain integer as NULL pointer")

Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Acked-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-24 14:42:26 -07:00
Jeff King
47ce115370 http: use curl's tcp keepalive if available
Commit a15d069 taught git to use curl's SOCKOPTFUNCTION hook
to turn on TCP keepalives. However, modern versions of curl
have a TCP_KEEPALIVE option, which can do this for us. As an
added bonus, the curl code knows how to turn on keepalive
for a much wider variety of platforms. The only downside to
using this option is that not everybody has a new enough curl.
Let's split our keepalive options into three conditionals:

  1. With curl 7.25.0 and newer, we rely on curl to do it
     right.

  2. With older curl that still knows SOCKOPTFUNCTION, we
     use the code from a15d069.

  3. Otherwise, we are out of luck, and the call is a no-op.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-16 11:26:09 -07:00
Jeff King
c93c92f309 http: update base URLs when we see redirects
If a caller asks the http_get_* functions to go to a
particular URL and we end up elsewhere due to a redirect,
the effective_url field can tell us where we went.

It would be nice to remember this redirect and short-cut
further requests for two reasons:

  1. It's more efficient. Otherwise we spend an extra http
     round-trip to the server for each subsequent request,
     just to get redirected.

  2. If we end up with an http 401 and are going to ask for
     credentials, it is to feed them to the redirect target.
     If the redirect is an http->https upgrade, this means
     our credentials may be provided on the http leg, just
     to end up redirected to https. And if the redirect
     crosses server boundaries, then curl will drop the
     credentials entirely as it follows the redirect.

However, it, it is not enough to simply record the effective
URL we saw and use that for subsequent requests. We were
originally fed a "base" url like:

   http://example.com/foo.git

and we want to figure out what the new base is, even though
the URLs we see may be:

     original: http://example.com/foo.git/info/refs
    effective: http://example.com/bar.git/info/refs

Subsequent requests will not be for "info/refs", but for
other paths relative to the base. We must ask the caller to
pass in the original base, and we must pass the redirected
base back to the caller (so that it can generate more URLs
from it). Furthermore, we need to feed the new base to the
credential code, so that requests to credential helpers (or
to the user) match the URL we will be requesting.

This patch teaches http_request_reauth to do this munging.
Since it is the caller who cares about making more URLs, it
seems at first glance that callers could simply check
effective_url themselves and handle it. However, since we
need to update the credential struct before the second
re-auth request, we have to do it inside http_request_reauth.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-10-14 16:56:47 -07:00
Jeff King
78868962c0 http: provide effective url to callers
When we ask curl to access a URL, it may follow one or more
redirects to reach the final location. We have no idea
this has happened, as curl takes care of the details and
simply returns the final content to us.

The final URL that we ended up with can be accessed via
CURLINFO_EFFECTIVE_URL. Let's make that optionally available
to callers of http_get_*, so that they can make further
decisions based on the redirection.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-10-14 16:55:23 -07:00
Jeff King
2501aff8b7 http: hoist credential request out of handle_curl_result
When we are handling a curl response code in http_request or
in the remote-curl RPC code, we use the handle_curl_result
helper to translate curl's response into an easy-to-use
code. When we see an HTTP 401, we do one of two things:

  1. If we already had a filled-in credential, we mark it as
     rejected, and then return HTTP_NOAUTH to indicate to
     the caller that we failed.

  2. If we didn't, then we ask for a new credential and tell
     the caller HTTP_REAUTH to indicate that they may want
     to try again.

Rejecting in the first case makes sense; it is the natural
result of the request we just made. However, prompting for
more credentials in the second step does not always make
sense. We do not know for sure that the caller is going to
make a second request, and nor are we sure that it will be
to the same URL. Logically, the prompt belongs not to the
request we just finished, but to the request we are (maybe)
about to make.

In practice, it is very hard to trigger any bad behavior.
Currently, if we make a second request, it will always be to
the same URL (even in the face of redirects, because curl
handles the redirects internally). And we almost always
retry on HTTP_REAUTH these days. The one exception is if we
are streaming a large RPC request to the server (e.g., a
pushed packfile), in which case we cannot restart. It's
extremely unlikely to see a 401 response at this stage,
though, as we would typically have seen it when we sent a
probe request, before streaming the data.

This patch drops the automatic prompt out of case 2, and
instead requires the caller to do it. This is a few extra
lines of code, and the bug it fixes is unlikely to come up
in practice. But it is conceptually cleaner, and paves the
way for better handling of credentials across redirects.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-10-14 16:55:13 -07:00
Eric Wong
a15d069a19 http: enable keepalive on TCP sockets
This is a follow up to commit e47a8583 (enable SO_KEEPALIVE for
connected TCP sockets, 2011-12-06).

Sockets may never receive notification of some link errors,
causing "git fetch" or similar processes to hang forever.
Enabling keepalive messages allows hung processes to error out
after a few minutes/hours depending on the keepalive settings of
the system.

I noticed this problem with some non-interactive cronjobs getting
hung when talking to HTTP servers.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-10-14 07:03:59 -07:00
Jeff King
1bbcc224cc http: refactor options to http_get_*
Over time, the http_get_strbuf function has grown several
optional parameters. We now have a bitfield with multiple
boolean options, as well as an optional strbuf for returning
the content-type of the response. And a future patch in this
series is going to add another strbuf option.

Treating these as separate arguments has a few downsides:

  1. Most call sites need to add extra NULLs and 0s for the
     options they aren't interested in.

  2. The http_get_* functions are actually wrappers around
     2 layers of low-level implementation functions. We have
     to pass these options through individually.

  3. The http_get_strbuf wrapper learned these options, but
     nobody bothered to do so for http_get_file, even though
     it is backed by the same function that does understand
     the options.

Let's consolidate the options into a single struct. For the
common case of the default options, we'll allow callers to
simply pass a NULL for the options struct.

The resulting code is often a few lines longer, but it ends
up being easier to read (and to change as we add new
options, since we do not need to update each call site).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-09-30 17:21:59 -07:00
Jeff King
132b70a2ed http_request: factor out curlinfo_strbuf
When we retrieve the content-type of an http response, curl
gives us a pointer to internal storage, which we then copy
into a strbuf. Let's factor out the get-and-copy routine,
which can be used for getting other curl info.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-09-30 13:04:45 -07:00
Jeff King
3d1fb769b2 http_get_file: style fixes
Besides being ugly, the extra parentheses are idiomatic for
suppressing compiler warnings when we are assigning within a
conditional. We aren't doing that here, and they just
confuse the reader.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-09-30 13:04:31 -07:00
Junio C Hamano
a0a08d48d0 Merge branch 'jc/url-match'
Allow section.<urlpattern>.var configuration variables to be
treated as a "virtual" section.var given a URL, and use the
mechanism to enhance http.* configuration variables.

This is a reroll of Kyle J. McKay's work.

* jc/url-match:
  builtin/config.c: compilation fix
  config: "git config --get-urlmatch" parses section.<url>.key
  builtin/config: refactor collect_config()
  config: parse http.<url>.<variable> using urlmatch
  config: add generic callback wrapper to parse section.<url>.key
  config: add helper to normalize and match URLs
  http.c: fix parsing of http.sslCertPasswordProtected variable
2013-09-09 14:50:36 -07:00
Kyle J. McKay
6a56993b2e config: parse http.<url>.<variable> using urlmatch
Use the urlmatch_config_entry() to wrap the underlying
http_options() two-level variable parser in order to set
http.<variable> to the value with the most specific URL in the
configuration.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-05 16:02:03 -07:00
Junio C Hamano
3f4ccd2b0b http.c: fix parsing of http.sslCertPasswordProtected variable
The existing code triggers only when the configuration variable is
set to true.  Once the variable is set to true in a more generic
configuration file (e.g. ~/.gitconfig), it cannot be overriden to
false in the repository specific one (e.g. .git/config).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-31 12:09:13 -07:00
Dave Borowitz
912b2acf2f http: add http.savecookies option to write out HTTP cookies
HTTP servers may send Set-Cookie headers in a response and expect them
to be set on subsequent requests. By default, libcurl behavior is to
store such cookies in memory and reuse them across requests within a
single session. However, it may also make sense, depending on the
server and the cookies, to store them across sessions. Provide users
an option to enable this behavior, writing cookies out to the same
file specified in http.cookiefile.

Signed-off-by: Dave Borowitz <dborowitz@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-30 09:19:04 -07:00
Junio C Hamano
dc2ed04c23 Merge branch 'bc/http-keep-memory-given-to-curl'
Older cURL wanted piece of memory we call it with to be stable, but
we updated the auth material after handing it to a call.

* bc/http-keep-memory-given-to-curl:
  http.c: don't rewrite the user:passwd string multiple times
2013-06-27 14:29:49 -07:00
Brandon Casey
a94cf2cb7e http.c: don't rewrite the user:passwd string multiple times
Curl older than 7.17 (RHEL 4.X provides 7.12 and RHEL 5.X provides
7.15) requires that we manage any strings that we pass to it as
pointers.  So, we really shouldn't be modifying this strbuf after we
have passed it to curl.

Our interaction with curl is currently safe (before or after this
patch) since the pointer that is passed to curl is never invalidated;
it is repeatedly rewritten with the same sequence of characters but
the strbuf functions never need to allocate a larger string, so the
same memory buffer is reused.

This "guarantee" of safety is somewhat subtle and could be overlooked
by someone who may want to add a more complex handling of the username
and password.  So, let's stop modifying this strbuf after we have
passed it to curl, but also leave a note to describe the assumptions
that have been made about username/password lifetime and to draw
attention to the code.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-19 10:00:26 -07:00
Junio C Hamano
574d51b575 Merge branch 'mv/ssl-ftp-curl'
Does anybody really use commit walkers over (s)ftp?

* mv/ssl-ftp-curl:
  Support FTP-over-SSL/TLS for regular FTP
2013-04-19 13:31:08 -07:00
Jeff King
b793acf14c http: set curl FAILONERROR each time we select a handle
Because we reuse curl handles for multiple requests, the
setup of a handle happens in two stages: stable, global
setup and per-request setup. The lifecycle of a handle is
something like:

  1. get_curl_handle; do basic global setup that will last
     through the whole program (e.g., setting the user
     agent, ssl options, etc)

  2. get_active_slot; set up a per-request baseline (e.g.,
     clearing the read/write functions, making it a GET
     request, etc)

  3. perform the request with curl_*_perform functions

  4. goto step 2 to perform another request

Breaking it down this way means we can avoid doing global
setup from step (1) repeatedly, but we still finish step (2)
with a predictable baseline setup that callers can rely on.

Until commit 6d052d7 (http: add HTTP_KEEP_ERROR option,
2013-04-05), setting curl's FAILONERROR option was a global
setup; we never changed it. However, 6d052d7 introduced an
option where some requests might turn off FAILONERROR. Later
requests using the same handle would have the option
unexpectedly turned off, which meant they would not notice
http failures at all.

This could easily be seen in the test-suite for the
"half-auth" cases of t5541 and t5551. The initial requests
turned off FAILONERROR, which meant it was erroneously off
for the rpc POST. That worked fine for a successful request,
but meant that we failed to react properly to the HTTP 401
(instead, we treated whatever the server handed us as a
successful message body).

The solution is simple: now that FAILONERROR is a
per-request setting, we move it to get_active_slot to make
sure it is reset for each request.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-16 10:13:46 -07:00
Modestas Vainius
4bc444eb64 Support FTP-over-SSL/TLS for regular FTP
Add a boolean http.sslTry option which allows to enable AUTH SSL/TLS and
encrypted data transfers when connecting via regular FTP protocol.

Default is false since it might trigger certificate verification errors on
misconfigured servers.

Signed-off-by: Modestas Vainius <modestas@vainius.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-12 08:52:23 -07:00
Jeff King
4df13f69e9 http: drop http_error function
This function is a single-liner and is only called from one
place. Just inline it, which makes the code more obvious.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-06 18:56:46 -07:00
Jeff King
39a570f26c http: re-word http error message
When we report an http error code, we say something like:

  error: The requested URL reported failure: 403 Forbidden while accessing http://example.com/repo.git

Everything between "error:" and "while" is written by curl,
and the resulting sentence is hard to read (especially
because there is no punctuation between curl's sentence and
the remainder of ours). Instead, let's re-order this to give
better flow:

  error: unable to access 'http://example.com/repo.git: The requested URL reported failure: 403 Forbidden

This is still annoyingly long, but at least reads more
clearly left to right.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-06 18:56:45 -07:00
Jeff King
67d2a7b5c5 http: simplify http_error helper function
This helper function should really be a one-liner that
prints an error message, but it has ended up unnecessarily
complicated:

  1. We call error() directly when we fail to start the curl
     request, so we must later avoid printing a duplicate
     error in http_error().

     It would be much simpler in this case to just stuff the
     error message into our usual curl_errorstr buffer
     rather than printing it ourselves. This means that
     http_error does not even have to care about curl's exit
     value (the interesting part is in the errorstr buffer
     already).

  2. We return the "ret" value passed in to us, but none of
     the callers actually cares about our return value. We
     can just drop this entirely.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-06 18:56:44 -07:00
Jeff King
6d052d78d7 http: add HTTP_KEEP_ERROR option
We currently set curl's FAILONERROR option, which means that
any http failures are reported as curl errors, and the
http body content from the server is thrown away.

This patch introduces a new option to http_get_strbuf which
specifies that the body content from a failed http response
should be placed in the destination strbuf, where it can be
accessed by the caller.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-06 18:56:41 -07:00
Jeff King
047ec60205 pkt-line: move LARGE_PACKET_MAX definition from sideband
Having the packet sizes defined near the packet read/write
functions makes more sense.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 13:42:22 -08:00
Jeff King
3443db51a0 http_request: reset "type" strbuf before adding
Callers may pass us a strbuf which we use to record the
content-type of the response. However, we simply appended to
it rather than overwriting its contents, meaning that cruft
in the strbuf gave us a bogus type. E.g., the multiple
requests triggered by http_request could yield a type like
"text/plainapplication/x-git-receive-pack-advertisement".

Reported-by: Michael Schubert <mschub@elegosoft.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-06 07:50:56 -08:00
Shawn Pearce
4656bf47fc Verify Content-Type from smart HTTP servers
Before parsing a suspected smart-HTTP response verify the returned
Content-Type matches the standard. This protects a client from
attempting to process a payload that smells like a smart-HTTP
server response.

JGit has been doing this check on all responses since the dawn of
time. I mistakenly failed to include it in git-core when smart HTTP
was introduced. At the time I didn't know how to get the Content-Type
from libcurl. I punted, meant to circle back and fix this, and just
plain forgot about it.

Signed-off-by: Shawn Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-04 10:22:36 -08:00
Junio C Hamano
8bc714b408 Merge branch 'rb/http-cert-cred-no-username-prompt' into maint
* rb/http-cert-cred-no-username-prompt:
  http.c: Avoid username prompt for certifcate credentials
2013-01-10 14:03:54 -08:00
Rene Bredlau
75e9a405d4 http.c: Avoid username prompt for certifcate credentials
If sslCertPasswordProtected is set to true do not ask for username to decrypt rsa key. This question is pointless, the key is only protected by a password. Internaly the username is simply set to "".

Signed-off-by: Rene Bredlau <git@unrelated.de>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-21 10:19:40 -08:00
Jeff King
23a50a1fb1 Merge branch 'sz/maint-curl-multi-timeout'
Sometimes curl_multi_timeout() function suggested a wrong timeout
value when there is no file descriptors to wait on and the http
transport ended up sleeping for minutes in select(2) system call.
Detect this and reduce the wait timeout in such a case.

* sz/maint-curl-multi-timeout:
  Fix potential hang in https handshake
2012-11-09 12:50:56 -05:00
Jeff King
58f3f9893d Merge branch 'jk/maint-http-init-not-in-result-handler'
Further clean-up to the http codepath that picks up results after
cURL library is done with one request slot.

* jk/maint-http-init-not-in-result-handler:
  http: do not set up curl auth after a 401
  remote-curl: do not call run_slot repeatedly
2012-10-29 04:13:09 -04:00
Stefan Zager
7202b81ffc Fix potential hang in https handshake
It has been observed that curl_multi_timeout may return a very long
timeout value (e.g., 294 seconds and some usec) just before
curl_multi_fdset returns no file descriptors for reading.  The
upshot is that select() will hang for a long time -- long enough for
an https handshake to be dropped.  The observed behavior is that
the git command will hang at the terminal and never transfer any
data.

This patch is a workaround for a probable bug in libcurl.  The bug
only seems to manifest around a very specific set of circumstances:

- curl version (from curl/curlver.h):

 #define LIBCURL_VERSION_NUM 0x071307

- git-remote-https running on an ubuntu-lucid VM.
- Connecting through squid proxy running on another VM.

Interestingly, the problem doesn't manifest if a host connects
through squid proxy running on localhost; only if the proxy is on
a separate VM (not sure if the squid host needs to be on a separate
physical machine).  That would seem to suggest that this issue
is timing-sensitive.

This patch is more or less in line with a recommendation in the
curl docs about how to behave when curl_multi_fdset doesn't return
and file descriptors:

http://curl.haxx.se/libcurl/c/curl_multi_fdset.html

Signed-off-by: Stefan Zager <szager@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-19 14:15:17 -07:00
Junio C Hamano
e98fa647aa Merge branch 'jk/maint-http-half-auth-push' into maint
* jk/maint-http-half-auth-push:
  http: fix segfault in handle_curl_result
2012-10-17 10:29:24 -07:00
Junio C Hamano
053a08f5bb Merge branch 'jk/maint-http-half-auth-push'
Fixes a regression in maint-1.7.11 (v1.7.11.7), maint (v1.7.12.1)
and master (v1.8.0-rc0).

* jk/maint-http-half-auth-push:
  http: fix segfault in handle_curl_result
2012-10-16 11:44:37 -07:00
Jeff King
1960897ebc http: do not set up curl auth after a 401
When we get an http 401, we prompt for credentials and put
them in our global credential struct. We also feed them to
the curl handle that produced the 401, with the intent that
they will be used on a retry.

When the code was originally introduced in commit 42653c0,
this was a necessary step. However, since dfa1725, we always
feed our global credential into every curl handle when we
initialize the slot with get_active_slot. So every further
request already feeds the credential to curl.

Moreover, accessing the slot here is somewhat dubious. After
the slot has produced a response, we don't actually control
it any more.  If we are using curl_multi, it may even have
been re-initialized to handle a different request.

It just so happens that we will reuse the curl handle within
the slot in such a case, and that because we only keep one
global credential, it will be the one we want.  So the
current code is not buggy, but it is misleading.

By cleaning it up, we can remove the slot argument entirely
from handle_curl_result, making it much more obvious that
slots should not be accessed after they are marked as
finished.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-12 09:45:15 -07:00
Jeff King
188923f0d1 http: fix segfault in handle_curl_result
When we create an http active_request_slot, we can set its
"results" pointer back to local storage. The http code will
fill in the details of how the request went, and we can
access those details even after the slot has been cleaned
up.

Commit 8809703 (http: factor out http error code handling)
switched us from accessing our local results struct directly
to accessing it via the "results" pointer of the slot. That
means we're accessing the slot after it has been marked as
finished, defeating the whole purpose of keeping the results
storage separate.

Most of the time this doesn't matter, as finishing the slot
does not actually clean up the pointer. However, when using
curl's multi interface with the dumb-http revision walker,
we might actually start a new request before handing control
back to the original caller. In that case, we may reuse the
slot, zeroing its results pointer, and leading the original
caller to segfault while looking for its results inside the
slot.

Instead, we need to pass a pointer to our local results
storage to the handle_curl_result function, rather than
relying on the pointer in the slot struct. This matches what
the original code did before the refactoring (which did not
use a separate function, and therefore just accessed the
results struct directly).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-12 09:42:31 -07:00
Shawn O. Pearce
aa90b9697f Enable info/refs gzip decompression in HTTP client
Some HTTP servers try to use gzip compression on the /info/refs
request to save transfer bandwidth. Repositories with many tags
may find the /info/refs request can be gzipped to be 50% of the
original size due to the few but often repeated bytes used (hex
SHA-1 and commonly digits in tag names).

For most HTTP requests enable "Accept-Encoding: gzip" ensuring
the /info/refs payload can use this encoding format.

Only request gzip encoding from servers. Although deflate is
supported by libcurl, most servers have standardized on gzip
encoding for compression as that is what most browsers support.
Asking for deflate increases request sizes by a few bytes, but is
unlikely to ever be used by a server.

Disable the Accept-Encoding header on probe RPCs as response bodies
are supposed to be exactly 4 bytes long, "0000". The HTTP headers
requesting and indicating compression use more space than the data
transferred in the body.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-20 10:26:50 -07:00
Junio C Hamano
3503e9ab32 Merge branch 'maint-1.7.11' into maint 2012-09-12 14:08:05 -07:00
Junio C Hamano
7d9483c299 Merge branch 'jk/maint-http-half-auth-push' into maint-1.7.11
Pushing to smart HTTP server with recent Git fails without having
the username in the URL to force authentication, if the server is
configured to allow GET anonymously, while requiring authentication
for POST.

* jk/maint-http-half-auth-push:
  http: prompt for credentials on failed POST
  http: factor out http error code handling
  t: test http access to "half-auth" repositories
  t: test basic smart-http authentication
  t/lib-httpd: recognize */smart/* repos as smart-http
  t/lib-httpd: only route auth/dumb to dumb repos
  t5550: factor out http auth setup
  t5550: put auth-required repo in auth/dumb
2012-09-12 13:58:23 -07:00
Jeff King
8809703072 http: factor out http error code handling
Most of our http requests go through the http_request()
interface, which does some nice post-processing on the
results. In particular, it handles prompting for missing
credentials as well as approving and rejecting valid or
invalid credentials. Unfortunately, it only handles GET
requests. Making it handle POSTs would be quite complex, so
let's pull result handling code into its own function so
that it can be reused from the POST code paths.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-27 10:49:09 -07:00
Joachim Schmitz
4246b0bd90 http.c: don't use curl_easy_strerror prior to curl-7.12.0
Reverts be22d92 (http: avoid empty error messages for some curl
errors, 2011-09-05) on platforms with older versions of libcURL
where the function is not available.

Signed-off-by: Joachim Schmitz <jojo@schmitz-digital.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-23 14:23:18 -07:00
Jeff King
745c7c8e62 http: get default user-agent from git_user_agent
This means we will respect the GIT_USER_AGENT build-time
configuration and run-time environment variable.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-06-03 13:11:54 -07:00
Pete Wyckoff
82247e9bd5 remove superfluous newlines in error messages
The error handling routines add a newline.  Remove
the duplicate ones in error messages.

Signed-off-by: Pete Wyckoff <pw@padd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-30 15:45:51 -07:00
Jeff King
6f4c347ca1 http: use newer curl options for setting credentials
We give the username and password to curl by sticking them
in a buffer of the form "user:pass" and handing the result
to CURLOPT_USERPWD. Since curl 7.19.1, there is a split
mechanism, where you can specify each element individually.

This has the advantage that a username can contain a ":"
character. It also is less code for us, since we can hand
our strings over to curl directly. And since curl 7.17.0 and
higher promise to copy the strings for us, we we don't even
have to worry about memory ownership issues.

Unfortunately, we have to keep the ugly code for old curl
around, but as it is now nicely #if'd out, we can easily get
rid of it when we decide that 7.19.1 is "old enough".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-14 16:04:25 -07:00
Jeff King
aa0834a04e http: clean up leak in init_curl_http_auth
When we have a credential to give to curl, we must copy it
into a "user:pass" buffer and then hand the buffer to curl.
Old versions of curl did not copy the buffer, and we were
expected to keep it valid. Newer versions of curl will copy
the buffer.

Our solution was to use a strbuf and detach it, giving
ownership of the resulting buffer to curl. However, this
meant that we were leaking the buffer on newer versions of
curl, since curl was just copying it and throwing away the
string we passed. Furthermore, when we replaced a
credential (e.g., because our original one was rejected), we
were also leaking on both old and new versions of curl.

This got even worse in the last patch, which started
replacing the credential (and thus leaking) on every http
request.

Instead, let's use a static buffer to make the ownership
more clear and less leaky.  We already keep a static "struct
credential", so we are only handling a single credential at
a time, anyway.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-14 16:04:24 -07:00