When we try to resume transfer of a partially-downloaded
object or pack, we fopen() the existing file for append,
then use ftell() to get the current position. We use a
"long", which can hold only 2GB on a 32-bit system, even
though packfiles may be larger than that.
Let's switch to using off_t, which should hold any file size
our system is capable of storing. We need to use ftello() to
get the off_t. This is in POSIX and hopefully available
everywhere; if not, we should be able to wrap it by falling
back to ftell(), which would presumably return "-1" on such
a large file (and we would simply skip resuming in that case).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A HTTP server is permitted to return a non-range response to a HTTP
range request (and Apache httpd in fact does this in some cases).
While libcurl knows how to correctly handle this (by skipping bytes
before and after the requested range), it only turns on this handling
if it is aware that a range request is being made. By manually
setting the range header instead of using CURLOPT_RANGE, we were
hiding the fact that this was a range request from libcurl. This
could cause corruption.
Signed-off-by: David Turner <dturner@twopensource.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many allocations that is manually counted (correctly) that are
followed by strcpy/sprintf have been replaced with a less error
prone constructs such as xstrfmt.
Macintosh-specific breakage was noticed and corrected in this
reroll.
* jk/war-on-sprintf: (70 commits)
name-rev: use strip_suffix to avoid magic numbers
use strbuf_complete to conditionally append slash
fsck: use for_each_loose_file_in_objdir
Makefile: drop D_INO_IN_DIRENT build knob
fsck: drop inode-sorting code
convert strncpy to memcpy
notes: document length of fanout path with a constant
color: add color_set helper for copying raw colors
prefer memcpy to strcpy
help: clean up kfmclient munging
receive-pack: simplify keep_arg computation
avoid sprintf and strcpy with flex arrays
use alloc_ref rather than hand-allocating "struct ref"
color: add overflow checks for parsing colors
drop strcpy in favor of raw sha1_to_hex
use sha1_to_hex_r() instead of strcpy
daemon: use cld->env_array when re-spawning
stat_tracking_info: convert to argv_array
http-push: use an argv_array for setup_revisions
fetch-pack: use argv_array for index-pack / unpack-objects
...
By default, libcurl will follow circular http redirects
forever. Let's put a cap on this so that somebody who can
trigger an automated fetch of an arbitrary repository (e.g.,
for CI) cannot convince git to loop infinitely.
The value chosen is 20, which is the same default that
Firefox uses.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, libcurl would follow redirection to any protocol
it was compiled for support with. This is desirable to allow
redirection from HTTP to HTTPS. However, it would even
successfully allow redirection from HTTP to SFTP, a protocol
that git does not otherwise support at all. Furthermore
git's new protocol-whitelisting could be bypassed by
following a redirect within the remote helper, as it was
only enforced at transport selection time.
This patch limits redirects within libcurl to HTTP, HTTPS,
FTP and FTPS. If there is a protocol-whitelist present, this
list is limited to those also allowed by the whitelist. As
redirection happens from within libcurl, it is impossible
for an HTTP redirect to a protocol implemented within
another remote helper.
When the curl version git was compiled with is too old to
support restrictions on protocol redirection, we warn the
user if GIT_ALLOW_PROTOCOL restrictions were requested. This
is a little inaccurate, as even without that variable in the
environment, we would still restrict SFTP, etc, and we do
not warn in that case. But anything else means we would
literally warn every time git accesses an http remote.
This commit includes a test, but it is not as robust as we
would hope. It redirects an http request to ftp, and checks
that curl complained about the protocol, which means that we
are relying on curl's specific error message to know what
happened. Ideally we would redirect to a working ftp server
and confirm that we can clone without protocol restrictions,
and not with them. But we do not have a portable way of
providing an ftp server, nor any other protocol that curl
supports (https is the closest, but we would have to deal
with certificates).
[jk: added test and version warning]
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we want to convert "foo.pack" to "foo.idx", we do it by
duplicating the original string and then munging the bytes
in place. Let's use strip_suffix and xstrfmt instead, which
has several advantages:
1. It's more clear what the intent is.
2. It does not implicitly rely on the fact that
strlen(".idx") <= strlen(".pack") to avoid an overflow.
3. We communicate the assumption that the input file ends
with ".pack" (and get a run-time check that this is so).
4. We drop calls to strcpy, which makes auditing the code
base easier.
Likewise, we can do this to convert ".pack" to ".bitmap",
avoiding some manual memory computation.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We sometimes sprintf into fixed-size buffers when we know
that the buffer is large enough to fit the input (either
because it's a constant, or because it's numeric input that
is bounded in size). Likewise with strcpy of constant
strings.
However, these sites make it hard to audit sprintf and
strcpy calls for buffer overflows, as a reader has to
cross-reference the size of the array with the input. Let's
use xsnprintf instead, which communicates to a reader that
we don't expect this to overflow (and catches the mistake in
case we do).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A new configuration variable http.sslVersion can be used to specify
what specific version of SSL/TLS to use to make a connection.
* ep/http-configure-ssl-version:
http: add support for specifying the SSL version
Teach git about a new option, "http.sslVersion", which permits one
to specify the SSL version to use when negotiating SSL connections.
The setting can be overridden by the GIT_SSL_VERSION environment
variable.
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 5a688fe4 ("core.sharedrepository = 0mode" should set, not
loosen, 2009-03-25), we kept reminding ourselves:
NEEDSWORK: this should be renamed to finalize_temp_file() as
"moving" is only a part of what it does, when no patch between
master to pu changes the call sites of this function.
without doing anything about it. Let's do so.
The purpose of this function was not to move but to finalize. The
detail of the primarily implementation of finalizing was to link the
temporary file to its final name and then to unlink, which wasn't
even "moving". The alternative implementation did "move" by calling
rename(2), which is a fun tangent.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We used to ask libCURL to use the most secure authentication method
available when talking to an HTTP proxy only when we were told to
talk to one via configuration variables. We now ask libCURL to
always use the most secure authentication method, because the user
can tell libCURL to use an HTTP proxy via an environment variable
without using configuration variables.
* et/http-proxyauth:
http: always use any proxy auth method available
We set CURLOPT_PROXYAUTH to use the most secure authentication
method available only when the user has set configuration variables
to specify a proxy. However, libcurl also supports specifying a
proxy through environment variables. In that case libcurl defaults
to only using the Basic proxy authentication method, because we do
not use CURLOPT_PROXYAUTH.
Set CURLOPT_PROXYAUTH to always use the most secure authentication
method available, even when there is no git configuration telling us
to use a proxy. This allows the user to use environment variables to
configure a proxy that requires an authentication method different
from Basic.
Signed-off-by: Enrique A. Tobis <etobis@twosigma.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce http.<url>.SSLCipherList configuration variable to tweak
the list of cipher suite to be used with libcURL when talking with
https:// sites.
* ls/http-ssl-cipher-list:
http: add support for specifying an SSL cipher list
Teach git about a new option, "http.sslCipherList", which permits one to
specify a list of ciphers to use when negotiating SSL connections. The
setting can be overwridden by the GIT_SSL_CIPHER_LIST environment
variable.
Signed-off-by: Lars Kellogg-Stedman <lars@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The cleanup function is used in 4 places now and it's always safe to
free up the memory as well.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Calling setlocale(LC_MESSAGES, ...) directly from http.c, without
including <locale.h>, was causing compilation warnings. Move the
helper function to gettext.c that already includes the header and
where locale-related issues are handled.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We did not check the curl library version before using
CURLOPT_PROXYAUTH feature that may not exist.
* tc/missing-http-proxyauth:
http: support curl < 7.10.7
A broken pack .idx file in the receiving repository prevented the
dumb http transport from fetching a good copy of it from the other
side.
* jk/dumb-http-idx-fetch-fix:
dumb-http: do not pass NULL path to parse_pack_index
Using environment variable LANGUAGE and friends on the client side,
HTTP-based transports now send Accept-Language when making requests.
* ye/http-accept-language:
http: add Accept-Language header if possible
A broken pack .idx file in the receiving repository prevented the
dumb http transport from fetching a good copy of it from the other
side.
* jk/dumb-http-idx-fetch-fix:
dumb-http: do not pass NULL path to parse_pack_index
Mark file-local symbols as "static", and drop functions that nobody
uses.
* jc/unused-symbols:
shallow.c: make check_shallow_file_for_update() static
remote.c: make clear_cas_option() static
urlmatch.c: make match_urls() static
revision.c: make save_parents() and free_saved_parents() static
line-log.c: make line_log_data_init() static
pack-bitmap.c: make pack_bitmap_filename() static
prompt.c: remove git_getpass() nobody uses
http.c: make finish_active_slot() and handle_curl_result() static
Commit dd61399 introduced support for http proxies that require
authentication but it relies on the CURL_PROXYAUTH option which was
added in curl 7.10.7.
This makes sure proxy authentication is only enabled if libcurl can
support it.
Signed-off-by: Tom G. Christensen <tgc@statsbiblioteket.dk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add an Accept-Language header which indicates the user's preferred
languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG.
Examples:
LANGUAGE= -> ""
LANGUAGE=ko:en -> "Accept-Language: ko, en;q=0.9, *;q=0.1"
LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *;q=0.1"
LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *;q=0.1"
This gives git servers a chance to display remote error messages in
the user's preferred language.
Limit the number of languages to 1,000 because q-value must not be
smaller than 0.001, and limit the length of Accept-Language header to
4,000 bytes for some HTTP servers which cannot accept such long header.
Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Once upon a time, dumb http always fetched .idx files
directly into their final location, and then checked their
validity with parse_pack_index. This was refactored in
commit 750ef42 (http-fetch: Use temporary files for
pack-*.idx until verified, 2010-04-19), which uses the
following logic:
1. If we have the idx already in place, see if it's
valid (using parse_pack_index). If so, use it.
2. Otherwise, fetch the .idx to a tempfile, check
that, and if so move it into place.
3. Either way, fetch the pack itself if necessary.
However, it got step 1 wrong. We pass a NULL path parameter
to parse_pack_index, so an existing .idx file always looks
broken. Worse, we do not treat this broken .idx as an
opportunity to re-fetch, but instead return an error,
ignoring the pack entirely. This can lead to a dumb-http
fetch failing to retrieve the necessary objects.
This doesn't come up much in practice, because it must be a
packfile that we found out about (and whose .idx we stored)
during an earlier dumb-http fetch, but whose packfile we
_didn't_ fetch. I.e., we did a partial clone of a
repository, didn't need some packfiles, and now a followup
fetch needs them.
Discovery and tests by Charles Bailey <charles@hashpling.org>.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
They used to be used directly by remote-curl.c for the smart-http
protocol. But they got wrapped by run_one_slot() in beed336 (http:
never use curl_easy_perform, 2014-02-18). Any future users are
expected to follow that route.
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apache servers using mod_auth_kerb can be configured to allow the user
to authenticate either using Negotiate (using the Kerberos ticket) or
Basic authentication (using the Kerberos password). Often, one will
want to use Negotiate authentication if it is available, but fall back
to Basic authentication if the ticket is missing or expired.
However, libcurl will try very hard to use something other than Basic
auth, even over HTTPS. If Basic and something else are offered, libcurl
will never attempt to use Basic, even if the other option fails.
Teach the HTTP client code to stop trying authentication mechanisms that
don't use a password (currently Negotiate) after the first failure,
since if they failed the first time, they will never succeed.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
CodingGuidelines states that the first #include in C files should be
git-compat-util.h or another header file that includes it, such as
cache.h or builtin.h.
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unless there is a good reason, we should use die() rather than
fprintf/exit. We can also shorten the message to match other curl init
failures (and match our usual lowercase no-full-stop style).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most struct child_process variables are cleared using memset first after
declaration. Provide a macro, CHILD_PROCESS_INIT, that can be used to
initialize them statically instead. That's shorter, doesn't require a
function call and is slightly more readable (especially given that we
already have STRBUF_INIT, ARGV_ARRAY_INIT etc.).
Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/skip-prefix:
http-push: refactor parsing of remote object names
imap-send: use skip_prefix instead of using magic numbers
use skip_prefix to avoid repeated calculations
git: avoid magic number with skip_prefix
fetch-pack: refactor parsing in get_ack
fast-import: refactor parsing of spaces
stat_opt: check extra strlen call
daemon: use skip_prefix to avoid magic numbers
fast-import: use skip_prefix for parsing input
use skip_prefix to avoid repeating strings
use skip_prefix to avoid magic numbers
transport-helper: avoid reading past end-of-string
fast-import: fix read of uninitialized argv memory
apply: use skip_prefix instead of raw addition
refactor skip_prefix to return a boolean
avoid using skip_prefix as a boolean
daemon: mark some strings as const
parse_diff_color_slot: drop ofs parameter
In some cases, we use starts_with to check for a prefix, and
then use an already-calculated prefix length to advance a
pointer past the prefix. There are no magic numbers or
duplicated strings here, but we can still make the code
simpler and more obvious by using skip_prefix.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
extract_content_type() could not extract a charset parameter if the
parameter is not the first one and there is a whitespace and a following
semicolon just before the parameter. For example:
text/plain; format=fixed ;charset=utf-8
And it also could not handle correctly some other cases, such as:
text/plain; charset=utf-8; format=fixed
text/plain; some-param="a long value with ;semicolons;"; charset=utf-8
Thanks-to: Jeff King <peff@peff.net>
Signed-off-by: Yi EungJun <eungjun.yi@navercorp.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is specified by RFC 2616 as the default if no "charset"
parameter is given.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the previous commit, we now give a sanitized,
shortened version of the content-type header to any callers
who ask for it.
This patch adds back a way for them to cleanly access
specific parameters to the type. We could easily extract all
parameters and make them available via a string_list, but:
1. That complicates the interface and memory management.
2. In practice, no planned callers care about anything
except the charset.
This patch therefore goes with the simplest thing, and we
can expand or change the interface later if it becomes
necessary.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we get a content-type from curl, we get the whole
header line, including any parameters, and without any
normalization (like downcasing or whitespace) applied.
If we later try to match it with strcmp() or even
strcasecmp(), we may get false negatives.
This could cause two visible behaviors:
1. We might fail to recognize a smart-http server by its
content-type.
2. We might fail to relay text/plain error messages to
users (especially if they contain a charset parameter).
This patch teaches the http code to extract and normalize
just the type/subtype portion of the string. This is
technically passing out less information to the callers, who
can no longer see the parameters. But none of the current
callers cares, and a future patch will add back an
easier-to-use method for accessing those parameters.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* mh/object-code-cleanup:
sha1_file.c: document a bunch of functions defined in the file
sha1_file_name(): declare to return a const string
find_pack_entry(): document last_found_pack
replace_object: use struct members instead of an array
Change the return value of sha1_file_name() to (const char *).
(Callers have no business mucking about here.) Change callers
accordingly, deleting a few superfluous temporary variables along the
way.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We currently don't reuse http connections when fetching via
the smart-http protocol. This is bad because the TCP
handshake introduces latency, and especially because SSL
connection setup may be non-trivial.
We can fix it by consistently using curl's "multi"
interface. The reason is rather complicated:
Our http code has two ways of being used: queuing many
"slots" to be fetched in parallel, or fetching a single
request in a blocking manner. The parallel code is built on
curl's "multi" interface. Most of the single-request code
uses http_request, which is built on top of the parallel
code (we just feed it one slot, and wait until it finishes).
However, one could also accomplish the single-request scheme
by avoiding curl's multi interface entirely and just using
curl_easy_perform. This is simpler, and is used by post_rpc
in the smart-http protocol.
It does work to use the same curl handle in both contexts,
as long as it is not at the same time. However, internally
curl may not share all of the cached resources between both
contexts. In particular, a connection formed using the
"multi" code will go into a reuse pool connected to the
"multi" object. Further requests using the "easy" interface
will not be able to reuse that connection.
The smart http protocol does ref discovery via http_request,
which uses the "multi" interface, and then follows up with
the "easy" interface for its rpc calls. As a result, we make
two HTTP connections rather than reusing a single one.
We could teach the ref discovery to use the "easy"
interface. But it is only once we have done this discovery
that we know whether the protocol will be smart or dumb. If
it is dumb, then our further requests, which want to fetch
objects in parallel, will not be able to reuse the same
connection.
Instead, this patch switches post_rpc to build on the
parallel interface, which means that we use it consistently
everywhere. It's a little more complicated to use, but since
we have the infrastructure already, it doesn't add any code;
we can just factor out the relevant bits from http_request.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove a few duplicate implementations of prefix/suffix comparison
functions, and rename them to starts_with and ends_with.
* cc/starts-n-ends-with:
replace {pre,suf}fixcmp() with {starts,ends}_with()
strbuf: introduce starts_with() and ends_with()
builtin/remote: remove postfixcmp() and use suffixcmp() instead
environment: normalize use of prefixcmp() by removing " != 0"
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.
The change can be recreated by mechanically applying this:
$ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
grep -v strbuf\\.c |
xargs perl -pi -e '
s|!prefixcmp\(|starts_with\(|g;
s|prefixcmp\(|!starts_with\(|g;
s|!suffixcmp\(|ends_with\(|g;
s|suffixcmp\(|!ends_with\(|g;
'
on the result of preparatory changes in this series.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Issue "100 Continue" responses to help use of GSS-Negotiate
authentication scheme over HTTP transport when needed.
* bc/http-100-continue:
remote-curl: fix large pushes with GSSAPI
remote-curl: pass curl slot_results back through run_slot
http: return curl's AUTHAVAIL via slot_results
Callers of the http code may want to know which auth types
were available for the previous request. But after finishing
with the curl slot, they are not supposed to look at the
curl handle again. We already handle returning other
information via the slot_results struct; let's add a flag to
check the available auth.
Note that older versions of curl did not support this, so we
simply return 0 (something like "-1" would be worse, as the
value is a bitflag and we might accidentally set a flag).
This is sufficient for the callers planned in this series,
who only trigger some optional behavior if particular bits
are set, and can live with a fake "no bits" answer.
Signed-off-by: Jeff King <peff@peff.net>