Deal with a few deprecation warning from cURL library.
* jk/curl-avoid-deprecated-api:
http: support CURLOPT_PROTOCOLS_STR
http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
In http.c, the run_active_slot() function allows the given "slot" to
make progress by calling step_active_slots() in a loop repeatedly,
and the loop is not left until the request held in the slot
completes.
Ages ago, we used to use the slot->in_use member to get out of the
loop, which misbehaved when the request in "slot" completes (at
which time, the result of the request is copied away from the slot,
and the in_use member is cleared, making the slot ready to be
reused), and the "slot" gets reused to service a different request
(at which time, the "slot" becomes in_use again, even though it is
for a different request). The loop terminating condition mistakenly
thought that the original request has yet to be completed.
Today's code, after baa7b67d (HTTP slot reuse fixes, 2006-03-10)
fixed this issue, uses a separate "slot->finished" member that is
set in run_active_slot() to point to an on-stack variable, and the
code that completes the request in finish_active_slot() clears the
on-stack variable via the pointer to signal that the particular
request held by the slot has completed. It also clears the in_use
member (as before that fix), so that the slot itself can safely be
reused for an unrelated request.
One thing that is not quite clean in this arrangement is that,
unless the slot gets reused, at which point the finished member is
reset to NULL, the member keeps the value of &finished, which
becomes a dangling pointer into the stack when run_active_slot()
returns. Clear the finished member before the control leaves the
function, which has a side effect of unconfusing compilers like
recent GCC 12 that is over-eager to warn against such an assignment.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The CURLOPT_PROTOCOLS (and matching CURLOPT_REDIR_PROTOCOLS) flag was
deprecated in curl 7.85.0, and using it generate compiler warnings as of
curl 7.87.0. The path forward is to use CURLOPT_PROTOCOLS_STR, but we
can't just do so unilaterally, as it was only introduced less than a
year ago in 7.85.0.
Until that version becomes ubiquitous, we have to either disable the
deprecation warning or conditionally use the "STR" variant on newer
versions of libcurl. This patch switches to the new variant, which is
nice for two reasons:
- we don't have to worry that silencing curl's deprecation warnings
might cause us to miss other more useful ones
- we'd eventually want to move to the new variant anyway, so this gets
us set up (albeit with some extra ugly boilerplate for the
conditional)
There are a lot of ways to split up the two cases. One way would be to
abstract the storage type (strbuf versus a long), how to append
(strbuf_addstr vs bitwise OR), how to initialize, which CURLOPT to use,
and so on. But the resulting code looks pretty magical:
GIT_CURL_PROTOCOL_TYPE allowed = GIT_CURL_PROTOCOL_TYPE_INIT;
if (...http is allowed...)
GIT_CURL_PROTOCOL_APPEND(&allowed, "http", CURLOPT_HTTP);
and you end up with more "#define GIT_CURL_PROTOCOL_TYPE" macros than
actual code.
On the other end of the spectrum, we could just implement two separate
functions, one that handles a string list and one that handles bits. But
then we end up repeating our list of protocols (http, https, ftp, ftp).
This patch takes the middle ground. The run-time code is always there to
handle both types, and we just choose which one to feed to curl.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The IOCTLFUNCTION option has been deprecated, and generates a compiler
warning in recent versions of curl. We can switch to using SEEKFUNCTION
instead. It was added in 2008 via curl 7.18.0; our INSTALL file already
indicates we require at least curl 7.19.4.
But there's one catch: curl says we should use CURL_SEEKFUNC_{OK,FAIL},
and those didn't arrive until 7.19.5. One workaround would be to use a
bare 0/1 here (or define our own macros). But let's just bump the
minimum required version to 7.19.5. That version is only a minor version
bump from our existing requirement, and is only a 2 month time bump for
versions that are almost 13 years old. So it's not likely that anybody
cares about the distinction.
Switching means we have to rewrite the ioctl functions into seek
functions. In some ways they are simpler (seeking is the only
operation), but in some ways more complex (the ioctl allowed only a full
rewind, but now we can seek to arbitrary offsets).
Curl will only ever use SEEK_SET (per their documentation), so I didn't
bother implementing anything else, since it would naturally be
completely untested. This seems unlikely to change, but I added an
assertion just in case.
Likewise, I doubt curl will ever try to seek outside of the buffer sizes
we've told it, but I erred on the defensive side here, rather than do an
out-of-bounds read.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Code which split an argv_array call across multiple lines, like:
argv_array_pushl(&args, "one argument",
"another argument", "and more",
NULL);
was recently mechanically renamed to use strvec, which results in
mis-matched indentation like:
strvec_pushl(&args, "one argument",
"another argument", "and more",
NULL);
Let's fix these up to align the arguments with the opening paren. I did
this manually by sifting through the results of:
git jump grep 'strvec_.*,$'
and liberally applying my editor's auto-format. Most of the changes are
of the form shown above, though I also normalized a few that had
originally used a single-tab indentation (rather than our usual style of
aligning with the open paren). I also rewrapped a couple of obvious
cases (e.g., where previously too-long lines became short enough to fit
on one), but I wasn't aggressive about it. In cases broken to three or
more lines, the grouping of arguments is sometimes meaningful, and it
wasn't worth my time or reviewer time to ponder each case individually.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We eventually want to drop the argv_array name and just use strvec
consistently. There's no particular reason we have to do it all at once,
or care about interactions between converted and unconverted bits.
Because of our preprocessor compat layer, the names are interchangeable
to the compiler (so even a definition and declaration using different
names is OK).
This patch converts remaining files from the first half of the alphabet,
to keep the diff to a manageable size.
The conversion was done purely mechanically with:
git ls-files '*.c' '*.h' |
xargs perl -i -pe '
s/ARGV_ARRAY/STRVEC/g;
s/argv_array/strvec/g;
'
and then selectively staging files with "git add '[abcdefghjkl]*'".
We'll deal with any indentation/style fallouts separately.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "fetch/clone" protocol has been updated to allow the server to
instruct the clients to grab pre-packaged packfile(s) in addition
to the packed object data coming over the wire.
* jt/cdn-offload:
upload-pack: fix a sparse '0 as NULL pointer' warning
upload-pack: send part of packfile response as uri
fetch-pack: support more than one pack lockfile
upload-pack: refactor reading of pack-objects out
Documentation: add Packfile URIs design doc
Documentation: order protocol v2 sections
http-fetch: support fetching packfiles by URL
http-fetch: refactor into function
http: refactor finish_http_pack_request()
http: use --stdin when indexing dumb HTTP pack
Teach http-fetch the ability to download packfiles directly, given a
URL, and to verify them.
The http_pack_request suite has been augmented with a function that
takes a URL directly. With this function, the hash is only used to
determine the name of the temporary file.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
finish_http_pack_request() does multiple tasks, including some
housekeeping on a struct packed_git - (1) closing its index, (2)
removing it from a list, and (3) installing it. These concerns are
independent of fetching a pack through HTTP: they are there only because
(1) the calling code opens the pack's index before deciding to fetch it,
(2) the calling code maintains a list of packfiles that can be fetched,
and (3) the calling code fetches it in order to make use of its objects
in the same process.
In preparation for a subsequent commit, which adds a feature that does
not need any of this housekeeping, remove (1), (2), and (3) from
finish_http_pack_request(). (2) and (3) are now done by a helper
function, and (1) is the responsibility of the caller (in this patch,
done closer to the point where the pack index is opened).
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When Git fetches a pack using dumb HTTP, (among other things) it invokes
index-pack on a ".pack.temp" packfile, specifying the filename as an
argument.
A future commit will require the aforementioned invocation of index-pack
to also generate a "keep" file. To use this, we either have to use
index-pack's naming convention (because --keep requires the pack's
filename to end with ".pack") or to pass the pack through stdin. Of the
two, it is simpler to pass the pack through stdin.
Thus, teach http to pass --stdin to index-pack. As a bonus, the code is
now simpler.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In trace output (when GIT_TRACE_CURL is true), redact the values of all
HTTP cookies by default. Now that auth headers (since the implementation
of GIT_TRACE_CURL in 74c682d3c6 ("http.c: implement the GIT_TRACE_CURL
environment variable", 2016-05-24)) and cookie values (since this
commit) are redacted by default in these traces, also allow the user to
inhibit these redactions through an environment variable.
Since values of all cookies are now redacted by default,
GIT_REDACT_COOKIES (which previously allowed users to select individual
cookies to redact) now has no effect.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Whenever GIT_CURL_VERBOSE is set, teach Git to behave as if
GIT_TRACE_CURL=1 and GIT_TRACE_CURL_NO_DATA=1 is set, instead of setting
CURLOPT_VERBOSE.
This is to prevent inadvertent revelation of sensitive data. In
particular, GIT_CURL_VERBOSE redacts neither the "Authorization" header
nor any cookies specified by GIT_REDACT_COOKIES.
Unifying the tracing mechanism also has the future benefit that any
improvements to the tracing mechanism will benefit both users of
GIT_CURL_VERBOSE and GIT_TRACE_CURL, and we do not need to remember to
implement any improvement twice.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We may feed a URL like "cert:///path/to/cert.pem" into the credential
machinery to get the key for a client-side certificate. That
credential has no hostname field, which is about to be disallowed (to
avoid confusion with protocols where a helper _would_ expect a
hostname).
This means as of the next patch, credential helpers won't work for
unlocking certs. Let's fix that by doing two things:
- when we parse a url with an empty host, set the host field to the
empty string (asking only to match stored entries with an empty
host) rather than NULL (asking to match _any_ host).
- when we build a cert:// credential by hand, similarly assign an
empty string
It's the latter that is more likely to impact real users in practice,
since it's what's used for http connections. But we don't have good
infrastructure to test it.
The url-parsing version will help anybody using git-credential in a
script, and is easy to test.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
A handful of options to configure SSL when talking to proxies have
been added.
* js/https-proxy-config:
http: add environment variable support for HTTPS proxies
http: add client cert support for HTTPS proxies
Add 4 environment variables that can be used to configure the proxy
cert, proxy ssl key, the proxy cert password protected flag, and the
CA info for the proxy.
Documentation for the options was also updated.
Signed-off-by: Jorge Lopez Silva <jalopezsilva@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git supports performing connections to HTTPS proxies, but we don't
support doing mutual authentication with them (through TLS).
Add the necessary options to be able to send a client certificate to
the HTTPS proxy.
A client certificate can provide an alternative way of authentication
instead of using 'ProxyAuthorization' or other more common methods of
authentication. Libcurl supports this functionality already, so changes
are somewhat minimal. The feature is guarded by the first available
libcurl version that supports these options.
4 configuration options are added and documented, cert, key, cert
password protected and CA info. The CA info should be used to specify a
different CA path to validate the HTTPS proxy cert.
Signed-off-by: Jorge Lopez Silva <jalopezsilva@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a function for inserting a C string into a strbuf. Use it
throughout the source to get rid of magic string length constants and
explicit strlen() calls.
Like strbuf_addstr(), implement it as an inline function to avoid the
implicit strlen() calls to cause runtime overhead.
Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
HTTP transport had possible allocator/deallocator mismatch, which
has been corrected.
* cb/curl-use-xmalloc:
remote-curl: unbreak http.extraHeader with custom allocators
In 93b980e58f (http: use xmalloc with cURL, 2019-08-15), we started to
ask cURL to use `xmalloc()`, and if compiled with nedmalloc, that means
implicitly a different allocator than the system one.
Which means that all of cURL's allocations and releases now _need_ to
use that allocator.
However, the `http_options()` function used `slist_append()` to add any
configured extra HTTP header(s) _before_ asking cURL to use `xmalloc()`,
and `http_cleanup()` would release them _afterwards_, i.e. in the
presence of custom allocators, cURL would attempt to use the wrong
allocator to release the memory.
A naïve attempt at fixing this would move the call to
`curl_global_init()` _before_ the config is parsed (i.e. before that
call to `slist_append()`).
However, that does not work, as we _also_ parse the config setting
`http.sslbackend` and if found, call `curl_global_sslset()` which *must*
be called before `curl_global_init()`, for details see:
https://curl.haxx.se/libcurl/c/curl_global_sslset.html
So let's instead make the config parsing entirely independent from
cURL's data structures. Incidentally, this deletes two more lines than
it introduces, which is nice.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow callers to specify exactly what characters need to be URL-encoded
and which do not. This new API will be taken advantage of in a patch
later in this set.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Matthew DeVore <matvore@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The fread/fwrite-like functions in http.c, namely fread_buffer,
fwrite_buffer, fwrite_null, fwrite_sha1_file all return the
multiplication of the size and number of items they are being given.
Practically speaking, it doesn't matter, because in all contexts where
those functions are used, size is 1.
But those functions being similar to fread and fwrite (the curl API is
designed around being able to use fread and fwrite directly), it might
be preferable to make them behave like fread and fwrite, which, from
the fread/fwrite manual page, is:
On success, fread() and fwrite() return the number of items read
or written. This number equals the number of bytes transferred
only when size is 1. If an error occurs, or the end of the file
is reached, the return value is a short item count (or zero).
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Conversion from unsigned char[20] to struct object_id continues.
* bc/hash-transition-16: (35 commits)
gitweb: make hash size independent
Git.pm: make hash size independent
read-cache: read data in a hash-independent way
dir: make untracked cache extension hash size independent
builtin/difftool: use parse_oid_hex
refspec: make hash size independent
archive: convert struct archiver_args to object_id
builtin/get-tar-commit-id: make hash size independent
get-tar-commit-id: parse comment record
hash: add a function to lookup hash algorithm by length
remote-curl: make hash size independent
http: replace sha1_to_hex
http: compute hash of downloaded objects using the_hash_algo
http: replace hard-coded constant with the_hash_algo
http-walker: replace sha1_to_hex
http-push: remove remaining uses of sha1_to_hex
http-backend: allow 64-character hex names
http-push: convert to use the_hash_algo
builtin/pull: make hash-size independent
builtin/am: make hash size independent
...
Code clean-up around a much-less-important-than-it-used-to-be
update_server_info() funtion.
* jk/server-info-rabbit-hole:
update_info_refs(): drop unused force parameter
server-info: drop objdirlen pointer arithmetic
server-info: drop nr_alloc struct member
server-info: use strbuf to read old info/packs file
server-info: simplify cleanup in parse_pack_def()
server-info: fix blind pointer arithmetic
http: simplify parsing of remote objects/info/packs
packfile: fix pack basename computation
midx: check both pack and index names for containment
t5319: drop useless --buffer from cat-file
t5319: fix bogus cat-file argument
pack-revindex: open index if necessary
packfile.h: drop extern from function declarations
We can use skip_prefix() and parse_oid_hex() to continuously increment
our pointer, rather than dealing with magic numbers. This also fixes a
few small shortcomings:
- if we see a line with the right prefix, suffix, and length, i.e.
matching /P pack-.{40}.pack\n/, we'll interpret the middle part as
hex without checking if it could be parsed. This could lead to us
looking at uninitialized garbage in the hash array. In practice this
means we'll just make a garbage request to the server which will
fail, though it's interesting that a malicious server could convince
us to leak 40 bytes of uninitialized stack to them.
- the current code is picky about seeing a newline at the end of file,
but we can easily be more liberal
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since sha1_to_hex is limited to SHA-1, replace it with hash_to_hex.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace a hard-coded 40 with a reference to the_hash_algo.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This member is used to represent the pack checksum of the pack in
question. Expand this member to be GIT_MAX_RAWSZ bytes in length so it
works with longer hashes and rename it to be "hash" instead of "sha1".
This transformation was made with a change to the definition and the
following semantic patch:
@@
struct packed_git *E1;
@@
- E1->sha1
+ E1->hash
@@
struct packed_git E1;
@@
- E1.sha1
+ E1.hash
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We make some requests with CURLOPT_FAILONERROR and some without, and
then handle_curl_result() normalizes any failures to a uniform CURLcode.
There are some other code paths in the dumb-http walker which don't use
handle_curl_result(); let's pull the normalization into its own function
so it can be reused.
Arguably those code paths would benefit from the rest of
handle_curl_result(), notably the auth handling. But retro-fitting it
now would be a lot of work, and in practice it doesn't matter too much
(whatever authentication we needed to make the initial contact with the
server is generally sufficient for the rest of the dumb-http requests).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Debugging help for http transport.
* ms/http-no-more-failonerror:
test: test GIT_CURL_VERBOSE=1 shows an error
remote-curl: unset CURLOPT_FAILONERROR
remote-curl: define struct for CURLOPT_WRITEFUNCTION
http: enable keep_error for HTTP requests
http: support file handles for HTTP_KEEP_ERROR
curl stops parsing a response when it sees a bad HTTP status code and it
has CURLOPT_FAILONERROR set. This prevents GIT_CURL_VERBOSE to show HTTP
headers on error.
keep_error is an option to receive the HTTP response body for those
error responses. By enabling this option, curl will process the HTTP
response headers, and they're shown if GIT_CURL_VERBOSE is set.
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
HTTP_KEEP_ERROR makes it easy to debug HTTP transport errors. In order
to make HTTP_KEEP_ERROR enabled for all requests, file handles need to
be supported.
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The loose object access code in sha1-file.c is some of the oldest in
Git, and could use some modernizing. It mostly uses "unsigned char *"
for object ids, which these days should be "struct object_id".
It also uses the term "sha1_file" in many functions, which is confusing.
The term "loose_objects" is much better. It clearly distinguishes
them from packed objects (which didn't even exist back when the name
"sha1_file" came into being). And it also distinguishes it from the
checksummed-file concept in csum-file.c (which until recently was
actually called "struct sha1file"!).
This patch converts the functions {open,close,map,stat}_sha1_file() into
open_loose_object(), etc, and switches their sha1 arguments for
object_id structs. Similarly, path functions like fill_sha1_path()
become fill_loose_path() and use object_ids.
The function sha1_loose_object_info() already says "loose", so we can
just drop the "sha1" (and teach it to use object_id).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The dumb-http walker code still passes around and stores object ids as
"unsigned char *sha1". Let's modernize it.
There's probably still more work to be done to handle dumb-http fetches
with a new, larger hash. But that can wait; this is enough that we can
now convert some of the low-level object routines that we call into from
here (and in fact, some of the "oid.hash" references added here will be
further improved in the next patch).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up with optimization for the codepath that checks
(non-)existence of loose objects.
* jk/loose-object-cache:
odb_load_loose_cache: fix strbuf leak
fetch-pack: drop custom loose object cache
sha1-file: use loose object cache for quick existence check
object-store: provide helpers for loose_objects_cache
sha1-file: use an object_directory for the main object dir
handle alternates paths the same as the main object dir
sha1_file_name(): overwrite buffer instead of appending
rename "alternate_object_database" to "object_directory"
submodule--helper: prefer strip_suffix() to ends_with()
fsck: do not reuse child_process structs
The "http.version" configuration variable can be used with recent
enough cURL library to force the version of HTTP used to talk when
fetching and pushing.
* fc/http-version:
http: add support selecting http version
Translating the new strings introduced for v2.20 showed some typos.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sha1_file_name() function is used to generate the path to a loose
object in the object directory. It doesn't make much sense for it to
append, since the the path we write may be absolute (i.e., you cannot
reliably build up a path with it). Because many callers use it with a
static buffer, they have to strbuf_reset() manually before each call
(and the other callers always use an empty buffer, so they don't care
either way). Let's handle this automatically.
Since we're changing the semantics, let's take the opportunity to give
it a more hash-neutral name (which will also catch any callers from
topics in flight).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>