Communication between the HTTP server and http_backend process can
lead to a dead-lock when relaying a large ref negotiation request.
Diagnose the situation better, and mitigate it by reading such a
request first into core (to a reasonable limit).
* jk/http-backend-deadlock:
http-backend: spool ref negotiation requests to buffer
t5551: factor out tag creation
http-backend: fix die recursion with custom handler
* jk/http-backend-deadlock-2.3:
http-backend: spool ref negotiation requests to buffer
t5551: factor out tag creation
http-backend: fix die recursion with custom handler
* jk/http-backend-deadlock-2.2:
http-backend: spool ref negotiation requests to buffer
t5551: factor out tag creation
http-backend: fix die recursion with custom handler
When http-backend spawns "upload-pack" to do ref
negotiation, it streams the http request body to
upload-pack, who then streams the http response back to the
client as it reads. In theory, git can go full-duplex; the
client can consume our response while it is still sending
the request. In practice, however, HTTP is a half-duplex
protocol. Even if our client is ready to read and write
simultaneously, we may have other HTTP infrastructure in the
way, including the webserver that spawns our CGI, or any
intermediate proxies.
In at least one documented case[1], this leads to deadlock
when trying a fetch over http. What happens is basically:
1. Apache proxies the request to the CGI, http-backend.
2. http-backend gzip-inflates the data and sends
the result to upload-pack.
3. upload-pack acts on the data and generates output over
the pipe back to Apache. Apache isn't reading because
it's busy writing (step 1).
This works fine most of the time, because the upload-pack
output ends up in a system pipe buffer, and Apache reads
it as soon as it finishes writing. But if both the request
and the response exceed the system pipe buffer size, then we
deadlock (Apache blocks writing to http-backend,
http-backend blocks writing to upload-pack, and upload-pack
blocks writing to Apache).
We need to break the deadlock by spooling either the input
or the output. In this case, it's ideal to spool the input,
because Apache does not start reading either stdout _or_
stderr until we have consumed all of the input. So until we
do so, we cannot even get an error message out to the
client.
The solution is fairly straight-forward: we read the request
body into an in-memory buffer in http-backend, freeing up
Apache, and then feed the data ourselves to upload-pack. But
there are a few important things to note:
1. We limit the in-memory buffer to prevent an obvious
denial-of-service attack. This is a new hard limit on
requests, but it's unlikely to come into play. The
default value is 10MB, which covers even the ridiculous
100,000-ref negotation in the included test (that
actually caps out just over 5MB). But it's configurable
on the off chance that you don't mind spending some
extra memory to make even ridiculous requests work.
2. We must take care only to buffer when we have to. For
pushes, the incoming packfile may be of arbitrary
size, and we should connect the input directly to
receive-pack. There's no deadlock problem here, though,
because we do not produce any output until the whole
packfile has been read.
For upload-pack's initial ref advertisement, we
similarly do not need to buffer. Even though we may
generate a lot of output, there is no request body at
all (i.e., it is a GET, not a POST).
[1] http://article.gmane.org/gmane.comp.version-control.git/269020
Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test clean-up.
* jk/skip-http-tests-under-no-curl:
tests: skip dav http-push tests under NO_EXPAT=NoThanks
t/lib-httpd.sh: skip tests if NO_CURL is defined
One of our tests in t5551 creates a large number of tags,
and jumps through some hoops to do it efficiently. Let's
factor that out into a function so we can make other similar
tests.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we built git without curl, we can't actually test against
an http server. In fact, all of the test scripts which
include lib-httpd.sh already perform this check, with one
exception: t5540. For those scripts, this is a noop, and for
t5540, this is a bugfix (it used to fail when built with
NO_CURL, though it could go unnoticed if you had a stale
git-remote-https in your build directory).
Noticed-by: Junio C Hamano <junio@pobox.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
People often forget to chain the commands in their test together
with &&, leaving a failure from an earlier command in the test go
unnoticed. The new GIT_TEST_CHAIN_LINT mechanism allows you to
catch such a mistake more easily.
* jk/test-chain-lint: (36 commits)
t9001: drop save_confirm helper
t0020: use test_* helpers instead of hand-rolled messages
t: simplify loop exit-code status variables
t: fix some trivial cases of ignored exit codes in loops
t7701: fix ignored exit code inside loop
t3305: fix ignored exit code inside loop
t0020: fix ignored exit code inside loops
perf-lib: fix ignored exit code inside loop
t6039: fix broken && chain
t9158, t9161: fix broken &&-chain in git-svn tests
t9104: fix test for following larger parents
t4104: drop hand-rolled error reporting
t0005: fix broken &&-chains
t7004: fix embedded single-quotes
t0050: appease --chain-lint
t9001: use test_when_finished
t4117: use modern test_* helpers
t6034: use modern test_* helpers
t1301: use modern test_* helpers
t0020: use modern test_* helpers
...
Test fixes.
* jk/test-annoyances:
t5551: make EXPENSIVE test cheaper
t5541: move run_with_cmdline_limit to test-lib.sh
t: pass GIT_TRACE through Apache
t: redirect stderr GIT_TRACE to descriptor 4
t: translate SIGINT to an exit
These are tests which are missing a link in their &&-chain,
but during a setup phase. We may fail to notice failure in
commands that build the test environment, but these are
typically not expected to fail at all (but it's still good
to double-check that our test environment is what we
expect).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These are tests which are missing a link in their &&-chain,
but in a way that probably does not effect the outcome of
the test. Most of these are of the form:
some_cmd >actual
test_cmp expect actual
The main point of the test is to verify the output, and a
failure in some_cmd would probably be noticed by bogus
output. But it is good for the tests to also confirm that
"some_cmd" does not die unexpectedly after producing its
output.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These are tests which are missing a link in their &&-chain,
in a location which causes a significant portion of the test
to be missed (e.g., the test effectively does nothing, or
consists of a long string of actions and output comparisons,
and we throw away the exit code of at least one part of the
string).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We create 50,000 tags to check that we don't overflow the
command-line of fetch-pack. But by using run_with_cmdline_limit,
we can get the same effect with a much smaller number of
tags. This makes the test fast enough that we can drop the
EXPENSIVE prereq, which means people will actually run it.
It was not documented to do so, but this test was also the
only test of a clone-over-http that requires multiple POSTs
during the conversation. We can continue to test that by
dropping http.postbuffer to its minimum size, and checking
that we get two POSTs.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When upload-pack advertises the refs (either for a normal,
non-stateless request, or for the initial contact in a
stateless one), we call for_each_ref with the send_ref
function as its callback. send_ref, in turn, calls
mark_our_ref, which checks whether the ref is hidden, and
sets OUR_REF or HIDDEN_REF on the object as appropriate. If
it is hidden, mark_our_ref also returns "1" to signal
send_ref that the ref should not be advertised.
If we are not advertising refs, (i.e., the follow-up
invocation by an http client to send its "want" lines), we
use mark_our_ref directly as a callback to for_each_ref. Its
marking does the right thing, but when it then returns "1"
to for_each_ref, the latter interprets this as an error and
stops iterating. As a result, we skip marking all of the
refs that come lexicographically after it. Any "want" lines
from the client asking for those objects will fail, as they
were not properly marked with OUR_REF.
To solve this, we introduce a wrapper callback around
mark_our_ref which always returns 0 (even if the ref is
hidden, we want to keep iterating). We also tweak the
signature of mark_our_ref to exclude unnecessary parameters
that were present only to conform to the callback interface.
This should make it less likely for somebody to accidentally
use it as a callback in the future.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* 'jc/test-lazy-prereq' (early part):
t3419: drop unnecessary NOT_EXPENSIVE pseudo-prerequisite
t3302: drop unnecessary NOT_EXPENSIVE pseudo-prerequisite
t3302: do not chdir around in the primary test process
t3302: coding style updates
test: turn USR_BIN_TIME into a lazy prerequisite
test: turn EXPENSIVE into a lazy prerequisite
Make clear which one is for dumb protocol, which one is for smart from
their file name.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>