git-commit-vandalism/t/t5551-http-fetch-smart.sh

439 lines
13 KiB
Bash
Raw Normal View History

test smart http fetch and push The top level directory "/smart/" of the test Apache server is mapped through our git-http-backend CGI, but uses the same underlying repository space as the server's document root. This is the most simple installation possible. Server logs are checked to verify the client has accessed only the smart URLs during the test. During fetch testing the headers are also logged from libcurl to ensure we are making a reasonably sane HTTP request, and getting back reasonably sane response headers from the CGI. When validating the request headers used during smart fetch we munge away the actual Content-Length and replace it with the placeholder "xxx". This avoids unnecessary varability in the test caused by an unrelated change in the requested capabilities in the first want line of the request. However, we still want to look for and verify that Content-Length was used, because smaller payloads should be using Content-Length and not "Transfer-Encoding: chunked". When validating the server response headers we must discard both Content-Length and Transfer-Encoding, as Apache2 can use either format to return our response. During development of this test I observed Apache returning both forms, depending on when the processes got CPU time. If our CGI returned the pack data quickly, Apache just buffered the whole thing and returned a Content-Length. If our CGI took just a bit too long to complete, Apache flushed its buffer and instead used "Transfer-Encoding: chunked". Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-31 01:47:47 +01:00
#!/bin/sh
test_description='test smart fetching over http via http-backend'
. ./test-lib.sh
. "$TEST_DIRECTORY"/lib-httpd.sh
start_httpd
test_expect_success 'setup repository' '
git config push.default matching &&
test smart http fetch and push The top level directory "/smart/" of the test Apache server is mapped through our git-http-backend CGI, but uses the same underlying repository space as the server's document root. This is the most simple installation possible. Server logs are checked to verify the client has accessed only the smart URLs during the test. During fetch testing the headers are also logged from libcurl to ensure we are making a reasonably sane HTTP request, and getting back reasonably sane response headers from the CGI. When validating the request headers used during smart fetch we munge away the actual Content-Length and replace it with the placeholder "xxx". This avoids unnecessary varability in the test caused by an unrelated change in the requested capabilities in the first want line of the request. However, we still want to look for and verify that Content-Length was used, because smaller payloads should be using Content-Length and not "Transfer-Encoding: chunked". When validating the server response headers we must discard both Content-Length and Transfer-Encoding, as Apache2 can use either format to return our response. During development of this test I observed Apache returning both forms, depending on when the processes got CPU time. If our CGI returned the pack data quickly, Apache just buffered the whole thing and returned a Content-Length. If our CGI took just a bit too long to complete, Apache flushed its buffer and instead used "Transfer-Encoding: chunked". Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-31 01:47:47 +01:00
echo content >file &&
git add file &&
git commit -m one
'
test_expect_success 'create http-accessible bare repository' '
mkdir "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
(cd "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
git --bare init
) &&
git remote add public "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
git push public master:master
'
setup_askpass_helper
test smart http fetch and push The top level directory "/smart/" of the test Apache server is mapped through our git-http-backend CGI, but uses the same underlying repository space as the server's document root. This is the most simple installation possible. Server logs are checked to verify the client has accessed only the smart URLs during the test. During fetch testing the headers are also logged from libcurl to ensure we are making a reasonably sane HTTP request, and getting back reasonably sane response headers from the CGI. When validating the request headers used during smart fetch we munge away the actual Content-Length and replace it with the placeholder "xxx". This avoids unnecessary varability in the test caused by an unrelated change in the requested capabilities in the first want line of the request. However, we still want to look for and verify that Content-Length was used, because smaller payloads should be using Content-Length and not "Transfer-Encoding: chunked". When validating the server response headers we must discard both Content-Length and Transfer-Encoding, as Apache2 can use either format to return our response. During development of this test I observed Apache returning both forms, depending on when the processes got CPU time. If our CGI returned the pack data quickly, Apache just buffered the whole thing and returned a Content-Length. If our CGI took just a bit too long to complete, Apache flushed its buffer and instead used "Transfer-Encoding: chunked". Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-31 01:47:47 +01:00
test_expect_success 'clone http repository' '
cat >exp <<-\EOF &&
> GET /smart/repo.git/info/refs?service=git-upload-pack HTTP/1.1
> Accept: */*
> Accept-Encoding: ENCODINGS
> Pragma: no-cache
< HTTP/1.1 200 OK
< Pragma: no-cache
< Cache-Control: no-cache, max-age=0, must-revalidate
< Content-Type: application/x-git-upload-pack-advertisement
> POST /smart/repo.git/git-upload-pack HTTP/1.1
> Accept-Encoding: ENCODINGS
> Content-Type: application/x-git-upload-pack-request
> Accept: application/x-git-upload-pack-result
> Content-Length: xxx
< HTTP/1.1 200 OK
< Pragma: no-cache
< Cache-Control: no-cache, max-age=0, must-revalidate
< Content-Type: application/x-git-upload-pack-result
EOF
GIT_TRACE_CURL=true git clone --quiet $HTTPD_URL/smart/repo.git clone 2>err &&
test smart http fetch and push The top level directory "/smart/" of the test Apache server is mapped through our git-http-backend CGI, but uses the same underlying repository space as the server's document root. This is the most simple installation possible. Server logs are checked to verify the client has accessed only the smart URLs during the test. During fetch testing the headers are also logged from libcurl to ensure we are making a reasonably sane HTTP request, and getting back reasonably sane response headers from the CGI. When validating the request headers used during smart fetch we munge away the actual Content-Length and replace it with the placeholder "xxx". This avoids unnecessary varability in the test caused by an unrelated change in the requested capabilities in the first want line of the request. However, we still want to look for and verify that Content-Length was used, because smaller payloads should be using Content-Length and not "Transfer-Encoding: chunked". When validating the server response headers we must discard both Content-Length and Transfer-Encoding, as Apache2 can use either format to return our response. During development of this test I observed Apache returning both forms, depending on when the processes got CPU time. If our CGI returned the pack data quickly, Apache just buffered the whole thing and returned a Content-Length. If our CGI took just a bit too long to complete, Apache flushed its buffer and instead used "Transfer-Encoding: chunked". Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-31 01:47:47 +01:00
test_cmp file clone/file &&
tr '\''\015'\'' Q <err |
sed -e "
s/Q\$//
/^[*] /d
/^== Info:/d
/^=> Send header, /d
/^=> Send header:$/d
/^<= Recv header, /d
/^<= Recv header:$/d
s/=> Send header: //
s/= Recv header://
/^<= Recv data/d
/^=> Send data/d
/^$/d
/^< $/d
test smart http fetch and push The top level directory "/smart/" of the test Apache server is mapped through our git-http-backend CGI, but uses the same underlying repository space as the server's document root. This is the most simple installation possible. Server logs are checked to verify the client has accessed only the smart URLs during the test. During fetch testing the headers are also logged from libcurl to ensure we are making a reasonably sane HTTP request, and getting back reasonably sane response headers from the CGI. When validating the request headers used during smart fetch we munge away the actual Content-Length and replace it with the placeholder "xxx". This avoids unnecessary varability in the test caused by an unrelated change in the requested capabilities in the first want line of the request. However, we still want to look for and verify that Content-Length was used, because smaller payloads should be using Content-Length and not "Transfer-Encoding: chunked". When validating the server response headers we must discard both Content-Length and Transfer-Encoding, as Apache2 can use either format to return our response. During development of this test I observed Apache returning both forms, depending on when the processes got CPU time. If our CGI returned the pack data quickly, Apache just buffered the whole thing and returned a Content-Length. If our CGI took just a bit too long to complete, Apache flushed its buffer and instead used "Transfer-Encoding: chunked". Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-31 01:47:47 +01:00
/^[^><]/{
s/^/> /
}
/^> User-Agent: /d
/^> Host: /d
/^> POST /,$ {
/^> Accept: [*]\\/[*]/d
}
test smart http fetch and push The top level directory "/smart/" of the test Apache server is mapped through our git-http-backend CGI, but uses the same underlying repository space as the server's document root. This is the most simple installation possible. Server logs are checked to verify the client has accessed only the smart URLs during the test. During fetch testing the headers are also logged from libcurl to ensure we are making a reasonably sane HTTP request, and getting back reasonably sane response headers from the CGI. When validating the request headers used during smart fetch we munge away the actual Content-Length and replace it with the placeholder "xxx". This avoids unnecessary varability in the test caused by an unrelated change in the requested capabilities in the first want line of the request. However, we still want to look for and verify that Content-Length was used, because smaller payloads should be using Content-Length and not "Transfer-Encoding: chunked". When validating the server response headers we must discard both Content-Length and Transfer-Encoding, as Apache2 can use either format to return our response. During development of this test I observed Apache returning both forms, depending on when the processes got CPU time. If our CGI returned the pack data quickly, Apache just buffered the whole thing and returned a Content-Length. If our CGI took just a bit too long to complete, Apache flushed its buffer and instead used "Transfer-Encoding: chunked". Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-31 01:47:47 +01:00
s/^> Content-Length: .*/> Content-Length: xxx/
/^> 00..want /d
/^> 00.*done/d
test smart http fetch and push The top level directory "/smart/" of the test Apache server is mapped through our git-http-backend CGI, but uses the same underlying repository space as the server's document root. This is the most simple installation possible. Server logs are checked to verify the client has accessed only the smart URLs during the test. During fetch testing the headers are also logged from libcurl to ensure we are making a reasonably sane HTTP request, and getting back reasonably sane response headers from the CGI. When validating the request headers used during smart fetch we munge away the actual Content-Length and replace it with the placeholder "xxx". This avoids unnecessary varability in the test caused by an unrelated change in the requested capabilities in the first want line of the request. However, we still want to look for and verify that Content-Length was used, because smaller payloads should be using Content-Length and not "Transfer-Encoding: chunked". When validating the server response headers we must discard both Content-Length and Transfer-Encoding, as Apache2 can use either format to return our response. During development of this test I observed Apache returning both forms, depending on when the processes got CPU time. If our CGI returned the pack data quickly, Apache just buffered the whole thing and returned a Content-Length. If our CGI took just a bit too long to complete, Apache flushed its buffer and instead used "Transfer-Encoding: chunked". Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-31 01:47:47 +01:00
/^< Server: /d
/^< Expires: /d
/^< Date: /d
/^< Content-Length: /d
/^< Transfer-Encoding: /d
" >actual &&
sed -e "s/^> Accept-Encoding: .*/> Accept-Encoding: ENCODINGS/" \
actual >actual.smudged &&
test_cmp exp actual.smudged &&
grep "Accept-Encoding:.*gzip" actual >actual.gzip &&
test_line_count = 2 actual.gzip
test smart http fetch and push The top level directory "/smart/" of the test Apache server is mapped through our git-http-backend CGI, but uses the same underlying repository space as the server's document root. This is the most simple installation possible. Server logs are checked to verify the client has accessed only the smart URLs during the test. During fetch testing the headers are also logged from libcurl to ensure we are making a reasonably sane HTTP request, and getting back reasonably sane response headers from the CGI. When validating the request headers used during smart fetch we munge away the actual Content-Length and replace it with the placeholder "xxx". This avoids unnecessary varability in the test caused by an unrelated change in the requested capabilities in the first want line of the request. However, we still want to look for and verify that Content-Length was used, because smaller payloads should be using Content-Length and not "Transfer-Encoding: chunked". When validating the server response headers we must discard both Content-Length and Transfer-Encoding, as Apache2 can use either format to return our response. During development of this test I observed Apache returning both forms, depending on when the processes got CPU time. If our CGI returned the pack data quickly, Apache just buffered the whole thing and returned a Content-Length. If our CGI took just a bit too long to complete, Apache flushed its buffer and instead used "Transfer-Encoding: chunked". Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-31 01:47:47 +01:00
'
test_expect_success 'fetch changes via http' '
echo content >>file &&
git commit -a -m two &&
git push public &&
test smart http fetch and push The top level directory "/smart/" of the test Apache server is mapped through our git-http-backend CGI, but uses the same underlying repository space as the server's document root. This is the most simple installation possible. Server logs are checked to verify the client has accessed only the smart URLs during the test. During fetch testing the headers are also logged from libcurl to ensure we are making a reasonably sane HTTP request, and getting back reasonably sane response headers from the CGI. When validating the request headers used during smart fetch we munge away the actual Content-Length and replace it with the placeholder "xxx". This avoids unnecessary varability in the test caused by an unrelated change in the requested capabilities in the first want line of the request. However, we still want to look for and verify that Content-Length was used, because smaller payloads should be using Content-Length and not "Transfer-Encoding: chunked". When validating the server response headers we must discard both Content-Length and Transfer-Encoding, as Apache2 can use either format to return our response. During development of this test I observed Apache returning both forms, depending on when the processes got CPU time. If our CGI returned the pack data quickly, Apache just buffered the whole thing and returned a Content-Length. If our CGI took just a bit too long to complete, Apache flushed its buffer and instead used "Transfer-Encoding: chunked". Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-31 01:47:47 +01:00
(cd clone && git pull) &&
test_cmp file clone/file
'
test_expect_success 'used upload-pack service' '
cat >exp <<-\EOF &&
GET /smart/repo.git/info/refs?service=git-upload-pack HTTP/1.1 200
POST /smart/repo.git/git-upload-pack HTTP/1.1 200
GET /smart/repo.git/info/refs?service=git-upload-pack HTTP/1.1 200
POST /smart/repo.git/git-upload-pack HTTP/1.1 200
EOF
t/lib-httpd: avoid occasional failures when checking access.log The last test of 't5561-http-backend.sh', 'server request log matches test results' may fail occasionally, because the order of entries in Apache's access log doesn't match the order of requests sent in the previous tests, although all the right requests are there. I saw it fail on Travis CI five times in the span of about half a year, when the order of two subsequent requests was flipped, and could trigger the failure with a modified Git. However, I was unable to trigger it with stock Git on my machine. Three tests in 't5541-http-push-smart.sh' and 't5551-http-fetch-smart.sh' check requests in the log the same way, so they might be prone to a similar occasional failure as well. When a test sends a HTTP request, it can continue execution after 'git-http-backend' fulfilled that request, but Apache writes the corresponding access log entry only after 'git-http-backend' exited. Some time inevitably passes between fulfilling the request and writing the log entry, and, under unfavourable circumstances, enough time might pass for the subsequent request to be sent and fulfilled by a different Apache thread or process, and then Apache writes access log entries racily. This effect can be exacerbated by adding a bit of variable delay after the request is fulfilled but before 'git-http-backend' exits, e.g. like this: diff --git a/http-backend.c b/http-backend.c index f3dc218b2..bbf4c125b 100644 --- a/http-backend.c +++ b/http-backend.c @@ -709,5 +709,7 @@ int cmd_main(int argc, const char **argv) max_request_buffer); cmd->imp(&hdr, cmd_arg); + if (getpid() % 2) + sleep(1); return 0; } This delay considerably increases the chances of log entries being written out of order, and in turn makes t5561's last test fail almost every time. Alas, it doesn't seem to be enough to trigger a similar failure in t5541 and t5551. So, since we can't just rely on the order of access log entries always corresponding the order of requests, make checking the access log more deterministic by sorting (simply lexicographically) both the stripped access log entries and the expected entries before the comparison with 'test_cmp'. This way the order of log entries won't matter and occasional out-of-order entries won't trigger a test failure, but the comparison will still notice any unexpected or missing log entries. OTOH, this sorting will make it harder to identify from which test an unexpected log entry came from or which test's request went missing. Therefore, in case of an error include the comparison of the unsorted log enries in the test output as well. And since all this should be performed in four tests in three test scripts, put this into a new helper function 'check_access_log' in 't/lib-httpd.sh'. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-12 14:22:16 +02:00
check_access_log exp
test smart http fetch and push The top level directory "/smart/" of the test Apache server is mapped through our git-http-backend CGI, but uses the same underlying repository space as the server's document root. This is the most simple installation possible. Server logs are checked to verify the client has accessed only the smart URLs during the test. During fetch testing the headers are also logged from libcurl to ensure we are making a reasonably sane HTTP request, and getting back reasonably sane response headers from the CGI. When validating the request headers used during smart fetch we munge away the actual Content-Length and replace it with the placeholder "xxx". This avoids unnecessary varability in the test caused by an unrelated change in the requested capabilities in the first want line of the request. However, we still want to look for and verify that Content-Length was used, because smaller payloads should be using Content-Length and not "Transfer-Encoding: chunked". When validating the server response headers we must discard both Content-Length and Transfer-Encoding, as Apache2 can use either format to return our response. During development of this test I observed Apache returning both forms, depending on when the processes got CPU time. If our CGI returned the pack data quickly, Apache just buffered the whole thing and returned a Content-Length. If our CGI took just a bit too long to complete, Apache flushed its buffer and instead used "Transfer-Encoding: chunked". Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-31 01:47:47 +01:00
'
test_expect_success 'follow redirects (301)' '
git clone $HTTPD_URL/smart-redir-perm/repo.git --quiet repo-p
'
test_expect_success 'follow redirects (302)' '
git clone $HTTPD_URL/smart-redir-temp/repo.git --quiet repo-t
'
remote-curl: rewrite base url from info/refs redirects For efficiency and security reasons, an earlier commit in this series taught http_get_* to re-write the base url based on redirections we saw while making a specific request. This commit wires that option into the info/refs request, meaning that a redirect from http://example.com/foo.git/info/refs to https://example.com/bar.git/info/refs will behave as if "https://example.com/bar.git" had been provided to git in the first place. The tests bear some explanation. We introduce two new hierearchies into the httpd test config: 1. Requests to /smart-redir-limited will work only for the initial info/refs request, but not any subsequent requests. As a result, we can confirm whether the client is re-rooting its requests after the initial contact, since otherwise it will fail (it will ask for "repo.git/git-upload-pack", which is not redirected). 2. Requests to smart-redir-auth will redirect, and require auth after the redirection. Since we are using the redirected base for further requests, we also update the credential struct, in order not to mislead the user (or credential helpers) about which credential is needed. We can therefore check the GIT_ASKPASS prompts to make sure we are prompting for the new location. Because we have neither multiple servers nor https support in our test setup, we can only redirect between paths, meaning we need to turn on credential.useHttpPath to see the difference. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-09-28 10:35:35 +02:00
test_expect_success 'redirects re-root further requests' '
git clone $HTTPD_URL/smart-redir-limited/repo.git repo-redir-limited
'
http: always update the base URL for redirects If a malicious server redirects the initial ref advertisement, it may be able to leak sha1s from other, unrelated servers that the client has access to. For example, imagine that Alice is a git user, she has access to a private repository on a server hosted by Bob, and Mallory runs a malicious server and wants to find out about Bob's private repository. Mallory asks Alice to clone an unrelated repository from her over HTTP. When Alice's client contacts Mallory's server for the initial ref advertisement, the server issues an HTTP redirect for Bob's server. Alice contacts Bob's server and gets the ref advertisement for the private repository. If there is anything to fetch, she then follows up by asking the server for one or more sha1 objects. But who is the server? If it is still Mallory's server, then Alice will leak the existence of those sha1s to her. Since commit c93c92f30 (http: update base URLs when we see redirects, 2013-09-28), the client usually rewrites the base URL such that all further requests will go to Bob's server. But this is done by textually matching the URL. If we were originally looking for "http://mallory/repo.git/info/refs", and we got pointed at "http://bob/other.git/info/refs", then we know that the right root is "http://bob/other.git". If the redirect appears to change more than just the root, we punt and continue to use the original server. E.g., imagine the redirect adds a URL component that Bob's server will ignore, like "http://bob/other.git/info/refs?dummy=1". We can solve this by aborting in this case rather than silently continuing to use Mallory's server. In addition to protecting from sha1 leakage, it's arguably safer and more sane to refuse a confusing redirect like that in general. For example, part of the motivation in c93c92f30 is avoiding accidentally sending credentials over clear http, just to get a response that says "try again over https". So even in a non-malicious case, we'd prefer to err on the side of caution. The downside is that it's possible this will break a legitimate but complicated server-side redirection scheme. The setup given in the newly added test does work, but it's convoluted enough that we don't need to care about it. A more plausible case would be a server which redirects a request for "info/refs?service=git-upload-pack" to just "info/refs" (because it does not do smart HTTP, and for some reason really dislikes query parameters). Right now we would transparently downgrade to dumb-http, but with this patch, we'd complain (and the user would have to set GIT_SMART_HTTP=0 to fetch). Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-06 19:24:35 +01:00
test_expect_success 're-rooting dies on insane schemes' '
test_must_fail git clone $HTTPD_URL/insane-redir/repo.git insane
'
test_expect_success 'clone from password-protected repository' '
echo two >expect &&
set_askpass user@host pass@host &&
git clone --bare "$HTTPD_URL/auth/smart/repo.git" smart-auth &&
expect_askpass both user@host &&
git --git-dir=smart-auth log -1 --format=%s >actual &&
test_cmp expect actual
'
test_expect_success 'clone from auth-only-for-push repository' '
echo two >expect &&
set_askpass wrong &&
git clone --bare "$HTTPD_URL/auth-push/smart/repo.git" smart-noauth &&
expect_askpass none &&
git --git-dir=smart-noauth log -1 --format=%s >actual &&
test_cmp expect actual
'
test_expect_success 'clone from auth-only-for-objects repository' '
echo two >expect &&
set_askpass user@host pass@host &&
git clone --bare "$HTTPD_URL/auth-fetch/smart/repo.git" half-auth &&
expect_askpass both user@host &&
git --git-dir=half-auth log -1 --format=%s >actual &&
test_cmp expect actual
'
test_expect_success 'no-op half-auth fetch does not require a password' '
set_askpass wrong &&
git --git-dir=half-auth fetch &&
expect_askpass none
'
remote-curl: rewrite base url from info/refs redirects For efficiency and security reasons, an earlier commit in this series taught http_get_* to re-write the base url based on redirections we saw while making a specific request. This commit wires that option into the info/refs request, meaning that a redirect from http://example.com/foo.git/info/refs to https://example.com/bar.git/info/refs will behave as if "https://example.com/bar.git" had been provided to git in the first place. The tests bear some explanation. We introduce two new hierearchies into the httpd test config: 1. Requests to /smart-redir-limited will work only for the initial info/refs request, but not any subsequent requests. As a result, we can confirm whether the client is re-rooting its requests after the initial contact, since otherwise it will fail (it will ask for "repo.git/git-upload-pack", which is not redirected). 2. Requests to smart-redir-auth will redirect, and require auth after the redirection. Since we are using the redirected base for further requests, we also update the credential struct, in order not to mislead the user (or credential helpers) about which credential is needed. We can therefore check the GIT_ASKPASS prompts to make sure we are prompting for the new location. Because we have neither multiple servers nor https support in our test setup, we can only redirect between paths, meaning we need to turn on credential.useHttpPath to see the difference. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-09-28 10:35:35 +02:00
test_expect_success 'redirects send auth to new location' '
set_askpass user@host pass@host &&
remote-curl: rewrite base url from info/refs redirects For efficiency and security reasons, an earlier commit in this series taught http_get_* to re-write the base url based on redirections we saw while making a specific request. This commit wires that option into the info/refs request, meaning that a redirect from http://example.com/foo.git/info/refs to https://example.com/bar.git/info/refs will behave as if "https://example.com/bar.git" had been provided to git in the first place. The tests bear some explanation. We introduce two new hierearchies into the httpd test config: 1. Requests to /smart-redir-limited will work only for the initial info/refs request, but not any subsequent requests. As a result, we can confirm whether the client is re-rooting its requests after the initial contact, since otherwise it will fail (it will ask for "repo.git/git-upload-pack", which is not redirected). 2. Requests to smart-redir-auth will redirect, and require auth after the redirection. Since we are using the redirected base for further requests, we also update the credential struct, in order not to mislead the user (or credential helpers) about which credential is needed. We can therefore check the GIT_ASKPASS prompts to make sure we are prompting for the new location. Because we have neither multiple servers nor https support in our test setup, we can only redirect between paths, meaning we need to turn on credential.useHttpPath to see the difference. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-09-28 10:35:35 +02:00
git -c credential.useHttpPath=true \
clone $HTTPD_URL/smart-redir-auth/repo.git repo-redir-auth &&
expect_askpass both user@host auth/smart/repo.git
'
test_expect_success 'disable dumb http on server' '
git --git-dir="$HTTPD_DOCUMENT_ROOT_PATH/repo.git" \
config http.getanyfile false
'
test_expect_success 'GIT_SMART_HTTP can disable smart http' '
(GIT_SMART_HTTP=0 &&
export GIT_SMART_HTTP &&
cd clone &&
test_must_fail git fetch)
'
test_expect_success 'invalid Content-Type rejected' '
test_must_fail git clone $HTTPD_URL/broken_smart/repo.git 2>actual &&
grep "not valid:" actual
'
test_expect_success 'create namespaced refs' '
test_commit namespaced &&
git push public HEAD:refs/namespaces/ns/refs/heads/master &&
git --git-dir="$HTTPD_DOCUMENT_ROOT_PATH/repo.git" \
symbolic-ref refs/namespaces/ns/HEAD refs/namespaces/ns/refs/heads/master
'
test_expect_success 'smart clone respects namespace' '
git clone "$HTTPD_URL/smart_namespace/repo.git" ns-smart &&
echo namespaced >expect &&
git --git-dir=ns-smart/.git log -1 --format=%s >actual &&
test_cmp expect actual
'
test_expect_success 'dumb clone via http-backend respects namespace' '
git --git-dir="$HTTPD_DOCUMENT_ROOT_PATH/repo.git" \
config http.getanyfile true &&
GIT_SMART_HTTP=0 git clone \
"$HTTPD_URL/smart_namespace/repo.git" ns-dumb &&
echo namespaced >expect &&
git --git-dir=ns-dumb/.git log -1 --format=%s >actual &&
test_cmp expect actual
'
test_expect_success 'cookies stored in http.cookiefile when http.savecookies set' '
cat >cookies.txt <<-\EOF &&
127.0.0.1 FALSE /smart_cookies/ FALSE 0 othername othervalue
EOF
sort >expect_cookies.txt <<-\EOF &&
127.0.0.1 FALSE /smart_cookies/ FALSE 0 othername othervalue
127.0.0.1 FALSE /smart_cookies/repo.git/info/ FALSE 0 name value
EOF
git config http.cookiefile cookies.txt &&
git config http.savecookies true &&
git ls-remote $HTTPD_URL/smart_cookies/repo.git master &&
tail -3 cookies.txt | sort >cookies_tail.txt &&
test_cmp expect_cookies.txt cookies_tail.txt
'
test_expect_success 'transfer.hiderefs works over smart-http' '
test_commit hidden &&
test_commit visible &&
git push public HEAD^:refs/heads/a HEAD:refs/heads/b &&
git --git-dir="$HTTPD_DOCUMENT_ROOT_PATH/repo.git" \
config transfer.hiderefs refs/heads/a &&
git clone --bare "$HTTPD_URL/smart/repo.git" hidden.git &&
test_must_fail git -C hidden.git rev-parse --verify a &&
git -C hidden.git rev-parse --verify b
'
# create an arbitrary number of tags, numbered from tag-$1 to tag-$2
create_tags () {
rm -f marks &&
for i in $(test_seq "$1" "$2")
do
# don't use here-doc, because it requires a process
# per loop iteration
echo "commit refs/heads/too-many-refs-$1" &&
echo "mark :$i" &&
echo "committer git <git@example.com> $i +0000" &&
echo "data 0" &&
echo "M 644 inline bla.txt" &&
echo "data 4" &&
echo "bla" &&
# make every commit dangling by always
# rewinding the branch after each commit
echo "reset refs/heads/too-many-refs-$1" &&
echo "from :$1"
done | git fast-import --export-marks=marks &&
# now assign tags to all the dangling commits we created above
tag=$(perl -e "print \"bla\" x 30") &&
sed -e "s|^:\([^ ]*\) \(.*\)$|\2 refs/tags/$tag-\1|" <marks >>packed-refs
}
test_expect_success 'create 2,000 tags in the repo' '
(
cd "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
create_tags 1 2000
)
'
test_expect_success CMDLINE_LIMIT \
'clone the 2,000 tag repo to check OS command line overflow' '
run_with_limited_cmdline git clone $HTTPD_URL/smart/repo.git too-many-refs &&
(
cd too-many-refs &&
git for-each-ref refs/tags >actual &&
test_line_count = 2000 actual
)
'
test_expect_success 'large fetch-pack requests can be split across POSTs' '
GIT_TRACE_CURL=true git -c http.postbuffer=65536 \
clone --bare "$HTTPD_URL/smart/repo.git" split.git 2>err &&
grep "^=> Send header: POST" err >posts &&
test_line_count = 2 posts
'
remote-curl: don't hang when a server dies before any output In the event that a HTTP server closes the connection after giving a 200 but before giving any packets, we don't want to hang forever waiting for a response that will never come. Instead, we should die immediately. One case where this happens is when attempting to fetch a dangling object by its object name. In this case, the server dies before sending any data. Prior to this patch, fetch-pack would wait for data from the server, and remote-curl would wait for fetch-pack, causing a deadlock. Despite this patch, there is other possible malformed input that could cause the same deadlock (e.g. a half-finished pktline, or a pktline but no trailing flush). There are a few possible solutions to this: 1. Allowing remote-curl to tell fetch-pack about the EOF (so that fetch-pack could know that no more data is coming until it says something else). This is tricky because an out-of-band signal would be required, or the http response would have to be re-framed inside another layer of pkt-line or something. 2. Make remote-curl understand some of the protocol. It turns out that in addition to understanding pkt-line, it would need to watch for ack/nak. This is somewhat fragile, as information about the protocol would end up in two places. Also, pkt-lines which are already at the length limit would need special handling. Both of these solutions would require a fair amount of work, whereas this hack is easy and solves at least some of the problem. Still to do: it would be good to give a better error message than "fatal: The remote end hung up unexpectedly". Signed-off-by: David Turner <dturner@twosigma.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-11-18 21:30:49 +01:00
test_expect_success 'test allowreachablesha1inwant' '
test_when_finished "rm -rf test_reachable.git" &&
server="$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
master_sha=$(git -C "$server" rev-parse refs/heads/master) &&
git -C "$server" config uploadpack.allowreachablesha1inwant 1 &&
git init --bare test_reachable.git &&
git -C test_reachable.git remote add origin "$HTTPD_URL/smart/repo.git" &&
git -C test_reachable.git fetch origin "$master_sha"
'
test_expect_success 'test allowreachablesha1inwant with unreachable' '
test_when_finished "rm -rf test_reachable.git; git reset --hard $(git rev-parse HEAD)" &&
#create unreachable sha
echo content >file2 &&
git add file2 &&
git commit -m two &&
git push public HEAD:refs/heads/doomed &&
git push public :refs/heads/doomed &&
server="$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
master_sha=$(git -C "$server" rev-parse refs/heads/master) &&
git -C "$server" config uploadpack.allowreachablesha1inwant 1 &&
git init --bare test_reachable.git &&
git -C test_reachable.git remote add origin "$HTTPD_URL/smart/repo.git" &&
test_must_fail git -C test_reachable.git fetch origin "$(git rev-parse HEAD)"
'
test_expect_success 'test allowanysha1inwant with unreachable' '
test_when_finished "rm -rf test_reachable.git; git reset --hard $(git rev-parse HEAD)" &&
#create unreachable sha
echo content >file2 &&
git add file2 &&
git commit -m two &&
git push public HEAD:refs/heads/doomed &&
git push public :refs/heads/doomed &&
server="$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
master_sha=$(git -C "$server" rev-parse refs/heads/master) &&
git -C "$server" config uploadpack.allowreachablesha1inwant 1 &&
git init --bare test_reachable.git &&
git -C test_reachable.git remote add origin "$HTTPD_URL/smart/repo.git" &&
test_must_fail git -C test_reachable.git fetch origin "$(git rev-parse HEAD)" &&
git -C "$server" config uploadpack.allowanysha1inwant 1 &&
git -C test_reachable.git fetch origin "$(git rev-parse HEAD)"
'
http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-20 09:37:09 +02:00
test_expect_success EXPENSIVE 'http can handle enormous ref negotiation' '
(
cd "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
create_tags 2001 50000
) &&
http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-20 09:37:09 +02:00
git -C too-many-refs fetch -q --tags &&
(
cd "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
create_tags 50001 100000
) &&
git -C too-many-refs fetch -q --tags &&
git -C too-many-refs for-each-ref refs/tags >tags &&
test_line_count = 100000 tags
'
test_expect_success 'custom http headers' '
test_must_fail git -c http.extraheader="x-magic-two: cadabra" \
fetch "$HTTPD_URL/smart_headers/repo.git" &&
git -c http.extraheader="x-magic-one: abra" \
-c http.extraheader="x-magic-two: cadabra" \
fetch "$HTTPD_URL/smart_headers/repo.git" &&
git update-index --add --cacheinfo 160000,$(git rev-parse HEAD),sub &&
git config -f .gitmodules submodule.sub.path sub &&
git config -f .gitmodules submodule.sub.url \
"$HTTPD_URL/smart_headers/repo.git" &&
git submodule init sub &&
test_must_fail git submodule update sub &&
git -c http.extraheader="x-magic-one: abra" \
-c http.extraheader="x-magic-two: cadabra" \
submodule update sub
'
fetch-pack: unify ref in and out param When a user fetches: - at least one up-to-date ref and at least one non-up-to-date ref, - using HTTP with protocol v0 (or something else that uses the fetch command of a remote helper) some refs might not be updated after the fetch. This bug was introduced in commit 989b8c4452 ("fetch-pack: put shallow info in output parameter", 2018-06-28) which allowed transports to report the refs that they have fetched in a new out-parameter "fetched_refs". If they do so, transport_fetch_refs() makes this information available to its caller. Users of "fetched_refs" rely on the following 3 properties: (1) it is the complete list of refs that was passed to transport_fetch_refs(), (2) it has shallow information (REF_STATUS_REJECT_SHALLOW set if relevant), and (3) it has updated OIDs if ref-in-want was used (introduced after 989b8c4452). In an effort to satisfy (1), whenever transport_fetch_refs() filters the refs sent to the transport, it re-adds the filtered refs to whatever the transport supplies before returning it to the user. However, the implementation in 989b8c4452 unconditionally re-adds the filtered refs without checking if the transport refrained from reporting anything in "fetched_refs" (which it is allowed to do), resulting in an incomplete list, no longer satisfying (1). An earlier effort to resolve this [1] solved the issue by readding the filtered refs only if the transport did not refrain from reporting in "fetched_refs", but after further discussion, it seems that the better solution is to revert the API change that introduced "fetched_refs". This API change was first suggested as part of a ref-in-want implementation that allowed for ref patterns and, thus, there could be drastic differences between the input refs and the refs actually fetched [2]; we eventually decided to only allow exact ref names, but this API change remained even though its necessity was decreased. Therefore, revert this API change by reverting commit 989b8c4452, and make receive_wanted_refs() update the OIDs in the sought array (like how update_shallow() updates shallow information in the sought array) instead. A test is also included to show that the user-visible bug discussed at the beginning of this commit message no longer exists. [1] https://public-inbox.org/git/20180801171806.GA122458@google.com/ [2] https://public-inbox.org/git/86a128c5fb710a41791e7183207c4d64889f9307.1485381677.git.jonathantanmy@google.com/ Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-01 22:13:20 +02:00
test_expect_success 'using fetch command in remote-curl updates refs' '
SERVER="$HTTPD_DOCUMENT_ROOT_PATH/twobranch" &&
rm -rf "$SERVER" client &&
git init "$SERVER" &&
test_commit -C "$SERVER" foo &&
git -C "$SERVER" update-ref refs/heads/anotherbranch foo &&
git clone $HTTPD_URL/smart/twobranch client &&
test_commit -C "$SERVER" bar &&
git -C client -c protocol.version=0 fetch &&
git -C "$SERVER" rev-parse master >expect &&
git -C client rev-parse origin/master >actual &&
test_cmp expect actual
'
test_expect_success 'fetch by SHA-1 without tag following' '
SERVER="$HTTPD_DOCUMENT_ROOT_PATH/server" &&
rm -rf "$SERVER" client &&
git init "$SERVER" &&
test_commit -C "$SERVER" foo &&
git clone $HTTPD_URL/smart/server client &&
test_commit -C "$SERVER" bar &&
git -C "$SERVER" rev-parse bar >bar_hash &&
git -C client -c protocol.version=0 fetch \
--no-tags origin $(cat bar_hash)
'
test_expect_success 'GIT_REDACT_COOKIES redacts cookies' '
rm -rf clone &&
echo "Set-Cookie: Foo=1" >cookies &&
echo "Set-Cookie: Bar=2" >>cookies &&
GIT_TRACE_CURL=true GIT_REDACT_COOKIES=Bar,Baz \
git -c "http.cookieFile=$(pwd)/cookies" clone \
$HTTPD_URL/smart/repo.git clone 2>err &&
grep "Cookie:.*Foo=1" err &&
grep "Cookie:.*Bar=<redacted>" err &&
! grep "Cookie:.*Bar=2" err
'
test_expect_success 'GIT_REDACT_COOKIES handles empty values' '
rm -rf clone &&
echo "Set-Cookie: Foo=" >cookies &&
GIT_TRACE_CURL=true GIT_REDACT_COOKIES=Foo \
git -c "http.cookieFile=$(pwd)/cookies" clone \
$HTTPD_URL/smart/repo.git clone 2>err &&
grep "Cookie:.*Foo=<redacted>" err
'
test_expect_success 'GIT_TRACE_CURL_NO_DATA prevents data from being traced' '
rm -rf clone &&
GIT_TRACE_CURL=true \
git clone $HTTPD_URL/smart/repo.git clone 2>err &&
grep "=> Send data" err &&
rm -rf clone &&
GIT_TRACE_CURL=true GIT_TRACE_CURL_NO_DATA=1 \
git clone $HTTPD_URL/smart/repo.git clone 2>err &&
! grep "=> Send data" err
'
test_expect_success 'server-side error detected' '
test_must_fail git clone $HTTPD_URL/error_smart/repo.git 2>actual &&
grep "server-side error" actual
'
test smart http fetch and push The top level directory "/smart/" of the test Apache server is mapped through our git-http-backend CGI, but uses the same underlying repository space as the server's document root. This is the most simple installation possible. Server logs are checked to verify the client has accessed only the smart URLs during the test. During fetch testing the headers are also logged from libcurl to ensure we are making a reasonably sane HTTP request, and getting back reasonably sane response headers from the CGI. When validating the request headers used during smart fetch we munge away the actual Content-Length and replace it with the placeholder "xxx". This avoids unnecessary varability in the test caused by an unrelated change in the requested capabilities in the first want line of the request. However, we still want to look for and verify that Content-Length was used, because smaller payloads should be using Content-Length and not "Transfer-Encoding: chunked". When validating the server response headers we must discard both Content-Length and Transfer-Encoding, as Apache2 can use either format to return our response. During development of this test I observed Apache returning both forms, depending on when the processes got CPU time. If our CGI returned the pack data quickly, Apache just buffered the whole thing and returned a Content-Length. If our CGI took just a bit too long to complete, Apache flushed its buffer and instead used "Transfer-Encoding: chunked". Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-31 01:47:47 +01:00
stop_httpd
test_done