git-commit-vandalism/upload-pack.c

1060 lines
26 KiB
C
Raw Normal View History

#include "cache.h"
#include "refs.h"
#include "pkt-line.h"
#include "sideband.h"
#include "tag.h"
#include "object.h"
#include "commit.h"
#include "exec_cmd.h"
#include "diff.h"
#include "revision.h"
#include "list-objects.h"
#include "run-command.h"
#include "connect.h"
#include "sigchain.h"
#include "version.h"
upload/receive-pack: allow hiding ref hierarchies A repository may have refs that are only used for its internal bookkeeping purposes that should not be exposed to the others that come over the network. Teach upload-pack to omit some refs from its initial advertisement by paying attention to the uploadpack.hiderefs multi-valued configuration variable. Do the same to receive-pack via the receive.hiderefs variable. As a convenient short-hand, allow using transfer.hiderefs to set the value to both of these variables. Any ref that is under the hierarchies listed on the value of these variable is excluded from responses to requests made by "ls-remote", "fetch", etc. (for upload-pack) and "push" (for receive-pack). Because these hidden refs do not count as OUR_REF, an attempt to fetch objects at the tip of them will be rejected, and because these refs do not get advertised, "git push :" will not see local branches that have the same name as them as "matching" ones to be sent. An attempt to update/delete these hidden refs with an explicit refspec, e.g. "git push origin :refs/hidden/22", is rejected. This is not a new restriction. To the pusher, it would appear that there is no such ref, so its push request will conclude with "Now that I sent you all the data, it is time for you to update the refs. I saw that the ref did not exist when I started pushing, and I want the result to point at this commit". The receiving end will apply the compare-and-swap rule to this request and rejects the push with "Well, your update request conflicts with somebody else; I see there is such a ref.", which is the right thing to do. Otherwise a push to a hidden ref will always be "the last one wins", which is not a good default. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-19 01:08:30 +01:00
#include "string-list.h"
#include "parse-options.h"
#include "argv-array.h"
#include "prio-queue.h"
static const char * const upload_pack_usage[] = {
N_("git upload-pack [<options>] <dir>"),
NULL
};
/* Remember to update object flag allocation in object.h */
#define THEY_HAVE (1u << 11)
#define OUR_REF (1u << 12)
#define WANTED (1u << 13)
#define COMMON_KNOWN (1u << 14)
#define REACHABLE (1u << 15)
#define SHALLOW (1u << 16)
#define NOT_SHALLOW (1u << 17)
#define CLIENT_SHALLOW (1u << 18)
#define HIDDEN_REF (1u << 19)
static unsigned long oldest_have;
fetch, upload-pack: --deepen=N extends shallow boundary by N commits In git-fetch, --depth argument is always relative with the latest remote refs. This makes it a bit difficult to cover this use case, where the user wants to make the shallow history, say 3 levels deeper. It would work if remote refs have not moved yet, but nobody can guarantee that, especially when that use case is performed a couple months after the last clone or "git fetch --depth". Also, modifying shallow boundary using --depth does not work well with clones created by --since or --not. This patch fixes that. A new argument --deepen=<N> will add <N> more (*) parent commits to the current history regardless of where remote refs are. Have/Want negotiation is still respected. So if remote refs move, the server will send two chunks: one between "have" and "want" and another to extend shallow history. In theory, the client could send no "want"s in order to get the second chunk only. But the protocol does not allow that. Either you send no want lines, which means ls-remote; or you have to send at least one want line that carries deep-relative to the server.. The main work was done by Dongcan Jiang. I fixed it up here and there. And of course all the bugs belong to me. (*) We could even support --deepen=<N> where <N> is negative. In that case we can cut some history from the shallow clone. This operation (and --depth=<shorter depth>) does not require interaction with remote side (and more complicated to implement as a result). Helped-by: Duy Nguyen <pclouds@gmail.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Dongcan Jiang <dongcan.jiang@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-12 12:54:09 +02:00
static int deepen_relative;
static int multi_ack;
static int no_done;
static int use_thin_pack, use_ofs_delta, use_include_tag;
static int no_progress, daemon_mode;
/* Allow specifying sha1 if it is a ref tip. */
#define ALLOW_TIP_SHA1 01
/* Allow request of a sha1 if it is reachable from a ref (possibly hidden ref). */
#define ALLOW_REACHABLE_SHA1 02
static unsigned int allow_unadvertised_object_request;
static int shallow_nr;
static struct object_array have_obj;
static struct object_array want_obj;
static struct object_array extra_edge_obj;
static unsigned int timeout;
static int keepalive = 5;
/* 0 for no sideband,
* otherwise maximum packet size (up to 65520 bytes).
*/
static int use_sideband;
static int advertise_refs;
static int stateless_rpc;
upload-pack: provide a hook for running pack-objects When upload-pack serves a client request, it turns to pack-objects to do the heavy lifting of creating a packfile. There's no easy way to intercept the call to pack-objects, but there are a few good reasons to want to do so: 1. If you're debugging a client or server issue with fetching, you may want to store a copy of the generated packfile. 2. If you're gathering data from real-world fetches for performance analysis or debugging, storing a copy of the arguments and stdin lets you replay the pack generation at your leisure. 3. You may want to insert a caching layer around pack-objects; it is the most CPU- and memory-intensive part of serving a fetch, and its output is a pure function[1] of its input, making it an ideal place to consolidate identical requests. This patch adds a simple "hook" interface to intercept calls to pack-objects. The new test demonstrates how it can be used for debugging (using it for caching is a straightforward extension; the tricky part is writing the actual caching layer). This hook is unlike the normal hook scripts found in the "hooks/" directory of a repository. Because we promise that upload-pack is safe to run in an untrusted repository, we cannot execute arbitrary code or commands found in the repository (neither in hooks/, nor in the config). So instead, this hook is triggered from a config variable that is explicitly ignored in the per-repo config. The config variable holds the actual shell command to run as the hook. Another approach would be to simply treat it as a boolean: "should I respect the upload-pack hooks in this repo?", and then run the script from "hooks/" as we usually do. However, that isn't as flexible; there's no way to run a hook approved by the site administrator (e.g., in "/etc/gitconfig") on a repository whose contents are not trusted. The approach taken by this patch is more fine-grained, if a little less conventional for git hooks (it does behave similar to other configured commands like diff.external, etc). [1] Pack-objects isn't _actually_ a pure function. Its output depends on the exact packing of the object database, and if multi-threading is used for delta compression, can even differ racily. But for the purposes of caching, that's OK; of the many possible outputs for a given input, it is sufficient only that we output one of them. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-19 00:45:37 +02:00
static const char *pack_objects_hook;
static void reset_timeout(void)
{
alarm(timeout);
}
static void send_client_data(int fd, const char *data, ssize_t sz)
{
if (use_sideband) {
send_sideband(1, fd, data, sz, use_sideband);
return;
}
if (fd == 3)
/* emergency quit */
fd = 2;
if (fd == 2) {
/* XXX: are we happy to lose stuff here? */
xwrite(fd, data, sz);
return;
}
write_or_die(fd, data, sz);
}
static int write_one_shallow(const struct commit_graft *graft, void *cb_data)
{
FILE *fp = cb_data;
if (graft->nr_parent == -1)
fprintf(fp, "--shallow %s\n", oid_to_hex(&graft->oid));
return 0;
}
static void create_pack_file(void)
{
struct child_process pack_objects = CHILD_PROCESS_INIT;
char data[8193], progress[128];
char abort_msg[] = "aborting due to possible repository "
"corruption on the remote side.";
int buffered = -1;
ssize_t sz;
int i;
FILE *pipe_fd;
upload-pack: provide a hook for running pack-objects When upload-pack serves a client request, it turns to pack-objects to do the heavy lifting of creating a packfile. There's no easy way to intercept the call to pack-objects, but there are a few good reasons to want to do so: 1. If you're debugging a client or server issue with fetching, you may want to store a copy of the generated packfile. 2. If you're gathering data from real-world fetches for performance analysis or debugging, storing a copy of the arguments and stdin lets you replay the pack generation at your leisure. 3. You may want to insert a caching layer around pack-objects; it is the most CPU- and memory-intensive part of serving a fetch, and its output is a pure function[1] of its input, making it an ideal place to consolidate identical requests. This patch adds a simple "hook" interface to intercept calls to pack-objects. The new test demonstrates how it can be used for debugging (using it for caching is a straightforward extension; the tricky part is writing the actual caching layer). This hook is unlike the normal hook scripts found in the "hooks/" directory of a repository. Because we promise that upload-pack is safe to run in an untrusted repository, we cannot execute arbitrary code or commands found in the repository (neither in hooks/, nor in the config). So instead, this hook is triggered from a config variable that is explicitly ignored in the per-repo config. The config variable holds the actual shell command to run as the hook. Another approach would be to simply treat it as a boolean: "should I respect the upload-pack hooks in this repo?", and then run the script from "hooks/" as we usually do. However, that isn't as flexible; there's no way to run a hook approved by the site administrator (e.g., in "/etc/gitconfig") on a repository whose contents are not trusted. The approach taken by this patch is more fine-grained, if a little less conventional for git hooks (it does behave similar to other configured commands like diff.external, etc). [1] Pack-objects isn't _actually_ a pure function. Its output depends on the exact packing of the object database, and if multi-threading is used for delta compression, can even differ racily. But for the purposes of caching, that's OK; of the many possible outputs for a given input, it is sufficient only that we output one of them. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-19 00:45:37 +02:00
if (!pack_objects_hook)
pack_objects.git_cmd = 1;
else {
argv_array_push(&pack_objects.args, pack_objects_hook);
argv_array_push(&pack_objects.args, "git");
pack_objects.use_shell = 1;
}
if (shallow_nr) {
argv_array_push(&pack_objects.args, "--shallow-file");
argv_array_push(&pack_objects.args, "");
}
argv_array_push(&pack_objects.args, "pack-objects");
argv_array_push(&pack_objects.args, "--revs");
if (use_thin_pack)
argv_array_push(&pack_objects.args, "--thin");
argv_array_push(&pack_objects.args, "--stdout");
if (shallow_nr)
argv_array_push(&pack_objects.args, "--shallow");
if (!no_progress)
argv_array_push(&pack_objects.args, "--progress");
if (use_ofs_delta)
argv_array_push(&pack_objects.args, "--delta-base-offset");
if (use_include_tag)
argv_array_push(&pack_objects.args, "--include-tag");
upload-pack: start pack-objects before async rev-list In a pthread-enabled version of upload-pack, there's a race condition that can cause a deadlock on the fflush(NULL) we call from run-command. What happens is this: 1. Upload-pack is informed we are doing a shallow clone. 2. We call start_async() to spawn a thread that will generate rev-list results to feed to pack-objects. It gets a file descriptor to a pipe which will eventually hook to pack-objects. 3. The rev-list thread uses fdopen to create a new output stream around the fd we gave it, called pack_pipe. 4. The thread writes results to pack_pipe. Outside of our control, libc is doing locking on the stream. We keep writing until the OS pipe buffer is full, and then we block in write(), still holding the lock. 5. The main thread now uses start_command to spawn pack-objects. Before forking, it calls fflush(NULL) to flush every stdio output buffer. It blocks trying to get the lock on pack_pipe. And we have a deadlock. The thread will block until somebody starts reading from the pipe. But nobody will read from the pipe until we finish flushing to the pipe. To fix this, we swap the start order: we start the pack-objects reader first, and then the rev-list writer after. Thus the problematic fflush(NULL) happens before we even open the new file descriptor (and even if it didn't, flushing should no longer block, as the reader at the end of the pipe is now active). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-06 23:33:33 +02:00
pack_objects.in = -1;
pack_objects.out = -1;
pack_objects.err = -1;
upload-pack: Use finish_{command,async}() instead of waitpid(). upload-pack spawns two processes, rev-list and pack-objects, and carefully monitors their status so that it can report failure to the remote end. This change removes the complicated procedures on the grounds of the following observations: - If everything is OK, rev-list closes its output pipe end, upon which pack-objects (which reads from the pipe) sees EOF and terminates itself, closing its output (and error) pipes. upload-pack reads from both until it sees EOF in both. It collects the exit codes of the child processes (which indicate success) and terminates successfully. - If rev-list sees an error, it closes its output and terminates with failure. pack-objects sees EOF in its input and terminates successfully. Again upload-pack reads its inputs until EOF. When it now collects the exit codes of its child processes, it notices the failure of rev-list and signals failure to the remote end. - If pack-objects sees an error, it terminates with failure. Since this breaks the pipe to rev-list, rev-list is killed with SIGPIPE. upload-pack reads its input until EOF, then collects the exit codes of the child processes, notices their failures, and signals failure to the remote end. - If upload-pack itself dies unexpectedly, pack-objects is killed with SIGPIPE, and subsequently also rev-list. The upshot of this is that precise monitoring of child processes is not required because both terminate if either one of them dies unexpectedly. This allows us to use finish_command() and finish_async() instead of an explicit waitpid(2) call. The change is smaller than it looks because most of it only reduces the indentation of a large part of the inner loop. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-04 20:46:48 +01:00
if (start_command(&pack_objects))
die("git upload-pack: unable to fork git-pack-objects");
pipe_fd = xfdopen(pack_objects.in, "w");
if (shallow_nr)
for_each_commit_graft(write_one_shallow, pipe_fd);
for (i = 0; i < want_obj.nr; i++)
fprintf(pipe_fd, "%s\n",
oid_to_hex(&want_obj.objects[i].item->oid));
fprintf(pipe_fd, "--not\n");
for (i = 0; i < have_obj.nr; i++)
fprintf(pipe_fd, "%s\n",
oid_to_hex(&have_obj.objects[i].item->oid));
for (i = 0; i < extra_edge_obj.nr; i++)
fprintf(pipe_fd, "%s\n",
oid_to_hex(&extra_edge_obj.objects[i].item->oid));
fprintf(pipe_fd, "\n");
fflush(pipe_fd);
fclose(pipe_fd);
/* We read from pack_objects.err to capture stderr output for
* progress bar, and pack_objects.out to capture the pack data.
*/
while (1) {
struct pollfd pfd[2];
int pe, pu, pollsize;
int ret;
reset_timeout();
pollsize = 0;
pe = pu = -1;
if (0 <= pack_objects.out) {
pfd[pollsize].fd = pack_objects.out;
pfd[pollsize].events = POLLIN;
pu = pollsize;
pollsize++;
}
if (0 <= pack_objects.err) {
pfd[pollsize].fd = pack_objects.err;
pfd[pollsize].events = POLLIN;
pe = pollsize;
pollsize++;
}
upload-pack: Use finish_{command,async}() instead of waitpid(). upload-pack spawns two processes, rev-list and pack-objects, and carefully monitors their status so that it can report failure to the remote end. This change removes the complicated procedures on the grounds of the following observations: - If everything is OK, rev-list closes its output pipe end, upon which pack-objects (which reads from the pipe) sees EOF and terminates itself, closing its output (and error) pipes. upload-pack reads from both until it sees EOF in both. It collects the exit codes of the child processes (which indicate success) and terminates successfully. - If rev-list sees an error, it closes its output and terminates with failure. pack-objects sees EOF in its input and terminates successfully. Again upload-pack reads its inputs until EOF. When it now collects the exit codes of its child processes, it notices the failure of rev-list and signals failure to the remote end. - If pack-objects sees an error, it terminates with failure. Since this breaks the pipe to rev-list, rev-list is killed with SIGPIPE. upload-pack reads its input until EOF, then collects the exit codes of the child processes, notices their failures, and signals failure to the remote end. - If upload-pack itself dies unexpectedly, pack-objects is killed with SIGPIPE, and subsequently also rev-list. The upshot of this is that precise monitoring of child processes is not required because both terminate if either one of them dies unexpectedly. This allows us to use finish_command() and finish_async() instead of an explicit waitpid(2) call. The change is smaller than it looks because most of it only reduces the indentation of a large part of the inner loop. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-04 20:46:48 +01:00
if (!pollsize)
break;
ret = poll(pfd, pollsize,
keepalive < 0 ? -1 : 1000 * keepalive);
if (ret < 0) {
upload-pack: Use finish_{command,async}() instead of waitpid(). upload-pack spawns two processes, rev-list and pack-objects, and carefully monitors their status so that it can report failure to the remote end. This change removes the complicated procedures on the grounds of the following observations: - If everything is OK, rev-list closes its output pipe end, upon which pack-objects (which reads from the pipe) sees EOF and terminates itself, closing its output (and error) pipes. upload-pack reads from both until it sees EOF in both. It collects the exit codes of the child processes (which indicate success) and terminates successfully. - If rev-list sees an error, it closes its output and terminates with failure. pack-objects sees EOF in its input and terminates successfully. Again upload-pack reads its inputs until EOF. When it now collects the exit codes of its child processes, it notices the failure of rev-list and signals failure to the remote end. - If pack-objects sees an error, it terminates with failure. Since this breaks the pipe to rev-list, rev-list is killed with SIGPIPE. upload-pack reads its input until EOF, then collects the exit codes of the child processes, notices their failures, and signals failure to the remote end. - If upload-pack itself dies unexpectedly, pack-objects is killed with SIGPIPE, and subsequently also rev-list. The upshot of this is that precise monitoring of child processes is not required because both terminate if either one of them dies unexpectedly. This allows us to use finish_command() and finish_async() instead of an explicit waitpid(2) call. The change is smaller than it looks because most of it only reduces the indentation of a large part of the inner loop. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-04 20:46:48 +01:00
if (errno != EINTR) {
error_errno("poll failed, resuming");
upload-pack: Use finish_{command,async}() instead of waitpid(). upload-pack spawns two processes, rev-list and pack-objects, and carefully monitors their status so that it can report failure to the remote end. This change removes the complicated procedures on the grounds of the following observations: - If everything is OK, rev-list closes its output pipe end, upon which pack-objects (which reads from the pipe) sees EOF and terminates itself, closing its output (and error) pipes. upload-pack reads from both until it sees EOF in both. It collects the exit codes of the child processes (which indicate success) and terminates successfully. - If rev-list sees an error, it closes its output and terminates with failure. pack-objects sees EOF in its input and terminates successfully. Again upload-pack reads its inputs until EOF. When it now collects the exit codes of its child processes, it notices the failure of rev-list and signals failure to the remote end. - If pack-objects sees an error, it terminates with failure. Since this breaks the pipe to rev-list, rev-list is killed with SIGPIPE. upload-pack reads its input until EOF, then collects the exit codes of the child processes, notices their failures, and signals failure to the remote end. - If upload-pack itself dies unexpectedly, pack-objects is killed with SIGPIPE, and subsequently also rev-list. The upshot of this is that precise monitoring of child processes is not required because both terminate if either one of them dies unexpectedly. This allows us to use finish_command() and finish_async() instead of an explicit waitpid(2) call. The change is smaller than it looks because most of it only reduces the indentation of a large part of the inner loop. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-04 20:46:48 +01:00
sleep(1);
}
upload-pack: Use finish_{command,async}() instead of waitpid(). upload-pack spawns two processes, rev-list and pack-objects, and carefully monitors their status so that it can report failure to the remote end. This change removes the complicated procedures on the grounds of the following observations: - If everything is OK, rev-list closes its output pipe end, upon which pack-objects (which reads from the pipe) sees EOF and terminates itself, closing its output (and error) pipes. upload-pack reads from both until it sees EOF in both. It collects the exit codes of the child processes (which indicate success) and terminates successfully. - If rev-list sees an error, it closes its output and terminates with failure. pack-objects sees EOF in its input and terminates successfully. Again upload-pack reads its inputs until EOF. When it now collects the exit codes of its child processes, it notices the failure of rev-list and signals failure to the remote end. - If pack-objects sees an error, it terminates with failure. Since this breaks the pipe to rev-list, rev-list is killed with SIGPIPE. upload-pack reads its input until EOF, then collects the exit codes of the child processes, notices their failures, and signals failure to the remote end. - If upload-pack itself dies unexpectedly, pack-objects is killed with SIGPIPE, and subsequently also rev-list. The upshot of this is that precise monitoring of child processes is not required because both terminate if either one of them dies unexpectedly. This allows us to use finish_command() and finish_async() instead of an explicit waitpid(2) call. The change is smaller than it looks because most of it only reduces the indentation of a large part of the inner loop. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-04 20:46:48 +01:00
continue;
}
if (0 <= pe && (pfd[pe].revents & (POLLIN|POLLHUP))) {
/* Status ready; we ship that in the side-band
* or dump to the standard error.
*/
sz = xread(pack_objects.err, progress,
sizeof(progress));
if (0 < sz)
send_client_data(2, progress, sz);
else if (sz == 0) {
close(pack_objects.err);
pack_objects.err = -1;
}
else
goto fail;
/* give priority to status messages */
continue;
}
upload-pack: Use finish_{command,async}() instead of waitpid(). upload-pack spawns two processes, rev-list and pack-objects, and carefully monitors their status so that it can report failure to the remote end. This change removes the complicated procedures on the grounds of the following observations: - If everything is OK, rev-list closes its output pipe end, upon which pack-objects (which reads from the pipe) sees EOF and terminates itself, closing its output (and error) pipes. upload-pack reads from both until it sees EOF in both. It collects the exit codes of the child processes (which indicate success) and terminates successfully. - If rev-list sees an error, it closes its output and terminates with failure. pack-objects sees EOF in its input and terminates successfully. Again upload-pack reads its inputs until EOF. When it now collects the exit codes of its child processes, it notices the failure of rev-list and signals failure to the remote end. - If pack-objects sees an error, it terminates with failure. Since this breaks the pipe to rev-list, rev-list is killed with SIGPIPE. upload-pack reads its input until EOF, then collects the exit codes of the child processes, notices their failures, and signals failure to the remote end. - If upload-pack itself dies unexpectedly, pack-objects is killed with SIGPIPE, and subsequently also rev-list. The upshot of this is that precise monitoring of child processes is not required because both terminate if either one of them dies unexpectedly. This allows us to use finish_command() and finish_async() instead of an explicit waitpid(2) call. The change is smaller than it looks because most of it only reduces the indentation of a large part of the inner loop. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-04 20:46:48 +01:00
if (0 <= pu && (pfd[pu].revents & (POLLIN|POLLHUP))) {
/* Data ready; we keep the last byte to ourselves
* in case we detect broken rev-list, so that we
* can leave the stream corrupted. This is
* unfortunate -- unpack-objects would happily
* accept a valid packdata with trailing garbage,
* so appending garbage after we pass all the
* pack data is not good enough to signal
* breakage to downstream.
*/
char *cp = data;
ssize_t outsz = 0;
if (0 <= buffered) {
*cp++ = buffered;
outsz++;
}
upload-pack: Use finish_{command,async}() instead of waitpid(). upload-pack spawns two processes, rev-list and pack-objects, and carefully monitors their status so that it can report failure to the remote end. This change removes the complicated procedures on the grounds of the following observations: - If everything is OK, rev-list closes its output pipe end, upon which pack-objects (which reads from the pipe) sees EOF and terminates itself, closing its output (and error) pipes. upload-pack reads from both until it sees EOF in both. It collects the exit codes of the child processes (which indicate success) and terminates successfully. - If rev-list sees an error, it closes its output and terminates with failure. pack-objects sees EOF in its input and terminates successfully. Again upload-pack reads its inputs until EOF. When it now collects the exit codes of its child processes, it notices the failure of rev-list and signals failure to the remote end. - If pack-objects sees an error, it terminates with failure. Since this breaks the pipe to rev-list, rev-list is killed with SIGPIPE. upload-pack reads its input until EOF, then collects the exit codes of the child processes, notices their failures, and signals failure to the remote end. - If upload-pack itself dies unexpectedly, pack-objects is killed with SIGPIPE, and subsequently also rev-list. The upshot of this is that precise monitoring of child processes is not required because both terminate if either one of them dies unexpectedly. This allows us to use finish_command() and finish_async() instead of an explicit waitpid(2) call. The change is smaller than it looks because most of it only reduces the indentation of a large part of the inner loop. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-04 20:46:48 +01:00
sz = xread(pack_objects.out, cp,
sizeof(data) - outsz);
if (0 < sz)
;
upload-pack: Use finish_{command,async}() instead of waitpid(). upload-pack spawns two processes, rev-list and pack-objects, and carefully monitors their status so that it can report failure to the remote end. This change removes the complicated procedures on the grounds of the following observations: - If everything is OK, rev-list closes its output pipe end, upon which pack-objects (which reads from the pipe) sees EOF and terminates itself, closing its output (and error) pipes. upload-pack reads from both until it sees EOF in both. It collects the exit codes of the child processes (which indicate success) and terminates successfully. - If rev-list sees an error, it closes its output and terminates with failure. pack-objects sees EOF in its input and terminates successfully. Again upload-pack reads its inputs until EOF. When it now collects the exit codes of its child processes, it notices the failure of rev-list and signals failure to the remote end. - If pack-objects sees an error, it terminates with failure. Since this breaks the pipe to rev-list, rev-list is killed with SIGPIPE. upload-pack reads its input until EOF, then collects the exit codes of the child processes, notices their failures, and signals failure to the remote end. - If upload-pack itself dies unexpectedly, pack-objects is killed with SIGPIPE, and subsequently also rev-list. The upshot of this is that precise monitoring of child processes is not required because both terminate if either one of them dies unexpectedly. This allows us to use finish_command() and finish_async() instead of an explicit waitpid(2) call. The change is smaller than it looks because most of it only reduces the indentation of a large part of the inner loop. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-04 20:46:48 +01:00
else if (sz == 0) {
close(pack_objects.out);
pack_objects.out = -1;
}
upload-pack: Use finish_{command,async}() instead of waitpid(). upload-pack spawns two processes, rev-list and pack-objects, and carefully monitors their status so that it can report failure to the remote end. This change removes the complicated procedures on the grounds of the following observations: - If everything is OK, rev-list closes its output pipe end, upon which pack-objects (which reads from the pipe) sees EOF and terminates itself, closing its output (and error) pipes. upload-pack reads from both until it sees EOF in both. It collects the exit codes of the child processes (which indicate success) and terminates successfully. - If rev-list sees an error, it closes its output and terminates with failure. pack-objects sees EOF in its input and terminates successfully. Again upload-pack reads its inputs until EOF. When it now collects the exit codes of its child processes, it notices the failure of rev-list and signals failure to the remote end. - If pack-objects sees an error, it terminates with failure. Since this breaks the pipe to rev-list, rev-list is killed with SIGPIPE. upload-pack reads its input until EOF, then collects the exit codes of the child processes, notices their failures, and signals failure to the remote end. - If upload-pack itself dies unexpectedly, pack-objects is killed with SIGPIPE, and subsequently also rev-list. The upshot of this is that precise monitoring of child processes is not required because both terminate if either one of them dies unexpectedly. This allows us to use finish_command() and finish_async() instead of an explicit waitpid(2) call. The change is smaller than it looks because most of it only reduces the indentation of a large part of the inner loop. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-04 20:46:48 +01:00
else
goto fail;
sz += outsz;
if (1 < sz) {
buffered = data[sz-1] & 0xFF;
sz--;
}
upload-pack: Use finish_{command,async}() instead of waitpid(). upload-pack spawns two processes, rev-list and pack-objects, and carefully monitors their status so that it can report failure to the remote end. This change removes the complicated procedures on the grounds of the following observations: - If everything is OK, rev-list closes its output pipe end, upon which pack-objects (which reads from the pipe) sees EOF and terminates itself, closing its output (and error) pipes. upload-pack reads from both until it sees EOF in both. It collects the exit codes of the child processes (which indicate success) and terminates successfully. - If rev-list sees an error, it closes its output and terminates with failure. pack-objects sees EOF in its input and terminates successfully. Again upload-pack reads its inputs until EOF. When it now collects the exit codes of its child processes, it notices the failure of rev-list and signals failure to the remote end. - If pack-objects sees an error, it terminates with failure. Since this breaks the pipe to rev-list, rev-list is killed with SIGPIPE. upload-pack reads its input until EOF, then collects the exit codes of the child processes, notices their failures, and signals failure to the remote end. - If upload-pack itself dies unexpectedly, pack-objects is killed with SIGPIPE, and subsequently also rev-list. The upshot of this is that precise monitoring of child processes is not required because both terminate if either one of them dies unexpectedly. This allows us to use finish_command() and finish_async() instead of an explicit waitpid(2) call. The change is smaller than it looks because most of it only reduces the indentation of a large part of the inner loop. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-04 20:46:48 +01:00
else
buffered = -1;
send_client_data(1, data, sz);
upload-pack: Use finish_{command,async}() instead of waitpid(). upload-pack spawns two processes, rev-list and pack-objects, and carefully monitors their status so that it can report failure to the remote end. This change removes the complicated procedures on the grounds of the following observations: - If everything is OK, rev-list closes its output pipe end, upon which pack-objects (which reads from the pipe) sees EOF and terminates itself, closing its output (and error) pipes. upload-pack reads from both until it sees EOF in both. It collects the exit codes of the child processes (which indicate success) and terminates successfully. - If rev-list sees an error, it closes its output and terminates with failure. pack-objects sees EOF in its input and terminates successfully. Again upload-pack reads its inputs until EOF. When it now collects the exit codes of its child processes, it notices the failure of rev-list and signals failure to the remote end. - If pack-objects sees an error, it terminates with failure. Since this breaks the pipe to rev-list, rev-list is killed with SIGPIPE. upload-pack reads its input until EOF, then collects the exit codes of the child processes, notices their failures, and signals failure to the remote end. - If upload-pack itself dies unexpectedly, pack-objects is killed with SIGPIPE, and subsequently also rev-list. The upshot of this is that precise monitoring of child processes is not required because both terminate if either one of them dies unexpectedly. This allows us to use finish_command() and finish_async() instead of an explicit waitpid(2) call. The change is smaller than it looks because most of it only reduces the indentation of a large part of the inner loop. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-04 20:46:48 +01:00
}
/*
* We hit the keepalive timeout without saying anything; send
* an empty message on the data sideband just to let the other
* side know we're still working on it, but don't have any data
* yet.
*
* If we don't have a sideband channel, there's no room in the
* protocol to say anything, so those clients are just out of
* luck.
*/
if (!ret && use_sideband) {
static const char buf[] = "0005\1";
write_or_die(1, buf, 5);
}
upload-pack: Use finish_{command,async}() instead of waitpid(). upload-pack spawns two processes, rev-list and pack-objects, and carefully monitors their status so that it can report failure to the remote end. This change removes the complicated procedures on the grounds of the following observations: - If everything is OK, rev-list closes its output pipe end, upon which pack-objects (which reads from the pipe) sees EOF and terminates itself, closing its output (and error) pipes. upload-pack reads from both until it sees EOF in both. It collects the exit codes of the child processes (which indicate success) and terminates successfully. - If rev-list sees an error, it closes its output and terminates with failure. pack-objects sees EOF in its input and terminates successfully. Again upload-pack reads its inputs until EOF. When it now collects the exit codes of its child processes, it notices the failure of rev-list and signals failure to the remote end. - If pack-objects sees an error, it terminates with failure. Since this breaks the pipe to rev-list, rev-list is killed with SIGPIPE. upload-pack reads its input until EOF, then collects the exit codes of the child processes, notices their failures, and signals failure to the remote end. - If upload-pack itself dies unexpectedly, pack-objects is killed with SIGPIPE, and subsequently also rev-list. The upshot of this is that precise monitoring of child processes is not required because both terminate if either one of them dies unexpectedly. This allows us to use finish_command() and finish_async() instead of an explicit waitpid(2) call. The change is smaller than it looks because most of it only reduces the indentation of a large part of the inner loop. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-04 20:46:48 +01:00
}
upload-pack: Use finish_{command,async}() instead of waitpid(). upload-pack spawns two processes, rev-list and pack-objects, and carefully monitors their status so that it can report failure to the remote end. This change removes the complicated procedures on the grounds of the following observations: - If everything is OK, rev-list closes its output pipe end, upon which pack-objects (which reads from the pipe) sees EOF and terminates itself, closing its output (and error) pipes. upload-pack reads from both until it sees EOF in both. It collects the exit codes of the child processes (which indicate success) and terminates successfully. - If rev-list sees an error, it closes its output and terminates with failure. pack-objects sees EOF in its input and terminates successfully. Again upload-pack reads its inputs until EOF. When it now collects the exit codes of its child processes, it notices the failure of rev-list and signals failure to the remote end. - If pack-objects sees an error, it terminates with failure. Since this breaks the pipe to rev-list, rev-list is killed with SIGPIPE. upload-pack reads its input until EOF, then collects the exit codes of the child processes, notices their failures, and signals failure to the remote end. - If upload-pack itself dies unexpectedly, pack-objects is killed with SIGPIPE, and subsequently also rev-list. The upshot of this is that precise monitoring of child processes is not required because both terminate if either one of them dies unexpectedly. This allows us to use finish_command() and finish_async() instead of an explicit waitpid(2) call. The change is smaller than it looks because most of it only reduces the indentation of a large part of the inner loop. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-04 20:46:48 +01:00
if (finish_command(&pack_objects)) {
error("git upload-pack: git-pack-objects died with error.");
upload-pack: Use finish_{command,async}() instead of waitpid(). upload-pack spawns two processes, rev-list and pack-objects, and carefully monitors their status so that it can report failure to the remote end. This change removes the complicated procedures on the grounds of the following observations: - If everything is OK, rev-list closes its output pipe end, upon which pack-objects (which reads from the pipe) sees EOF and terminates itself, closing its output (and error) pipes. upload-pack reads from both until it sees EOF in both. It collects the exit codes of the child processes (which indicate success) and terminates successfully. - If rev-list sees an error, it closes its output and terminates with failure. pack-objects sees EOF in its input and terminates successfully. Again upload-pack reads its inputs until EOF. When it now collects the exit codes of its child processes, it notices the failure of rev-list and signals failure to the remote end. - If pack-objects sees an error, it terminates with failure. Since this breaks the pipe to rev-list, rev-list is killed with SIGPIPE. upload-pack reads its input until EOF, then collects the exit codes of the child processes, notices their failures, and signals failure to the remote end. - If upload-pack itself dies unexpectedly, pack-objects is killed with SIGPIPE, and subsequently also rev-list. The upshot of this is that precise monitoring of child processes is not required because both terminate if either one of them dies unexpectedly. This allows us to use finish_command() and finish_async() instead of an explicit waitpid(2) call. The change is smaller than it looks because most of it only reduces the indentation of a large part of the inner loop. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-04 20:46:48 +01:00
goto fail;
}
upload-pack: Use finish_{command,async}() instead of waitpid(). upload-pack spawns two processes, rev-list and pack-objects, and carefully monitors their status so that it can report failure to the remote end. This change removes the complicated procedures on the grounds of the following observations: - If everything is OK, rev-list closes its output pipe end, upon which pack-objects (which reads from the pipe) sees EOF and terminates itself, closing its output (and error) pipes. upload-pack reads from both until it sees EOF in both. It collects the exit codes of the child processes (which indicate success) and terminates successfully. - If rev-list sees an error, it closes its output and terminates with failure. pack-objects sees EOF in its input and terminates successfully. Again upload-pack reads its inputs until EOF. When it now collects the exit codes of its child processes, it notices the failure of rev-list and signals failure to the remote end. - If pack-objects sees an error, it terminates with failure. Since this breaks the pipe to rev-list, rev-list is killed with SIGPIPE. upload-pack reads its input until EOF, then collects the exit codes of the child processes, notices their failures, and signals failure to the remote end. - If upload-pack itself dies unexpectedly, pack-objects is killed with SIGPIPE, and subsequently also rev-list. The upshot of this is that precise monitoring of child processes is not required because both terminate if either one of them dies unexpectedly. This allows us to use finish_command() and finish_async() instead of an explicit waitpid(2) call. The change is smaller than it looks because most of it only reduces the indentation of a large part of the inner loop. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-04 20:46:48 +01:00
/* flush the data */
if (0 <= buffered) {
data[0] = buffered;
send_client_data(1, data, 1);
upload-pack: Use finish_{command,async}() instead of waitpid(). upload-pack spawns two processes, rev-list and pack-objects, and carefully monitors their status so that it can report failure to the remote end. This change removes the complicated procedures on the grounds of the following observations: - If everything is OK, rev-list closes its output pipe end, upon which pack-objects (which reads from the pipe) sees EOF and terminates itself, closing its output (and error) pipes. upload-pack reads from both until it sees EOF in both. It collects the exit codes of the child processes (which indicate success) and terminates successfully. - If rev-list sees an error, it closes its output and terminates with failure. pack-objects sees EOF in its input and terminates successfully. Again upload-pack reads its inputs until EOF. When it now collects the exit codes of its child processes, it notices the failure of rev-list and signals failure to the remote end. - If pack-objects sees an error, it terminates with failure. Since this breaks the pipe to rev-list, rev-list is killed with SIGPIPE. upload-pack reads its input until EOF, then collects the exit codes of the child processes, notices their failures, and signals failure to the remote end. - If upload-pack itself dies unexpectedly, pack-objects is killed with SIGPIPE, and subsequently also rev-list. The upshot of this is that precise monitoring of child processes is not required because both terminate if either one of them dies unexpectedly. This allows us to use finish_command() and finish_async() instead of an explicit waitpid(2) call. The change is smaller than it looks because most of it only reduces the indentation of a large part of the inner loop. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-04 20:46:48 +01:00
fprintf(stderr, "flushed.\n");
}
upload-pack: Use finish_{command,async}() instead of waitpid(). upload-pack spawns two processes, rev-list and pack-objects, and carefully monitors their status so that it can report failure to the remote end. This change removes the complicated procedures on the grounds of the following observations: - If everything is OK, rev-list closes its output pipe end, upon which pack-objects (which reads from the pipe) sees EOF and terminates itself, closing its output (and error) pipes. upload-pack reads from both until it sees EOF in both. It collects the exit codes of the child processes (which indicate success) and terminates successfully. - If rev-list sees an error, it closes its output and terminates with failure. pack-objects sees EOF in its input and terminates successfully. Again upload-pack reads its inputs until EOF. When it now collects the exit codes of its child processes, it notices the failure of rev-list and signals failure to the remote end. - If pack-objects sees an error, it terminates with failure. Since this breaks the pipe to rev-list, rev-list is killed with SIGPIPE. upload-pack reads its input until EOF, then collects the exit codes of the child processes, notices their failures, and signals failure to the remote end. - If upload-pack itself dies unexpectedly, pack-objects is killed with SIGPIPE, and subsequently also rev-list. The upshot of this is that precise monitoring of child processes is not required because both terminate if either one of them dies unexpectedly. This allows us to use finish_command() and finish_async() instead of an explicit waitpid(2) call. The change is smaller than it looks because most of it only reduces the indentation of a large part of the inner loop. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-04 20:46:48 +01:00
if (use_sideband)
packet_flush(1);
return;
fail:
send_client_data(3, abort_msg, sizeof(abort_msg));
die("git upload-pack: %s", abort_msg);
}
static int got_sha1(const char *hex, unsigned char *sha1)
{
struct object *o;
int we_knew_they_have = 0;
if (get_sha1_hex(hex, sha1))
die("git upload-pack: expected SHA1 object, got '%s'", hex);
if (!has_sha1_file(sha1))
return -1;
o = parse_object(sha1);
if (!o)
die("oops (%s)", sha1_to_hex(sha1));
2006-08-13 07:16:51 +02:00
if (o->type == OBJ_COMMIT) {
struct commit_list *parents;
struct commit *commit = (struct commit *)o;
if (o->flags & THEY_HAVE)
we_knew_they_have = 1;
else
o->flags |= THEY_HAVE;
if (!oldest_have || (commit->date < oldest_have))
oldest_have = commit->date;
for (parents = commit->parents;
parents;
parents = parents->next)
parents->item->object.flags |= THEY_HAVE;
}
if (!we_knew_they_have) {
add_object_array(o, NULL, &have_obj);
return 1;
}
return 0;
}
static int reachable(struct commit *want)
{
struct prio_queue work = { compare_commits_by_commit_date };
prio_queue_put(&work, want);
while (work.nr) {
struct commit_list *list;
struct commit *commit = prio_queue_get(&work);
if (commit->object.flags & THEY_HAVE) {
want->object.flags |= COMMON_KNOWN;
break;
}
if (!commit->object.parsed)
parse_object(commit->object.oid.hash);
if (commit->object.flags & REACHABLE)
continue;
commit->object.flags |= REACHABLE;
if (commit->date < oldest_have)
continue;
for (list = commit->parents; list; list = list->next) {
struct commit *parent = list->item;
if (!(parent->object.flags & REACHABLE))
prio_queue_put(&work, parent);
}
}
want->object.flags |= REACHABLE;
clear_commit_marks(want, REACHABLE);
clear_prio_queue(&work);
return (want->object.flags & COMMON_KNOWN);
}
static int ok_to_give_up(void)
{
int i;
if (!have_obj.nr)
return 0;
for (i = 0; i < want_obj.nr; i++) {
struct object *want = want_obj.objects[i].item;
if (want->flags & COMMON_KNOWN)
continue;
want = deref_tag(want, "a want line", 0);
if (!want || want->type != OBJ_COMMIT) {
/* no way to tell if this is reachable by
* looking at the ancestry chain alone, so
* leave a note to ourselves not to worry about
* this object anymore.
*/
want_obj.objects[i].item->flags |= COMMON_KNOWN;
continue;
}
if (!reachable((struct commit *)want))
return 0;
}
return 1;
}
static int get_common_commits(void)
{
unsigned char sha1[20];
Add multi_ack_detailed capability to fetch-pack/upload-pack When multi_ack_detailed is enabled the ACK continue messages returned by the remote upload-pack are broken out to describe the different states within the peer. This permits the client to better understand the server's in-memory state. The fetch-pack/upload-pack protocol now looks like: NAK --------------------------------- Always sent in response to "done" if there was no common base selected from the "have" lines (or no have lines were sent). * no multi_ack or multi_ack_detailed: Sent when the client has sent a pkt-line flush ("0000") and the server has not yet found a common base object. * either multi_ack or multi_ack_detailed: Always sent in response to a pkt-line flush. ACK %s ----------------------------------- * no multi_ack or multi_ack_detailed: Sent in response to "have" when the object exists on the remote side and is therefore an object in common between the peers. The argument is the SHA-1 of the common object. * either multi_ack or multi_ack_detailed: Sent in response to "done" if there are common objects. The argument is the last SHA-1 determined to be common. ACK %s continue ----------------------------------- * multi_ack only: Sent in response to "have". The remote side wants the client to consider this object as common, and immediately stop transmitting additional "have" lines for objects that are reachable from it. The reason the client should stop is not given, but is one of the two cases below available under multi_ack_detailed. ACK %s common ----------------------------------- * multi_ack_detailed only: Sent in response to "have". Both sides have this object. Like with "ACK %s continue" above the client should stop sending have lines reachable for objects from the argument. ACK %s ready ----------------------------------- * multi_ack_detailed only: Sent in response to "have". The client should stop transmitting objects which are reachable from the argument, and send "done" soon to get the objects. If the remote side has the specified object, it should first send an "ACK %s common" message prior to sending "ACK %s ready". Clients may still submit additional "have" lines if there are more side branches for the client to explore that might be added to the common set and reduce the number of objects to transfer. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-31 01:47:25 +01:00
char last_hex[41];
int got_common = 0;
int got_other = 0;
int sent_ready = 0;
save_commit_buffer = 0;
for (;;) {
pkt-line: provide a LARGE_PACKET_MAX static buffer Most of the callers of packet_read_line just read into a static 1000-byte buffer (callers which handle arbitrary binary data already use LARGE_PACKET_MAX). This works fine in practice, because: 1. The only variable-sized data in these lines is a ref name, and refs tend to be a lot shorter than 1000 characters. 2. When sending ref lines, git-core always limits itself to 1000 byte packets. However, the only limit given in the protocol specification in Documentation/technical/protocol-common.txt is LARGE_PACKET_MAX; the 1000 byte limit is mentioned only in pack-protocol.txt, and then only describing what we write, not as a specific limit for readers. This patch lets us bump the 1000-byte limit to LARGE_PACKET_MAX. Even though git-core will never write a packet where this makes a difference, there are two good reasons to do this: 1. Other git implementations may have followed protocol-common.txt and used a larger maximum size. We don't bump into it in practice because it would involve very long ref names. 2. We may want to increase the 1000-byte limit one day. Since packets are transferred before any capabilities, it's difficult to do this in a backwards-compatible way. But if we bump the size of buffer the readers can handle, eventually older versions of git will be obsolete enough that we can justify bumping the writers, as well. We don't have plans to do this anytime soon, but there is no reason not to start the clock ticking now. Just bumping all of the reading bufs to LARGE_PACKET_MAX would waste memory. Instead, since most readers just read into a temporary buffer anyway, let's provide a single static buffer that all callers can use. We can further wrap this detail away by having the packet_read_line wrapper just use the buffer transparently and return a pointer to the static storage. That covers most of the cases, and the remaining ones already read into their own LARGE_PACKET_MAX buffers. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 21:02:57 +01:00
char *line = packet_read_line(0, NULL);
const char *arg;
reset_timeout();
pkt-line: provide a LARGE_PACKET_MAX static buffer Most of the callers of packet_read_line just read into a static 1000-byte buffer (callers which handle arbitrary binary data already use LARGE_PACKET_MAX). This works fine in practice, because: 1. The only variable-sized data in these lines is a ref name, and refs tend to be a lot shorter than 1000 characters. 2. When sending ref lines, git-core always limits itself to 1000 byte packets. However, the only limit given in the protocol specification in Documentation/technical/protocol-common.txt is LARGE_PACKET_MAX; the 1000 byte limit is mentioned only in pack-protocol.txt, and then only describing what we write, not as a specific limit for readers. This patch lets us bump the 1000-byte limit to LARGE_PACKET_MAX. Even though git-core will never write a packet where this makes a difference, there are two good reasons to do this: 1. Other git implementations may have followed protocol-common.txt and used a larger maximum size. We don't bump into it in practice because it would involve very long ref names. 2. We may want to increase the 1000-byte limit one day. Since packets are transferred before any capabilities, it's difficult to do this in a backwards-compatible way. But if we bump the size of buffer the readers can handle, eventually older versions of git will be obsolete enough that we can justify bumping the writers, as well. We don't have plans to do this anytime soon, but there is no reason not to start the clock ticking now. Just bumping all of the reading bufs to LARGE_PACKET_MAX would waste memory. Instead, since most readers just read into a temporary buffer anyway, let's provide a single static buffer that all callers can use. We can further wrap this detail away by having the packet_read_line wrapper just use the buffer transparently and return a pointer to the static storage. That covers most of the cases, and the remaining ones already read into their own LARGE_PACKET_MAX buffers. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 21:02:57 +01:00
if (!line) {
if (multi_ack == 2 && got_common
&& !got_other && ok_to_give_up()) {
sent_ready = 1;
packet_write(1, "ACK %s ready\n", last_hex);
}
if (have_obj.nr == 0 || multi_ack)
packet_write(1, "NAK\n");
if (no_done && sent_ready) {
packet_write(1, "ACK %s\n", last_hex);
return 0;
}
if (stateless_rpc)
exit(0);
got_common = 0;
got_other = 0;
continue;
}
if (skip_prefix(line, "have ", &arg)) {
switch (got_sha1(arg, sha1)) {
case -1: /* they have what we do not */
got_other = 1;
Add multi_ack_detailed capability to fetch-pack/upload-pack When multi_ack_detailed is enabled the ACK continue messages returned by the remote upload-pack are broken out to describe the different states within the peer. This permits the client to better understand the server's in-memory state. The fetch-pack/upload-pack protocol now looks like: NAK --------------------------------- Always sent in response to "done" if there was no common base selected from the "have" lines (or no have lines were sent). * no multi_ack or multi_ack_detailed: Sent when the client has sent a pkt-line flush ("0000") and the server has not yet found a common base object. * either multi_ack or multi_ack_detailed: Always sent in response to a pkt-line flush. ACK %s ----------------------------------- * no multi_ack or multi_ack_detailed: Sent in response to "have" when the object exists on the remote side and is therefore an object in common between the peers. The argument is the SHA-1 of the common object. * either multi_ack or multi_ack_detailed: Sent in response to "done" if there are common objects. The argument is the last SHA-1 determined to be common. ACK %s continue ----------------------------------- * multi_ack only: Sent in response to "have". The remote side wants the client to consider this object as common, and immediately stop transmitting additional "have" lines for objects that are reachable from it. The reason the client should stop is not given, but is one of the two cases below available under multi_ack_detailed. ACK %s common ----------------------------------- * multi_ack_detailed only: Sent in response to "have". Both sides have this object. Like with "ACK %s continue" above the client should stop sending have lines reachable for objects from the argument. ACK %s ready ----------------------------------- * multi_ack_detailed only: Sent in response to "have". The client should stop transmitting objects which are reachable from the argument, and send "done" soon to get the objects. If the remote side has the specified object, it should first send an "ACK %s common" message prior to sending "ACK %s ready". Clients may still submit additional "have" lines if there are more side branches for the client to explore that might be added to the common set and reduce the number of objects to transfer. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-31 01:47:25 +01:00
if (multi_ack && ok_to_give_up()) {
const char *hex = sha1_to_hex(sha1);
if (multi_ack == 2) {
sent_ready = 1;
Add multi_ack_detailed capability to fetch-pack/upload-pack When multi_ack_detailed is enabled the ACK continue messages returned by the remote upload-pack are broken out to describe the different states within the peer. This permits the client to better understand the server's in-memory state. The fetch-pack/upload-pack protocol now looks like: NAK --------------------------------- Always sent in response to "done" if there was no common base selected from the "have" lines (or no have lines were sent). * no multi_ack or multi_ack_detailed: Sent when the client has sent a pkt-line flush ("0000") and the server has not yet found a common base object. * either multi_ack or multi_ack_detailed: Always sent in response to a pkt-line flush. ACK %s ----------------------------------- * no multi_ack or multi_ack_detailed: Sent in response to "have" when the object exists on the remote side and is therefore an object in common between the peers. The argument is the SHA-1 of the common object. * either multi_ack or multi_ack_detailed: Sent in response to "done" if there are common objects. The argument is the last SHA-1 determined to be common. ACK %s continue ----------------------------------- * multi_ack only: Sent in response to "have". The remote side wants the client to consider this object as common, and immediately stop transmitting additional "have" lines for objects that are reachable from it. The reason the client should stop is not given, but is one of the two cases below available under multi_ack_detailed. ACK %s common ----------------------------------- * multi_ack_detailed only: Sent in response to "have". Both sides have this object. Like with "ACK %s continue" above the client should stop sending have lines reachable for objects from the argument. ACK %s ready ----------------------------------- * multi_ack_detailed only: Sent in response to "have". The client should stop transmitting objects which are reachable from the argument, and send "done" soon to get the objects. If the remote side has the specified object, it should first send an "ACK %s common" message prior to sending "ACK %s ready". Clients may still submit additional "have" lines if there are more side branches for the client to explore that might be added to the common set and reduce the number of objects to transfer. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-31 01:47:25 +01:00
packet_write(1, "ACK %s ready\n", hex);
} else
Add multi_ack_detailed capability to fetch-pack/upload-pack When multi_ack_detailed is enabled the ACK continue messages returned by the remote upload-pack are broken out to describe the different states within the peer. This permits the client to better understand the server's in-memory state. The fetch-pack/upload-pack protocol now looks like: NAK --------------------------------- Always sent in response to "done" if there was no common base selected from the "have" lines (or no have lines were sent). * no multi_ack or multi_ack_detailed: Sent when the client has sent a pkt-line flush ("0000") and the server has not yet found a common base object. * either multi_ack or multi_ack_detailed: Always sent in response to a pkt-line flush. ACK %s ----------------------------------- * no multi_ack or multi_ack_detailed: Sent in response to "have" when the object exists on the remote side and is therefore an object in common between the peers. The argument is the SHA-1 of the common object. * either multi_ack or multi_ack_detailed: Sent in response to "done" if there are common objects. The argument is the last SHA-1 determined to be common. ACK %s continue ----------------------------------- * multi_ack only: Sent in response to "have". The remote side wants the client to consider this object as common, and immediately stop transmitting additional "have" lines for objects that are reachable from it. The reason the client should stop is not given, but is one of the two cases below available under multi_ack_detailed. ACK %s common ----------------------------------- * multi_ack_detailed only: Sent in response to "have". Both sides have this object. Like with "ACK %s continue" above the client should stop sending have lines reachable for objects from the argument. ACK %s ready ----------------------------------- * multi_ack_detailed only: Sent in response to "have". The client should stop transmitting objects which are reachable from the argument, and send "done" soon to get the objects. If the remote side has the specified object, it should first send an "ACK %s common" message prior to sending "ACK %s ready". Clients may still submit additional "have" lines if there are more side branches for the client to explore that might be added to the common set and reduce the number of objects to transfer. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-31 01:47:25 +01:00
packet_write(1, "ACK %s continue\n", hex);
}
break;
default:
got_common = 1;
Add multi_ack_detailed capability to fetch-pack/upload-pack When multi_ack_detailed is enabled the ACK continue messages returned by the remote upload-pack are broken out to describe the different states within the peer. This permits the client to better understand the server's in-memory state. The fetch-pack/upload-pack protocol now looks like: NAK --------------------------------- Always sent in response to "done" if there was no common base selected from the "have" lines (or no have lines were sent). * no multi_ack or multi_ack_detailed: Sent when the client has sent a pkt-line flush ("0000") and the server has not yet found a common base object. * either multi_ack or multi_ack_detailed: Always sent in response to a pkt-line flush. ACK %s ----------------------------------- * no multi_ack or multi_ack_detailed: Sent in response to "have" when the object exists on the remote side and is therefore an object in common between the peers. The argument is the SHA-1 of the common object. * either multi_ack or multi_ack_detailed: Sent in response to "done" if there are common objects. The argument is the last SHA-1 determined to be common. ACK %s continue ----------------------------------- * multi_ack only: Sent in response to "have". The remote side wants the client to consider this object as common, and immediately stop transmitting additional "have" lines for objects that are reachable from it. The reason the client should stop is not given, but is one of the two cases below available under multi_ack_detailed. ACK %s common ----------------------------------- * multi_ack_detailed only: Sent in response to "have". Both sides have this object. Like with "ACK %s continue" above the client should stop sending have lines reachable for objects from the argument. ACK %s ready ----------------------------------- * multi_ack_detailed only: Sent in response to "have". The client should stop transmitting objects which are reachable from the argument, and send "done" soon to get the objects. If the remote side has the specified object, it should first send an "ACK %s common" message prior to sending "ACK %s ready". Clients may still submit additional "have" lines if there are more side branches for the client to explore that might be added to the common set and reduce the number of objects to transfer. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-31 01:47:25 +01:00
memcpy(last_hex, sha1_to_hex(sha1), 41);
if (multi_ack == 2)
packet_write(1, "ACK %s common\n", last_hex);
else if (multi_ack)
packet_write(1, "ACK %s continue\n", last_hex);
else if (have_obj.nr == 1)
Add multi_ack_detailed capability to fetch-pack/upload-pack When multi_ack_detailed is enabled the ACK continue messages returned by the remote upload-pack are broken out to describe the different states within the peer. This permits the client to better understand the server's in-memory state. The fetch-pack/upload-pack protocol now looks like: NAK --------------------------------- Always sent in response to "done" if there was no common base selected from the "have" lines (or no have lines were sent). * no multi_ack or multi_ack_detailed: Sent when the client has sent a pkt-line flush ("0000") and the server has not yet found a common base object. * either multi_ack or multi_ack_detailed: Always sent in response to a pkt-line flush. ACK %s ----------------------------------- * no multi_ack or multi_ack_detailed: Sent in response to "have" when the object exists on the remote side and is therefore an object in common between the peers. The argument is the SHA-1 of the common object. * either multi_ack or multi_ack_detailed: Sent in response to "done" if there are common objects. The argument is the last SHA-1 determined to be common. ACK %s continue ----------------------------------- * multi_ack only: Sent in response to "have". The remote side wants the client to consider this object as common, and immediately stop transmitting additional "have" lines for objects that are reachable from it. The reason the client should stop is not given, but is one of the two cases below available under multi_ack_detailed. ACK %s common ----------------------------------- * multi_ack_detailed only: Sent in response to "have". Both sides have this object. Like with "ACK %s continue" above the client should stop sending have lines reachable for objects from the argument. ACK %s ready ----------------------------------- * multi_ack_detailed only: Sent in response to "have". The client should stop transmitting objects which are reachable from the argument, and send "done" soon to get the objects. If the remote side has the specified object, it should first send an "ACK %s common" message prior to sending "ACK %s ready". Clients may still submit additional "have" lines if there are more side branches for the client to explore that might be added to the common set and reduce the number of objects to transfer. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-31 01:47:25 +01:00
packet_write(1, "ACK %s\n", last_hex);
break;
}
continue;
}
if (!strcmp(line, "done")) {
if (have_obj.nr > 0) {
if (multi_ack)
packet_write(1, "ACK %s\n", last_hex);
return 0;
}
packet_write(1, "NAK\n");
return -1;
}
die("git upload-pack: expected SHA1 list, got '%s'", line);
}
}
static int is_our_ref(struct object *o)
{
int allow_hidden_ref = (allow_unadvertised_object_request &
(ALLOW_TIP_SHA1 | ALLOW_REACHABLE_SHA1));
return o->flags & ((allow_hidden_ref ? HIDDEN_REF : 0) | OUR_REF);
}
/*
* on successful case, it's up to the caller to close cmd->out
*/
static int do_reachable_revlist(struct child_process *cmd,
struct object_array *src,
struct object_array *reachable)
{
static const char *argv[] = {
"rev-list", "--stdin", NULL,
};
struct object *o;
char namebuf[42]; /* ^ + SHA-1 + LF */
int i;
cmd->argv = argv;
cmd->git_cmd = 1;
cmd->no_stderr = 1;
cmd->in = -1;
cmd->out = -1;
/*
* If the next rev-list --stdin encounters an unknown commit,
* it terminates, which will cause SIGPIPE in the write loop
* below.
*/
sigchain_push(SIGPIPE, SIG_IGN);
if (start_command(cmd))
goto error;
namebuf[0] = '^';
namebuf[41] = '\n';
for (i = get_max_object_index(); 0 < i; ) {
o = get_indexed_object(--i);
if (!o)
continue;
if (reachable && o->type == OBJ_COMMIT)
o->flags &= ~TMP_MARK;
if (!is_our_ref(o))
continue;
memcpy(namebuf + 1, oid_to_hex(&o->oid), GIT_SHA1_HEXSZ);
if (write_in_full(cmd->in, namebuf, 42) < 0)
goto error;
}
namebuf[40] = '\n';
for (i = 0; i < src->nr; i++) {
o = src->objects[i].item;
if (is_our_ref(o)) {
if (reachable)
add_object_array(o, NULL, reachable);
continue;
}
if (reachable && o->type == OBJ_COMMIT)
o->flags |= TMP_MARK;
memcpy(namebuf, oid_to_hex(&o->oid), GIT_SHA1_HEXSZ);
if (write_in_full(cmd->in, namebuf, 41) < 0)
goto error;
}
close(cmd->in);
cmd->in = -1;
sigchain_pop(SIGPIPE);
return 0;
error:
sigchain_pop(SIGPIPE);
if (cmd->in >= 0)
close(cmd->in);
if (cmd->out >= 0)
close(cmd->out);
return -1;
}
static int get_reachable_list(struct object_array *src,
struct object_array *reachable)
{
struct child_process cmd = CHILD_PROCESS_INIT;
int i;
struct object *o;
char namebuf[42]; /* ^ + SHA-1 + LF */
if (do_reachable_revlist(&cmd, src, reachable) < 0)
return -1;
while ((i = read_in_full(cmd.out, namebuf, 41)) == 41) {
struct object_id sha1;
if (namebuf[40] != '\n' || get_oid_hex(namebuf, &sha1))
break;
o = lookup_object(sha1.hash);
if (o && o->type == OBJ_COMMIT) {
o->flags &= ~TMP_MARK;
}
}
for (i = get_max_object_index(); 0 < i; i--) {
o = get_indexed_object(i - 1);
if (o && o->type == OBJ_COMMIT &&
(o->flags & TMP_MARK)) {
add_object_array(o, NULL, reachable);
o->flags &= ~TMP_MARK;
}
}
close(cmd.out);
if (finish_command(&cmd))
return -1;
return 0;
}
static int has_unreachable(struct object_array *src)
{
struct child_process cmd = CHILD_PROCESS_INIT;
char buf[1];
int i;
if (do_reachable_revlist(&cmd, src, NULL) < 0)
return 1;
/*
* The commits out of the rev-list are not ancestors of
* our ref.
*/
i = read_in_full(cmd.out, buf, 1);
if (i)
goto error;
close(cmd.out);
cmd.out = -1;
/*
* rev-list may have died by encountering a bad commit
* in the history, in which case we do want to bail out
* even when it showed no commit.
*/
if (finish_command(&cmd))
goto error;
/* All the non-tip ones are ancestors of what we advertised */
return 0;
error:
sigchain_pop(SIGPIPE);
if (cmd.out >= 0)
close(cmd.out);
return 1;
}
static void check_non_tip(void)
{
int i;
/*
* In the normal in-process case without
* uploadpack.allowReachableSHA1InWant,
* non-tip requests can never happen.
*/
if (!stateless_rpc && !(allow_unadvertised_object_request & ALLOW_REACHABLE_SHA1))
goto error;
if (!has_unreachable(&want_obj))
/* All the non-tip ones are ancestors of what we advertised */
return;
error:
/* Pick one of them (we know there at least is one) */
for (i = 0; i < want_obj.nr; i++) {
struct object *o = want_obj.objects[i].item;
if (!is_our_ref(o))
die("git upload-pack: not our ref %s",
oid_to_hex(&o->oid));
}
}
static void send_shallow(struct commit_list *result)
{
while (result) {
struct object *object = &result->item->object;
if (!(object->flags & (CLIENT_SHALLOW|NOT_SHALLOW))) {
packet_write(1, "shallow %s",
oid_to_hex(&object->oid));
register_shallow(object->oid.hash);
shallow_nr++;
}
result = result->next;
}
}
static void send_unshallow(const struct object_array *shallows)
{
int i;
for (i = 0; i < shallows->nr; i++) {
struct object *object = shallows->objects[i].item;
if (object->flags & NOT_SHALLOW) {
struct commit_list *parents;
packet_write(1, "unshallow %s",
oid_to_hex(&object->oid));
object->flags &= ~CLIENT_SHALLOW;
/*
* We want to _register_ "object" as shallow, but we
* also need to traverse object's parents to deepen a
* shallow clone. Unregister it for now so we can
* parse and add the parents to the want list, then
* re-register it.
*/
unregister_shallow(object->oid.hash);
object->parsed = 0;
parse_commit_or_die((struct commit *)object);
parents = ((struct commit *)object)->parents;
while (parents) {
add_object_array(&parents->item->object,
NULL, &want_obj);
parents = parents->next;
}
add_object_array(object, NULL, &extra_edge_obj);
}
/* make sure commit traversal conforms to client */
register_shallow(object->oid.hash);
}
}
fetch, upload-pack: --deepen=N extends shallow boundary by N commits In git-fetch, --depth argument is always relative with the latest remote refs. This makes it a bit difficult to cover this use case, where the user wants to make the shallow history, say 3 levels deeper. It would work if remote refs have not moved yet, but nobody can guarantee that, especially when that use case is performed a couple months after the last clone or "git fetch --depth". Also, modifying shallow boundary using --depth does not work well with clones created by --since or --not. This patch fixes that. A new argument --deepen=<N> will add <N> more (*) parent commits to the current history regardless of where remote refs are. Have/Want negotiation is still respected. So if remote refs move, the server will send two chunks: one between "have" and "want" and another to extend shallow history. In theory, the client could send no "want"s in order to get the second chunk only. But the protocol does not allow that. Either you send no want lines, which means ls-remote; or you have to send at least one want line that carries deep-relative to the server.. The main work was done by Dongcan Jiang. I fixed it up here and there. And of course all the bugs belong to me. (*) We could even support --deepen=<N> where <N> is negative. In that case we can cut some history from the shallow clone. This operation (and --depth=<shorter depth>) does not require interaction with remote side (and more complicated to implement as a result). Helped-by: Duy Nguyen <pclouds@gmail.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Dongcan Jiang <dongcan.jiang@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-12 12:54:09 +02:00
static void deepen(int depth, int deepen_relative,
struct object_array *shallows)
{
if (depth == INFINITE_DEPTH && !is_repository_shallow()) {
int i;
for (i = 0; i < shallows->nr; i++) {
struct object *object = shallows->objects[i].item;
object->flags |= NOT_SHALLOW;
}
fetch, upload-pack: --deepen=N extends shallow boundary by N commits In git-fetch, --depth argument is always relative with the latest remote refs. This makes it a bit difficult to cover this use case, where the user wants to make the shallow history, say 3 levels deeper. It would work if remote refs have not moved yet, but nobody can guarantee that, especially when that use case is performed a couple months after the last clone or "git fetch --depth". Also, modifying shallow boundary using --depth does not work well with clones created by --since or --not. This patch fixes that. A new argument --deepen=<N> will add <N> more (*) parent commits to the current history regardless of where remote refs are. Have/Want negotiation is still respected. So if remote refs move, the server will send two chunks: one between "have" and "want" and another to extend shallow history. In theory, the client could send no "want"s in order to get the second chunk only. But the protocol does not allow that. Either you send no want lines, which means ls-remote; or you have to send at least one want line that carries deep-relative to the server.. The main work was done by Dongcan Jiang. I fixed it up here and there. And of course all the bugs belong to me. (*) We could even support --deepen=<N> where <N> is negative. In that case we can cut some history from the shallow clone. This operation (and --depth=<shorter depth>) does not require interaction with remote side (and more complicated to implement as a result). Helped-by: Duy Nguyen <pclouds@gmail.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Dongcan Jiang <dongcan.jiang@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-12 12:54:09 +02:00
} else if (deepen_relative) {
struct object_array reachable_shallows = OBJECT_ARRAY_INIT;
struct commit_list *result;
get_reachable_list(shallows, &reachable_shallows);
result = get_shallow_commits(&reachable_shallows,
depth + 1,
SHALLOW, NOT_SHALLOW);
send_shallow(result);
free_commit_list(result);
object_array_clear(&reachable_shallows);
} else {
struct commit_list *result;
result = get_shallow_commits(&want_obj, depth,
SHALLOW, NOT_SHALLOW);
send_shallow(result);
free_commit_list(result);
}
send_unshallow(shallows);
packet_flush(1);
}
static void deepen_by_rev_list(int ac, const char **av,
struct object_array *shallows)
{
struct commit_list *result;
result = get_shallow_commits_by_rev_list(ac, av, SHALLOW, NOT_SHALLOW);
send_shallow(result);
free_commit_list(result);
send_unshallow(shallows);
packet_flush(1);
}
static void receive_needs(void)
{
struct object_array shallows = OBJECT_ARRAY_INIT;
struct string_list deepen_not = STRING_LIST_INIT_DUP;
pkt-line: provide a LARGE_PACKET_MAX static buffer Most of the callers of packet_read_line just read into a static 1000-byte buffer (callers which handle arbitrary binary data already use LARGE_PACKET_MAX). This works fine in practice, because: 1. The only variable-sized data in these lines is a ref name, and refs tend to be a lot shorter than 1000 characters. 2. When sending ref lines, git-core always limits itself to 1000 byte packets. However, the only limit given in the protocol specification in Documentation/technical/protocol-common.txt is LARGE_PACKET_MAX; the 1000 byte limit is mentioned only in pack-protocol.txt, and then only describing what we write, not as a specific limit for readers. This patch lets us bump the 1000-byte limit to LARGE_PACKET_MAX. Even though git-core will never write a packet where this makes a difference, there are two good reasons to do this: 1. Other git implementations may have followed protocol-common.txt and used a larger maximum size. We don't bump into it in practice because it would involve very long ref names. 2. We may want to increase the 1000-byte limit one day. Since packets are transferred before any capabilities, it's difficult to do this in a backwards-compatible way. But if we bump the size of buffer the readers can handle, eventually older versions of git will be obsolete enough that we can justify bumping the writers, as well. We don't have plans to do this anytime soon, but there is no reason not to start the clock ticking now. Just bumping all of the reading bufs to LARGE_PACKET_MAX would waste memory. Instead, since most readers just read into a temporary buffer anyway, let's provide a single static buffer that all callers can use. We can further wrap this detail away by having the packet_read_line wrapper just use the buffer transparently and return a pointer to the static storage. That covers most of the cases, and the remaining ones already read into their own LARGE_PACKET_MAX buffers. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 21:02:57 +01:00
int depth = 0;
int has_non_tip = 0;
unsigned long deepen_since = 0;
int deepen_rev_list = 0;
shallow_nr = 0;
for (;;) {
struct object *o;
const char *features;
unsigned char sha1_buf[20];
pkt-line: provide a LARGE_PACKET_MAX static buffer Most of the callers of packet_read_line just read into a static 1000-byte buffer (callers which handle arbitrary binary data already use LARGE_PACKET_MAX). This works fine in practice, because: 1. The only variable-sized data in these lines is a ref name, and refs tend to be a lot shorter than 1000 characters. 2. When sending ref lines, git-core always limits itself to 1000 byte packets. However, the only limit given in the protocol specification in Documentation/technical/protocol-common.txt is LARGE_PACKET_MAX; the 1000 byte limit is mentioned only in pack-protocol.txt, and then only describing what we write, not as a specific limit for readers. This patch lets us bump the 1000-byte limit to LARGE_PACKET_MAX. Even though git-core will never write a packet where this makes a difference, there are two good reasons to do this: 1. Other git implementations may have followed protocol-common.txt and used a larger maximum size. We don't bump into it in practice because it would involve very long ref names. 2. We may want to increase the 1000-byte limit one day. Since packets are transferred before any capabilities, it's difficult to do this in a backwards-compatible way. But if we bump the size of buffer the readers can handle, eventually older versions of git will be obsolete enough that we can justify bumping the writers, as well. We don't have plans to do this anytime soon, but there is no reason not to start the clock ticking now. Just bumping all of the reading bufs to LARGE_PACKET_MAX would waste memory. Instead, since most readers just read into a temporary buffer anyway, let's provide a single static buffer that all callers can use. We can further wrap this detail away by having the packet_read_line wrapper just use the buffer transparently and return a pointer to the static storage. That covers most of the cases, and the remaining ones already read into their own LARGE_PACKET_MAX buffers. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 21:02:57 +01:00
char *line = packet_read_line(0, NULL);
const char *arg;
reset_timeout();
pkt-line: provide a LARGE_PACKET_MAX static buffer Most of the callers of packet_read_line just read into a static 1000-byte buffer (callers which handle arbitrary binary data already use LARGE_PACKET_MAX). This works fine in practice, because: 1. The only variable-sized data in these lines is a ref name, and refs tend to be a lot shorter than 1000 characters. 2. When sending ref lines, git-core always limits itself to 1000 byte packets. However, the only limit given in the protocol specification in Documentation/technical/protocol-common.txt is LARGE_PACKET_MAX; the 1000 byte limit is mentioned only in pack-protocol.txt, and then only describing what we write, not as a specific limit for readers. This patch lets us bump the 1000-byte limit to LARGE_PACKET_MAX. Even though git-core will never write a packet where this makes a difference, there are two good reasons to do this: 1. Other git implementations may have followed protocol-common.txt and used a larger maximum size. We don't bump into it in practice because it would involve very long ref names. 2. We may want to increase the 1000-byte limit one day. Since packets are transferred before any capabilities, it's difficult to do this in a backwards-compatible way. But if we bump the size of buffer the readers can handle, eventually older versions of git will be obsolete enough that we can justify bumping the writers, as well. We don't have plans to do this anytime soon, but there is no reason not to start the clock ticking now. Just bumping all of the reading bufs to LARGE_PACKET_MAX would waste memory. Instead, since most readers just read into a temporary buffer anyway, let's provide a single static buffer that all callers can use. We can further wrap this detail away by having the packet_read_line wrapper just use the buffer transparently and return a pointer to the static storage. That covers most of the cases, and the remaining ones already read into their own LARGE_PACKET_MAX buffers. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 21:02:57 +01:00
if (!line)
break;
if (skip_prefix(line, "shallow ", &arg)) {
unsigned char sha1[20];
struct object *object;
if (get_sha1_hex(arg, sha1))
die("invalid shallow line: %s", line);
object = parse_object(sha1);
if (!object)
continue;
if (object->type != OBJ_COMMIT)
die("invalid shallow object %s", sha1_to_hex(sha1));
if (!(object->flags & CLIENT_SHALLOW)) {
object->flags |= CLIENT_SHALLOW;
add_object_array(object, NULL, &shallows);
}
continue;
}
if (skip_prefix(line, "deepen ", &arg)) {
char *end = NULL;
depth = strtol(arg, &end, 0);
if (!end || *end || depth <= 0)
die("Invalid deepen: %s", line);
continue;
}
if (skip_prefix(line, "deepen-since ", &arg)) {
char *end = NULL;
deepen_since = strtoul(arg, &end, 0);
if (!end || *end || !deepen_since ||
/* revisions.c's max_age -1 is special */
deepen_since == -1)
die("Invalid deepen-since: %s", line);
deepen_rev_list = 1;
continue;
}
if (skip_prefix(line, "deepen-not ", &arg)) {
char *ref = NULL;
unsigned char sha1[20];
if (expand_ref(arg, strlen(arg), sha1, &ref) != 1)
die("git upload-pack: ambiguous deepen-not: %s", line);
string_list_append(&deepen_not, ref);
free(ref);
deepen_rev_list = 1;
continue;
}
if (!skip_prefix(line, "want ", &arg) ||
get_sha1_hex(arg, sha1_buf))
die("git upload-pack: protocol error, "
"expected to get sha, not '%s'", line);
features = arg + 40;
fetch, upload-pack: --deepen=N extends shallow boundary by N commits In git-fetch, --depth argument is always relative with the latest remote refs. This makes it a bit difficult to cover this use case, where the user wants to make the shallow history, say 3 levels deeper. It would work if remote refs have not moved yet, but nobody can guarantee that, especially when that use case is performed a couple months after the last clone or "git fetch --depth". Also, modifying shallow boundary using --depth does not work well with clones created by --since or --not. This patch fixes that. A new argument --deepen=<N> will add <N> more (*) parent commits to the current history regardless of where remote refs are. Have/Want negotiation is still respected. So if remote refs move, the server will send two chunks: one between "have" and "want" and another to extend shallow history. In theory, the client could send no "want"s in order to get the second chunk only. But the protocol does not allow that. Either you send no want lines, which means ls-remote; or you have to send at least one want line that carries deep-relative to the server.. The main work was done by Dongcan Jiang. I fixed it up here and there. And of course all the bugs belong to me. (*) We could even support --deepen=<N> where <N> is negative. In that case we can cut some history from the shallow clone. This operation (and --depth=<shorter depth>) does not require interaction with remote side (and more complicated to implement as a result). Helped-by: Duy Nguyen <pclouds@gmail.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Dongcan Jiang <dongcan.jiang@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-12 12:54:09 +02:00
if (parse_feature_request(features, "deepen-relative"))
deepen_relative = 1;
if (parse_feature_request(features, "multi_ack_detailed"))
Add multi_ack_detailed capability to fetch-pack/upload-pack When multi_ack_detailed is enabled the ACK continue messages returned by the remote upload-pack are broken out to describe the different states within the peer. This permits the client to better understand the server's in-memory state. The fetch-pack/upload-pack protocol now looks like: NAK --------------------------------- Always sent in response to "done" if there was no common base selected from the "have" lines (or no have lines were sent). * no multi_ack or multi_ack_detailed: Sent when the client has sent a pkt-line flush ("0000") and the server has not yet found a common base object. * either multi_ack or multi_ack_detailed: Always sent in response to a pkt-line flush. ACK %s ----------------------------------- * no multi_ack or multi_ack_detailed: Sent in response to "have" when the object exists on the remote side and is therefore an object in common between the peers. The argument is the SHA-1 of the common object. * either multi_ack or multi_ack_detailed: Sent in response to "done" if there are common objects. The argument is the last SHA-1 determined to be common. ACK %s continue ----------------------------------- * multi_ack only: Sent in response to "have". The remote side wants the client to consider this object as common, and immediately stop transmitting additional "have" lines for objects that are reachable from it. The reason the client should stop is not given, but is one of the two cases below available under multi_ack_detailed. ACK %s common ----------------------------------- * multi_ack_detailed only: Sent in response to "have". Both sides have this object. Like with "ACK %s continue" above the client should stop sending have lines reachable for objects from the argument. ACK %s ready ----------------------------------- * multi_ack_detailed only: Sent in response to "have". The client should stop transmitting objects which are reachable from the argument, and send "done" soon to get the objects. If the remote side has the specified object, it should first send an "ACK %s common" message prior to sending "ACK %s ready". Clients may still submit additional "have" lines if there are more side branches for the client to explore that might be added to the common set and reduce the number of objects to transfer. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-31 01:47:25 +01:00
multi_ack = 2;
else if (parse_feature_request(features, "multi_ack"))
multi_ack = 1;
if (parse_feature_request(features, "no-done"))
no_done = 1;
if (parse_feature_request(features, "thin-pack"))
use_thin_pack = 1;
if (parse_feature_request(features, "ofs-delta"))
use_ofs_delta = 1;
if (parse_feature_request(features, "side-band-64k"))
use_sideband = LARGE_PACKET_MAX;
else if (parse_feature_request(features, "side-band"))
use_sideband = DEFAULT_PACKET_MAX;
if (parse_feature_request(features, "no-progress"))
no_progress = 1;
if (parse_feature_request(features, "include-tag"))
use_include_tag = 1;
upload-pack: load non-tip "want" objects from disk It is a long-time security feature that upload-pack will not serve any "want" lines that do not correspond to the tip of one of our refs. Traditionally, this was enforced by checking the objects in the in-memory hash; they should have been loaded and received the OUR_REF flag during the advertisement. The stateless-rpc mode, however, has a race condition here: one process advertises, and another receives the want lines, so the refs may have changed in the interim. To address this, commit 051e400 added a new verification mode; if the object is not OUR_REF, we set a "has_non_tip" flag, and then later verify that the requested objects are reachable from our current tips. However, we still die immediately when the object is not in our in-memory hash, and at this point we should only have loaded our tip objects. So the check_non_tip code path does not ever actually trigger, as any non-tip objects would have already caused us to die. We can fix that by using parse_object instead of lookup_object, which will load the object from disk if it has not already been loaded. We still need to check that parse_object does not return NULL, though, as it is possible we do not have the object at all. A more appropriate error message would be "no such object" rather than "not our ref"; however, we do not want to leak information about what objects are or are not in the object database, so we continue to use the same "not our ref" message that would be produced by an unreachable object. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-16 11:28:30 +01:00
o = parse_object(sha1_buf);
if (!o)
die("git upload-pack: not our ref %s",
sha1_to_hex(sha1_buf));
if (!(o->flags & WANTED)) {
o->flags |= WANTED;
if (!is_our_ref(o))
has_non_tip = 1;
add_object_array(o, NULL, &want_obj);
}
}
/*
* We have sent all our refs already, and the other end
* should have chosen out of them. When we are operating
* in the stateless RPC mode, however, their choice may
* have been based on the set of older refs advertised
* by another process that handled the initial request.
*/
if (has_non_tip)
check_non_tip();
if (!use_sideband && daemon_mode)
no_progress = 1;
if (depth == 0 && !deepen_rev_list && shallows.nr == 0)
return;
if (depth > 0 && deepen_rev_list)
die("git upload-pack: deepen and deepen-since (or deepen-not) cannot be used together");
if (depth > 0)
fetch, upload-pack: --deepen=N extends shallow boundary by N commits In git-fetch, --depth argument is always relative with the latest remote refs. This makes it a bit difficult to cover this use case, where the user wants to make the shallow history, say 3 levels deeper. It would work if remote refs have not moved yet, but nobody can guarantee that, especially when that use case is performed a couple months after the last clone or "git fetch --depth". Also, modifying shallow boundary using --depth does not work well with clones created by --since or --not. This patch fixes that. A new argument --deepen=<N> will add <N> more (*) parent commits to the current history regardless of where remote refs are. Have/Want negotiation is still respected. So if remote refs move, the server will send two chunks: one between "have" and "want" and another to extend shallow history. In theory, the client could send no "want"s in order to get the second chunk only. But the protocol does not allow that. Either you send no want lines, which means ls-remote; or you have to send at least one want line that carries deep-relative to the server.. The main work was done by Dongcan Jiang. I fixed it up here and there. And of course all the bugs belong to me. (*) We could even support --deepen=<N> where <N> is negative. In that case we can cut some history from the shallow clone. This operation (and --depth=<shorter depth>) does not require interaction with remote side (and more complicated to implement as a result). Helped-by: Duy Nguyen <pclouds@gmail.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Dongcan Jiang <dongcan.jiang@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-12 12:54:09 +02:00
deepen(depth, deepen_relative, &shallows);
else if (deepen_rev_list) {
struct argv_array av = ARGV_ARRAY_INIT;
int i;
argv_array_push(&av, "rev-list");
if (deepen_since)
argv_array_pushf(&av, "--max-age=%lu", deepen_since);
if (deepen_not.nr) {
argv_array_push(&av, "--not");
for (i = 0; i < deepen_not.nr; i++) {
struct string_list_item *s = deepen_not.items + i;
argv_array_push(&av, s->string);
}
argv_array_push(&av, "--not");
}
for (i = 0; i < want_obj.nr; i++) {
struct object *o = want_obj.objects[i].item;
argv_array_push(&av, oid_to_hex(&o->oid));
}
deepen_by_rev_list(av.argc, av.argv, &shallows);
argv_array_clear(&av);
}
else
if (shallows.nr > 0) {
int i;
for (i = 0; i < shallows.nr; i++)
register_shallow(shallows.objects[i].item->oid.hash);
}
shallow_nr += shallows.nr;
free(shallows.objects);
}
upload/receive-pack: allow hiding ref hierarchies A repository may have refs that are only used for its internal bookkeeping purposes that should not be exposed to the others that come over the network. Teach upload-pack to omit some refs from its initial advertisement by paying attention to the uploadpack.hiderefs multi-valued configuration variable. Do the same to receive-pack via the receive.hiderefs variable. As a convenient short-hand, allow using transfer.hiderefs to set the value to both of these variables. Any ref that is under the hierarchies listed on the value of these variable is excluded from responses to requests made by "ls-remote", "fetch", etc. (for upload-pack) and "push" (for receive-pack). Because these hidden refs do not count as OUR_REF, an attempt to fetch objects at the tip of them will be rejected, and because these refs do not get advertised, "git push :" will not see local branches that have the same name as them as "matching" ones to be sent. An attempt to update/delete these hidden refs with an explicit refspec, e.g. "git push origin :refs/hidden/22", is rejected. This is not a new restriction. To the pusher, it would appear that there is no such ref, so its push request will conclude with "Now that I sent you all the data, it is time for you to update the refs. I saw that the ref did not exist when I started pushing, and I want the result to point at this commit". The receiving end will apply the compare-and-swap rule to this request and rejects the push with "Well, your update request conflicts with somebody else; I see there is such a ref.", which is the right thing to do. Otherwise a push to a hidden ref will always be "the last one wins", which is not a good default. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-19 01:08:30 +01:00
/* return non-zero if the ref is hidden, otherwise 0 */
static int mark_our_ref(const char *refname, const char *refname_full,
const struct object_id *oid)
{
struct object *o = lookup_unknown_object(oid->hash);
upload/receive-pack: allow hiding ref hierarchies A repository may have refs that are only used for its internal bookkeeping purposes that should not be exposed to the others that come over the network. Teach upload-pack to omit some refs from its initial advertisement by paying attention to the uploadpack.hiderefs multi-valued configuration variable. Do the same to receive-pack via the receive.hiderefs variable. As a convenient short-hand, allow using transfer.hiderefs to set the value to both of these variables. Any ref that is under the hierarchies listed on the value of these variable is excluded from responses to requests made by "ls-remote", "fetch", etc. (for upload-pack) and "push" (for receive-pack). Because these hidden refs do not count as OUR_REF, an attempt to fetch objects at the tip of them will be rejected, and because these refs do not get advertised, "git push :" will not see local branches that have the same name as them as "matching" ones to be sent. An attempt to update/delete these hidden refs with an explicit refspec, e.g. "git push origin :refs/hidden/22", is rejected. This is not a new restriction. To the pusher, it would appear that there is no such ref, so its push request will conclude with "Now that I sent you all the data, it is time for you to update the refs. I saw that the ref did not exist when I started pushing, and I want the result to point at this commit". The receiving end will apply the compare-and-swap rule to this request and rejects the push with "Well, your update request conflicts with somebody else; I see there is such a ref.", which is the right thing to do. Otherwise a push to a hidden ref will always be "the last one wins", which is not a good default. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-19 01:08:30 +01:00
if (ref_is_hidden(refname, refname_full)) {
o->flags |= HIDDEN_REF;
upload/receive-pack: allow hiding ref hierarchies A repository may have refs that are only used for its internal bookkeeping purposes that should not be exposed to the others that come over the network. Teach upload-pack to omit some refs from its initial advertisement by paying attention to the uploadpack.hiderefs multi-valued configuration variable. Do the same to receive-pack via the receive.hiderefs variable. As a convenient short-hand, allow using transfer.hiderefs to set the value to both of these variables. Any ref that is under the hierarchies listed on the value of these variable is excluded from responses to requests made by "ls-remote", "fetch", etc. (for upload-pack) and "push" (for receive-pack). Because these hidden refs do not count as OUR_REF, an attempt to fetch objects at the tip of them will be rejected, and because these refs do not get advertised, "git push :" will not see local branches that have the same name as them as "matching" ones to be sent. An attempt to update/delete these hidden refs with an explicit refspec, e.g. "git push origin :refs/hidden/22", is rejected. This is not a new restriction. To the pusher, it would appear that there is no such ref, so its push request will conclude with "Now that I sent you all the data, it is time for you to update the refs. I saw that the ref did not exist when I started pushing, and I want the result to point at this commit". The receiving end will apply the compare-and-swap rule to this request and rejects the push with "Well, your update request conflicts with somebody else; I see there is such a ref.", which is the right thing to do. Otherwise a push to a hidden ref will always be "the last one wins", which is not a good default. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-19 01:08:30 +01:00
return 1;
}
o->flags |= OUR_REF;
return 0;
}
static int check_ref(const char *refname_full, const struct object_id *oid,
int flag, void *cb_data)
{
const char *refname = strip_namespace(refname_full);
mark_our_ref(refname, refname_full, oid);
return 0;
}
static void format_symref_info(struct strbuf *buf, struct string_list *symref)
{
struct string_list_item *item;
if (!symref->nr)
return;
for_each_string_list_item(item, symref)
strbuf_addf(buf, " symref=%s:%s", item->string, (char *)item->util);
}
static int send_ref(const char *refname, const struct object_id *oid,
int flag, void *cb_data)
{
static const char *capabilities = "multi_ack thin-pack side-band"
fetch, upload-pack: --deepen=N extends shallow boundary by N commits In git-fetch, --depth argument is always relative with the latest remote refs. This makes it a bit difficult to cover this use case, where the user wants to make the shallow history, say 3 levels deeper. It would work if remote refs have not moved yet, but nobody can guarantee that, especially when that use case is performed a couple months after the last clone or "git fetch --depth". Also, modifying shallow boundary using --depth does not work well with clones created by --since or --not. This patch fixes that. A new argument --deepen=<N> will add <N> more (*) parent commits to the current history regardless of where remote refs are. Have/Want negotiation is still respected. So if remote refs move, the server will send two chunks: one between "have" and "want" and another to extend shallow history. In theory, the client could send no "want"s in order to get the second chunk only. But the protocol does not allow that. Either you send no want lines, which means ls-remote; or you have to send at least one want line that carries deep-relative to the server.. The main work was done by Dongcan Jiang. I fixed it up here and there. And of course all the bugs belong to me. (*) We could even support --deepen=<N> where <N> is negative. In that case we can cut some history from the shallow clone. This operation (and --depth=<shorter depth>) does not require interaction with remote side (and more complicated to implement as a result). Helped-by: Duy Nguyen <pclouds@gmail.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Dongcan Jiang <dongcan.jiang@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-12 12:54:09 +02:00
" side-band-64k ofs-delta shallow deepen-since deepen-not"
" deepen-relative no-progress include-tag multi_ack_detailed";
const char *refname_nons = strip_namespace(refname);
struct object_id peeled;
if (mark_our_ref(refname_nons, refname, oid))
upload/receive-pack: allow hiding ref hierarchies A repository may have refs that are only used for its internal bookkeeping purposes that should not be exposed to the others that come over the network. Teach upload-pack to omit some refs from its initial advertisement by paying attention to the uploadpack.hiderefs multi-valued configuration variable. Do the same to receive-pack via the receive.hiderefs variable. As a convenient short-hand, allow using transfer.hiderefs to set the value to both of these variables. Any ref that is under the hierarchies listed on the value of these variable is excluded from responses to requests made by "ls-remote", "fetch", etc. (for upload-pack) and "push" (for receive-pack). Because these hidden refs do not count as OUR_REF, an attempt to fetch objects at the tip of them will be rejected, and because these refs do not get advertised, "git push :" will not see local branches that have the same name as them as "matching" ones to be sent. An attempt to update/delete these hidden refs with an explicit refspec, e.g. "git push origin :refs/hidden/22", is rejected. This is not a new restriction. To the pusher, it would appear that there is no such ref, so its push request will conclude with "Now that I sent you all the data, it is time for you to update the refs. I saw that the ref did not exist when I started pushing, and I want the result to point at this commit". The receiving end will apply the compare-and-swap rule to this request and rejects the push with "Well, your update request conflicts with somebody else; I see there is such a ref.", which is the right thing to do. Otherwise a push to a hidden ref will always be "the last one wins", which is not a good default. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-19 01:08:30 +01:00
return 0;
if (capabilities) {
struct strbuf symref_info = STRBUF_INIT;
format_symref_info(&symref_info, cb_data);
packet_write(1, "%s %s%c%s%s%s%s%s agent=%s\n",
oid_to_hex(oid), refname_nons,
0, capabilities,
(allow_unadvertised_object_request & ALLOW_TIP_SHA1) ?
" allow-tip-sha1-in-want" : "",
(allow_unadvertised_object_request & ALLOW_REACHABLE_SHA1) ?
" allow-reachable-sha1-in-want" : "",
stateless_rpc ? " no-done" : "",
symref_info.buf,
git_user_agent_sanitized());
strbuf_release(&symref_info);
} else {
packet_write(1, "%s %s\n", oid_to_hex(oid), refname_nons);
}
capabilities = NULL;
if (!peel_ref(refname, peeled.hash))
packet_write(1, "%s %s^{}\n", oid_to_hex(&peeled), refname_nons);
return 0;
}
static int find_symref(const char *refname, const struct object_id *oid,
int flag, void *cb_data)
{
const char *symref_target;
struct string_list_item *item;
struct object_id unused;
if ((flag & REF_ISSYMREF) == 0)
return 0;
symref_target = resolve_ref_unsafe(refname, 0, unused.hash, &flag);
if (!symref_target || (flag & REF_ISSYMREF) == 0)
die("'%s' is a symref but it is not?", refname);
item = string_list_append(cb_data, refname);
item->util = xstrdup(symref_target);
return 0;
}
static void upload_pack(void)
{
struct string_list symref = STRING_LIST_INIT_DUP;
head_ref_namespaced(find_symref, &symref);
if (advertise_refs || !stateless_rpc) {
reset_timeout();
head_ref_namespaced(send_ref, &symref);
for_each_namespaced_ref(send_ref, &symref);
make the sender advertise shallow commits to the receiver If either receive-pack or upload-pack is called on a shallow repository, shallow commits (*) will be sent after the ref advertisement (but before the packet flush), so that the receiver has the full "shape" of the sender's commit graph. This will be needed for the receiver to update its .git/shallow if necessary. This breaks the protocol for all clients trying to push to a shallow repo, or fetch from one. Which is basically the same end result as today's "is_repository_shallow() && die()" in receive-pack and upload-pack. New clients will be made aware of shallow upstream and can make use of this information. The sender must send all shallow commits that are sent in the following pack. It may send more shallow commits than necessary. upload-pack for example may choose to advertise no shallow commits if it knows in advance that the pack it's going to send contains no shallow commits. But upload-pack is the server, so we choose the cheaper way, send full .git/shallow and let the client deal with it. Smart HTTP is not affected by this patch. Shallow support on smart-http comes later separately. (*) A shallow commit is a commit that terminates the revision walker. It is usually put in .git/shallow in order to keep the revision walker from going out of bound because there is no guarantee that objects behind this commit is available. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-05 14:02:32 +01:00
advertise_shallow_grafts(1);
packet_flush(1);
} else {
head_ref_namespaced(check_ref, NULL);
for_each_namespaced_ref(check_ref, NULL);
}
string_list_clear(&symref, 1);
if (advertise_refs)
return;
receive_needs();
if (want_obj.nr) {
get_common_commits();
create_pack_file();
}
}
upload/receive-pack: allow hiding ref hierarchies A repository may have refs that are only used for its internal bookkeeping purposes that should not be exposed to the others that come over the network. Teach upload-pack to omit some refs from its initial advertisement by paying attention to the uploadpack.hiderefs multi-valued configuration variable. Do the same to receive-pack via the receive.hiderefs variable. As a convenient short-hand, allow using transfer.hiderefs to set the value to both of these variables. Any ref that is under the hierarchies listed on the value of these variable is excluded from responses to requests made by "ls-remote", "fetch", etc. (for upload-pack) and "push" (for receive-pack). Because these hidden refs do not count as OUR_REF, an attempt to fetch objects at the tip of them will be rejected, and because these refs do not get advertised, "git push :" will not see local branches that have the same name as them as "matching" ones to be sent. An attempt to update/delete these hidden refs with an explicit refspec, e.g. "git push origin :refs/hidden/22", is rejected. This is not a new restriction. To the pusher, it would appear that there is no such ref, so its push request will conclude with "Now that I sent you all the data, it is time for you to update the refs. I saw that the ref did not exist when I started pushing, and I want the result to point at this commit". The receiving end will apply the compare-and-swap rule to this request and rejects the push with "Well, your update request conflicts with somebody else; I see there is such a ref.", which is the right thing to do. Otherwise a push to a hidden ref will always be "the last one wins", which is not a good default. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-19 01:08:30 +01:00
static int upload_pack_config(const char *var, const char *value, void *unused)
{
if (!strcmp("uploadpack.allowtipsha1inwant", var)) {
if (git_config_bool(var, value))
allow_unadvertised_object_request |= ALLOW_TIP_SHA1;
else
allow_unadvertised_object_request &= ~ALLOW_TIP_SHA1;
} else if (!strcmp("uploadpack.allowreachablesha1inwant", var)) {
if (git_config_bool(var, value))
allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
else
allow_unadvertised_object_request &= ~ALLOW_REACHABLE_SHA1;
} else if (!strcmp("uploadpack.keepalive", var)) {
keepalive = git_config_int(var, value);
if (!keepalive)
keepalive = -1;
upload-pack: provide a hook for running pack-objects When upload-pack serves a client request, it turns to pack-objects to do the heavy lifting of creating a packfile. There's no easy way to intercept the call to pack-objects, but there are a few good reasons to want to do so: 1. If you're debugging a client or server issue with fetching, you may want to store a copy of the generated packfile. 2. If you're gathering data from real-world fetches for performance analysis or debugging, storing a copy of the arguments and stdin lets you replay the pack generation at your leisure. 3. You may want to insert a caching layer around pack-objects; it is the most CPU- and memory-intensive part of serving a fetch, and its output is a pure function[1] of its input, making it an ideal place to consolidate identical requests. This patch adds a simple "hook" interface to intercept calls to pack-objects. The new test demonstrates how it can be used for debugging (using it for caching is a straightforward extension; the tricky part is writing the actual caching layer). This hook is unlike the normal hook scripts found in the "hooks/" directory of a repository. Because we promise that upload-pack is safe to run in an untrusted repository, we cannot execute arbitrary code or commands found in the repository (neither in hooks/, nor in the config). So instead, this hook is triggered from a config variable that is explicitly ignored in the per-repo config. The config variable holds the actual shell command to run as the hook. Another approach would be to simply treat it as a boolean: "should I respect the upload-pack hooks in this repo?", and then run the script from "hooks/" as we usually do. However, that isn't as flexible; there's no way to run a hook approved by the site administrator (e.g., in "/etc/gitconfig") on a repository whose contents are not trusted. The approach taken by this patch is more fine-grained, if a little less conventional for git hooks (it does behave similar to other configured commands like diff.external, etc). [1] Pack-objects isn't _actually_ a pure function. Its output depends on the exact packing of the object database, and if multi-threading is used for delta compression, can even differ racily. But for the purposes of caching, that's OK; of the many possible outputs for a given input, it is sufficient only that we output one of them. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-19 00:45:37 +02:00
} else if (current_config_scope() != CONFIG_SCOPE_REPO) {
if (!strcmp("uploadpack.packobjectshook", var))
return git_config_string(&pack_objects_hook, var, value);
}
upload/receive-pack: allow hiding ref hierarchies A repository may have refs that are only used for its internal bookkeeping purposes that should not be exposed to the others that come over the network. Teach upload-pack to omit some refs from its initial advertisement by paying attention to the uploadpack.hiderefs multi-valued configuration variable. Do the same to receive-pack via the receive.hiderefs variable. As a convenient short-hand, allow using transfer.hiderefs to set the value to both of these variables. Any ref that is under the hierarchies listed on the value of these variable is excluded from responses to requests made by "ls-remote", "fetch", etc. (for upload-pack) and "push" (for receive-pack). Because these hidden refs do not count as OUR_REF, an attempt to fetch objects at the tip of them will be rejected, and because these refs do not get advertised, "git push :" will not see local branches that have the same name as them as "matching" ones to be sent. An attempt to update/delete these hidden refs with an explicit refspec, e.g. "git push origin :refs/hidden/22", is rejected. This is not a new restriction. To the pusher, it would appear that there is no such ref, so its push request will conclude with "Now that I sent you all the data, it is time for you to update the refs. I saw that the ref did not exist when I started pushing, and I want the result to point at this commit". The receiving end will apply the compare-and-swap rule to this request and rejects the push with "Well, your update request conflicts with somebody else; I see there is such a ref.", which is the right thing to do. Otherwise a push to a hidden ref will always be "the last one wins", which is not a good default. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-19 01:08:30 +01:00
return parse_hide_refs_config(var, value, "uploadpack");
}
add an extra level of indirection to main() There are certain startup tasks that we expect every git process to do. In some cases this is just to improve the quality of the program (e.g., setting up gettext()). In others it is a requirement for using certain functions in libgit.a (e.g., system_path() expects that you have called git_extract_argv0_path()). Most commands are builtins and are covered by the git.c version of main(). However, there are still a few external commands that use their own main(). Each of these has to remember to include the correct startup sequence, and we are not always consistent. Rather than just fix the inconsistencies, let's make this harder to get wrong by providing a common main() that can run this standard startup. We basically have two options to do this: - the compat/mingw.h file already does something like this by adding a #define that replaces the definition of main with a wrapper that calls mingw_startup(). The upside is that the code in each program doesn't need to be changed at all; it's rewritten on the fly by the preprocessor. The downside is that it may make debugging of the startup sequence a bit more confusing, as the preprocessor is quietly inserting new code. - the builtin functions are all of the form cmd_foo(), and git.c's main() calls them. This is much more explicit, which may make things more obvious to somebody reading the code. It's also more flexible (because of course we have to figure out _which_ cmd_foo() to call). The downside is that each of the builtins must define cmd_foo(), instead of just main(). This patch chooses the latter option, preferring the more explicit approach, even though it is more invasive. We introduce a new file common-main.c, with the "real" main. It expects to call cmd_main() from whatever other objects it is linked against. We link common-main.o against anything that links against libgit.a, since we know that such programs will need to do this setup. Note that common-main.o can't actually go inside libgit.a, as the linker would not pick up its main() function automatically (it has no callers). The rest of the patch is just adjusting all of the various external programs (mostly in t/helper) to use cmd_main(). I've provided a global declaration for cmd_main(), which means that all of the programs also need to match its signature. In particular, many functions need to switch to "const char **" instead of "char **" for argv. This effect ripples out to a few other variables and functions, as well. This makes the patch even more invasive, but the end result is much better. We should be treating argv strings as const anyway, and now all programs conform to the same signature (which also matches the way builtins are defined). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 07:58:58 +02:00
int cmd_main(int argc, const char **argv)
{
const char *dir;
int strict = 0;
struct option options[] = {
OPT_BOOL(0, "stateless-rpc", &stateless_rpc,
N_("quit after a single request/response exchange")),
OPT_BOOL(0, "advertise-refs", &advertise_refs,
N_("exit immediately after initial ref advertisement")),
OPT_BOOL(0, "strict", &strict,
N_("do not try <directory>/.git/ if <directory> is no Git directory")),
OPT_INTEGER(0, "timeout", &timeout,
N_("interrupt transfer after <n> seconds of inactivity")),
OPT_END()
};
packet_trace_identity("upload-pack");
check_replace_refs = 0;
argc = parse_options(argc, argv, NULL, options, upload_pack_usage, 0);
if (argc != 1)
usage_with_options(upload_pack_usage, options);
if (timeout)
daemon_mode = 1;
setup_path();
dir = argv[0];
if (!enter_repo(dir, strict))
die("'%s' does not appear to be a git repository", dir);
make the sender advertise shallow commits to the receiver If either receive-pack or upload-pack is called on a shallow repository, shallow commits (*) will be sent after the ref advertisement (but before the packet flush), so that the receiver has the full "shape" of the sender's commit graph. This will be needed for the receiver to update its .git/shallow if necessary. This breaks the protocol for all clients trying to push to a shallow repo, or fetch from one. Which is basically the same end result as today's "is_repository_shallow() && die()" in receive-pack and upload-pack. New clients will be made aware of shallow upstream and can make use of this information. The sender must send all shallow commits that are sent in the following pack. It may send more shallow commits than necessary. upload-pack for example may choose to advertise no shallow commits if it knows in advance that the pack it's going to send contains no shallow commits. But upload-pack is the server, so we choose the cheaper way, send full .git/shallow and let the client deal with it. Smart HTTP is not affected by this patch. Shallow support on smart-http comes later separately. (*) A shallow commit is a commit that terminates the revision walker. It is usually put in .git/shallow in order to keep the revision walker from going out of bound because there is no guarantee that objects behind this commit is available. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-05 14:02:32 +01:00
upload/receive-pack: allow hiding ref hierarchies A repository may have refs that are only used for its internal bookkeeping purposes that should not be exposed to the others that come over the network. Teach upload-pack to omit some refs from its initial advertisement by paying attention to the uploadpack.hiderefs multi-valued configuration variable. Do the same to receive-pack via the receive.hiderefs variable. As a convenient short-hand, allow using transfer.hiderefs to set the value to both of these variables. Any ref that is under the hierarchies listed on the value of these variable is excluded from responses to requests made by "ls-remote", "fetch", etc. (for upload-pack) and "push" (for receive-pack). Because these hidden refs do not count as OUR_REF, an attempt to fetch objects at the tip of them will be rejected, and because these refs do not get advertised, "git push :" will not see local branches that have the same name as them as "matching" ones to be sent. An attempt to update/delete these hidden refs with an explicit refspec, e.g. "git push origin :refs/hidden/22", is rejected. This is not a new restriction. To the pusher, it would appear that there is no such ref, so its push request will conclude with "Now that I sent you all the data, it is time for you to update the refs. I saw that the ref did not exist when I started pushing, and I want the result to point at this commit". The receiving end will apply the compare-and-swap rule to this request and rejects the push with "Well, your update request conflicts with somebody else; I see there is such a ref.", which is the right thing to do. Otherwise a push to a hidden ref will always be "the last one wins", which is not a good default. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-19 01:08:30 +01:00
git_config(upload_pack_config, NULL);
upload_pack();
return 0;
}