When printing an error message saying a ref was requested that we do not
have, only print that ref, rather than the ref and everything sent to us
on the same packet line (e.g. protocol support specifications).
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A bit of history in chronological order, the newest at bottom:
- 80ccaa7 (upload-pack: Move the revision walker into a separate function.)
do_rev_list was introduced with create_full_pack argument
- 21edd3f (upload-pack: Run rev-list in an asynchronous function.)
do_rev_list was now called by start_async, create_full_pack was
passed by rev_list.data
- f0cea83 (Shift object enumeration out of upload-pack)
rev_list.data was now zero permanently. Creating full pack was
done by passing --all to pack-objects
- ae6a560 (run-command: support custom fd-set in async)
rev_list.data = 0 was found out redudant and got rid of.
Get rid of the code as well, for less headache while reading do_rev_list.
[jc: noticed by Elijah Newren]
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch adds the possibility to supply a set of non-0 file
descriptors for async process communication instead of the
default-created pipe.
Additionally, we now support bi-directional communiction with the
async procedure, by giving the async function both read and write
file descriptors.
To retain compatiblity and similar "API feel" with start_command,
we require start_async callers to set .out = -1 to get a readable
file descriptor. If either of .in or .out is 0, we supply no file
descriptor to the async process.
[sp: Note: Erik started this patch, and a huge bulk of it is
his work. All bugs were introduced later by Shawn.]
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This hook runs after "git fetch" in the repository the objects are
fetched from as the user who fetched, and has security implications.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* sp/smart-http: (37 commits)
http-backend: Let gcc check the format of more printf-type functions.
http-backend: Fix access beyond end of string.
http-backend: Fix bad treatment of uintmax_t in Content-Length
t5551-http-fetch: Work around broken Accept header in libcurl
t5551-http-fetch: Work around some libcurl versions
http-backend: Protect GIT_PROJECT_ROOT from /../ requests
Git-aware CGI to provide dumb HTTP transport
http-backend: Test configuration options
http-backend: Use http.getanyfile to disable dumb HTTP serving
test smart http fetch and push
http tests: use /dumb/ URL prefix
set httpd port before sourcing lib-httpd
t5540-http-push: remove redundant fetches
Smart HTTP fetch: gzip requests
Smart fetch over HTTP: client side
Smart push over HTTP: client side
Discover refs via smart HTTP server when available
http-backend: more explict LocationMatch
http-backend: add example for gitweb on same URL
http-backend: use mod_alias instead of mod_rewrite
...
Conflicts:
.gitignore
remote-curl.c
In theory it is possible for sideband channel #2 to be delayed if
pack data is quick to come up for sideband channel #1. And because
data for channel #2 is read only 128 bytes at a time while pack data
is read 8192 bytes at a time, it is possible for many pack blocks to
be sent to the client before the progress message fifo is emptied,
making the situation even worse. This would result in totally garbled
progress display on the client's console as local progress gets mixed
with partial remote progress lines.
Let's prevent such situations by giving transmission priority to
progress messages over pack data at all times.
Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When --stateless-rpc is passed as a command line parameter to
upload-pack or receive-pack the programs now assume they may
perform only a single read-write cycle with stdin and stdout.
This fits with the HTTP POST request processing model where a
program may read the request, write a response, and must exit.
When --advertise-refs is passed as a command line parameter only
the initial ref advertisement is output, and the program exits
immediately. This fits with the HTTP GET request model, where
no request content is received but a response must be produced.
HTTP headers and/or environment are not processed here, but
instead are assumed to be handled by the program invoking
either service backend.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When multi_ack_detailed is enabled the ACK continue messages returned
by the remote upload-pack are broken out to describe the different
states within the peer. This permits the client to better understand
the server's in-memory state.
The fetch-pack/upload-pack protocol now looks like:
NAK
---------------------------------
Always sent in response to "done" if there was no common base
selected from the "have" lines (or no have lines were sent).
* no multi_ack or multi_ack_detailed:
Sent when the client has sent a pkt-line flush ("0000") and
the server has not yet found a common base object.
* either multi_ack or multi_ack_detailed:
Always sent in response to a pkt-line flush.
ACK %s
-----------------------------------
* no multi_ack or multi_ack_detailed:
Sent in response to "have" when the object exists on the remote
side and is therefore an object in common between the peers.
The argument is the SHA-1 of the common object.
* either multi_ack or multi_ack_detailed:
Sent in response to "done" if there are common objects.
The argument is the last SHA-1 determined to be common.
ACK %s continue
-----------------------------------
* multi_ack only:
Sent in response to "have".
The remote side wants the client to consider this object as
common, and immediately stop transmitting additional "have"
lines for objects that are reachable from it. The reason
the client should stop is not given, but is one of the two
cases below available under multi_ack_detailed.
ACK %s common
-----------------------------------
* multi_ack_detailed only:
Sent in response to "have". Both sides have this object.
Like with "ACK %s continue" above the client should stop
sending have lines reachable for objects from the argument.
ACK %s ready
-----------------------------------
* multi_ack_detailed only:
Sent in response to "have".
The client should stop transmitting objects which are reachable
from the argument, and send "done" soon to get the objects.
If the remote side has the specified object, it should
first send an "ACK %s common" message prior to sending
"ACK %s ready".
Clients may still submit additional "have" lines if there are
more side branches for the client to explore that might be added
to the common set and reduce the number of objects to transfer.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There were several unchecked use of fdopen(); replace them with xfdopen()
that checks and dies.
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 2d14d65 (Use a clearer style to issue commands to remote helpers,
2009-09-03) I happened to notice two changes like this:
- write_in_full(helper->in, "list\n", 5);
+
+ strbuf_addstr(&buf, "list\n");
+ write_in_full(helper->in, buf.buf, buf.len);
+ strbuf_reset(&buf);
IMHO, it would be better to define a new function,
static inline ssize_t write_str_in_full(int fd, const char *str)
{
return write_in_full(fd, str, strlen(str));
}
and then use it like this:
- strbuf_addstr(&buf, "list\n");
- write_in_full(helper->in, buf.buf, buf.len);
- strbuf_reset(&buf);
+ write_str_in_full(helper->in, "list\n");
Thus not requiring the added allocation, and still avoiding
the maintenance risk of literal string lengths.
These days, compilers are good enough that strlen("literal")
imposes no run-time cost.
Transformed via this:
perl -pi -e \
's/write_in_full\((.*?), (".*?"), \d+\)/write_str_in_full($1, $2)/'\
$(git grep -l 'write_in_full.*"')
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
First of all, I can't find any reason why thin pack generation is
explicitly disabled when dealing with a shallow repository. The
possible delta base objects are collected from the edge commits which
are always obtained through history walking with the same shallow refs
as the client, Therefore the client is always going to have those base
objects available. So let's remove that restriction.
Then we can make shallow repository deepening much more efficient by
using the remote's unshallowed commits as edge commits to get preferred
base objects for thin pack generation. On git.git, this makes the data
transfer for the deepening of a shallow repository from depth 1 to depth 2
around 134 KB instead of 3.68 MB.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The majority of code in core git appears to use a single
space after if/for/while. This is an attempt to bring more
code to this standard. These are entirely cosmetic changes.
Signed-off-by: Brian Gianforcaro <b.gianfo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A request to clone the repository does not give any "have" but asks for
all the refs we offer with "want". When a request does not ask to clone
the repository fully, but asks to fetch some refs into an empty
repository, it will not give any "have" but its "want" won't ask for all
the refs we offer.
If we suppose (and I would say this is a rather big if) that it makes
sense to distinguish these two cases, a hook cannot reliably do this
alone. The hook can detect lack of "have" and bunch of "want", but there
is no direct way to tell if the other end asked for all refs we offered,
or merely most of them.
Between the time we talked with the other end and the time the hook got
called, we may have acquired more refs or lost some refs in the repository
by concurrent operations. Given that we plan to introduce selective
advertisement of refs with a protocol extension, it would become even more
difficult for hooks to guess between these two cases.
This adds "kind [clone|fetch]" to hook's input, as a stable interface to
allow the hooks to tell these cases apart.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After upload-pack successfully finishes its operation, post-upload-pack
hook can be called for logging purposes.
The hook is passed various pieces of information, one per line, from its
standard input. Currently the following items can be fed to the hook, but
more types of information may be added in the future:
want SHA-1::
40-byte hexadecimal object name the client asked to include in the
resulting pack. Can occur one or more times in the input.
have SHA-1::
40-byte hexadecimal object name the client asked to exclude from
the resulting pack, claiming to have them already. Can occur zero
or more times in the input.
time float::
Number of seconds spent for creating the packfile.
size decimal::
Size of the resulting packfile in bytes.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* cc/replace:
t6050: check pushing something based on a replaced commit
Documentation: add documentation for "git replace"
Add git-replace to .gitignore
builtin-replace: use "usage_msg_opt" to give better error messages
parse-options: add new function "usage_msg_opt"
builtin-replace: teach "git replace" to actually replace
Add new "git replace" command
environment: add global variable to disable replacement
mktag: call "check_sha1_signature" with the replacement sha1
replace_object: add a test case
object: call "check_sha1_signature" with the replacement sha1
sha1_file: add a "read_sha1_file_repl" function
replace_object: add mechanism to replace objects found in "refs/replace/"
refs: add a "for_each_replace_ref" function
upload-pack runs pack-objects, which generates progress indicator output
on its stderr. If the client requests a sideband, this indicator is sent
to the client; but if it did not, then the progress is written to
upload-pack's own stderr.
If upload-pack is itself run from git-daemon (and if the client did not
request a sideband) the progress indicator never reaches the client and it
need not be generated in the first place. With this patch the progress
indicator is suppressed in this situation.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Offload object enumeration in upload-pack to pack-objects, but fall
back on internal revision walker for shallow interaction. Aside from
architecturally making more sense, this also leaves the door open for
pack-objects to employ a revision cache mechanism. Test t5530 updated
in order to explicitly check both enumeration methods.
Signed-off-by: Nick Edelen <sirnot@gmail.com>
Acked-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This new "read_replace_refs" global variable is set to 1 by
default, so that replace refs are used by default. But
reachability traversal and packing commands ("cmd_fsck",
"cmd_prune", "cmd_pack_objects", "upload_pack",
"cmd_unpack_objects") set it to 0, as they must work with the
original DAG.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* lt/pack-object-memuse:
show_object(): push path_name() call further down
process_{tree,blob}: show objects without buffering
Conflicts:
builtin-pack-objects.c
builtin-rev-list.c
list-objects.c
list-objects.h
upload-pack.c
In particular, pushing the "path_name()" call _into_ the show() function
would seem to allow
- more clarity into who "owns" the name (ie now when we free the name in
the show_object callback, it's because we generated it ourselves by
calling path_name())
- not calling path_name() at all, either because we don't care about the
name in the first place, or because we are actually happy walking the
linked list of "struct name_path *" and the last component.
Now, I didn't do that latter optimization, because it would require some
more coding, but especially looking at "builtin-pack-objects.c", we really
don't even want the whole pathname, we really would be better off with the
list of path components.
Why? We use that name for two things:
- add_preferred_base_object(), which actually _wants_ to traverse the
path, and now does it by looking for '/' characters!
- for 'name_hash()', which only cares about the last 16 characters of a
name, so again, generating the full name seems to be just unnecessary
work.
Anyway, so I didn't look any closer at those things, but it did convince
me that the "show_object()" calling convention was crazy, and we're
actually better off doing _less_ in list-objects.c, and giving people
access to the internal data structures so that they can decide whether
they want to generate a path-name or not.
This patch does that, and then for people who did use the name (even if
they might do something more clever in the future), it just does the
straightforward "name = path_name(path, component); .. free(name);" thing.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Here's a less trivial thing, and slightly more dubious one.
I was looking at that "struct object_array objects", and wondering why we
do that. I have honestly totally forgotten. Why not just call the "show()"
function as we encounter the objects? Rather than add the objects to the
object_array, and then at the very end going through the array and doing a
'show' on all, just do things more incrementally.
Now, there are possible downsides to this:
- the "buffer using object_array" _can_ in theory result in at least
better I-cache usage (two tight loops rather than one more spread out
one). I don't think this is a real issue, but in theory..
- this _does_ change the order of the objects printed. Instead of doing a
"process_tree(revs, commit->tree, &objects, NULL, "");" in the loop
over the commits (which puts all the root trees _first_ in the object
list, this patch just adds them to the list of pending objects, and
then we'll traverse them in that order (and thus show each root tree
object together with the objects we discover under it)
I _think_ the new ordering actually makes more sense, but the object
ordering is actually a subtle thing when it comes to packing
efficiency, so any change in order is going to have implications for
packing. Good or bad, I dunno.
- There may be some reason why we did it that odd way with the object
array, that I have simply forgotten.
Anyway, now that we don't buffer up the objects before showing them
that may actually result in lower memory usage during that whole
traverse_commit_list() phase.
This is seriously not very deeply tested. It makes sense to me, it seems
to pass all the tests, it looks ok, but...
Does anybody remember why we did that "object_array" thing? It used to be
an "object_list" a long long time ago, but got changed into the array due
to better memory usage patterns (those linked lists of obejcts are
horrible from a memory allocation standpoint). But I wonder why we didn't
do this back then. Maybe there's a reason for it.
Or maybe there _used_ to be a reason, and no longer is.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The goal of this patch is to get rid of the "static struct rev_info
revs" static variable in "builtin-rev-list.c".
To do that, we need to pass the revs to the "show_commit" function
in "builtin-rev-list.c" and this in turn means that the
"traverse_commit_list" function in "list-objects.c" must be passed
functions pointers to functions with 2 parameters instead of one.
So we have to change all the callers and all the functions passed
to "traverse_commit_list".
Anyway this makes the code more clean and more generic, so it
should be a good thing in the long run.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These weren't used outside and can be safely moved
Signed-off-by: Benjamin Kramer <benny.kra@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Certain remote commands, when asked to do something in a
particular directory that was not actually a git repository,
would say "unable to chdir or not a git archive". The
"chdir" bit is an unnecessary detail, and the term "git
archive" is much less common these days than "git repository".
So let's switch them all to:
fatal: '%s' does not appear to be a git repository
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Programs that use git_config need to find the global configuration.
When runtime prefix computation is enabled, this requires that
git_extract_argv0_path() is called early in the program's main().
This commit adds the necessary calls.
Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a mechanical conversion of all '*.c' files with:
s/((?:die|error|warning)\("git)-(\S+:)/$1 $2/;
The result was manually inspected and no false positive was found.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We will need the command invocation path in system_path(). This path was
passed to setup_path(), but system_path() can be called earlier, for
example via:
main
commit_pager_choice
setup_pager
git_config
git_etc_gitconfig
system_path
Therefore, we introduce git_set_argv0_path() and call it as soon as
possible.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In upload-pack we must explicitly close the output channel of rev-list.
(On Unix, the channel is closed automatically because process that runs
rev-list terminates.)
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
The new protocol extension "include-tag" allows the client side
of the connection (fetch-pack) to request that the server side of the
native git protocol (upload-pack / pack-objects) use --include-tag
as it prepares the packfile, thus ensuring that an annotated tag object
will be included in the resulting packfile if the object it refers to
was also included into the packfile.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To facilitate testing and verification of the requests sent by
git-fetch to the remote side we permit logging the received packet
lines to the file descriptor specified in GIT_DEBUG_SEND_PACK has
been set. Special start and end lines are included to indicate
the start and end of each connection.
$ GIT_DEBUG_SEND_PACK=3 git fetch 3>UPLOAD_LOG
$ cat UPLOAD_LOG
#S
want 8e10cf4e007ad7e003463c30c34b1050b039db78 multi_ack side-band-64k thin-pack ofs-delta
want ddfa4a33562179aca1ace2bcc662244a17d0b503
#E
#S
want 3253df4d1cf6fb138b52b1938473bcfec1483223 multi_ack side-band-64k thin-pack ofs-delta
#E
>From the above trace the first connection opened by git-fetch was to
download two refs (with values 8e and dd) and the second connection
was opened to automatically follow an annotated tag (32).
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* mk/maint-parse-careful:
receive-pack: use strict mode for unpacking objects
index-pack: introduce checking mode
unpack-objects: prevent writing of inconsistent objects
unpack-object: cache for non written objects
add common fsck error printing function
builtin-fsck: move common object checking code to fsck.c
builtin-fsck: reports missing parent commits
Remove unused object-ref code
builtin-fsck: move away from object-refs to fsck_walk
add generic, type aware object chain walker
Conflicts:
Makefile
builtin-fsck.c
* mk/maint-parse-careful:
peel_onion: handle NULL
check return value from parse_commit() in various functions
parse_commit: don't fail, if object is NULL
revision.c: handle tag->tagged == NULL
reachable.c::process_tree/blob: check for NULL
process_tag: handle tag->tagged == NULL
check results of parse_commit in merge_bases
list-objects.c::process_tree/blob: check for NULL
reachable.c::add_one_tree: handle NULL from lookup_tree
mark_blob/tree_uninteresting: check for NULL
get_sha1_oneline: check return value of parse_object
read_object_with_reference: don't read beyond the buffer
A failure in prepare_revision_walk can be caused by
a not parseable object.
Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since git-upload-pack has to spawn git-pack-objects, it has to make sure
that the latter can be found in the PATH. Without this patch an attempt
to clone or pull via ssh from a server fails if the git tools are not in
the standard PATH on the server even though git clone or git pull were
invoked with --upload-pack=/path/to/git-upload-pack.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
upload-pack spawns two processes, rev-list and pack-objects, and carefully
monitors their status so that it can report failure to the remote end.
This change removes the complicated procedures on the grounds of the
following observations:
- If everything is OK, rev-list closes its output pipe end, upon which
pack-objects (which reads from the pipe) sees EOF and terminates itself,
closing its output (and error) pipes. upload-pack reads from both until
it sees EOF in both. It collects the exit codes of the child processes
(which indicate success) and terminates successfully.
- If rev-list sees an error, it closes its output and terminates with
failure. pack-objects sees EOF in its input and terminates successfully.
Again upload-pack reads its inputs until EOF. When it now collects
the exit codes of its child processes, it notices the failure of rev-list
and signals failure to the remote end.
- If pack-objects sees an error, it terminates with failure. Since this
breaks the pipe to rev-list, rev-list is killed with SIGPIPE.
upload-pack reads its input until EOF, then collects the exit codes of
the child processes, notices their failures, and signals failure to the
remote end.
- If upload-pack itself dies unexpectedly, pack-objects is killed with
SIGPIPE, and subsequently also rev-list.
The upshot of this is that precise monitoring of child processes is not
required because both terminate if either one of them dies unexpectedly.
This allows us to use finish_command() and finish_async() instead of
an explicit waitpid(2) call.
The change is smaller than it looks because most of it only reduces the
indentation of a large part of the inner loop.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This gets rid of an explicit fork().
Since upload-pack has to coordinate two processes (rev-list and
pack-objects), we cannot use the normal finish_async(), but have to monitor
the process explicitly. Hence, there are no changes at this front.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This allows us later to use start_async() with this function, and at
the same time is a nice cleanup that makes a long function
(create_pack_file()) shorter.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This gets rid of an explicit fork/exec.
Since upload-pack has to coordinate two processes (rev-list and
pack-objects), we cannot use the normal finish_command(), but have to
monitor the processes explicitly. Hence, the waitpid() call remains.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>