Commit Graph

136 Commits

Author SHA1 Message Date
Josh Triplett
6b01ecfe22 ref namespaces: Support remote repositories via upload-pack and receive-pack
Change upload-pack and receive-pack to use the namespace-prefixed refs
when working with the repository, and use the unprefixed refs when
talking to the client, maintaining the masquerade.  This allows
clone, pull, fetch, and push to work with a suitably configured
GIT_NAMESPACE.

receive-pack advertises refs outside the current namespace as .have refs
(as it currently does for refs in alternates), so that the client can
use them to minimize data transfer but will otherwise ignore them.

With appropriate configuration, this also allows http-backend to expose
namespaces as multiple repositories with different paths.  This only
requires setting GIT_NAMESPACE, which http-backend passes through to
upload-pack and receive-pack.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Jamey Sharp <jamey@minilop.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-11 09:35:38 -07:00
Junio C Hamano
982f6c90ee Merge branch 'jk/maint-upload-pack-shallow'
* jk/maint-upload-pack-shallow:
  upload-pack: start pack-objects before async rev-list
2011-04-27 11:36:42 -07:00
Jeff King
b961219779 upload-pack: start pack-objects before async rev-list
In a pthread-enabled version of upload-pack, there's a race condition
that can cause a deadlock on the fflush(NULL) we call from run-command.

What happens is this:

  1. Upload-pack is informed we are doing a shallow clone.

  2. We call start_async() to spawn a thread that will generate rev-list
     results to feed to pack-objects. It gets a file descriptor to a
     pipe which will eventually hook to pack-objects.

  3. The rev-list thread uses fdopen to create a new output stream
     around the fd we gave it, called pack_pipe.

  4. The thread writes results to pack_pipe. Outside of our control,
     libc is doing locking on the stream. We keep writing until the OS
     pipe buffer is full, and then we block in write(), still holding
     the lock.

  5. The main thread now uses start_command to spawn pack-objects.
     Before forking, it calls fflush(NULL) to flush every stdio output
     buffer. It blocks trying to get the lock on pack_pipe.

And we have a deadlock. The thread will block until somebody starts
reading from the pipe. But nobody will read from the pipe until we
finish flushing to the pipe.

To fix this, we swap the start order: we start the
pack-objects reader first, and then the rev-list writer
after. Thus the problematic fflush(NULL) happens before we
even open the new file descriptor (and even if it didn't,
flushing should no longer block, as the reader at the end of
the pipe is now active).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-06 14:38:47 -07:00
Junio C Hamano
2eee1393f3 Merge branches 'sp/maint-fetch-pack-stop-early' and 'sp/maint-upload-pack-stop-early'
* sp/maint-fetch-pack-stop-early:
  enable "no-done" extension only when fetching over smart-http

* sp/maint-upload-pack-stop-early:
  enable "no-done" extension only when serving over smart-http
2011-03-29 14:09:02 -07:00
Junio C Hamano
4e10cf9a17 Revert two "no-done" reverts
Last night I had to make these two emergency reverts, but now we have a
better understanding of which part of the topic was broken, let's get rid
of the revert to fix it correctly.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-29 12:29:10 -07:00
Junio C Hamano
cf2ad8e641 enable "no-done" extension only when serving over smart-http
Do not advertise no-done capability when upload-pack is not serving over
smart-http, as there is no way for this server to know when it should stop
reading in-flight data from the client, even though it is necessary to
drain all the in-flight data in order to unblock the client.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
2011-03-29 12:21:28 -07:00
Junio C Hamano
4793b7e86d Revert "upload-pack: Implement no-done capability"
This reverts 3e63b21 (upload-pack: Implement no-done capability,
2011-03-14).  Together with 761ecf0 (fetch-pack: Implement no-done
capability, 2011-03-14) it seems to make the fetch-pack process out of
sync and makes it keep talking long after upload-pack stopped listening to
it, terminating the process with SIGPIPE.
2011-03-28 23:33:51 -07:00
Junio C Hamano
f59bf09678 Merge branch 'sp/maint-upload-pack-stop-early'
* sp/maint-upload-pack-stop-early:
  upload-pack: Implement no-done capability
  upload-pack: More aggressively send 'ACK %s ready'
2011-03-22 21:38:06 -07:00
Shawn O. Pearce
3e63b21ace upload-pack: Implement no-done capability
If the client requests both multi_ack_detailed and no-done then
upload-pack is free to immediately send a PACK following its first
'ACK %s ready' message.  The upload-pack response actually winds
up being:

  ACK %s common
  ... (maybe more) ...
  ACK %s ready
  NAK
  ACK %s
  PACK.... the pack stream ....

For smart HTTP connections this saves one HTTP RPC, reducing
the overall latency for a trivial fetch.  For git:// and ssh://
a no-done option slightly reduces latency by removing one
server->client->server round-trip at the end of the common
ancestor negotiation.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-15 12:14:35 -07:00
Shawn O. Pearce
49bee717f7 upload-pack: More aggressively send 'ACK %s ready'
If a client is merely following the remote (and has not made any
new commits itself), all "have %s" lines sent by the client will be
common to the server.  As all lines are common upload-pack never
calls ok_to_give_up() and does not compute if it has a good cut
point in the commit graph.

Without this computation the following client is going to send all
tagged commits, as these were determined to be COMMON_REF during the
initial advertisement, but the client does not parse their history
to transitively pass the COMMON flag and empty its queue of commits.

For git.git with 339 commit tags, it takes clients 11 rounds of
negotation to fully send all tagged commits and exhaust its queue
of things to send as common.  This is pretty slow for a client that
has not done any local development activity.

Force computing ok_to_give_up() and send "ACK %s ready" at the end
of the current round if this round only contained common objects
and ok_to_give_up() was therefore not called.  This may allow the
client to break early, avoiding transmission of the COMMON_REFs.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-14 17:27:25 -07:00
Jeff King
bbc30f9963 add packet tracing debug code
This shows a trace of all packets coming in or out of a given
program. This can help with debugging object negotiation or
other protocol issues.

To keep the code changes simple, we operate at the lowest
level, meaning we don't necessarily understand what's in the
packets. The one exception is a packet starting with "PACK",
which causes us to skip that packet and turn off tracing
(since the gigantic pack data will not be interesting to
read, at least not in the trace format).

We show both written and read packets. In the local case,
this may mean you will see packets twice (written by the
sender and read by the receiver). However, for cases where
the other end is remote, this allows you to see the full
conversation.

Packet tracing can be enabled with GIT_TRACE_PACKET=<foo>,
where <foo> takes the same arguments as GIT_TRACE.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-08 12:12:04 -08:00
Thiago Farina
47e44ed1dc commit: Add commit_list prefix in two function names.
Add commit_list prefix to insert_by_date function and to sort_by_date,
so it's clear that these functions refer to commit_list structure.

Signed-off-by: Thiago Farina <tfransosi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-29 14:01:52 -08:00
Štěpán Němec
62b4698e55 Use angles for placeholders consistently
Signed-off-by: Štěpán Němec <stepnem@gmail.com>
Acked-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-08 12:29:52 -07:00
Thiago Farina
3cd474599f object.h: Add OBJECT_ARRAY_INIT macro and make use of it.
Signed-off-by: Thiago Farina <tfransosi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-29 22:42:49 -07:00
Elijah Newren
9f9aa76130 upload-pack: Improve error message when bad ref requested
When printing an error message saying a ref was requested that we do not
have, only print that ref, rather than the ref and everything sent to us
on the same packet line (e.g. protocol support specifications).

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-02 15:31:59 -07:00
Nguyễn Thái Ngọc Duy
1e39d7deea upload-pack: remove unused "create_full_pack" code in do_rev_list
A bit of history in chronological order, the newest at bottom:

- 80ccaa7 (upload-pack: Move the revision walker into a separate function.)
   do_rev_list was introduced with create_full_pack argument

- 21edd3f (upload-pack: Run rev-list in an asynchronous function.)
   do_rev_list was now called by start_async, create_full_pack was
   passed by rev_list.data

- f0cea83 (Shift object enumeration out of upload-pack)
   rev_list.data was now zero permanently. Creating full pack was
   done by passing --all to pack-objects

- ae6a560 (run-command: support custom fd-set in async)
   rev_list.data = 0 was found out redudant and got rid of.

Get rid of the code as well, for less headache while reading do_rev_list.

[jc: noticed by Elijah Newren]

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-07-28 13:50:11 -07:00
Erik Faye-Lund
ae6a5609c0 run-command: support custom fd-set in async
This patch adds the possibility to supply a set of non-0 file
descriptors for async process communication instead of the
default-created pipe.

Additionally, we now support bi-directional communiction with the
async procedure, by giving the async function both read and write
file descriptors.

To retain compatiblity and similar "API feel" with start_command,
we require start_async callers to set .out = -1 to get a readable
file descriptor.  If either of .in or .out is 0, we supply no file
descriptor to the async process.

[sp: Note: Erik started this patch, and a huge bulk of it is
     his work.  All bugs were introduced later by Shawn.]

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-05 20:57:22 -08:00
Junio C Hamano
4cb51a65a4 Sync with 1.6.5.6
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-12-10 16:20:59 -08:00
Junio C Hamano
1456b043fc Remove post-upload-hook
This hook runs after "git fetch" in the repository the objects are
fetched from as the user who fetched, and has security implications.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-12-10 12:21:40 -08:00
Junio C Hamano
ef3a4fd670 Merge branch 'np/maint-sideband-favor-status' into maint
* np/maint-sideband-favor-status:
  give priority to progress messages
2009-12-03 13:50:24 -08:00
Junio C Hamano
905bf7742c Merge branch 'sp/smart-http'
* sp/smart-http: (37 commits)
  http-backend: Let gcc check the format of more printf-type functions.
  http-backend: Fix access beyond end of string.
  http-backend: Fix bad treatment of uintmax_t in Content-Length
  t5551-http-fetch: Work around broken Accept header in libcurl
  t5551-http-fetch: Work around some libcurl versions
  http-backend: Protect GIT_PROJECT_ROOT from /../ requests
  Git-aware CGI to provide dumb HTTP transport
  http-backend: Test configuration options
  http-backend: Use http.getanyfile to disable dumb HTTP serving
  test smart http fetch and push
  http tests: use /dumb/ URL prefix
  set httpd port before sourcing lib-httpd
  t5540-http-push: remove redundant fetches
  Smart HTTP fetch: gzip requests
  Smart fetch over HTTP: client side
  Smart push over HTTP: client side
  Discover refs via smart HTTP server when available
  http-backend: more explict LocationMatch
  http-backend: add example for gitweb on same URL
  http-backend: use mod_alias instead of mod_rewrite
  ...

Conflicts:
	.gitignore
	remote-curl.c
2009-11-20 23:51:23 -08:00
Junio C Hamano
e36e6c00cd Merge branch 'np/maint-sideband-favor-status'
* np/maint-sideband-favor-status:
  give priority to progress messages
2009-11-17 22:03:20 -08:00
Nicolas Pitre
6b59f51b31 give priority to progress messages
In theory it is possible for sideband channel #2 to be delayed if
pack data is quick to come up for sideband channel #1.  And because
data for channel #2 is read only 128 bytes at a time while pack data
is read 8192 bytes at a time, it is possible for many pack blocks to
be sent to the client before the progress message fifo is emptied,
making the situation even worse.  This would result in totally garbled
progress display on the client's console as local progress gets mixed
with partial remote progress lines.

Let's prevent such situations by giving transmission priority to
progress messages over pack data at all times.

Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-11-13 14:39:25 -08:00
Shawn O. Pearce
42526b478e Add stateless RPC options to upload-pack, receive-pack
When --stateless-rpc is passed as a command line parameter to
upload-pack or receive-pack the programs now assume they may
perform only a single read-write cycle with stdin and stdout.
This fits with the HTTP POST request processing model where a
program may read the request, write a response, and must exit.

When --advertise-refs is passed as a command line parameter only
the initial ref advertisement is output, and the program exits
immediately.  This fits with the HTTP GET request model, where
no request content is received but a response must be produced.

HTTP headers and/or environment are not processed here, but
instead are assumed to be handled by the program invoking
either service backend.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-11-04 17:58:14 -08:00
Shawn O. Pearce
78affc49de Add multi_ack_detailed capability to fetch-pack/upload-pack
When multi_ack_detailed is enabled the ACK continue messages returned
by the remote upload-pack are broken out to describe the different
states within the peer.  This permits the client to better understand
the server's in-memory state.

The fetch-pack/upload-pack protocol now looks like:

NAK
---------------------------------
  Always sent in response to "done" if there was no common base
  selected from the "have" lines (or no have lines were sent).

  * no multi_ack or multi_ack_detailed:

    Sent when the client has sent a pkt-line flush ("0000") and
    the server has not yet found a common base object.

  * either multi_ack or multi_ack_detailed:

    Always sent in response to a pkt-line flush.

ACK %s
-----------------------------------
  * no multi_ack or multi_ack_detailed:

    Sent in response to "have" when the object exists on the remote
    side and is therefore an object in common between the peers.
    The argument is the SHA-1 of the common object.

  * either multi_ack or multi_ack_detailed:

    Sent in response to "done" if there are common objects.
    The argument is the last SHA-1 determined to be common.

ACK %s continue
-----------------------------------
  * multi_ack only:

    Sent in response to "have".

    The remote side wants the client to consider this object as
    common, and immediately stop transmitting additional "have"
    lines for objects that are reachable from it.  The reason
    the client should stop is not given, but is one of the two
    cases below available under multi_ack_detailed.

ACK %s common
-----------------------------------
  * multi_ack_detailed only:

    Sent in response to "have".  Both sides have this object.
    Like with "ACK %s continue" above the client should stop
    sending have lines reachable for objects from the argument.

ACK %s ready
-----------------------------------
  * multi_ack_detailed only:

    Sent in response to "have".

    The client should stop transmitting objects which are reachable
    from the argument, and send "done" soon to get the objects.

    If the remote side has the specified object, it should
    first send an "ACK %s common" message prior to sending
    "ACK %s ready".

    Clients may still submit additional "have" lines if there are
    more side branches for the client to explore that might be added
    to the common set and reduce the number of objects to transfer.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-30 19:20:54 -07:00
Jim Meyering
41698375ad don't dereference NULL upon fdopen failure
There were several unchecked use of fdopen(); replace them with xfdopen()
that checks and dies.

Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-13 01:32:20 -07:00
Jim Meyering
2b7ca830c6 use write_str_in_full helper to avoid literal string lengths
In 2d14d65 (Use a clearer style to issue commands to remote helpers,
2009-09-03) I happened to notice two changes like this:

-	write_in_full(helper->in, "list\n", 5);
+
+	strbuf_addstr(&buf, "list\n");
+	write_in_full(helper->in, buf.buf, buf.len);
+	strbuf_reset(&buf);

IMHO, it would be better to define a new function,

    static inline ssize_t write_str_in_full(int fd, const char *str)
    {
           return write_in_full(fd, str, strlen(str));
    }

and then use it like this:

-       strbuf_addstr(&buf, "list\n");
-       write_in_full(helper->in, buf.buf, buf.len);
-       strbuf_reset(&buf);
+       write_str_in_full(helper->in, "list\n");

Thus not requiring the added allocation, and still avoiding
the maintenance risk of literal string lengths.
These days, compilers are good enough that strlen("literal")
imposes no run-time cost.

Transformed via this:

    perl -pi -e \
        's/write_in_full\((.*?), (".*?"), \d+\)/write_str_in_full($1, $2)/'\
      $(git grep -l 'write_in_full.*"')

Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-13 01:31:10 -07:00
Junio C Hamano
e4d1afbcf2 Merge branch 'jc/upload-pack-hook'
* jc/upload-pack-hook:
  upload-pack: feed "kind [clone|fetch]" to post-upload-pack hook
  upload-pack: add a trigger for post-upload-pack hook
2009-09-07 15:24:47 -07:00
Junio C Hamano
8e4384fd44 Merge branch 'np/maint-1.6.3-deepen'
* np/maint-1.6.3-deepen:
  pack-objects: free preferred base memory after usage
  make shallow repository deepening more network efficient
2009-09-07 15:23:50 -07:00
Nicolas Pitre
6523078b96 make shallow repository deepening more network efficient
First of all, I can't find any reason why thin pack generation is
explicitly disabled when dealing with a shallow repository.  The
possible delta base objects are collected from the edge commits which
are always obtained through history walking with the same shallow refs
as the client, Therefore the client is always going to have those base
objects available. So let's remove that restriction.

Then we can make shallow repository deepening much more efficient by
using the remote's unshallowed commits as edge commits to get preferred
base objects for thin pack generation.  On git.git, this makes the data
transfer for the deepening of a shallow repository from depth 1 to depth 2
around 134 KB instead of 3.68 MB.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-05 22:25:26 -07:00
Brian Gianforcaro
eeefa7c90e Style fixes, add a space after if/for/while.
The majority of code in core git appears to use a single
space after if/for/while. This is an attempt to bring more
code to this standard. These are entirely cosmetic changes.

Signed-off-by: Brian Gianforcaro <b.gianfo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-31 23:26:28 -07:00
Junio C Hamano
11cae066b2 upload-pack: feed "kind [clone|fetch]" to post-upload-pack hook
A request to clone the repository does not give any "have" but asks for
all the refs we offer with "want".  When a request does not ask to clone
the repository fully, but asks to fetch some refs into an empty
repository, it will not give any "have" but its "want" won't ask for all
the refs we offer.

If we suppose (and I would say this is a rather big if) that it makes
sense to distinguish these two cases, a hook cannot reliably do this
alone.  The hook can detect lack of "have" and bunch of "want", but there
is no direct way to tell if the other end asked for all refs we offered,
or merely most of them.

Between the time we talked with the other end and the time the hook got
called, we may have acquired more refs or lost some refs in the repository
by concurrent operations.  Given that we plan to introduce selective
advertisement of refs with a protocol extension, it would become even more
difficult for hooks to guess between these two cases.

This adds "kind [clone|fetch]" to hook's input, as a stable interface to
allow the hooks to tell these cases apart.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-28 22:39:24 -07:00
Junio C Hamano
a8563ec851 upload-pack: add a trigger for post-upload-pack hook
After upload-pack successfully finishes its operation, post-upload-pack
hook can be called for logging purposes.

The hook is passed various pieces of information, one per line, from its
standard input.  Currently the following items can be fed to the hook, but
more types of information may be added in the future:

    want SHA-1::
        40-byte hexadecimal object name the client asked to include in the
        resulting pack.  Can occur one or more times in the input.

    have SHA-1::
        40-byte hexadecimal object name the client asked to exclude from
        the resulting pack, claiming to have them already.  Can occur zero
        or more times in the input.

    time float::
        Number of seconds spent for creating the packfile.

    size decimal::
        Size of the resulting packfile in bytes.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-28 22:39:17 -07:00
Junio C Hamano
f00ecbe42b Merge branch 'cc/replace'
* cc/replace:
  t6050: check pushing something based on a replaced commit
  Documentation: add documentation for "git replace"
  Add git-replace to .gitignore
  builtin-replace: use "usage_msg_opt" to give better error messages
  parse-options: add new function "usage_msg_opt"
  builtin-replace: teach "git replace" to actually replace
  Add new "git replace" command
  environment: add global variable to disable replacement
  mktag: call "check_sha1_signature" with the replacement sha1
  replace_object: add a test case
  object: call "check_sha1_signature" with the replacement sha1
  sha1_file: add a "read_sha1_file_repl" function
  replace_object: add mechanism to replace objects found in "refs/replace/"
  refs: add a "for_each_replace_ref" function
2009-08-21 18:47:53 -07:00
Junio C Hamano
7d1b509812 Merge branch 'ne/futz-upload-pack'
* ne/futz-upload-pack:
  Shift object enumeration out of upload-pack

Conflicts:
	upload-pack.c
2009-08-05 12:38:29 -07:00
Johannes Sixt
9462e3f59c upload-pack: squelch progress indicator if client cannot see it
upload-pack runs pack-objects, which generates progress indicator output
on its stderr. If the client requests a sideband, this indicator is sent
to the client; but if it did not, then the progress is written to
upload-pack's own stderr.

If upload-pack is itself run from git-daemon (and if the client did not
request a sideband) the progress indicator never reaches the client and it
need not be generated in the first place. With this patch the progress
indicator is suppressed in this situation.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-18 11:38:43 -07:00
Nick Edelen
f0cea83f63 Shift object enumeration out of upload-pack
Offload object enumeration in upload-pack to pack-objects, but fall
back on internal revision walker for shallow interaction.   Aside from
architecturally making more sense, this also leaves the door open for
pack-objects to employ a revision cache mechanism.  Test t5530 updated
in order to explicitly check both enumeration methods.

Signed-off-by: Nick Edelen <sirnot@gmail.com>
Acked-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-09 23:49:31 -07:00
Christian Couder
dae556bdb1 environment: add global variable to disable replacement
This new "read_replace_refs" global variable is set to 1 by
default, so that replace refs are used by default. But
reachability traversal and packing commands ("cmd_fsck",
"cmd_prune", "cmd_pack_objects", "upload_pack",
"cmd_unpack_objects") set it to 0, as they must work with the
original DAG.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-31 17:02:59 -07:00
Junio C Hamano
7d71be242d Merge branch 'lt/pack-object-memuse' into maint
* lt/pack-object-memuse:
  show_object(): push path_name() call further down
  process_{tree,blob}: show objects without buffering
2009-05-03 15:02:40 -07:00
Junio C Hamano
9824a388e5 Merge branch 'lt/pack-object-memuse'
* lt/pack-object-memuse:
  show_object(): push path_name() call further down
  process_{tree,blob}: show objects without buffering

Conflicts:
	builtin-pack-objects.c
	builtin-rev-list.c
	list-objects.c
	list-objects.h
	upload-pack.c
2009-04-18 14:46:17 -07:00
Linus Torvalds
cf2ab916af show_object(): push path_name() call further down
In particular, pushing the "path_name()" call _into_ the show() function
would seem to allow

 - more clarity into who "owns" the name (ie now when we free the name in
   the show_object callback, it's because we generated it ourselves by
   calling path_name())

 - not calling path_name() at all, either because we don't care about the
   name in the first place, or because we are actually happy walking the
   linked list of "struct name_path *" and the last component.

Now, I didn't do that latter optimization, because it would require some
more coding, but especially looking at "builtin-pack-objects.c", we really
don't even want the whole pathname, we really would be better off with the
list of path components.

Why? We use that name for two things:
 - add_preferred_base_object(), which actually _wants_ to traverse the
   path, and now does it by looking for '/' characters!
 - for 'name_hash()', which only cares about the last 16 characters of a
   name, so again, generating the full name seems to be just unnecessary
   work.

Anyway, so I didn't look any closer at those things, but it did convince
me that the "show_object()" calling convention was crazy, and we're
actually better off doing _less_ in list-objects.c, and giving people
access to the internal data structures so that they can decide whether
they want to generate a path-name or not.

This patch does that, and then for people who did use the name (even if
they might do something more clever in the future), it just does the
straightforward "name = path_name(path, component); .. free(name);" thing.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-12 17:28:31 -07:00
Linus Torvalds
8d2dfc49b1 process_{tree,blob}: show objects without buffering
Here's a less trivial thing, and slightly more dubious one.

I was looking at that "struct object_array objects", and wondering why we
do that. I have honestly totally forgotten. Why not just call the "show()"
function as we encounter the objects? Rather than add the objects to the
object_array, and then at the very end going through the array and doing a
'show' on all, just do things more incrementally.

Now, there are possible downsides to this:

 - the "buffer using object_array" _can_ in theory result in at least
   better I-cache usage (two tight loops rather than one more spread out
   one). I don't think this is a real issue, but in theory..

 - this _does_ change the order of the objects printed. Instead of doing a
   "process_tree(revs, commit->tree, &objects, NULL, "");" in the loop
   over the commits (which puts all the root trees _first_ in the object
   list, this patch just adds them to the list of pending objects, and
   then we'll traverse them in that order (and thus show each root tree
   object together with the objects we discover under it)

   I _think_ the new ordering actually makes more sense, but the object
   ordering is actually a subtle thing when it comes to packing
   efficiency, so any change in order is going to have implications for
   packing. Good or bad, I dunno.

 - There may be some reason why we did it that odd way with the object
   array, that I have simply forgotten.

Anyway, now that we don't buffer up the objects before showing them
that may actually result in lower memory usage during that whole
traverse_commit_list() phase.

This is seriously not very deeply tested. It makes sense to me, it seems
to pass all the tests, it looks ok, but...

Does anybody remember why we did that "object_array" thing? It used to be
an "object_list" a long long time ago, but got changed into the array due
to better memory usage patterns (those linked lists of obejcts are
horrible from a memory allocation standpoint). But I wonder why we didn't
do this back then. Maybe there's a reason for it.

Or maybe there _used_ to be a reason, and no longer is.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-12 17:28:31 -07:00
Christian Couder
11c211fa06 list-objects: add "void *data" parameter to show functions
The goal of this patch is to get rid of the "static struct rev_info
revs" static variable in "builtin-rev-list.c".

To do that, we need to pass the revs to the "show_commit" function
in "builtin-rev-list.c" and this in turn means that the
"traverse_commit_list" function in "list-objects.c" must be passed
functions pointers to functions with 2 parameters instead of one.

So we have to change all the callers and all the functions passed
to "traverse_commit_list".

Anyway this makes the code more clean and more generic, so it
should be a good thing in the long run.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-07 22:12:38 -07:00
Benjamin Kramer
fd13b21f52 Move local variables to narrower scopes
These weren't used outside and can be safely moved

Signed-off-by: Benjamin Kramer <benny.kra@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-07 20:52:23 -08:00
Jeff King
05ac6b34e2 improve missing repository error message
Certain remote commands, when asked to do something in a
particular directory that was not actually a git repository,
would say "unable to chdir or not a git archive". The
"chdir" bit is an unnecessary detail, and the term "git
archive" is much less common these days than "git repository".

So let's switch them all to:

  fatal: '%s' does not appear to be a git repository

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-04 20:37:21 -08:00
Alexander Potashev
34263de026 Replace deprecated dashed git commands in usage
Signed-off-by: Alexander Potashev <aspotashev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-04 15:08:49 -08:00
Steffen Prohaska
2fb3f6db96 Add calls to git_extract_argv0_path() in programs that call git_config_*
Programs that use git_config need to find the global configuration.
When runtime prefix computation is enabled, this requires that
git_extract_argv0_path() is called early in the program's main().

This commit adds the necessary calls.

Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-26 00:26:05 -08:00
Junio C Hamano
7e44c93558 'git foo' program identifies itself without dash in die() messages
This is a mechanical conversion of all '*.c' files with:

	s/((?:die|error|warning)\("git)-(\S+:)/$1 $2/;

The result was manually inspected and no false positive was found.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-31 09:39:19 -07:00
Johannes Sixt
e1464ca7bb Record the command invocation path early
We will need the command invocation path in system_path(). This path was
passed to setup_path(), but  system_path() can be called earlier, for
example via:

    main
      commit_pager_choice
        setup_pager
          git_config
            git_etc_gitconfig
              system_path

Therefore, we introduce git_set_argv0_path() and call it as soon as
possible.

Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-25 17:41:13 -07:00
Johannes Sixt
618ebe9ff9 Windows: Implement asynchronous functions as threads.
In upload-pack we must explicitly close the output channel of rev-list.
(On Unix, the channel is closed automatically because process that runs
rev-list terminates.)

Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
2008-06-26 08:45:08 +02:00