Commit Graph

153 Commits

Author SHA1 Message Date
Junio C Hamano
e43171a4a7 Merge branch 'nd/upload-pack-shallow-must-be-commit'
A minor consistency check patch that does not have much relevance
to the real world.

* nd/upload-pack-shallow-must-be-commit:
  upload-pack: only accept commits from "shallow" line
2013-01-14 08:15:44 -08:00
Nguyễn Thái Ngọc Duy
6293ded348 upload-pack: only accept commits from "shallow" line
We only allow cuts at commits, not arbitrary objects. upload-pack will
fail eventually in register_shallow if a non-commit is given with a
generic error "Object %s is a %s, not a commit". Check it early and
give a more accurate error.

This should never show up in an ordinary session. It's for buggy
clients, or when the user manually edits .git/shallow.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-08 09:28:00 -08:00
Jeff King
435c833237 upload-pack: use peel_ref for ref advertisements
When upload-pack advertises refs, we attempt to peel tags
and advertise the peeled version. We currently hand-roll the
tag dereferencing, and use as many optimizations as we can
to avoid loading non-tag objects into memory.

Not only has peel_ref recently learned these optimizations,
too, but it also contains an even more important one: it
has access to the "peeled" data from the pack-refs file.
That means we can avoid not only loading annotated tags
entirely, but also avoid doing any kind of object lookup at
all.

This cut the CPU time to advertise refs by 50% in the
linux-2.6 repo, as measured by:

  echo 0000 | git-upload-pack . >/dev/null

best-of-five, warm cache, objects and refs fully packed:

  [before]             [after]
  real    0m0.026s     real    0m0.013s
  user    0m0.024s     user    0m0.008s
  sys     0m0.000s     sys     0m0.000s

Those numbers are irrelevantly small compared to an actual
fetch. Here's a larger repo (400K refs, of which 12K are
unique, and of which only 107 are unique annotated tags):

  [before]             [after]
  real    0m0.704s     real    0m0.596s
  user    0m0.600s     user    0m0.496s
  sys     0m0.096s     sys     0m0.092s

This shows only a 15% speedup (mostly because it has fewer
actual tags to parse), but a larger absolute value (100ms,
which isn't a lot compared to a real fetch, but this
advertisement happens on every fetch, even if the client is
just finding out they are completely up to date).

In truly pathological cases, where you have a large number
of unique annotated tags, it can make an even bigger
difference. Here are the numbers for a linux-2.6 repository
that has had every seventh commit tagged (so about 50K
tags):

  [before]             [after]
  real    0m0.443s     real    0m0.097s
  user    0m0.416s     user    0m0.080s
  sys     0m0.024s     sys     0m0.012s

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-04 20:34:29 -07:00
Jeff King
ff5effdf45 include agent identifier in capability string
Instead of having the client advertise a particular version
number in the git protocol, we have managed extensions and
backwards compatibility by having clients and servers
advertise capabilities that they support. This is far more
robust than having each side consult a table of
known versions, and provides sufficient information for the
protocol interaction to complete.

However, it does not allow servers to keep statistics on
which client versions are being used. This information is
not necessary to complete the network request (the
capabilities provide enough information for that), but it
may be helpful to conduct a general survey of client
versions in use.

We already send the client version in the user-agent header
for http requests; adding it here allows us to gather
similar statistics for non-http requests.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-03 13:03:34 -07:00
Junio C Hamano
d1afa8baa2 Merge branch 'jk/parse-object-cached'
* jk/parse-object-cached:
  upload-pack: avoid parsing tag destinations
  upload-pack: avoid parsing objects during ref advertisement
  parse_object: try internal cache before reading object db
2012-01-29 13:18:55 -08:00
Junio C Hamano
f47182c852 server_supports(): parse feature list more carefully
We have been carefully choosing feature names used in the protocol
extensions so that the vocabulary does not contain a word that is a
substring of another word, so it is not a real problem, but we have
recently added "quiet" feature word, which would mean we cannot later
add some other word with "quiet" (e.g. "quiet-push"), which is awkward.

Let's make sure that we can eventually be able to do so by teaching the
clients and servers that feature words consist of non whitespace
letters. This parser also allows us to later add features with parameters
e.g. "feature=1.5" (parameter values need to be quoted for whitespaces,
but we will worry about the detauls when we do introduce them).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-08 14:26:28 -08:00
Jeff King
90108a2441 upload-pack: avoid parsing tag destinations
When upload-pack advertises refs, it dereferences any tags
it sees, and shows the resulting sha1 to the client. It does
this by calling deref_tag. That function must load and parse
each tag object to find the sha1 of the tagged object.
However, it also ends up parsing the tagged object itself,
which is not strictly necessary for upload-pack's use.

Each tag produces two object loads (assuming it is not a
recursive tag), when it could get away with only a single
one. Dropping the second load halves the effort we spend.

The downside is that we are no longer verifying the
resulting object by loading it. In particular:

  1. We never cross-check the "type" field given in the tag
     object with the type of the pointed-to object.  If the
     tag says it points to a tag but doesn't, then we will
     keep peeling and realize the error.  If the tag says it
     points to a non-tag but actually points to a tag, we
     will stop peeling and just advertise the pointed-to
     tag.

  2. If we are missing the pointed-to object, we will not
     realize (because we never even look it up in the object
     db).

However, both of these are errors in the object database,
and both will be detected if a client actually requests the
broken objects in question. So we are simply pushing the
verification away from the advertising stage, and down to
the actual fetching stage.

On my test repo with 120K refs, this drops the time to
advertise the refs from ~3.2s to ~2.0s.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-06 13:28:57 -08:00
Jeff King
926f1dd954 upload-pack: avoid parsing objects during ref advertisement
When we advertise a ref, the first thing we do is parse the
pointed-to object. This gives us two things:

  1. a "struct object" we can use to store flags

  2. the type of the object, so we know whether we need to
     dereference it as a tag

Instead, we can just use lookup_unknown_object to get an
object struct, and then fill in just the type field using
sha1_object_info (which, in the case of packed files, can
find the information without actually inflating the object
data).

This can save time if you have a large number of refs, and
the client isn't actually going to request those refs (e.g.,
because most of them are already up-to-date).

The downside is that we are no longer verifying objects that
we advertise by fully parsing them (however, we do still
know we actually have them, because sha1_object_info must
find them to get the type). While we might fail to detect a
corrupt object here, if the client actually fetches the
object, we will parse (and verify) it then.

On a repository with 120K refs, the advertisement portion of
upload-pack goes from ~3.4s to 3.2s (the failure to speed up
more is largely due to the fact that most of these refs are
tags, which need dereferenced to find the tag destination
anyway).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-06 13:28:55 -08:00
Ævar Arnfjörð Bjarmason
5e9637c629 i18n: add infrastructure for translating Git with gettext
Change the skeleton implementation of i18n in Git to one that can show
localized strings to users for our C, Shell and Perl programs using
either GNU libintl or the Solaris gettext implementation.

This new internationalization support is enabled by default. If
gettext isn't available, or if Git is compiled with
NO_GETTEXT=YesPlease, Git falls back on its current behavior of
showing interface messages in English. When using the autoconf script
we'll auto-detect if the gettext libraries are installed and act
appropriately.

This change is somewhat large because as well as adding a C, Shell and
Perl i18n interface we're adding a lot of tests for them, and for
those tests to work we need a skeleton PO file to actually test
translations. A minimal Icelandic translation is included for this
purpose. Icelandic includes multi-byte characters which makes it easy
to test various edge cases, and it's a language I happen to
understand.

The rest of the commit message goes into detail about various
sub-parts of this commit.

= Installation

Gettext .mo files will be installed and looked for in the standard
$(prefix)/share/locale path. GIT_TEXTDOMAINDIR can also be set to
override that, but that's only intended to be used to test Git itself.

= Perl

Perl code that's to be localized should use the new Git::I18n
module. It imports a __ function into the caller's package by default.

Instead of using the high level Locale::TextDomain interface I've
opted to use the low-level (equivalent to the C interface)
Locale::Messages module, which Locale::TextDomain itself uses.

Locale::TextDomain does a lot of redundant work we don't need, and
some of it would potentially introduce bugs. It tries to set the
$TEXTDOMAIN based on package of the caller, and has its own
hardcoded paths where it'll search for messages.

I found it easier just to completely avoid it rather than try to
circumvent its behavior. In any case, this is an issue wholly
internal Git::I18N. Its guts can be changed later if that's deemed
necessary.

See <AANLkTilYD_NyIZMyj9dHtVk-ylVBfvyxpCC7982LWnVd@mail.gmail.com> for
a further elaboration on this topic.

= Shell

Shell code that's to be localized should use the git-sh-i18n
library. It's basically just a wrapper for the system's gettext.sh.

If gettext.sh isn't available we'll fall back on gettext(1) if it's
available. The latter is available without the former on Solaris,
which has its own non-GNU gettext implementation. We also need to
emulate eval_gettext() there.

If neither are present we'll use a dumb printf(1) fall-through
wrapper.

= About libcharset.h and langinfo.h

We use libcharset to query the character set of the current locale if
it's available. I.e. we'll use it instead of nl_langinfo if
HAVE_LIBCHARSET_H is set.

The GNU gettext manual recommends using langinfo.h's
nl_langinfo(CODESET) to acquire the current character set, but on
systems that have libcharset.h's locale_charset() using the latter is
either saner, or the only option on those systems.

GNU and Solaris have a nl_langinfo(CODESET), FreeBSD can use either,
but MinGW and some others need to use libcharset.h's locale_charset()
instead.

=Credits

This patch is based on work by Jeff Epler <jepler@unpythonic.net> who
did the initial Makefile / C work, and a lot of comments from the Git
mailing list, including Jonathan Nieder, Jakub Narebski, Johannes
Sixt, Erik Faye-Lund, Peter Krefting, Junio C Hamano, Thomas Rast and
others.

[jc: squashed a small Makefile fix from Ramsay]

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-05 20:46:55 -08:00
Junio C Hamano
2e2e7e9dd0 Merge branch 'jc/fetch-verify'
* jc/fetch-verify:
  fetch: verify we have everything we need before updating our ref
  rev-list --verify-object
  list-objects: pass callback data to show_objects()
2011-10-05 12:36:20 -07:00
Junio C Hamano
f817f2fbb5 Merge branch 'jc/traverse-commit-list'
* jc/traverse-commit-list:
  revision.c: update show_object_with_name() without using malloc()
  revision.c: add show_object_with_name() helper function
  rev-list: fix finish_object() call
2011-10-05 12:36:19 -07:00
Junio C Hamano
4947367267 list-objects: pass callback data to show_objects()
The traverse_commit_list() API takes two callback functions, one to show
commit objects, and the other to show other kinds of objects. Even though
the former has a callback data parameter, so that the callback does not
have to rely on global state, the latter does not.

Give the show_objects() callback the same callback data parameter.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-09-01 15:46:12 -07:00
Junio C Hamano
b7fcd00715 Sync with 1.7.6.1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-24 12:18:02 -07:00
Brian Harring
2a74532412 get_indexed_object can return NULL if nothing is in that slot; check for it
This fixes a segfault introduced by 051e400; via it, no longer able to
trigger the http/smartserv race.

Signed-off-by: Brian Harring <ferringb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-24 10:50:33 -07:00
Junio C Hamano
91f175165a revision.c: add show_object_with_name() helper function
There are two copies of traverse_commit_list callback that show the object
name followed by pathname the object was found, to produce output similar
to "rev-list --objects".

Unify them.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-22 11:34:55 -07:00
Junio C Hamano
2f5cb6aa1e Merge branch 'jc/maint-smart-http-race-upload-pack'
* jc/maint-smart-http-race-upload-pack:
  helping smart-http/stateless-rpc fetch race
2011-08-17 17:35:58 -07:00
Junio C Hamano
051e4005a3 helping smart-http/stateless-rpc fetch race
A request to fetch from a client over smart HTTP protocol is served in
multiple steps. In the first round, the server side shows the set of refs
it has and their values, and the client picks from them and sends "I want
to fetch the history leading to these commits".

When the server tries to respond to this second request, its refs may have
progressed by a push from elsewhere. By design, we do not allow fetching
objects that are not at the tip of an advertised ref, and the server
rejects such a request. The client needs to try again, which is not ideal
especially for a busy server.

Teach upload-pack (which is the workhorse driven by git-daemon and smart
http server interface) that it is OK for a smart-http client to ask for
commits that are not at the tip of any advertised ref, as long as they are
reachable from advertised refs.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-08 15:33:28 -07:00
Josh Triplett
6b01ecfe22 ref namespaces: Support remote repositories via upload-pack and receive-pack
Change upload-pack and receive-pack to use the namespace-prefixed refs
when working with the repository, and use the unprefixed refs when
talking to the client, maintaining the masquerade.  This allows
clone, pull, fetch, and push to work with a suitably configured
GIT_NAMESPACE.

receive-pack advertises refs outside the current namespace as .have refs
(as it currently does for refs in alternates), so that the client can
use them to minimize data transfer but will otherwise ignore them.

With appropriate configuration, this also allows http-backend to expose
namespaces as multiple repositories with different paths.  This only
requires setting GIT_NAMESPACE, which http-backend passes through to
upload-pack and receive-pack.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Jamey Sharp <jamey@minilop.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-11 09:35:38 -07:00
Junio C Hamano
982f6c90ee Merge branch 'jk/maint-upload-pack-shallow'
* jk/maint-upload-pack-shallow:
  upload-pack: start pack-objects before async rev-list
2011-04-27 11:36:42 -07:00
Jeff King
b961219779 upload-pack: start pack-objects before async rev-list
In a pthread-enabled version of upload-pack, there's a race condition
that can cause a deadlock on the fflush(NULL) we call from run-command.

What happens is this:

  1. Upload-pack is informed we are doing a shallow clone.

  2. We call start_async() to spawn a thread that will generate rev-list
     results to feed to pack-objects. It gets a file descriptor to a
     pipe which will eventually hook to pack-objects.

  3. The rev-list thread uses fdopen to create a new output stream
     around the fd we gave it, called pack_pipe.

  4. The thread writes results to pack_pipe. Outside of our control,
     libc is doing locking on the stream. We keep writing until the OS
     pipe buffer is full, and then we block in write(), still holding
     the lock.

  5. The main thread now uses start_command to spawn pack-objects.
     Before forking, it calls fflush(NULL) to flush every stdio output
     buffer. It blocks trying to get the lock on pack_pipe.

And we have a deadlock. The thread will block until somebody starts
reading from the pipe. But nobody will read from the pipe until we
finish flushing to the pipe.

To fix this, we swap the start order: we start the
pack-objects reader first, and then the rev-list writer
after. Thus the problematic fflush(NULL) happens before we
even open the new file descriptor (and even if it didn't,
flushing should no longer block, as the reader at the end of
the pipe is now active).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-06 14:38:47 -07:00
Junio C Hamano
2eee1393f3 Merge branches 'sp/maint-fetch-pack-stop-early' and 'sp/maint-upload-pack-stop-early'
* sp/maint-fetch-pack-stop-early:
  enable "no-done" extension only when fetching over smart-http

* sp/maint-upload-pack-stop-early:
  enable "no-done" extension only when serving over smart-http
2011-03-29 14:09:02 -07:00
Junio C Hamano
4e10cf9a17 Revert two "no-done" reverts
Last night I had to make these two emergency reverts, but now we have a
better understanding of which part of the topic was broken, let's get rid
of the revert to fix it correctly.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-29 12:29:10 -07:00
Junio C Hamano
cf2ad8e641 enable "no-done" extension only when serving over smart-http
Do not advertise no-done capability when upload-pack is not serving over
smart-http, as there is no way for this server to know when it should stop
reading in-flight data from the client, even though it is necessary to
drain all the in-flight data in order to unblock the client.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
2011-03-29 12:21:28 -07:00
Junio C Hamano
4793b7e86d Revert "upload-pack: Implement no-done capability"
This reverts 3e63b21 (upload-pack: Implement no-done capability,
2011-03-14).  Together with 761ecf0 (fetch-pack: Implement no-done
capability, 2011-03-14) it seems to make the fetch-pack process out of
sync and makes it keep talking long after upload-pack stopped listening to
it, terminating the process with SIGPIPE.
2011-03-28 23:33:51 -07:00
Junio C Hamano
f59bf09678 Merge branch 'sp/maint-upload-pack-stop-early'
* sp/maint-upload-pack-stop-early:
  upload-pack: Implement no-done capability
  upload-pack: More aggressively send 'ACK %s ready'
2011-03-22 21:38:06 -07:00
Shawn O. Pearce
3e63b21ace upload-pack: Implement no-done capability
If the client requests both multi_ack_detailed and no-done then
upload-pack is free to immediately send a PACK following its first
'ACK %s ready' message.  The upload-pack response actually winds
up being:

  ACK %s common
  ... (maybe more) ...
  ACK %s ready
  NAK
  ACK %s
  PACK.... the pack stream ....

For smart HTTP connections this saves one HTTP RPC, reducing
the overall latency for a trivial fetch.  For git:// and ssh://
a no-done option slightly reduces latency by removing one
server->client->server round-trip at the end of the common
ancestor negotiation.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-15 12:14:35 -07:00
Shawn O. Pearce
49bee717f7 upload-pack: More aggressively send 'ACK %s ready'
If a client is merely following the remote (and has not made any
new commits itself), all "have %s" lines sent by the client will be
common to the server.  As all lines are common upload-pack never
calls ok_to_give_up() and does not compute if it has a good cut
point in the commit graph.

Without this computation the following client is going to send all
tagged commits, as these were determined to be COMMON_REF during the
initial advertisement, but the client does not parse their history
to transitively pass the COMMON flag and empty its queue of commits.

For git.git with 339 commit tags, it takes clients 11 rounds of
negotation to fully send all tagged commits and exhaust its queue
of things to send as common.  This is pretty slow for a client that
has not done any local development activity.

Force computing ok_to_give_up() and send "ACK %s ready" at the end
of the current round if this round only contained common objects
and ok_to_give_up() was therefore not called.  This may allow the
client to break early, avoiding transmission of the COMMON_REFs.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-14 17:27:25 -07:00
Jeff King
bbc30f9963 add packet tracing debug code
This shows a trace of all packets coming in or out of a given
program. This can help with debugging object negotiation or
other protocol issues.

To keep the code changes simple, we operate at the lowest
level, meaning we don't necessarily understand what's in the
packets. The one exception is a packet starting with "PACK",
which causes us to skip that packet and turn off tracing
(since the gigantic pack data will not be interesting to
read, at least not in the trace format).

We show both written and read packets. In the local case,
this may mean you will see packets twice (written by the
sender and read by the receiver). However, for cases where
the other end is remote, this allows you to see the full
conversation.

Packet tracing can be enabled with GIT_TRACE_PACKET=<foo>,
where <foo> takes the same arguments as GIT_TRACE.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-08 12:12:04 -08:00
Thiago Farina
47e44ed1dc commit: Add commit_list prefix in two function names.
Add commit_list prefix to insert_by_date function and to sort_by_date,
so it's clear that these functions refer to commit_list structure.

Signed-off-by: Thiago Farina <tfransosi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-29 14:01:52 -08:00
Štěpán Němec
62b4698e55 Use angles for placeholders consistently
Signed-off-by: Štěpán Němec <stepnem@gmail.com>
Acked-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-08 12:29:52 -07:00
Thiago Farina
3cd474599f object.h: Add OBJECT_ARRAY_INIT macro and make use of it.
Signed-off-by: Thiago Farina <tfransosi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-29 22:42:49 -07:00
Elijah Newren
9f9aa76130 upload-pack: Improve error message when bad ref requested
When printing an error message saying a ref was requested that we do not
have, only print that ref, rather than the ref and everything sent to us
on the same packet line (e.g. protocol support specifications).

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-02 15:31:59 -07:00
Nguyễn Thái Ngọc Duy
1e39d7deea upload-pack: remove unused "create_full_pack" code in do_rev_list
A bit of history in chronological order, the newest at bottom:

- 80ccaa7 (upload-pack: Move the revision walker into a separate function.)
   do_rev_list was introduced with create_full_pack argument

- 21edd3f (upload-pack: Run rev-list in an asynchronous function.)
   do_rev_list was now called by start_async, create_full_pack was
   passed by rev_list.data

- f0cea83 (Shift object enumeration out of upload-pack)
   rev_list.data was now zero permanently. Creating full pack was
   done by passing --all to pack-objects

- ae6a560 (run-command: support custom fd-set in async)
   rev_list.data = 0 was found out redudant and got rid of.

Get rid of the code as well, for less headache while reading do_rev_list.

[jc: noticed by Elijah Newren]

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-07-28 13:50:11 -07:00
Erik Faye-Lund
ae6a5609c0 run-command: support custom fd-set in async
This patch adds the possibility to supply a set of non-0 file
descriptors for async process communication instead of the
default-created pipe.

Additionally, we now support bi-directional communiction with the
async procedure, by giving the async function both read and write
file descriptors.

To retain compatiblity and similar "API feel" with start_command,
we require start_async callers to set .out = -1 to get a readable
file descriptor.  If either of .in or .out is 0, we supply no file
descriptor to the async process.

[sp: Note: Erik started this patch, and a huge bulk of it is
     his work.  All bugs were introduced later by Shawn.]

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-05 20:57:22 -08:00
Junio C Hamano
4cb51a65a4 Sync with 1.6.5.6
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-12-10 16:20:59 -08:00
Junio C Hamano
1456b043fc Remove post-upload-hook
This hook runs after "git fetch" in the repository the objects are
fetched from as the user who fetched, and has security implications.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-12-10 12:21:40 -08:00
Junio C Hamano
ef3a4fd670 Merge branch 'np/maint-sideband-favor-status' into maint
* np/maint-sideband-favor-status:
  give priority to progress messages
2009-12-03 13:50:24 -08:00
Junio C Hamano
905bf7742c Merge branch 'sp/smart-http'
* sp/smart-http: (37 commits)
  http-backend: Let gcc check the format of more printf-type functions.
  http-backend: Fix access beyond end of string.
  http-backend: Fix bad treatment of uintmax_t in Content-Length
  t5551-http-fetch: Work around broken Accept header in libcurl
  t5551-http-fetch: Work around some libcurl versions
  http-backend: Protect GIT_PROJECT_ROOT from /../ requests
  Git-aware CGI to provide dumb HTTP transport
  http-backend: Test configuration options
  http-backend: Use http.getanyfile to disable dumb HTTP serving
  test smart http fetch and push
  http tests: use /dumb/ URL prefix
  set httpd port before sourcing lib-httpd
  t5540-http-push: remove redundant fetches
  Smart HTTP fetch: gzip requests
  Smart fetch over HTTP: client side
  Smart push over HTTP: client side
  Discover refs via smart HTTP server when available
  http-backend: more explict LocationMatch
  http-backend: add example for gitweb on same URL
  http-backend: use mod_alias instead of mod_rewrite
  ...

Conflicts:
	.gitignore
	remote-curl.c
2009-11-20 23:51:23 -08:00
Junio C Hamano
e36e6c00cd Merge branch 'np/maint-sideband-favor-status'
* np/maint-sideband-favor-status:
  give priority to progress messages
2009-11-17 22:03:20 -08:00
Nicolas Pitre
6b59f51b31 give priority to progress messages
In theory it is possible for sideband channel #2 to be delayed if
pack data is quick to come up for sideband channel #1.  And because
data for channel #2 is read only 128 bytes at a time while pack data
is read 8192 bytes at a time, it is possible for many pack blocks to
be sent to the client before the progress message fifo is emptied,
making the situation even worse.  This would result in totally garbled
progress display on the client's console as local progress gets mixed
with partial remote progress lines.

Let's prevent such situations by giving transmission priority to
progress messages over pack data at all times.

Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-11-13 14:39:25 -08:00
Shawn O. Pearce
42526b478e Add stateless RPC options to upload-pack, receive-pack
When --stateless-rpc is passed as a command line parameter to
upload-pack or receive-pack the programs now assume they may
perform only a single read-write cycle with stdin and stdout.
This fits with the HTTP POST request processing model where a
program may read the request, write a response, and must exit.

When --advertise-refs is passed as a command line parameter only
the initial ref advertisement is output, and the program exits
immediately.  This fits with the HTTP GET request model, where
no request content is received but a response must be produced.

HTTP headers and/or environment are not processed here, but
instead are assumed to be handled by the program invoking
either service backend.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-11-04 17:58:14 -08:00
Shawn O. Pearce
78affc49de Add multi_ack_detailed capability to fetch-pack/upload-pack
When multi_ack_detailed is enabled the ACK continue messages returned
by the remote upload-pack are broken out to describe the different
states within the peer.  This permits the client to better understand
the server's in-memory state.

The fetch-pack/upload-pack protocol now looks like:

NAK
---------------------------------
  Always sent in response to "done" if there was no common base
  selected from the "have" lines (or no have lines were sent).

  * no multi_ack or multi_ack_detailed:

    Sent when the client has sent a pkt-line flush ("0000") and
    the server has not yet found a common base object.

  * either multi_ack or multi_ack_detailed:

    Always sent in response to a pkt-line flush.

ACK %s
-----------------------------------
  * no multi_ack or multi_ack_detailed:

    Sent in response to "have" when the object exists on the remote
    side and is therefore an object in common between the peers.
    The argument is the SHA-1 of the common object.

  * either multi_ack or multi_ack_detailed:

    Sent in response to "done" if there are common objects.
    The argument is the last SHA-1 determined to be common.

ACK %s continue
-----------------------------------
  * multi_ack only:

    Sent in response to "have".

    The remote side wants the client to consider this object as
    common, and immediately stop transmitting additional "have"
    lines for objects that are reachable from it.  The reason
    the client should stop is not given, but is one of the two
    cases below available under multi_ack_detailed.

ACK %s common
-----------------------------------
  * multi_ack_detailed only:

    Sent in response to "have".  Both sides have this object.
    Like with "ACK %s continue" above the client should stop
    sending have lines reachable for objects from the argument.

ACK %s ready
-----------------------------------
  * multi_ack_detailed only:

    Sent in response to "have".

    The client should stop transmitting objects which are reachable
    from the argument, and send "done" soon to get the objects.

    If the remote side has the specified object, it should
    first send an "ACK %s common" message prior to sending
    "ACK %s ready".

    Clients may still submit additional "have" lines if there are
    more side branches for the client to explore that might be added
    to the common set and reduce the number of objects to transfer.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-30 19:20:54 -07:00
Jim Meyering
41698375ad don't dereference NULL upon fdopen failure
There were several unchecked use of fdopen(); replace them with xfdopen()
that checks and dies.

Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-13 01:32:20 -07:00
Jim Meyering
2b7ca830c6 use write_str_in_full helper to avoid literal string lengths
In 2d14d65 (Use a clearer style to issue commands to remote helpers,
2009-09-03) I happened to notice two changes like this:

-	write_in_full(helper->in, "list\n", 5);
+
+	strbuf_addstr(&buf, "list\n");
+	write_in_full(helper->in, buf.buf, buf.len);
+	strbuf_reset(&buf);

IMHO, it would be better to define a new function,

    static inline ssize_t write_str_in_full(int fd, const char *str)
    {
           return write_in_full(fd, str, strlen(str));
    }

and then use it like this:

-       strbuf_addstr(&buf, "list\n");
-       write_in_full(helper->in, buf.buf, buf.len);
-       strbuf_reset(&buf);
+       write_str_in_full(helper->in, "list\n");

Thus not requiring the added allocation, and still avoiding
the maintenance risk of literal string lengths.
These days, compilers are good enough that strlen("literal")
imposes no run-time cost.

Transformed via this:

    perl -pi -e \
        's/write_in_full\((.*?), (".*?"), \d+\)/write_str_in_full($1, $2)/'\
      $(git grep -l 'write_in_full.*"')

Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-13 01:31:10 -07:00
Junio C Hamano
e4d1afbcf2 Merge branch 'jc/upload-pack-hook'
* jc/upload-pack-hook:
  upload-pack: feed "kind [clone|fetch]" to post-upload-pack hook
  upload-pack: add a trigger for post-upload-pack hook
2009-09-07 15:24:47 -07:00
Junio C Hamano
8e4384fd44 Merge branch 'np/maint-1.6.3-deepen'
* np/maint-1.6.3-deepen:
  pack-objects: free preferred base memory after usage
  make shallow repository deepening more network efficient
2009-09-07 15:23:50 -07:00
Nicolas Pitre
6523078b96 make shallow repository deepening more network efficient
First of all, I can't find any reason why thin pack generation is
explicitly disabled when dealing with a shallow repository.  The
possible delta base objects are collected from the edge commits which
are always obtained through history walking with the same shallow refs
as the client, Therefore the client is always going to have those base
objects available. So let's remove that restriction.

Then we can make shallow repository deepening much more efficient by
using the remote's unshallowed commits as edge commits to get preferred
base objects for thin pack generation.  On git.git, this makes the data
transfer for the deepening of a shallow repository from depth 1 to depth 2
around 134 KB instead of 3.68 MB.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-05 22:25:26 -07:00
Brian Gianforcaro
eeefa7c90e Style fixes, add a space after if/for/while.
The majority of code in core git appears to use a single
space after if/for/while. This is an attempt to bring more
code to this standard. These are entirely cosmetic changes.

Signed-off-by: Brian Gianforcaro <b.gianfo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-31 23:26:28 -07:00
Junio C Hamano
11cae066b2 upload-pack: feed "kind [clone|fetch]" to post-upload-pack hook
A request to clone the repository does not give any "have" but asks for
all the refs we offer with "want".  When a request does not ask to clone
the repository fully, but asks to fetch some refs into an empty
repository, it will not give any "have" but its "want" won't ask for all
the refs we offer.

If we suppose (and I would say this is a rather big if) that it makes
sense to distinguish these two cases, a hook cannot reliably do this
alone.  The hook can detect lack of "have" and bunch of "want", but there
is no direct way to tell if the other end asked for all refs we offered,
or merely most of them.

Between the time we talked with the other end and the time the hook got
called, we may have acquired more refs or lost some refs in the repository
by concurrent operations.  Given that we plan to introduce selective
advertisement of refs with a protocol extension, it would become even more
difficult for hooks to guess between these two cases.

This adds "kind [clone|fetch]" to hook's input, as a stable interface to
allow the hooks to tell these cases apart.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-28 22:39:24 -07:00
Junio C Hamano
a8563ec851 upload-pack: add a trigger for post-upload-pack hook
After upload-pack successfully finishes its operation, post-upload-pack
hook can be called for logging purposes.

The hook is passed various pieces of information, one per line, from its
standard input.  Currently the following items can be fed to the hook, but
more types of information may be added in the future:

    want SHA-1::
        40-byte hexadecimal object name the client asked to include in the
        resulting pack.  Can occur one or more times in the input.

    have SHA-1::
        40-byte hexadecimal object name the client asked to exclude from
        the resulting pack, claiming to have them already.  Can occur zero
        or more times in the input.

    time float::
        Number of seconds spent for creating the packfile.

    size decimal::
        Size of the resulting packfile in bytes.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-28 22:39:17 -07:00