Commit Graph

72 Commits

Author SHA1 Message Date
Junio C Hamano
e013bdab0f Merge branch 'jk/pkt-line-cleanup'
Clean up pkt-line API, implementation and its callers to make them
more robust.

* jk/pkt-line-cleanup:
  do not use GIT_TRACE_PACKET=3 in tests
  remote-curl: always parse incoming refs
  remote-curl: move ref-parsing code up in file
  remote-curl: pass buffer straight to get_remote_heads
  teach get_remote_heads to read from a memory buffer
  pkt-line: share buffer/descriptor reading implementation
  pkt-line: provide a LARGE_PACKET_MAX static buffer
  pkt-line: move LARGE_PACKET_MAX definition from sideband
  pkt-line: teach packet_read_line to chomp newlines
  pkt-line: provide a generic reading function with options
  pkt-line: drop safe_write function
  pkt-line: move a misplaced comment
  write_or_die: raise SIGPIPE when we get EPIPE
  upload-archive: use argv_array to store client arguments
  upload-archive: do not copy repo name
  send-pack: prefer prefixcmp over memcmp in receive_status
  fetch-pack: fix out-of-bounds buffer offset in get_ack
  upload-pack: remove packet debugging harness
  upload-pack: do not add duplicate objects to shallow list
  upload-pack: use get_sha1_hex to parse "shallow" lines
2013-04-01 08:59:37 -07:00
Jeff King
85edf4f58b teach get_remote_heads to read from a memory buffer
Now that we can read packet data from memory as easily as a
descriptor, get_remote_heads can take either one as a
source. This will allow further refactoring in remote-curl.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-24 00:17:38 -08:00
Jeff King
74543a0423 pkt-line: provide a LARGE_PACKET_MAX static buffer
Most of the callers of packet_read_line just read into a
static 1000-byte buffer (callers which handle arbitrary
binary data already use LARGE_PACKET_MAX). This works fine
in practice, because:

  1. The only variable-sized data in these lines is a ref
     name, and refs tend to be a lot shorter than 1000
     characters.

  2. When sending ref lines, git-core always limits itself
     to 1000 byte packets.

However, the only limit given in the protocol specification
in Documentation/technical/protocol-common.txt is
LARGE_PACKET_MAX; the 1000 byte limit is mentioned only in
pack-protocol.txt, and then only describing what we write,
not as a specific limit for readers.

This patch lets us bump the 1000-byte limit to
LARGE_PACKET_MAX. Even though git-core will never write a
packet where this makes a difference, there are two good
reasons to do this:

  1. Other git implementations may have followed
     protocol-common.txt and used a larger maximum size. We
     don't bump into it in practice because it would involve
     very long ref names.

  2. We may want to increase the 1000-byte limit one day.
     Since packets are transferred before any capabilities,
     it's difficult to do this in a backwards-compatible
     way. But if we bump the size of buffer the readers can
     handle, eventually older versions of git will be
     obsolete enough that we can justify bumping the
     writers, as well. We don't have plans to do this
     anytime soon, but there is no reason not to start the
     clock ticking now.

Just bumping all of the reading bufs to LARGE_PACKET_MAX
would waste memory. Instead, since most readers just read
into a temporary buffer anyway, let's provide a single
static buffer that all callers can use. We can further wrap
this detail away by having the packet_read_line wrapper just
use the buffer transparently and return a pointer to the
static storage.  That covers most of the cases, and the
remaining ones already read into their own LARGE_PACKET_MAX
buffers.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 13:42:22 -08:00
Jeff King
819b929d33 pkt-line: teach packet_read_line to chomp newlines
The packets sent during ref negotiation are all terminated
by newline; even though the code to chomp these newlines is
short, we end up doing it in a lot of places.

This patch teaches packet_read_line to auto-chomp the
trailing newline; this lets us get rid of a lot of inline
chomping code.

As a result, some call-sites which are not reading
line-oriented data (e.g., when reading chunks of packfiles
alongside sideband) transition away from packet_read_line to
the generic packet_read interface. This patch converts all
of the existing callsites.

Since the function signature of packet_read_line does not
change (but its behavior does), there is a possibility of
new callsites being introduced in later commits, silently
introducing an incompatibility.  However, since a later
patch in this series will change the signature, such a
commit would have to be merged directly into this commit,
not to the tip of the series; we can therefore ignore the
issue.

This is an internal cleanup and should produce no change of
behavior in the normal case. However, there is one corner
case to note. Callers of packet_read_line have never been
able to tell the difference between a flush packet ("0000")
and an empty packet ("0004"), as both cause packet_read_line
to return a length of 0. Readers treat them identically,
even though Documentation/technical/protocol-common.txt says
we must not; it also says that implementations should not
send an empty pkt-line.

By stripping out the newline before the result gets to the
caller, we will now treat the newline-only packet ("0005\n")
the same as an empty packet, which in turn gets treated like
a flush packet. In practice this doesn't matter, as neither
empty nor newline-only packets are part of git's protocols
(at least not for the line-oriented bits, and readers who
are not expecting line-oriented packets will be calling
packet_read directly, anyway). But even if we do decide to
care about the distinction later, it is orthogonal to this
patch.  The right place to tighten would be to stop treating
empty packets as flush packets, and this change does not
make doing so any harder.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20 13:42:21 -08:00
Junio C Hamano
f2db854d24 fetch: use struct ref to represent refs to be fetched
Even though "git fetch" has full infrastructure to parse refspecs to
be fetched and match them against the list of refs to come up with
the final list of refs to be fetched, the list of refs that are
requested to be fetched were internally converted to a plain list of
strings at the transport layer and then passed to the underlying
fetch-pack driver.

Stop this conversion and instead pass around an array of refs.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-07 13:53:59 -08:00
Nguyễn Thái Ngọc Duy
745f7a8cac fetch-pack: move core code to libgit.a
fetch_pack() is used by transport.c, part of libgit.a while it stays
in builtin/fetch-pack.c. Move it to fetch-pack.c so that we won't get
undefined reference if a program that uses libgit.a happens to pull it
in.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
2012-10-29 03:40:29 -04:00
Nguyễn Thái Ngọc Duy
f8eb3036d0 fetch-pack: remove global (static) configuration variable "args"
This helps removes the hack in fetch_pack() that copies my_args to args.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
2012-10-29 03:08:30 -04:00
Michael Haggerty
5f0fc64513 fetch-pack: eliminate spurious error messages
It used to be that if "--all", "--depth", and also explicit references
were sought, then the explicit references were not handled correctly
in filter_refs() because the "--all --depth" code took precedence over
the explicit reference handling, and the explicit references were
never noted as having been found.  So check for explicitly sought
references before proceeding to the "--all --depth" logic.

This fixes two test cases in t5500.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-12 11:46:32 -07:00
Michael Haggerty
b285668dd2 cmd_fetch_pack(): simplify computation of return value
Set the final value at initialization rather than initializing it then
sometimes changing it.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-12 11:46:32 -07:00
Michael Haggerty
778e7543d2 fetch-pack: report missing refs even if no existing refs were received
This fixes a test in t5500.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-12 11:46:32 -07:00
Michael Haggerty
7418f1a037 cmd_fetch_pack(): return early if finish_connect() fails
This simplifies the logic without changing the behavior.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-12 11:46:31 -07:00
Michael Haggerty
f537cfa750 filter_refs(): simplify logic
Simplify flow within loop: first decide whether to keep the reference,
then keep/free it.  This makes it clearer that each ref has exactly
two possible destinies, and removes duplication of the code for
appending the reference to the linked list.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-12 11:46:31 -07:00
Michael Haggerty
5096e48753 filter_refs(): build refs list as we go
Instead of temporarily storing matched refs to temporary array
"return_refs", simply append them to newlist as we go.  This changes
the order of references in newlist to strictly sorted if "--all" and
"--depth" and named references are all specified, but that usage is
broken anyway (see the last two tests in t5500).

This changes the last test in t5500 from segfaulting into just
emitting a spurious error (this will be fixed in a moment).

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-12 11:46:31 -07:00
Michael Haggerty
4ba159996f filter_refs(): delete matched refs from sought list
Remove any references that are available from the remote from the
sought list (rather than overwriting their names with NUL characters,
as previously).  Mark matching entries by writing a non-NULL pointer
to string_list_item::util during the iteration, then use
filter_string_list() later to filter out the entries that have been
marked.

Document this aspect of fetch_pack() in a comment in the header file.
(More documentation is obviously still needed.)

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-12 11:46:31 -07:00
Michael Haggerty
4c58f13ba6 fetch_pack(): update sought->nr to reflect number of unique entries
fetch_pack() removes duplicates from the "sought" list, thereby
shrinking the list.  But previously, the caller was not informed about
the shrinkage.  This would cause a spurious error message to be
emitted by cmd_fetch_pack() if "git fetch-pack" is called with
duplicate refnames.

Instead, remove duplicates using string_list_remove_duplicates(),
which adjusts sought->nr to reflect the new length of the list.

The last test of t5500 inexplicably *required* "git fetch-pack" to
fail when fetching a list of references that contains duplicates;
i.e., it insisted on the buggy behavior.  So change the test to expect
the correct behavior.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-12 11:46:31 -07:00
Michael Haggerty
382a967114 filter_refs(): do not check the same sought_pos twice
Once a match has been found at sought_pos, the entry is zeroed and no
future attempts will match that entry.  So increment sought_pos to
avoid checking against the zeroed-out entry during the next iteration.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-12 11:46:31 -07:00
Michael Haggerty
8bee93dd24 Change fetch_pack() and friends to take string_list arguments
Instead of juggling <nr_heads,heads> (sometimes called
<nr_match,match>), pass around the list of references to be sought in
a single string_list variable called "sought".  Future commits will
make more use of string_list functionality.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-12 11:46:31 -07:00
Michael Haggerty
63c694534b fetch_pack(): reindent function decl and defn
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-12 11:46:31 -07:00
Jeff King
36c60f7a08 fetch-pack: mention server version with verbose output
Fetch-pack's verbose mode is more of a debugging mode (and
in fact takes two "-v" arguments to trigger via the
porcelain layer). Let's mention the server version as
another possible item of interest.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-13 21:56:05 -07:00
Junio C Hamano
74991a98df fetch-pack: do not ask for unadvertised capabilities
In the same spirit as the previous fix, stop asking for thin-pack, no-progress
and include-tag capabilities when the other end does not claim to support them.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-10 14:27:52 -07:00
Jeff King
d50c387163 do not send client agent unless server does first
Commit ff5effdf taught both clients and servers of the git protocol
to send an "agent" capability that just advertises their version for
statistics and debugging purposes.  The protocol-capabilities.txt
document however indicates that the client's advertisement is
actually a response, and should never include capabilities not
mentioned in the server's advertisement.

Adding the unconditional advertisement in the server programs was
OK, then, but the clients broke the protocol.  The server
implementation of git-core itself does not care, but at least one
does: the Google Code git server (or any server using Dulwich), will
hang up with an internal error upon seeing an unknown capability.

Instead, each client must record whether we saw an agent string from
the server, and respond with its agent only if the server mentioned
it first.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-10 12:35:13 -07:00
Jeff King
ff5effdf45 include agent identifier in capability string
Instead of having the client advertise a particular version
number in the git protocol, we have managed extensions and
backwards compatibility by having clients and servers
advertise capabilities that they support. This is far more
robust than having each side consult a table of
known versions, and provides sufficient information for the
protocol interaction to complete.

However, it does not allow servers to keep statistics on
which client versions are being used. This information is
not necessary to complete the network request (the
capabilities provide enough information for that), but it
may be helpful to conduct a general survey of client
versions in use.

We already send the client version in the user-agent header
for http requests; adding it here allows us to gather
similar statistics for non-http requests.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-03 13:03:34 -07:00
Junio C Hamano
12d7d15074 Merge branch 'jk/fetch-pack-remove-dups-optim'
The way "fetch-pack" that is given multiple references to fetch tried to
remove duplicates was very inefficient.

By Jeff King
* jk/fetch-pack-remove-dups-optim:
  fetch-pack: sort incoming heads list earlier
  fetch-pack: avoid quadratic loop in filter_refs
  fetch-pack: sort the list of incoming refs
  add sorting infrastructure for list refs
  fetch-pack: avoid quadratic behavior in remove_duplicates
  fetch-pack: sort incoming heads
2012-05-29 13:09:08 -07:00
Junio C Hamano
4dbfaee0c7 Merge branch 'mh/fetch-pack-constness'
Tighten constness of some local variables in a callchain.

By Michael Haggerty
* mh/fetch-pack-constness:
  cmd_fetch_pack(): respect constness of argv parameter
  cmd_fetch_pack(): combine the loop termination conditions
  cmd_fetch_pack(): handle non-option arguments outside of the loop
  cmd_fetch_pack(): declare dest to be const
2012-05-29 13:08:53 -07:00
Jeff King
3d2a33e57f fetch-pack: sort incoming heads list earlier
Commit 4435968 started sorting heads fed to fetch-pack so
that later commits could use more optimized algorithms;
commit 7db8d53 switched the remove_duplicates function to
such an algorithm.

Of course, the sorting is more effective if you do it
_before_ the algorithm in question.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-24 10:02:37 -07:00
Jeff King
a0de28805d fetch-pack: avoid quadratic loop in filter_refs
We have a list of refs that we want to compare against the
"match" array. The current code searches the match list
linearly, giving quadratic behavior over the number of refs
when you want to fetch all of them.

Instead, we can compare the lists as we go, giving us linear
behavior.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-22 13:31:03 -07:00
Jeff King
9e8e704f0b fetch-pack: sort the list of incoming refs
Having the list sorted means we can avoid some quadratic
algorithms when comparing lists.

These should typically be sorted already, but they do come
from the remote, so let's be extra careful. Our ref-sorting
implementation does a mergesort, so we do not have to care
about performance degrading in the common case that the list
is already sorted.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-22 13:31:03 -07:00
Jeff King
7db8d5370f fetch-pack: avoid quadratic behavior in remove_duplicates
We remove duplicate entries from the list of refs we are
fed in fetch-pack. The original algorithm is quadratic over
the number of refs, but since the list is now guaranteed to
be sorted, we can do it in linear time.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-22 13:31:03 -07:00
Jeff King
443596850f fetch-pack: sort incoming heads
There's no reason to preserve the incoming order of the
heads we're requested to fetch. By having them sorted, we
can replace some of the quadratic algorithms with linear
ones.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-22 13:31:03 -07:00
Michael Haggerty
57e6fc6958 cmd_fetch_pack(): respect constness of argv parameter
The old code cast away the constness of the strings passed to the
function in argument argv[], which could result in their being
modified by filter_refs().  Fix by copying reference names from argv
and putting them into our own array (similarly to how refnames passed
to stdin were already handled).

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-22 12:57:20 -07:00
Michael Haggerty
ff22ff9909 cmd_fetch_pack(): combine the loop termination conditions
If an argument that does not start with '-' is found, the loop is
terminated.  So move that check into the for-loop condition.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-22 12:57:20 -07:00
Michael Haggerty
4cc00fcf5d cmd_fetch_pack(): handle non-option arguments outside of the loop
This makes it more obvious that the code is always executed unless
there is an error, and that the first initialization of nr_heads is
unnecessary.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-22 12:57:19 -07:00
Michael Haggerty
9d19c6ea52 cmd_fetch_pack(): declare dest to be const
There is no need for it to be non-const, and this avoids the need
for casting away the constness of an argv element.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-22 12:57:19 -07:00
Junio C Hamano
77cab8af4a Merge branch 'it/fetch-pack-many-refs'
When "git fetch" encounters repositories with too many references, the
command line of "fetch-pack" that is run by a helper e.g. remote-curl,
may fail to hold all of them. Now such an internal invocation can feed
the references through the standard input of "fetch-pack".

By Ivan Todoroski
* it/fetch-pack-many-refs:
  remote-curl: main test case for the OS command line overflow
  fetch-pack: test cases for the new --stdin option
  remote-curl: send the refs to fetch-pack on stdin
  fetch-pack: new --stdin option to read refs from stdin
2012-04-24 14:40:51 -07:00
Ivan Todoroski
078b895fef fetch-pack: new --stdin option to read refs from stdin
If a remote repo has too many tags (or branches), cloning it over the
smart HTTP transport can fail because remote-curl.c puts all the refs
from the remote repo on the fetch-pack command line. This can make the
command line longer than the global OS command line limit, causing
fetch-pack to fail.

This is especially a problem on Windows where the command line limit is
orders of magnitude shorter than Linux. There are already real repos out
there that msysGit cannot clone over smart HTTP due to this problem.

Here is an easy way to trigger this problem:

	git init too-many-refs
	cd too-many-refs
	echo bla > bla.txt
	git add .
	git commit -m test
	sha=$(git rev-parse HEAD)
	tag=$(perl -e 'print "bla" x 30')
	for i in `seq 50000`; do
		echo $sha refs/tags/$tag-$i >> .git/packed-refs
	done

Then share this repo over the smart HTTP protocol and try cloning it:

	$ git clone http://localhost/.../too-many-refs/.git
	Cloning into 'too-many-refs'...
	fatal: cannot exec 'fetch-pack': Argument list too long

50k tags is obviously an absurd number, but it is required to
demonstrate the problem on Linux because it has a much more generous
command line limit. On Windows the clone fails with as little as 500
tags in the above loop, which is getting uncomfortably close to the
number of tags you might see in real long lived repos.

This is not just theoretical, msysGit is already failing to clone our
company repo due to this. It's a large repo converted from CVS, nearly
10 years of history.

Four possible solutions were discussed on the Git mailing list (in no
particular order):

1) Call fetch-pack multiple times with smaller batches of refs.

This was dismissed as inefficient and inelegant.

2) Add option --refs-fd=$n to pass a an fd from where to read the refs.

This was rejected because inheriting descriptors other than
stdin/stdout/stderr through exec() is apparently problematic on Windows,
plus it would require changes to the run-command API to open extra
pipes.

3) Add option --refs-from=$tmpfile to pass the refs using a temp file.

This was not favored because of the temp file requirement.

4) Add option --stdin to pass the refs on stdin, one per line.

In the end this option was chosen as the most efficient and most
desirable from scripting perspective.

There was however a small complication when using stdin to pass refs to
fetch-pack. The --stateless-rpc option to fetch-pack also uses stdin for
communication with the remote server.

If we are going to sneak refs on stdin line by line, it would have to be
done very carefully in the presence of --stateless-rpc, because when
reading refs line by line we might read ahead too much data into our
buffer and eat some of the remote protocol data which is also coming on
stdin.

One way to solve this would be to refactor get_remote_heads() in
fetch-pack.c to accept a residual buffer from our stdin line parsing
above, but this function is used in several places so other callers
would be burdened by this residual buffer interface even when most of
them don't need it.

In the end we settled on the following solution:

If --stdin is specified without --stateless-rpc, fetch-pack would read
the refs from stdin one per line, in a script friendly format.

However if --stdin is specified together with --stateless-rpc,
fetch-pack would read the refs from stdin in packetized format
(pkt-line) with a flush packet terminating the list of refs. This way we
can read the exact number of bytes that we need from stdin, and then
get_remote_heads() can continue reading from the same fd without losing
a single byte of remote protocol data.

This way the --stdin option only loses generality and scriptability when
used together with --stateless-rpc, which is not easily scriptable
anyway because it also uses pkt-line when talking to the remote server.

Signed-off-by: Ivan Todoroski <grnch@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-02 13:47:15 -07:00
Junio C Hamano
592d051759 Merge branch 'cb/transfer-no-progress'
* cb/transfer-no-progress:
  push/fetch/clone --no-progress suppresses progress output
2012-02-20 00:14:55 -08:00
Clemens Buchacher
01fdc21f6e push/fetch/clone --no-progress suppresses progress output
By default, progress output is disabled if stderr is not a terminal.
The --progress option can be used to force progress output anyways.
Conversely, --no-progress does not force progress output. In particular,
if stderr is a terminal, progress output is enabled.

This is unintuitive. Change --no-progress to force output off.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-13 13:06:53 -08:00
Michael Haggerty
f257659132 everything_local(): mark alternate refs as complete
Objects in an alternate object database are already available to the
local repository and therefore don't need to be fetched.  So mark them
as complete in everything_local().

This fixes a test in t5700.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-12 19:50:39 -08:00
Michael Haggerty
c41a802fe9 fetch-pack.c: inline insert_alternate_refs()
The logic of the (single) caller is clearer without encapsulating this
one line in a function.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-12 19:50:39 -08:00
Michael Haggerty
65385ef7d4 fetch-pack.c: rename some parameters from "path" to "refname"
The parameters denote reference names, which are no longer 1:1 with
filesystem paths.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-12 19:50:39 -08:00
Jeff King
1e7ba0f9ca fetch-pack: match refs exactly
When we are determining the list of refs to fetch via
fetch-pack, we have two sets of refs to compare: those on
the remote side, and a "match" list of things we want to
fetch. We iterate through the remote refs alphabetically,
seeing if each one is wanted by the "match" list.

Since def88e9 (Commit first cut at "git-fetch-pack",
2005-07-04), we have used the "path_match" function to do a
suffix match, where a remote ref is considered wanted if
any of the "match" elements is a suffix of the remote
refname.

This enables callers of fetch-pack to specify unqualified
refs and have them matched up with remote refs (e.g., ask
for "A" and get remote's "refs/heads/A"). However, if you
provide a fully qualified ref, then there are corner cases
where we provide the wrong answer. For example, given a
remote with two refs:

   refs/foo/refs/heads/master
   refs/heads/master

asking for "refs/heads/master" will first match
"refs/foo/refs/heads/master" by the suffix rule, and we will
erroneously fetch it instead of refs/heads/master.

As it turns out, all callers of fetch_pack do provide
fully-qualified refs for the match list. There are two ways
fetch_pack can get match lists:

  1. Through the transport code (i.e., via git-fetch)

  2. On the command-line of git-fetch-pack

In the first case, we will always be providing the names of
fully-qualified refs from "struct ref" objects. We will have
pre-matched those ref objects already (since we have to
handle more advanced matching, like wildcard refspecs), and
are just providing a list of the refs whose objects we need.

In the second case, users could in theory be providing
non-qualified refs on the command-line. However, the
fetch-pack documentation claims that refs should be fully
qualified (and has always done so since it was written in
2005).

Let's change this path_match call to simply check for string
equality, matching what the callers of fetch_pack are
expecting.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-13 10:17:50 -08:00
Jeff King
afe7c5ff1f drop "match" parameter from get_remote_heads
The get_remote_heads function reads the list of remote refs
during git protocol session. It dates all the way back to
def88e9 (Commit first cut at "git-fetch-pack", 2005-07-04).
At that time, the idea was to come up with a list of refs we
were interested in, and then filter the list as we got it
from the remote side.

Later, 1baaae5 (Make maximal use of the remote refs,
2005-10-28) stopped filtering at the get_remote_heads layer,
letting us use the non-matching refs to find common history.

As a result, all callers now simply pass an empty match
list (and any future callers will want to do the same). So
let's drop these now-useless parameters.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-13 10:08:24 -08:00
Junio C Hamano
9bd500048d Merge branch 'mh/check-ref-format-3'
* mh/check-ref-format-3: (23 commits)
  add_ref(): verify that the refname is formatted correctly
  resolve_ref(): expand documentation
  resolve_ref(): also treat a too-long SHA1 as invalid
  resolve_ref(): emit warnings for improperly-formatted references
  resolve_ref(): verify that the input refname has the right format
  remote: avoid passing NULL to read_ref()
  remote: use xstrdup() instead of strdup()
  resolve_ref(): do not follow incorrectly-formatted symbolic refs
  resolve_ref(): extract a function get_packed_ref()
  resolve_ref(): turn buffer into a proper string as soon as possible
  resolve_ref(): only follow a symlink that contains a valid, normalized refname
  resolve_ref(): use prefixcmp()
  resolve_ref(): explicitly fail if a symlink is not readable
  Change check_refname_format() to reject unnormalized refnames
  Inline function refname_format_print()
  Make collapse_slashes() allocate memory for its result
  Do not allow ".lock" at the end of any refname component
  Refactor check_refname_format()
  Change check_ref_format() to take a flags argument
  Change bad_ref_char() to return a boolean value
  ...
2011-10-10 15:56:18 -07:00
Michael Haggerty
8d9c50105f Change check_ref_format() to take a flags argument
Change check_ref_format() to take a flags argument that indicates what
is acceptable in the reference name (analogous to "git
check-ref-format"'s "--allow-onelevel" and "--refspec-pattern").  This
is more convenient for callers and also fixes a failure in the test
suite (and likely elsewhere in the code) by enabling "onelevel" and
"refspec-pattern" to be allowed independently of each other.

Also rename check_ref_format() to check_refname_format() to make it
obvious that it deals with refnames rather than references themselves.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-05 13:45:29 -07:00
Junio C Hamano
ca0c9764bf Merge branch 'jc/fetch-pack-fsck-objects'
* jc/fetch-pack-fsck-objects:
  test: fetch/receive with fsckobjects
  transfer.fsckobjects: unify fetch/receive.fsckobjects
  fetch.fsckobjects: verify downloaded objects

Conflicts:
	Documentation/config.txt
	builtin/fetch-pack.c
2011-10-05 12:36:20 -07:00
Junio C Hamano
dab76d3aa6 transfer.fsckobjects: unify fetch/receive.fsckobjects
This single variable can be used to set instead of setting fsckobjects
variable for fetch & receive independently.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-09-04 12:39:32 -07:00
Junio C Hamano
5e838ea7aa fetch.fsckobjects: verify downloaded objects
This corresponds to receive.fsckobjects configuration variable added (a
lot) earlier in 20dc001 (receive-pack: allow using --strict mode for
unpacking objects, 2008-02-25).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-09-04 12:27:17 -07:00
Junio C Hamano
3400c222d9 Merge branch 'nd/decorate-grafts'
* nd/decorate-grafts:
  log: Do not decorate replacements with --no-replace-objects
  log: decorate "replaced" on to replaced commits
  log: decorate grafted commits with "grafted"
  Move write_shallow_commits to fetch-pack.c
  Add for_each_commit_graft() to iterate all grafts
  decoration: do not mis-decorate refs with same prefix
2011-08-28 21:22:58 -07:00
Nguyễn Thái Ngọc Duy
ec099546a9 fetch-pack: check for valid commit from server
A malicious server can return ACK with non-existent SHA-1 or not a
commit. lookup_commit() in this case may return NULL. Do not let
fetch-pack crash by accessing NULL address in this case.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-18 12:25:54 -07:00
Nguyễn Thái Ngọc Duy
294e15fc19 Move write_shallow_commits to fetch-pack.c
This function produces network traffic and should be in fetch-pack. It
has been in commit.c because it needs to iterate (private) graft
list. It can now do so using for_each_commit_graft().

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-18 11:01:18 -07:00