Commit Graph

29357 Commits

Author SHA1 Message Date
René Scharfe
c2df7585ef submodule: fix prototype of gitmodules_config
Add void to make it match its definition in submodule.c.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-10 12:27:54 -07:00
Ross Lagerwall
658219f1c7 rev-parse --show-prefix: add in trailing newline
Print out a trailing newline when --show-prefix is run with cwd
at the top level of the tree which results in an empty prefix.
Behavior is now like --show-cdup.

Fixes an expected failure in t1501.

Signed-off-by: Ross Lagerwall <rosslagerwall@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-10 09:25:35 -07:00
Jeff King
dfa1725a3e fix http auth with multiple curl handles
HTTP authentication is currently handled by get_refs and fetch_ref, but
not by fetch_object, fetch_pack or fetch_alternates. In the
single-threaded case, this is not an issue, since get_refs is always
called first. It recognigzes the 401 and prompts the user for
credentials, which will then be used subsequently.

If the curl multi interface is used, however, only the multi handle used
by get_refs will have credentials configured. Requests made by other
handles fail with an authentication error.

Fix this by setting CURLOPT_USERPWD whenever a slot is requested.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-10 09:12:13 -07:00
Clemens Buchacher
5a9681f46a http auth fails with multiple curl handles
Create a repo with multiple loose objects in order to demonstrate http
authentication breakage.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-10 09:12:11 -07:00
David A. Greene
926b1ec63e Fix git-subtree install instructions
Update the install instructions to reflect the changes for an
integrated git-subtree.

Signed-off-by: David A. Greene <greened@obbligato.org>
2012-04-09 22:26:19 -05:00
David A. Greene
311391da90 Use git-subtree test Makefile
Use the Makefile in contrib/subtree/t to run git-subtree tests.

Signed-off-by: David A. Greene <greened@obbligato.org>
2012-04-09 22:26:19 -05:00
David A. Greene
c3d884a688 Add subtree test Makefile
Add a Makefile to run subtree tests.  This is largely copied
from the standard test suite with irrelevant targets removed
and some paths altered to account for where subtree tests live.

Signed-off-by: David A. Greene <greened@obbligato.org>
2012-04-09 22:26:10 -05:00
David A. Greene
7ff8463dba Install git-subtree from contrib
Build git-subtree in its contrib directory and install from there.
The main Makefile no longer discovers subcommands build in the main
build area so we cannot count on it to install git-subtree.  The user
should make && make install in contrib/subtree to install git-subtree.

Change the rule to install the git-subtree manpage.  The main
Documentation area doesn't directly support installing documentation
from other directories so the user will have to do that from within
contrib/subtree for now.

Signed-off-by: David A. Greene <greened@obbligato.org>
2012-04-09 22:26:10 -05:00
David A. Greene
187bc2da5b Use configure settings for git-subtree
Include config.make.autogen in the git-subtree contrib area to pick up
settings for prefix and other such things.

Signed-off-by: David A. Greene <greened@obbligato.org>
2012-04-09 22:26:10 -05:00
David A. Greene
5163d476d0 Use project config files
Use project-wide files to process documentation for git-subtree.

Signed-off-by: David A. Greene <greened@obbligato.org>
2012-04-09 22:25:58 -05:00
David A. Greene
c96c5383ff Remove unnecessary git-subtree files
Remove various files that simply duplicate functionality already
provided by the main project files.

Signed-off-by: David A. Greene <greened@obbligato.org>
2012-04-09 22:11:25 -05:00
David A. Greene
603ee0f0c3 Set TEST_DIRECTORY
Set TEST_DIRECTORY to the main git test area.  This allows the
git-subtree out-of-tree tests to run correctly.

Signed-off-by: David A. Greene <greened@obbligato.org>
2012-04-09 20:23:10 -05:00
David A. Greene
634392b262 Add 'contrib/subtree/' from commit 'd3a04e06c77d57978bb5230361c64946232cc346'
git-subtree-dir: contrib/subtree
git-subtree-mainline: e8dde3e5f9
git-subtree-split: d3a04e06c7
2012-04-09 20:22:55 -05:00
Thomas Rast
6942efcfa9 xdiff: load full words in the inner loop of xdl_hash_record
Redo the hashing loop in xdl_hash_record in a way that loads an entire
'long' at a time, using masking tricks to see when and where we found
the terminating '\n'.

I stole inspiration and code from the posts by Linus Torvalds around

  https://lkml.org/lkml/2012/3/2/452
  https://lkml.org/lkml/2012/3/5/6

His method reads the buffers in sizeof(long) increments, and may thus
overrun it by at most sizeof(long)-1 bytes before it sees the final
newline (or hits the buffer length check).  I considered padding out
all buffers by a suitable amount to "catch" the overrun, but

* this does not work for mmap()'d buffers: if you map 4096+8 bytes
  from a 4096 byte file, accessing the last 8 bytes results in a
  SIGBUS on my machine; and

* it would also be extremely ugly because it intrudes deep into the
  unpacking machinery.

So I adapted it to not read beyond the buffer at all.  Instead, it
reads the final partial word byte-by-byte and strings it together.
Then it can use the same logic as before to finish the hashing.

So far we enable this only on x86_64, where it provides nice speedup
for diff-related work:

  Test                                  origin/next      tr/xdiff-fast-hash
  -----------------------------------------------------------------------------
  4000.1: log -3000 (baseline)          0.07(0.05+0.02)  0.08(0.06+0.02) +14.3%
  4000.2: log --raw -3000 (tree-only)   0.37(0.33+0.04)  0.37(0.32+0.04) +0.0%
  4000.3: log -p -3000 (Myers)          1.75(1.65+0.09)  1.60(1.49+0.10) -8.6%
  4000.4: log -p -3000 --histogram      1.73(1.62+0.09)  1.58(1.49+0.08) -8.7%
  4000.5: log -p -3000 --patience       2.11(2.00+0.10)  1.94(1.80+0.11) -8.1%

Perhaps other platforms could also benefit.  However it does NOT work
on big-endian systems!

[jc: minimum style and compilation fixes]

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-09 17:03:25 -07:00
John Keeping
a6754cda43 rebase -i continue: don't skip commits that only change submodules
When git-rebase--interactive stops due to a conflict and the only change
to be committed is in a submodule, the test for whether there is
anything to be committed ignores the staged submodule change.  This
leads rebase to skip creating the commit for the change.

While unstaged submodule changes should be ignored to avoid needing to
update submodules during a rebase, it is safe to remove the
--ignore-submodules option to diff-index because --cached ensures that
it is only checking the index.  This was discussed in [1] and a test is
included to ensure that unstaged changes are still ignored correctly.

[1] http://thread.gmane.org/gmane.comp.version-control.git/188713

Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-09 15:08:18 -07:00
Michael Schubert
31558fd48e remote: update builtin usage
Add missing options "--tags|--no-tags" and "--push".

Signed-off-by: Michael Schubert <mschub@elegosoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-09 15:06:50 -07:00
Pete Wyckoff
6ab1d76c3c git p4: use "git p4" directly in tests
Drop the $GITP4 variable that was used to specify the script in
contrib/fast-import/.  The command is called "git p4" now, not
"git-p4".

Note that configuration variables will remain in a section called
"git-p4".

Signed-off-by: Pete Wyckoff <pw@padd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-09 15:00:33 -07:00
Pete Wyckoff
9dcb9f24f8 git p4: update name in script
In messages to the user and comments, change "git-p4" to "git p4".

Signed-off-by: Pete Wyckoff <pw@padd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-09 14:59:40 -07:00
Pete Wyckoff
b6f9305764 git-p4: move to toplevel
Move git-p4 out of contrib/fast-import into the main code base,
aside other foreign SCM tools.

Signed-off-by: Pete Wyckoff <pw@padd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-09 14:59:40 -07:00
Ben Walton
c5bc42b9b7 Avoid bug in Solaris xpg4/sed as used in submodule
The sed provided by Solaris in /usr/xpg4/bin has a bug whereby an
unanchored regex using * for zero or more repetitions sees two
separate matches fed to the substitution engine in some cases.

This is evidenced by:

$ for sed in /usr/xpg4/bin/sed /usr/bin/sed /opt/csw/gnu/sed; do \
echo 'ab' | $sed -e 's|[a]*|X|g'; \
done
XXbX
XbX
XbX

This bug was triggered during a git submodule clone operation as
exercised in the setup stage of t5526-fetch-submodules when using the
default SANE_TOOL_PATH for Solaris.  It led to paths such as
..../.. being used in the submodule .git gitdir reference.

Using the expression 's|\([^/]*\(/*\)\)|..\2|g' provides the desired
result with all three three tested sed implementations but is harder
to read.  As we do not need to handle fully qualified paths though,
the expression could actually be [^/]+ which isn't properly handled
either.  Instead, use [^/][^/]*, as suggested by Andreas Schwab, which
works on all three tested sed implementations.

The new expression is semantically different than the original one.
It will not place a leading '..' on a fully qualified path as the
original expression did.  All of the paths being passed through this
regex are relative and did not rely on this behaviour so it's a safe
change.

Signed-off-by: Ben Walton <bwalton@artsci.utoronto.ca>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-09 14:49:32 -07:00
Junio C Hamano
b1bcfbe344 Merge branch 'jc/maint-verify-objects-remove-pessimism' into maint-1.7.8
* jc/maint-verify-objects-remove-pessimism:
  fetch/receive: remove over-pessimistic connectivity check
2012-04-09 13:43:16 -07:00
Junio C Hamano
795283c415 Merge branch 'dw/gitweb-doc-grammo' into maint-1.7.8
* dw/gitweb-doc-grammo:
  Documentation/gitweb: trivial English fixes
2012-04-09 13:42:56 -07:00
Junio C Hamano
6d5c16a90c Merge branch 'tr/cache-tree' into maint-1.7.8
* tr/cache-tree:
  t0090: be prepared that 'wc -l' writes leading blanks
  reset: update cache-tree data when appropriate
  commit: write cache-tree data when writing index anyway
  Refactor cache_tree_update idiom from commit
  Test the current state of the cache-tree optimization
  Add test-scrap-cache-tree
2012-04-09 13:40:32 -07:00
Junio C Hamano
00fb2d2563 Merge branch 'cb/maint-t5541-make-server-port-portable' into maint-1.7.8
* cb/maint-t5541-make-server-port-portable:
  t5541: check error message against the real port number used
  remote-curl: Fix push status report when all branches fail
2012-04-09 13:38:41 -07:00
Junio C Hamano
fc2d99f1e9 Merge branch 'cn/maint-rev-list-doc' into maint-1.7.8
* cn/maint-rev-list-doc:
  Documentation: use {asterisk} in rev-list-options.txt when needed
2012-04-09 13:36:44 -07:00
Junio C Hamano
50c9403284 Merge branch 'tr/maint-bundle-boundary' into maint-1.7.8
* tr/maint-bundle-boundary:
  bundle: keep around names passed to add_pending_object()
  t5510: ensure we stay in the toplevel test dir
  t5510: refactor bundle->pack conversion
2012-04-09 13:36:26 -07:00
Junio C Hamano
8502a779da Merge branch 'tr/maint-bundle-long-subject' into maint-1.7.8
* tr/maint-bundle-long-subject:
  t5704: match tests to modern style
  strbuf: improve strbuf_get*line documentation
  bundle: use a strbuf to scan the log for boundary commits
  bundle: put strbuf_readline_fd in strbuf.c with adjustments
2012-04-09 13:36:20 -07:00
Junio C Hamano
dbdc07fcbe Merge branch 'ph/rerere-doc' into maint-1.7.8
* ph/rerere-doc:
  rerere: Document 'rerere remaining'
2012-04-09 13:34:09 -07:00
Junio C Hamano
e8dde3e5f9 Git 1.7.10
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-06 10:47:58 -07:00
Felipe Contreras
e681a93a98 spec: add missing build dependency
Otherwise:

/usr/bin/perl Makefile.PL PREFIX='/opt/git' INSTALL_BASE=''
Can't locate ExtUtils/MakeMaker.pm in @INC (@INC contains: ...) at Makefile.PL line 1.
BEGIN failed--compilation aborted at Makefile.PL line 1.
make[1]: *** [perl.mak] Error 2
make: *** [perl/perl.mak] Error 2

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-06 10:15:11 -07:00
Jeff King
38f865c27d run-command: treat inaccessible directories as ENOENT
When execvp reports EACCES, it can be one of two things:

  1. We found a file to execute, but did not have
     permissions to do so.

  2. We did not have permissions to look in some directory
     in the $PATH.

In the former case, we want to consider this a
permissions problem and report it to the user as such (since
getting this for something like "git foo" is likely a
configuration error).

In the latter case, there is a good chance that the
inaccessible directory does not contain anything of
interest. Reporting "permission denied" is confusing to the
user (and prevents our usual "did you mean...?" lookup). It
also prevents git from trying alias lookup, since we do so
only when an external command does not exist (not when it
exists but has an error).

This patch detects EACCES from execvp, checks whether we are
in case (2), and if so converts errno to ENOENT. This
behavior matches that of "bash" (but not of simpler shells
that use execvp more directly, like "dash").

Test stolen from Junio.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-05 16:24:13 -07:00
Ramsay Jones
1696d72321 compat/mingw.[ch]: Change return type of exec functions to int
The POSIX standard specifies a return type of int for all six exec
functions. In addition, all exec functions return -1 on error, and
simply do not return on success. However, the current emulation of
the exec functions on mingw are declared with a void return type.

This would cause a problem should any code attempt to call the
exec function in a non-void context. In particular, if an exec
function were used in a conditional it would fail to compile.

In order to improve the fidelity of the emulation, we change the
return type of the mingw_execv[p] functions to int and return -1
on error.

Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-05 16:22:48 -07:00
Junio C Hamano
135dadef71 push: error out when the "upstream" semantics does not make sense
The user can say "git push" without specifying any refspec.  When using
the "upstream" semantics via the push.default configuration, the user
wants to update the "upstream" branch of the current branch, which is the
branch at a remote repository the current branch is set to integrate with,
with this command.

However, there are cases that such a "git push" that uses the "upstream"
semantics does not make sense:

 - The current branch does not have branch.$name.remote configured.  By
   definition, "git push" that does not name where to push to will not
   know where to push to.  The user may explicitly say "git push $there",
   but again, by definition, no branch at repository $there is set to
   integrate with the current branch in this case and we wouldn't know
   which remote branch to update.

 - The current branch does have branch.$name.remote configured, but it
   does not specify branch.$name.merge that names what branch at the
   remote this branch integrates with. "git push" knows where to push in
   this case (or the user may explicitly say "git push $remote" to tell us
   where to push), but we do not know which remote branch to update.

 - The current branch does have its remote and upstream branch configured,
   but the user said "git push $there", where $there is not the remote
   named by "branch.$name.remote".  By definition, no branch at repository
   $there is set to integrate with the current branch in this case, and
   this push is not meant to update any branch at the remote repository
   $there.

The first two cases were already checked correctly, but the third case was
not checked and we ended up updating the branch named branch.$name.merge
at repository $there, which was totally bogus.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-05 13:35:57 -07:00
Jeff King
4066bd6797 add--interactive: ignore unmerged entries in patch mode
When "add -p" sees an unmerged entry, it shows the combined
diff and then immediately skips the hunk. This can be
confusing in a variety of ways, depending on whether there
are other changes to stage (in which case you get the
superfluous combined diff output in between other hunks) or
not (in which case you get the combined diff and the program
exits immediately, rather than seeing "No changes").

The current behavior was not planned, and is just what the
implementation happens to do. Instead, let's explicitly
remove unmerged entries from our list of modified files, and
print a warning that we are ignoring them.

We can cheaply find which entries are unmerged by adding
"--raw" output to the "diff-files --numstat" we already run.
There is one non-obvious thing we must change when parsing
this combined output. Before this patch, when we saw a
numstat line for a file that did not have index changes, we
would create a new record with 'unchanged' in the 'INDEX'
field.  Because "--raw" comes before "--numstat", we must
move this special-case down to the raw-line case (and it is
sufficient to move it rather than handle it in both places,
since any file which has a --numstat will also have a --raw
entry).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-05 09:01:03 -07:00
Junio C Hamano
69dec66b2f update-index: upgrade/downgrade on-disk index version
With the "--index-version <n>" parameter, write the index out in the
specified version.  With this, an index file that is written in newer
format (say v4) can be downgraded to be read by older versions of Git.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-04 09:57:50 -07:00
Junio C Hamano
9d227781b6 read-cache.c: write prefix-compressed names in the index
Teach the code to write the index in the v4 on-disk format.

Record the format version of the on-disk index we read from in the
index_state, and use the format when writing the new index out.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-04 09:57:49 -07:00
Ben Walton
b3e34dddc0 Use SHELL_PATH from build system in run_command.c:prepare_shell_cmd
During the testing of the 1.7.10 rc series on Solaris for OpenCSW, it
was discovered that t7006-pager was failing due to finding a bad "sh"
in PATH after a call to execvp("sh", ...).  This call was setup by
run_command.c:prepare_shell_cmd.

The PATH in use at the time saw /opt/csw/bin given precedence to
traditional Solaris paths such as /usr/bin and /usr/xpg4/bin.  A
package named schilyutils (Joerg Schilling's utilities) was installed
on the build system and it delivered a modified version of the
traditional Solaris /usr/bin/sh as /opt/csw/bin/sh.  This version of
sh suffers from many of the same problems as /usr/bin/sh.

The command-specific pager test failed due to the broken "sh" handling
^ as a pipe character.  It tried to fork two processes when it
encountered "sed s/^/foo:/" as the pager command.  This problem was
entirely dependent on the PATH of the user at runtime.

Possible fixes for this issue are:

1. Use the standard system() or popen() which both launch a POSIX
   shell on Solaris as long as _POSIX_SOURCE is defined.

2. The git wrapper could prepend SANE_TOOL_PATH to PATH thus forcing
   all unqualified commands run to use the known good tools on the
   system.

3. The run_command.c:prepare_shell_command() could use the same
   SHELL_PATH that is in the #! line of all all scripts and not rely
   on PATH to find the sh to run.

Option 1 would preclude opening a bidirectional pipe to a filter
script and would also break git for Windows as cmd.exe is spawned from
system() (cf. v1.7.5-rc0~144^2, "alias: use run_command api to execute
aliases, 2011-01-07).

Option 2 is not friendly to users as it would negate their ability to
use tools of their choice in many cases.  Alternately, injecting
SANE_TOOL_PATH such that it takes precedence over /bin and /usr/bin
(and anything with lower precedence than those paths) as
git-sh-setup.sh does would not solve the problem either as the user
environment could still allow a bad sh to be found.  (Many OpenCSW
users will have /opt/csw/bin leading their PATH and some subset would
have schilyutils installed.)

Option 3 allows us to use a known good shell while still honouring the
users' PATH for the utilities being run.  Thus, it solves the problem
while not negatively impacting either users or git's ability to run
external commands in convenient ways.  Essentially, the shell is a
special case of tool that should not rely on SANE_TOOL_PATH and must
be called explicitly.

With this patch applied, any code path leading to
run_command.c:prepare_shell_cmd can count on using the same sane shell
that all shell scripts in the git suite use.  Both the build system
and run_command.c will default this shell to /bin/sh unless
overridden.

Signed-off-by: Ben Walton <bwalton@artsci.utoronto.ca>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-03 17:24:20 -07:00
Junio C Hamano
1f08c2c825 Documentation/git-commit: rephrase the "initial-ness" of templates
The description of "commit -t <file>" said the file is used "as the
initial version" of the commit message, but in the context of an SCM,
"version" is a loaded word that can needlesslyl confuse readers.

Explain the purpose of the mechanism without using "version".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-03 16:41:21 -07:00
Junio C Hamano
6c9cd161d9 read-cache.c: read prefix-compressed names in index on-disk version v4
Because the entries are sorted by path, adjacent entries in the index tend
to share the leading components of them, and it makes sense to only store
the differences in later entries.  In the v4 on-disk format of the index,
each on-disk cache entry stores the number of bytes to be stripped from
the end of the previous name, and the bytes to append to the result, to
come up with its name.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-03 16:24:46 -07:00
Junio C Hamano
f136f7bfe8 read-cache.c: move code to copy incore to ondisk cache to a helper function
This makes the change in a later patch look less scary.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-03 16:24:46 -07:00
Junio C Hamano
3fc22b5331 read-cache.c: move code to copy ondisk to incore cache to a helper function
This makes the change in a later patch look less scary.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-03 16:24:46 -07:00
Junio C Hamano
0136bac9b8 read-cache.c: report the header version we do not understand
Instead of just saying "bad index version", report the value we read
from the disk.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-03 16:24:45 -07:00
Junio C Hamano
936f53d055 read-cache.c: make create_from_disk() report number of bytes it consumed
The function is the one that is reading from the data stream. It only is
natural to make it responsible for reporting this number, not the caller.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-03 16:24:45 -07:00
Junio C Hamano
d60c49c2d7 read-cache.c: allow unaligned mapping of the index file
Both the on-disk format v2 and v3 pads the "name" field to the multiple of
eight to make sure that various quantities in network long/short type can
be accessed with ntohl/ntohs without having to worry about alignment, but
this forces us to waste disk I/O bandwidth.

Introduce ntoh_s()/ntoh_l() macros that the callers can use as if they were
the regular ntohs()/ntohl() on a field that may not be aligned correctly.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-03 16:24:45 -07:00
Junio C Hamano
db3b313c84 cache.h: hide on-disk index details
The on-disk format of the index file is a detail whose implementation is
neatly encapsulated in read-cache.c; there is no need to expose it to the
general public that include the cache.h header file.

Also add a prominent mark to read-cache.c to delineate the parts that deal
with the index file I/O routines from the remainder of the file.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-03 16:24:45 -07:00
Junio C Hamano
d2c1898571 varint: make it available outside the context of pack
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-03 16:24:44 -07:00
Junio C Hamano
e5056c05ec Git 1.7.10-rc4
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-03 09:30:58 -07:00
Junio C Hamano
ca2b71a00b Merge branch 'pt/gitk'
* pt/gitk:
  gitk: fix setting font display with new tabbed dialog layout.
  gitk: fix tabbed preferences construction when using tcl 8.4
2012-04-02 15:06:25 -07:00
Ivan Todoroski
078b895fef fetch-pack: new --stdin option to read refs from stdin
If a remote repo has too many tags (or branches), cloning it over the
smart HTTP transport can fail because remote-curl.c puts all the refs
from the remote repo on the fetch-pack command line. This can make the
command line longer than the global OS command line limit, causing
fetch-pack to fail.

This is especially a problem on Windows where the command line limit is
orders of magnitude shorter than Linux. There are already real repos out
there that msysGit cannot clone over smart HTTP due to this problem.

Here is an easy way to trigger this problem:

	git init too-many-refs
	cd too-many-refs
	echo bla > bla.txt
	git add .
	git commit -m test
	sha=$(git rev-parse HEAD)
	tag=$(perl -e 'print "bla" x 30')
	for i in `seq 50000`; do
		echo $sha refs/tags/$tag-$i >> .git/packed-refs
	done

Then share this repo over the smart HTTP protocol and try cloning it:

	$ git clone http://localhost/.../too-many-refs/.git
	Cloning into 'too-many-refs'...
	fatal: cannot exec 'fetch-pack': Argument list too long

50k tags is obviously an absurd number, but it is required to
demonstrate the problem on Linux because it has a much more generous
command line limit. On Windows the clone fails with as little as 500
tags in the above loop, which is getting uncomfortably close to the
number of tags you might see in real long lived repos.

This is not just theoretical, msysGit is already failing to clone our
company repo due to this. It's a large repo converted from CVS, nearly
10 years of history.

Four possible solutions were discussed on the Git mailing list (in no
particular order):

1) Call fetch-pack multiple times with smaller batches of refs.

This was dismissed as inefficient and inelegant.

2) Add option --refs-fd=$n to pass a an fd from where to read the refs.

This was rejected because inheriting descriptors other than
stdin/stdout/stderr through exec() is apparently problematic on Windows,
plus it would require changes to the run-command API to open extra
pipes.

3) Add option --refs-from=$tmpfile to pass the refs using a temp file.

This was not favored because of the temp file requirement.

4) Add option --stdin to pass the refs on stdin, one per line.

In the end this option was chosen as the most efficient and most
desirable from scripting perspective.

There was however a small complication when using stdin to pass refs to
fetch-pack. The --stateless-rpc option to fetch-pack also uses stdin for
communication with the remote server.

If we are going to sneak refs on stdin line by line, it would have to be
done very carefully in the presence of --stateless-rpc, because when
reading refs line by line we might read ahead too much data into our
buffer and eat some of the remote protocol data which is also coming on
stdin.

One way to solve this would be to refactor get_remote_heads() in
fetch-pack.c to accept a residual buffer from our stdin line parsing
above, but this function is used in several places so other callers
would be burdened by this residual buffer interface even when most of
them don't need it.

In the end we settled on the following solution:

If --stdin is specified without --stateless-rpc, fetch-pack would read
the refs from stdin one per line, in a script friendly format.

However if --stdin is specified together with --stateless-rpc,
fetch-pack would read the refs from stdin in packetized format
(pkt-line) with a flush packet terminating the list of refs. This way we
can read the exact number of bytes that we need from stdin, and then
get_remote_heads() can continue reading from the same fd without losing
a single byte of remote protocol data.

This way the --stdin option only loses generality and scriptability when
used together with --stateless-rpc, which is not easily scriptable
anyway because it also uses pkt-line when talking to the remote server.

Signed-off-by: Ivan Todoroski <grnch@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-02 13:47:15 -07:00
Junio C Hamano
d82829b612 Sync with 1.7.9.6 2012-04-02 13:11:49 -07:00