Commit Graph

34786 Commits

Author SHA1 Message Date
Felipe Contreras
3e9b9cb117 fast-export: refactor get_tags_and_duplicates()
Split into a separate helper function get_commit() so that the part that
finds the relevant commit, and the part that does something with it
(handle tag object, etc.) are in different places.

No functional changes.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-03 12:42:25 -07:00
Felipe Contreras
1d844ee7bd fast-export: make extra_refs global
There's no need to pass it around everywhere. This would make easier
further refactoring that makes use of this variable.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-03 12:39:17 -07:00
Felipe Contreras
d0423ddd77 t: branch: fix broken && chains
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-03 12:14:29 -07:00
Felipe Contreras
002ba0376b t: branch: fix typo
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-03 12:14:28 -07:00
Felipe Contreras
140cd84593 t: branch: trivial style fix
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-03 12:14:26 -07:00
Matthieu Moy
f19f5e60f6 git-remote-mediawiki: no need to update private ref in non-dumb push
We used to update the private ref ourselves, but this update is now
done by default since 664059fb (transport-helper: update remote
helper namespace, 2013-04-17).

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-03 11:58:17 -07:00
Matthieu Moy
aa38dc68ea git-remote-mediawiki: use no-private-update capability on dumb push
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-03 11:58:12 -07:00
Matthieu Moy
597b831afb transport-helper: add no-private-update capability
Since 664059fb (transport-helper: update remote helper namespace,
2013-04-17), a 'push' operation on a remote helper updates the
private ref by default. This is often a good thing, but it can also
be desirable to disable this update to force the next 'pull' to
re-import the pushed revisions.

Allow remote-helpers to disable the automatic update by introducing a new
capability.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-03 11:57:53 -07:00
Felipe Contreras
cf99a761d3 sha1-name: pass len argument to interpret_branch_name()
This is useful to make sure we don't step outside the boundaries of what
we are interpreting at the moment. For example while interpreting
foobar@{u}~1, the job of interpret_branch_name() ends right before ~1,
but there's no way to figure that out inside the function, unless the
len argument is passed.

So let's do that.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-03 11:33:00 -07:00
Nguyễn Thái Ngọc Duy
487a2b7322 Make setup_git_env() resolve .git file when $GIT_DIR is not specified
This makes reinitializing on a .git file repository work.

This is probably the only case that setup_git_env() (via
set_git_dir()) is called on a .git file. Other cases in
setup_git_dir_gently() and enter_repo() both cover .git file case
explicitly because they need to verify the target repo is valid.

Reported-by: Ximin Luo <infinity0@gmx.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-03 11:14:58 -07:00
Jeff King
ed016612e6 pager: turn on "cat" optimization for DEFAULT_PAGER
If the user specifies a pager of "cat" (or the empty
string), whether it is in the environment or from config, we
automagically optimize it out to mean "no pager" and avoid
forking at all. We treat an empty pager variable similary.

However, we did not apply this optimization when
DEFAULT_PAGER was set to "cat" (or the empty string). There
is no reason to treat DEFAULT_PAGER any differently. The
optimization should not be user-visible (unless the user has
a bizarre "cat" in their PATH). And even if it is, we are
better off behaving consistently between the compile-time
default and the environment and config settings.

The stray "else" we are removing from this code was
introduced by 402461a (pager: do not fork a pager if PAGER
is set to empty., 2006-04-16). At that time, the line
directly above used:

   if (!pager)
	   pager = "less";

as a fallback, meaning that it could not possibly trigger
the optimization. Later, a3d023d (Provide a build time
default-pager setting, 2009-10-30) turned that constant into
a build-time setting which could be anything, but didn't
loosen the "else" to let DEFAULT_PAGER use the optimization.

Noticed-by: Dale R. Worley <worley@alum.mit.edu>
Suggested-by: Matthieu Moy <Matthieu.Moy@grenoble-inp.fr>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-03 10:36:12 -07:00
Junio C Hamano
5d21adcbfe contrib/remote-helpers: quote variable references in redirection targets
Even though it is not required by POSIX to double-quote the
redirection target in a variable, our code does so because some
versions of bash issue a warning without the quotes.

Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-03 10:25:58 -07:00
Junio C Hamano
ff867963f0 contrib/remote-helpers: style updates for test scripts
During the review of the main series it was noticed that these test
scripts can use updates to conform to our coding style better, but
fixing the style should be done in a patch separate from the main
series.

This updates the test-*.sh scripts only for style issues:

 * We do not leave SP between a redirection operator and the
   filename;

 * We change line before "then", "do", etc. rather than terminating
   the condition for "if"/"while" and list for "for" with a
   semicolon;

 * When HERE document does not use any expansion, we quote the end
   marker (e.g. "cat <<\EOF" not "cat <<EOF") to signal the readers
   that there is no funny substitution to worry about when reading
   the code.

 * We use "test" rather than "[".

Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-03 10:25:19 -07:00
Felipe Contreras
d521abf890 add: trivial style cleanup
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-30 20:59:18 -07:00
Felipe Contreras
4e83ab3e8d reset: trivial style cleanup
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-30 20:59:04 -07:00
Felipe Contreras
82a0672f8e branch: trivial style fix
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-30 20:58:49 -07:00
Felipe Contreras
f38798f48d reset: trivial refactoring
After commit 3fde386 (reset [--mixed]: use diff-based reset whether or
not pathspec was given), some code can be moved to the 'reset_type ==
MIXED' check.

Let's move the code that is specific to MIXED.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-30 20:58:43 -07:00
Brad King
9bbb0fa1fd refs: report ref type from lock_any_ref_for_update
Expose lock_ref_sha1_basic's type_p argument to callers of
lock_any_ref_for_update.  Update all call sites to ignore it by passing
NULL for now.

Signed-off-by: Brad King <brad.king@kitware.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-30 14:57:28 -07:00
Brad King
2be778a8ac reset: rename update_refs to reset_refs
The function resets refs rather than doing arbitrary updates.
Rename it to allow a future general-purpose update_refs function
to be added.

Signed-off-by: Brad King <brad.king@kitware.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-30 14:57:27 -07:00
Ævar Arnfjörð Bjarmason
fd87004e51 gitweb: Fix the author initials in blame for non-ASCII names
Change the @author_initials feature Jakub added in
v1.6.4-rc2-14-ga36817b to match non-ASCII author initials as intended.

The regexp Jakub added was intended to match
non-ASCII (/\b([[:upper:]])\B/g). But in Perl this doesn't actually
match non-ASCII upper-case characters unless the string being matched
against has the UTF8 flag.

So when we open a pipe to "git blame" we need to mark the file
descriptor we're opening as utf8 explicitly.

So as a result it abbreviates me to "AB" not "ÆAB", entirely because "Æ"
isn't /[[:upper:]]/ unless the string being matched against has the UTF8
flag.

Here's something that demonstrates the issue:

    #!/usr/bin/env perl
    use strict;
    use warnings;

    binmode STDOUT, ':utf8' if $ENV{UTF8};
    open my $fd, "-|", "git", "blame", "--incremental", "--", "Makefile" or die "Can't open: $!";
    binmode $fd, ":utf8" if $ENV{UTF8};
    while (my $line = <$fd>) {
    	next unless my ($author) = $line =~ /^author (.*)/;
    	my @author_initials = ($author =~ /\b([[:upper:]])\B/g);
    	printf "%s (%s)\n",  join("", @author_initials), $author;
    }

When that's run with and without UTF8 being true in the environment it
gives, on git.git:

    $ UTF8=0 perl author-initials.pl | sort | uniq -c |
    sort -nr | head -n 5
         99 JH (Junio C Hamano)
         35 JN (Jonathan Nieder)
         35 JK (Jeff King)
         20 JS (Johannes Schindelin)
         16 AB (Ævar Arnfjörð Bjarmason)
    $ UTF8=1 perl author-initials.pl | sort | uniq -c |
    sort -nr | head -n 5
         99 JH (Junio C Hamano)
         35 JN (Jonathan Nieder)
         35 JK (Jeff King)
         20 JS (Johannes Schindelin)
         16 ÆAB (Ævar Arnfjörð Bjarmason)

Acked-by: Jakub Narębski <jnareb@gmail.com>
Tested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Tested-by: Simon Ruderich <simon@ruderich.org>

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-30 14:55:04 -07:00
Jeff King
45e8a74873 has_sha1_file: re-check pack directory before giving up
When we read a sha1 file, we first look for a packed
version, then a loose version, and then re-check the pack
directory again before concluding that we cannot find it.
This lets us handle a process that is writing to the
repository simultaneously (e.g., receive-pack writing a new
pack followed by a ref update, or git-repack packing
existing loose objects into a new pack).

However, we do not do the same trick with has_sha1_file; we
only check the packed objects once, followed by loose
objects. This means that we might incorrectly report that we
do not have an object, even though we could find it if we
simply re-checked the pack directory.

By itself, this is usually not a big deal. The other process
is running simultaneously, so we may run has_sha1_file
before it writes, anyway. It is a race whether we see the
object or not.  However, we may also see other things
the writing process has done (like updating refs); and in
that case, we must be able to also see the new objects.

For example, imagine we are doing a for_each_ref iteration,
and somebody simultaneously pushes. Receive-pack may write
the pack and update a ref after we have examined the
objects/pack directory, but before the iteration gets to the
updated ref. When we do finally see the updated ref,
for_each_ref will call has_sha1_file to check whether the
ref is broken. If has_sha1_file returns the wrong answer, we
erroneously will think that the ref is broken.

For a normal iteration without DO_FOR_EACH_INCLUDE_BROKEN,
this means that the caller does not see the ref at all
(neither the old nor the new value).  So not only will we
fail to see the new value of the ref (which is acceptable,
since we are running simultaneously with the writer, and we
might well read the ref before the writer commits its
write), but we will not see the old value either. For
programs that act on reachability like pack-objects or
prune, this can cause data loss, as we may see the objects
referenced by the original ref value as dangling (and either
omit them from the pack, or delete them via prune).

There's no test included here, because the success case is
two processes running simultaneously forever. But you can
replicate the issue with:

  # base.sh
  # run this in one terminal; it creates and pushes
  # repeatedly to a repository
  git init parent &&
  (cd parent &&

    # create a base commit that will trigger us looking at
    # the objects/pack directory before we hit the updated ref
    echo content >file &&
    git add file &&
    git commit -m base &&

    # set the unpack limit abnormally low, which
    # lets us simulate full-size pushes using tiny ones
    git config receive.unpackLimit 1
  ) &&
  git clone parent child &&
  cd child &&
  n=0 &&
  while true; do
    echo $n >file && git add file && git commit -m $n &&
    git push origin HEAD:refs/remotes/child/master &&
    n=$(($n + 1))
  done

  # fsck.sh
  # now run this simultaneously in another terminal; it
  # repeatedly fscks, looking for us to consider the
  # newly-pushed ref broken. We cannot use for-each-ref
  # here, as it uses DO_FOR_EACH_INCLUDE_BROKEN, which
  # skips the has_sha1_file check (and if it wants
  # more information on the object, it will actually read
  # the object, which does the proper two-step lookup)
  cd parent &&
  while true; do
    broken=`git fsck 2>&1 | grep remotes/child`
    if test -n "$broken"; then
      echo $broken
      exit 1
    fi
  done

Without this patch, the fsck loop fails within a few seconds
(and almost instantly if the test repository actually has a
large number of refs). With it, the two can run
indefinitely.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-30 14:53:45 -07:00
Felipe Contreras
c587d65512 remote-hg: use notes to keep track of Hg revisions
Keep track of Mercurial revisions as Git notes under the 'refs/notes/hg'
ref.  This way, the user can easily see which Mercurial revision
corresponds to certain Git commit.

Unfortunately, there's no way to efficiently update the notes after
doing an export (push), so they'll have to be updated when importing
(fetching).

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-30 10:37:23 -07:00
Junio C Hamano
992c38644a Start the post-1.8.4 cycle
It is tentatively called 1.8.5, but it should be an easy matter of
renaming the release-notes file and RelNotes symlink to later call
it 1.9 near the end of the cycle if we wanted to.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-30 10:16:16 -07:00
Junio C Hamano
f2be2a51f2 Merge branch 'bc/completion-for-bash-3.0'
Some people still use rather old versions of bash, which cannot
grok some constructs like 'printf -v varname' the prompt and
completion code started to use recently.

* bc/completion-for-bash-3.0:
  contrib/git-prompt.sh: handle missing 'printf -v' more gracefully
  t9902-completion.sh: old Bash still does not support array+=('') notation
  git-completion.bash: use correct Bash/Zsh array length syntax
2013-08-30 10:10:55 -07:00
Junio C Hamano
36d80208c5 Merge branch 'sp/doc-smart-http'
* sp/doc-smart-http:
  Document the HTTP transport protocols
2013-08-30 10:10:52 -07:00
Junio C Hamano
9bb78de519 Merge branch 'mm/war-on-whatchanged'
* mm/war-on-whatchanged:
  whatchanged: document its historical nature
  core-tutorial: trim the section on Inspecting Changes
2013-08-30 10:08:26 -07:00
Junio C Hamano
482bd22d49 Merge branch 'rt/doc-merge-file-diff3'
* rt/doc-merge-file-diff3:
  Documentation/git-merge-file: document option "--diff3"
2013-08-30 10:08:23 -07:00
Junio C Hamano
04d0eb89e3 Merge branch 'mb/docs-favor-en-us'
Declare that the official grammar & spelling of the source of this
project is en_US, but strongly discourage patches only to "fix"
existing en_UK strings to avoid unnecessary churns.

* mb/docs-favor-en-us:
  Provide some linguistic guidance for the documentation.
2013-08-30 10:08:19 -07:00
Junio C Hamano
e30db6dbcf Merge branch 'rj/doc-rev-parse'
* rj/doc-rev-parse:
  rev-parse(1): logically group options
  rev-parse: remove restrictions on some options
2013-08-30 10:08:13 -07:00
Junio C Hamano
55fefe6bbb Merge branch 'hv/config-from-blob'
Portability fix.

* hv/config-from-blob:
  config: do not use C function names as struct members
2013-08-30 10:06:52 -07:00
Junio C Hamano
e250020cd0 Merge branch 'nd/fetch-pack-shallow-fix'
The recent "short-cut clone connectivity check" topic broke a
shallow repository when a fetch operation tries to auto-follow tags.

* nd/fetch-pack-shallow-fix:
  fetch-pack: do not remove .git/shallow file when --depth is not specified
2013-08-30 10:05:55 -07:00
Thorsten Glaser
6897a64b65 fix shell syntax error in template
An if clause must not be empty; add a "colon" command.

Signed-off-by: Thorsten Glaser <t.glaser@tarent.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-30 09:56:30 -07:00
Sebastien Helleu
21860882c8 l10n: fr.po: hotfix for commit 6b388fc
Fix many typos and add some new translations (1277/2080 messages
translated).

Closes git-l10n/git-po/pull/63.

Signed-off-by: Sebastien Helleu <flashcode@flashtux.org>
Signed-off-by: Jean-Noel Avila <jn.avila@free.fr>
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2013-08-30 16:59:29 +08:00
Matthieu Moy
8987cda9e1 git-remote-mediawiki: add test and check Makefile targets
There are a few level 4 and 2 perlcritic issues in the current code. We
make level 5 fatal, and keep level 2 as warnings.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-29 12:07:24 -07:00
Junio C Hamano
97d01f2a88 config: rewrite core.pager documentation
The text mentions core.pager and GIT_PAGER without giving the
overall picture of precedences.  Borrow a better description from
the git-var(1) documentation.

The use of the mechanism to allow system-wide, global and
per-repository configuration files is not limited to this particular
variable.  Remove it to clarify the paragraph.

Rewrite the part that explains how the environment variable LESS is
set to Git's default value, and how to selectively customize it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-29 12:03:08 -07:00
Felipe Contreras
641a2b5bee remote-helpers: cleanup more global variables
They don't need to be specified if they are not going to be set.

Suggested-by: Dusty Phillips <dusty@linux.ca>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-29 11:40:57 -07:00
Felipe Contreras
670dda85d6 remote-helpers: trivial style fixes
In accordance with pep8.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-29 11:40:56 -07:00
Felipe Contreras
2a6981833d remote-hg: improve basic test
It appears 'let' is not present in all shells.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-29 11:40:55 -07:00
Felipe Contreras
8493fd14b2 remote-hg: add missing &&s in the test
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-29 11:40:54 -07:00
Felipe Contreras
0fdc9b0939 remote-hg: fix test
It wasn't being checked properly before; those refs never existed.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-29 11:40:52 -07:00
Felipe Contreras
a11b0ac9e1 remote-bzr: make bzr branches configurable per-repo
Different repositories have different branches, some are are even
branches themselves.

Reported-by: Peter Niederlag <netservice@niekom.de>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-29 11:40:51 -07:00
Felipe Contreras
a8c0b74718 remote-bzr: fix export of utf-8 authors
Reported-by: Joakim Verona <joakim@verona.se>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-29 11:39:45 -07:00
Jeff King
83bd7437ca write_index: optionally allow broken null sha1s
Commit 4337b58 (do not write null sha1s to on-disk index,
2012-07-28) added a safety check preventing git from writing
null sha1s into the index. The intent was to catch errors in
other parts of the code that might let such an entry slip
into the index (or worse, a tree).

Some existing repositories may have invalid trees that
contain null sha1s already, though.  Until 4337b58, a common
way to clean this up would be to use git-filter-branch's
index-filter to repair such broken entries.  That now fails
when filter-branch tries to write out the index.

Introduce a GIT_ALLOW_NULL_SHA1 environment variable to
relax this check and make it easier to recover from such a
history.

It is tempting to not involve filter-branch in this commit
at all, and instead require the user to manually invoke

	GIT_ALLOW_NULL_SHA1=1 git filter-branch ...

to perform an index-filter on a history with trees with null
sha1s.  That would be slightly safer, but requires some
specialized knowledge from the user.  So let's set the
GIT_ALLOW_NULL_SHA1 variable automatically when checking out
the to-be-filtered trees.  Advice on using filter-branch to
remove such entries already exists on places like
stackoverflow, and this patch makes it Just Work again on
recent versions of git.

Further commands that touch the index will still notice and
fail, unless they actually remove the broken entries.  A
filter-branch whose filters do not touch the index at all
will not error out (since we complain of the null sha1 only
on writing, not when making a tree out of the index), but
this is acceptable, as we still print a loud warning, so the
problem is unlikely to go unnoticed.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-28 20:54:43 -07:00
Ramsay Jones
0f73f8bd79 builtin/fetch.c: Fix a sparse warning
Sparse issues an "'prepare_transport' was not declared. Should it
be static?" warning. In order to suppress the warning, since this
symbol only requires file scope, we simply add the static modifier
to it's declaration.

Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-28 16:55:23 -07:00
Junio C Hamano
286bc123cd diff --no-index: describe in a separate paragraph
The documentation for "diff-files" mode of "git diff" primarily
talks about how changes in the files in the working tree are shown
relative to the contents previously added to that index, and tucks
explanation on how "--no-index" mode, which works in a quite
different way, may be implicitly used instead.  Instead, add a
separate paragraph to explain what "--no-index" mode does, and also
mention when "--no-index" can be omitted from the command line
(essentially, when it is obvious from the context).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-28 15:17:18 -07:00
Jiang Xin
f85f7947c3 documentation: clarify notes for clean.requireForce
Add "-i" (interactive clean option) to clarify the documentation for
"clean.requireForce" config variable.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-28 12:51:46 -07:00
Jeff King
f972a1658a mailmap: handle mailmap blobs without trailing newlines
The read_mailmap_buf function reads each line of the mailmap
using strchrnul, like:

    const char *end = strchrnul(buf, '\n');
    unsigned long linelen = end - buf + 1;

But that's off-by-one when we actually hit the NUL byte; our
line does not have a terminator, and so is only "end - buf"
bytes long. As a result, when we subtract the linelen from
the total len, we end up with (unsigned long)-1 bytes left
in the buffer, and we start reading random junk from memory.

We could fix it with:

    unsigned long linelen = end - buf + !!*end;

but let's take a step back for a moment. It's questionable
in the first place for a function that takes a buffer and
length to be using strchrnul. But it works because we only
have one caller (and are only likely to ever have this one),
which is handing us data from read_sha1_file. Which means
that it's always NUL-terminated.

Instead of tightening the assumptions to make the
buffer/length pair work for a caller that doesn't actually
exist, let's let loosen the assumptions to what the real
caller has: a modifiable, NUL-terminated string.

This makes the code simpler and shorter (because we don't
have to correlate strchrnul with the length calculation),
correct (because the code with the off-by-one just goes
away), and more efficient (we can drop the extra allocation
we needed to create NUL-terminated strings for each line,
and just terminate in place).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-28 12:33:32 -07:00
Matthijs Kooijman
f21d2a786b Add testcase for needless objects during a shallow fetch
This is a testcase that checks for a problem where, during a specific
shallow fetch where the client does not have any commits that are a
successor of the new shallow root (i.e., the fetch creates a new
detached piece of history), the server would simply send over _all_
objects, instead of taking into account the objects already present in
the client.

The actual problem was fixed by a recent patch series by Nguyễn Thái
Ngọc Duy already.

Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-28 11:57:28 -07:00
Nguyễn Thái Ngọc Duy
fbd4a7036d list-objects: mark more commits as edges in mark_edges_uninteresting
The purpose of edge commits is to let pack-objects know what objects
it can use as base, but does not need to include in the thin pack
because the other side is supposed to already have them. So far we
mark uninteresting parents of interesting commits as edges. But even
an unrelated uninteresting commit (that the other side has) may
become a good base for pack-objects and help produce more efficient
packs.

This is especially true for shallow clone, when the client issues a
fetch with a depth smaller or equal to the number of commits the
server is ahead of the client. For example, in this commit history
the client has up to "A" and the server has up to "B":

    -------A---B
     have--^   ^
              /
       want--+

If depth 1 is requested, the commit list to send to the client
includes only B. The way m_e_u is working, it checks if parent
commits of B are uninteresting, if so mark them as edges.  Due to
shallow effect, commit B is grafted to have no parents and the
revision walker never sees A as the parent of B. In fact it marks no
edges at all in this simple case and sends everything B has to the
client even if it could have excluded what A and also the client
already have.

In a slightly different case where A is not a direct parent of B
(iow there are commits in between A and B), marking A as an edge can
still save some because B may still have stuff from the far ancestor
A.

There is another case from the earlier patch, when we deepen a ref
from C->E to A->E:

    ---A---B   C---D---E
     want--^   ^       ^
       shallow-+      /
          have-------+

In this case we need to send A and B to the client, and C (i.e. the
current shallow point that the client informs the server) is a very
good base because it's closet to A and B. Normal m_e_u won't recognize
C as an edge because it only looks back to parents (i.e. A<-B) not the
opposite way B->C even if C is already marked as uninteresting commit
by the previous patch.

This patch includes all uninteresting commits from command line as
edges and lets pack-objects decide what's best to do. The upside is we
have better chance of producing better packs in certain cases. The
downside is we may need to process some extra objects on the server
side.

For the shallow case on git.git, when the client is 5 commits behind
and does "fetch --depth=3", the result pack is 99.26 KiB instead of
4.92 MiB.

Reported-and-analyzed-by: Matthijs Kooijman <matthijs@stdin.nl>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-28 11:54:18 -07:00
Nguyễn Thái Ngọc Duy
e76a5fb459 list-objects: reduce one argument in mark_edges_uninteresting
mark_edges_uninteresting() is always called with this form

  mark_edges_uninteresting(revs->commits, revs, ...);

Remove the first argument and let mark_edges_uninteresting figure that
out by itself. It helps answer the question "are this commit list and
revs related in any way?" when looking at mark_edges_uninteresting
implementation.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-28 11:54:18 -07:00