Commit Graph

21388 Commits

Author SHA1 Message Date
Junio C Hamano
0adc030615 Merge branch 'jp/maint-send-email-fold' into maint
* jp/maint-send-email-fold:
  git-send-email.perl: fold multiple entry "Cc:" and multiple single line "RCPT TO:"s
2009-10-23 22:30:42 -07:00
Junio C Hamano
64fb90b707 Merge branch 'jn/maint-1.6.3-check-ref-format-doc' into maint
* jn/maint-1.6.3-check-ref-format-doc:
  Documentation: describe check-ref-format --branch
2009-10-23 22:30:20 -07:00
Junio C Hamano
70ed433c2b Merge branch 'pv/maint-add-p-no-exclude' into maint
* pv/maint-add-p-no-exclude:
  git-add--interactive: never skip files included in index
2009-10-23 22:29:19 -07:00
Jens Lehmann
86140d56c1 add tests for git diff --submodule
Copied from the submodule summary test and changed to reflect the
differences in the output of git diff --submodule.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-23 17:07:33 -07:00
Junio C Hamano
024ab976ff Do not fail "describe --always" in a tag-less repository
This fixes a regression introduce by d68dc34 (git-describe: Die early if
there are no possible descriptions, 2009-08-06).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-23 12:00:41 -07:00
Johannes Sixt
9bccfcdbff Windows: use BLK_SHA1 again
Since NO_OPENSSL is no longer defined on Windows, BLK_SHA1 is not defined
anymore implicitly. Define it explicitly.

As a nice side-effect, we no longer link against libcrypto.dll, which has
non-trivial startup costs because it depends on 6 otherwise unneeded
DLLs.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-23 12:48:31 +02:00
Marius Storm-Olsen
c36e16385b MSVC: Enable OpenSSL, and translate -lcrypto
We don't use crypto, but rather require libeay32 and
ssleay32. handle it in both the Makefile msvc linker
script, and the buildsystem generator.

Signed-off-by: Marius Storm-Olsen <mstormo@gmail.com>
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-23 12:48:04 +02:00
Erik Faye-Lund
4192e1cd02 mingw: enable OpenSSL
Since we have OpenSSL in msysgit now, enable it to support SSL
encryption for imap-send.

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-23 12:47:37 +02:00
René Scharfe
02edd56b84 Implement wrap format %w() as if it is a mode switch
I always considered line wrapping to be more similar to a colour, i.e. a
state that one can change and that is applied to all following text until
the next state change, except that it's always reset at the end of the
format string.

Here's a patch to implement this behaviour, using Dscho's
strbuf_add_wrapped_text()

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-22 23:33:48 -07:00
Junio C Hamano
00d3947366 Teach --wrap to only indent without wrapping
When a zero or negative width is given to "shortlog -w<width>,<in1>,<in2>"
and --format=%[wrap(w,in1,in2)...%], just indent the text by in1 without
wrapping.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-22 23:20:16 -07:00
Thomas Rast
204d363f5a Quote ' as \(aq in manpages
The docbook/xmlto toolchain insists on quoting ' as \'.  This does
achieve the quoting goal, but modern 'man' implementations turn the
apostrophe into a unicode "proper" apostrophe (given the right
circumstances), breaking code examples in many of our manpages.

Quote them as \(aq instead, which is an "apostrophe quote" as per the
groff_char manpage.

Unfortunately, as Anders Kaseorg kindly pointed out, this is not
portable beyond groff, so we add an extra Makefile variable GNU_ROFF
which you need to enable to get the new quoting.

Thanks also to Miklos Vajna for documentation.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-22 12:59:50 -07:00
Jari Aalto
7c85d27429 Documentation/merge-options.txt: order options in alphabetical groups
Signed-off-by: Jari Aalto <jari.aalto@cante.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-22 12:39:18 -07:00
Jari Aalto
3f7a9b5ad1 Documentation/git-pull.txt: Add subtitles above included option files
Signed-off-by: Jari Aalto <jari.aalto@cante.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-22 12:39:10 -07:00
Junio C Hamano
39eea7bdd9 Fix incorrect error check while reading deflated pack data
The loop in get_size_from_delta() feeds a deflated delta data from the
pack stream _until_ we get inflated result of 20 bytes[*] or we reach the
end of stream.

    Side note. This magic number 20 does not have anything to do with the
    size of the hash we use, but comes from 1a3b55c (reduce delta head
    inflated size, 2006-10-18).

The loop reads like this:

    do {
        in = use_pack();
        stream.next_in = in;
        st = git_inflate(&stream, Z_FINISH);
        curpos += stream.next_in - in;
    } while ((st == Z_OK || st == Z_BUF_ERROR) &&
             stream.total_out < sizeof(delta_head));

This git_inflate() can return:

 - Z_STREAM_END, if use_pack() fed it enough input and the delta itself
   was smaller than 20 bytes;

 - Z_OK, when some progress has been made;

 - Z_BUF_ERROR, if no progress is possible, because we either ran out of
   input (due to corrupt pack), or we ran out of output before we saw the
   end of the stream.

The fix b3118bd (sha1_file: Fix infinite loop when pack is corrupted,
2009-10-14) attempted was against a corruption that appears to be a valid
stream that produces a result larger than the output buffer, but we are
not even trying to read the stream to the end in this loop.  If avail_out
becomes zero, total_out will be the same as sizeof(delta_head) so the loop
will terminate without the "fix".  There is no fix from b3118bd needed for
this loop, in other words.

The loop in unpack_compressed_entry() is quite a different story.  It
feeds a deflated stream (either delta or base) and allows the stream to
produce output up to what we expect but no more.

    do {
        in = use_pack();
        stream.next_in = in;
        st = git_inflate(&stream, Z_FINISH);
        curpos += stream.next_in - in;
    } while (st == Z_OK || st == Z_BUF_ERROR)

This _does_ risk falling into an endless interation, as we can exhaust
avail_out if the length we expect is smaller than what the stream wants to
produce (due to pack corruption).  In such a case, avail_out will become
zero and inflate() will return Z_BUF_ERROR, while avail_in may (or may
not) be zero.

But this is not a right fix:

    do {
        in = use_pack();
        stream.next_in = in;
        st = git_inflate(&stream, Z_FINISH);
+       if (st == Z_BUF_ERROR && (stream.avail_in || !stream.avail_out)
+               break; /* wants more input??? */
        curpos += stream.next_in - in;
    } while (st == Z_OK || st == Z_BUF_ERROR)

as Z_BUF_ERROR from inflate() may be telling us that avail_in has also run
out before reading the end of stream marker.  In such a case, both avail_in
and avail_out would be zero, and the loop should iterate to allow the end
of stream marker to be seen by inflate from the input stream.

The right fix for this loop is likely to be to increment the initial
avail_out by one (we allocate one extra byte to terminate it with NUL
anyway, so there is no risk to overrun the buffer), and break out if we
see that avail_out has become zero, in order to detect that the stream
wants to produce more than what we expect.  After the loop, we have a
check that exactly tests this condition:

    if ((st != Z_STREAM_END) || stream.total_out != size) {
        free(buffer);
        return NULL;
    }

So here is a patch (without my previous botched attempts) to fix this
issue.  The first hunk reverts the corresponding hunk from b3118bd, and
the second hunk is the same fix proposed earlier.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-21 23:19:47 -07:00
Junio C Hamano
3694209ca1 Merge branch 'maint'
* maint:
  Document `delta` attribute in "git help attributes".
  Mark files in t/t5100 as UTF-8
  Remove a left-over file from t/t5100
2009-10-21 17:33:15 -07:00
Junio C Hamano
8e850a4dbd Merge branch 'gb/maint-gitweb-esc-param'
* gb/maint-gitweb-esc-param:
  gitweb: fix esc_param
2009-10-21 17:32:59 -07:00
Johannes Schindelin
a5ca8367c2 blame: make sure that the last line ends in an LF
This is convenient when parsing multiple the blame of multiple files,
for example:

    git ls-files -z --exclude-standard -- "*.[ch]" |
    xargs --null -n 1 git blame -p > output

and then analyzing the 'output' file using a seperate script.

Currently the parsing is difficult when not all files have a newline
at EOF, this patch ensures that even such files have a newline at the
end of the blame output.

Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
CC: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-21 17:16:58 -07:00
Björn Gustavsson
550c66f3f9 git-clone.txt: Fix grammar and formatting
Add the missing definite article ("the") in several places.

Change "note to..." to "note for...", since "note to" means that
that the note is addressed to someone (source: Google search).

Change "progressbar" to "progress bar" (source: Wikipedia).

Format git commands, options, and file names consistently using
back quotes (i.e. a fixed font in the resulting HTML document).

Signed-off-by: Björn Gustavsson <bgustavsson@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-21 17:16:08 -07:00
Ingmar Vanhassel
2a94552887 import-tars: Add support for tarballs compressed with lzma, xz
Also handle the extensions .tlz and .txz, aliases for .tar.lzma and
.tar.xz respectively.

Signed-off-by: Ingmar Vanhassel <ingmar@exherbo.org>
Liked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-21 17:15:43 -07:00
Junio C Hamano
77e3efbf43 receive-pack: run "gc --auto --quiet" and optionally "update-server-info"
Introduce two new configuration variables, receive.autogc (defaults to
true) and receive.updateserverinfo (defaults to false).  When these are
set, receive-pack runs "gc --auto --quiet" and "update-server-info"
respectively after it finishes receiving data from "git push" and updating
refs.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
2009-10-21 15:32:32 -07:00
Junio C Hamano
dad5f89fc5 gc --auto --quiet: make the notice a bit less verboase
When "gc --auto --quiet" decides there is something to do, it tells the
user what it is doing, as it is going to make the user wait for a bit.

But the message was a bit too long.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-21 15:28:42 -07:00
Junio C Hamano
46148dd7ea git checkout --no-guess
Porcelains may want to make sure their calls to "git checkout" will
reliably fail regardless of the presense of random remote tracking
branches by the new DWIMmery introduced.

Luckily all existing in-tree callers have extra checks to make sure they
feed local branch name when they want to switch, or they explicitly ask to
detach HEAD at the given commit, so there is no need to add this option
for them.

As this is strictly script-only option, do not even bother to document it,
and do bother to hide it from "git checkout -h".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-21 15:17:24 -07:00
Jari Aalto
6b276e19fa Documentation/fetch-options.txt: order options alphabetically
git-fetch.{1,html} will be helped with this patch

Signed-off-by: Jari Aalto <jari.aalto@cante.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-21 15:12:01 -07:00
Clemens Buchacher
d504f6975d modernize fetch/merge/pull examples
The "git pull" documentation has examples which follow an outdated
style. Update the examples to use "git merge" where appropriate and
move the examples to the corresponding manpages.

Furthermore,

 - show that pull is equivalent to fetch and merge, which is still a
   frequently asked question,

 - explain the default fetch refspec.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-21 14:20:50 -07:00
Nasser Grainawi
975457f185 Document delta attribute in "git help attributes".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-21 14:07:44 -07:00
Johannes Sixt
814a9bfee9 Mark files in t/t5100 as UTF-8
This enables gitk to show the patch text with correct glyphs if the locale
is not UTF-8.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-21 11:27:50 -07:00
Johannes Sixt
66bcb6a68f Remove a left-over file from t/t5100
This mbox file must have been added by accident in e9fe804 (git-mailinfo:
Fix getting the subject from the in-body [PATCH] line, 2008-07-14).

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-21 11:27:48 -07:00
Junio C Hamano
f29cd3938d fsck: default to "git fsck --full"
Linus and other git developers from the early days trained their fingers
to type the command, every once in a while even without thinking, to check
the consistency of the repository back when the lower core part of the git
was still being developed.  Developers who wanted to make sure that git
correctly dealt with packfiles could deliberately trigger their creation
and checked them after they were created carefully, but loose objects are
the ones that are written by various commands from random codepaths.  It
made some technical sense to have a mode that checked only loose objects
from the debugging point of view for that reason.

Even for git developers, there no longer is any reason to type "git fsck"
every five minutes these days, worried that some newly created objects
might be corrupt due to recent change to git.

The reason we did not make "--full" the default is probably we trust our
filesystems a bit too much.  At least, we trusted filesystems more than we
trusted the lower core part of git that was under development.

Once a packfile is created and we always use it read-only, there didn't
seem to be much point in suspecting that the underlying filesystems or
disks may corrupt them in such a way that is not caught by the SHA-1
checksum over the entire packfile and per object checksum.  That trust in
the filesystems might have been a good tradeoff between fsck performance
and reliability on platforms git was initially developed on and for, but
it may not be true anymore as we run on many more platforms these days.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-20 12:11:39 -07:00
Junio C Hamano
a9d7c9552e Merge branch 'maint'
* maint:
  Documentation/git-gc.txt: change "references" to "reference"
2009-10-20 00:13:13 -07:00
Stephen Boyd
e133d65c3e gitweb: linkify author/committer names with search
It's nice to search for an author by merely clicking on their name in
gitweb. This is usually faster than selecting the name, copying the
selection, pasting it into the search box, selecting between
author/committer and then hitting enter.

Linkify the avatar icon in log/shortlog view because the icon is directly
adjacent to the name and thus more related. The same is not true
when in commit/tag view where the icon is farther away.

Signed-off-by: Stephen Boyd <bebarino@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-20 00:08:16 -07:00
Matt Kraai
3ed0b11e7e Documentation/git-gc.txt: change "references" to "reference"
Signed-off-by: Matt Kraai <kraai@ftbfs.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-20 00:01:23 -07:00
Johannes Schindelin
752c0c2492 Add the --submodule option to the diff option family
When you use the option --submodule=log you can see the submodule
summaries inlined in the diff, instead of not-quite-helpful SHA-1 pairs.

The format imitates what "git submodule summary" shows.

To do that, <path>/.git/objects/ is added to the alternate object
databases (if that directory exists).

This option was requested by Jens Lehmann at the GitTogether in Berlin.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-19 22:31:00 -07:00
Thomas Rast
b7b10385a8 stash list: drop the default limit of 10 stashes
'git stash list' had an undocumented limit of 10 stashes, unless other
git-log arguments were specified.  This surprised at least one user,
but possibly served to cut the output below a screenful without using
a pager.

Since the last commit, 'git stash list' will fire up a pager according
to the same rules as the 'git log' it calls, so we can drop the limit.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-19 22:28:26 -07:00
Thomas Rast
391c53bdcd stash list: use new %g formats instead of sed
With the new formats, we can rewrite 'git stash list' in terms of an
appropriate pretty format, instead of hand-editing with sed.  This has
the advantage that it obeys the normal settings for git-log, notably
the pager.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-19 22:28:26 -07:00
Thomas Rast
8f8f5476cd Introduce new pretty formats %g[sdD] for reflog information
Add three new --pretty=format escapes:

  %gD  long  reflog descriptor (e.g. refs/stash@{0})
  %gd  short reflog descriptor (e.g. stash@{0})
  %gs  reflog message

This is achieved by passing down the reflog info, if any, inside the
pretty_print_context struct.

We use the newly refactored get_reflog_selector(), and give it some
extra functionality to extract a shortened ref.  The shortening is
cached inside the commit_reflogs struct; the only allocation of it
happens in read_complete_reflog(), where it is initialised to 0.  Also
add another helper get_reflog_message() for the message extraction.

Note that the --format="%h %gD: %gs" tests may not work in real
repositories, as the --pretty formatter doesn't know to leave away the
": " on the last commit in an incomplete (because git-gc removed the
old part) reflog.  This equivalence is nevertheless the main goal of
this patch.

Thanks to Jeff King for reviews, the %gd testcase and documentation.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-19 22:28:26 -07:00
Thomas Rast
72b103fec7 reflog-walk: refactor the branch@{num} formatting
We'll use the same output in an upcoming commit, so refactor its
formatting (which was duplicated anyway) into a separate function.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-19 22:28:25 -07:00
Thomas Rast
dd2e794a21 Refactor pretty_print_commit arguments into a struct
pretty_print_commit() has a bunch of rarely-used arguments, and
introducing more of them requires yet another update of all the call
sites.  Refactor most of them into a struct to make future extensions
easier.

The ones that stay "plain" arguments were chosen on the grounds that
all callers put real arguments there, whereas some callers have 0/NULL
for all arguments that were factored into the struct.

We declare the struct 'const' to ensure none of the callers are bitten
by the changed (no longer call-by-value) semantics.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-19 22:28:20 -07:00
Erik Faye-Lund
514213bf72 mingw: wrap SSL_set_(w|r)fd to call _get_osfhandle
SSL_set_fd (and friends) expects a OS file handle on Windows, not
a file descriptor as on UNIX(-ish).

This patch makes the Windows version of SSL_set_fd behave like the
UNIX versions, by calling _get_osfhandle on it's input.

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-19 22:17:36 -07:00
Erik Faye-Lund
f9a88b70f9 imap-send: build imap-send on Windows
Since the POSIX-specific tunneling code has been replaced
by the run-command API (and a compile-error has been
cleaned away), we can now enable imap-send on Windows
builds.

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-19 22:17:36 -07:00
Erik Faye-Lund
d23b1ecf11 imap-send: fix compilation-error on Windows
mmsystem.h (included from windows.h) defines DRV_OK to 1. To avoid
an error due to DRV_OK redefenition, this patch undefines the old
definition (i.e the one from mmsystem.h) before defining DRV_OK.

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-19 22:17:36 -07:00
Erik Faye-Lund
c94d2dd080 imap-send: use run-command API for tunneling
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-19 22:17:36 -07:00
Erik Faye-Lund
7a7796e9a7 imap-send: use separate read and write fds
This is a patch that enables us to use the run-command
API, which is supported on Windows.

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-19 22:17:36 -07:00
Jeff King
3a7cba95b7 imap-send: remove useless uid code
The imap-send code is based on code from isync, a program
for syncing imap mailboxes. Because of this, it has
inherited some code that makes sense for isync, but not for
imap-send.

In particular, when storing a message, it does one of:

  - if the server supports it, note the server-assigned
    unique identifier (UID) given to each message

  - otherwise, assigned a random UID and store it in the
    message header as X-TUID

Presumably this is used in isync to be able to synchronize
mailstores multiple times without duplication. But for
imap-send, the values are useless; we never do anything
with them and simply forget them at the end of the program.

This patch removes the useless code. Not only is it nice for
maintainability to get rid of dead code, but the removed
code relied on the existence of /dev/urandom, which made it
a portability problem for non-Unix platforms.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-19 22:17:36 -07:00
Johan Herland
a099469bbc Add selftests verifying concatenation of multiple notes for the same commit
Also verify that multiple references to the _same_ note blob are _not_
concatenated.

Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-19 19:00:24 -07:00
Johan Herland
ef8db638cc Refactor notes code to concatenate multiple notes annotating the same object
Currently, having multiple notes referring to the same commit from various
locations in the notes tree is strongly discouraged, since only one of those
notes will be parsed and shown.

This patch teaches the notes code to _concatenate_ multiple notes that
annotate the same commit. Notes are concatenated by creating a new blob
object containing the concatenation of the notes in question, and
replacing them with the concatenated note in the internal notes tree
structure.

Getting the concatenation right requires being more proactive in unpacking
subtree entries in the internal notes tree structure, so that we don't return
a note prematurely (i.e. before having found all other notes that annotate
the same object). As such, this patch may incur a small performance penalty.

Suggested-by: Sam Vilain <sam@vilain.net>
Re-suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-19 19:00:24 -07:00
Johan Herland
0057c0917d Add selftests verifying that we can parse notes trees with various fanouts
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-19 19:00:24 -07:00
Johan Herland
23123aecf8 Teach the notes lookup code to parse notes trees with various fanout schemes
The semantics used when parsing notes trees (with regards to fanout subtrees)
follow Dscho's proposal fairly closely:
- No concatenation/merging of notes is performed. If there are several notes
  objects referencing a given commit, only one of those objects are used.
- If a notes object for a given commit is present in the "root" notes tree,
  no subtrees are consulted; the object in the root tree is used directly.
- If there are more than one subtree that prefix-matches the given commit,
  only the subtree with the longest matching prefix is consulted. This
  means that if the given commit is e.g. "deadbeef", and the notes tree have
  subtrees "de" and "dead", then the following paths in the notes tree are
  searched: "deadbeef", "dead/beef". Note that "de/adbeef" is NOT searched.
- Fanout directories (subtrees) must references a whole number of bytes
  from the SHA1 sum they subdivide. E.g. subtrees "dead" and "de" are
  acceptable; "d" and "dea" are not.
- Multiple levels of fanout are allowed. All the above rules apply
  recursively. E.g. "de/adbeef" is preferred over "de/adbe/ef", etc.

This patch changes the in-memory datastructure for holding parsed notes:
Instead of holding all note (and subtree) entries in a hash table, a
simple 16-tree structure is used instead. The tree structure consists of
16-arrays as internal nodes, and note/subtree entries as leaf nodes. The
tree is traversed by indexing subsequent nibbles of the search key until
a leaf node is encountered. If a subtree entry is encountered while
searching for a note, the subtree is unpacked into the 16-tree structure,
and the search continues into that subtree.

The new algorithm performs significantly better in the cases where only
a fraction of the notes need to be looked up (this is assumed to be the
common case for notes lookup). The new code even performs marginally
better in the worst case (where _all_ the notes are looked up).

In addition to this, comes the massive performance win associated with
organizing the notes tree according to some fanout scheme. Even a simple
2/38 fanout scheme is dramatically quicker to traverse (going from tens of
seconds to sub-second runtimes).

As for memory usage, the new code is marginally better than the old code in
the worst case, but in the case of looking up only some notes from a notes
tree with proper fanout, the new code uses only a small fraction of the
memory needed to hold the entire notes tree.

However, there is one casualty of this patch. The old notes lookup code was
able to parse notes that were associated with non-SHA1s (e.g. refs). The new
code requires the referenced object to be named by a SHA1 sum. Still, this
is not considered a major setback, since the notes infrastructure was not
originally intended to annotate objects outside the Git object database.

Cc: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-19 19:00:24 -07:00
Johan Herland
27d5756410 Teach notes code to free its internal data structures on request
There's no need to be rude to memory-concious callers...

This patch has been improved by the following contributions:
- Junio C Hamano: avoid old-style declaration

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-19 19:00:24 -07:00
Johannes Schindelin
8b208f0213 Add '%N'-format for pretty-printing commit notes
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-19 19:00:24 -07:00
Johan Herland
c56fcc89b9 Add flags to get_commit_notes() to control the format of the note string
This patch adds the following flags to get_commit_notes() for adjusting the
format of the produced note string:
- NOTES_SHOW_HEADER: Print "Notes:" line before the notes contents
- NOTES_INDENT: Indent notes contents by 4 spaces

Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-19 19:00:24 -07:00