Currently, having multiple notes referring to the same commit from various
locations in the notes tree is strongly discouraged, since only one of those
notes will be parsed and shown.
This patch teaches the notes code to _concatenate_ multiple notes that
annotate the same commit. Notes are concatenated by creating a new blob
object containing the concatenation of the notes in question, and
replacing them with the concatenated note in the internal notes tree
structure.
Getting the concatenation right requires being more proactive in unpacking
subtree entries in the internal notes tree structure, so that we don't return
a note prematurely (i.e. before having found all other notes that annotate
the same object). As such, this patch may incur a small performance penalty.
Suggested-by: Sam Vilain <sam@vilain.net>
Re-suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The semantics used when parsing notes trees (with regards to fanout subtrees)
follow Dscho's proposal fairly closely:
- No concatenation/merging of notes is performed. If there are several notes
objects referencing a given commit, only one of those objects are used.
- If a notes object for a given commit is present in the "root" notes tree,
no subtrees are consulted; the object in the root tree is used directly.
- If there are more than one subtree that prefix-matches the given commit,
only the subtree with the longest matching prefix is consulted. This
means that if the given commit is e.g. "deadbeef", and the notes tree have
subtrees "de" and "dead", then the following paths in the notes tree are
searched: "deadbeef", "dead/beef". Note that "de/adbeef" is NOT searched.
- Fanout directories (subtrees) must references a whole number of bytes
from the SHA1 sum they subdivide. E.g. subtrees "dead" and "de" are
acceptable; "d" and "dea" are not.
- Multiple levels of fanout are allowed. All the above rules apply
recursively. E.g. "de/adbeef" is preferred over "de/adbe/ef", etc.
This patch changes the in-memory datastructure for holding parsed notes:
Instead of holding all note (and subtree) entries in a hash table, a
simple 16-tree structure is used instead. The tree structure consists of
16-arrays as internal nodes, and note/subtree entries as leaf nodes. The
tree is traversed by indexing subsequent nibbles of the search key until
a leaf node is encountered. If a subtree entry is encountered while
searching for a note, the subtree is unpacked into the 16-tree structure,
and the search continues into that subtree.
The new algorithm performs significantly better in the cases where only
a fraction of the notes need to be looked up (this is assumed to be the
common case for notes lookup). The new code even performs marginally
better in the worst case (where _all_ the notes are looked up).
In addition to this, comes the massive performance win associated with
organizing the notes tree according to some fanout scheme. Even a simple
2/38 fanout scheme is dramatically quicker to traverse (going from tens of
seconds to sub-second runtimes).
As for memory usage, the new code is marginally better than the old code in
the worst case, but in the case of looking up only some notes from a notes
tree with proper fanout, the new code uses only a small fraction of the
memory needed to hold the entire notes tree.
However, there is one casualty of this patch. The old notes lookup code was
able to parse notes that were associated with non-SHA1s (e.g. refs). The new
code requires the referenced object to be named by a SHA1 sum. Still, this
is not considered a major setback, since the notes infrastructure was not
originally intended to annotate objects outside the Git object database.
Cc: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's no need to be rude to memory-concious callers...
This patch has been improved by the following contributions:
- Junio C Hamano: avoid old-style declaration
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch adds the following flags to get_commit_notes() for adjusting the
format of the produced note string:
- NOTES_SHOW_HEADER: Print "Notes:" line before the notes contents
- NOTES_INDENT: Indent notes contents by 4 spaces
Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Creating repos with 10/100/1000/10000 commits and notes takes a lot of time.
However, using git-fast-import to do the job is a lot more efficient than
using plumbing commands to do the same.
This patch decreases the overall run-time of this test on my machine from
~3 to ~1 minutes.
Signed-off-by: Johan Herland <johan@herland.net>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a 'notemodify' subcommand of the 'commit' command. This subcommand
is similar to 'filemodify', except that no mode is supplied (all notes have
mode 0644), and the path is set to the hex SHA1 of the given "comittish".
This enables fast import of note objects along with their associated commits,
since the notes can now be named using the mark references of their
corresponding commits.
The patch also includes a test case of the added functionality.
Signed-off-by: Johan Herland <johan@herland.net>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "-m" and "-F" options are already the established method
(in both git-commit and git-tag) to specify a commit/tag message
without invoking the editor. This patch teaches "git notes edit"
to respect the same options for specifying a notes message without
invoking the editor.
Multiple "-m" and/or "-F" options are concatenated as separate
paragraphs.
The patch also updates the "git notes" documentation and adds
selftests for the new functionality. Unfortunately, the added
selftests include a couple of lines with trailing whitespace
(without these the test will fail). This may cause git to warn
about "whitespace errors".
This patch has been improved by the following contributions:
- Thomas Rast: fix trailing whitespace in t3301
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-notes have the potential of being pretty expensive, so test with
a lot of commits. A lot. So to make things cheaper, you have to
opt-in explicitely, by setting the environment variable
GIT_NOTES_TIMING_TESTS.
This patch has been improved by the following contributions:
- Junio C Hamano: tests: fix "export var=val"
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To avoid looking up each and every commit in the notes ref's tree
object, which is very expensive, speed things up by slurping the tree
object's contents into a hash_map.
The idea for the hashmap singleton is from David Reiss, initial
benchmarking by Jeff King.
Note: the implementation allows for arbitrary entries in the notes
tree object, ignoring those that do not reference a valid object. This
allows you to annotate arbitrary branches, or objects.
This patch has been improved by the following contributions:
- Junio C Hamano: fixed an obvious error in initialize_hash_map()
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The script 'git notes' allows you to edit and show commit notes, by
calling either
git notes show <commit>
or
git notes edit <commit>
This patch has been improved by the following contributions:
- Tor Arne Vestbø: fix printing of multi-line notes
- Michael J Gruber: test and handle empty notes gracefully
- Thomas Rast:
- only clean up message file when editing
- use GIT_EDITOR and core.editor over VISUAL/EDITOR
- t3301: fix confusing quoting in test for valid notes ref
- t3301: use test_must_fail instead of !
- refuse to edit notes outside refs/notes/
- Junio C Hamano: tests: fix "export var=val"
- Christian Couder: documentation: fix 'linkgit' macro in "git-notes.txt"
- Johan Herland: minor cleanup and bugfixing in git-notes.sh (v2)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Tor Arne Vestbø <tavestbo@trolltech.com>
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit notes are blobs which are shown together with the commit
message. These blobs are taken from the notes ref, which you can
configure by the config variable core.notesRef, which in turn can
be overridden by the environment variable GIT_NOTES_REF.
The notes ref is a branch which contains "files" whose names are
the names of the corresponding commits (i.e. the SHA-1).
The rationale for putting this information into a ref is this: we
want to be able to fetch and possibly union-merge the notes,
maybe even look at the date when a note was introduced, and we
want to store them efficiently together with the other objects.
This patch has been improved by the following contributions:
- Thomas Rast: fix core.notesRef documentation
- Tor Arne Vestbø: fix printing of multi-line notes
- Alex Riesen: Using char array instead of char pointer costs less BSS
- Johan Herland: Plug leak when msg is good, but msglen or type causes return
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Tor Arne Vestbø <tavestbo@trolltech.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
get_commit_notes(): Plug memory leak when 'if' triggers, but not because of read_sha1_file() failure
The newly added function can rewrap text according to a given first-line
indent, other-indent and text width.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
print_wrapped_text() will insert its own newlines. Up until now, if the
text passed to it contained newlines, they would not be handled properly
(the wrapping got confused after that).
The strategy is to replace a single new-line with a space, but keep double
new-lines so that already-wrapped text with empty lines between paragraphs
will be handled properly.
However, single new-line characters are only handled this way if the
character after it is an alphanumeric character, as per Linus' suggestion.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We don't want to use output() on git-commit --amend when rewording the
commit message. This leads to confusion as the editor is run in a
subshell with it's output saved away, leaving the user with a seemingly
frozen terminal.
Fix by removing the output part.
Signed-off-by: Stephen Boyd <bebarino@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The format template string was declared as "const void *" for some unknown
reason, even though it obviously is meant to be passed a string. Make it
"const char *".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One of the first things that cvsimport does is chdir to the
newly created git repo. This means that any filenames given
to us on the command line will be looked up relative to the
git repo directory. This is probably not what the user
expects, so let's remember and prepend the original
directory for relative filenames.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Get rid of the static variable that was used to prevent loading all
the refnames multiple times by moving that code out of describe(),
simply making sure it is only run once that way.
Also change the error message that is shown in case no refnames are
found to not include a hash any more, as the error condition is not
specific to any particular revision.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
git push: say that --tag can't be used with --all or --mirror in help text
git push: remove incomplete options list from help text
document push's new quiet option
Makefile: clean block-sha1/ directory instead of mozilla-sha1/
This replaces an earlier patch by Björn Gustavsson,
Message-ID: <4AD75029.1010109@gmail.com>
Signed-off-by: しらいし ななこ <nanako3@lavabit.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git push -h' shows usage text with incomplete list of options and then
has a separate list of options that are supported. Imitate the way other
commands (I looked at 'git diff' for an example) show their options.
Signed-off-by: しらいし ななこ <nanako3@lavabit.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'make clean' should remove the object files from block-sha1/
instead of the non-existent mozilla-sha1/ directory.
Signed-off-by: Carlos R. Mafra <crmafra@aei.mpg.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When flipping commits around on topic branches, I often end up doing
this sequence:
* Run "log --oneline next..jc/frotz" to find out the first commit
on 'jc/frotz' branch not yet merged to 'next';
* Run "checkout $that_commit^" to detach HEAD to the parent of it;
* Rebuild the series on top of that commit; and
* "show-branch jc/frotz HEAD" and "diff jc/frotz HEAD" to verify.
Introduce a new syntax to "git checkout" to name the commit to switch to,
to make the first two steps easier. When the branch to switch to is
specified as A...B (you can omit either A or B but not both, and HEAD
is used instead of the omitted side), the merge base between these two
commits are computed, and if there is one unique one, we detach the HEAD
at that commit.
With this, I can say "checkout next...jc/frotz".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 'frotz' is not a valid object name and not a tracked filename,
we used to complain and failed this command. When there is only
one remote that has 'frotz' as one of its tracking branches, we can
DWIM it as a request to create a local branch 'frotz' forking from
the matching remote tracking branch.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make it possible to invole the logic of verify_filename() to make sure the
pathname arguments are unambiguous without actually dying. The caller may
want to do something different.
* jc/maint-blank-at-eof:
diff -B: colour whitespace errors
diff.c: emit_add_line() takes only the rest of the line
diff.c: split emit_line() from the first char and the rest of the line
diff.c: shuffling code around
diff --whitespace: fix blank lines at end
core.whitespace: split trailing-space into blank-at-{eol,eof}
diff --color: color blank-at-eof
diff --whitespace=warn/error: fix blank-at-eof check
diff --whitespace=warn/error: obey blank-at-eof
diff.c: the builtin_diff() deals with only two-file comparison
apply --whitespace: warn blank but not necessarily empty lines at EOF
apply --whitespace=warn/error: diagnose blank at EOF
apply.c: split check_whitespace() into two
apply --whitespace=fix: detect new blank lines at eof correctly
apply --whitespace=fix: fix handling of blank lines at the eof
"git grep" would segfault if its -f option was used because it would
try to use an uninitialized strbuf, so initialize the strbuf.
Thanks to Johannes Sixt <j.sixt@viscovery.net> for the help with the
test cases.
Signed-off-by: Matt Kraai <kraai@ftbfs.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When creating an info/grafts under windows, one typically gets a CRLF file.
There is no good reason to forbid trailing CR at the end of the line (for
that matter, any trailing whitespaces); the code allowed only LF simply
because that was good enough for the platforms with LF line endings.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some types of corruption to a pack may confuse the deflate stream
which stores an object. In Andy's reported case a 36 byte region
of the pack was overwritten, leading to what appeared to be a valid
deflate stream that was trying to produce a result larger than our
allocated output buffer could accept.
Z_BUF_ERROR is returned from inflate() if either the input buffer
needs more input bytes, or the output buffer has run out of space.
Previously we only considered the former case, as it meant we needed
to move the stream's input buffer to the next window in the pack.
We now abort the loop if inflate() returns Z_BUF_ERROR without
consuming the entire input buffer it was given, or has filled
the entire output buffer but has not yet returned Z_STREAM_END.
Either state is a clear indicator that this loop is not working
as expected, and should not continue.
This problem cannot occur with loose objects as we open the entire
loose object as a single buffer and treat Z_BUF_ERROR as an error.
Reported-by: Andy Isaacson <adi@hexapodia.org>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
change throughput display units with fast links
clone: Supply the right commit hash to post-checkout when -b is used
remote-curl: add missing initialization of argv0_path
Switch to MiB/s when the connection is fast enough (i.e. on a LAN).
Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>