The goal of the patch is to introduce the GNU diff
-B/--ignore-blank-lines as closely as possible. The short option is not
available because it's already used for "break-rewrites".
When this option is used, git-diff will not create hunks that simply
add or remove empty lines, but will still show empty lines
addition/suppression if they are close enough to "valuable" changes.
There are two differences between this option and GNU diff -B option:
- GNU diff doesn't have "--inter-hunk-context", so this must be handled
- The following sequence looks like a bug (context is displayed twice):
$ seq 5 >file1
$ cat <<EOF >file2
change
1
2
3
4
5
change
EOF
$ diff -u -B file1 file2
--- file1 2013-06-08 22:13:04.471517834 +0200
+++ file2 2013-06-08 22:13:23.275517855 +0200
@@ -1,5 +1,7 @@
+change
1
2
+
3
4
5
@@ -3,3 +5,4 @@
3
4
5
+change
So here is a more thorough description of the option:
- real changes are interesting
- blank lines that are close enough (less than context size) to
interesting changes are considered interesting (recursive definition)
- "context" lines are used around each hunk of interesting changes
- If two hunks are separated by less than "inter-hunk-context", they
will be merged into one.
The implementation does the "interesting changes selection" in a single
pass.
Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Normally it doesn't matter if we show the pre-image or th post-image
for the common parts of a diff because they are the same. If
white-space changes are ignored they can differ, though. The
new text after applying the diff is more interesting in that case,
so show that instead of the old contents.
Note: GNU diff shows the pre-image.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the option -W/--function-context to git diff. It is similar to
the same option of git grep and expands the context of change hunks
so that the whole surrounding function is shown. This "natural"
context can allow changes to be understood better.
Note: GNU patch doesn't like diffs generated with the new option;
it seems to expect context lines to be the same before and after
changes. git apply doesn't complain.
This implementation has the same shortcoming as the one in grep,
namely that there is no way to explicitly find the end of a
function. That means that a few lines of extra context are shown,
right up to the next recognized function begins. It's already
useful in its current form, though.
The function get_func_line() in xdiff/xemit.c is extended to work
forward as well as backward to find post-context as well as
pre-context. It returns the position of the first found matching
line. The func_line parameter is made optional, as we don't need
it for -W.
The enhanced function is then used in xdl_emit_diff() to extend
the context as needed. If the added context overlaps with the
next change, it is merged into the current hunk.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the code to search for a function line to be shown in the hunk
header into its own function and to make returning the length-limited
result string easier, introduce struct func_line.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For each hunk, xdl_find_func searches the preimage for a function name
until the beginning of the file. If the file does not contain any
function names, this search has complexity O(n^2) in the number of
hunks n.
Instead, inline xdl_find_func() and keep track of up to which line we
have scanned already and the contents of the last funcname line that
we have found.
Noticed and a different approach proposed by Clemens Buchacher.
This alternative solution was done by René Scharfe.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
http-push.c::finish_request():
request is initialized by the for loop
index-pack.c::free_base_data():
b is initialized by the for loop
merge-recursive.c::process_renames():
move compare to narrower scope, and remove unused assignments to it
remove unused variable renames2
xdiff/xdiffi.c::xdl_recs_cmp():
remove unused variable ec
xdiff/xemit.c::xdl_emit_diff():
xche is always overwritten
Signed-off-by: Benjamin Kramer <benny.kra@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge two hunks if there is only the specified number of otherwise unshown
context between them. For --inter-hunk-context=1, the resulting patch has
the same number of lines but shows uninterrupted context instead of a
context header line in between.
Patches generated with this option are easier to read but are also more
likely to conflict if the file to be patched contains other changes.
This patch keeps the default for this option at 0. It is intended to just
make the feature available in order to see its advantages and downsides.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For some users (e.g. git blame), getting textual patch output is just
extra work, as they can get all the information they need from the low-
level diff structures. Allow for an alternate low-level emit function
to be defined to allow bypassing the textual patch generation; set
xemitconf_t's emit_func member to enable this.
The (void (*)()) type is pretty ugly, but the alternative would be to
include most of the private xdiff headers in xdiff.h to get the types
required for the "proper" function prototype. Also, a (void *) won't
work, as ANSI C doesn't allow a function pointer to be cast to an
object pointer.
Signed-off-by: Brian Downing <bdowning@lavos.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes"diff -p" hunk headers customizable via gitattributes mechanism.
It is based on Johannes's earlier patch that allowed to define a single
regexp to be used for everything.
The mechanism to arrive at the regexp that is used to define hunk header
is the same as other use of gitattributes. You assign an attribute, funcname
(because "diff -p" typically uses the name of the function the patch is about
as the hunk header), a simple string value. This can be one of the names of
built-in pattern (currently, "java" is defined) or a custom pattern name, to
be looked up from the configuration file.
(in .gitattributes)
*.java funcname=java
*.perl funcname=perl
(in .git/config)
[funcname]
java = ... # ugly and complicated regexp to override the built-in one.
perl = ... # another ugly and complicated regexp to define a new one.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This uses "git-apply --whitespace=strip" to fix whitespace errors that have
crept in to our source files over time. There are a few files that need
to have trailing whitespaces (most notably, test vectors). The results
still passes the test, and build result in Documentation/ area is unchanged.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In xemit.c:xdl_emit_diff() a buffer for showing the function name as
commentary is allocated; this buffer was 40 characters. This is a bit
small; particularly for C++ function names where there is often an
identical prefix (like void LongNamespace::LongClassName) on multiple
functions, which makes the context the same everywhere. In other words
the context is useless. This patch increases that buffer to 80
characters - which may still not be enough, but is better
Signed-off-by: Andy Parkins <andyparkins@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This removes the '#' and '(' tests and adds a '$' test instead although I have
no idea what it is actually good for - but hey, if that's what GNU diff does...
Pasky only went and did as Junio sayeth.
Signed-off-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This removes trailing blanks from git-generated diff headers
the same way a similar patch did that for GNU diff:
http://article.gmane.org/gmane.comp.gnu.utils.bugs/13839
That is, it removes trailing blanks on the hunk header line that
shows the function name.
Signed-off-by: Jim Meyering <jim@meyering.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This fairly trivial patch adds a new XDL_EMIT_xxx flag to tell libxdiff
that we don't want to generate the _diff_ between two files, we want to
see the lines that are _common_ to two files.
So when you set XDL_EMIT_COMMON, xdl_diff() will do everything exactly
like it used to do, but the output records it generates just contain the
lines that aren't part of the diff.
This is for doing things like generating the common base case for a file
that was added in both branches.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The speed of the built-in diff generator is nice; but the function names
shown by `diff -p' are /really/ nice. And I hate having to choose. So,
we hack xdiff to find the function names and print them.
xdiff has grown a flag to say whether to dig up the function names. The
builtin_diff function passes this flag unconditionally. I suppose it
could parse GIT_DIFF_OPTS, but it doesn't at the moment. I've also
reintroduced the `function name' into the test suite, from which it was
removed in commit 3ce8f089.
The function names are parsed by a particularly stupid algorithm at the
moment: it just tries to find a line in the `old' file, from before the
start of the hunk, whose first character looks plausible. Still, it's
most definitely a start.
Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This uses a simplified libxdiff setup to generate unified diffs _without_
doing fork/execve of GNU "diff".
This has several huge advantages, for example:
Before:
[torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null
real 0m24.818s
user 0m13.332s
sys 0m8.664s
After:
[torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null
real 0m4.563s
user 0m2.944s
sys 0m1.580s
and the fact that this should be a lot more portable (ie we can ignore all
the issues with doing fork/execve under Windows).
Perhaps even more importantly, this allows us to do diffs without actually
ever writing out the git file contents to a temporary file (and without
any of the shell quoting issues on filenames etc etc).
NOTE! THIS PATCH DOES NOT DO THAT OPTIMIZATION YET! I was lazy, and the
current "diff-core" code actually will always write the temp-files,
because it used to be something that you simply had to do. So this current
one actually writes a temp-file like before, and then reads it into memory
again just to do the diff. Stupid.
But if this basic infrastructure is accepted, we can start switching over
diff-core to not write temp-files, which should speed things up even
further, especially when doing big tree-to-tree diffs.
Now, in the interest of full disclosure, I should also point out a few
downsides:
- the libxdiff algorithm is different, and I bet GNU diff has gotten a
lot more testing. And the thing is, generating a diff is not an exact
science - you can get two different diffs (and you will), and they can
both be perfectly valid. So it's not possible to "validate" the
libxdiff output by just comparing it against GNU diff.
- GNU diff does some nice eye-candy, like trying to figure out what the
last function was, and adding that information to the "@@ .." line.
libxdiff doesn't do that.
- The libxdiff thing has some known deficiencies. In particular, it gets
the "\No newline at end of file" case wrong. So this is currently for
the experimental branch only. I hope Davide will help fix it.
That said, I think the huge performance advantage, and the fact that it
integrates better is definitely worth it. But it should go into a
development branch at least due to the missing newline issue.
Technical note: this is based on libxdiff-0.17, but I did some surgery to
get rid of the extraneous fat - stuff that git doesn't need, and seriously
cutting down on mmfile_t, which had much more capabilities than the diff
algorithm either needed or used. In this version, "mmfile_t" is just a
trivial <pointer,length> tuple.
That said, I tried to keep the differences to simple removals, so that you
can do a diff between this and the libxdiff origin, and you'll basically
see just things getting deleted. Even the mmfile_t simplifications are
left in a state where the diffs should be readable.
Apologies to Davide, whom I'd love to get feedback on this all from (I
wrote my own "fill_mmfile()" for the new simpler mmfile_t format: the old
complex format had a helper function for that, but I did my surgery with
the goal in mind that eventually we _should_ just do
mmfile_t mf;
buf = read_sha1_file(sha1, type, &size);
mf->ptr = buf;
mf->size = size;
.. use "mf" directly ..
which was really a nightmare with the old "helpful" mmfile_t, and really
is that easy with the new cut-down interfaces).
[ Btw, as any hawk-eye can see from the diff, this was actually generated
with itself, so it is "self-hosting". That's about all the testing it
has gotten, along with the above kernel diff, which eye-balls correctly,
but shows the newline issue when you double-check it with "git-apply" ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>