When file renames/copies detection is turned on, the
second diffcore_std will degrade a 'C' pair to a 'R' pair.
And this may happen when we run 'git log --follow' with
hard copies finding. That is, the try_to_follow_renames()
will run diffcore_std to find the copies, and then
'git log' will issue another diffcore_std, which will reduce
'src->rename_used' and recognize this copy as a rename.
This is not what we want.
So, I think we really don't need to run diffcore_std more
than one time.
Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the diff_queue_struct code, this macro help
to reset the structure.
Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
xdi_diff_outf() overrides the structure members of its last parameter,
ignoring any value that callers pass in. It's no surprise then that all
callers pass a pointer to an uninitialized structure. They also don't
read it after the call, so the parameter is neither used for input nor
for output. Turn it into a local variable of xdi_diff_outf().
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ever since the xdiff library had been introduced to git, all its callers
have used the flag XDF_NEED_MINIMAL. It makes sure that the smallest
possible diff is produced, but that takes quite some time if there are
lots of differences that can be expressed in multiple ways.
This flag makes a difference for only 0.1% of the non-merge commits in
the git repo of Linux, both in terms of diff size and execution time.
The patches there are mostly nice and small.
SungHyun Nam however reported a case in a different repo where a diff
took more than 20 times longer to generate with XDF_NEED_MINIMAL than
without. Rebasing became really slow.
This patch removes this flag from all callers. The default of xdiff is
saner because it has minimal to no impact in the normal case of small
diffs and doesn't incur that much of a speed penalty for large ones.
A follow-up patch may introduce a command line option to set the flag if
the user needs it, similar to GNU diff's -d/--minimal.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The diffstat "added" and "changed" fields generally store
line counts; however, for binary files, they store file
sizes. Since we store and print these values as ints, a
diffstat on a file larger than 2G can show a negative size.
Instead, let's use uintmax_t, which should be at least 64
bits on modern platforms.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This teaches the --color-words engine a more general interface that
supports two new modes:
* --word-diff=plain, inspired by the 'wdiff' utility (most similar to
'wdiff -n <old> <new>'): uses delimiters [-removed-] and {+added+}
* --word-diff=porcelain, which generates an ad-hoc machine readable
format:
- each diff unit is prefixed by [-+ ] and terminated by newline as
in unified diff
- newlines in the input are output as a line consisting only of a
tilde '~'
Both of these formats still support color if it is enabled, using it
to highlight the differences. --color-words becomes a synonym for
--word-diff=color, which is the color-only format. Also adds some
compatibility/convenience options.
Thanks to Junio C Hamano and Miles Bader for good ideas.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
builtin_diff calls fill_mmfile fairly early, which in turn
calls diff_populate_filespec, which actually retrieves the
file's blob contents into a buffer. Long ago, this was
sensible as we would need to look at the blobs eventually.
These days, however, we may not ever want those blobs if we
end up using a textconv cache, and for large binary files
(exactly the sort for which you might have a textconv
cache), just retrieving the objects can be costly.
This patch just pushes the fill_mmfile call a bit later, so
we can avoid populating the filespec in some cases. There
is one thing to note that looks like a bug but isn't. We
push the fill_mmfile down into the first branch of a
conditional. It seems like we would need it on the other
branch, too, but we don't; fill_textconv does it for us (in
fact, before this, we were just writing over the results of
the fill_mmfile on that branch).
Here's a timing sample on a commit with 45 changed jpgs and
avis. The result is fully textconv cached, but we still
wasted a lot of time just pulling the blobs from storage.
The total size of the blobs (source and dest) is about
180M.
[before]
$ time git show >/dev/null
real 0m0.352s
user 0m0.148s
sys 0m0.200s
[after]
$ time git show >/dev/null
real 0m0.009s
user 0m0.004s
sys 0m0.004s
And that's on a warm cache. On a cold cache, the "after"
case is not much worse, but the "before" case has to do an
extra 180M of I/O.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Running a textconv filter can take a long time. It's
particularly bad for a large file which needs to be spooled
to disk, but even for small files, the fork+exec overhead
can add up for something like "git log -p".
This patch uses the notes-cache mechanism to keep a fast
cache of textconv output. Caches are stored in
refs/notes/textconv/$x, where $x is the userdiff driver
defined in gitattributes.
Caching is enabled only if diff.$x.cachetextconv is true.
In my test repo, on a commit with 45 jpg and avi files
changed and a textconv to show their exif tags:
[before]
$ time git show >/dev/null
real 0m13.724s
user 0m12.057s
sys 0m1.624s
[after, first run]
$ git config diff.mfo.cachetextconv true
$ time git show >/dev/null
real 0m14.252s
user 0m12.197s
sys 0m1.800s
[after, subsequent runs]
$ time git show >/dev/null
real 0m0.352s
user 0m0.148s
sys 0m0.200s
So for a slight (3.8%) cost on the first run, we achieve an
almost 40x speed up on subsequent runs.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch adds a fill_textconv wrapper, which centralizes
some minor logic like error checking and handling the case
of no-textconv.
In addition to dropping the number of lines, this will make
it easier in future patches to handle multiple types of
textconv.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We correctly free() for the normal diff case, but leak for
rewrite diffs.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To make the code simpler, run_textconv lumps all of its
error checking into one conditional. However, the
short-circuit means that an error in reading will prevent us
from calling finish_command, leaving a zombie child.
Clean up properly after errors.
Based-on-work-by: Jeff King <peff@peff.net>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jl/submodule-diff-dirtiness:
git status: ignoring untracked files must apply to submodules too
git status: Fix false positive "new commits" output for dirty submodules
Refactor dirty submodule detection in diff-lib.c
git status: Show detailed dirty status of submodules in long format
git diff --submodule: Show detailed dirty status of submodules
Testing if the output "new commits" should appear in the long format of
"git status" is done by comparing the hashes of the diffpair. This always
resulted in printing "new commits" for submodules that contained untracked
or modified content, even if they did not contain new commits. The reason
was that match_stat_with_submodule() did set the "changed" flag for dirty
submodules, resulting in two->sha1 being set to the null_sha1 at the call
sites, which indicates that new commits are present. This is changed so
that when no new commits are present, the same object names are in the
sha1 field for both sides of the filepair, and the working tree side will
have the "dirty_submodule" flag set when appropriate. For a submodule to
be seen as modified even when it just has a dirty work tree, some
conditions had to be extended to also check for the "dirty_submodule"
flag.
Unfortunately the test case that should have found this bug had been
changed incorrectly too. It is fixed and extended to test for other
combinations too.
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Moving duplicated code into the new function match_stat_with_submodule().
Replacing the implicit activation of detailed checks for the dirtiness of
submodules when DIFF_FORMAT_PATCH was selected with explicitly setting
the recently added DIFF_OPT_DIRTY_SUBMODULES option in diff_setup_done().
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make git-branch, git-show-branch, git-grep, and all the diff-based
programs accept an optional argument <when> for --color. The argument
is a colorbool: "always", "never", or "auto". If no argument is given,
"always" is used; --no-color is an alias for --color=never. This makes
the command-line interface consistent with other GNU tools, such as `ls'
and `grep', and with the git-config color options. Note that, without
an argument, --color and --no-color work exactly as before.
To implement this, two internal changes were made:
1. Allow the first argument of git_config_colorbool() to be NULL,
in which case it returns -1 if the argument isn't "always", "never",
or "auto".
2. Add OPT_COLOR_FLAG(), OPT__COLOR(), and parse_opt_color_flag_cb()
to the option parsing library. The callback uses
git_config_colorbool(), so color.h is now a dependency
of parse-options.c.
Signed-off-by: Mark Lodato <lodatom@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint-1.6.6:
dwim_ref: fix dangling symref warning
stash pop: remove 'apply' options during 'drop' invocation
diff: make sure --output=/bad/path is caught
Remove hyphen from "git-command" in two error messages
The option -w tells the diff machinery to inspect the contents to set the
exit status, instead of checking the blob object level difference alone.
However, --quiet tells the diff machinery not to look at the contents, which
means DIFF_FROM_CONTENTS has no chance to inspect the change.
Work it around by calling diff_flush_patch() with output sent to /dev/null.
Signed-off-by: Larry D'Anna <larry@elder-gods.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jl/diff-submodule-ignore:
Teach diff --submodule that modified submodule directory is dirty
git diff: Don't test submodule dirtiness with --ignore-submodules
Make ce_uptodate() trustworthy again
Since commit 8e08b4 git diff does append "-dirty" to the work tree side
if the working directory of a submodule contains new or modified files.
Lets do the same when the --submodule option is used.
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/fix-tree-walk:
read-tree --debug-unpack
unpack-trees.c: look ahead in the index
unpack-trees.c: prepare for looking ahead in the index
Aggressive three-way merge: fix D/F case
traverse_trees(): handle D/F conflict case sanely
more D/F conflict tests
tests: move convenience regexp to match object names to test-lib.sh
Conflicts:
builtin-read-tree.c
unpack-trees.c
unpack-trees.h
* jl/submodule-diff:
Performance optimization for detection of modified submodules
git status: Show uncommitted submodule changes too when enabled
Teach diff that modified submodule directory is dirty
Show submodules as modified when they contain a dirty work tree
* maint-1.6.4:
Fix mis-backport of t7002
base85: Make the code more obvious instead of explaining the non-obvious
base85: encode_85() does not use the decode table
base85 debug code: Fix length byte calculation
checkout -m: do not try to fall back to --merge from an unborn branch
branch: die explicitly why when calling "git branch [-a|-r] branchname".
textconv: stop leaking file descriptors
commit: --cleanup is a message option
git count-objects: handle packs bigger than 4G
t7102: make the test fail if one of its check fails
* maint-1.6.3:
base85: Make the code more obvious instead of explaining the non-obvious
base85: encode_85() does not use the decode table
base85 debug code: Fix length byte calculation
checkout -m: do not try to fall back to --merge from an unborn branch
branch: die explicitly why when calling "git branch [-a|-r] branchname".
textconv: stop leaking file descriptors
commit: --cleanup is a message option
git count-objects: handle packs bigger than 4G
t7102: make the test fail if one of its check fails
Conflicts:
builtin-commit.c
* maint-1.6.2:
base85: Make the code more obvious instead of explaining the non-obvious
base85: encode_85() does not use the decode table
base85 debug code: Fix length byte calculation
checkout -m: do not try to fall back to --merge from an unborn branch
branch: die explicitly why when calling "git branch [-a|-r] branchname".
textconv: stop leaking file descriptors
commit: --cleanup is a message option
git count-objects: handle packs bigger than 4G
t7102: make the test fail if one of its check fails
Conflicts:
diff.c
In the worst case is_submodule_modified() got called three times for
each submodule. The information we got from scanning the whole
submodule tree the first time can be reused instead.
New parameters have been added to diff_change() and diff_addremove(),
the information is stored in a new member of struct diff_filespec. Its
value is then reused instead of calling is_submodule_modified() again.
When no explicit "-dirty" is needed in the output the call to
is_submodule_modified() is not necessary when the submodules HEAD
already disagrees with the ref of the superproject, as this alone
marks it as modified. To achieve that, get_stat_data() got an extra
argument.
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/run-command-use-shell:
t4030, t4031: work around bogus MSYS bash path conversion
diff: run external diff helper with shell
textconv: use shell to run helper
editor: use run_command's shell feature
run-command: optimize out useless shell calls
run-command: convert simple callsites to use_shell
t0021: use $SHELL_PATH for the filter script
run-command: add "use shell" option
A diff run in superproject only compares the name of the commit object
bound at the submodule paths. When we compare with a work tree and the
checked out submodule directory is dirty (e.g. has either staged or
unstaged changes, or has new files the user forgot to add to the index),
show the work tree side as "dirty".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* nd/sparse: (25 commits)
t7002: test for not using external grep on skip-worktree paths
t7002: set test prerequisite "external-grep" if supported
grep: do not do external grep on skip-worktree entries
commit: correctly respect skip-worktree bit
ie_match_stat(): do not ignore skip-worktree bit with CE_MATCH_IGNORE_VALID
tests: rename duplicate t1009
sparse checkout: inhibit empty worktree
Add tests for sparse checkout
read-tree: add --no-sparse-checkout to disable sparse checkout support
unpack-trees(): ignore worktree check outside checkout area
unpack_trees(): apply $GIT_DIR/info/sparse-checkout to the final index
unpack-trees(): "enable" sparse checkout and load $GIT_DIR/info/sparse-checkout
unpack-trees.c: generalize verify_* functions
unpack-trees(): add CE_WT_REMOVE to remove on worktree alone
Introduce "sparse checkout"
dir.c: export excluded_1() and add_excludes_from_file_1()
excluded_1(): support exclude files in index
unpack-trees(): carry skip-worktree bit over in merged_entry()
Read .gitignore from index if it is skip-worktree
Avoid writing to buffer in add_excludes_from_file_1()
...
Conflicts:
.gitignore
Documentation/config.txt
Documentation/git-update-index.txt
Makefile
entry.c
t/t7002-grep.sh
* maint-1.6.1:
base85: Make the code more obvious instead of explaining the non-obvious
base85: encode_85() does not use the decode table
base85 debug code: Fix length byte calculation
checkout -m: do not try to fall back to --merge from an unborn branch
branch: die explicitly why when calling "git branch [-a|-r] branchname".
textconv: stop leaking file descriptors
commit: --cleanup is a message option
git count-objects: handle packs bigger than 4G
t7102: make the test fail if one of its check fails
Conflicts:
diff.c
This makes the traversal of index be in sync with the tree traversal.
When unpack_callback() is fed a set of tree entries from trees, it
inspects the name of the entry and checks if the an index entry with
the same name could be hiding behind the current index entry, and
(1) if the name appears in the index as a leaf node, it is also
fed to the n_way_merge() callback function;
(2) if the name is a directory in the index, i.e. there are entries in
that are underneath it, then nothing is fed to the n_way_merge()
callback function;
(3) otherwise, if the name comes before the first eligible entry in the
index, the index entry is first unpacked alone.
When traverse_trees_recursive() descends into a subdirectory, the
cache_bottom pointer is moved to walk index entries within that directory.
All of these are omitted for diff-index, which does not even want to be
fed an index entry and a tree entry with D/F conflicts.
This fixes 3-way read-tree and exposes a bug in other parts of the system
in t6035, test #5. The test prepares these three trees:
O = HEAD^
100644 blob e69de29bb2 a/b-2/c/d
100644 blob e69de29bb2 a/b/c/d
100644 blob e69de29bb2 a/x
A = HEAD
100644 blob e69de29bb2 a/b-2/c/d
100644 blob e69de29bb2 a/b/c/d
100644 blob 587be6b4c3f93f93c489c0111bba5596147a26cb a/x
B = master
120000 blob a36b77384451ea1de7bd340ffca868249626bc52 a/b
100644 blob e69de29bb2 a/b-2/c/d
100644 blob e69de29bb2 a/x
With a clean index that matches HEAD, running
git read-tree -m -u --aggressive $O $A $B
now yields
120000 a36b77384451ea1de7bd340ffca868249626bc52 3 a/b
100644 e69de29bb2 0 a/b-2/c/d
100644 e69de29bb2 1 a/b/c/d
100644 e69de29bb2 2 a/b/c/d
100644 587be6b4c3f93f93c489c0111bba5596147a26cb 0 a/x
which is correct. "master" created "a/b" symlink that did not exist,
and removed "a/b/c/d" while HEAD did not do touch either path.
Before this series, read-tree did not notice the situation and resolved
addition of "a/b" and removal of "a/b/c/d" independently. If A = HEAD had
another path "a/b/c/e" added, this merge should conflict but instead it
silently resolved "a/b" and then immediately overwrote it to add
"a/b/c/e", which was quite bogus.
Tests in t1012 start to work with this.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is mostly to make it more consistent with the rest of
git, which uses the shell to exec helpers.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently textconv helpers are run directly. Running through
the shell is useful because the user can provide a program
with command line arguments, like "antiword -f".
It also makes textconv more consistent with other parts of
git, most of which run their helpers using the shell.
The downside is that textconv helpers with shell
metacharacters (like space) in the filename will be broken.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
textconv: stop leaking file descriptors
commit: --cleanup is a message option
git count-objects: handle packs bigger than 4G
t7102: make the test fail if one of its check fails
Documentation: always respect core.worktree if set
* maint-1.6.1:
textconv: stop leaking file descriptors
commit: --cleanup is a message option
git count-objects: handle packs bigger than 4G
t7102: make the test fail if one of its check fails
Conflicts:
builtin-commit.c
diff.c
We read the output from textconv helpers over a pipe, but we
never actually closed our end of the pipe after using it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/1.7.0-diff-whitespace-only-status:
diff.c: fix typoes in comments
Make test case number unique
diff: Rename QUIET internal option to QUICK
diff: change semantics of "ignore whitespace" options
Conflicts:
diff.h
* maint:
Git 1.6.5.7
worktree: don't segfault with an absolute pathspec without a work tree
ignore unknown color configuration
help.autocorrect: do not run a command if the command given is junk
Illustrate "filter" attribute with an example
When parsing the config file, if there is a value that is
syntactically correct but unused, we generally ignore it.
This lets non-core porcelains store arbitrary information in
the config file, and it means that configuration files can
be shared between new and old versions of git (the old
versions might simply ignore certain configuration).
The one exception to this is color configuration; if we
encounter a color.{diff,branch,status}.$slot variable, we
die if it is not one of the recognized slots (presumably as
a safety valve for user misconfiguration). This behavior
has existed since 801235c (diff --color: use
$GIT_DIR/config, 2006-06-24), but hasn't yet caused a
problem. No porcelain has wanted to store extra colors, and
we once a color area (like color.diff) has been introduced,
we've never changed the set of color slots.
However, that changed recently with the addition of
color.diff.func. Now a user with color.diff.func in their
config can no longer freely switch between v1.6.6 and older
versions; the old versions will complain about the existence
of the variable.
This patch loosens the check to match the rest of
git-config; unknown color slots are simply ignored. This
doesn't fix this particular problem, as the older version
(without this patch) is the problem, but it at least
prevents it from happening again in the future.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Inspired by the coloring of quilt.
Introduce a separate color and paint the hunk comment part, i.e. the name
of the function, in a separate color "diff.func" (defaults to plain).
Whitespace between hunk header and hunk comment is printed in plain color.
Signed-off-by: Bert Wesarg <bert.wesarg@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When emit_line() is called with an empty line (but non-zero length, as we
send line terminating LF or CRLF to the function), it used to emit
<SET><RESET> followed by a newline. Stop the wastefulness.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change git-diff's whitespace-ignoring modes to generate
output only if a non-empty patch results, which git-apply
rejects.
Update the tests to look for the new behavior.
Signed-off-by: Greg Bacon <gbacon@dbresearch.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* js/maint-diff-color-words:
diff --color-words: bit of clean-up
diff --color-words -U0: fix the location of hunk headers
t4034-diff-words: add a test for word diff without context
Conflicts:
diff.c
* jc/maint-blank-at-eof:
diff -B: colour whitespace errors
diff.c: emit_add_line() takes only the rest of the line
diff.c: split emit_line() from the first char and the rest of the line
diff.c: shuffling code around
diff --whitespace: fix blank lines at end
core.whitespace: split trailing-space into blank-at-{eol,eof}
diff --color: color blank-at-eof
diff --whitespace=warn/error: fix blank-at-eof check
diff --whitespace=warn/error: obey blank-at-eof
diff.c: the builtin_diff() deals with only two-file comparison
apply --whitespace: warn blank but not necessarily empty lines at EOF
apply --whitespace=warn/error: diagnose blank at EOF
apply.c: split check_whitespace() into two
apply --whitespace=fix: detect new blank lines at eof correctly
apply --whitespace=fix: fix handling of blank lines at the eof
* js/maint-diff-color-words:
diff --color-words: bit of clean-up
diff --color-words -U0: fix the location of hunk headers
t4034-diff-words: add a test for word diff without context
Conflicts:
diff.c
When we introduced the "word diff" mode, we could have done one of three
things:
* change fn_out_consume() to "this is called every time a line worth of
diff becomes ready from the lower-level diff routine. This function
knows two sets of helpers (one for line-oriented diff, another for word
diff), and each set has various functions to be called at certain
places (e.g. hunk header, context, ...). The function's role is to
inspect the incoming line, and dispatch appropriate helpers to produce
either line- or word- oriented diff output."
* introduce fn_out_consume_word_diff() that is "this is called every time
a line worth of diff becomes ready from the lower-level diff routine,
and here is what we do to prepare word oriented diff using that line."
without touching fn_out_consume() at all.
* Do neither of the above, and keep fn_out_consume() to "this is called
every time a line worth of diff becomes ready from the lower-level diff
routine, and here is what we do to output line oriented diff using that
line." but sprinkle a handful of 'are we in word-diff mode? if so do
this totally different thing' at random places.
This patch is to at least abstract the details of "this totally different
thing" out from the main codepath, in order to improve readability.
We can later refactor it by introducing fn_out_consume_word_diff(), taking
the second route above, but that is a separate topic.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Colored word diff without context lines firstly printed all the hunk
headers among each other and then printed the diff.
This was due to the code relying on getting at least one context line at
the end of each hunk, where the colored words would be flushed (it is
done that way to be able to ignore rewrapped lines).
Noticed by Markus Heidelberg.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When you use the option --submodule=log you can see the submodule
summaries inlined in the diff, instead of not-quite-helpful SHA-1 pairs.
The format imitates what "git submodule summary" shows.
To do that, <path>/.git/objects/ is added to the alternate object
databases (if that directory exists).
This option was requested by Jens Lehmann at the GitTogether in Berlin.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/maint-blank-at-eof:
diff -B: colour whitespace errors
diff.c: emit_add_line() takes only the rest of the line
diff.c: split emit_line() from the first char and the rest of the line
diff.c: shuffling code around
diff --whitespace: fix blank lines at end
core.whitespace: split trailing-space into blank-at-{eol,eof}
diff --color: color blank-at-eof
diff --whitespace=warn/error: fix blank-at-eof check
diff --whitespace=warn/error: obey blank-at-eof
diff.c: the builtin_diff() deals with only two-file comparison
apply --whitespace: warn blank but not necessarily empty lines at EOF
apply --whitespace=warn/error: diagnose blank at EOF
apply.c: split check_whitespace() into two
apply --whitespace=fix: detect new blank lines at eof correctly
apply --whitespace=fix: fix handling of blank lines at the eof
Essentially; s/type* /type */ as per the coding guidelines.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* 'jc/maint-1.6.0-blank-at-eof' (early part):
diff.c: emit_add_line() takes only the rest of the line
diff.c: split emit_line() from the first char and the rest of the line
* 'jc/maint-1.6.0-blank-at-eof' (early part):
diff --whitespace: fix blank lines at end
core.whitespace: split trailing-space into blank-at-{eol,eof}
diff --color: color blank-at-eof
diff --whitespace=warn/error: fix blank-at-eof check
diff --whitespace=warn/error: obey blank-at-eof
diff.c: the builtin_diff() deals with only two-file comparison
apply --whitespace: warn blank but not necessarily empty lines at EOF
apply --whitespace=warn/error: diagnose blank at EOF
apply.c: split check_whitespace() into two
apply --whitespace=fix: detect new blank lines at eof correctly
apply --whitespace=fix: fix handling of blank lines at the eof
We used to send the old and new contents more or less straight out to the
output with only the original "old is red, new is green" colouring. Now
all the necessary support routines have been prepared, call them with a
line of data at a time from the output code and have them check and color
whitespace errors in exactly the same way as they are called from the low
level diff callback routines.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As the first character on the line that is fed to this function is always
"+", it is pointless to send that along with the rest of the line.
This change will make it easier to reuse the logic when emitting the
rewrite diff, as we do not want to copy a line only to add "+"/"-"/" "
immediately before its first character when we produce rewrite diff
output.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A new helper function emit_line_0() takes the first line of diff output
(typically "-", " ", or "+") separately from the remainder of the line.
No other functional changes.
This change will make it easier to reuse the logic when emitting the
rewrite diff, as we do not want to copy a line only to add "+"/"-"/" "
immediately before its first character when we produce rewrite diff
output.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move function, type, and structure definitions for fill_mmfile(),
count_trailing_blank(), check_blank_at_eof(), emit_line(),
new_blank_line_at_eof(), emit_add_line(), sane_truncate_fn, and
emit_callback up in the file, so that they can be refactored into helper
functions and reused by codepath for emitting rewrite patches.
This only moves the lines around to make the next two patches easier to
read.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The earlier logic tried to colour any and all blank lines that were added
beyond the last blank line in the original, but this was very wrong. If
you added 96 blank lines, a non-blank line, and then 3 blank lines at the
end, only the last 3 lines should trigger the error, not the earlier 96
blank lines.
We need to also make sure that the lines are after the last non-blank line
in the postimage as well before deciding to paint them.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the coloring logic processed the patch output one line at a time, we
couldn't easily color code the new blank lines at the end of file.
Reuse the adds_blank_at_eof() function to find where the runs of such
blank lines start, keep track of the line number in the preimage while
processing the patch output one line at a time, and paint the new blank
lines that appear after that line to implement this.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "diff --check" logic used to share the same issue as the one fixed for
"git apply" earlier in this series, in that a patch that adds new blank
lines at end could appear as
@@ -l,5 +m,7 @@$
_context$
_context$
-deleted$
+$
+$
+$
_$
_$
where _ stands for SP and $ shows a end-of-line. Instead of looking at
each line in the patch in the callback, simply count the blank lines from
the end in two versions, and notice the presence of new ones.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "diff --check" code used to conflate trailing-space whitespace error
class with this, but now we have a proper separate error class, we should
check it under blank-at-eof, not trailing-space.
The whitespace error is not about _having_ blank lines at end, but about
adding _new_ blank lines. To keep the message consistent with what is
given by "git apply", call whitespace_error_string() to generate it,
instead of using a hardcoded custom message.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The combined diff is implemented in combine_diff() and fn_out_consume()
codepath never has to deal with anything but two-file comparision.
Drop nparents from the emit_callback structure and simplify the code.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The majority of code in core git appears to use a single
space after if/for/while. This is an attempt to bring more
code to this standard. These are entirely cosmetic changes.
Signed-off-by: Brian Gianforcaro <b.gianfo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
grep: turn on --cached for files that is marked skip-worktree
ls-files: do not check for deleted file that is marked skip-worktree
update-index: ignore update request if it's skip-worktree, while still allows removing
diff*: skip worktree version
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The option "QUIET" primarily meant "find if we have _any_ difference as
quick as possible and report", which means we often do not even have to
look at blobs if we know the trees are different by looking at the higher
level (e.g. "diff-tree A B"). As a side effect, because there is no point
showing one change that we happened to have found first, it also enables
NO_OUTPUT and EXIT_WITH_STATUS options, making the end result look quiet.
Rename the internal option to QUICK to reflect this better; it also makes
grepping the source tree much easier, as there are other kinds of QUIET
option everywhere.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Traditionally, the --ignore-whitespace* options have merely meant to tell
the diff output routine that some class of differences are not worth
showing in the textual diff output, so that the end user has easier time
to review the remaining (presumably more meaningful) changes. These
options never affected the outcome of the command, given as the exit
status when the --exit-code option was in effect (either directly or
indirectly).
When you have only whitespace changes, however, you might expect
git diff -b --exit-code
to report that there is _no_ change with zero exit status.
Change the semantics of --ignore-whitespace* options to mean more than
"omit showing the difference in text".
The exit status, when --exit-code is in effect, is computed by checking if
we found any differences at the path level, while diff frontends feed
filepairs to the diffcore engine. When "ignore whitespace" options are in
effect, we defer this determination until the very end of diffcore
transformation. We simply do not know until the textual diff is
generated, which comes very late in the pipeline.
When --quiet is in effect, various diff frontends optimize by breaking out
early from the loop that enumerates the filepairs, when we find the first
path level difference; when --ignore-whitespace* is used the above change
automatically disables this optimization.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* rs/grep-p:
grep: simplify -p output
grep -p: support user defined regular expressions
grep: add option -p/--show-function
grep: handle pre context lines on demand
grep: print context hunk marks between files
grep: move context hunk mark handling into show_line()
userdiff: add xdiff_clear_find_func()
* tr/die_errno:
Use die_errno() instead of die() when checking syscalls
Convert existing die(..., strerror(errno)) to die_errno()
die_errno(): double % in strerror() output just in case
Introduce die_errno() that appends strerror(errno) to die()
xdiff_set_find_func() is used to set user defined regular expressions
for finding function signatures. Add xdiff_clear_find_func(), which
frees the memory allocated by the former, making the API complete.
Also, use the new function in diff.c (the only call site of
xdiff_set_find_func()) to clean up after ourselves.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Lots of die() calls did not actually report the kind of error, which
can leave the user confused as to the real problem. Use die_errno()
where we check a system/library call that sets errno on failure, or
one of the following that wrap such calls:
Function Passes on error from
-------- --------------------
odb_pack_keep open
read_ancestry fopen
read_in_full xread
strbuf_read xread
strbuf_read_file open or strbuf_read_file
strbuf_readlink readlink
write_in_full xwrite
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change calls to die(..., strerror(errno)) to use the new die_errno().
In the process, also make slight style adjustments: at least state
_something_ about the function that failed (instead of just printing
the pathname), and put paths in single quotes.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
diff.c: plug a memory leak in an error path
fetch-pack: close output channel after sideband demultiplexer terminates
builtin-remote: Make "remote show" display all urls
Naturally, prep_temp_blob() did not care about filenames.
As a result, GIT_EXTERNAL_DIFF and textconv generated
filenames such as ".diff_XXXXXX".
This modifies prep_temp_blob() to generate user-friendly
filenames when creating temporary files.
Diffing "name.ext" now generates "XXXXXX_name.ext".
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ar/unlink-err:
print unlink(2) errno in copy_or_link_directory
replace direct calls to unlink(2) with unlink_or_warn
Introduce an unlink(2) wrapper which gives warning if unlink failed
This particular readlink call never NUL-terminated its
result, making it a potential source of bugs (though there
is no bug now, as it currently always respects the length
field). Let's just switch it to strbuf_readlink which is
shorter and less error-prone.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ar/unlink-err:
print unlink(2) errno in copy_or_link_directory
replace direct calls to unlink(2) with unlink_or_warn
Introduce an unlink(2) wrapper which gives warning if unlink failed
* jc/maint-1.6.0-keep-pack:
pack-objects: don't loosen objects available in alternate or kept packs
t7700: demonstrate repack flaw which may loosen objects unnecessarily
Remove --kept-pack-only option and associated infrastructure
pack-objects: only repack or loosen objects residing in "local" packs
git-repack.sh: don't use --kept-pack-only option to pack-objects
t7700-repack: add two new tests demonstrating repacking flaws
is_kept_pack(): final clean-up
Simplify is_kept_pack()
Consolidate ignore_packed logic more
has_sha1_kept_pack(): take "struct rev_info"
has_sha1_pack(): refactor "pretend these packs do not exist" interface
git-repack: resist stray environment variable
Essentially; s/type* /type */ as per the coding guidelines.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This helps to notice when something's going wrong, especially on
systems which lock open files.
I used the following criteria when selecting the code for replacement:
- it was already printing a warning for the unlink failures
- it is in a function which already printing something or is
called from such a function
- it is in a static function, returning void and the function is only
called from a builtin main function (cmd_)
- it is in a function which handles emergency exit (signal handlers)
- it is in a function which is obvously cleaning up the lockfiles
Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The diffstat used the color.diff.plain slot (context text) for coloring
filenames and the whole summary line. This didn't look nice and the
affected text isn't patch context at all.
Signed-off-by: Markus Heidelberg <markus.heidelberg@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I told people on the kernel mailing list to please use "-M" when sending
me rename patches, so that I can see what they do while reading email
rather than having to apply the patch and then look at the end result.
I also told them that if they want to make it the default, they can just
add
[diff]
renames
to their ~/.gitconfig file. And while I was thinking about that, I wanted
to also check whether you can then mark individual projects to _not_ have
that default in the per-repository .git/config file.
And you can't. Currently you cannot have a global "enable renames by
default" and then a local ".. but not for _this_ project". Why? Because if
somebody writes
[diff]
renames = no
we simply ignore it, rather than resetting "diff_detect_rename_default"
back to zero.
Fixed thusly.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/maint-1.6.0-keep-pack:
pack-objects: don't loosen objects available in alternate or kept packs
t7700: demonstrate repack flaw which may loosen objects unnecessarily
Remove --kept-pack-only option and associated infrastructure
pack-objects: only repack or loosen objects residing in "local" packs
git-repack.sh: don't use --kept-pack-only option to pack-objects
t7700-repack: add two new tests demonstrating repacking flaws
is_kept_pack(): final clean-up
Simplify is_kept_pack()
Consolidate ignore_packed logic more
has_sha1_kept_pack(): take "struct rev_info"
has_sha1_pack(): refactor "pretend these packs do not exist" interface
git-repack: resist stray environment variable
Conflicts:
t/t7700-repack.sh
When the index says that the file in the work tree that corresponds to the
blob object that is used for comparison is known to be unchanged, "diff"
reads from the file and applies convert_to_git(), instead of inflating the
object, to feed the internal diff engine with, because an earlier
benchnark found that it tends to be faster to use this optimization.
However, the index can lie when the path is marked as assume-unchanged.
Disable the optimization for such paths.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When preparing temporary files for an external diff or textconv, it is
easier on the external tools, especially when they are implemented using
platform tools, if they are fed the input after convert_to_working_tree().
This fixes msysGit issue 177.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/maint-1.6.0-keep-pack:
is_kept_pack(): final clean-up
Simplify is_kept_pack()
Consolidate ignore_packed logic more
has_sha1_kept_pack(): take "struct rev_info"
has_sha1_pack(): refactor "pretend these packs do not exist" interface
git-repack: resist stray environment variable
These variables were unused and can be removed safely:
builtin-clone.c::cmd_clone(): use_local_hardlinks, use_separate_remote
builtin-fetch-pack.c::find_common(): len
builtin-remote.c::mv(): symref
diff.c::show_stats():show_stats(): total
diffcore-break.c::should_break(): base_size
fast-import.c::validate_raw_date(): date, sign
fsck.c::fsck_tree(): o_sha1, sha1
xdiff-interface.c::parse_num(): read_some
Signed-off-by: Benjamin Kramer <benny.kra@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of the callers of this function except only one pass NULL to its last
parameter, ignore_packed.
Introduce has_sha1_kept_pack() function that has the function signature
and the semantics of this function, and convert the sole caller that does
not pass NULL to call this new function.
All other callers and has_sha1_pack() lose the ignore_packed parameter.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the literal ANSI escape sequences and replace them by readable
constants.
Signed-off-by: Arjen Laarhoven <arjen@yaph.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When there is more than one file that are changed, running git diff with
GIT_EXTERNAL_DIFF incorrectly diagnoses an programming error and dies.
The check introduced in 479b0ae (diff: refactor tempfile cleanup handling,
2009-01-22) to detect a temporary file slot that forgot to remove its
temporary file was inconsistent with the way the codepath to remove the
temporary to mark the slot that it is done with it.
This patch fixes this problem and adds a test case for it.
Signed-off-by: Nazri Ramliy <ayiehere@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/signal-cleanup:
t0005: use SIGTERM for sigchain test
pager: do wait_for_pager on signal death
refactor signal handling for cleanup functions
chain kill signals for cleanup functions
diff: refactor tempfile cleanup handling
Windows: Fix signal numbers
This is an evil merge, as a test added since 1.6.0 expects an incorrect
behaviour the merged commit fixes.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A patch that changes the filetype (e.g. regular file to symlink) of a path
must be split into a deletion event followed by a creation event, which
means that we need to have two independent metainfo lines for each.
However, the code reused the single set of metainfo lines.
As the blob object names recorded on the index lines are usually not used
nor validated on the receiving end, this is not an issue with normal use
of the resulting patch. However, when accepting a binary patch to delete
a blob, git-apply verified that the postimage blob object name on the
index line is 0{40}, hence a patch that deletes a regular file blob that
records binary contents to create a blob with different filetype (e.g. a
symbolic link) failed to apply. "git am -3" also uses the blob object
names recorded on the index line, so it would also misbehave when
synthesizing a preimage tree.
This moves the code to generate metainfo lines around, so that two
independent sets of metainfo lines are used for the split halves.
Additional tests by Jeff King.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* js/diff-color-words:
Change the spelling of "wordregex".
color-words: Support diff.wordregex config option
color-words: make regex configurable via attributes
color-words: expand docs with precise semantics
color-words: enable REG_NEWLINE to help user
color-words: take an optional regular expression describing words
color-words: change algorithm to allow for 0-character word boundaries
color-words: refactor word splitting and use ALLOC_GROW()
Add color_fwrite_lines(), a function coloring each line individually
The current code is very inconsistent about which signals
are caught for doing cleanup of temporary files and lock
files. Some callsites checked only SIGINT, while others
checked a variety of death-dealing signals.
This patch factors out those signals to a single function,
and then calls it everywhere. For some sites, that means
this is a simple clean up. For others, it is an improvement
in that they will now properly clean themselves up after a
larger variety of signals.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a piece of code wanted to do some cleanup before exiting
(e.g., cleaning up a lockfile or a tempfile), our usual
strategy was to install a signal handler that did something
like this:
do_cleanup(); /* actual work */
signal(signo, SIG_DFL); /* restore previous behavior */
raise(signo); /* deliver signal, killing ourselves */
For a single handler, this works fine. However, if we want
to clean up two _different_ things, we run into a problem.
The most recently installed handler will run, but when it
removes itself as a handler, it doesn't put back the first
handler.
This patch introduces sigchain, a tiny library for handling
a stack of signal handlers. You sigchain_push each handler,
and use sigchain_pop to restore whoever was before you in
the stack.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are two pieces of code that create tempfiles for diff:
run_external_diff and run_textconv. The former cleans up its
tempfiles in the face of premature death (i.e., by die() or
by signal), but the latter does not. After this patch, they
will both use the same cleanup routines.
To make clear what the change is, let me first explain what
happens now:
- run_external_diff uses a static global array of 2
diff_tempfile structs (since it knows it will always
need exactly 2 tempfiles). It calls prepare_temp_file
(which doesn't know anything about the global array) on
each of the structs, creating the tempfiles that need to
be cleaned up. It then registers atexit and signal
handlers to look through the global array and remove the
tempfiles. If it succeeds, it calls the handler manually
(which marks the tempfile structs as unused).
- textconv has its own tempfile struct, which it allocates
using prepare_temp_file and cleans up manually. No
signal or atexit handlers.
The new code moves the installation of cleanup handlers into
the prepare_temp_file function. Which means that that
function now has to understand that there is static tempfile
storage. So what happens now is:
- run_external_diff calls prepare_temp_file
- prepare_temp_file calls claim_diff_tempfile, which
allocates an unused slot from our global array
- prepare_temp_file installs (if they have not already
been installed) atexit and signal handlers for cleanup
- prepare_temp_file sets up the tempfile as usual
- prepare_temp_file returns a pointer to the allocated
tempfile
The advantage being that run_external_diff no longer has to
care about setting up cleanup handlers. Now by virtue of
calling prepare_temp_file, run_textconv gets the same
benefit, as will any future users of prepare_temp_file.
There are also a few side benefits to the specific
implementation:
- we now install cleanup handlers _before_ allocating the
tempfile, closing a race which could leave temp cruft
- when allocating a slot in the global array, we will now
detect a situation where the old slots were not properly
vacated (i.e., somebody forgot to call remove upon
leaving the function). In the old code, such a situation
would silently overwrite the tempfile names, meaning we
would forget to clean them up. The new code dies with a
bug warning.
- we make sure only to install the signal handler once.
This isn't a big deal, since we are just overwriting the
old handler, but will become an issue when a later patch
converts the code to use sigchain
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When diff is invoked with --color-words (w/o =regex), use the regular
expression the user has configured as diff.wordregex.
diff drivers configured via attributes take precedence over the
diff.wordregex-words setting. If the user wants to change them, they have
their own configuration variables.
Signed-off-by: Boyd Stephen Smith Jr <bss@iguanasuicide.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All the other config variables use CamelCase. This config variable should
not be an exception.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the --color-words splitting regular expression configurable via
the diff driver's 'wordregex' attribute. The user can then set the
driver on a file in .gitattributes. If a regex is given on the
command line, it overrides the driver's setting.
We also provide built-in regexes for the languages that already had
funcname patterns, and add an appropriate diff driver entry for C/++.
(The patterns are designed to run UTF-8 sequences into a single chunk
to make sure they remain readable.)
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We silently truncate a match at the newline, which may lead to
unexpected behaviour, e.g., when matching "<[^>]*>" against
<foo
bar>
since then "<foo" becomes a word (and "bar>" doesn't!) even though the
regex said only angle-bracket-delimited things can be words.
To alleviate the problem slightly, use REG_NEWLINE so that negated
classes can't match a newline. Of course newlines can still be
matched explicitly.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In some applications, words are not delimited by white space. To
allow for that, you can specify a regular expression describing
what makes a word with
git diff --color-words='[A-Za-z0-9]+'
Note that words cannot contain newline characters.
As suggested by Thomas Rast, the words are the exact matches of the
regular expression.
Note that a regular expression beginning with a '^' will match only
a word at the beginning of the hunk, not a word at the beginning of
a line, and is probably not what you want.
This commit contains a quoting fix by Thomas Rast.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Up until now, the color-words code assumed that word boundaries are
identical to white space characters.
Therefore, it could get away with a very simple scheme: it copied the
hunks, substituted newlines for each white space character, called
libxdiff with the processed text, and then identified the text to
output by the offsets (which agreed since the original text had the
same length).
This code was ugly, for a number of reasons:
- it was impossible to introduce 0-character word boundaries,
- we had to print everything word by word, and
- the code needed extra special handling of newlines in the removed part.
Fix all of these issues by processing the text such that
- we build word lists, separated by newlines,
- we remember the original offsets for every word, and
- after calling libxdiff on the wordlists, we parse the hunk headers, and
find the corresponding offsets, and then
- we print the removed/added parts in one go.
The pre and post samples in the test were provided by Santi Béjar.
Note that there is some strange special handling of hunk headers where
one line range is 0 due to POSIX: in this case, the start is one too
low. In other words a hunk header '@@ -1,0 +2 @@' actually means that
the line must be added after the _second_ line of the pre text, _not_
the first.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Word splitting is now performed by the function diff_words_fill(),
avoiding having the same code twice.
In the same spirit, avoid duplicating the code of ALLOC_GROW().
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit teaches Git to produce diff output using the patience diff
algorithm with the diff option '--patience'.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
LF at the end of format strings given to die() is redundant because
die already adds one on its own.
Signed-off-by: Alexander Potashev <aspotashev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Merge two hunks if there is only the specified number of otherwise unshown
context between them. For --inter-hunk-context=1, the resulting patch has
the same number of lines but shows uninterrupted context instead of a
context header line in between.
Patches generated with this option are easier to read but are also more
likely to conflict if the file to be patched contains other changes.
This patch keeps the default for this option at 0. It is intended to just
make the feature available in order to see its advantages and downsides.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The type of the size member of filespec is ulong, while strbuf_detach expects
a size_t pointer. This patch should fix the warning:
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code was already set up to not really need it, so this just massages
it a bit to remove the use entirely.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes all tests pass on a system where 'lstat()' has been hacked to
return bogus data in st_size for symlinks.
Of course, the test coverage isn't complete, but it's a good baseline.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently we just skip rewrite diffs for binary files; this
patch makes an exception for files which will be textconv'd,
and actually performs the textconv before generating the
diff.
Conceptually, rewrite diffs should be in the exact same
format as the a non-rewrite diff, except that we refuse to
share any context. Thus it makes very little sense for "git
diff" to show a textconv'd diff, but for "git diff -B" to
show "Binary files differ".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current emit_rewrite_diff code always writes a text patch without
checking whether the content is binary. This means that if you end up with
a rewrite diff for a binary file, you get lots of raw binary goo in your
patch.
Instead, if we have binary files, then let's just skip emit_rewrite_diff
altogether. We will already have shown the "dissimilarity index" line, so
it is really about the diff contents. If binary diffs are turned off, the
"Binary files a/file and b/file differ" message should be the same in
either case. If we do have binary patches turned on, there isn't much
point in making a less-efficient binary patch that does a total rewrite;
no human is going to read it, and since binary patches don't apply with
any fuzz anyway, the result of application should be the same.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some history viewers use the diff plumbing to generate diffs
rather than going through the "git diff" porcelain.
Currently, there is no way for them to specify that they
would like to see the text-converted version of the diff.
This patch adds a "--textconv" option to allow such a
plumbing user to allow text conversion. The user can then
tell the viewer whether or not they would like text
conversion enabled.
While it may be tempting add a configuration option rather
than requiring each plumbing user to be configured to pass
--textconv, that is somewhat dangerous. Text-converted diffs
generally cannot be applied directly, so each plumbing user
should "opt in" to generating such a diff, either by
explicit request of the user or by confirming that their
output will not be fed to patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* rs/blame:
blame: use xdi_diff_hunks(), get rid of struct patch
add xdi_diff_hunks() for callers that only need hunk lengths
Allow alternate "low-level" emit function from xdl_diff
Always initialize xpparam_t to 0
blame: inline get_patch()
We treat symlinks as text containing the results of the
symlink, so it doesn't make much sense to text-convert them.
Similarly gitlink components just end up as the text
"Subproject commit $sha1", which we should leave intact.
Note that a typechange may be broken into two parts: the
removal of the old part and the addition of the new. In that
case, we _do_ show the textconv for any part which is the
addition or removal of a file we would ordinarily textconv,
since it is purely acting on the file contents.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffs that have been produced with textconv almost certainly
cannot be applied, so we want to be careful not to generate
them in things like format-patch.
This introduces a new diff options, ALLOW_TEXTCONV, which
controls this behavior. It is off by default, but is
explicitly turned on for the "log" family of commands, as
well as the "diff" porcelain (but not diff-* plumbing).
Because both text conversion and external diffing are
controlled by these diff options, we can get rid of the
"plumbing versus porcelain" distinction when reading the
config. This was an attempt to control the same thing, but
suffered from being too coarse-grained.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>