git-commit-vandalism

Author	SHA1	Message	Date
Thomas Rast	6440d3417c	diff: tweak a _copy_ of diff_options with word-diff When using word diff, the code sets the word_regex from various defaults if it was not set already. The problem is that it does this on the original diff_options, which will also be used in subsequent diffs. This means that when the word_regex is not given on the command line, only the first diff for which a setting for word_regex (either from attributes or diff.wordRegex) ever takes effect. This value then propagates to the rest of the diff runs and in particular prevents further attribute lookups. Fix the problem of changing diff state once and for all, by working with a _copy_ of the diff_options. Noticed-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-03-14 14:41:20 -07:00
Johannes Sixt	62d39359af	t4034: diff.*.wordregex should not be "sticky" in --word-diff The test case applies a custom wordRegex to one file in a diff, and expects that the default word splitting applies to the second file in the diff. But the custom wordRegex is also incorrectly used for the second file. Helped-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-03-14 14:39:01 -07:00
Junio C Hamano	05c65cb116	Merge branch 'tr/maint-word-diff-incomplete-line' * tr/maint-word-diff-incomplete-line: word-diff: ignore '\ No newline at eof' marker	2012-01-18 15:16:19 -08:00
Thomas Rast	c7c2bc0ac9	word-diff: ignore '\ No newline at eof' marker The word-diff logic accumulates + and - lines until another line type appears (normally [ @\]), at which point it generates the word diff. This is usually correct, but it breaks when the preimage does not have a newline at EOF: $ printf "%s" "a a a" >a $ printf "%s\n" "a ab a" >b $ git diff --no-index --word-diff a b diff --git 1/a 2/b index 9f68e94..6a7c02f 100644 --- 1/a +++ 2/b @@ -1 +1 @@ [-a a a-] No newline at end of file {+a ab a+} Because of the order of the lines in a unified diff @@ -1 +1 @@ -a a a \ No newline at end of file +a ab a the '\' line flushed the buffers, and the - and + lines were never matched with each other. A proper fix would defer such markers until the end of the hunk. However, word-diff is inherently whitespace-ignoring, so as a cheap fix simply ignore the marker (and hide it from the output). We use a prefix match for '\ ' to parallel the logic in apply.c:parse_fragment(). We currently do not localize this string (just accept other variants of it in git-apply), but this should be future-proof. Noticed-by: Ivan Shirokoff <shirokoff@yandex-team.ru> Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-01-12 11:27:41 -08:00
Gustaf Hendeby	53b10a1405	Add built-in diff patterns for MATLAB code MATLAB is often used in industry and academia for scientific computations motivating it being included as a built-in pattern. Signed-off-by: Gustaf Hendeby <hendeby@isy.liu.se> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-11-15 16:11:52 -08:00
Jim Meyering	42536dd9b9	do not read beyond end of malloc'd buffer With diff.suppress-blank-empty=true, "git diff --word-diff" would output data that had been read from uninitialized heap memory. The problem was that fn_out_consume did not account for the possibility of a line with length 1, i.e., the empty context line that diff.suppress-blank-empty=true converts from " \n" to "\n". Since it assumed there would always be a prefix character (the space), it decremented "len" unconditionally, thus passing len=0 to emit_line, which would then blindly call emit_line_0 with len=-1 which would pass that value on to fwrite as SIZE_MAX. Boom. Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-05-20 11:39:49 -07:00
Junio C Hamano	5269edf170	t4034 (diff --word-diff): add a minimum Perl drier test vector Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-01-18 09:44:22 -08:00
Jonathan Nieder	5094d15874	t4034 (diff --word-diff): style suggestions Rearrange code to be easier to browse: - first data - then functions - then test assertions Mark up inline test vectors as cat >vector <<-\EOF data data EOF for visual scannability. Use words like "set up" for tests that set up for other tests, to make it obvious which tests are safe to skip. Use repeated function calls instead of a loop for the language-specific tests, so the invocations can be easily tweaked individually (for example if one starts to fail). This means if you add a new subdirectory to t4034/, it will not be automatically used. I think that's worth it for the added explicitness. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-01-18 09:02:23 -08:00
Thomas Rast	8d96e7288f	t4034: bulk verify builtin word regex sanity The builtin word regexes should be tested with some simple examples against simple issues. Do this in bulk. Mainly due to a lack of language knowledge and inspiration, most of the test cases (cpp, csharp, java, objc, pascal, php, python, ruby) are directly based off a C operator precedence table to verify that all operators are split correctly. This means that they are probably incomplete or inaccurate except for 'cpp' itself. Still, they are good enough to already have uncovered a typo in the python and ruby patterns. 'fortran' is based on my anecdotal knowledge of the DO10I parsing rules, and thus probably useless. The rest (bibtex, html, tex) are an ad-hoc test of what I consider important splits in those languages. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-01-18 08:51:58 -08:00
Junio C Hamano	b3ff808b71	Merge branch 'en/and-cascade-tests' * en/and-cascade-tests: (25 commits) t4124 (apply --whitespace): use test_might_fail t3404: do not use 'describe' to implement test_cmp_rev t3404 (rebase -i): introduce helper to check position of HEAD t3404 (rebase -i): move comment to description t3404 (rebase -i): unroll test_commit loops t3301 (notes): use test_expect_code for clarity t1400 (update-ref): use test_must_fail t1502 (rev-parse --parseopt): test exit code from "-h" t6022 (renaming merge): chain test commands with && test-lib: introduce test_line_count to measure files tests: add missing &&, batch 2 tests: add missing && Introduce sane_unset and use it to ensure proper && chaining t7800 (difftool): add missing && t7601 (merge-pull-config): add missing && t7001 (mv): add missing && t6016 (rev-list-graph-simplify-history): add missing && t5602 (clone-remote-exec): add missing && t4026 (color): remove unneeded and unchained command t4019 (diff-wserror): add lots of missing && ... Conflicts: t/t7006-pager.sh	2010-11-24 15:51:49 -08:00
Jonathan Nieder	a48fcd8369	tests: add missing && Breaks in a test assertion's && chain can potentially hide failures from earlier commands in the chain. Commands intended to fail should be marked with !, test_must_fail, or test_might_fail. The examples in this patch do not require that. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-11-09 11:59:49 -08:00
Kevin Ballard	a471833d51	test-lib: extend test_decode_color to handle more color codes Enhance the test_decode_color function to handle all common color codes, including background colors and escapes that contain multiple codes. This change necessitates changing <WHITE> to <BOLD>, so update t4034 as well. This change is necessary for the next commit in order to test background colors properly. Signed-off-by: Kevin Ballard <kevin@sb.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-10-20 16:10:14 -07:00
Thomas Rast	882749a04f	diff: add --word-diff option that generalizes --color-words This teaches the --color-words engine a more general interface that supports two new modes: * --word-diff=plain, inspired by the 'wdiff' utility (most similar to 'wdiff -n <old> <new>'): uses delimiters [-removed-] and {+added+} * --word-diff=porcelain, which generates an ad-hoc machine readable format: - each diff unit is prefixed by [-+ ] and terminated by newline as in unified diff - newlines in the input are output as a line consisting only of a tilde '~' Both of these formats still support color if it is enabled, using it to highlight the differences. --color-words becomes a synonym for --word-diff=color, which is the color-only format. Also adds some compatibility/convenience options. Thanks to Junio C Hamano and Miles Bader for good ideas. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-04-14 10:56:53 -07:00
Junio C Hamano	c2ff10c98e	Merge branch 'jk/1.7.0-status' * jk/1.7.0-status: status/commit: do not suggest "reset HEAD <path>" while merging commit/status: "git add <path>" is not necessarily how to resolve commit/status: check $GIT_DIR/MERGE_HEAD only once t7508-status: test all modes with color t7508-status: status --porcelain ignores relative paths setting status: reduce duplicated setup code status: disable color for porcelain format status -s: obey color.status builtin-commit: refactor short-status code into wt-status.c t7508-status.sh: Add tests for status -s status -s: respect the status.relativePaths option docs: note that status configuration affects only long format commit: support alternate status formats status: add --porcelain output format status: refactor format option parsing status: refactor short-mode printing to its own function status: typo fix in usage git status: not "commit --dry-run" anymore git stat -s: short status output git stat: the beginning of "status that is not a dry-run of commit" Conflicts: t/t4034-diff-words.sh wt-status.c	2009-12-27 23:01:32 -08:00
Michael J Gruber	68cfc6f551	t7508-status: test all modes with color Move a useful script function to decode colored output to text form from t4034 and use it in this test as well. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-12-08 21:52:47 -08:00
Bert Wesarg	89cb73a19a	Give the hunk comment its own color Inspired by the coloring of quilt. Introduce a separate color and paint the hunk comment part, i.e. the name of the function, in a separate color "diff.func" (defaults to plain). Whitespace between hunk header and hunk comment is printed in plain color. Signed-off-by: Bert Wesarg <bert.wesarg@googlemail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-11-28 10:05:44 -08:00
Junio C Hamano	06a4755270	emit_line(): don't emit an empty <SET><RESET> followed by a newline When emit_line() is called with an empty line (but non-zero length, as we send line terminating LF or CRLF to the function), it used to emit <SET><RESET> followed by a newline. Stop the wastefulness. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-11-27 22:33:53 -08:00
Johannes Schindelin	a4ca1465ec	diff --color-words -U0: fix the location of hunk headers Colored word diff without context lines firstly printed all the hunk headers among each other and then printed the diff. This was due to the code relying on getting at least one context line at the end of each hunk, where the colored words would be flushed (it is done that way to be able to ignore rewrapped lines). Noticed by Markus Heidelberg. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-10-30 09:42:56 -07:00
Markus Heidelberg	168eff3c80	t4034-diff-words: add a test for word diff without context Signed-off-by: Markus Heidelberg <markus.heidelberg@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-10-30 09:42:52 -07:00
Boyd Stephen Smith Jr	ae3b970ac3	Change the spelling of "wordregex". Use "wordRegex" for configuration variable names. Use "word_regex" for C language tokens. Signed-off-by: Boyd Stephen Smith Jr. <bss@iguanasuicide.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-01-21 23:52:16 -08:00
Boyd Stephen Smith Jr	98a4d87b87	color-words: Support diff.wordregex config option When diff is invoked with --color-words (w/o =regex), use the regular expression the user has configured as diff.wordregex. diff drivers configured via attributes take precedence over the diff.wordregex-words setting. If the user wants to change them, they have their own configuration variables. Signed-off-by: Boyd Stephen Smith Jr <bss@iguanasuicide.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-01-21 00:51:12 -08:00
Thomas Rast	80c49c3de2	color-words: make regex configurable via attributes Make the --color-words splitting regular expression configurable via the diff driver's 'wordregex' attribute. The user can then set the driver on a file in .gitattributes. If a regex is given on the command line, it overrides the driver's setting. We also provide built-in regexes for the languages that already had funcname patterns, and add an appropriate diff driver entry for C/++. (The patterns are designed to run UTF-8 sequences into a single chunk to make sure they remain readable.) Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-01-17 10:44:21 -08:00
Johannes Schindelin	2b6a5417d7	color-words: take an optional regular expression describing words In some applications, words are not delimited by white space. To allow for that, you can specify a regular expression describing what makes a word with git diff --color-words='[A-Za-z0-9]+' Note that words cannot contain newline characters. As suggested by Thomas Rast, the words are the exact matches of the regular expression. Note that a regular expression beginning with a '^' will match only a word at the beginning of the hunk, not a word at the beginning of a line, and is probably not what you want. This commit contains a quoting fix by Thomas Rast. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-01-17 10:43:08 -08:00
Johannes Schindelin	2e5d2003b2	color-words: change algorithm to allow for 0-character word boundaries Up until now, the color-words code assumed that word boundaries are identical to white space characters. Therefore, it could get away with a very simple scheme: it copied the hunks, substituted newlines for each white space character, called libxdiff with the processed text, and then identified the text to output by the offsets (which agreed since the original text had the same length). This code was ugly, for a number of reasons: - it was impossible to introduce 0-character word boundaries, - we had to print everything word by word, and - the code needed extra special handling of newlines in the removed part. Fix all of these issues by processing the text such that - we build word lists, separated by newlines, - we remember the original offsets for every word, and - after calling libxdiff on the wordlists, we parse the hunk headers, and find the corresponding offsets, and then - we print the removed/added parts in one go. The pre and post samples in the test were provided by Santi Béjar. Note that there is some strange special handling of hunk headers where one line range is 0 due to POSIX: in this case, the start is one too low. In other words a hunk header '@@ -1,0 +2 @@' actually means that the line must be added after the _second_ line of the pre text, _not_ the first. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2009-01-17 10:42:41 -08:00

24 Commits