Commit Graph

260 Commits

Author SHA1 Message Date
Junio C Hamano
4c90d8908a Merge branch 'jn/log-m-does-not-imply-p'
Earlier "git log -m" was changed to always produce patch output,
which would break existing scripts, which has been reverted.

* jn/log-m-does-not-imply-p:
  Revert 'diff-merges: let "-m" imply "-p"'
2021-08-11 12:36:18 -07:00
Jonathan Nieder
6a38e33331 Revert 'diff-merges: let "-m" imply "-p"'
This reverts commit f5bfcc823b, which
made "git log -m" imply "--patch" by default.  The logic was that
"-m", which makes diff generation for merges perform a diff against
each parent, has no use unless I am viewing the diff, so we could save
the user some typing by turning on display of the resulting diff
automatically.  That wasn't expected to adversely affect scripts
because scripts would either be using a command like "git diff-tree"
that already emits diffs by default or would be combining -m with a
diff generation option such as --name-status.  By saving typing for
interactive use without adversely affecting scripts in the wild, it
would be a pure improvement.

The problem is that although diff generation options are only relevant
for the displayed diff, a script author can imagine them affecting
path limiting.  For example, I might run

	git log -w --format=%H -- README

hoping to list commits that edited README, excluding whitespace-only
changes.  In fact, a whitespace-only change is not TREESAME so the use
of -w here has no effect (since we don't apply these diff generation
flags to the diff_options struct rev_info::pruning used for this
purpose), but the documentation suggests that it should work

	Suppose you specified foo as the <paths>. We shall call
	commits that modify foo !TREESAME, and the rest TREESAME. (In
	a diff filtered for foo, they look different and equal,
	respectively.)

and a script author who has not tested whitespace-only changes
wouldn't notice.

Similarly, a script author could include

	git log -m --first-parent --format=%H -- README

to filter the first-parent history for commits that modified README.
The -m is a no-op but it reflects the script author's intent.  For
example, until 1e20a407fe (stash list: stop passing "-m" to "git
log", 2021-05-21), "git stash list" did this.

As a result, we can't safely change "-m" to imply "-p" without fear of
breaking such scripts.  Restore the previous behavior.

Noticed because Rust's src/bootstrap/bootstrap.py made use of this
same construct: https://github.com/rust-lang/rust/pull/87513.  That
script has been updated to omit the unnecessary "-m" option, but we
can expect other scripts in the wild to have similar expectations.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-09 13:52:01 -07:00
Elijah Newren
9dd29dbef0 diffcore-rename: treat a rename_limit of 0 as unlimited
In commit 89973554b5 (diffcore-rename: make diff-tree -l0 mean
-l<large>, 2017-11-29), -l0 was given a special magical "large" value,
but one which was not large enough for some uses (as can be seen from
commit 9f7e4bfa3b (diff: remove silent clamp of renameLimit,
2017-11-13).  Make 0 (or a negative value) be treated as unlimited
instead and update the documentation to mention this.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-15 16:54:24 -07:00
Elijah Newren
6623a528e0 doc: clarify documentation for rename/copy limits
A few places in the docs implied that rename/copy detection is always
quadratic or that all (unpaired) files were involved in the quadratic
portion of rename/copy detection.  The following two commits each
introduced an exception to this:

    9027f53cb5 (Do linear-time/space rename logic for exact renames,
                  2007-10-25)
    bd24aa2f97 (diffcore-rename: guide inexact rename detection based
                  on basenames, 2021-02-14)

(As a side note, for copy detection, the basename guided inexact rename
detection is turned off and the exact renames will only result in
sources (without the dests) being removed from the set of files used in
quadratic detection.  So, for copy detection, the documentation was
closer to correct.)

Avoid implying that all files involved in rename/copy detection are
subject to the full quadratic algorithm.  While at it, also note the
default values for all these settings.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-15 16:54:24 -07:00
Junio C Hamano
8e444e66df Merge branch 'so/log-m-implies-p'
The "-m" option in "git log -m" that does not specify which format,
if any, of diff is desired did not have any visible effect; it now
implies some form of diff (by default "--patch") is produced.

* so/log-m-implies-p:
  diff-merges: let "-m" imply "-p"
  diff-merges: rename "combined_imply_patch" to "merges_imply_patch"
  stash list: stop passing "-m" to "git log"
  git-svn: stop passing "-m" to "git rev-list"
  diff-merges: move specific diff-index "-m" handling to diff-index
  t4013: test "git diff-index -m"
  t4013: test "git diff-tree -m"
  t4013: test "git log -m --stat"
  t4013: test "git log -m --raw"
  t4013: test that "-m" alone has no effect in "git log"
2021-06-14 13:33:27 +09:00
Sergey Organov
f5bfcc823b diff-merges: let "-m" imply "-p"
Fix long standing inconsistency between -c/--cc that do imply -p on
one side, and -m that did not imply -p on the other side.

Change corresponding test accordingly, as "log -m" output should now
match one from "log -m -p", rather than from just "log".

Change documentation accordingly.

NOTES:

After this patch

  git log -m

produces diffs without need to provide -p as well, that improves both
consistency and usability. It gets even more useful if one sets
"log.diffMerges" configuration variable to "first-parent" to force -m
produce usual diff with respect to first parent only.

This patch, however, does not change behavior when specific diff
format is explicitly provided on the command-line, so that commands
like

  git log -m --raw
  git log -m --stat

are not affected, nor does it change commands where specific diff
format is active by default, such as:

  git diff-tree -m

It's also worth to be noticed that exact historical semantics of -m is
still provided by --diff-merges=separate.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-21 09:24:14 +09:00
Junio C Hamano
93e0b28dbb Merge branch 'ab/pathname-encoding-doc'
Clarify that pathnames recorded in Git trees are most often (but
not necessarily) encoded in UTF-8.

* ab/pathname-encoding-doc:
  doc: clarify the filename encoding in git diff
2021-04-30 13:50:27 +09:00
Andrey Bienkowski
9364bf465d doc: clarify the filename encoding in git diff
AFAICT parsing the output of `git diff --name-only master...feature`
is the intended way of programmatically getting the list of files
modified
by a feature branch. It is impossible to parse text unless you know what
encoding it is in. The output encoding of diff --name-only and

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-20 12:57:26 -07:00
Sergey Organov
364bc11fe5 doc/diff-options: document new --diff-merges features
Document changes in -m and --diff-merges=m semantics, as well as new
--diff-merges=on option.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 23:38:35 -07:00
Junio C Hamano
1eb4136ac2 diff: --{rotate,skip}-to=<path>
In the implementation of "git difftool", there is a case where the
user wants to start viewing the diffs at a specific path and
continue on to the rest, optionally wrapping around to the
beginning.  Since it is somewhat cumbersome to implement such a
feature as a post-processing step of "git diff" output, let's
support it internally with two new options.

 - "git diff --rotate-to=C", when the resulting patch would show
   paths A B C D E without the option, would "rotate" the paths to
   shows patch to C D E A B instead.  It is an error when there is
   no patch for C is shown.

 - "git diff --skip-to=C" would instead "skip" the paths before C,
   and shows patch to C D E.  Again, it is an error when there is no
   patch for C is shown.

 - "git log [-p]" also accepts these two options, but it is not an
   error if there is no change to the specified path.  Instead, the
   set of output paths are rotated or skipped to the specified path
   or the first path that sorts after the specified path.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 09:30:42 -08:00
Junio C Hamano
aac006aa99 Merge branch 'so/log-diff-merge'
"git log" learned a new "--diff-merges=<how>" option.

* so/log-diff-merge: (32 commits)
  t4013: add tests for --diff-merges=first-parent
  doc/git-show: include --diff-merges description
  doc/rev-list-options: document --first-parent changes merges format
  doc/diff-generate-patch: mention new --diff-merges option
  doc/git-log: describe new --diff-merges options
  diff-merges: add '--diff-merges=1' as synonym for 'first-parent'
  diff-merges: add old mnemonic counterparts to --diff-merges
  diff-merges: let new options enable diff without -p
  diff-merges: do not imply -p for new options
  diff-merges: implement new values for --diff-merges
  diff-merges: make -m/-c/--cc explicitly mutually exclusive
  diff-merges: refactor opt settings into separate functions
  diff-merges: get rid of now empty diff_merges_init_revs()
  diff-merges: group diff-merge flags next to each other inside 'rev_info'
  diff-merges: split 'ignore_merges' field
  diff-merges: fix -m to properly override -c/--cc
  t4013: add tests for -m failing to override -c/--cc
  t4013: support test_expect_failure through ':failure' magic
  diff-merges: revise revs->diff flag handling
  diff-merges: handle imply -p on -c/--cc logic for log.c
  ...
2021-02-05 16:40:44 -08:00
Sergey Organov
1d24509b7b doc/git-show: include --diff-merges description
Move description of --diff-merges option from git-log.txt to
diff-options.txt so that it is included in the git-show help.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:32 -08:00
Junio C Hamano
3f6dc9c366 Merge branch 'pb/blame-funcname-range-userdiff'
"git blame -L :funcname -- path" did not work well for a path for
which a userdiff driver is defined.

* pb/blame-funcname-range-userdiff:
  blame: simplify 'setup_blame_bloom_data' interface
  blame: simplify 'setup_scoreboard' interface
  blame: enable funcname blaming with userdiff driver
  line-log: mention both modes in 'blame' and 'log' short help
  doc: add more pointers to gitattributes(5) for userdiff
  blame-options.txt: also mention 'funcname' in '-L' description
  doc: line-range: improve formatting
  doc: log, gitk: move '-L' description to 'line-range-options.txt'
2020-11-18 13:32:53 -08:00
Junio C Hamano
ee13bebbd5 Merge branch 'jc/abbrev-doc'
The documentation on the "--abbrev=<n>" option did not say the
output may be longer than "<n>" hexdigits, which has been
clarified.

* jc/abbrev-doc:
  doc: clarify that --abbrev=<n> is about the minimum length
2020-11-11 13:18:38 -08:00
Junio C Hamano
c5a802f0ce Merge branch 'so/format-patch-doc-on-default-diff-format'
Docfix.

* so/format-patch-doc-on-default-diff-format:
  doc/diff-options: fix out of place mentions of '--patch/-p'
2020-11-11 13:18:37 -08:00
Junio C Hamano
cda34e0d0c doc: clarify that --abbrev=<n> is about the minimum length
Early text written in 2006 explains the "--abbrev=<n>" option to
"show only a partial prefix", without saying that the length of the
partial prefix is not necessarily the number given to the option to
ensure that the output names the object uniquely.

Update documentation for the diff family of commands, "blame",
"branch --verbose", "ls-files" and "ls-tree" to stress that the
short prefix must uniquely refer to an object, and <n> is merely
the mininum number of hexdigits used in the prefix.

Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-04 14:04:44 -08:00
Junio C Hamano
1ae0949a03 Merge branch 'mk/diff-ignore-regex'
"git diff" family of commands learned the "-I<regex>" option to
ignore hunks whose changed lines all match the given pattern.

* mk/diff-ignore-regex:
  diff: add -I<regex> that ignores matching changes
  merge-base, xdiff: zero out xpparam_t structures
2020-11-02 13:17:44 -08:00
Philippe Blain
0cce88f1e4 doc: add more pointers to gitattributes(5) for userdiff
Several Git commands can make use of the builtin userdiff patterns, but
it's not obvious in the documentation. Add pointers to the 'Defining a
custom hunk header' part of gitattributes(5) in the description of the
following options:

- the '--function-context' option of `git diff` and friends
- the '--function-context' option of `git grep`
- the '-L :<funcname>' option of `git log`, `gitk` and `git blame`

In 'git-grep.txt', take the opportunity to use backticks in the
description of '--show-function', and improve the wording of the
desription of '--function-context'.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-01 15:54:14 -08:00
Sergey Organov
714d491af0 doc/diff-options: fix out of place mentions of '--patch/-p'
First, references to --patch and -p appeared in the description of
git-format-patch, where the options themselves are not included.

Next, the description of --unified option elsewhere had duplicate implied
statements: "Implies --patch. Implies -p."

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-31 13:14:26 -07:00
Michał Kępień
296d4a94e7 diff: add -I<regex> that ignores matching changes
Add a new diff option that enables ignoring changes whose all lines
(changed, removed, and added) match a given regular expression.  This is
similar to the -I/--ignore-matching-lines option in standalone diff
utilities and can be used e.g. to ignore changes which only affect code
comments or to look for unrelated changes in commits containing a large
number of automatically applied modifications (e.g. a tree-wide string
replacement).  The difference between -G/-S and the new -I option is
that the latter filters output on a per-change basis.

Use the 'ignore' field of xdchange_t for marking a change as ignored or
not.  Since the same field is used by --ignore-blank-lines, identical
hunk emitting rules apply for --ignore-blank-lines and -I.  These two
options can also be used together in the same git invocation (they are
complementary to each other).

Rename xdl_mark_ignorable() to xdl_mark_ignorable_lines(), to indicate
that it is logically a "sibling" of xdl_mark_ignorable_regex() rather
than its "parent".

Signed-off-by: Michał Kępień <michal@isc.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-20 12:53:26 -07:00
Junio C Hamano
096c948dab Merge branch 'dd/diff-customize-index-line-abbrev'
The output from the "diff" family of the commands had abbreviated
object names of blobs involved in the patch, but its length was not
affected by the --abbrev option.  Now it is.

* dd/diff-customize-index-line-abbrev:
  diff: index-line: respect --abbrev in object's name
  t4013: improve diff-post-processor logic
2020-08-31 15:49:46 -07:00
Đoàn Trần Công Danh
3046c7f69a diff: index-line: respect --abbrev in object's name
A handful of Git's commands respect `--abbrev' for customizing length
of abbreviation of object names.

For diff-family, Git supports 2 different options for 2 different
purposes, `--full-index' for showing diff-patch object's name in full,
and `--abbrev' to customize the length of object names in diff-raw and
diff-tree header lines, without any options to customise the length of
object names in diff-patch format. When working with diff-patch format,
we only have two options, either full index, or default abbrev length.

Although, that behaviour is documented, it doesn't stop users from
trying to use `--abbrev' with the hope of customising diff-patch's
objects' name's abbreviation.

Let's allow the blob object names shown on the "index" line to be
abbreviated to arbitrary length given via the "--abbrev" option.

To preserve backward compatibility with old script that specify both
`--full-index' and `--abbrev', always show full object id
if `--full-index' is specified.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-21 12:43:05 -07:00
Jeff King
9a6d515fc3 doc/git-log: move "-t" into diff-options list
The "-t" option is infrequently used; it doesn't deserve a spot near the
top of the options list. Let's push it down into the diff-options
include, near the definition of --raw.

We'll protect it with a git-log ifdef, since it doesn't make any sense
for non-tree diff commands. Note that this means it also shows up in
git-show, but that's a good thing; it applies equally well there.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-29 13:44:03 -07:00
Laurent Arnoud
c28ded83fc diff: add config option relative
The `diff.relative` boolean option set to `true` shows only changes in
the current directory/value specified by the `path` argument of the
`relative` option and shows pathnames relative to the aforementioned
directory.

Teach `--no-relative` to override earlier `--relative`

Add for git-format-patch(1) options documentation `--relative` and
`--no-relative`

Signed-off-by: Laurent Arnoud <laurent@spkdev.net>
Acked-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-24 16:23:59 -07:00
Martin Ågren
9299f84921 diff-options.txt: avoid "regex" overload in example
When we exemplify the difference between `-G` and `-S` (using
`--pickaxe-regex`), we do so using an example diff and git-diff
invocation involving "regexec", "regexp", "regmatch", ...

The example is correct, but we can make it easier to untangle by
avoiding writing "regex.*" unless it's really needed to make our point.

Use some made-up, non-regexy words instead.

Reported-by: Adam Dinwoodie <adam@dinwoodie.org>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-09 09:25:24 -08:00
Junio C Hamano
8a9a837a63 Merge branch 'nd/diff-parseopt-3'
Third batch to teach the diff machinery to use the parse-options
API.

* nd/diff-parseopt-3:
  diff-parseopt: convert --submodule
  diff-parseopt: convert --ignore-submodules
  diff-parseopt: convert --textconv
  diff-parseopt: convert --ext-diff
  diff-parseopt: convert --quiet
  diff-parseopt: convert --exit-code
  diff-parseopt: convert --color-words
  diff-parseopt: convert --word-diff-regex
  diff-parseopt: convert --word-diff
  diff-parseopt: convert --[no-]color
  diff-parseopt: convert --[no-]follow
  diff-parseopt: convert -R
  diff-parseopt: convert -a|--text
  diff-parseopt: convert --full-index
  diff-parseopt: convert --binary
  diff-parseopt: convert --anchored
  diff-parseopt: convert --diff-algorithm
  diff-parseopt: convert --histogram
  diff-parseopt: convert --patience
  diff-parseopt: convert --[no-]indent-heuristic
2019-04-16 19:28:03 +09:00
Junio C Hamano
4ab0f13857 Merge branch 'nd/diff-parseopt-2'
Second batch to teach the diff machinery to use the parse-options
API.

* nd/diff-parseopt-2: (21 commits)
  diff-parseopt: convert --ignore-some-changes
  diff-parseopt: convert --[no-]minimal
  diff-parseopt: convert --relative
  diff-parseopt: convert --no-renames|--[no--rename-empty
  diff-parseopt: convert --find-copies-harder
  diff-parseopt: convert -C|--find-copies
  diff-parseopt: convert -D|--irreversible-delete
  diff-parseopt: convert -M|--find-renames
  diff-parseopt: convert -B|--break-rewrites
  diff-parseopt: convert --output-*
  diff-parseopt: convert --[no-]compact-summary
  diff-parseopt: convert --stat*
  diff-parseopt: convert -s|--no-patch
  diff-parseopt: convert --name-status
  diff-parseopt: convert --name-only
  diff-parseopt: convert --patch-with-stat
  diff-parseopt: convert --summary
  diff-parseopt: convert --check
  diff-parseopt: convert --dirstat and friends
  diff-parseopt: convert --numstat and --shortstat
  ...
2019-03-07 09:59:58 +09:00
Junio C Hamano
54b469b9e9 Merge branch 'nd/diff-parseopt'
The diff machinery, one of the oldest parts of the system, which
long predates the parse-options API, uses fairly long and complex
handcrafted option parser.  This is being rewritten to use the
parse-options API.

* nd/diff-parseopt:
  diff.c: convert --raw
  diff.c: convert -W|--[no-]function-context
  diff.c: convert -U|--unified
  diff.c: convert -u|-p|--patch
  diff.c: prepare to use parse_options() for parsing
  diff.h: avoid bit fields in struct diff_flags
  diff.h: keep forward struct declarations sorted
  parse-options: allow ll_callback with OPTION_CALLBACK
  parse-options: avoid magic return codes
  parse-options: stop abusing 'callback' for lowlevel callbacks
  parse-options: add OPT_BITOP()
  parse-options: disable option abbreviation with PARSE_OPT_KEEP_UNKNOWN
  parse-options: add one-shot mode
  parse-options.h: remove extern on function prototypes
2019-03-07 09:59:52 +09:00
Nguyễn Thái Ngọc Duy
6d9af6f4da diff-parseopt: convert --binary
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-07 08:02:21 +09:00
Nguyễn Thái Ngọc Duy
cdc43eb0b8 diff-parseopt: convert --no-renames|--[no--rename-empty
For --rename-empty, see 90d43b0768 (teach diffcore-rename to
optionally ignore empty content - 2012-03-22) for more information.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-21 15:16:59 -08:00
Nguyễn Thái Ngọc Duy
af2f368091 diff-parseopt: convert --output-*
This also validates that the user specifies a single character in
--output-indicator-*, not a string.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-21 15:16:59 -08:00
Nguyễn Thái Ngọc Duy
4ce7aab5a5 diff-parseopt: convert --dirstat and friends
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-21 15:16:58 -08:00
Junio C Hamano
15b07cba0b Merge branch 'pw/diff-color-moved-ws-fix'
"git diff --color-moved-ws" updates.

* pw/diff-color-moved-ws-fix:
  diff --color-moved-ws: handle blank lines
  diff --color-moved-ws: modify allow-indentation-change
  diff --color-moved-ws: optimize allow-indentation-change
  diff --color-moved=zebra: be stricter with color alternation
  diff --color-moved-ws: fix false positives
  diff --color-moved-ws: demonstrate false positives
  diff: allow --no-color-moved-ws
  Use "whitespace" consistently
  diff: document --no-color-moved
2019-01-29 12:47:53 -08:00
Nguyễn Thái Ngọc Duy
d473e2e0e8 diff.c: convert -U|--unified
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-27 16:28:18 -08:00
Phillip Wood
b73bcbac4a diff: allow --no-color-moved-ws
Allow --no-color-moved-ws and --color-moved-ws=no to cancel any previous
--color-moved-ws option.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-10 10:37:59 -08:00
Phillip Wood
748aa1aa34 Use "whitespace" consistently
Most of the messages and documentation use 'whitespace' rather than
'white space' or 'white spaces' convert to latter two to the former for
consistency.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-10 10:37:42 -08:00
Phillip Wood
fbafb7c682 diff: document --no-color-moved
Add documentation for --no-color-moved.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-10 10:37:32 -08:00
Thomas Braun
e0e7cb8080 log -G: ignore binary files
The -G<regex> option of log looks for the differences whose patch text
contains added/removed lines that match regex.

Currently -G looks also into patches of binary files (which
according to [1]) is binary as well.

This has a couple of issues:

- It makes the pickaxe search slow. In a proprietary repository of the
  author with only ~5500 commits and a total .git size of ~300MB
  searching takes ~13 seconds

    $time git log -Gwave > /dev/null

    real    0m13,241s
    user    0m12,596s
    sys     0m0,644s

  whereas when we ignore binary files with this patch it takes ~4s

    $time ~/devel/git/git log -Gwave > /dev/null

    real    0m3,713s
    user    0m3,608s
    sys     0m0,105s

  which is a speedup of more than fourfold.

- The internally used algorithm for generating patch text is based on
  xdiff and its states in [1]

  > The output format of the binary patch file is proprietary
  > (and binary) and it is basically a collection of copy and insert
  > commands [..]

  which means that the current format could change once the internal
  algorithm is changed as the format is not standardized. In addition
  the git binary patch format used for preparing patches for git apply
  is *different* from the xdiff format as can be seen by comparing

  git log -p -a

    commit 6e95bf4bafccf14650d02ab57f3affe669be10cf
    Author: A U Thor <author@example.com>
    Date:   Thu Apr 7 15:14:13 2005 -0700

        modify binary file

    diff --git a/data.bin b/data.bin
    index f414c84..edfeb6f 100644
    --- a/data.bin
    +++ b/data.bin
    @@ -1,2 +1,4 @@
     a
     a^@a
    +a
    +a^@a

  with git log --binary

    commit 6e95bf4bafccf14650d02ab57f3affe669be10cf
    Author: A U Thor <author@example.com>
    Date:   Thu Apr 7 15:14:13 2005 -0700

        modify binary file

    diff --git a/data.bin b/data.bin
    index f414c84bd3aa25fa07836bb1fb73db784635e24b..edfeb6f501[..]
    GIT binary patch
    literal 12
    QcmYe~N@Pgn0zx1O01)N^ZvX%Q

    literal 6
    NcmYe~N@Pgn0ssWg0XP5v

  which seems unexpected.

To resolve these issues this patch makes -G<regex> ignore binary files
by default. Textconv filters are supported and also -a/--text for
getting the old and broken behaviour back.

The -S<block of text> option of log looks for differences that changes
the number of occurrences of the specified block of text (i.e.
addition/deletion) in a file. As we want to keep the current behaviour,
add a test to ensure it stays that way.

[1]: http://www.xmailserver.org/xdiff.html

Signed-off-by: Thomas Braun <thomas.braun@virtuell-zuhause.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-12-26 14:59:37 -08:00
Junio C Hamano
706b0b5e8d Merge branch 'es/diff-color-moved-fix'
One of the "diff --color-moved" mode "dimmed_zebra" that was named
in an unusual way has been deprecated and replaced by
"dimmed-zebra".

* es/diff-color-moved-fix:
  diff: --color-moved: rename "dimmed_zebra" to "dimmed-zebra"
2018-08-15 15:08:22 -07:00
Junio C Hamano
a81575aa91 Merge branch 'sb/diff-color-move-more'
"git diff --color-moved" feature has further been tweaked.

* sb/diff-color-move-more:
  diff.c: offer config option to control ws handling in move detection
  diff.c: add white space mode to move detection that allows indent changes
  diff.c: factor advance_or_nullify out of mark_color_as_moved
  diff.c: decouple white space treatment from move detection algorithm
  diff.c: add a blocks mode for moved code detection
  diff.c: adjust hash function signature to match hashmap expectation
  diff.c: do not pass diff options as keydata to hashmap
  t4015: avoid git as a pipe input
  xdiff/xdiffi.c: remove unneeded function declarations
  xdiff/xdiff.h: remove unused flags
2018-08-02 15:30:40 -07:00
Eric Sunshine
e3f2f5f9cd diff: --color-moved: rename "dimmed_zebra" to "dimmed-zebra"
The --color-moved "dimmed_zebra" mode (with an underscore) is an
anachronism. Most options and modes are hyphenated. It is more difficult
to type and somewhat more difficult to read than those which are
hyphenated. Therefore, rename it to "dimmed-zebra", and nominally
deprecate "dimmed_zebra".

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-25 14:23:52 -07:00
Stefan Beller
626c0b5d39 diff.c: offer config option to control ws handling in move detection
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-19 12:02:54 -07:00
Stefan Beller
ca1f4ae4df diff.c: add white space mode to move detection that allows indent changes
The option of --color-moved has proven to be useful as observed on the
mailing list. However when refactoring sometimes the indentation changes,
for example when partitioning a functions into smaller helper functions
the code usually mostly moved around except for a decrease in indentation.

To just review the moved code ignoring the change in indentation, a mode
to ignore spaces in the move detection as implemented in a previous patch
would be enough.  However the whole move coloring as motivated in commit
2e2d5ac (diff.c: color moved lines differently, 2017-06-30), brought
up the notion of the reviewer being able to trust the move of a "block".

As there are languages such as python, which depend on proper relative
indentation for the control flow of the program, ignoring any white space
change in a block would not uphold the promises of 2e2d5ac that allows
reviewers to pay less attention to the inside of a block, as inside
the reviewer wants to assume the same program flow.

This new mode of white space ignorance will take this into account and will
only allow the same white space changes per line in each block. This patch
even allows only for the same change at the beginning of the lines.

As this is a white space mode, it is made exclusive to other white space
modes in the move detection.

This patch brings some challenges, related to the detection of blocks.
We need a wide net to catch the possible moved lines, but then need to
narrow down to check if the blocks are still intact. Consider this
example (ignoring block sizes):

 - A
 - B
 - C
 +    A
 +    B
 +    C

At the beginning of a block when checking if there is a counterpart
for A, we have to ignore all space changes. However at the following
lines we have to check if the indent change stayed the same.

Checking if the indentation change did stay the same, is done by computing
the indentation change by the difference in line length, and then assume
the change is only in the beginning of the longer line, the common tail
is the same. That is why the test contains lines like:

 - <TAB> A
 ...
 + A <TAB>
 ...

As the first line starting a block is caught using a compare function that
ignores white spaces unlike the rest of the block, where the white space
delta is taken into account for the comparison, we also have to think about
the following situation:

 - A
 - B
 -   A
 -   B
 +    A
 +    B
 +      A
 +      B

When checking if the first A (both in the + and - lines) is a start of
a block, we have to check all 'A' and record all the white space deltas
such that we can find the example above to be just one block that is
indented.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-19 12:02:54 -07:00
Stefan Beller
b3095712f9 diff.c: decouple white space treatment from move detection algorithm
In the original implementation of the move detection logic the choice for
ignoring white space changes is the same for the move detection as it is
for the regular diff.  Some cases came up where different treatment would
have been nice.

Allow the user to specify that white space should be ignored differently
during detection of moved lines than during generation of added and removed
lines. This is done by providing analogs to the --ignore-space-at-eol,
-b, and -w options by introducing the option --color-moved-ws=<modes>
with the modes named "ignore-space-at-eol", "ignore-space-change" and
"ignore-all-space", which is used only during the move detection phase.

As we change the default, we'll adjust the tests.

For now we do not infer any options to treat white spaces in the move
detection from the generic white space options given to diff.
This can be tuned later to reasonable default.

As we plan on adding more white space related options in a later patch,
that interferes with the current white space options, use a flag field
and clamp it down to  XDF_WHITESPACE_FLAGS, as that (a) allows to easily
check at parse time if we give invalid combinations and (b) can reuse
parts of this patch.

By having the white space treatment in its own option, we'll also
make it easier for a later patch to have an config option for
spaces in the move detection.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-17 11:25:31 -07:00
Stefan Beller
51da15eb23 diff.c: add a blocks mode for moved code detection
The new "blocks" mode provides a middle ground between plain and zebra.
It is as intuitive (few colors) as plain, but still has the requirement
for a minimum of lines/characters to count a block as moved.

Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
 (https://public-inbox.org/git/87o9j0uljo.fsf@evledraar.gmail.com/)
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-17 11:25:31 -07:00
Karthikeyan Singaravelan
1f2abe68d0 doc: fix typos in documentation and release notes
Signed-off-by: Karthikeyan Singaravelan <tir.karthi@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-19 09:01:12 -07:00
Junio C Hamano
d676cc512a Merge branch 'rd/diff-options-typofix'
Typofix.

* rd/diff-options-typofix:
  diff-options.txt: fix minor typos, font inconsistencies, in docs
2018-06-18 10:18:41 -07:00
Robert P. J. Day
7eedad15df diff-options.txt: fix minor typos, font inconsistencies, in docs
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-11 13:11:09 -07:00
Junio C Hamano
cb6462fe74 Merge branch 'en/doc-typoes'
Docfix.

* en/doc-typoes:
  Documentation: normalize spelling of 'normalised'
  Documentation: fix several one-character-off spelling errors
2018-04-25 13:28:58 +09:00
Elijah Newren
c30d4f1b84 Documentation: fix several one-character-off spelling errors
Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-09 14:15:02 +09:00