The values have eluded documentation so far. While at it streamline
the wording by grouping relevant parts together.
Signed-off-by: Andreas Heiduk <asheiduk@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation fix.
* rh/diff-orderfile-doc:
diff: document the format of the -O (diff.orderFile) file
diff: document behavior of relative diff.orderFile
Document that a relative pathname for diff.orderFile is interpreted as
relative to the top-level work directory.
Signed-off-by: Richard Hansen <hansenr@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --inter-hunk-context= option was added in commit 6d0e674a57
("diff: add option to show context between close hunks"). This patch
allows configuring a default for this option.
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a patch inserts a block of lines, whose last lines are the
same as the existing lines that appear before the inserted block,
"git diff" can choose any place between these existing lines as the
boundary between the pre-context and the added lines (adjusting the
end of the inserted block as appropriate) to come up with variants
of the same patch, and some variants are easier to read than others.
We have been trying to improve the choice of this boundary, and Git
2.11 shipped with an experimental "compaction-heuristic". Since
then another attempt to improve the logic further resulted in a new
"indent-heuristic" logic. It is agreed that the latter gives better
result overall, and the former outlived its usefulness.
Retire "compaction", and keep "indent" as an experimental feature.
The latter hopefully will be turned on by default in a future
release, but that should be done as a separate step.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git diff/log --ws-error-highlight=<kind>" lacked the corresponding
configuration variable to set it by default.
* jc/ws-error-highlight:
diff: introduce diff.wsErrorHighlight option
diff.c: move ws-error-highlight parsing helpers up
diff.c: refactor parse_ws_error_highlight()
t4015: split out the "setup" part of ws-error-highlight test
With the preparatory steps, it has become trivial to teach the
system a new diff.wsErrorHighlight configuration that gives the
default value for --ws-error-highlight command line option.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Output from "git diff" can be made easier to read by selecting
which lines are common and which lines are added/deleted
intelligently when the lines before and after the changed section
are the same. A command line option is added to help with the
experiment to find a good heuristics.
* mh/diff-indent-heuristic:
blame: honor the diff heuristic options and config
parse-options: add parse_opt_unknown_cb()
diff: improve positioning of add/delete blocks in diffs
xdl_change_compact(): introduce the concept of a change group
recs_match(): take two xrecord_t pointers as arguments
is_blank_line(): take a single xrecord_t as argument
xdl_change_compact(): only use heuristic if group can't be matched
xdl_change_compact(): fix compaction heuristic to adjust ixo
Some groups of added/deleted lines in diffs can be slid up or down,
because lines at the edges of the group are not unique. Picking good
shifts for such groups is not a matter of correctness but definitely has
a big effect on aesthetics. For example, consider the following two
diffs. The first is what standard Git emits:
--- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl
+++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl
@@ -231,6 +231,9 @@ if (!defined $initial_reply_to && $prompting) {
}
if (!$smtp_server) {
+ $smtp_server = $repo->config('sendemail.smtpserver');
+}
+if (!$smtp_server) {
foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) {
if (-x $_) {
$smtp_server = $_;
The following diff is equivalent, but is obviously preferable from an
aesthetic point of view:
--- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl
+++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl
@@ -230,6 +230,9 @@ if (!defined $initial_reply_to && $prompting) {
$initial_reply_to =~ s/(^\s+|\s+$)//g;
}
+if (!$smtp_server) {
+ $smtp_server = $repo->config('sendemail.smtpserver');
+}
if (!$smtp_server) {
foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) {
if (-x $_) {
This patch teaches Git to pick better positions for such "diff sliders"
using heuristics that take the positions of nearby blank lines and the
indentation of nearby lines into account.
The existing Git code basically always shifts such "sliders" as far down
in the file as possible. The only exception is when the slider can be
aligned with a group of changed lines in the other file, in which case
Git favors depicting the change as one add+delete block rather than one
add and a slightly offset delete block. This naive algorithm often
yields ugly diffs.
Commit d634d61ed6 improved the situation somewhat by preferring to
position add/delete groups to make their last line a blank line, when
that is possible. This heuristic does more good than harm, but (1) it
can only help if there are blank lines in the right places, and (2)
always picks the last blank line, even if there are others that might be
better. The end result is that it makes perhaps 1/3 as many errors as
the default Git algorithm, but that still leaves a lot of ugly diffs.
This commit implements a new and much better heuristic for picking
optimal "slider" positions using the following approach: First observe
that each hypothetical positioning of a diff slider introduces two
splits: one between the context lines preceding the group and the first
added/deleted line, and the other between the last added/deleted line
and the first line of context following it. It tries to find the
positioning that creates the least bad splits.
Splits are evaluated based only on the presence and locations of nearby
blank lines, and the indentation of lines near the split. Basically, it
prefers to introduce splits adjacent to blank lines, between lines that
are indented less, and between lines with the same level of indentation.
In more detail:
1. It measures the following characteristics of a proposed splitting
position in a `struct split_measurement`:
* the number of blank lines above the proposed split
* whether the line directly after the split is blank
* the number of blank lines following that line
* the indentation of the nearest non-blank line above the split
* the indentation of the line directly below the split
* the indentation of the nearest non-blank line after that line
2. It combines the measured attributes using a bunch of
empirically-optimized weighting factors to derive a `struct
split_score` that measures the "badness" of splitting the text at
that position.
3. It combines the `split_score` for the top and the bottom of the
slider at each of its possible positions, and selects the position
that has the best `split_score`.
I determined the initial set of weighting factors by collecting a corpus
of Git histories from 29 open-source software projects in various
programming languages. I generated many diffs from this corpus, and
determined the best positioning "by eye" for about 6600 diff sliders. I
used about half of the repositories in the corpus (corresponding to
about 2/3 of the sliders) as a training set, and optimized the weights
against this corpus using a crude automated search of the parameter
space to get the best agreement with the manually-determined values.
Then I tested the resulting heuristic against the full corpus. The
results are summarized in the following table, in column `indent-1`:
| repository | count | Git 2.9.0 | compaction | compaction-fixed | indent-1 | indent-2 |
| --------------------- | ----- | -------------- | -------------- | ---------------- | -------------- | -------------- |
| afnetworking | 109 | 89 (81.7%) | 37 (33.9%) | 37 (33.9%) | 2 (1.8%) | 2 (1.8%) |
| alamofire | 30 | 18 (60.0%) | 14 (46.7%) | 15 (50.0%) | 0 (0.0%) | 0 (0.0%) |
| angular | 184 | 127 (69.0%) | 39 (21.2%) | 23 (12.5%) | 5 (2.7%) | 5 (2.7%) |
| animate | 313 | 2 (0.6%) | 2 (0.6%) | 2 (0.6%) | 2 (0.6%) | 2 (0.6%) |
| ant | 380 | 356 (93.7%) | 152 (40.0%) | 148 (38.9%) | 15 (3.9%) | 15 (3.9%) | *
| bugzilla | 306 | 263 (85.9%) | 109 (35.6%) | 99 (32.4%) | 14 (4.6%) | 15 (4.9%) | *
| corefx | 126 | 91 (72.2%) | 22 (17.5%) | 21 (16.7%) | 6 (4.8%) | 6 (4.8%) |
| couchdb | 78 | 44 (56.4%) | 26 (33.3%) | 28 (35.9%) | 6 (7.7%) | 6 (7.7%) | *
| cpython | 937 | 158 (16.9%) | 50 (5.3%) | 49 (5.2%) | 5 (0.5%) | 5 (0.5%) | *
| discourse | 160 | 95 (59.4%) | 42 (26.2%) | 36 (22.5%) | 18 (11.2%) | 13 (8.1%) |
| docker | 307 | 194 (63.2%) | 198 (64.5%) | 253 (82.4%) | 8 (2.6%) | 8 (2.6%) | *
| electron | 163 | 132 (81.0%) | 38 (23.3%) | 39 (23.9%) | 6 (3.7%) | 6 (3.7%) |
| git | 536 | 470 (87.7%) | 73 (13.6%) | 78 (14.6%) | 16 (3.0%) | 16 (3.0%) | *
| gitflow | 127 | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| ionic | 133 | 89 (66.9%) | 29 (21.8%) | 38 (28.6%) | 1 (0.8%) | 1 (0.8%) |
| ipython | 482 | 362 (75.1%) | 167 (34.6%) | 169 (35.1%) | 11 (2.3%) | 11 (2.3%) | *
| junit | 161 | 147 (91.3%) | 67 (41.6%) | 66 (41.0%) | 1 (0.6%) | 1 (0.6%) | *
| lighttable | 15 | 5 (33.3%) | 0 (0.0%) | 2 (13.3%) | 0 (0.0%) | 0 (0.0%) |
| magit | 88 | 75 (85.2%) | 11 (12.5%) | 9 (10.2%) | 1 (1.1%) | 0 (0.0%) |
| neural-style | 28 | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) |
| nodejs | 781 | 649 (83.1%) | 118 (15.1%) | 111 (14.2%) | 4 (0.5%) | 5 (0.6%) | *
| phpmyadmin | 491 | 481 (98.0%) | 75 (15.3%) | 48 (9.8%) | 2 (0.4%) | 2 (0.4%) | *
| react-native | 168 | 130 (77.4%) | 79 (47.0%) | 81 (48.2%) | 0 (0.0%) | 0 (0.0%) |
| rust | 171 | 128 (74.9%) | 30 (17.5%) | 27 (15.8%) | 16 (9.4%) | 14 (8.2%) |
| spark | 186 | 149 (80.1%) | 52 (28.0%) | 52 (28.0%) | 2 (1.1%) | 2 (1.1%) |
| tensorflow | 115 | 66 (57.4%) | 48 (41.7%) | 48 (41.7%) | 5 (4.3%) | 5 (4.3%) |
| test-more | 19 | 15 (78.9%) | 2 (10.5%) | 2 (10.5%) | 1 (5.3%) | 1 (5.3%) | *
| test-unit | 51 | 34 (66.7%) | 14 (27.5%) | 8 (15.7%) | 2 (3.9%) | 2 (3.9%) | *
| xmonad | 23 | 22 (95.7%) | 2 (8.7%) | 2 (8.7%) | 1 (4.3%) | 1 (4.3%) | *
| --------------------- | ----- | -------------- | -------------- | ---------------- | -------------- | -------------- |
| totals | 6668 | 4391 (65.9%) | 1496 (22.4%) | 1491 (22.4%) | 150 (2.2%) | 144 (2.2%) |
| totals (training set) | 4552 | 3195 (70.2%) | 1053 (23.1%) | 1061 (23.3%) | 86 (1.9%) | 88 (1.9%) |
| totals (test set) | 2116 | 1196 (56.5%) | 443 (20.9%) | 430 (20.3%) | 64 (3.0%) | 56 (2.6%) |
In this table, the numbers are the count and percentage of human-rated
sliders that the corresponding algorithm got *wrong*. The columns are
* "repository" - the name of the repository used. I used the diffs
between successive non-merge commits on the HEAD branch of the
corresponding repository.
* "count" - the number of sliders that were human-rated. I chose most,
but not all, sliders to rate from those among which the various
algorithms gave different answers.
* "Git 2.9.0" - the default algorithm used by `git diff` in Git 2.9.0.
* "compaction" - the heuristic used by `git diff --compaction-heuristic`
in Git 2.9.0.
* "compaction-fixed" - the heuristic used by `git diff
--compaction-heuristic` after the fixes from earlier in this patch
series. Note that the results are not dramatically different than
those for "compaction". Both produce non-ideal diffs only about 1/3 as
often as the default `git diff`.
* "indent-1" - the new `--indent-heuristic` algorithm, using the first
set of weighting factors, determined as described above.
* "indent-2" - the new `--indent-heuristic` algorithm, using the final
set of weighting factors, determined as described below.
* `*` - indicates that repo was part of training set used to determine
the first set of weighting factors.
The fact that the heuristic performed nearly as well on the test set as
on the training set in column "indent-1" is a good indication that the
heuristic was not over-trained. Given that fact, I ran a second round of
optimization, using the entire corpus as the training set. The resulting
set of weights gave the results in column "indent-2". These are the
weights included in this patch.
The final result gives consistently and significantly better results
across the whole corpus than either `git diff` or `git diff
--compaction-heuristic`. It makes only about 1/30 as many errors as the
former and about 1/10 as many errors as the latter. (And a good fraction
of the remaining errors are for diffs that involve weirdly-formatted
code, sometimes apparently machine-generated.)
The tools that were used to do this optimization and analysis, along
with the human-generated data values, are recorded in a separate project
[1].
This patch adds a new command-line option `--indent-heuristic`, and a
new configuration setting `diff.indentHeuristic`, that activate this
heuristic. This interface is only meant for testing purposes, and should
be finalized before including this change in any release.
[1] https://github.com/mhagger/diff-slider-tools
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach git-diff and friends a new format for displaying the difference
of a submodule. The new format is an inline diff of the contents of the
submodule between the commit range of the update. This allows the user
to see the actual code change caused by a submodule update.
Add tests for the new format and option.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
More mark-up updates to typeset strings that are expected to
literally typed by the end user in fixed-width font.
* mm/doc-tt:
doc: typeset HEAD and variants as literal
CodingGuidelines: formatting HEAD in documentation
doc: typeset long options with argument as literal
doc: typeset '--' as literal
doc: typeset long command-line options as literal
doc: typeset short command-line options as literal
Documentation/git-mv.txt: fix whitespace indentation
It was common in our documentation to surround short option names with
forward quotes, which renders as italic in HTML. Instead, use backquotes
which renders as monospace. This is one more step toward conformance to
Documentation/CodingGuidelines.
This was obtained with:
perl -pi -e "s/'(-[a-z])'/\`\$1\`/g" *.txt
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation set has been updated so that literal commands,
configuration variables and environment variables are consistently
typeset in fixed-width font and bold in manpages.
* tr/doc-tt:
doc: change configuration variables format
doc: more consistency in environment variables format
doc: change environment variables format
doc: clearer rule about formatting literals
It turns out that the earlier effort to update the heuristics may
want to use a bit more time to mature. Turn it off by default.
* jk/diff-compact-heuristic:
diff: disable compaction heuristic for now
http://lkml.kernel.org/g/20160610075043.GA13411@sigill.intra.peff.net
reports that a change to add a new "function" with common ending
with the existing one at the end of the file is shown like this:
def foo
do_foo_stuff()
+ common_ending()
+end
+
+def bar
+ do_bar_stuff()
+
common_ending()
end
when the new heuristic is in use. In reality, the change is to add
the blank line before "def bar" and everything below, which is what
the code without the new heuristic shows.
Disable the heuristics by default, and resurrect the documentation
for the option and the configuration variables, while clearly
marking the feature as still experimental.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This change configuration variables that where in italic style
to monospace font according to the guideline. It was obtained with
grep '[[:alpha:]]*\.[[:alpha:]]*::$' config.txt | \
sed -e 's/::$//' -e 's/\./\\\\./' | \
xargs -iP perl -pi -e "s/\'P\'/\`P\`/g" ./*.txt
Signed-off-by: Tom Russello <tom.russello@grenoble-inp.org>
Signed-off-by: Erwan Mathoniere <erwan.mathoniere@grenoble-inp.org>
Signed-off-by: Samuel Groot <samuel.groot@grenoble-inp.org>
Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It seems that people around here are all happy with the updated
heuristics used to decide where the hunks are separated. Let's keep
that as the default. Even though we do not expect too much trouble
from the difference between the old and the new algorithms, just in
case let's leave the implementation of the knobs to turn it off for
emergencies. There is no longer need for documenting them, though.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to produce the smallest possible diff and combine several diff
hunks together, we implement a heuristic from GNU Diff which moves diff
hunks forward as far as possible when we find common context above and
below a diff hunk. This sometimes produces less readable diffs when
writing C, Shell, or other programming languages, ie:
...
/*
+ *
+ *
+ */
+
+/*
...
instead of the more readable equivalent of
...
+/*
+ *
+ *
+ */
+
/*
...
Implement the following heuristic to (optionally) produce the desired
output.
If there are diff chunks which can be shifted around, shift each hunk
such that the last common empty line is below the chunk with the rest
of the context above.
This heuristic appears to resolve the above example and several other
common issues without producing significantly weird results. However, as
with any heuristic it is not really known whether this will always be
more optimal. Thus, it can be disabled via diff.compactionHeuristic.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename detection is a very convenient feature, and new users shouldn't
have to dig in the documentation to benefit from it.
Potential objections to activating rename detection are that it
sometimes fail, and it is sometimes slow. But rename detection is
already activated by default in several cases like "git status" and "git
merge", so activating diff.renames does not fundamentally change the
situation. When the rename detection fails, it now fails consistently
between "git diff" and "git status".
This setting does not affect plumbing commands, hence well-written
scripts will not be affected.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The description was misleading, since "set to any boolean value" include
"set to false", and diff.renames=false does not enable basic detection,
but actually disables it. Also, document that diff.renames only affects
Porcelain.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This should improve readability. Compare "thislongname" and
"thisLongName". The following keys are left in unchanged. We can
decide what to do with them later.
- am.keepcr
- core.autocrlf .safecrlf .trustctime
- diff.dirstat .noprefix
- gitcvs.usecrlfattr
- gui.blamehistoryctx .trustmtime
- pull.twohead
- receive.autogc
- sendemail.signedoffbycc .smtpsslcertpath .suppresscc
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff.orderfile acts as a default for the -O command line option.
[sb: split up aw's original patch; rework tests and docs, treat option
as pathname]
Signed-off-by: Anders Waldenborg <anders@0x63.nu>
Signed-off-by: Samuel Bronson <naesten@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git status' and 'git commit' can be told to also show the output of "git
submodule summary" by setting the "status.submodulesummary" config option.
But status and commit also honor the "diff.ignoreSubmodules" and the
"submodule.<name>.ignore" settings, which then disable the summary partly
or completely. This - and the fact that the last two settings do not
affect the "git submodule" commands at all - is not well documented.
Extend the documentation in those places where "status.submodulesummary",
"diff.ignoreSubmodules" and "submodule.<name>.ignore" are described to
better explain these dependencies.
Thanks-to: Matthieu Moy <Matthieu.Moy@grenoble-inp.fr>
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add diff.algorithm configuration so that the user does not type
"diff --histogram".
* mp/diff-algo-config:
diff: Introduce --diff-algorithm command line option
config: Introduce diff.algorithm variable
git-completion.bash: Autocomplete --minimal and --histogram for git-diff
Build on top of the clean-up done by jk/mergetool and automatically
generate the list of mergetool and difftool backends the build
supports to be included in the documentation.
* da/mergetool-docs:
doc: generate a list of valid merge tools
mergetool--lib: list user configured tools in '--tool-help'
mergetool--lib: add functions for finding available tools
mergetool--lib: improve the help text in guess_merge_tool()
mergetool--lib: simplify command expressions
Use the show_tool_names() function to build lists of all
the built-in tools supported by difftool and mergetool.
This frees us from needing to update the documentation
whenever a new tool is added.
Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some users or projects prefer different algorithms over others, e.g.
patience over myers or similar. However, specifying appropriate
argument every time diff is to be used is impractical. Moreover,
creating an alias doesn't play nicely with other tools based on diff
(git-show for instance). Hence, a configuration variable which is able
to set specific algorithm is needed. For now, these four values are
accepted: 'myers' (which has the same effect as not setting the config
variable at all), 'minimal', 'patience' and 'histogram'.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The line that happens to begin with indent followed by "3. " was
interpreted as if it was an enumerated list; just wrap the lines
differently to work it around for now.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a diff.submodule configuration variable corresponding to the
'--submodule' command-line option of 'git diff'.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
19299a8 (Documentation: Move diff.<driver>.* from config.txt to
diff-config.txt, 2011-04-07) moved the diff configuration options to
diff-config.txt, but forgot about diff.wordRegex, which was left
behind in config.txt. Fix this.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a configuration variable diff.context that tells
Porcelain commands to use a non-default number of context
lines instead of 3 (the default). With this variable, users
do not have to keep repeating "git log -U8" from the command
line; instead, it becomes sufficient to say "git config
diff.context 8" just once.
Signed-off-by: Jeff Muizelaar <jmuizelaar@mozilla.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Config option diff.statGraphWidth=<width> is equivalent to
--stat-graph-width=<width>, except that the config option is ignored
by format-patch.
For the graph-width limiting to be usable, it should happen
'automatically' once configured, hence the config option.
Nevertheless, graph width limiting only makes sense when used on a
wide terminal, so it should not influence the output of format-patch,
which adheres to the 80-column standard.
Signed-off-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch adds an alternative implementation of show_dirstat(), called
show_dirstat_by_line(), which uses the more expensive diffstat analysis
(as opposed to show_dirstat()'s own (relatively inexpensive) analysis)
to derive the numbers from which the --dirstat output is computed.
The alternative implementation is controlled by the new "lines" parameter
to the --dirstat option (or the diff.dirstat config variable).
For binary files, the diffstat analysis counts bytes instead of lines,
so to prevent binary files from dominating the dirstat results, the
byte counts for binary files are divided by 64 before being compared to
their textual/line-based counterparts. This is a stupid and ugly - but
very cheap - heuristic.
In linux-2.6.git, running the three different --dirstat modes:
time git diff v2.6.20..v2.6.30 --dirstat=changes > /dev/null
vs.
time git diff v2.6.20..v2.6.30 --dirstat=lines > /dev/null
vs.
time git diff v2.6.20..v2.6.30 --dirstat=files > /dev/null
yields the following average runtimes on my machine:
- "changes" (default): ~6.0 s
- "lines": ~9.6 s
- "files": ~0.1 s
So, as expected, there's a considerable performance hit (~60%) by going
through the full diffstat analysis as compared to the default "changes"
analysis (obviously, "files" is much faster than both). As such, the
"lines" mode is probably only useful if you really need the --dirstat
numbers to be consistent with the numbers returned from the other
--*stat options.
The patch also includes documentation and tests for the new dirstat mode.
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new diff.dirstat config variable takes the same arguments as
'--dirstat=<args>', and specifies the default arguments for --dirstat.
The config is obviously overridden by --dirstat arguments passed on the
command line.
When not specified, the --dirstat defaults are 'changes,noncumulative,3'.
The patch also adds several tests verifying the interaction between the
diff.dirstat config variable, and the --dirstat command line option.
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apart from the list of "valid values", 'diff.tool' can take any value,
provided there is a corresponding 'difftool.<tool>.cmd' option. Also,
describe this option just before the 'difftool.*' options.
Helped-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Although the gitattributes page contains comprehensive information
about these configuration options, they should be included in the
config documentation for completeness.
It may be better to rename the "driver" in "diff.<driver>.*" to
something like "content type" or "file type", but for now, let's keep
it consistent across this part of the documentation and the original
description in the gitattributes documentation.
Helped-by: Jakub Narebski <jnareb@gmail.com>
Helped-by: Michael J Gruber <git@drmicha.warpmail.net>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>