This patch adds an alternative implementation of show_dirstat(), called
show_dirstat_by_line(), which uses the more expensive diffstat analysis
(as opposed to show_dirstat()'s own (relatively inexpensive) analysis)
to derive the numbers from which the --dirstat output is computed.
The alternative implementation is controlled by the new "lines" parameter
to the --dirstat option (or the diff.dirstat config variable).
For binary files, the diffstat analysis counts bytes instead of lines,
so to prevent binary files from dominating the dirstat results, the
byte counts for binary files are divided by 64 before being compared to
their textual/line-based counterparts. This is a stupid and ugly - but
very cheap - heuristic.
In linux-2.6.git, running the three different --dirstat modes:
time git diff v2.6.20..v2.6.30 --dirstat=changes > /dev/null
vs.
time git diff v2.6.20..v2.6.30 --dirstat=lines > /dev/null
vs.
time git diff v2.6.20..v2.6.30 --dirstat=files > /dev/null
yields the following average runtimes on my machine:
- "changes" (default): ~6.0 s
- "lines": ~9.6 s
- "files": ~0.1 s
So, as expected, there's a considerable performance hit (~60%) by going
through the full diffstat analysis as compared to the default "changes"
analysis (obviously, "files" is much faster than both). As such, the
"lines" mode is probably only useful if you really need the --dirstat
numbers to be consistent with the numbers returned from the other
--*stat options.
The patch also includes documentation and tests for the new dirstat mode.
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new diff.dirstat config variable takes the same arguments as
'--dirstat=<args>', and specifies the default arguments for --dirstat.
The config is obviously overridden by --dirstat arguments passed on the
command line.
When not specified, the --dirstat defaults are 'changes,noncumulative,3'.
The patch also adds several tests verifying the interaction between the
diff.dirstat config variable, and the --dirstat command line option.
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of having multiple interconnected dirstat-related options, teach
the --dirstat option itself to accept all behavior modifiers as parameters.
- Preserve the current --dirstat=<limit> (where <limit> is an integer
specifying a cut-off percentage)
- Add --dirstat=cumulative, replacing --cumulative
- Add --dirstat=files, replacing --dirstat-by-file
- Also add --dirstat=changes and --dirstat=noncumulative for specifying the
current default behavior. These allow the user to reset other --dirstat
parameters (e.g. 'cumulative' and 'files') occuring earlier on the
command line.
The deprecated options (--cumulative and --dirstat-by-file) are still
functional, although they have been removed from the documentation.
Allow multiple parameters to be separated by commas, e.g.:
--dirstat=files,10,cumulative
Update the documentation accordingly, and add testcases verifying the
behavior of the new syntax.
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* mz/rebase: (34 commits)
rebase: define options in OPTIONS_SPEC
Makefile: do not install sourced rebase scripts
rebase: use @{upstream} if no upstream specified
rebase -i: remove unnecessary state rebase-root
rebase -i: don't read unused variable preserve_merges
git-rebase--am: remove unnecessary --3way option
rebase -m: don't print exit code 2 when merge fails
rebase -m: remember allow_rerere_autoupdate option
rebase: remember strategy and strategy options
rebase: remember verbose option
rebase: extract code for writing basic state
rebase: factor out sub command handling
rebase: make -v a tiny bit more verbose
rebase -i: align variable names
rebase: show consistent conflict resolution hint
rebase: extract am code to new source file
rebase: extract merge code to new source file
rebase: remove $branch as synonym for $orig_head
rebase -i: support --stat
rebase: factor out call to pre-rebase hook
...
* jh/dirstat:
--dirstat: In case of renames, use target filename instead of source filename
Teach --dirstat not to completely ignore rearranged lines within a file
--dirstat-by-file: Make it faster and more correct
--dirstat: Describe non-obvious differences relative to --stat or regular diff
* rr/doc-content-type:
Documentation: Allow custom diff tools to be specified in 'diff.tool'
Documentation: Add diff.<driver>.* to config
Documentation: Move diff.<driver>.* from config.txt to diff-config.txt
Documentation: Add filter.<driver>.* to config
6abd933 (git-svn: allow the mergeinfo property to be set, 2010-09-24)
introduced the --mergeinfo option. Document it.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The local value of the config variable tar.umask is not passed to the
other side with --remote. We may want to change that, but for now just
document this fact.
Reported-by: Jacek Masiulaniec <jacek.masiulaniec@gmail.com>
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove a spurious empty line which prevented asciidoc from recognizing a
list continuation mark ('+'), so that it does not get output literally any
more.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I found that some doubled words had snuck back into projects from which
I'd already removed them, so now there's a "syntax-check" makefile rule in
gnulib to help prevent recurrence.
Running the command below spotted a few in git, too:
git ls-files | xargs perl -0777 -n \
-e 'while (/\b(then?|[iao]n|i[fst]|but|f?or|at|and|[dt])\s+\1\b/gims)' \
-e '{$n=($` =~ tr/\n/\n/ + 1); ($v=$&)=~s/\n/\\n/g;' \
-e 'print "$ARGV:$n:$v\n"}'
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, the --dirstat analysis ignores when lines within a file are
rearranged, because the "damage" calculated by show_dirstat() is 0.
However, if the object name has changed, we already know that there is
some damage, and it is unintuitive to claim there is _no_ damage.
Teach show_dirstat() to assign a minimum amount of damage (== 1) to
entries for which the analysis otherwise yields zero damage, to still
represent that these files are changed, instead of saying that there
is no change.
Also, skip --dirstat analysis when the object names are the same (e.g. for
a pure file rename).
Signed-off-by: Johan Herland <johan@herland.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Also add a testcase documenting the current behavior.
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Similar to the 'remote.<name>.pushurl' config key for git remotes,
'pushurl' is designed to be used in cases where 'url' points to an SVN
repository via a read-only transport, to provide an alternate
read/write transport. It is assumed that both keys point to the same
repository.
The 'pushurl' key is distinct from the 'commiturl' key in that
'commiturl' is a full svn path while 'pushurl' (like 'url') is a base
path. 'commiturl' takes precendece over 'pushurl' in cases where
either might be used.
The 'pushurl' is used by git-svn's dcommit and branch commands.
Signed-off-by: Alejandro R. Sedeño <asedeno@mit.edu>
Reviewed-by: James Y Knight <jknight@itasoftware.com>
Acked-by: Eric Wong <normalperson@yhbt.net>
Apart from the list of "valid values", 'diff.tool' can take any value,
provided there is a corresponding 'difftool.<tool>.cmd' option. Also,
describe this option just before the 'difftool.*' options.
Helped-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Although the gitattributes page contains comprehensive information
about these configuration options, they should be included in the
config documentation for completeness.
It may be better to rename the "driver" in "diff.<driver>.*" to
something like "content type" or "file type", but for now, let's keep
it consistent across this part of the documentation and the original
description in the gitattributes documentation.
Helped-by: Jakub Narebski <jnareb@gmail.com>
Helped-by: Michael J Gruber <git@drmicha.warpmail.net>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Although the gitattributes page contains comprehensive information
about these configuration options, they should be included in the
config documentation for completeness.
Helped-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* mg/rev-list-n-reverse-doc:
git-log.txt,rev-list-options.txt: put option blocks in proper order
git-log.txt,rev-list-options.txt: -n/--max-count is commit limiting
* jk/maint-remote-mirror-safer:
remote: deprecate --mirror
remote: separate the concept of push and fetch mirrors
remote: disallow some nonsensical option combinations
The pack-objects command should take notice of the object file and
refrain from attempting to delta large ones, to be consistent with
the fast-import command.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If user sets config.abbrev option, use it as if --abbrev was given. This
is the default value and user can override different abbrev length by
specifying the --abbrev=N command line option.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jl/submodule-fetch-on-demand:
fetch/pull: Describe --recurse-submodule restrictions in the BUGS section
submodule update: Don't fetch when the submodule commit is already present
fetch/pull: Don't recurse into a submodule when commits are already present
Submodules: Add 'on-demand' value for the 'fetchRecurseSubmodule' option
config: teach the fetch.recurseSubmodules option the 'on-demand' value
fetch/pull: Add the 'on-demand' value to the --recurse-submodules option
fetch/pull: recurse into submodules when necessary
Conflicts:
builtin/fetch.c
submodule.c
As 1.7.4.3 has backmerged a handful of fixes from the master,
drop these entries from 1.7.5 release notes.
Signed-off-by: Junio C Hamano <gitster@pobox.com>