When encountering errors or unknown tokens while parsing parameters to the
--dirstat option, it makes sense to die() with an error message informing
the user of which parameter did not make sense. However, when parsing the
diff.dirstat config variable, we cannot simply die(), but should instead
(after warning the user) ignore the erroneous or unrecognized parameter.
After all, future Git versions might add more dirstat parameters, and
using two different Git versions on the same repo should not cripple the
older Git version just because of a parameter that is only understood by
a more recent Git version.
This patch fixes the issue by refactoring the dirstat parameter parsing
so that parse_dirstat_params() keeps on parsing parameters, even if an
earlier parameter was not recognized. When parsing has finished, it returns
zero if all parameters were successfully parsed, and non-zero if one or
more parameters were not recognized (with appropriate error messages
appended to the 'errmsg' argument).
The parse_dirstat_params() callers then decide (based on the return value
from parse_dirstat_params()) whether to warn and ignore (in case of
diff.dirstat), or to warn and die (in case of --dirstat).
The patch also adds a couple of tests verifying the correct behavior of
--dirstat and diff.dirstat in the face of unknown (possibly future) dirstat
parameters.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch adds an alternative implementation of show_dirstat(), called
show_dirstat_by_line(), which uses the more expensive diffstat analysis
(as opposed to show_dirstat()'s own (relatively inexpensive) analysis)
to derive the numbers from which the --dirstat output is computed.
The alternative implementation is controlled by the new "lines" parameter
to the --dirstat option (or the diff.dirstat config variable).
For binary files, the diffstat analysis counts bytes instead of lines,
so to prevent binary files from dominating the dirstat results, the
byte counts for binary files are divided by 64 before being compared to
their textual/line-based counterparts. This is a stupid and ugly - but
very cheap - heuristic.
In linux-2.6.git, running the three different --dirstat modes:
time git diff v2.6.20..v2.6.30 --dirstat=changes > /dev/null
vs.
time git diff v2.6.20..v2.6.30 --dirstat=lines > /dev/null
vs.
time git diff v2.6.20..v2.6.30 --dirstat=files > /dev/null
yields the following average runtimes on my machine:
- "changes" (default): ~6.0 s
- "lines": ~9.6 s
- "files": ~0.1 s
So, as expected, there's a considerable performance hit (~60%) by going
through the full diffstat analysis as compared to the default "changes"
analysis (obviously, "files" is much faster than both). As such, the
"lines" mode is probably only useful if you really need the --dirstat
numbers to be consistent with the numbers returned from the other
--*stat options.
The patch also includes documentation and tests for the new dirstat mode.
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Only the first digit after the decimal point is kept, as the dirstat
calculations all happen in permille.
Selftests verifying floating-point percentage input has been added.
Improved-by: Junio C Hamano <gitster@pobox.com>
Improved-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new diff.dirstat config variable takes the same arguments as
'--dirstat=<args>', and specifies the default arguments for --dirstat.
The config is obviously overridden by --dirstat arguments passed on the
command line.
When not specified, the --dirstat defaults are 'changes,noncumulative,3'.
The patch also adds several tests verifying the interaction between the
diff.dirstat config variable, and the --dirstat command line option.
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of having multiple interconnected dirstat-related options, teach
the --dirstat option itself to accept all behavior modifiers as parameters.
- Preserve the current --dirstat=<limit> (where <limit> is an integer
specifying a cut-off percentage)
- Add --dirstat=cumulative, replacing --cumulative
- Add --dirstat=files, replacing --dirstat-by-file
- Also add --dirstat=changes and --dirstat=noncumulative for specifying the
current default behavior. These allow the user to reset other --dirstat
parameters (e.g. 'cumulative' and 'files') occuring earlier on the
command line.
The deprecated options (--cumulative and --dirstat-by-file) are still
functional, although they have been removed from the documentation.
Allow multiple parameters to be separated by commas, e.g.:
--dirstat=files,10,cumulative
Update the documentation accordingly, and add testcases verifying the
behavior of the new syntax.
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The expected output from --dirstat=0, is to include any directory with
changes, even if those changes contribute a minuscule portion of the total
changes. However, currently, directories that contribute less than 0.1% are
not included, since their 'permille' value is 0, and there is an
'if (permille)' check in gather_dirstat() that causes them to be ignored.
This test is obviously intended to exclude directories that contribute no
changes whatsoever, but in this case, it hits too broadly. The correct
check is against 'this_dir' from which the permille is calculated. Only if
this value is 0 does the directory truly contribute no changes, and should
be skipped from the output.
This patches fixes this issue, and updates corresponding testcases to
expect the new behvaior.
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, t4013 is the only selftest that exercises the --dirstat machinery,
but it only does a superficial verification of --dirstat's output.
This patch adds a new selftest - t4047-diff-dirstat.sh - which prepares a
commit containing:
- unchanged files, changed files and files with rearranged lines
- copied files, moved files, and unmoved files
It then verifies the correct dirstat output for that commit in the following
dirstat modes:
- --dirstat
- -X
- --dirstat=0
- -X0
- --cumulative
- --dirstat-by-file
- (plus combinations of the above)
Each of the above tests are also run with:
- no rename detection
- rename detection (-M)
- expensive copy detection (-C -C)
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apply parameter expansion. Also use here document to save
test results instead of appending each line with ">>".
Signed-off-by: Mathias Lafeldt <misfire@debugon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/rename-degrade-cc-to-c:
diffcore-rename: fall back to -C when -C -C busts the rename limit
diffcore-rename: record filepair for rename src
diffcore-rename: refactor "too many candidates" logic
builtin/diff.c: remove duplicated call to diff_result_code()
* mz/rebase: (34 commits)
rebase: define options in OPTIONS_SPEC
Makefile: do not install sourced rebase scripts
rebase: use @{upstream} if no upstream specified
rebase -i: remove unnecessary state rebase-root
rebase -i: don't read unused variable preserve_merges
git-rebase--am: remove unnecessary --3way option
rebase -m: don't print exit code 2 when merge fails
rebase -m: remember allow_rerere_autoupdate option
rebase: remember strategy and strategy options
rebase: remember verbose option
rebase: extract code for writing basic state
rebase: factor out sub command handling
rebase: make -v a tiny bit more verbose
rebase -i: align variable names
rebase: show consistent conflict resolution hint
rebase: extract am code to new source file
rebase: extract merge code to new source file
rebase: remove $branch as synonym for $orig_head
rebase -i: support --stat
rebase: factor out call to pre-rebase hook
...
* en/merge-recursive:
merge-recursive: tweak magic band-aid
merge-recursive: When we detect we can skip an update, actually skip it
t6022: New test checking for unnecessary updates of files in D/F conflicts
t6022: New test checking for unnecessary updates of renamed+modified files
* jh/dirstat:
--dirstat: In case of renames, use target filename instead of source filename
Teach --dirstat not to completely ignore rearranged lines within a file
--dirstat-by-file: Make it faster and more correct
--dirstat: Describe non-obvious differences relative to --stat or regular diff
* mg/reflog-with-options:
reflog: fix overriding of command line options
t/t1411: test reflog with formats
builtin/log.c: separate default and setup of cmd_log_init()
When relative dates are more than about a year ago, we start
writing them as "Y years, M months". At the point where we
calculate Y and M, we have the time delta specified as a
number of days. We calculate these integers as:
Y = days / 365
M = (days % 365 + 15) / 30
This rounds days in the latter half of a month up to the
nearest month, so that day 16 is "1 month" (or day 381 is "1
year, 1 month").
We don't round the year at all, though, meaning we can end
up with "1 year, 12 months", which is silly; it should just
be "2 years".
Implement this differently with months of size
onemonth = 365/12
so that
totalmonths = (long)( (days + onemonth/2)/onemonth )
years = totalmonths / 12
months = totalmonths % 12
In order to do this without floats, we write the first formula as
totalmonths = (days*12*2 + 365) / (365*2)
Tests and inspiration by Jeff King.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On systems where the local time and file modification time may be out of
sync (e.g. test directory on NFS) t3306 and t5305 can fail because prune
compares times such as "now" (client time) with file modification times
(server times for remote file systems). I.e., these are spurious test
failures.
Avoid this by setting the relevant modification times to the local time.
Noticed on a system with as little as 2s time skew.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
parse_value in config.c has a static buffer of 1024 bytes that it
parse the value into. This can sometimes be a problem when a
config file contains very long values.
It's particularly amusing that git-config already is able to write
such files, so it should probably be able to read them as well.
Fix this by using a strbuf instead of a fixed-size buffer.
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, the --dirstat analysis ignores when lines within a file are
rearranged, because the "damage" calculated by show_dirstat() is 0.
However, if the object name has changed, we already know that there is
some damage, and it is unintuitive to claim there is _no_ damage.
Teach show_dirstat() to assign a minimum amount of damage (== 1) to
entries for which the analysis otherwise yields zero damage, to still
represent that these files are changed, instead of saying that there
is no change.
Also, skip --dirstat analysis when the object names are the same (e.g. for
a pure file rename).
Signed-off-by: Johan Herland <johan@herland.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, when using --dirstat-by-file, it first does the full --dirstat
analysis (using diffcore_count_changes()), and then resets 'damage' to 1,
if any damage was found by diffcore_count_changes().
But --dirstat-by-file is not interested in the file damage per se. It only
cares if the file changed at all. In that sense it only cares if the blob
object for a file has changed. We therefore only need to compare the
object names of each file pair in the diff queue and we can skip the
entire --dirstat analysis and simply set 'damage' to 1 for each entry
where the object name has changed.
This makes --dirstat-by-file faster, and also bypasses --dirstat's practice
of ignoring rearranged lines within a file.
The patch also contains an added testcase verifying that --dirstat-by-file
now detects changes that only rearrange lines within a file.
Signed-off-by: Johan Herland <johan@herland.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Also add a testcase documenting the current behavior.
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Don't assume one comes after the other on the command line. Use a
three-state variable to track and check its value accordingly.
Signed-off-by: Dan McGee <dpmcgee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One of these passes just fine; the other one exposes a problem where
command line flag order matters for --no-keep-index and --patch
interaction.
Signed-off-by: Dan McGee <dpmcgee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/maint-remote-mirror-safer:
remote: deprecate --mirror
remote: separate the concept of push and fetch mirrors
remote: disallow some nonsensical option combinations
Before we apply a stash, we make sure there are no changes
in the worktree that are not in the index. This check dates
back to the original git-stash.sh, and is presumably
intended to prevent changes in the working tree from being
accidentally lost during the merge.
However, this check has two problems:
1. It is overly restrictive. If my stash changes only file
"foo", but "bar" is dirty in the working tree, it will
prevent us from applying the stash.
2. It is redundant. We don't touch the working tree at all
until we actually call merge-recursive. But it has its
own (much more accurate) checks to avoid losing working
tree data, and will abort the merge with a nicer
message telling us which paths were problems.
So we can simply drop the check entirely.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King reported a problem with git stash apply incorrectly
applying an invalid stash reference.
There is an existing test that should have caught this, but
the test itself was broken, resulting in a false positive.
Signed-off-by: Jon Seymour <jon.seymour@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Once upon a time, "git rev-parse ref@{9999999}" did not
generate an error. Therefore when we got an invalid stash
reference in "stash apply", we could end up not noticing
until quite late. Commit b0f0ecd (detached-stash: work
around git rev-parse failure to detect bad log refs,
2010-08-21) handled this by checking for the "Log for stash
has only %d entries" warning on stderr when we validated the
ref.
A few days later, e6eedc3 (rev-parse: exit with non-zero
status if ref@{n} is not valid., 2010-08-24) fixed the
original issue. That made the extra stderr test superfluous,
but also introduced a new bug. Now the early call to:
git rev-parse --symbolic "$@"
fails, but we don't notice the exit code. Worse, its empty
output means we think the user didn't provide us a ref, and
we try to apply stash@{0}.
This patch checks the rev-parse exit code and fails early in
the revision parsing process. We can also get rid of the
stderr test; as a bonus, this means that "stash apply" can
now run under GIT_TRACE=1 properly.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Jon Seymour <jon.seymour@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jl/submodule-fetch-on-demand:
fetch/pull: Describe --recurse-submodule restrictions in the BUGS section
submodule update: Don't fetch when the submodule commit is already present
fetch/pull: Don't recurse into a submodule when commits are already present
Submodules: Add 'on-demand' value for the 'fetchRecurseSubmodule' option
config: teach the fetch.recurseSubmodules option the 'on-demand' value
fetch/pull: Add the 'on-demand' value to the --recurse-submodules option
fetch/pull: recurse into submodules when necessary
Conflicts:
builtin/fetch.c
submodule.c
For a pull into an unborn branch, we do not use "git merge"
at all. Instead, we call read-tree directly. However, we
used the --reset parameter instead of "-m", which turns off
the safety features.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/format-patch-multiline-header:
format-patch: rfc2047-encode newlines in headers
format-patch: wrap long header lines
strbuf: add fixed-length version of add_wrapped_text
Currently, the "Did you mean..." message suggests "commit:fullpath"
only. Extend this to show the more convenient "commit:./file" form also.
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>