All other codepaths refrain from running textual diff when either the old
or the new side is binary, but this function only checks the new side. I
was almost going to change it to check both, but that would be a bad
change. Explain why to prevent future mistakes.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git diff --check" should return non-zero when there was any whitespace
error but the code only paid attention to the error status of the last
new line in the patch.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/test:
enable whitespace checking of test scripts
avoid trailing whitespace in zero-change diffstat lines
avoid whitespace on empty line in automatic usage message
mask necessary whitespace policy violations in test scripts
fix whitespace violations in test scripts
It worked that way since commit 50f575fc (Tweak diff colors,
2006-06-22), but commit c1795bb0 (Unify whitespace checking, 2007-12-13)
changed it. This patch restores the old behaviour.
Besides Linus' arguments in the log message of 50f575fc, resetting color
before printing newline is also important to keep 'git add --patch'
happy. If the last line(s) of a file are removed, then that hunk will
end with a colored line. However, if the newline comes before the color
reset, then the diff output will have an additional line at the end
containing only the reset sequence. This causes trouble in
git-add--interactive.perl's parse_diff function, because @colored will
have one more element than @diff, and that last element will contain the
color reset. The elements of these arrays will then be copied to @hunk,
but only as many as the number of elements in @diff. As a result the
last color reset is lost and all subsequent terminal output will be
printed in color.
Signed-off-by: SZEDER Gábor <szeder@ira.uka.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In some cases, we produce a diffstat line even though no
lines have changed (e.g., because of an exact rename). In
this case, there is no +/- "graph" after the number of
changed lines. However, we output the space separator
unconditionally, meaning that these lines contained a
trailing space character.
This isn't a huge problem, but in cleaning up the output we
are able to eliminate some trailing whitespace from a test
vector.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The new option --ignore-submodules can now be used to ignore changes in
submodules.
Why? Sometimes it is not interesting when a submodule changed.
For example, when reordering some commits in the superproject, a dirty
submodule is usually totally uninteresting. So we will use this option
in git-rebase to test for a dirty working tree.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git_config() only had a function parameter, but no callback data
parameter. This assumes that all callback functions only modify
global variables.
With this patch, every callback gets a void * parameter, and it is hoped
that this will help the libification effort.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current rename limit default of 100 was arbitrarily
chosen. Testing[1] has shown that on modern hardware, a
limit of 200 adds about a second of computation time, and a
limit of 500 adds about 5 seconds of computation time.
This patch bumps the default limit to 200 for viewing diffs,
and to 500 for performing a merge. The limit for generating
git-status templates is set independently; we bump it up to
200 here, as well, to match the diff limit.
[1]: See <20080211113516.GB6344@coredump.intra.peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These variables were made unnecessary by commit
3969cf7db1.
Signed-off-by: Adam Simpkins <adam@adamsimpkins.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of counting added and removed lines (and mixing the byte size
reported for binary files in the result), summarize the extent of damage
the same way as we count similarity for rename detection.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ping Yin noticed that "git diff-index --raw" shows 0{40} when work tree
has submodule difference, but "git diff --raw" didn't correctly do so.
There was a mistake in the diffcore_skip_stat_unmatch() that was meant to
clean up the stat-only difference for running diff between the index and
work tree and diff between the tree and the work tree, to cause it re-read
from the submodule repository HEAD. When ce_stat_match() says work tree
is different, we should always say 0{40} on the work tree side.
This patch fixes the issue, and adds tests.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now find_unique_abbrev() never returns NULL, there is no need for callers
to prepare for seeing NULL and fall back to giving the full 40-hexdigits.
While we are at it, drop "..." in the "git reset" output that reports the
location of the new HEAD, between the abbreviated commit object name and
the one line commit summary. Because we are always showing the HEAD
(which cannot be missing!), we never had a case where we show the full 40
hexdigits that is not followed by three dots, and these three dots were
stealing 3 columns from the precious horizontal screen real estate out of
80 that can better be used for the one line commit summary.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
Documentation/git-am.txt: Pass -r in the example invocation of rm -f .dotest
timezone_names[]: fixed the tz offset for New Zealand.
filter-branch documentation: non-zero exit status in command abort the filter
rev-parse: fix potential bus error with --parseopt option spec handling
Use a single implementation and API for copy_file()
Documentation/git-filter-branch: add a new msg-filter example
Correct fast-export file mode strings to match fast-import standard
Originally by Kristian Hï¿œgsberg; I fixed the conversion of rerere, which
had a different API.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We do not account binary nor unmerged files when --shortstat is
asked for (or the summary stat at the end of --stat).
The new option --dirstat should work the same way as it is about
summarizing the changes of multiple files by adding them up.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This change removes all obvious useless if-before-free tests.
E.g., it replaces code like this:
if (some_expression)
free (some_expression);
with the now-equivalent:
free (some_expression);
It is equivalent not just because POSIX has required free(NULL)
to work for a long time, but simply because it has worked for
so long that no reasonable porting target fails the test.
Here's some evidence from nearly 1.5 years ago:
http://www.winehq.org/pipermail/wine-patches/2006-October/031544.html
FYI, the change below was prepared by running the following:
git ls-files -z | xargs -0 \
perl -0x3b -pi -e \
's/\bif\s*\(\s*(\S+?)(?:\s*!=\s*NULL)?\s*\)\s+(free\s*\(\s*\1\s*\))/$2/s'
Note however, that it doesn't handle brace-enclosed blocks like
"if (x) { free (x); }". But that's ok, since there were none like
that in git sources.
Beware: if you do use the above snippet, note that it can
produce syntactically invalid C code. That happens when the
affected "if"-statement has a matching "else".
E.g., it would transform this
if (x)
free (x);
else
foo ();
into this:
free (x);
else
foo ();
There were none of those here, either.
If you're interested in automating detection of the useless
tests, you might like the useless-if-before-free script in gnulib:
[it *does* detect brace-enclosed free statements, and has a --name=S
option to make it detect free-like functions with different names]
http://git.sv.gnu.org/gitweb/?p=gnulib.git;a=blob;f=build-aux/useless-if-before-free
Addendum:
Remove one more (in imap-send.c), spotted by Jean-Luc Herren <jlh@gmx.ch>.
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Solaris regex library doesn't like having the '$' anchor
inside capture parentheses. It rejects the match, causing
t4018 to fail.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
commit: discard index after setting up partial commit
filter-branch: handle filenames that need quoting
diff: Fix miscounting of --check output
hg-to-git: fix parent analysis
mailinfo: feed only one line to handle_filter() for QP input
diff.c: add "const" qualifier to "char *cmd" member of "struct ll_diff_driver"
Add "const" qualifier to "char *excludes_file".
Add "const" qualifier to "char *editor_program".
Add "const" qualifier to "char *pager_program".
config: add 'git_config_string' to refactor string config variables.
diff.c: remove useless check for value != NULL
fast-import: check return value from unpack_entry()
Validate nicknames of remote branches to prohibit confusing ones
diff.c: replace a 'strdup' with 'xstrdup'.
diff.c: fixup garding of config parser from value=NULL
c1795bb (Unify whitespace checking) incorrectly made the
checking function return without incrementing the line numbers
when there is no whitespace problem is found on a '+' line.
This resurrects the earlier behaviour.
Noticed and reported by Jay Soffian. The test script was stolen
from Jay's independent fix.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Also use "git_config_string" to simplify code where "cmd" is set.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is not necessary to check if value != NULL before calling
'parse_lldiff_command' as there is already a check inside this
function.
By the way this patch also improves the existing check inside
'parse_lldiff_command' by using:
return config_error_nonbool(var);
instead of:
return error("%s: lacks value", var);
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Christian Couder noticed that there still were a handcrafted error()
call that we should have converted to config_error_nonbool() where
parse_lldiff_command() parses the configuration file.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows the --relative option to say which subdirectory to
pretend to be in, so that in a bare repository, you can say:
$ git log --relative=drivers/ v2.6.20..v2.6.22 -- drivers/scsi/
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds --relative option to the diff family. When you start
from a subdirectory:
$ git diff --relative
shows only the diff that is inside your current subdirectory,
and without $prefix part. People who usually live in
subdirectories may like it.
There are a few things I should also mention about the change:
- This works not just with diff but also works with the log
family of commands, but the history pruning is not affected.
In other words, if you go to a subdirectory, you can say:
$ git log --relative -p
but it will show the log message even for commits that do not
touch the current directory. You can limit it by giving
pathspec yourself:
$ git log --relative -p .
This originally was not a conscious design choice, but we
have a way to affect diff pathspec and pruning pathspec
independently. IOW "git log --full-diff -p ." tells it to
prune history to commits that affect the current subdirectory
but show the changes with full context. I think it makes
more sense to leave pruning independent from --relative than
the obvious alternative of always pruning with the current
subdirectory, which would break the symmetry.
- Because this works also with the log family, you could
format-patch a single change, limiting the effect to your
subdirectory, like so:
$ cd gitk-git
$ git format-patch -1 --relative 911f1eb
But because that is a special purpose usage, this option will
never become the default, with or without repository or user
preference configuration. The risk of producing a partial
patch and sending it out by mistake is too great if we did
so.
- This is inherently incompatible with --no-index, which is a
bolted-on hack that does not have much to do with git
itself. I didn't bother checking and erroring out on the
combined use of the options, but probably I should.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds a new form of overview diffstat output, doing something that I
have occasionally ended up doing manually (and badly, because it's
actually pretty nasty to do), and that I think is very useful for an
project like the kernel that has a fairly deep and well-separated
directory structure with semantic meaning.
What I mean by that is that it's often interesting to see exactly which
sub-directories are impacted by a patch, and to what degree - even if you
don't perhaps care so much about the individual files themselves.
What makes the concept more interesting is that the "impact" is often
hierarchical: in the kernel, for example, something could either have a
very localized impact to "fs/ext3/" and then it's interesting to see that
such a patch changes mostly that subdirectory, but you could have another
patch that changes some generic VFS-layer issue which affects _many_
subdirectories that are all under "fs/", but none - or perhaps just a
couple of them - of the individual filesystems are interesting in
themselves.
So what commonly happens is that you may have big changes in a specific
sub-subdirectory, but still also significant separate changes to the
subdirectory leading up to that - maybe you have significant VFS-level
changes, but *also* changes under that VFS layer in the NFS-specific
directories, for example. In that case, you do want the low-level parts
that are significant to show up, but then the insignificant ones should
show up as under the more generic top-level directory.
This patch shows all of that with "--dirstat". The output can be either
something simple like
commit 81772fe...
Author: Thomas Gleixner <tglx@linutronix.de>
Date: Sun Feb 10 23:57:36 2008 +0100
x86: remove over noisy debug printk
pageattr-test.c contains a noisy debug printk that people reported.
The condition under which it prints (randomly tapping into a mem_map[]
hole and not being able to c_p_a() there) is valid behavior and not
interesting to report.
Remove it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
100.0% arch/x86/mm/
or something much more complex like
commit e231c2e...
Author: David Howells <dhowells@redhat.com>
Date: Thu Feb 7 00:15:26 2008 -0800
Convert ERR_PTR(PTR_ERR(p)) instances to ERR_CAST(p)
20.5% crypto/
7.6% fs/afs/
7.6% fs/fuse/
7.6% fs/gfs2/
5.1% fs/jffs2/
5.1% fs/nfs/
5.1% fs/nfsd/
7.6% fs/reiserfs/
15.3% fs/
7.6% net/rxrpc/
10.2% security/keys/
where that latter example is an example of significant work in some
individual fs/*/ subdirectories (like the patches to reiserfs accounting
for 7.6% of the whole), but then discounting those individual filesystems,
there's also 15.3% other "random" things that weren't worth reporting on
their oen left over under fs/ in general (either in that directory itself,
or in subdirectories of fs/ that didn't have enough changes to be reported
individually).
I'd like to stress that the "15.3% fs/" mentioned above is the stuff that
is under fs/ but that was _not_ significant enough to report on its own.
So the above does _not_ mean that 15.3% of the work was under fs/ per se,
because that 15.3% does *not* include the already-reported 7.6% of afs,
7.6% of fuse etc.
If you want to enable "cumulative" directory statistics, you can use the
"--cumulative" flag, which adds up percentages recursively even when
they have been already reported for a sub-directory. That cumulative
output is disabled if *all* of the changes in one subdirectory come from
a deeper subdirectory, to avoid repeating subdirectories all the way to
the root.
For an example of the cumulative reporting, the above commit becomes
commit e231c2e...
Author: David Howells <dhowells@redhat.com>
Date: Thu Feb 7 00:15:26 2008 -0800
Convert ERR_PTR(PTR_ERR(p)) instances to ERR_CAST(p)
20.5% crypto/
7.6% fs/afs/
7.6% fs/fuse/
7.6% fs/gfs2/
5.1% fs/jffs2/
5.1% fs/nfs/
5.1% fs/nfsd/
7.6% fs/reiserfs/
61.5% fs/
7.6% net/rxrpc/
10.2% security/keys/
in which the commit percentages now obviously add up to much more than
100%: now the changes that were already reported for the sub-directories
under fs/ are then cumulatively included in the whole percentage of fs/
(ie now shows 61.5% as opposed to the 15.3% without the cumulative
reporting).
The default reporting limit has been arbitrarily set at 3%, which seems
to be a pretty good cut-off, but you can specify the cut-off manually by
giving it as an option parameter (eg "--dirstat=5" makes the cut-off be
at 5% instead)
NOTE! The percentages are purely about the total lines added and removed,
not anything smarter (or dumber) than that. Also note that you should not
generally expect things to add up to 100%: not only does it round down, we
don't report leftover scraps (they add up to the top-level change count,
but we don't even bother reporting that, it only reports subdirectories).
Quite frankly, as a top-level manager this is really convenient for me,
but it's going to be very boring for git itself since there are few
subdirectories. Also, don't expect things to make tons of sense if you
combine this with "-M" and there are cross-directory renames etc.
But even for git itself, you can get some fun statistics. Try out
git log --dirstat
and see the occasional mentions of things like Documentation/, git-gui/,
gitweb/ and gitk-git/. Or try out something like
git diff --dirstat v1.5.0..v1.5.4
which does kind of git an overview that shows *something*. But in general,
the output is more exciting for big projects with deeper structure, and
doing a
git diff --dirstat v2.6.24..v2.6.25-rc1
on the kernel is what I actually wrote this for!
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* lt/in-core-index:
lazy index hashing
Create pathname-based hash-table lookup into index
read-cache.c: introduce is_racy_timestamp() helper
read-cache.c: fix a couple more CE_REMOVE conversion
Also use unpack_trees() in do_diff_cache()
Make run_diff_index() use unpack_trees(), not read_tree()
Avoid running lstat(2) on the same cache entry.
index: be careful when handling long names
Make on-disk index representation separate from in-core one
diff.external, diff.*.command, diff.color.*, color.diff.* and
diff.*.funcname configuration variables expect a string value.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
CRLF conversion bears a slight chance of corrupting data.
autocrlf=true will convert CRLF to LF during commit and LF to
CRLF during checkout. A file that contains a mixture of LF and
CRLF before the commit cannot be recreated by git. For text
files this is the right thing to do: it corrects line endings
such that we have only LF line endings in the repository.
But for binary files that are accidentally classified as text the
conversion can corrupt data.
If you recognize such corruption early you can easily fix it by
setting the conversion type explicitly in .gitattributes. Right
after committing you still have the original file in your work
tree and this file is not yet corrupted. You can explicitly tell
git that this file is binary and git will handle the file
appropriately.
Unfortunately, the desired effect of cleaning up text files with
mixed line endings and the undesired effect of corrupting binary
files cannot be distinguished. In both cases CRLFs are removed
in an irreversible way. For text files this is the right thing
to do because CRLFs are line endings, while for binary files
converting CRLFs corrupts data.
This patch adds a mechanism that can either warn the user about
an irreversible conversion or can even refuse to convert. The
mechanism is controlled by the variable core.safecrlf, with the
following values:
- false: disable safecrlf mechanism
- warn: warn about irreversible conversions
- true: refuse irreversible conversions
The default is to warn. Users are only affected by this default
if core.autocrlf is set. But the current default of git is to
leave core.autocrlf unset, so users will not see warnings unless
they deliberately chose to activate the autocrlf mechanism.
The safecrlf mechanism's details depend on the git command. The
general principles when safecrlf is active (not false) are:
- we warn/error out if files in the work tree can modified in an
irreversible way without giving the user a chance to backup the
original file.
- for read-only operations that do not modify files in the work tree
we do not not print annoying warnings.
There are exceptions. Even though...
- "git add" itself does not touch the files in the work tree, the
next checkout would, so the safety triggers;
- "git apply" to update a text file with a patch does touch the files
in the work tree, but the operation is about text files and CRLF
conversion is about fixing the line ending inconsistencies, so the
safety does not trigger;
- "git diff" itself does not touch the files in the work tree, it is
often run to inspect the changes you intend to next "git add". To
catch potential problems early, safety triggers.
The concept of a safety check was originally proposed in a similar
way by Linus Torvalds. Thanks to Dimitry Potapov for insisting
on getting the naked LF/autocrlf=true case right.
Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Aside from the lstat(2) done for work tree files, there are
quite many lstat(2) calls in refname dwimming codepath. This
patch is not about reducing them.
* It adds a new ce_flag, CE_UPTODATE, that is meant to mark the
cache entries that record a regular file blob that is up to
date in the work tree. If somebody later walks the index and
wants to see if the work tree has changes, they do not have
to be checked with lstat(2) again.
* fill_stat_cache_info() marks the cache entry it just added
with CE_UPTODATE. This has the effect of marking the paths
we write out of the index and lstat(2) immediately as "no
need to lstat -- we know it is up-to-date", from quite a lot
fo callers:
- git-apply --index
- git-update-index
- git-checkout-index
- git-add (uses add_file_to_index())
- git-commit (ditto)
- git-mv (ditto)
* refresh_cache_ent() also marks the cache entry that are clean
with CE_UPTODATE.
* write_index is changed not to write CE_UPTODATE out to the
index file, because CE_UPTODATE is meant to be transient only
in core. For the same reason, CE_UPDATE is not written to
prevent an accident from happening.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We truncate hunk-header line at 80 bytes, but that 80th byte
could be in the middle of a character, which is bad. This uses
pick_one_utf8_char() function to make sure we do not cut a character
in the middle.
This assumes that the internal representation of the text is
UTF-8. This needs to be extended in the future but the optimal
direction has not been decided yet.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is no point to this. Either:
1. The program has already loaded git_diff_ui_config, in
which case this is a noop.
2. The program didn't, which means it is plumbing that
does not _want_ git_diff_ui_config to be loaded.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The funcname patterns influence the "comment" on @@ lines of
the diff. They are safe to use with plumbing since they
don't fundamentally change the meaning of the diff in any
way.
Since all diff users call either diff_ui_config or
diff_basic_config, we can get rid of the lazy reading of the
config.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The diff porcelain uses git_diff_ui_config to set
porcelain-ish config options, like automatically turning on
color. The plumbing specifically avoids calling this
function, since it doesn't want things like automatic color
or rename detection.
However, some diff options should be set for both plumbing
and porcelain. For example, one can still turn on color in
git-diff-files using the --color command line option. This
means we want the color config from color.diff.* (so that
once color is on, we use the user's preferred scheme), but
_not_ the color.diff variable.
We split the diff config into "ui" and "basic", where
"basic" is suitable for use by plumbing (so _most_ things
affecting the output should still go into the "ui" part).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This moves the logic to quote two paths (prefix + path) in
C-style introduced in the previous commit from the
dump_quoted_path() in combine-diff.c to quote.c, and uses it to
fix rewrite_diff() that never C-quoted the pathnames correctly.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the new options "--src-prefix=<prefix>", "--dst-prefix=<prefix>"
and "--no-prefix", you can now control the path prefixes of the diff
machinery. These used to by hardwired to "a/" for the source prefix
and "b/" for the destination prefix.
Initial patch by Pascal Obry. Sane option names suggested by Linus.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We had the diff.external variable in the documentation of the config
file since its conception, but failed to respect it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For consistency, make the two tools report whitespace errors in the
same way (the output of "diff --check" has been tweaked to match
that of "git apply").
Note that although the textual content is basically the same only
"git diff --check" provides a colorized version of the problematic
lines; making "git apply" do colorization will require more extensive
changes (figuring out the diff colorization preferences of the user)
and so that will be a subject for another commit.
Signed-off-by: Wincent Colaiuta <win@wincent.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit unifies three separate places where whitespace checking was
performed:
- the whitespace checking previously done in builtin-apply.c is
extracted into a function in ws.c
- the equivalent logic in "git diff" is removed
- the emit_line_with_ws() function is also removed because that also
rechecks the whitespace, and its functionality is rolled into ws.c
The new function is called check_and_emit_line() and it does two things:
checks a line for whitespace errors and optionally emits it. The checking
is based on lines of content rather than patch lines (in other words, the
caller must strip the leading "+" or "-"); this was suggested by Junio on
the mailing list to allow for a future extension to "git show" to display
whitespace errors in blobs.
At the same time we teach it to report all classes of whitespace errors
found for a given line rather than reporting only the first found error.
Signed-off-by: Wincent Colaiuta <win@wincent.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is no reason --exit-code and --check-diff must be mutually
exclusive, so assign different bits to different results and allow them
to be returned from the command. Introduce diff_result_code() to factor
out the common code to decide final status code based on diffopt
settings and use it everywhere.
Update tests to match the above fix.
Turning pager off when "diff --check" is used is a regression.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git diff" has a --check option that can be used to check for whitespace
problems but it only reported by printing warnings to the
console.
Now when the --check option is used we give a non-zero exit status,
making "git diff --check" nicer to use in scripts and hooks.
Signed-off-by: Wincent Colaiuta <win@wincent.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This inserts a new function xdi_diff() that currently does not
do anything other than calling the underlying xdl_diff() to the
callchain of current callers of xdl_diff() function.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"diff --check" would only detect spaces before tabs if a tab was the
last character in the leading indent. Fix that and add a test case to
make sure the bug doesn't regress in the future.
Signed-off-by: Wincent Colaiuta <win@wincent.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "-z" format is all about machine parsability, but showing renamed
paths as "common/{a => b}/suffix" makes it impossible. The scripts would
never have successfully parsed "--numstat -z -M" in the old format.
This fixes the output format in a (hopefully minimally) backward
incompatible way.
* The output without -z is not changed. This has given a good way for
humans to view added and deleted lines separately, and showing the
path in combined, shorter way would preserve readability.
* The output with -z is unchanged for paths that do not involve renames.
Existing scripts that do not pass -M/-C are not affected at all.
* The output with -z for a renamed path is shown in a format that can
easily be distinguished from an unrenamed path.
This is based on Jakub Narebski's patch. Bugs and documentation typos
are mine.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For consistency, change "white space" and "whitespaces" to
"whitespace", fixing a couple of adjacent grammar problems in the
docs.
Signed-off-by: Wincent Colaiuta <win@wincent.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/spht:
Use gitattributes to define per-path whitespace rule
core.whitespace: documentation updates.
builtin-apply: teach whitespace_rules
builtin-apply: rename "whitespace" variables and fix styles
core.whitespace: add test for diff whitespace error highlighting
git-diff: complain about >=8 consecutive spaces in initial indent
War on whitespace: first, a bit of retreat.
Conflicts:
cache.h
config.c
diff.c
The `core.whitespace` configuration variable allows you to define what
`diff` and `apply` should consider whitespace errors for all paths in
the project (See gitlink:git-config[1]). This attribute gives you finer
control per path.
For example, if you have these in the .gitattributes:
frotz whitespace
nitfol -whitespace
xyzzy whitespace=-trailing
all types of whitespace problems known to git are noticed in path 'frotz'
(i.e. diff shows them in diff.whitespace color, and apply warns about
them), no whitespace problem is noticed in path 'nitfol', and the
default types of whitespace problems except "trailing whitespace" are
noticed for path 'xyzzy'. A project with mixed Python and C might want
to have:
*.c whitespace
*.py whitespace=-indent-with-non-tab
in its toplevel .gitattributes file.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds an option to help scripts find out color settings from
the configuration file.
git config --get-colorbool color.diff
inspects color.diff variable, and exits with status 0 (i.e. success) if
color is to be used. It exits with status 1 otherwise.
If a script wants "true"/"false" answer to the standard output of the
command, it can pass an additional boolean parameter to its command
line, telling if its standard output is a terminal, like this:
git config --get-colorbool color.diff true
When called like this, the command outputs "true" to its standard output
if color is to be used (i.e. "color.diff" says "always", "auto", or
"true"), and "false" otherwise.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
663af3422a (Full rework of
quote_c_style and write_name_quoted.) mistakenly used puts()
when writing out a fixed string when it did not want to add a
terminating LF.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
reverse_diff was a bit-value in disguise, it's merged in the flags now.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This introduces a new whitespace error type, "indent-with-non-tab".
The error is about starting a line with 8 or more SP, instead of
indenting it with a HT.
This is not enabled by default, as some projects employ an
indenting policy to use only SPs and no HTs.
The kernel folks and git contributors may want to enable this
detection with:
[core]
whitespace = indent-with-non-tab
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This introduces core.whitespace configuration variable that lets
you specify the definition of "whitespace error".
Currently there are two kinds of whitespace errors defined:
* trailing-space: trailing whitespaces at the end of the line.
* space-before-tab: a SP appears immediately before HT in the
indent part of the line.
You can specify the desired types of errors to be detected by
listing their names (unique abbreviations are accepted)
separated by comma. By default, these two errors are always
detected, as that is the traditional behaviour. You can disable
detection of a particular type of error by prefixing a '-' in
front of the name of the error, like this:
[core]
whitespace = -trailing-space
This patch teaches the code to output colored diff with
DIFF_WHITESPACE color to highlight the detected whitespace
errors to honor the new configuration.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* js/forkexec:
Use the asyncronous function infrastructure to run the content filter.
Avoid a dup2(2) in apply_filter() - start_command() can do it for us.
t0021-conversion.sh: Test that the clean filter really cleans content.
upload-pack: Run rev-list in an asynchronous function.
upload-pack: Move the revision walker into a separate function.
Use the asyncronous function infrastructure in builtin-fetch-pack.c.
Add infrastructure to run a function asynchronously.
upload-pack: Use start_command() to run pack-objects in create_pack_file().
Have start_command() create a pipe to read the stderr of the child.
Use start_comand() in builtin-fetch-pack.c instead of explicit fork/exec.
Use run_command() to spawn external diff programs instead of fork/exec.
Use start_command() to run content filters instead of explicit fork/exec.
Use start_command() in git_connect() instead of explicit fork/exec.
Change git_connect() to return a struct child_process instead of a pid_t.
Conflicts:
builtin-fetch-pack.c
The core rename detection had some rather stupid code to check if a
pathname was used by a later modification or rename, which basically
walked the whole pathname space for all renames for each rename, in
order to tell whether it was a pure rename (no remaining users) or
should be considered a copy (other users of the source file remaining).
That's really silly, since we can just keep a count of users around, and
replace all those complex and expensive loops with just testing that
simple counter (but this all depends on the previous commit that shared
the diff_filespec data structure by using a separate reference count).
Note that the reference count is not the same as the rename count: they
behave otherwise rather similarly, but the reference count is tied to
the allocation (and decremented at de-allocation, so that when it turns
zero we can get rid of the memory), while the rename count is tied to
the renames and is decremented when we find a rename (so that when it
turns zero we know that it was a rename, not a copy).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rather than copy the filespecs when introducing new versions of them
(for rename or copy detection), use a refcount and increment the count
when reusing the diff_filespec.
This avoids unnecessary allocations, but the real reason behind this is
a future enhancement: we will want to track shared data across the
copy/rename detection. In order to efficiently notice when a filespec
is used by a rename, the rename machinery wants to keep track of a
rename usage count which is shared across all different users of the
filespec.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix size_t vs. unsigned long pointer mismatch warnings introduced
with the addition of strbuf_detach().
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* ph/strbuf: (44 commits)
Make read_patch_file work on a strbuf.
strbuf_read_file enhancement, and use it.
strbuf change: be sure ->buf is never ever NULL.
double free in builtin-update-index.c
Clean up stripspace a bit, use strbuf even more.
Add strbuf_read_file().
rerere: Fix use of an empty strbuf.buf
Small cache_tree_write refactor.
Make builtin-rerere use of strbuf nicer and more efficient.
Add strbuf_cmp.
strbuf_setlen(): do not barf on setting length of an empty buffer to 0
sq_quote_argv and add_to_string rework with strbuf's.
Full rework of quote_c_style and write_name_quoted.
Rework unquote_c_style to work on a strbuf.
strbuf API additions and enhancements.
nfv?asprintf are broken without va_copy, workaround them.
Fix the expansion pattern of the pseudo-static path buffer.
builtin-for-each-ref.c::copy_name() - do not overstep the buffer.
builtin-apply.c: fix a tiny leak introduced during xmemdupz() conversion.
Use xmemdupz() in many places.
...
We find rename candidates by computing a fingerprint hash of
each file, and then comparing those fingerprints. There are
inherently O(n^2) comparisons, so it pays in CPU time to
hoist the (rather expensive) computation of the fingerprint
out of that loop (or to cache it once we have computed it once).
Previously, we didn't keep the filespec information around
because then we had the potential to consume a great deal of
memory. However, instead of keeping all of the filespec
data, we can instead just keep the fingerprint.
This patch implements and uses diff_free_filespec_data_large
to accomplish that goal. We also have to change
estimate_similarity not to needlessly repopulate the
filespec data when we already have the hash.
Practical tests showed 4.5x speedup for a 10% memory usage
increase.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For that purpose, the ->buf is always initialized with a char * buf living
in the strbuf module. It is made a char * so that we can sloppily accept
things that perform: sb->buf[0] = '\0', and because you can't pass "" as an
initializer for ->buf without making gcc unhappy for very good reasons.
strbuf_init/_detach/_grow have been fixed to trust ->alloc and not ->buf
anymore.
as a consequence strbuf_detach is _mandatory_ to detach a buffer, copying
->buf isn't an option anymore, if ->buf is going to escape from the scope,
and eventually be free'd.
API changes:
* strbuf_setlen now always works, so just make strbuf_reset a convenience
macro.
* strbuf_detatch takes a size_t* optional argument (meaning it can be
NULL) to copy the buffer's len, as it was needed for this refactor to
make the code more readable, and working like the callers.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* quote_c_style works on a strbuf instead of a wild buffer.
* quote_c_style is now clever enough to not add double quotes if not needed.
* write_name_quoted inherits those advantages, but also take a different
set of arguments. Now instead of asking for quotes or not, you pass a
"terminator". If it's \0 then we assume you don't want to escape, else C
escaping is performed. In any case, the terminator is also appended to the
stream. It also no longer takes the prefix/prefix_len arguments, as it's
seldomly used, and makes some optimizations harder.
* write_name_quotedpfx is created to work like write_name_quoted and take
the prefix/prefix_len arguments.
Thanks to those API changes, diff.c has somehow lost weight, thanks to the
removal of functions that were wrappers around the old write_name_quoted
trying to give it a semantics like the new one, but performing a lot of
allocations for this goal. Now we always write directly to the stream, no
intermediate allocation is performed.
As a side effect of the refactor in builtin-apply.c, the length of the bar
graphs in diffstats are not affected anymore by the fact that the path was
clipped.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
* master: (94 commits)
Fixed update-hook example allow-users format.
Documentation/git-svn: updated design philosophy notes
t/t4014: test "am -3" with mode-only change.
git-commit.sh: Shell script cleanup
preserve executable bits in zip archives
Fix lapsus in builtin-apply.c
git-push: documentation and tests for pushing only branches
git-svnimport: Use separate arguments in the pipe for git-rev-parse
contrib/fast-import: add perl version of simple example
contrib/fast-import: add simple shell example
rev-list --bisect: Bisection "distance" clean up.
rev-list --bisect: Move some bisection code into best_bisection.
rev-list --bisect: Move finding bisection into do_find_bisection.
Document ls-files --with-tree=<tree-ish>
git-commit: partial commit of paths only removed from the index
git-commit: Allow partial commit of file removal.
send-email: make message-id generation a bit more robust
git-apply: fix whitespace stripping
git-gui: Disable native platform text selection in "lists"
apply --index-info: fall back to current index for mode changes
...
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Now, those functions take an "out" strbuf argument, where they store their
result if any. In that case, it also returns 1, else it returns 0.
* those functions support "in place" editing, in the sense that it's OK to
call them this way:
convert_to_git(path, sb->buf, sb->len, sb);
When doable, conversions are done in place for real, else the strbuf
content is just replaced with the new one, transparentely for the caller.
If you want to create a new filter working this way, being the accumulation
of filter1, filter2, ... filtern, then your meta_filter would be:
int meta_filter(..., const char *src, size_t len, struct strbuf *sb)
{
int ret = 0;
ret |= filter1(...., src, len, sb);
if (ret) {
src = sb->buf;
len = sb->len;
}
ret |= filter2(...., src, len, sb);
if (ret) {
src = sb->buf;
len = sb->len;
}
....
return ret | filtern(..., src, len, sb);
}
That's why subfilters the convert_to_* functions called were also rewritten
to work this way.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds more proper rename detection limits. Instead of just checking
the limit against the number of potential rename destinations, we verify
that the rename matrix (which is what really matters) doesn't grow
ridiculously large, and we also make sure that we don't overflow when
doing the matrix size calculation.
This also changes the default limits from unlimited, to a rename matrix
that is limited to 100 entries on a side. You can raise it with the config
entry, or by using the "-l<n>" command line flag, but at least the default
is now a sane number that avoids spending lots of time (and memory) in
situations that likely don't merit it.
The choice of default value is of course very debatable. Limiting the
rename matrix to a 100x100 size will mean that even if you have just one
obvious rename, but you also create (or delete) 10,000 files, the rename
matrix will be so big that we disable the heuristics. Sounds reasonable to
me, but let's see if people hit this (and, perhaps more importantly,
actually *care*) in real life.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Add strbuf_rtrim to remove trailing spaces.
* Add strbuf_insert to insert data at a given position.
* Off-by one fix in strbuf_addf: strbuf_avail() does not counts the final
\0 so the overflow test for snprintf is the strict comparison. This is
not critical as the growth mechanism chosen will always allocate _more_
memory than asked, so the second test will not fail. It's some kind of
miracle though.
* Add size extension hints for strbuf_init and strbuf_read. If 0, default
applies, else:
+ initial buffer has the given size for strbuf_init.
+ first growth checks it has at least this size rather than the
default 8192.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* master:
archive - leakfix for format_subst()
Make --no-thin the default in git-push to save server resources
fix doc for --compression argument to pack-objects
git-tag -s must fail if gpg cannot sign the tag.
git-svn: understand grafts when doing dcommit
git-diff: don't squelch the new SHA1 in submodule diffs
Define NO_MEMMEM on Darwin as it lacks the function
git-svn: fix "Malformed network data" with svn:// servers
(cvs|svn)import: Ask git-tag to overwrite old tags.
git-rebase: fix -C option
git-rebase: support --whitespace=<option>
Documentation / grammer nit
archive: rename attribute specfile to export-subst
archive: specfile syntax change: "$Format:%PLCHLDR$" instead of just "%PLCHLDR" (take 2)
add memmem()
Remove unused function convert_sha1_file()
archive: specfile support (--pretty=format: in archive files)
Export format_commit_message()
The code to squelch empty diffs introduced by commit
fb13227e08 would inadvertently
populate filespec "two" of a submodule change using the uninitialized
(null) SHA1, thereby replacing the submodule SHA1 by 0{40} in the output.
This change teaches diffcore_skip_stat_unmatch to handle
submodule changes correctly.
Signed-off-by: Sven Verdoolaege <skimo@kotnet.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The warning message to suggest "Consider running git-status" from
"git-diff" that we experimented with during the 1.5.3 cycle turns
out to be a bad idea. It robbed cache-dirty information from people
who valued it, while still asking users to run "update-index --refresh".
It was hoped that the new behaviour would at least have some educational
value, but not showing the cache-dirty paths like before meant that the
user would not even know easily which paths were cache-dirty, and it
made the need to refresh the index look like even more unnecessary chore.
This commit reinstates the traditional behaviour, but with a twist.
By default, the empty "diff --git" output is totally squelched out
from "git diff" output. At the end of the command, it automatically
runs "update-index --refresh" as needed, without even bothering the
user. In other words, people who do not care about the cache-dirtyness
do not even have to see the warning.
The traditional behaviour to see the stat-dirty output and to bypassing
the overhead of content comparison can be specified by setting the
configuration variable diff.autorefreshindex to false.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We used to not generate a patch ID for binary diffs, but that means that
some commits may be skipped as being identical to already-applied diffs
when doing a rebase.
So just delete the code that skips the binary diff. At the very least,
we'd want the filenames to be part of the patch ID, but we might also want
to generate some hash for the binary diff itself too.
This fixes an issue noticed by Torgil Svensson.
Tested-by: Torgil Svensson <torgil.svensson@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we compare two non-tracked files, or explicitly
specify --no-index, the suggestion to run git-status
is not helpful.
The patch adds a new diff_options bitfield member, no_index, that
is used instead of the special value of -2 of the rev_info field
max_count to indicate that the index is not to be used. This makes
it possible to pass that flag down to diffcore_skip_stat_unmatch(),
which only has one diff_options parameter.
This could even become a cleanup if we removed all assignments of
max_count to a value of -2 (viz. replacement of a magic value with
a self-documenting field name) but I didn't dare to do that so late
in the rc game..
The no_index bit, if set, then tells diffcore_skip_stat_unmatch()
to not account for any skipped stat-mismatches, which avoids the
suggestion to run git-status.
Signed-off-by: Rene Scharfe <rene.scharfe@lsfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After starting to edit a working tree file but later when your edit ends
up identical to the original (this can also happen when you ran a
wholesale regexp replace with something like "perl -i" that does not
actually modify many of the paths), "git diff" between the index and the
working tree outputs many "empty" diffs that show "diff --git" headers
and nothing else, because these paths are stat-dirty. While it was a
way to warn the user that the earlier action of the user made the index
ineffective as an optimization mechanism, it was felt too loud for the
purpose of warning even to experienced users, and also resulted in
confusing people new to git.
This replaces the "empty" diffs with a single warning message at the
end. Having many such paths hurts performance, and you can run
"git-update-index --refresh" to update the lstat(2) information recorded
in the index in such a case. "git-status" does so as a side effect, and
that is more familiar to the end-user, so we recommend it to them.
The change affects only "git diff" that outputs patch text, because that
is where the annoyance of too many "empty" diff is most strongly felt,
and because the warning message can be safely ignored by downstream
tools without getting mistaken as part of the patch. For the low-level
"git diff-files" and "git diff-index", the traditional behaviour is
retained.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If user's TMPDIR is insanely long, return negative after
setting errno to ENAMETOOLONG, pretending that the underlying
mkstemp() choked on a temporary file path that is too long.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This would hopefully make it easier to maintain. Initially we
would have "java" and "tex" defined, as they are the only ones
we already have.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code shuffling mistakenly lost binariness specified with the
attribute mecahnism and made it always guess from the data.
Noticed by Johannes, with two test cases to t4020.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This updates the hunk header customization syntax. The special
case 'funcname' attribute is gone.
You assign the name of the type of contents to path's "diff"
attribute as a string value in .gitattributes like this:
*.java diff=java
*.perl diff=perl
*.doc diff=doc
If you supply "diff.<name>.funcname" variable via the
configuration mechanism (e.g. in $HOME/.gitconfig), the value is
used as the regexp set to find the line to use for the hunk
header (the variable is called "funcname" because such a line
typically is the one that has the name of the function in
programming language source text).
If there is no such configuration, built-in default is used, if
any. Currently there are two default patterns: default and java.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes"diff -p" hunk headers customizable via gitattributes mechanism.
It is based on Johannes's earlier patch that allowed to define a single
regexp to be used for everything.
The mechanism to arrive at the regexp that is used to define hunk header
is the same as other use of gitattributes. You assign an attribute, funcname
(because "diff -p" typically uses the name of the function the patch is about
as the hunk header), a simple string value. This can be one of the names of
built-in pattern (currently, "java" is defined) or a custom pattern name, to
be looked up from the configuration file.
(in .gitattributes)
*.java funcname=java
*.perl funcname=perl
(in .git/config)
[funcname]
java = ... # ugly and complicated regexp to override the built-in one.
perl = ... # another ugly and complicated regexp to define a new one.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The instances of xdemitconf_t were initialized member by member.
Instead, initialize them to all zero, so we do not have
to update those places each time we introduce a new member.
[jc: minimally fixed by getting rid of a new global]
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This replaces an explicit initialization of filespec->is_binary
field used for rename/break followed by direct access to that
field with a wrapper function that lazily iniaitlizes and
accesses the field. We would add more attribute accesses for
the use of diff routines, and it would be better to make this
abstraction earlier.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
Document -<n> for git-format-patch
glossary: add 'reflog'
diff --no-index: fix --name-status with added files
Don't smash stack when $GIT_ALTERNATE_OBJECT_DIRECTORIES is too long
To prevent funky games with external diff engines, git-log and
friends prevent external diff engines from being called. That makes
sense in the context of git-format-patch or git-rebase.
However, for "git log -p" it is not so nice to get the message
that binary files cannot be compared, while "git diff" has no
problems with them, if you provided an external diff driver.
With this patch, "git log --ext-diff -p" will do what you expect,
and the option "--no-ext-diff" can be used to override that
setting.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Without this patch, an added file would be reported as /dev/null.
Noticed by David Kastrup.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/diffcore:
diffcore-delta.c: Ignore CR in CRLF for text files
diffcore-delta.c: update the comment on the algorithm.
diffcore_filespec: add is_binary
diffcore_count_changes: pass diffcore_filespec
diffcore-break and diffcore-rename would want to behave slightly
differently depending on the binary-ness of the data, so add one
bit to the filespec, as the structure is now passed down to
diffcore_count_changes() function.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rounding down the printed (dis)similarity index allows us to use
"100%" as a special value that indicates complete rewrites and
fully equal file contents, respectively.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ok, I've really held off doing this too damn long, because I'm lazy, and I
was always hoping that somebody else would do it.
But no, people keep asking for it, but nobody actually did anything, so I
decided I might as well bite the bullet, and instead of telling people
they could add a "--follow" flag to "git log" to do what they want to do,
I decided that it looks like I just have to do it for them..
The code wasn't actually that complicated, in that the diffstat for this
patch literally says "70 insertions(+), 1 deletions(-)", but I will have
to admit that in order to get to this fairly simple patch, you did have to
know and understand the internal git diff generation machinery pretty
well, and had to really be able to follow how commit generation interacts
with generating patches and generating the log.
So I suspect that while I was right that it wasn't that hard, I might have
been expecting too much of random people - this patch does seem to be
firmly in the core "Linus or Junio" territory.
To make a long story short: I'm sorry for it taking so long until I just
did it.
I'm not going to guarantee that this works for everybody, but you really
can just look at the patch, and after the appropriate appreciative noises
("Ooh, aah") over how clever I am, you can then just notice that the code
itself isn't really that complicated.
All the real new code is in the new "try_to_follow_renames()" function. It
really isn't rocket science: we notice that the pathname we were looking
at went away, so we start a full tree diff and try to see if we can
instead make that pathname be a rename or a copy from some other previous
pathname. And if we can, we just continue, except we show *that*
particular diff, and ever after we use the _previous_ pathname.
One thing to look out for: the "rename detection" is considered to be a
singular event in the _linear_ "git log" output! That's what people want
to do, but I just wanted to point out that this patch is *not* carrying
around a "commit,pathname" kind of pair and it's *not* going to be able to
notice the file coming from multiple *different* files in earlier history.
IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind
of "files have single identities" kind of semantics, and git log will just
pick the identity based on the normal move/copy heuristics _as_if_ the
history could be linearized.
Put another way: I think the model is broken, but given the broken model,
I think this patch does just about as well as you can do. If you have
merges with the same "file" having different filenames over the two
branches, git will just end up picking _one_ of the pathnames at the point
where the newer one goes away. It never looks at multiple pathnames in
parallel.
And if you understood all that, you probably didn't need it explained, and
if you didn't understand the above blathering, it doesn't really mtter to
you. What matters to you is that you can now do
git log -p --follow builtin-rev-list.c
and it will find the point where the old "rev-list.c" got renamed to
"builtin-rev-list.c" and show it as such.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We already have two instances where we want to determine if a buffer
contains binary data as opposed to text.
[jc: cherry-picked 6bfce93e from 'master']
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Earlier, a second "-C" on the command line had no effect.
But "--find-copies-harder" is so long to type, let's make doubled -C
enable that option. It is in line with how "git blame" handles such
doubled options to mean "work harder".
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This uses "git-apply --whitespace=strip" to fix whitespace errors that have
crept in to our source files over time. There are a few files that need
to have trailing whitespaces (most notably, test vectors). The results
still passes the test, and build result in Documentation/ area is unchanged.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We already have two instances where we want to determine if a buffer
contains binary data as opposed to text.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
Fix git-svn to handle svn not reporting the md5sum of a file, and test.
Fix mishandling of $Id$ expanded in the repository copy in convert.c
More echo "$user_message" fixes.
Add tests for the last two fixes.
git-commit: use printf '%s\n' instead of echo on user-supplied strings
git-am: use printf instead of echo on user-supplied strings
Documentation: Add definition of "evil merge" to GIT Glossary
Replace the last 'dircache's by 'index'
Documentation: Clean up links in GIT Glossary
* maint-1.5.1:
Fix git-svn to handle svn not reporting the md5sum of a file, and test.
More echo "$user_message" fixes.
Add tests for the last two fixes.
git-commit: use printf '%s\n' instead of echo on user-supplied strings
git-am: use printf instead of echo on user-supplied strings
Documentation: Add definition of "evil merge" to GIT Glossary
Replace the last 'dircache's by 'index'
Documentation: Clean up links in GIT Glossary
* maint-1.5.1:
annotate: make it work from subdirectories.
git-config: Correct asciidoc documentation for --int/--bool
t1300: Add tests for git-config --bool --get
unpack-trees.c: verify_uptodate: remove dead code
Use PATH_MAX instead of TEMPFILE_PATH_LEN
branch: fix segfault when resolving an invalid HEAD
This patch fixes all calls to xread() where the return value is not
stored into an ssize_t. The patch should not have any effect whatsoever,
other than putting better/more appropriate type names on variables.
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
diff_filespec has a slot to record the size of the data already,
so make use of it instead of a separate size cache.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This reduces the memory pressure when dealing with many paths.
An unscientific test of running "diff-tree --stat --summary -M"
between v2.6.19 and v2.6.20-rc1 in the linux kernel repository
indicates that the number of minor faults are reduced by 2/3.
Signed-off-by: Junio C Hamano <junkio@cox.net>
* maint:
gitweb: use decode_utf8 directly
posix compatibility for t4200
Document 'opendiff' value in config.txt and git-mergetool.txt
Allow PERL_PATH="/usr/bin/env perl"
Make xstrndup common
diff.c: fix "size cache" handling.
http-fetch: Disable use of curl multi support for libcurl < 7.16.
We broke the size-cache handling when we changed the function
signature of sha1_object_info() in 21666f1a. We obviously
wanted to cache the size we obtained when sha1_object_info()
succeeded, not when it failed.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This enhances the attributes mechanism so that external programs
meant for existing GIT_EXTERNAL_DIFF interface can be specifed
per path.
To configure such a custom diff driver, first define a custom
diff driver in the configuration:
[diff "my-c-diff"]
command = <<your command string comes here>>
Then mark the paths that you want to use this custom driver
using the attribute mechanism.
*.c diff=my-c-diff
The intent of this separation is that the attribute mechanism is
used for specifying the type of the contents, while the
configuration mechanism is used to define what needs to be done
to that type of the contents, which would be specific to both
platform and personal taste.
Signed-off-by: Junio C Hamano <junkio@cox.net>
* 'jc/attr': (28 commits)
lockfile: record the primary process.
convert.c: restructure the attribute checking part.
Fix bogus linked-list management for user defined merge drivers.
Simplify calling of CR/LF conversion routines
Document gitattributes(5)
Update 'crlf' attribute semantics.
Documentation: support manual section (5) - file formats.
Simplify code to find recursive merge driver.
Counto-fix in merge-recursive
Fix funny types used in attribute value representation
Allow low-level driver to specify different behaviour during internal merge.
Custom low-level merge driver: change the configuration scheme.
Allow the default low-level merge driver to be configured.
Custom low-level merge driver support.
Add a demonstration/test of customized merge.
Allow specifying specialized merge-backend per path.
merge-recursive: separate out xdl_merge() interface.
Allow more than true/false to attributes.
Document git-check-attr
Change attribute negation marker from '!' to '-'.
...
It was bothering me a lot that I abused small integer values
casted to (void *) to represent non string values in
gitattributes. This corrects it by making the type of attribute
values (const char *), and using the address of a few statically
allocated character buffer to denote true/false. Unset attributes
are represented as having NULLs as their values.
Added in-header documentation to explain how git_checkattr()
routine should be called.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This allows you to define three values (and possibly more) to
each attribute: true, false, and unset.
Typically the handlers that notice and act on attribute values
treat "unset" attribute to mean "do your default thing"
(e.g. crlf that is unset would trigger "guess from contents"),
so being able to override a setting to an unset state is
actually useful.
- If you want to set the attribute value to true, have an entry
in .gitattributes file that mentions the attribute name; e.g.
*.o binary
- If you want to set the attribute value explicitly to false,
use '-'; e.g.
*.a -diff
- If you want to make the attribute value _unset_, perhaps to
override an earlier entry, use '!'; e.g.
*.a -diff
c.i.a !diff
This also allows string values to attributes, with the natural
syntax:
attrname=attrvalue
but you cannot use it, as nobody takes notice and acts on
it yet.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is in the same spirit as the previous one. Earlier 'diff'
meant 'do the built-in binary heuristics and disable patch text
generation based on it' while '!diff' meant 'do not guess, do
not generate patch text'. There was no way to say 'do generate
patch text even when the heuristics says it has NUL in it'.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The same way we generate diffs on symlinks as the the diff of text of the
symlink, we can generate subproject diffs (when not recursing into them!)
as the diff of the text that describes the subproject.
Of course, since what descibes a subproject is just the SHA1, that's what
we'll use. Add some pretty-printing to make it a bit more obvious what is
going on, and we're done.
So with this, we can get both raw diffs and "textual" diffs of subproject
changes:
- git diff --raw:
:160000 160000 2de597b5ad348b7db04bd10cdd38cd81cbc93ab5 0000000... M sub-A
- git diff:
diff --git a/sub-A b/sub-A
index 2de597b..e8f11a4 160000
--- a/sub-A
+++ b/sub-A
@@ -1 +1 @@
-Subproject commit 2de597b5ad348b7db04bd10cdd38cd81cbc93ab5
+Subproject commit e8f11a45c5c6b9e2fec6d136d3fb5aff75393d42
NOTE! We'll also want to have the ability to recurse into the subproject
and actually diff it recursively, but that will involve a new command line
option (I'd suggest "--subproject" and "-S", but the latter is in use by
pickaxe), and some very different code.
But regardless of ay future recursive behaviour, we need the non-recursive
version too (and it should be the default, at least in the absense of
config options, so that large superprojects don't default to something
extremely expensive).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This makes paths that explicitly unset 'diff' attribute not to
produce "textual" diffs from 'git-diff' family.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Previously, a binary file in the diffstat would show as:
some-binary-file.bin | Bin
The space after the "Bin" was never used. This patch changes binary
lines in the diffstat to be:
some-binary-file.bin | Bin 12345 -> 123456 bytes
The very nice "->" notation was suggested by Johannes Schindelin, and
shows the before and after sizes more clearly than "+" and "-" would.
If a size is 0 it's not shown (although it would probably be better to
treat no-file differently from zero-byte-file).
The user can see what changed in the binary file, and how big the new
file is. This is in keeping with the information in the rest of the
diffstat.
The diffstat_t members "added" and "deleted" were unused when the file
was binary, so this patch loads them with the file sizes in
builtin_diffstat(). These figures are then read in show_stats() when
the file is marked binary.
Signed-off-by: Andy Parkins <andyparkins@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds the command line option 'quiet' to tell 'git diff-*'
that we are not interested in the actual diff contents but only
want to know if there is any change. This option automatically
turns --exit-code on, and turns off output formatting, as it
does not make much sense to show the first hit we happened to
have found.
The --quiet option is silently turned off (but --exit-code is
still in effect, so is silent output) if postprocessing filters
such as pickaxe and diff-filter are used. For all practical
purposes I do not think of a reason to want to use these filters
and not viewing the diff output.
The backends have not been taught about the option with this patch.
That is a topic for later rounds.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This introduces a new command-line option: --exit-code. The diff
programs will return 1 for differences, return 0 for equality, and
something else for errors.
Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* js/diff-ni:
Get rid of the dependency to GNU diff in the tests
diff --no-index: support /dev/null as filename
diff-ni: fix the diff with standard input
diff: support reading a file from stdin via "-"
Some systems have sizeof(off_t) == 8 while sizeof(size_t) == 4.
This implies that we are able to access and work on files whose
maximum length is around 2^63-1 bytes, but we can only malloc or
mmap somewhat less than 2^32-1 bytes of memory.
On such a system an implicit conversion of off_t to size_t can cause
the size_t to wrap, resulting in unexpected and exciting behavior.
Right now we are working around all gcc warnings generated by the
-Wshorten-64-to-32 option by passing the off_t through xsize_t().
In the future we should make xsize_t on such problematic platforms
detect the wrapping and die if such a file is accessed.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The earlier commit to read from stdin was full of problems, and
this corrects them.
- The mode bits should have been set to satisify S_ISREG(); we
forgot to the S_IFREG bits and hardcoded 0644;
- We did not give escape hatch to name a path whose name is
really "-". Allow users to say "./-" for that;
- Use of xread() was not prepared to see short read (e.g. reading
from tty) nor handing read errors.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This allows you to say
echo Hello World | git diff x -
to compare the contents of file "x" with the line "Hello World".
This automatically switches to --no-index mode.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* np/types:
Cleanup check_valid in commit-tree.
make sure enum object_type is signed
get rid of lookup_object_type()
convert object type handling from a string to a number
formalize typename(), and add its reverse type_from_string()
sha1_file.c: don't ignore an error condition in sha1_loose_object_info()
sha1_file.c: cleanup "offset" usage
sha1_file.c: cleanup hdr usage
We currently have two parallel notation for dealing with object types
in the code: a string and a numerical value. One of them is obviously
redundent, and the most used one requires more stack space and a bunch
of strcmp() all over the place.
This is an initial step for the removal of the version using a char array
found in object reading code paths. The patch is unfortunately large but
there is no sane way to split it in smaller parts without breaking the
system.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
diff sets the exit status to 0 when no changes were found, to 1
when changes were found, and 2 means error.
We imitate this to be able to use "git diff" in the test scripts.
(Actually, keeping in line with the rest of git, -1 is returned
on error, which corresponds to an exit status 255).
To find out if the diff is not empty, a member called
"found_changes" was introduced in struct diff_options, which is
set in builtin_diff() and fn_out_consume().
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* master: (201 commits)
Documentation: link in 1.5.0.2 material to the top documentation page.
Documentation: document remote.<name>.tagopt
GIT 1.5.0.2
git-remote: support remotes with a dot in the name
Documentation: describe "-f/-t/-m" options to "git-remote add"
diff --cc: fix display of symlink conflicts during a merge.
merge-recursive: fix longstanding bug in merging symlinks
merge-index: fix longstanding bug in merging symlinks
diff --cached: give more sensible error message when HEAD is yet to be created.
Update tests to use test-chmtime
Add test-chmtime: a utility to change mtime on files
Add Release Notes to prepare for 1.5.0.2
Allow arbitrary number of arguments to git-pack-objects
rerere: do not deal with symlinks.
rerere: do not skip two conflicted paths next to each other.
Don't modify CREDITS-FILE if it hasn't changed.
diff-patch: Avoid emitting double-slashes in textual patch.
Reword git-am 3-way fallback failure message.
Limit filename for format-patch
core.legacyheaders: Use the description used in RelNotes-1.5.0
...
* maint:
diff-patch: Avoid emitting double-slashes in textual patch.
Reword git-am 3-way fallback failure message.
Limit filename for format-patch
core.legacyheaders: Use the description used in RelNotes-1.5.0
git-show-ref --verify: Fail if called without a reference
Conflicts:
builtin-show-ref.c
diff.c
* lt/crlf:
Teach core.autocrlf to 'git apply'
t0020: add test for auto-crlf
Make AutoCRLF ternary variable.
Lazy man's auto-CRLF
* jc/apply-config:
t4119: test autocomputing -p<n> for traditional diff input.
git-apply: guess correct -p<n> value for non-git patches.
git-apply: notice "diff --git" patch again
Fix botched "leak fix"
t4119: add test for traditional patch and different p_value
apply: fix memory leak in prefix_one()
git-apply: require -p<n> when working in a subdirectory.
git-apply: do not lose cwd when run from a subdirectory.
Teach 'git apply' to look at $HOME/.gitconfig even outside of a repository
Teach 'git apply' to look at $GIT_DIR/config
With this flag and given two paths, git-diff-files behaves as a GNU diff
lookalike (plus the git goodies like --check, colour, etc.). This flag
is also available in git-diff. It also works outside of a git repository.
In addition, if git-diff{,-files} is called without revision or stage
parameter, and with exactly two paths at least one of which is not tracked,
the default is --no-index.
So, you can now say
git diff /etc/inittab /etc/fstab
and it actually works!
This also unifies the duplicated argument parsing between cmd_diff_files()
and builtin_diff_files().
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Matthias Lederhofer noticed that `diff -B` did not pick up on diff
colournig.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
There were instances of strncmp() that were formatted improperly
(e.g. whitespace around parameter before closing parenthesis)
that caused the earlier mechanical conversion step to miss
them. This step cleans them up.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This mechanically converts strncmp() to use prefixcmp(), but only when
the parameters match specific patterns, so that they can be verified
easily. Leftover from this will be fixed in a separate step, including
idiotic conversions like
if (!strncmp("foo", arg, 3))
=>
if (!(-prefixcmp(arg, "foo")))
This was done by using this script in px.perl
#!/usr/bin/perl -i.bak -p
if (/strncmp\(([^,]+), "([^\\"]*)", (\d+)\)/ && (length($2) == $3)) {
s|strncmp\(([^,]+), "([^\\"]*)", (\d+)\)|prefixcmp($1, "$2")|;
}
if (/strncmp\("([^\\"]*)", ([^,]+), (\d+)\)/ && (length($1) == $3)) {
s|strncmp\("([^\\"]*)", ([^,]+), (\d+)\)|(-prefixcmp($2, "$1"))|;
}
and running:
$ git grep -l strncmp -- '*.c' | xargs perl px.perl
Signed-off-by: Junio C Hamano <junkio@cox.net>
Reuse the colour handling of the regular diff.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
It currently does NOT know about file attributes, so it does its
conversion purely based on content. Maybe that is more in the "git
philosophy" anyway, since content is king, but I think we should try to do
the file attributes to turn it off on demand.
Anyway, BY DEFAULT it is off regardless, because it requires a
[core]
AutoCRLF = true
in your config file to be enabled. We could make that the default for
Windows, of course, the same way we do some other things (filemode etc).
But you can actually enable it on UNIX, and it will cause:
- "git update-index" will write blobs without CRLF
- "git diff" will diff working tree files without CRLF
- "git checkout" will write files to the working tree _with_ CRLF
and things work fine.
Funnily, it actually shows an odd file in git itself:
git clone -n git test-crlf
cd test-crlf
git config core.autocrlf true
git checkout
git diff
shows a diff for "Documentation/docbook-xsl.css". Why? Because we have
actually checked in that file *with* CRLF! So when "core.autocrlf" is
true, we'll always generate a *different* hash for it in the index,
because the index hash will be for the content _without_ CRLF.
Is this complete? I dunno. It seems to work for me. It doesn't use the
filename at all right now, and that's probably a deficiency (we could
certainly make the "is_binary()" heuristics also take standard filename
heuristics into account).
I don't pass in the filename at all for the "index_fd()" case
(git-update-index), so that would need to be passed around, but this
actually works fine.
NOTE NOTE NOTE! The "is_binary()" heuristics are totally made-up by yours
truly. I will not guarantee that they work at all reasonable. Caveat
emptor. But it _is_ simple, and it _is_ safe, since it's all off by
default.
The patch is pretty simple - the biggest part is the new "convert.c" file,
but even that is really just basic stuff that anybody can write in
"Teaching C 101" as a final project for their first class in programming.
Not to say that it's bug-free, of course - but at least we're not talking
about rocket surgery here.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
`git diff --ignore-space-at-eol` will ignore whitespace at the
line ends.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Here's a patch that I think we can merge right now. There may be
other places that need this, but this at least points out the
three places that read/write working tree files for git
update-index, checkout and diff respectively. That should cover
a lot of it [jc: git-apply uses an entirely different codepath
both for reading and writing].
Some day we can actually implement it. In the meantime, this
points out a place for people to start. We *can* even start with
a really simple "we do CRLF conversion automatically, regardless
of filename" kind of approach, that just look at the data (all
three cases have the _full_ file data already in memory) and
says "ok, this is text, so let's convert to/from DOS format
directly".
THAT somebody can write in ten minutes, and it would already
make git much nicer on a DOS/Windows platform, I suspect.
And it would be totally zero-cost if you just make it a config
option (but please make it dynamic with the _default_ just being
0/1 depending on whether it's UNIX/Windows, just so that UNIX
people can _test_ it easily).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Quote both file names separately when printing a rename, yielding
something like
"foo" => "bar"
instead of the current
"foo => bar"
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This avoids some code duplication, and yields more readable results
for directory renames.
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Among the low-level output functions called from flush_one_pair(),
this was the only function that did not take (filepair, options)
as arguments.
Signed-off-by: Junio C Hamano <junkio@cox.net>
We have a number of badly checked write() calls. Often we are
expecting write() to write exactly the size we requested or fail,
this fails to handle interrupts or short writes. Switch to using
the new write_in_full(). Otherwise we at a minimum need to check
for EINTR and EAGAIN, where this is appropriate use xwrite().
Note, the changes to config handling are much larger and handled
in the next patch in the sequence.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* sp/mmap: (27 commits)
Spell default packedgitlimit slightly differently
Increase packedGit{Limit,WindowSize} on 64 bit systems.
Update packedGit config option documentation.
mmap: set FD_CLOEXEC for file descriptors we keep open for mmap()
pack-objects: fix use of use_pack().
Fix random segfaults in pack-objects.
Cleanup read_cache_from error handling.
Replace mmap with xmmap, better handling MAP_FAILED.
Release pack windows before reporting out of memory.
Default core.packdGitWindowSize to 1 MiB if NO_MMAP.
Test suite for sliding window mmap implementation.
Create pack_report() as a debugging aid.
Support unmapping windows on 'temporary' packfiles.
Improve error message when packfile mmap fails.
Ensure core.packedGitWindowSize cannot be less than 2 pages.
Load core configuration in git-verify-pack.
Fully activate the sliding window pack access.
Unmap individual windows rather than entire files.
Document why header parsing won't exceed a window.
Loop over pack_windows when inflating/accessing data.
...
Conflicts:
cache.h
pack-check.c
This updates the way diffcore represents an unmerged pair
somewhat. It used to be that entries with mode=0 on both sides
were used to represent an unmerged pair, but now it has an
explicit flag. This is to allow diff-index --cached to report
the entry from the tree when the path is unmerged in the index.
This is used in updating "git reset <tree> -- <path>" to restore
absense of the path in the index from the tree.
Signed-off-by: Junio C Hamano <junkio@cox.net>
In some cases we did not even bother to check the return value of
mmap() and just assume it worked. This is bad, because if we are
out of virtual address space the kernel returned MAP_FAILED and we
would attempt to dereference that address, segfaulting without any
real error output to the user.
We are replacing all calls to mmap() with xmmap() and moving all
MAP_FAILED checking into that single location. If a mmap call
fails we try to release enough least-recently-used pack windows
to possibly succeed, then retry the mmap() attempt. If we cannot
mmap even after releasing pack memory then we die() as none of our
callers have any reasonable recovery strategy for a failed mmap.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
When parsing the diff line starting with '@@', the line number of the
'+' file is parsed. For the subsequent line parses, the line number
should therefore be incremented after the parse, not before it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is a mechanical clean-up of the way *.c files include
system header files.
(1) sources under compat/, platform sha-1 implementations, and
xdelta code are exempt from the following rules;
(2) the first #include must be "git-compat-util.h" or one of
our own header file that includes it first (e.g. config.h,
builtin.h, pkt-line.h);
(3) system headers that are included in "git-compat-util.h"
need not be included in individual C source files.
(4) "git-compat-util.h" does not have to include subsystem
specific header files (e.g. expat.h).
Signed-off-by: Junio C Hamano <junkio@cox.net>
It is nicer to let the user know when a commit succeeded all the time,
not only the first time. Also the commit sha1 is much more useful than
the tree sha1 in this case.
This patch also introduces a -q switch to supress this message as well
as the summary of created/deleted files.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The Cygwin folks have done a fine job at creating a POSIX layer
on Windows That Just Works(tm). However it comes with a penalty;
accessing files in the working tree by way of stat/open/mmap can
be slower for diffcore than inflating the data from a blob which
is stored in a packfile.
This performance problem is especially an issue in merge-recursive
when dealing with nearly 7000 added files, as we are loading
each file's content from the working directory to perform rename
detection. I have literally seen (and sadly watched) paint dry in
less time than it takes for merge-recursive to finish such a merge.
On the other hand this very same merge runs very fast on Solaris.
If Git is compiled with NO_FAST_WORKING_DIRECTORY set then we will
avoid looking at the working directory when the blob in question
is available within a packfile and the caller doesn't need the data
unpacked into a temporary file.
We don't use loose objects as they have the same open/mmap/close
costs as the working directory file access, but have the additional
CPU overhead of needing to inflate the content before use. So it
is still faster to use the working tree file over the loose object.
If the caller needs the file data unpacked into a temporary file
its likely because they are going to call an external diff program,
passing the file as a parameter. In this case reusing the working
tree file will be faster as we don't need to inflate the data and
write it out to a temporary file.
The NO_FAST_WORKING_DIRECTORY feature is enabled by default on
Cygwin, as that is the platform which currently appears to benefit
the most from this option.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
While adding colour to the branch command it was pointed out that a
config option like "branch.color" conflicts with the pre-existing
"branch.something" namespace used for specifying default merge urls and
branches. The suggested solution was to flip the order of the
components to "color.branch", which I did for colourising branch.
This patch does the same thing for
- git-log (color.diff)
- git-status (color.status)
- git-diff (color.diff)
- pager (color.pager)
I haven't removed the old config options; but they should probably be
deprecated and eventually removed to prevent future namespace
collisions. I've done this deprecation by changing the documentation
for the config file to match the new names; and adding the "color.XXX"
options to contrib/completion/git-completion.bash.
Unfortunately git-svn reads "diff.color" and "pager.color"; which I
don't like to change unilaterally.
Signed-off-by: Andy Parkins <andyparkins@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This changes the --numstat output for binary files from "0 0" to
"- -" to match what "apply --numstat" does.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Somebody was wondering on #git channel why a git generated diff
does not apply with GNU patch when the filename contains a SP.
It is because GNU patch expects to find TAB (and trailing timestamp)
on ---/+++ (old_name and new_name) lines after the filenames.
The "diff --git" output format was carefully designed to be
compatible with GNU patch where it can, but whitespace
characters were always a pain.
This adds an extra TAB (but not trailing timestamp) to old_name
and new_name lines of git-diff output when the filename has a SP
in it. An earlier patch updated git-apply to prepare for this.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This makes "git log/diff --summary" imply recursive behaviour,
whose effect is summarized in one test output:
--- a/t/t4013/diff.diff-tree_--pretty_--root_--summary_initial
+++ b/t/t4013/diff.diff-tree_--pretty_--root_--summary_initial
@@ -5,7 +5,7 @@ Date: Mon Jun 26 00:00:00 2006 +0000
Initial
- create mode 040000 dir
+ create mode 100644 dir/sub
create mode 100644 file0
create mode 100644 file2
$
When a file is created in a subdirectory, we used to say just
the directory name only when that directory also was created,
which did not make sense from two reasons. It is not any more
significant to create a new file in a new directory than to
create a new file in an existing directory, and even if it were,
reportinging the new directory name without saying the actual
filename is not useful.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* jc/diff-stat:
diff --stat: ensure at least one '-' for deletions, and one '+' for additions
diff --stat=width[,name-width]: allow custom diffstat output width.
diff --stat: color output.
diff --stat: allow custom diffstat output width.
Geert noticed that complete rewrite diff missed the usual a/ and b/
leading paths. Pickaxe says it never worked, ever.
Embarrassing.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The number of '-' and '+' is still linear. The idea is that
scaled-length := floor(a * length + b) with the following constraints: if
length == 1, scaled-length == 1, and the combined length of plusses
and minusses should not be larger than the width by a small margin. Thus,
a + b == 1
and
a * max_plusses + b + a * max_minusses + b = width + 1
The solution is
a * x + b = ((width - 1) * (x - 1) + max_change - 1)
/ (max_change - 1)
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Under --color option, diffstat shows '+' and '-' in the graph
the same color as added and deleted lines.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds two parameters to "diff --stat".
. --stat-width=72 tells that the page should fit on 72-column output.
. --stat-name-width=30 tells that the filename part is limited
to 30 columns.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds DIFF_WHITESPACE color class (default = reverse red) to
colored diff output to let you catch common whitespace errors.
- trailing whitespaces at the end of line
- a space followed by a tab in the indent
Signed-off-by: Junio C Hamano <junkio@cox.net>
* jk/diff:
wt-status: remove extraneous newline from 'deleted:' output
git-status: document colorization config options
Teach runstatus about --untracked
git-commit.sh: convert run_status to a C builtin
Move color option parsing out of diff.c and into color.[ch]
diff: support custom callbacks for output
Users can request the DIFF_FORMAT_CALLBACK output format to get a callback
consisting of the whole diff_queue_struct.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Like xmalloc and xrealloc xstrdup dies with a useful message if
the native strdup() implementation returns NULL rather than a
valid pointer.
I just tried to use xstrdup in new code and found it to be missing.
However I expected it to be present as xmalloc and xrealloc are
already commonly used throughout the code.
[jc: removed the part that deals with last_XXX, which I am
finding more and more dubious these days.]
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This abstracts away the size of the hash values when copying them
from memory location to memory location, much as the introduction
of hashcmp abstracted away hash value comparsion.
A few call sites were using char* rather than unsigned char* so
I added the cast rather than open hashcpy to be void*. This is a
reasonable tradeoff as most call sites already use unsigned char*
and the existing hashcmp is also declared to be unsigned char*.
[jc: Splitted the patch to "master" part, to be followed by a
patch for merge-recursive.c which is not in "master" yet.
Fixed the cast in the latter hunk to combine-diff.c which was
wrong in the original.
Also converted ones left-over in combine-diff.c, diff-lib.c and
upload-pack.c ]
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Introduces global inline:
hashcmp(const unsigned char *sha1, const unsigned char *sha2)
Uses memcmp for comparison and returns the result based on the length of
the hash name (a future runtime decision).
Acked-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
[jc: I needed to hand merge the changes to the updated codebase,
so the result needs to be checked.]
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Replace sha1 comparisons to null_sha1 with a global inline (which previously an
unused static inline in builtin-apply.c)
[jc: with a fix from Jonas Fonseca.]
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
With this option, the changed words are shown inline. For example,
if a file containing "This is foo" is changed to "This is bar", the diff
will now show "This is " in plain text, "foo" in red, and "bar" in green.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The final output from diff used to compare pathnames between
preimage and postimage to tell if the filepair is a rename/copy.
By explicitly marking the filepair created by diffcore_rename(),
the output routine, resolve_rename_copy(), does not have to do
so anymore. This helps feeding a filepair that has different
pathnames in one and two elements to the diff machinery (most
notably, comparing two blobs).
Signed-off-by: Junio C Hamano <junkio@cox.net>
enable/disable colored output when the pager is in use
Signed-off-by: Matthias Lederhofer <matled@gmx.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
When paging through the output of git-whatchanged, the color cues help to
visually navigate within a diff. However, it is difficult to notice when a
new commit starts, because the commit and log are shown in the "normal"
color. This patch colorizes the 'commit' line, customizable through
diff.colors.commit and defaulting to yellow.
As a side effect, some of the diff color engine (slot enum, get_color) has
become accessible outside of diff.c.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Add support for more than 8 colors. Colors can be specified as numbers
-1..255. -1 is same as "normal".
Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Make it possible to set both colors and a attribute for diff colors.
Background colors are supported too.
Syntax is now:
[attr] [fg [bg]]
[fg [bg]] [attr]
Empty value is same as "normal normal", ie use default colors. The new
syntax is backwards compatible.
Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
In a handful places, we use C99 structure and array
initializers, which some compilers do not support.
This can be handy when you are trying to compile GIT on a
Solaris system that has an older C compiler, for example.
Signed-off-by: Junio C Hamano <junkio@cox.net>
* ew/diff:
templates/hooks--update: replace diffstat calls with git diff --stat
diff: do not use configuration magic at the core-level
Update diff-options and config documentation.
diff.c: --no-color to defeat diff.color configuration.
diff.c: respect diff.renames config option
This allows you to say:
git -p diff v2.6.16-rc5..
and the command pipes the output of any git command to your pager.
[jc: this resurrects a month old RFC patch with improvement
suggested by Linus to call it --paginate instead of --less.]
Signed-off-by: Junio C Hamano <junkio@cox.net>
The Porcelainish has become so much usable as the UI that there
is not much reason people should be using the core programs by
hand anymore. At this point we are better off making the
behaviour of the core programs predictable by keeping them
unaffected by the configuration variables. Otherwise they will
become very hard to use as reliable building blocks.
For example, "git-commit -a" internally uses git-diff-files to
figure out the set of paths that need to be updated in the
index, and we should never allow diff.renames that happens to be
in the configuration to interfere (or slow down the process).
The UI level configuration such as showing renamed diff and
coloring are still honored by the Porcelainish ("git log" family
and "git diff"), but not by the core anymore.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Even if the standard output is connected to a tty, do not
colorize the diff if we are talking to a dumb terminal when
diff.color configuration variable is set to "auto".
Signed-off-by: Junio C Hamano <junkio@cox.net>
diff.renames is mentioned several times in the documentation,
but to my surprise it didn't do anything before this patch.
Also add the --no-renames option to override this from the
command-line.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Add new item text to struct diff_options.
If set then do not try to detect binary files.
Signed-off-by: Stephan Feder <sf@b-i-t.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The binary file detection is just a heuristic which can well fail.
Do not produce garbage patches in these cases.
Signed-off-by: Stephan Feder <sf@b-i-t.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* th/diff:
builtin-diff: turn recursive on when defaulting to --patch format.
t4013: note improvements brought by the new output code.
t4013: add format-patch tests.
format-patch: fix diff format option implementation
combine-diff.c: type sanity.
t4013 test updates for new output code.
Fix some more diff options changes.
Fix diff-tree -s
log --raw: Don't descend into subdirectories by default
diff-tree: Use ---\n as a message separator
Print empty line between raw, stat, summary and patch
t4013: add more tests around -c and --cc
whatchanged: Default to DIFF_FORMAT_RAW
Don't xcalloc() struct diffstat_t
Add msg_sep to diff_options
DIFF_FORMAT_RAW is not default anymore
Set default diff output format after parsing command line
Make --raw option available for all diff commands
Merge with_raw, with_stat and summary variables to output_format
t4013: add tests for diff/log family output options.
With the change in default, "git add ." on kernel dir is about
twice as fast as before, with only minimal (0.5%) change in
object size. The speed difference is even more noticeable
when committing large files, which is now up to 8 times faster.
The configurability is through setting core.compression = [-1..9]
which maps to the zlib constants; -1 is the default, 0 is no
compression, and 1..9 are various speed/size tradeoffs, 9
being slowest.
Signed-off-by: Joachim B Haga (cjhaga@fys.uio.no)
Acked-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The function internally generated diff to get the patch id but
passed a wrong emit flags to the xdiff layer when it did so.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This fixes various problems in the new diff options code.
- Fix --cc/-c --patch; it showed two-tree diff used internally.
- Use "---\n" only where it matters -- that is, use it
immediately after the commit log text when we show a
commit log and something else before the patch text.
- Do not output spurious extra "\n"; have an extra newline
after the commit log text always when we have diff output and
we are not doing oneline.
- When running a pickaxe you need to go recursive.
Signed-off-by: Junio C Hamano <junkio@cox.net>
setup_revisions() calls diff_setup_done() before we can set default
value for output_format. Don't convert DIFF_FORMAT_NO_OUTPUT to 0 in
diff_setup_done(), it is useless and makes diff-tree believe no diff
format parameters were given and thus lets it reset output_format to
DIFF_FORMAT_RAW.
Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Add msg_sep variable to struct diff_options. msg_sep is printed after
commit message. Default is "\n", format-patch sets it to "---\n".
This also removes the second argument from show_log() because all
callers derived it from the first argument:
show_log(rev, rev->loginfo, ...
Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Initialize output_format to 0 instead of DIFF_FORMAT_RAW so that we can see
later if any command line options changed it. Default value is set only if
output format was not specified.
Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
DIFF_FORMAT_* are now bit-flags instead of enumerated values.
Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Call it like this:
unsigned char id[20];
if (diff_flush_patch_id(diff_options, id))
printf("And the patch id is: %s\n", sha1_to_hex(id));
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This lets you use something like this in your $GIT_DIR/config
file.
[diff]
color = auto
[diff.color]
new = blue
old = yellow
frag = reverse
When diff.color is set to "auto", colored diff is enabled when
the standard output is the terminal. Other choices are "always",
and "never". Usual boolean true/false can also be used.
The colormap entries can specify colors for the following slots:
plain - lines that appear in both old and new file (context)
meta - diff --git header and extended git diff headers
frag - @@ -n,m +l,k @@ lines (hunk header)
old - lines deleted from old file
new - lines added to new file
The following color names can be used:
normal, bold, dim, l, blink, reverse, reset,
black, red, green, yellow, blue, magenta, cyan,
white
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds -b (--ignore-space-change) and -w (--ignore-all-space) flags to
diff. The main part of the patch is teaching libxdiff about it.
[jc: renamed xdl_line_match() to xdl_recmatch() since the former is used
for different purposes in xpatchi.c which is in the parts of the upstream
source we do not use.]
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This patch does:
- always reset the color _before_ printing out the newline.
This is actually important. You (and Johannes) didn't see it, because
it only matters if you set the background, but if you don't do this,
you get some random and funky behaviour if you pick a color with a
non-default background (which still potentially has problems with tabs
etc, but less so).
- allow people to have a different color for the "file headers"
(DIFF_METAINFO) and for the "fragment header" (DIFF_FRAGINFO). Also,
make a difference between "normal color" and "reset colors"
- default to red/green for old/new lines. That's the norm, I'd think.
- instead of that eye-popping (and eye-ball-with-a-fondue-fork-popping)
purple color for metadata, use bold-face for file headers, and cyan for
the frag headers. I actually prefer the "gray background" for that, but
it only works well in xterms, so COLOR_CYAN it is..
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
ANSI C99 doesn't allow void-pointer arithmetic. This patch fixes this in
various ways. Usually the strategy that required the least changes was used.
Signed-off-by: Florian Forster <octo@verplant.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This patch is a slightly adjusted version of Junio's patch:
http://www.gelato.unsw.edu.au/archives/git/0604/19354.html
However, instead of using a config variable, this patch makes it available
as a diff option.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This makes "git format-patch" a built-in.
* js/fmt-patch:
git-rebase: use canonical A..B syntax to format-patch
git-format-patch: now built-in.
fmt-patch: Support --attach
fmt-patch: understand old <his> notation
Teach fmt-patch about --keep-subject
Teach fmt-patch about --numbered
fmt-patch: implement -o <dir>
fmt-patch: output file names to stdout
Teach fmt-patch to write individual files.
Use RFC2822 dates from "git fmt-patch".
git-fmt-patch: thinkofix to show [PATCH] properly.
rename internal format-patch wip
Minor tweak on subject line in --pretty=email
Tentative built-in format-patch.
Currently the summary is displayed after the patch. Fix this so
that the output order is stat-summary-patch. As a consequence of
the way this is coded, the --summary option will only actually
display summary data if combined with either the --stat or
--patch-with-stat option.
Signed-off-by: Sean Estabrooks <seanlkml@sympatico.ca>
Signed-off-by: Junio C Hamano <junkio@cox.net>
output_format == DIFFSTAT and with_stat == true does not make sense, and
the way the code is structured it causes trouble. Avoid it.
Signed-off-by: Junio C Hamano <junkio@cox.net>
The second parameter is not the end of string input; it is
the optional return value to retrieve where the parser stopped.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This patch touches a couple of files, because it adds options to print a
custom text just after the subject of a commit, and just after the
diffstat.
[jc: made "many dashes" used as the boundary leader into a single
variable, to reduce the possibility of later tweaks to miscount the
number of dashes to break it.]
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Actually, it is a diff option now, so you can say
git diff --check
to ask if what you are about to commit is a good patch.
[jc: this also would work for fmt-patch, but the point is that
the check is done before making a commit. format-patch is run
from an already created commit, and that is too late to catch
whitespace damaged change.]
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
When renaming leading/a/filename to leading/b/filename (and
"filename" is sufficiently long), we tried to squash the rename
to "leading/{a => b}/filename". However, when "/a" or "/b" part
is empty, we underflowed and tried to print a substring of
length -1.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Remove the need to pipe git diff through git apply to
get the extended headers summary.
Signed-off-by: Sean Estabrooks <seanlkml@sympatico.ca>
Signed-off-by: Junio C Hamano <junkio@cox.net>
We used to parse "-U" and "--unified" as part of the GIT_DIFF_OPTS
environment variable, but strangely enough we would _not_ parse them as
part of the normal diff command line (where we only accepted "-u").
This adds parsing of -U and --unified, both with an optional numeric
argument. So now you can just say
git diff --unified=5
to get a unified diff with a five-line context, instead of having to do
something silly like
GIT_DIFF_OPTS="--unified=5" git diff -u
(that silly format does continue to still work, of course).
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>