Commit Graph

473 Commits

Author SHA1 Message Date
Junio C Hamano
9aae177a4a Merge branch 'maint'
* maint:
  gitweb: use decode_utf8 directly
  posix compatibility for t4200
  Document 'opendiff' value in config.txt and git-mergetool.txt
  Allow PERL_PATH="/usr/bin/env perl"
  Make xstrndup common
  diff.c: fix "size cache" handling.
  http-fetch: Disable use of curl multi support for libcurl < 7.16.
2007-05-03 23:26:54 -07:00
Junio C Hamano
cdda666201 diff.c: fix "size cache" handling.
We broke the size-cache handling when we changed the function
signature of sha1_object_info() in 21666f1a.  We obviously
wanted to cache the size we obtained when sha1_object_info()
succeeded, not when it failed.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-03 22:12:40 -07:00
Junio C Hamano
f1af60bdba Support 'diff=pgm' attribute
This enhances the attributes mechanism so that external programs
meant for existing GIT_EXTERNAL_DIFF interface can be specifed
per path.

To configure such a custom diff driver, first define a custom
diff driver in the configuration:

	[diff "my-c-diff"]
		command = <<your command string comes here>>

Then mark the paths that you want to use this custom driver
using the attribute mechanism.

	*.c	diff=my-c-diff

The intent of this separation is that the attribute mechanism is
used for specifying the type of the contents, while the
configuration mechanism is used to define what needs to be done
to that type of the contents, which would be specific to both
platform and personal taste.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-22 22:16:14 -07:00
Junio C Hamano
a2d7c6c620 Merge branch 'jc/attr'
* 'jc/attr': (28 commits)
  lockfile: record the primary process.
  convert.c: restructure the attribute checking part.
  Fix bogus linked-list management for user defined merge drivers.
  Simplify calling of CR/LF conversion routines
  Document gitattributes(5)
  Update 'crlf' attribute semantics.
  Documentation: support manual section (5) - file formats.
  Simplify code to find recursive merge driver.
  Counto-fix in merge-recursive
  Fix funny types used in attribute value representation
  Allow low-level driver to specify different behaviour during internal merge.
  Custom low-level merge driver: change the configuration scheme.
  Allow the default low-level merge driver to be configured.
  Custom low-level merge driver support.
  Add a demonstration/test of customized merge.
  Allow specifying specialized merge-backend per path.
  merge-recursive: separate out xdl_merge() interface.
  Allow more than true/false to attributes.
  Document git-check-attr
  Change attribute negation marker from '!' to '-'.
  ...
2007-04-21 17:38:00 -07:00
Alex Riesen
ac78e54804 Simplify calling of CR/LF conversion routines
Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-20 23:24:34 -07:00
Junio C Hamano
a5e92abde6 Fix funny types used in attribute value representation
It was bothering me a lot that I abused small integer values
casted to (void *) to represent non string values in
gitattributes.  This corrects it by making the type of attribute
values (const char *), and using the address of a few statically
allocated character buffer to denote true/false.  Unset attributes
are represented as having NULLs as their values.

Added in-header documentation to explain how git_checkattr()
routine should be called.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-18 16:17:13 -07:00
Junio C Hamano
515106fa13 Allow more than true/false to attributes.
This allows you to define three values (and possibly more) to
each attribute: true, false, and unset.

Typically the handlers that notice and act on attribute values
treat "unset" attribute to mean "do your default thing"
(e.g. crlf that is unset would trigger "guess from contents"),
so being able to override a setting to an unset state is
actually useful.

 - If you want to set the attribute value to true, have an entry
   in .gitattributes file that mentions the attribute name; e.g.

	*.o	binary

 - If you want to set the attribute value explicitly to false,
   use '-'; e.g.

	*.a	-diff

 - If you want to make the attribute value _unset_, perhaps to
   override an earlier entry, use '!'; e.g.

	*.a	-diff
	c.i.a	!diff

This also allows string values to attributes, with the natural
syntax:

	attrname=attrvalue

but you cannot use it, as nobody takes notice and acts on
it yet.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-17 01:04:59 -07:00
Junio C Hamano
40250af411 Fix 'diff' attribute semantics.
This is in the same spirit as the previous one.  Earlier 'diff'
meant 'do the built-in binary heuristics and disable patch text
generation based on it' while '!diff' meant 'do not guess, do
not generate patch text'.  There was no way to say 'do generate
patch text even when the heuristics says it has NUL in it'.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-15 14:35:11 -07:00
Linus Torvalds
04786756f9 Expose subprojects as special files to "git diff" machinery
The same way we generate diffs on symlinks as the the diff of text of the
symlink, we can generate subproject diffs (when not recursing into them!)
as the diff of the text that describes the subproject.

Of course, since what descibes a subproject is just the SHA1, that's what
we'll use. Add some pretty-printing to make it a bit more obvious what is
going on, and we're done.

So with this, we can get both raw diffs and "textual" diffs of subproject
changes:

 - git diff --raw:

	:160000 160000 2de597b5ad348b7db04bd10cdd38cd81cbc93ab5 0000000... M    sub-A

 - git diff:

	diff --git a/sub-A b/sub-A
	index 2de597b..e8f11a4 160000
	--- a/sub-A
	+++ b/sub-A
	@@ -1 +1 @@
	-Subproject commit 2de597b5ad348b7db04bd10cdd38cd81cbc93ab5
	+Subproject commit e8f11a45c5c6b9e2fec6d136d3fb5aff75393d42

NOTE! We'll also want to have the ability to recurse into the subproject
and actually diff it recursively, but that will involve a new command line
option (I'd suggest "--subproject" and "-S", but the latter is in use by
pickaxe), and some very different code.

But regardless of ay future recursive behaviour, we need the non-recursive
version too (and it should be the default, at least in the absense of
config options, so that large superprojects don't default to something
extremely expensive).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-15 12:52:48 -07:00
Junio C Hamano
8c701249d2 Teach 'diff' about 'diff' attribute.
This makes paths that explicitly unset 'diff' attribute not to
produce "textual" diffs from 'git-diff' family.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-14 08:57:06 -07:00
Andy Parkins
b18825876a Show binary file size change in diff --stat
Previously, a binary file in the diffstat would show as:

 some-binary-file.bin       |  Bin

The space after the "Bin" was never used.  This patch changes binary
lines in the diffstat to be:

 some-binary-file.bin       |  Bin 12345 -> 123456 bytes

The very nice "->" notation was suggested by Johannes Schindelin, and
shows the before and after sizes more clearly than "+" and "-" would.
If a size is 0 it's not shown (although it would probably be better to
treat no-file differently from zero-byte-file).

The user can see what changed in the binary file, and how big the new
file is.  This is in keeping with the information in the rest of the
diffstat.

The diffstat_t members "added" and "deleted" were unused when the file
was binary, so this patch loads them with the file sizes in
builtin_diffstat().  These figures are then read in show_stats() when
the file is marked binary.

Signed-off-by: Andy Parkins <andyparkins@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-05 15:07:16 -07:00
Junio C Hamano
68aacb2f3c diff --quiet
This adds the command line option 'quiet' to tell 'git diff-*'
that we are not interested in the actual diff contents but only
want to know if there is any change.  This option automatically
turns --exit-code on, and turns off output formatting, as it
does not make much sense to show the first hit we happened to
have found.

The --quiet option is silently turned off (but --exit-code is
still in effect, so is silent output) if postprocessing filters
such as pickaxe and diff-filter are used.  For all practical
purposes I do not think of a reason to want to use these filters
and not viewing the diff output.

The backends have not been taught about the option with this patch.
That is a topic for later rounds.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-14 16:21:19 -07:00
Junio C Hamano
3161b4b521 Remove unused diffcore_std_no_resolve
This was only used by diff-tree-helper program, whose purpose
was to translate a raw diff to a patch.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-14 16:21:19 -07:00
Alex Riesen
41bbf9d585 Allow git-diff exit with codes similar to diff(1)
This introduces a new command-line option: --exit-code. The diff
programs will return 1 for differences, return 0 for equality, and
something else for errors.

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-14 16:21:19 -07:00
Junio C Hamano
cf6981d493 Merge branch 'js/diff-ni'
* js/diff-ni:
  Get rid of the dependency to GNU diff in the tests
  diff --no-index: support /dev/null as filename
  diff-ni: fix the diff with standard input
  diff: support reading a file from stdin via "-"
2007-03-10 23:26:33 -08:00
Shawn O. Pearce
dc49cd769b Cast 64 bit off_t to 32 bit size_t
Some systems have sizeof(off_t) == 8 while sizeof(size_t) == 4.
This implies that we are able to access and work on files whose
maximum length is around 2^63-1 bytes, but we can only malloc or
mmap somewhat less than 2^32-1 bytes of memory.

On such a system an implicit conversion of off_t to size_t can cause
the size_t to wrap, resulting in unexpected and exciting behavior.
Right now we are working around all gcc warnings generated by the
-Wshorten-64-to-32 option by passing the off_t through xsize_t().

In the future we should make xsize_t on such problematic platforms
detect the wrapping and die if such a file is accessed.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-07 11:15:26 -08:00
Junio C Hamano
3afaa72d7d diff-ni: fix the diff with standard input
The earlier commit to read from stdin was full of problems, and
this corrects them.

 - The mode bits should have been set to satisify S_ISREG(); we
   forgot to the S_IFREG bits and hardcoded 0644;
 - We did not give escape hatch to name a path whose name is
   really "-".  Allow users to say "./-" for that;
 - Use of xread() was not prepared to see short read (e.g. reading
   from tty) nor handing read errors.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-04 00:17:27 -08:00
Johannes Schindelin
5332b2af10 diff: support reading a file from stdin via "-"
This allows you to say

	echo Hello World | git diff x -

to compare the contents of file "x" with the line "Hello World".
This automatically switches to --no-index mode.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-03 23:45:47 -08:00
Junio C Hamano
597388f6a1 Merge branch 'np/types'
* np/types:
  Cleanup check_valid in commit-tree.
  make sure enum object_type is signed
  get rid of lookup_object_type()
  convert object type handling from a string to a number
  formalize typename(), and add its reverse type_from_string()
  sha1_file.c: don't ignore an error condition in sha1_loose_object_info()
  sha1_file.c: cleanup "offset" usage
  sha1_file.c: cleanup hdr usage
2007-02-28 11:58:27 -08:00
Nicolas Pitre
21666f1aae convert object type handling from a string to a number
We currently have two parallel notation for dealing with object types
in the code: a string and a numerical value.  One of them is obviously
redundent, and the most used one requires more stack space and a bunch
of strcmp() all over the place.

This is an initial step for the removal of the version using a char array
found in object reading code paths.  The patch is unfortunately large but
there is no sane way to split it in smaller parts without breaking the
system.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-27 01:34:21 -08:00
Johannes Schindelin
34a5e1a2d9 diff --no-index: also imitate the exit status of diff(1)
diff sets the exit status to 0 when no changes were found, to 1
when changes were found, and 2 means error.

We imitate this to be able to use "git diff" in the test scripts.
(Actually, keeping in line with the rest of git, -1 is returned
on error, which corresponds to an exit status 255).

To find out if the diff is not empty, a member called
"found_changes" was introduced in struct diff_options, which is
set in builtin_diff() and fn_out_consume().

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-26 01:20:55 -08:00
Junio C Hamano
048f48a2fd Merge branch 'master' into js/diff-ni
* master: (201 commits)
  Documentation: link in 1.5.0.2 material to the top documentation page.
  Documentation: document remote.<name>.tagopt
  GIT 1.5.0.2
  git-remote: support remotes with a dot in the name
  Documentation: describe "-f/-t/-m" options to "git-remote add"
  diff --cc: fix display of symlink conflicts during a merge.
  merge-recursive: fix longstanding bug in merging symlinks
  merge-index: fix longstanding bug in merging symlinks
  diff --cached: give more sensible error message when HEAD is yet to be created.
  Update tests to use test-chmtime
  Add test-chmtime: a utility to change mtime on files
  Add Release Notes to prepare for 1.5.0.2
  Allow arbitrary number of arguments to git-pack-objects
  rerere: do not deal with symlinks.
  rerere: do not skip two conflicted paths next to each other.
  Don't modify CREDITS-FILE if it hasn't changed.
  diff-patch: Avoid emitting double-slashes in textual patch.
  Reword git-am 3-way fallback failure message.
  Limit filename for format-patch
  core.legacyheaders: Use the description used in RelNotes-1.5.0
  ...
2007-02-26 01:20:42 -08:00
Junio C Hamano
8a13becc0d Merge branch 'maint'
* maint:
  diff-patch: Avoid emitting double-slashes in textual patch.
  Reword git-am 3-way fallback failure message.
  Limit filename for format-patch
  core.legacyheaders: Use the description used in RelNotes-1.5.0
  git-show-ref --verify: Fail if called without a reference

Conflicts:

	builtin-show-ref.c
	diff.c
2007-02-24 01:42:06 -08:00
Junio C Hamano
5089277718 diff-patch: Avoid emitting double-slashes in textual patch.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-24 01:26:52 -08:00
Junio C Hamano
ef1a5c2fa8 Merge branches 'lt/crlf' and 'jc/apply-config'
* lt/crlf:
  Teach core.autocrlf to 'git apply'
  t0020: add test for auto-crlf
  Make AutoCRLF ternary variable.
  Lazy man's auto-CRLF

* jc/apply-config:
  t4119: test autocomputing -p<n> for traditional diff input.
  git-apply: guess correct -p<n> value for non-git patches.
  git-apply: notice "diff --git" patch again
  Fix botched "leak fix"
  t4119: add test for traditional patch and different p_value
  apply: fix memory leak in prefix_one()
  git-apply: require -p<n> when working in a subdirectory.
  git-apply: do not lose cwd when run from a subdirectory.
  Teach 'git apply' to look at $HOME/.gitconfig even outside of a repository
  Teach 'git apply' to look at $GIT_DIR/config
2007-02-22 21:34:36 -08:00
Johannes Schindelin
d516c2d119 Teach git-diff-files the new option --no-index
With this flag and given two paths, git-diff-files behaves as a GNU diff
lookalike (plus the git goodies like --check, colour, etc.).  This flag
is also available in git-diff.  It also works outside of a git repository.

In addition, if git-diff{,-files} is called without revision or stage
parameter, and with exactly two paths at least one of which is not tracked,
the default is --no-index.

So, you can now say

	git diff /etc/inittab /etc/fstab

and it actually works!

This also unifies the duplicated argument parsing between cmd_diff_files()
and builtin_diff_files().

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-22 20:59:55 -08:00
Johannes Schindelin
13e36ec51b Teach diff -B about colours
Matthias Lederhofer noticed that `diff -B` did not pick up on diff
colournig.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-21 00:03:37 -08:00
Junio C Hamano
1968d77dd6 prefixcmp(): fix-up leftover strncmp().
There were instances of strncmp() that were formatted improperly
(e.g. whitespace around parameter before closing parenthesis)
that caused the earlier mechanical conversion step to miss
them.  This step cleans them up.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-20 22:03:15 -08:00
Junio C Hamano
cc44c7655f Mechanical conversion to use prefixcmp()
This mechanically converts strncmp() to use prefixcmp(), but only when
the parameters match specific patterns, so that they can be verified
easily.  Leftover from this will be fixed in a separate step, including
idiotic conversions like

    if (!strncmp("foo", arg, 3))

  =>

    if (!(-prefixcmp(arg, "foo")))

This was done by using this script in px.perl

   #!/usr/bin/perl -i.bak -p
   if (/strncmp\(([^,]+), "([^\\"]*)", (\d+)\)/ && (length($2) == $3)) {
           s|strncmp\(([^,]+), "([^\\"]*)", (\d+)\)|prefixcmp($1, "$2")|;
   }
   if (/strncmp\("([^\\"]*)", ([^,]+), (\d+)\)/ && (length($1) == $3)) {
           s|strncmp\("([^\\"]*)", ([^,]+), (\d+)\)|(-prefixcmp($2, "$1"))|;
   }

and running:

   $ git grep -l strncmp -- '*.c' | xargs perl px.perl

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-20 22:03:15 -08:00
Junio C Hamano
0bce7a52f2 Merge branch 'js/diff-color-check'
* js/diff-color-check:
  diff --check: use colour
2007-02-19 18:30:59 -08:00
Johannes Schindelin
c5a8c3ecd7 diff --check: use colour
Reuse the colour handling of the regular diff.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-18 15:45:51 -08:00
Linus Torvalds
6c510bee20 Lazy man's auto-CRLF
It currently does NOT know about file attributes, so it does its
conversion purely based on content. Maybe that is more in the "git
philosophy" anyway, since content is king, but I think we should try to do
the file attributes to turn it off on demand.

Anyway, BY DEFAULT it is off regardless, because it requires a

	[core]
		AutoCRLF = true

in your config file to be enabled. We could make that the default for
Windows, of course, the same way we do some other things (filemode etc).

But you can actually enable it on UNIX, and it will cause:

 - "git update-index" will write blobs without CRLF
 - "git diff" will diff working tree files without CRLF
 - "git checkout" will write files to the working tree _with_ CRLF

and things work fine.

Funnily, it actually shows an odd file in git itself:

	git clone -n git test-crlf
	cd test-crlf
	git config core.autocrlf true
	git checkout
	git diff

shows a diff for "Documentation/docbook-xsl.css". Why? Because we have
actually checked in that file *with* CRLF! So when "core.autocrlf" is
true, we'll always generate a *different* hash for it in the index,
because the index hash will be for the content _without_ CRLF.

Is this complete? I dunno. It seems to work for me. It doesn't use the
filename at all right now, and that's probably a deficiency (we could
certainly make the "is_binary()" heuristics also take standard filename
heuristics into account).

I don't pass in the filename at all for the "index_fd()" case
(git-update-index), so that would need to be passed around, but this
actually works fine.

NOTE NOTE NOTE! The "is_binary()" heuristics are totally made-up by yours
truly. I will not guarantee that they work at all reasonable. Caveat
emptor. But it _is_ simple, and it _is_ safe, since it's all off by
default.

The patch is pretty simple - the biggest part is the new "convert.c" file,
but even that is really just basic stuff that anybody can write in
"Teaching C 101" as a final project for their first class in programming.
Not to say that it's bug-free, of course - but at least we're not talking
about rocket surgery here.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-14 11:19:22 -08:00
Johannes Schindelin
859f9c4581 teach diff machinery about --ignore-space-at-eol
`git diff --ignore-space-at-eol` will ignore whitespace at the
line ends.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-13 21:40:42 -08:00
Junio C Hamano
3eee9c6dbe Merge branch 'jc/diff-apply-patch'
* jc/diff-apply-patch:
  git-diff/git-apply: make diff output a bit friendlier to GNU patch (part 2)
2007-02-13 19:18:16 -08:00
Linus Torvalds
bd3a5b5ee5 Mark places that need blob munging later for CRLF conversion.
Here's a patch that I think we can merge right now. There may be
other places that need this, but this at least points out the
three places that read/write working tree files for git
update-index, checkout and diff respectively. That should cover
a lot of it [jc: git-apply uses an entirely different codepath
both for reading and writing].

Some day we can actually implement it. In the meantime, this
points out a place for people to start. We *can* even start with
a really simple "we do CRLF conversion automatically, regardless
of filename" kind of approach, that just look at the data (all
three cases have the _full_ file data already in memory) and
says "ok, this is text, so let's convert to/from DOS format
directly".

THAT somebody can write in ten minutes, and it would already
make git much nicer on a DOS/Windows platform, I suspect.

And it would be totally zero-cost if you just make it a config
option (but please make it dynamic with the _default_ just being
0/1 depending on whether it's UNIX/Windows, just so that UNIX
people can _test_ it easily).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-13 10:12:37 -08:00
Alexandre Julliard
e5bfbf9b3e diff.c: More logical file name quoting for renames in diffstat.
Quote both file names separately when printing a rename, yielding
something like

  "foo" => "bar"

instead of the current

  "foo => bar"

Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-11 13:08:10 -08:00
Alexandre Julliard
0d26a64ece diff.c: Properly quote file names in diff --summary output.
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-11 12:53:05 -08:00
Alexandre Julliard
b9f441646c diff.c: Reuse the pprint_rename function for diff --summary output.
This avoids some code duplication, and yields more readable results
for directory renames.

Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-11 12:47:22 -08:00
Junio C Hamano
471efb09aa diff_flush_name(): take struct diff_options parameter.
Among the low-level output functions called from flush_one_pair(),
this was the only function that did not take (filepair, options)
as arguments.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-09 22:43:02 -08:00
Andy Whitcroft
93822c2239 short i/o: fix calls to write to use xwrite or write_in_full
We have a number of badly checked write() calls.  Often we are
expecting write() to write exactly the size we requested or fail,
this fails to handle interrupts or short writes.  Switch to using
the new write_in_full().  Otherwise we at a minimum need to check
for EINTR and EAGAIN, where this is appropriate use xwrite().

Note, the changes to config handling are much larger and handled
in the next patch in the sequence.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-08 15:44:47 -08:00
Junio C Hamano
cf2999eb4c Merge branch 'sp/mmap'
* sp/mmap: (27 commits)
  Spell default packedgitlimit slightly differently
  Increase packedGit{Limit,WindowSize} on 64 bit systems.
  Update packedGit config option documentation.
  mmap: set FD_CLOEXEC for file descriptors we keep open for mmap()
  pack-objects: fix use of use_pack().
  Fix random segfaults in pack-objects.
  Cleanup read_cache_from error handling.
  Replace mmap with xmmap, better handling MAP_FAILED.
  Release pack windows before reporting out of memory.
  Default core.packdGitWindowSize to 1 MiB if NO_MMAP.
  Test suite for sliding window mmap implementation.
  Create pack_report() as a debugging aid.
  Support unmapping windows on 'temporary' packfiles.
  Improve error message when packfile mmap fails.
  Ensure core.packedGitWindowSize cannot be less than 2 pages.
  Load core configuration in git-verify-pack.
  Fully activate the sliding window pack access.
  Unmap individual windows rather than entire files.
  Document why header parsing won't exceed a window.
  Loop over pack_windows when inflating/accessing data.
  ...

Conflicts:

	cache.h
	pack-check.c
2007-01-07 00:12:47 -08:00
Junio C Hamano
e9c8409900 diff-index --cached --raw: show tree entry on the LHS for unmerged entries.
This updates the way diffcore represents an unmerged pair
somewhat.  It used to be that entries with mode=0 on both sides
were used to represent an unmerged pair, but now it has an
explicit flag.  This is to allow diff-index --cached to report
the entry from the tree when the path is unmerged in the index.

This is used in updating "git reset <tree> -- <path>" to restore
absense of the path in the index from the tree.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-06 22:57:42 -08:00
Shawn O. Pearce
c4712e4553 Replace mmap with xmmap, better handling MAP_FAILED.
In some cases we did not even bother to check the return value of
mmap() and just assume it worked.  This is bad, because if we are
out of virtual address space the kernel returned MAP_FAILED and we
would attempt to dereference that address, segfaulting without any
real error output to the user.

We are replacing all calls to mmap() with xmmap() and moving all
MAP_FAILED checking into that single location.  If a mmap call
fails we try to release enough least-recently-used pack windows
to possibly succeed, then retry the mmap() attempt.  If we cannot
mmap even after releasing pack memory then we die() as none of our
callers have any reasonable recovery strategy for a failed mmap.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-29 11:36:45 -08:00
Junio C Hamano
27e4dd8de7 Merge branch 'maint'
* maint:
  diff --check: fix off by one error
  spurious .sp in manpages
2006-12-21 22:56:04 -08:00
Johannes Schindelin
e6d40d65df diff --check: fix off by one error
When parsing the diff line starting with '@@', the line number of the
'+' file is parsed. For the subsequent line parses, the line number
should therefore be incremented after the parse, not before it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-21 20:31:14 -08:00
Junio C Hamano
85023577a8 simplify inclusion of system header files.
This is a mechanical clean-up of the way *.c files include
system header files.

 (1) sources under compat/, platform sha-1 implementations, and
     xdelta code are exempt from the following rules;

 (2) the first #include must be "git-compat-util.h" or one of
     our own header file that includes it first (e.g. config.h,
     builtin.h, pkt-line.h);

 (3) system headers that are included in "git-compat-util.h"
     need not be included in individual C source files.

 (4) "git-compat-util.h" does not have to include subsystem
     specific header files (e.g. expat.h).

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-20 09:51:35 -08:00
Junio C Hamano
5caf923223 fix populate-filespec
I hand munged the original patch when committing 1510fea78, and
screwed up the conversion.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-19 21:33:24 -08:00
Nicolas Pitre
ebd124c678 make commit message a little more consistent and conforting
It is nicer to let the user know when a commit succeeded all the time,
not only the first time.  Also the commit sha1 is much more useful than
the tree sha1 in this case.

This patch also introduces a -q switch to supress this message as well
as the summary of created/deleted files.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-15 22:29:54 -08:00
Shawn O. Pearce
1510fea781 Avoid accessing a slow working copy during diffcore operations.
The Cygwin folks have done a fine job at creating a POSIX layer
on Windows That Just Works(tm).  However it comes with a penalty;
accessing files in the working tree by way of stat/open/mmap can
be slower for diffcore than inflating the data from a blob which
is stored in a packfile.

This performance problem is especially an issue in merge-recursive
when dealing with nearly 7000 added files, as we are loading
each file's content from the working directory to perform rename
detection.  I have literally seen (and sadly watched) paint dry in
less time than it takes for merge-recursive to finish such a merge.
On the other hand this very same merge runs very fast on Solaris.

If Git is compiled with NO_FAST_WORKING_DIRECTORY set then we will
avoid looking at the working directory when the blob in question
is available within a packfile and the caller doesn't need the data
unpacked into a temporary file.

We don't use loose objects as they have the same open/mmap/close
costs as the working directory file access, but have the additional
CPU overhead of needing to inflate the content before use.  So it
is still faster to use the working tree file over the loose object.

If the caller needs the file data unpacked into a temporary file
its likely because they are going to call an external diff program,
passing the file as a parameter.  In this case reusing the working
tree file will be faster as we don't need to inflate the data and
write it out to a temporary file.

The NO_FAST_WORKING_DIRECTORY feature is enabled by default on
Cygwin, as that is the platform which currently appears to benefit
the most from this option.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-15 22:11:19 -08:00
Junio C Hamano
f5c589f1df Merge branch 'jc/numstat'
* jc/numstat:
  diff --numstat: show binary with '-' to match "apply --numstat"
2006-12-13 11:00:32 -08:00
Andy Parkins
a159ca0cb7 Allow subcommand.color and color.subcommand color configuration
While adding colour to the branch command it was pointed out that a
config option like "branch.color" conflicts with the pre-existing
"branch.something" namespace used for specifying default merge urls and
branches.  The suggested solution was to flip the order of the
components to "color.branch", which I did for colourising branch.

This patch does the same thing for
  - git-log (color.diff)
  - git-status (color.status)
  - git-diff (color.diff)
  - pager (color.pager)

I haven't removed the old config options; but they should probably be
deprecated and eventually removed to prevent future namespace
collisions.  I've done this deprecation by changing the documentation
for the config file to match the new names; and adding the "color.XXX"
options to contrib/completion/git-completion.bash.

Unfortunately git-svn reads "diff.color" and "pager.color"; which I
don't like to change unilaterally.

Signed-off-by: Andy Parkins <andyparkins@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-13 01:47:36 -08:00
Junio C Hamano
bfddbc5e1e diff --numstat: show binary with '-' to match "apply --numstat"
This changes the --numstat output for binary files from "0 0" to
"- -" to match what "apply --numstat" does.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-11 14:17:19 -08:00
Junio C Hamano
1a9eb3b9d5 git-diff/git-apply: make diff output a bit friendlier to GNU patch (part 2)
Somebody was wondering on #git channel why a git generated diff
does not apply with GNU patch when the filename contains a SP.
It is because GNU patch expects to find TAB (and trailing timestamp)
on ---/+++ (old_name and new_name) lines after the filenames.

The "diff --git" output format was carefully designed to be
compatible with GNU patch where it can, but whitespace
characters were always a pain.

This adds an extra TAB (but not trailing timestamp) to old_name
and new_name lines of git-diff output when the filename has a SP
in it.  An earlier patch updated git-apply to prepare for this.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-11-21 21:27:51 -08:00
Junio C Hamano
65606f3530 Merge branch 'js/diff'
* js/diff:
  Turn on recursive with --summary
2006-10-18 22:09:03 -07:00
Junio C Hamano
74e2abe5b7 diff --numstat
[jc: with documentation from Jakub]

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-10-13 21:37:10 -07:00
Johannes Schindelin
d7014dc081 Turn on recursive with --summary
This makes "git log/diff --summary" imply recursive behaviour,
whose effect is summarized in one test output:

    --- a/t/t4013/diff.diff-tree_--pretty_--root_--summary_initial
    +++ b/t/t4013/diff.diff-tree_--pretty_--root_--summary_initial
    @@ -5,7 +5,7 @@ Date:   Mon Jun 26 00:00:00 2006 +0000

	 Initial

    - create mode 040000 dir
    + create mode 100644 dir/sub
      create mode 100644 file0
      create mode 100644 file2
     $

When a file is created in a subdirectory, we used to say just
the directory name only when that directory also was created,
which did not make sense from two reasons.  It is not any more
significant to create a new file in a new directory than to
create a new file in an existing directory, and even if it were,
reportinging the new directory name without saying the actual
filename is not useful.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-10-05 15:10:40 -07:00
Junio C Hamano
dd0c367e5e Merge branch 'jc/diff-stat'
* jc/diff-stat:
  diff --stat: ensure at least one '-' for deletions, and one '+' for additions
  diff --stat=width[,name-width]: allow custom diffstat output width.
  diff --stat: color output.
  diff --stat: allow custom diffstat output width.
2006-09-30 21:29:18 -07:00
Junio C Hamano
bc1a580757 git-diff -B output fix.
Geert noticed that complete rewrite diff missed the usual a/ and b/
leading paths.  Pickaxe says it never worked, ever.

Embarrassing.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-29 02:06:24 -07:00
Johannes Schindelin
3ed74e608a diff --stat: ensure at least one '-' for deletions, and one '+' for additions
The number of '-' and '+' is still linear. The idea is that
scaled-length := floor(a * length + b) with the following constraints: if
length == 1, scaled-length == 1, and the combined length of plusses
and minusses should not be larger than the width by a small margin. Thus,

	a + b == 1

and
	a * max_plusses + b + a * max_minusses + b = width + 1

The solution is

	a * x + b = ((width - 1) * (x - 1) + max_change - 1)
		 / (max_change - 1)

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-28 22:32:53 -07:00
Linus Torvalds
5c5b2ea9ab diff --stat=width[,name-width]: allow custom diffstat output width.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-28 22:27:29 -07:00
Junio C Hamano
785f743276 diff --stat: color output.
Under --color option, diffstat shows '+' and '-' in the graph
the same color as added and deleted lines.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-27 02:57:37 -07:00
Junio C Hamano
a2540023dc diff --stat: allow custom diffstat output width.
This adds two parameters to "diff --stat".

 . --stat-width=72 tells that the page should fit on 72-column output.

 . --stat-name-width=30 tells that the filename part is limited
   to 30 columns.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-27 02:55:08 -07:00
Junio C Hamano
448c3ef144 diff.c: second war on whitespace.
This adds DIFF_WHITESPACE color class (default = reverse red) to
colored diff output to let you catch common whitespace errors.

 - trailing whitespaces at the end of line
 - a space followed by a tab in the indent

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-24 00:12:44 -07:00
Junio C Hamano
b467fb0b90 Merge branch 'jk/diff'
* jk/diff:
  wt-status: remove extraneous newline from 'deleted:' output
  git-status: document colorization config options
  Teach runstatus about --untracked
  git-commit.sh: convert run_status to a C builtin
  Move color option parsing out of diff.c and into color.[ch]
  diff: support custom callbacks for output
2006-09-17 18:14:03 -07:00
Jeff King
7c92fe0eaa Move color option parsing out of diff.c and into color.[ch]
The intent is to lib-ify colorizing code so it can be reused.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-08 16:44:10 -07:00
Jeff King
0424558190 diff: support custom callbacks for output
Users can request the DIFF_FORMAT_CALLBACK output format to get a callback
consisting of the whole diff_queue_struct.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-07 15:40:25 -07:00
Junio C Hamano
82793c55e4 diff --binary generates full index on binary files.
... without --full-index.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-07 02:44:41 -07:00
Shawn Pearce
9befac470b Replace uses of strdup with xstrdup.
Like xmalloc and xrealloc xstrdup dies with a useful message if
the native strdup() implementation returns NULL rather than a
valid pointer.

I just tried to use xstrdup in new code and found it to be missing.
However I expected it to be present as xmalloc and xrealloc are
already commonly used throughout the code.

[jc: removed the part that deals with last_XXX, which I am
 finding more and more dubious these days.]

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-02 03:24:37 -07:00
Junio C Hamano
b32d37a3a6 Merge branch 'jc/apply'
* jc/apply:
  git-apply --reject: finishing touches.
  apply --reject: count hunks starting from 1, not 0
  git-apply --verbose
  git-apply --reject: send rejects to .rej files.
  git-apply --reject
  apply --reverse: tie it all together.
  diff.c: make binary patch reversible.
  builtin-apply --reverse: two bugfixes.
2006-08-27 17:51:05 -07:00
Shawn Pearce
e702496e43 Convert memcpy(a,b,20) to hashcpy(a,b).
This abstracts away the size of the hash values when copying them
from memory location to memory location, much as the introduction
of hashcmp abstracted away hash value comparsion.

A few call sites were using char* rather than unsigned char* so
I added the cast rather than open hashcpy to be void*.  This is a
reasonable tradeoff as most call sites already use unsigned char*
and the existing hashcmp is also declared to be unsigned char*.

[jc: Splitted the patch to "master" part, to be followed by a
 patch for merge-recursive.c which is not in "master" yet.

 Fixed the cast in the latter hunk to combine-diff.c which was
 wrong in the original.

 Also converted ones left-over in combine-diff.c, diff-lib.c and
 upload-pack.c ]

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-23 13:53:10 -07:00
David Rientjes
a89fccd281 Do not use memcmp(sha1_1, sha1_2, 20) with hardcoded length.
Introduces global inline:

	hashcmp(const unsigned char *sha1, const unsigned char *sha2)

Uses memcmp for comparison and returns the result based on the length of
the hash name (a future runtime decision).

Acked-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-17 14:23:53 -07:00
Junio C Hamano
d4c452f03b diff.c: make binary patch reversible.
This matches the format previous "git-apply --reverse" update
expects.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-16 21:08:45 -07:00
David Rientjes
96f1e58f52 remove unnecessary initializations
[jc: I needed to hand merge the changes to the updated codebase,
 so the result needs to be checked.]

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-15 21:22:20 -07:00
David Rientjes
0bef57ee44 make inline is_null_sha1 global
Replace sha1 comparisons to null_sha1 with a global inline (which previously an
unused static inline in builtin-apply.c)

[jc: with a fix from Jonas Fonseca.]

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-15 15:06:03 -07:00
David Rientjes
8c0b2bb636 diff.c cleanup
Removes conditional return.

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-14 18:38:07 -07:00
Junio C Hamano
f3c5b39567 Merge branch 'th/diff-extra' 2006-08-12 19:34:41 -07:00
Johannes Schindelin
f59a59e22f Add the --color-words option to the diff options family
With this option, the changed words are shown inline. For example,
if a file containing "This is foo" is changed to "This is bar", the diff
will now show "This is " in plain text, "foo" in red, and "bar" in green.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-10 15:28:57 -07:00
Junio C Hamano
943d5b73e2 allow diff.renamelimit to be set regardless of -M/-C
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-09 14:05:23 -07:00
Junio C Hamano
03b9d560be make --find-copies-harder imply -C
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-09 13:17:19 -07:00
Junio C Hamano
ef677686ef diff.c: do not use pathname comparison to tell renames
The final output from diff used to compare pathnames between
preimage and postimage to tell if the filepair is a rename/copy.
By explicitly marking the filepair created by diffcore_rename(),
the output routine, resolve_rename_copy(), does not have to do
so anymore.  This helps feeding a filepair that has different
pathnames in one and two elements to the diff machinery (most
notably, comparing two blobs).

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-03 14:41:53 -07:00
Matthias Lederhofer
aa086eb813 pager: config variable pager.color
enable/disable colored output when the pager is in use

Signed-off-by: Matthias Lederhofer <matled@gmx.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-31 15:32:24 -07:00
Jeff King
ce43697379 Colorize 'commit' lines in log ui
When paging through the output of git-whatchanged, the color cues help to
visually navigate within a diff. However, it is difficult to notice when a
new commit starts, because the commit and log are shown in the "normal"
color. This patch colorizes the 'commit' line, customizable through
diff.colors.commit and defaulting to yellow.

As a side effect, some of the diff color engine (slot enum, get_color) has
become accessible outside of diff.c.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-24 00:04:41 -07:00
Timo Hirvonen
f5b571fcf7 diff: Support 256 colors
Add support for more than 8 colors.  Colors can be specified as numbers
-1..255.  -1 is same as "normal".

Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-13 21:53:25 -07:00
Timo Hirvonen
f37399e6b0 diff: Support both attributes and colors
Make it possible to set both colors and a attribute for diff colors.
Background colors are supported too.

Syntax is now:

	[attr] [fg [bg]]
	[fg [bg]] [attr]

Empty value is same as "normal normal", ie use default colors.  The new
syntax is backwards compatible.

Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-13 21:53:23 -07:00
Shawn Pearce
344c52aee5 Avoid C99 initializers
In a handful places, we use C99 structure and array
initializers, which some compilers do not support.

This can be handy when you are trying to compile GIT on a
Solaris system that has an older C compiler, for example.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-10 00:13:28 -07:00
Junio C Hamano
fc93dbbfc9 Merge branch 'ew/diff'
* ew/diff:
  templates/hooks--update: replace diffstat calls with git diff --stat
  diff: do not use configuration magic at the core-level
  Update diff-options and config documentation.
  diff.c: --no-color to defeat diff.color configuration.
  diff.c: respect diff.renames config option
2006-07-09 23:47:39 -07:00
Junio C Hamano
85fb65ed6e "git -p cmd" to page anywhere
This allows you to say:

	git -p diff v2.6.16-rc5..

and the command pipes the output of any git command to your pager.

[jc: this resurrects a month old RFC patch with improvement
 suggested by Linus to call it --paginate instead of --less.]

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-09 03:27:03 -07:00
Junio C Hamano
88f0d5d7d9 Merge branch 'sf/diff' 2006-07-09 00:52:36 -07:00
Junio C Hamano
83ad63cfeb diff: do not use configuration magic at the core-level
The Porcelainish has become so much usable as the UI that there
is not much reason people should be using the core programs by
hand anymore.  At this point we are better off making the
behaviour of the core programs predictable by keeping them
unaffected by the configuration variables.  Otherwise they will
become very hard to use as reliable building blocks.

For example, "git-commit -a" internally uses git-diff-files to
figure out the set of paths that need to be updated in the
index, and we should never allow diff.renames that happens to be
in the configuration to interfere (or slow down the process).

The UI level configuration such as showing renamed diff and
coloring are still honored by the Porcelainish ("git log" family
and "git diff"), but not by the core anymore.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-08 03:11:01 -07:00
Junio C Hamano
a0c2089c1d colored diff: diff.color = auto fix
Even if the standard output is connected to a tty, do not
colorize the diff if we are talking to a dumb terminal when
diff.color configuration variable is set to "auto".

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-07 17:48:02 -07:00
Junio C Hamano
fef88bb013 diff.c: --no-color to defeat diff.color configuration.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-07 12:28:54 -07:00
Eric Wong
b68ea12e30 diff.c: respect diff.renames config option
diff.renames is mentioned several times in the documentation,
but to my surprise it didn't do anything before this patch.

Also add the --no-renames option to override this from the
command-line.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-07 12:28:53 -07:00
Stephan Feder
63ac450119 Teach diff -a as shorthand for --text
Signed-off-by: Stephan Feder <sf@b-i-t.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-07 12:28:04 -07:00
Stephan Feder
6d64ea965b Teach --text option to diff
Add new item text to struct diff_options.
If set then do not try to detect binary files.

Signed-off-by: Stephan Feder <sf@b-i-t.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-07 12:28:04 -07:00
Stephan Feder
c9c95bbc9c Do not drop data from '\0' until eol in patch output
The binary file detection is just a heuristic which can well fail.
Do not produce garbage patches in these cases.

Signed-off-by: Stephan Feder <sf@b-i-t.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-07 03:48:10 -07:00
Junio C Hamano
0c926a3d9c Merge branch 'th/diff'
* th/diff:
  builtin-diff: turn recursive on when defaulting to --patch format.
  t4013: note improvements brought by the new output code.
  t4013: add format-patch tests.
  format-patch: fix diff format option implementation
  combine-diff.c: type sanity.
  t4013 test updates for new output code.
  Fix some more diff options changes.
  Fix diff-tree -s
  log --raw: Don't descend into subdirectories by default
  diff-tree: Use ---\n as a message separator
  Print empty line between raw, stat, summary and patch
  t4013: add more tests around -c and --cc
  whatchanged: Default to DIFF_FORMAT_RAW
  Don't xcalloc() struct diffstat_t
  Add msg_sep to diff_options
  DIFF_FORMAT_RAW is not default anymore
  Set default diff output format after parsing command line
  Make --raw option available for all diff commands
  Merge with_raw, with_stat and summary variables to output_format
  t4013: add tests for diff/log family output options.
2006-07-05 16:31:24 -07:00
Joachim B Haga
12f6c308d5 Make zlib compression level configurable, and change default.
With the change in default, "git add ." on kernel dir is about
twice as fast as before, with only minimal (0.5%) change in
object size. The speed difference is even more noticeable
when committing large files, which is now up to 8 times faster.

The configurability is through setting core.compression = [-1..9]
which maps to the zlib constants; -1 is the default, 0 is no
compression, and 1..9 are various speed/size tradeoffs, 9
being slowest.

Signed-off-by: Joachim B Haga (cjhaga@fys.uio.no)
Acked-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-03 13:55:11 -07:00
Timo Hirvonen
d7de00f7e0 --name-only, --name-status, --check and -s are mutually exclusive
Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-01 22:26:00 -07:00
Junio C Hamano
9fdc3bb5c2 diff.c: fix get_patch_id()
The function internally generated diff to get the patch id but
passed a wrong emit flags to the xdiff layer when it did so.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-28 22:49:42 -07:00
Junio C Hamano
3969cf7db1 Fix some more diff options changes.
This fixes various problems in the new diff options code.

 - Fix --cc/-c --patch; it showed two-tree diff used internally.

 - Use "---\n" only where it matters -- that is, use it
   immediately after the commit log text when we show a
   commit log and something else before the patch text.

 - Do not output spurious extra "\n"; have an extra newline
   after the commit log text always when we have diff output and
   we are not doing oneline.

 - When running a pickaxe you need to go recursive.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-27 15:33:40 -07:00
Timo Hirvonen
3223847a8f Fix diff-tree -s
setup_revisions() calls diff_setup_done() before we can set default
value for output_format.  Don't convert DIFF_FORMAT_NO_OUTPUT to 0 in
diff_setup_done(), it is useless and makes diff-tree believe no diff
format parameters were given and thus lets it reset output_format to
DIFF_FORMAT_RAW.

Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-27 11:04:47 -07:00
Timo Hirvonen
946c3784a3 Print empty line between raw, stat, summary and patch
Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-27 10:59:34 -07:00
Timo Hirvonen
5e2b0636c7 Don't xcalloc() struct diffstat_t
Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-26 14:58:41 -07:00
Timo Hirvonen
39bc9a6c20 Add msg_sep to diff_options
Add msg_sep variable to struct diff_options.  msg_sep is printed after
commit message.  Default is "\n", format-patch sets it to "---\n".

This also removes the second argument from show_log() because all
callers derived it from the first argument:

    show_log(rev, rev->loginfo, ...

Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-26 14:58:41 -07:00
Timo Hirvonen
c9b5ef998a Set default diff output format after parsing command line
Initialize output_format to 0 instead of DIFF_FORMAT_RAW so that we can see
later if any command line options changed it.  Default value is set only if
output format was not specified.

Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-26 14:58:40 -07:00
Timo Hirvonen
a610786f4b Make --raw option available for all diff commands
Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-26 14:58:40 -07:00
Timo Hirvonen
c6744349df Merge with_raw, with_stat and summary variables to output_format
DIFF_FORMAT_* are now bit-flags instead of enumerated values.

Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-26 14:58:40 -07:00
Johannes Schindelin
fcb3d0adc1 add diff_flush_patch_id() to calculate the patch id
Call it like this:

unsigned char id[20];
if (diff_flush_patch_id(diff_options, id))
	printf("And the patch id is: %s\n", sha1_to_hex(id));

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-26 14:44:03 -07:00
Junio C Hamano
6a0dbb8a5c Merge branch 'jc/diff'
* jc/diff:
  diff --color: use $GIT_DIR/config
2006-06-26 14:36:02 -07:00
Junio C Hamano
2cc06a0500 Merge branch 'js/diff'
* js/diff:
  Teach diff about -b and -w flags
2006-06-26 14:28:42 -07:00
Junio C Hamano
801235c5e6 diff --color: use $GIT_DIR/config
This lets you use something like this in your $GIT_DIR/config
file.

	[diff]
		color = auto

	[diff.color]
		new = blue
		old = yellow
		frag = reverse

When diff.color is set to "auto", colored diff is enabled when
the standard output is the terminal.  Other choices are "always",
and "never".  Usual boolean true/false can also be used.

The colormap entries can specify colors for the following slots:

	plain	- lines that appear in both old and new file (context)
	meta	- diff --git header and extended git diff headers
	frag	- @@ -n,m +l,k @@ lines (hunk header)
	old	- lines deleted from old file
	new	- lines added to new file

The following color names can be used:

	normal, bold, dim, l, blink, reverse, reset,
	black, red, green, yellow, blue, magenta, cyan,
	white

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-25 00:39:13 -07:00
Timo Hirvonen
d2543b8ee3 Clean up diff.c
Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-24 20:16:55 -07:00
Junio C Hamano
0ec2f6b739 diff --color: use reset sequence when we mean reset.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-24 04:24:34 -07:00
Johannes Schindelin
0d21efa51c Teach diff about -b and -w flags
This adds -b (--ignore-space-change) and -w (--ignore-all-space) flags to
diff. The main part of the patch is teaching libxdiff about it.

[jc: renamed xdl_line_match() to xdl_recmatch() since the former is used
 for different purposes in xpatchi.c which is in the parts of the upstream
 source we do not use.]

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-23 17:35:27 -07:00
Linus Torvalds
50f575fc98 Tweak diff colors
This patch does:

 - always reset the color _before_ printing out the newline.

   This is actually important. You (and Johannes) didn't see it, because
   it only matters if you set the background, but if you don't do this,
   you get some random and funky behaviour if you pick a color with a
   non-default background (which still potentially has problems with tabs
   etc, but less so).

 - allow people to have a different color for the "file headers"
   (DIFF_METAINFO) and for the "fragment header" (DIFF_FRAGINFO). Also,
   make a difference between "normal color" and "reset colors"

 - default to red/green for old/new lines. That's the norm, I'd think.

 - instead of that eye-popping (and eye-ball-with-a-fondue-fork-popping)
   purple color for metadata, use bold-face for file headers, and cyan for
   the frag headers. I actually prefer the "gray background" for that, but
   it only works well in xterms, so COLOR_CYAN it is..

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-22 15:08:36 -07:00
Junio C Hamano
9d24ed4f01 Merge branch 'ff/c99' into next
* ff/c99:
  Remove all void-pointer arithmetic.
2006-06-21 03:51:59 -07:00
Florian Forster
1d7f171c3a Remove all void-pointer arithmetic.
ANSI C99 doesn't allow void-pointer arithmetic. This patch fixes this in
various ways. Usually the strategy that required the least changes was used.

Signed-off-by: Florian Forster <octo@verplant.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-20 01:59:46 -07:00
Johannes Schindelin
cd112cef99 diff options: add --color
This patch is a slightly adjusted version of Junio's patch:
http://www.gelato.unsw.edu.au/archives/git/0604/19354.html

However, instead of using a config variable, this patch makes it available
as a diff option.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-17 17:08:16 -07:00
Junio C Hamano
73f0a1577b Merge branch 'js/fmt-patch'
This makes "git format-patch" a built-in.

* js/fmt-patch:
  git-rebase: use canonical A..B syntax to format-patch
  git-format-patch: now built-in.
  fmt-patch: Support --attach
  fmt-patch: understand old <his> notation
  Teach fmt-patch about --keep-subject
  Teach fmt-patch about --numbered
  fmt-patch: implement -o <dir>
  fmt-patch: output file names to stdout
  Teach fmt-patch to write individual files.
  Use RFC2822 dates from "git fmt-patch".
  git-fmt-patch: thinkofix to show [PATCH] properly.
  rename internal format-patch wip
  Minor tweak on subject line in --pretty=email
  Tentative built-in format-patch.
2006-05-24 12:19:47 -07:00
Sean
fad70686b2 --summary output should print immediately after stats.
Currently the summary is displayed after the patch.  Fix this so
that the output order is stat-summary-patch.  As a consequence of
the way this is coded, the --summary option will only actually
display summary data if combined with either the --stat or
--patch-with-stat option.

Signed-off-by: Sean Estabrooks <seanlkml@sympatico.ca>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-23 23:45:37 -07:00
Sean
e3008464e7 Avoid segfault in diff --stat rename output.
Signed-off-by: Sean Estabrooks <seanlkml@sympatico.ca>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-22 18:00:38 -07:00
Junio C Hamano
5e363541d0 diff: minor option combination fix.
output_format == DIFFSTAT and with_stat == true does not make sense, and
the way the code is structured it causes trouble.  Avoid it.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-22 00:31:02 -07:00
Junio C Hamano
9e848163ed checkdiff_consume: strtol parameter fix.
The second parameter is not the end of string input; it is
the optional return value to retrieve where the parser stopped.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-21 03:01:59 -07:00
Johannes Schindelin
698ce6f87e fmt-patch: Support --attach
This patch touches a couple of files, because it adds options to print a
custom text just after the subject of a commit, and just after the
diffstat.

[jc: made "many dashes" used as the boundary leader into a single
 variable, to reduce the possibility of later tweaks to miscount the
 number of dashes to break it.]

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-21 02:03:09 -07:00
Johannes Schindelin
8824689884 diff family: add --check option
Actually, it is a diff option now, so you can say

	git diff --check

to ask if what you are about to commit is a good patch.

[jc: this also would work for fmt-patch, but the point is that
 the check is done before making a commit.  format-patch is run
 from an already created commit, and that is too late to catch
 whitespace damaged change.]

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-21 01:16:09 -07:00
Junio C Hamano
638684824c Merge branch 'se/diff'
* se/diff:
  Convert some "apply --summary" users to "diff --summary".
  Add "--summary" option to git diff.
2006-05-15 23:42:37 -07:00
Junio C Hamano
05f743f328 Merge branch 'lt/diff'
* lt/diff:
  git diff: support "-U" and "--unified" options properly
2006-05-15 18:09:15 -07:00
Junio C Hamano
cc908b82a4 diffstat rename squashing fix.
When renaming leading/a/filename to leading/b/filename (and
"filename" is sufficiently long), we tried to squash the rename
to "leading/{a => b}/filename".  However, when "/a" or "/b" part
is empty, we underflowed and tried to print a substring of
length -1.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-14 22:07:28 -07:00
Sean
4bbd261bbd Add "--summary" option to git diff.
Remove the need to pipe git diff through git apply to
get the extended headers summary.

Signed-off-by: Sean Estabrooks <seanlkml@sympatico.ca>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-14 16:28:45 -07:00
Linus Torvalds
ee1e5412a7 git diff: support "-U" and "--unified" options properly
We used to parse "-U" and "--unified" as part of the GIT_DIFF_OPTS
environment variable, but strangely enough we would _not_ parse them as
part of the normal diff command line (where we only accepted "-u").

This adds parsing of -U and --unified, both with an optional numeric
argument. So now you can just say

	git diff --unified=5

to get a unified diff with a five-line context, instead of having to do
something silly like

	GIT_DIFF_OPTS="--unified=5" git diff -u

(that silly format does continue to still work, of course).

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-14 16:26:27 -07:00
Junio C Hamano
2fc240a7b2 Merge branch 'jc/bindiff'
* jc/bindiff:
  improve base85 generated assembly code
  binary diff and apply: testsuite.
  binary diff: further updates.
  binary patch.
2006-05-09 14:16:56 -07:00
Junio C Hamano
45f75a0167 Merge branch 'fix'
* fix:
  Separate object name errors from usage errors
  Documentation: {caret} fixes (git-rev-list.txt)
  Fix "git diff --stat" with long filenames
  Fix repo-config set-multivar error return path.
2006-05-08 16:40:23 -07:00
Linus Torvalds
5d6a9f45e1 Fix "git diff --stat" with long filenames
When we cut off the front of a filename to make it fit on the line, we add
a "..." in front. However, the way the "git diff" code was written, we
will never reset the prefix back to the empty string, so every single
filename afterwards will have the "..." prefix, whether appropriate or
not.

You can see this with "git diff v2.6.16.." on the current kernel tree,
since there are filenames with long names that changed there:

 [ snip snip ]
 Documentation/filesystems/vfs.txt                  |  229
 .../firmware_class/firmware_sample_driver.c        |    3
 .../firmware_sample_firmware_class.c               |    1
 ...Documentation/fujitsu/frv/kernel-ABI.txt           |  192
 ...Documentation/hwmon/w83627hf                       |    4
 [ snip snip ]

notice how the two Documentation/firmware** filenames caused the "..." to
be added, but then the later filenames don't want it, and it also screws
up the alignment of the line numbering afterwards.

Trivially fixed by moving the declaration (and initial setting) of the
"prefix" variable into the for-loop where it is used.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-08 10:54:30 -07:00
Junio C Hamano
0660626caf binary diff: further updates.
This updates the user interface and generated diff data format.

 * "diff --binary" is used to signal that we want an e-mailable
   binary patch.  It implies --full-index and -p.

 * "apply --allow-binary-replacement" acquired a short synonym
   "apply --binary".

 * After the "GIT binary patch\n" header line there is a token
   to record which binary patch mechanism was used, so that we
   can extend it later.  Currently there are two mechanisms
   defined: "literal" and "delta".  The former records the
   deflated postimage and the latter records the deflated delta
   from the preimage to postimage.

   For purely implementation convenience, I added the deflated
   length after these "literal/delta" tokens (otherwise the
   decoding side needs to guess and reallocate the buffer while
   inflating).  Improvement patches are very welcomed.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-05 15:24:32 -07:00
Junio C Hamano
051308f6e9 binary patch.
This adds "binary patch" to the diff output and teaches apply
what to do with them.

On the diff generation side, traditionally, we said "Binary
files differ\n" without giving anything other than the preimage
and postimage object name on the index line.  This was good
enough for applying a patch generated from your own repository
(very useful while rebasing), because the postimage would be
available in such a case.  However, this was not useful when the
recipient of such a patch via e-mail were to apply it, even if
the preimage was available.

This patch allows the diff to generate "binary" patch when
operating under --full-index option.  The binary patch follows
the usual extended git diff headers, and looks like this:

	"GIT binary patch\n"
	<length byte><data>"\n"
	...
	"\n"

Each line is prefixed with a "length-byte", whose value is upper
or lowercase alphabet that encodes number of bytes that the data
on the line decodes to (1..52 -- 'A' means 1, 'B' means 2, ...,
'Z' means 26, 'a' means 27, ...).  <data> is 1 or more groups of
5-byte sequence, each of which encodes up to 4 bytes in base85
encoding.  Because 52 / 4 * 5 = 65 and we have the length byte,
an output line is capped to 66 characters.  The payload is the
same diff-delta as we use in the packfiles.

On the consumption side, git-apply now can decode and apply the
binary patch when --allow-binary-replacement is given, the diff
was generated with --full-index, and the receiving repository
has the preimage blob, which is the same condition as it always
required when accepting an "Binary files differ\n" patch.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-05 15:24:32 -07:00
Linus Torvalds
dcb3450fd8 sha1_to_hex() usage cleanup
Somebody on the #git channel complained that the sha1_to_hex() thing uses
a static buffer which caused an error message to show the same hex output
twice instead of showing two different ones.

That's pretty easily rectified by making it uses a simple LRU of a few
buffers, which also allows some other users (that were aware of the buffer
re-use) to be written in a more straightforward manner.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-03 22:06:45 -07:00
Junio C Hamano
710158e3ca diff --stat: show complete rewrites consistently.
The patch format shows complete rewrite as deletion of all old lines
followed by addition of all new lines.  Count lines consistenly with
that when doing diffstat.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-25 23:40:09 -07:00
Junio C Hamano
6973dcaee7 Libify diff-files.
This is the first installment to libify diff brothers.

The updated diff-files uses revision.c::setup_revisions()
infrastructure to parse its command line arguments, which means
the pathname arguments are checked more strictly than before.
The tests are adjusted to separate possibly missing paths from
the rest of arguments with double-dashes, to show the kosher
way.

As Linus pointed out, renaming diff.c to diff-lib.c was simply
stupid, so I am renaming it back.  The new diff-lib.c is to
contain pieces extracted from diff brothers.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-22 02:37:45 -07:00
Junio C Hamano
ba580aeafb diff: move diff.c to diff-lib.c to make room.
Now I am not doing any real "git-diff in C" yet, but this would
help before doing so.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-19 15:38:14 -07:00
Junio C Hamano
85e6326cc3 Merge branch 'fix'
* fix:
  Document git-clone --reference
  Fix filename scaling for binary files
2006-04-19 02:25:29 -07:00
Jonas Fonseca
8d6e10327d Fix filename scaling for binary files
Set maximum filename length for binary files so that scaling won't be
triggered and result in invalid string access.

Signed-off-by: Jonas Fonseca <fonseca@diku.dk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-18 14:44:58 -07:00
Junio C Hamano
3a624b346d Fix "git log --stat": make sure to set recursive with --stat.
Just like "patch" format always needs recursive, "diffstat"
format does not make sense without setting recursive.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-18 11:43:09 -07:00
Junio C Hamano
f56ef54174 diff --stat: make sure to set recursive.
Just like "patch" format always needs recursive, "diffstat"
format does not make sense without setting recursive.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-18 11:29:33 -07:00
Johannes Schindelin
2935327394 diff-options: add --patch-with-stat
With this option, git prepends a diffstat in front of the patch.

Since I really, really do not know what a diffstat of a combined diff
("merge diff") should look like, the diffstat is not generated for these.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-15 19:30:27 -07:00
Junio C Hamano
cbdda02404 diff-files --stat: do not dump core with unmerged index.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-15 19:22:48 -07:00
Junio C Hamano
6f4780f9df diff --stat: do not do its own three-dashes.
I missed that "git-diff-* --stat" spits out three-dash separator
on its own without being asked.  Remove it.

When we output commit log followed by diff, perhaps --patch-with-stat,
for downstream consumer, we _would_ want the three-dash between
the message and the diff material, but that logic belongs to the
caller, not diff generator.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-15 14:06:42 -07:00
Junio C Hamano
84981f9ad9 diff --stat: no need to ask funcnames nor context.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-13 21:35:54 -07:00
Johannes Schindelin
ece634d147 diff-options: add --stat (take 2)
... and a fix for an invalid free():


Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-13 16:50:02 -07:00
Johannes Schindelin
d75f7952ef diff-options: add --stat (take 2)
Now, you can say "git diff --stat" (to get an idea how many changes are
uncommitted), or "git log --stat".

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-13 16:48:24 -07:00
Petr Baudis
90c1b08c7d Separate the raw diff and patch with a newline
More friendly for human reading I believe, and possibly friendlier to some
parsers (although only by an epsilon).

Signed-off-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-11 11:17:50 -07:00
Junio C Hamano
86ff1d2012 diff-* --patch-with-raw
This new flag outputs the diff-raw output and diff-patch output
at the same time.  Requested by Cogito.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-10 19:44:18 -07:00
Junio C Hamano
77882f60d9 Retire diffcore-pathspec.
Nobody except diff-stages used it -- the callers instead filtered
the input to diffcore themselves.  Make diff-stages do that as
well and retire diffcore-pathspec.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-10 15:57:24 -07:00
Junio C Hamano
a041d94f29 diff: fix output of total-rewrite diff.
We did not read in the file data before emitting the
total-rewrite diff.  Noticed by Pasky.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-08 20:32:40 -07:00
Junio C Hamano
b4196cf70a Merge branches 'master' and 'jc/combine' into next
* master:
  Add git-clean command
  diff_flush(): leakfix.
  parse_date(): fix parsing 03/10/2006

* jc/combine:
  combine-diff: refactor built-in xdiff interface.
2006-04-05 02:58:14 -07:00
Junio C Hamano
12d81ce598 Merge branch 'fix'
* fix:
  diff_flush(): leakfix.
  parse_date(): fix parsing 03/10/2006
2006-04-05 02:50:54 -07:00
Junio C Hamano
7d6c447145 diff_flush(): leakfix.
We were leaking filepairs when output-format was set to
NO_OUTPUT.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-05 02:48:41 -07:00
Petr Baudis
d01d8c6782 Support for pickaxe matching regular expressions
git-diff-* --pickaxe-regex will change the -S pickaxe to match
POSIX extended regular expressions instead of fixed strings.

The regex.h library is a rather stupid interface and I like pcre too, but
with any luck it will be everywhere we will want to run Git on, it being
POSIX.2 and all. I'm not sure if we can expect platforms like AIX to
conform to POSIX.2 or if win32 has regex.h. We might add a flag to
Makefile if there is a portability trouble potential.

Signed-off-by: Petr Baudis <pasky@suse.cz>
2006-04-04 13:44:15 -07:00
Junio C Hamano
1b0c7174a1 tree/diff header cleanup.
Introduce tree-walk.[ch] and move "struct tree_desc" and
associated functions from various places.

Rename DIFF_FILE_CANON_MODE(mode) macro to canon_mode(mode) and
move it to cache.h.  This macro returns the canonicalized
st_mode value in the host byte order for files, symlinks and
directories -- to be compared with a tree_desc entry.
create_ce_mode(mode) in cache.h is similar but is intended to be
used for index entries (so it does not work for directories) and
returns the value in the network byte order.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-03-29 23:54:13 -08:00
Mark Wooding
acb7257729 xdiff: Show function names in hunk headers.
The speed of the built-in diff generator is nice; but the function names
shown by `diff -p' are /really/ nice.  And I hate having to choose.  So,
we hack xdiff to find the function names and print them.

xdiff has grown a flag to say whether to dig up the function names.  The
builtin_diff function passes this flag unconditionally.  I suppose it
could parse GIT_DIFF_OPTS, but it doesn't at the moment.  I've also
reintroduced the `function name' into the test suite, from which it was
removed in commit 3ce8f089.

The function names are parsed by a particularly stupid algorithm at the
moment: it just tries to find a line in the `old' file, from before the
start of the hunk, whose first character looks plausible.  Still, it's
most definitely a start.

Signed-off-by: Mark Wooding <mdw@distorted.org.uk>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-03-27 18:43:51 -08:00
Junio C Hamano
b9aa1f9e9d Merge branch 'lt/diffgen' into next
* lt/diffgen:
  true built-in diff: run everything in-core.
2006-03-26 00:15:44 -08:00
Junio C Hamano
cebff98dbe true built-in diff: run everything in-core.
This stops using temporary files when we are using the built-in
diff (including the complete rewrite).

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-03-25 23:27:01 -08:00
Junio C Hamano
9acf322d69 Merge branch 'lt/diffgen' into next
* lt/diffgen:
  built-in diff: minimum tweaks
  builtin-diff: \No newline at end of file.
  Use a *real* built-in diff generator
2006-03-25 17:44:01 -08:00
Junio C Hamano
3ce8f08944 built-in diff: minimum tweaks
This fixes up a couple of minor issues with the real built-in
diff to be more usable:

 - Omit ---/+++ header unless we emit diff output;

 - Detect and punt binary diff like GNU does;

 - Honor GIT_DIFF_OPTS minimally (only -u<number> and
   --unified=<number> are currently supported);

 - Omit line count of 1 from "@@ -l,k +m,n @@" hunk header
   (i.e. when k == 1 or n == 1)

 - Adjust testsuite for the lack of -p support.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-03-25 16:50:00 -08:00
Linus Torvalds
3443546f6e Use a *real* built-in diff generator
This uses a simplified libxdiff setup to generate unified diffs _without_
doing  fork/execve of GNU "diff".

This has several huge advantages, for example:

Before:

	[torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null

	real    0m24.818s
	user    0m13.332s
	sys     0m8.664s

After:

	[torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null

	real    0m4.563s
	user    0m2.944s
	sys     0m1.580s

and the fact that this should be a lot more portable (ie we can ignore all
the issues with doing fork/execve under Windows).

Perhaps even more importantly, this allows us to do diffs without actually
ever writing out the git file contents to a temporary file (and without
any of the shell quoting issues on filenames etc etc).

NOTE! THIS PATCH DOES NOT DO THAT OPTIMIZATION YET! I was lazy, and the
current "diff-core" code actually will always write the temp-files,
because it used to be something that you simply had to do. So this current
one actually writes a temp-file like before, and then reads it into memory
again just to do the diff. Stupid.

But if this basic infrastructure is accepted, we can start switching over
diff-core to not write temp-files, which should speed things up even
further, especially when doing big tree-to-tree diffs.

Now, in the interest of full disclosure, I should also point out a few
downsides:

 - the libxdiff algorithm is different, and I bet GNU diff has gotten a
   lot more testing. And the thing is, generating a diff is not an exact
   science - you can get two different diffs (and you will), and they can
   both be perfectly valid. So it's not possible to "validate" the
   libxdiff output by just comparing it against GNU diff.

 - GNU diff does some nice eye-candy, like trying to figure out what the
   last function was, and adding that information to the "@@ .." line.
   libxdiff doesn't do that.

 - The libxdiff thing has some known deficiencies. In particular, it gets
   the "\No newline at end of file" case wrong. So this is currently for
   the experimental branch only. I hope Davide will help fix it.

That said, I think the huge performance advantage, and the fact that it
integrates better is definitely worth it. But it should go into a
development branch at least due to the missing newline issue.

Technical note: this is based on libxdiff-0.17, but I did some surgery to
get rid of the extraneous fat - stuff that git doesn't need, and seriously
cutting down on mmfile_t, which had much more capabilities than the diff
algorithm either needed or used. In this version, "mmfile_t" is just a
trivial <pointer,length> tuple.

That said, I tried to keep the differences to simple removals, so that you
can do a diff between this and the libxdiff origin, and you'll basically
see just things getting deleted. Even the mmfile_t simplifications are
left in a state where the diffs should be readable.

Apologies to Davide, whom I'd love to get feedback on this all from (I
wrote my own "fill_mmfile()" for the new simpler mmfile_t format: the old
complex format had a helper function for that, but I did my surgery with
the goal in mind that eventually we _should_ just do

	mmfile_t mf;

	buf = read_sha1_file(sha1, type, &size);
	mf->ptr = buf;
	mf->size = size;
	.. use "mf" directly ..

which was really a nightmare with the old "helpful" mmfile_t, and really
is that easy with the new cut-down interfaces).

[ Btw, as any hawk-eye can see from the diff, this was actually generated
  with itself, so it is "self-hosting". That's about all the testing it
  has gotten, along with the above kernel diff, which eye-balls correctly,
  but shows the newline issue when you double-check it with "git-apply" ]

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-03-25 16:49:58 -08:00
Junio C Hamano
c06c79667c diffcore-rename: somewhat optimized.
This changes diffcore-rename to reuse statistics information
gathered during similarity estimation, and updates the hashtable
implementation used to keep track of the statistics to be
denser.  This seems to give better performance.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-03-12 03:22:10 -08:00
Linus Torvalds
8676eb4313 Make git diff-generation use a simpler spawn-like interface
Instead of depending of fork() and execve() and doing things in between
the two, make the git diff functions do everything up front, and then do
a single "spawn_prog()" invocation to run the actual external diff
program (if any is even needed).

This actually ends up simplifying the code, and should make it much
easier to make it efficient under broken operating systems (read: Windows).

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-26 16:21:27 -08:00
Junio C Hamano
ee072260db Merge branch 'jc/nostat'
* jc/nostat:
  cache_name_compare() compares name and stage, nothing else.
  "assume unchanged" git: documentation.
  ls-files: split "show-valid-bit" into a different option.
  "Assume unchanged" git: --really-refresh fix.
  ls-files: debugging aid for CE_VALID changes.
  "Assume unchanged" git: do not set CE_VALID with --refresh
  "Assume unchanged" git
2006-02-21 22:33:21 -08:00
Junio C Hamano
297a1aadbe find_unique_abbrev() simplification.
Earlier it did not grok the 0{40} SHA1 very well, but what it
needed to do was to find the shortest 0{N} that is not used as a
valid object name to be consistent with the way names of valid
objects are abbreviated.  This makes some users simpler.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-10 01:51:12 -08:00
Junio C Hamano
5f73076c1a "Assume unchanged" git
This adds "assume unchanged" logic, started by this message in the list
discussion recently:

	<Pine.LNX.4.64.0601311807470.7301@g5.osdl.org>

This is a workaround for filesystems that do not have lstat()
that is quick enough for the index mechanism to take advantage
of.  On the paths marked as "assumed to be unchanged", the user
needs to explicitly use update-index to register the object name
to be in the next commit.

You can use two new options to update-index to set and reset the
CE_VALID bit:

	git-update-index --assume-unchanged path...
	git-update-index --no-assume-unchanged path...

These forms manipulate only the CE_VALID bit; it does not change
the object name recorded in the index file.  Nor they add a new
entry to the index.

When the configuration variable "core.ignorestat = true" is set,
the index entries are marked with CE_VALID bit automatically
after:

 - update-index to explicitly register the current object name to the
   index file.

 - when update-index --refresh finds the path to be up-to-date.

 - when tools like read-tree -u and apply --index update the working
   tree file and register the current object name to the index file.

The flag is dropped upon read-tree that does not check out the index
entry.  This happens regardless of the core.ignorestat settings.

Index entries marked with CE_VALID bit are assumed to be
unchanged most of the time.  However, there are cases that
CE_VALID bit is ignored for the sake of safety and usability:

 - while "git-read-tree -m" or git-apply need to make sure
   that the paths involved in the merge do not have local
   modifications.  This sacrifices performance for safety.

 - when git-checkout-index -f -q -u -a tries to see if it needs
   to checkout the paths.  Otherwise you can never check
   anything out ;-).

 - when git-update-index --really-refresh (a new flag) tries to
   see if the index entry is up to date.  You can start with
   everything marked as CE_VALID and run this once to drop
   CE_VALID bit for paths that are modified.

Most notably, "update-index --refresh" honours CE_VALID and does
not actively stat, so after you modified a file in the working
tree, update-index --refresh would not notice until you tell the
index about it with "git-update-index path" or "git-update-index
--no-assume-unchanged path".

This version is not expected to be perfect.  I think diff
between index and/or tree and working files may need some
adjustment, and there probably needs other cases we should
automatically unmark paths that are marked to be CE_VALID.

But the basics seem to work, and ready to be tested by people
who asked for this feature.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-08 21:54:42 -08:00
Petr Baudis
6a1f79c1f1 Allow diff and index commands to be interrupted
So far, e.g. git-update-index --refresh was basically uninterruptable
by ctrl-c, since it hooked the SIGINT handler, but that handler would
only unlink the lockfile but not actually quit. This makes it propagate
the signal to the default handler.

Note that I expected it to work without resetting the signal handler to
SIG_DFL, but without that it ended in an infinite loop of tgkill()s -
is my glibc violating SUS or what?

Signed-off-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-01 19:47:52 -08:00
Junio C Hamano
6b1ddbdd6e diff --abbrev=<n> option fix.
Earier specifying an abbreviation shorter than minimum fell back
to full 40 letters, which was nonsense.  Make it to fall back to
the minimum number (currently 4).

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-01-28 00:09:39 -08:00
Junio C Hamano
46a6c2620b abbrev cleanup: use symbolic constants
The minimum length of abbreviated object name was hardcoded in
different places to be 4, risking inconsistencies in the future.
Also there were three different "default abbreviation
precision".  Use two C preprocessor symbols to clean up this
mess.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-01-28 00:09:38 -08:00
Junio C Hamano
82f9d58a39 code comments: spell
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-29 01:32:56 -08:00
Johannes Schindelin
975b31dc6e Handle symlinks graciously
This patch converts a stat() to an lstat() call, thereby fixing the case
when the date of a symlink was not the same as the one recorded in the
index. The included test case demonstrates this.

This is for the case that the symlink points to a non-existing file. If
the file exists, worse things than just an error message happen.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-26 18:33:27 -08:00
Junio C Hamano
7e4a2a8483 avoid asking ?alloc() for zero bytes.
Avoid asking for zero bytes when that change simplifies overall
logic.  Later we would change the wrapper to ask for 1 byte on
platforms that return NULL for zero byte request.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-26 17:23:59 -08:00
Junio C Hamano
47dd0d595d diff: --abbrev option
When I show transcripts to explain how something works, I often
find myself hand-editing the diff-raw output to shorten various
object names in the output.

This adds --abbrev option to the diff family, which shortens
diff-raw output and diff-tree commit id headers.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-19 18:32:44 -08:00
Junio C Hamano
9ce392f482 Move diff.renamelimit out of default configuration.
Otherwise we would end up linking all the unneeded stuff into git-daemon
only to link with git_default_config.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-11-21 23:00:50 -08:00
H. Peter Anvin
1b1480ff6a rename/copy score parsing updates.
Better variant, which handles stuff like "4.5%" and rejects
"192.168.0.1".  Additionally, make sure numbers are unsigned (I'm making
them unsigned long just for the hell of it), to make sure that
artificial wraparound scenarios don't cause harm.

	-hpa

[jc: with this, -M100 changes its meaning back to 10%.  People
wanting to say "pure renames only" should now say -M100% or
-M1.0; sounds a bit like an earthquake, but arguably things are
more consistent this way ;-)]

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-11-21 14:54:33 -08:00
Junio C Hamano
9f70b80692 rename detection with -M100 means "exact renames only".
When the user is interested in pure renames, there is no point
doing the similarity scores.  This changes the score argument
parsing to special case -M100 (otherwise, it is a precision
scaled value 0 <= v < 1 and would mean 0.1, not 1.0 --- if you
do mean 0.1, you can say -M1), and optimizes the diffcore_rename
transformation to only look at pure renames in that case.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-11-21 12:21:24 -08:00
Junio C Hamano
80b1e511d7 diff: --full-index
A new option, --full-index, is introduced to diff family.  This
causes the full object name of pre- and post-images to appear on
the index line of patch formatted output, to be used in
conjunction with --allow-binary-replacement option of git-apply.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-11-16 16:20:40 -08:00
Junio C Hamano
3299c6f6a8 diff: make default rename detection limit configurable.
A while ago, a rename-detection limit logic was implemented as a
response to this thread:

	http://marc.theaimsgroup.com/?l=git&m=112413080630175

where gitweb was found to be using a lot of time and memory to
detect renames on huge commits.  git-diff family takes -l<num>
flag, and if the number of paths that are rename destination
candidates (i.e. new paths with -M, or modified paths with -C)
are larger than that number, skips rename/copy detection even
when -M or -C is specified on the command line.

This commit makes the rename detection limit easier to use.  You
can have:

	[diff]
		renamelimit = 30

in your .git/config file to specify the default rename detection
limit.  You can override this from the command line; giving 0
means 'unlimited':

	git diff -M -l0

We might want to change the default behaviour, when you do not
have the configuration, to limit it to say 20 paths or so.  This
would also help the diffstat generation after a big 'git pull'.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-11-15 15:08:27 -08:00
Linus Torvalds
ac1b3d1248 Split up tree diff functions into tree-diff.c library
This makes the tree diff functionality independent of the "git-diff-tree"
program, by splitting the core functionality up into a library file.

This will be needed for when we teach git-rev-list to only follow a
specified set of pathnames, rather than the global revision history.

Most of it is a fairly straightforward code move, but it also involves
some calling convention cleanup, and moving some of the static variables
from diff-tree.c into the options structure.

The actual tree change callback routines also become paramterized by the
diff_options structure, allowing the library functionality to do something
else than just show the diff on stdout.

Right now the only user of this functionality remains git-diff-tree
itself.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-22 22:49:51 -07:00
Linus Torvalds
694a764fc2 Handle "-" at beginning of filenames, part 3
This fixes the default built-in exec() of "diff" to add a "--" before the
filenames, so that if a filename starts with a "-", the diff program won't
think it's an option.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-18 00:16:45 -07:00
Junio C Hamano
cf9dfc669e Update git-diff-* to use C-style quoting for funny pathnames.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-17 17:41:56 -07:00
Junio C Hamano
ec1fcc16af Show original and resulting blob object info in diff output.
This adds more cruft to diff --git header to record the blob SHA1 and
the mode the patch/diff is intended to be applied against, to help the
receiving end fall back on a three-way merge.  The new header looks
like this:

    diff --git a/apply.c b/apply.c
    index 7be5041..8366082 100644
    --- a/apply.c
    +++ b/apply.c
    @@ -14,6 +14,7 @@
     //    files that are being modified, but doesn't apply the patch
     //  --stat does just a diffstat, and doesn't actually apply
    +//  --show-index-info shows the old and new index info for...
    ...

Upon receiving such a patch, if the patch did not apply cleanly to the
target tree, the recipient can try to find the matching old objects in
her object database and create a temporary tree, apply the patch to
that temporary tree, and attempt a 3-way merge between the patched
temporary tree and the target tree using the original temporary tree
as the common ancestor.

The patch lifts the code to compute the hash for an on-filesystem
object from update-index.c and makes it available to the diff output
routine.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-07 03:42:00 -07:00
Junio C Hamano
88cd621dee Consolidate null_sha1[].
Signed-off-by: Junio C Hamano <junio@twinsun.com>
2005-09-30 22:12:01 -07:00
Junio C Hamano
946f5f7c24 Diff: --name-status output format.
The new output format shows only the status letter and paths.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-24 23:50:44 -07:00
Junio C Hamano
8082d8d305 Diff: -l<num> to limit rename/copy detection.
When many paths are modified, rename detection takes a lot of time.
The new option -l<num> can be used to disable rename detection when
more than <num> paths are possibly created as renames.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-24 23:50:44 -07:00
Junio C Hamano
6b5ee137e5 Diff clean-up.
This is a long overdue clean-up to the code for parsing and passing
diff options.  It also tightens some constness issues.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-24 23:50:43 -07:00
Junio C Hamano
5cfcd07c93 Retire diff-helper.
The textual diff generation with built-in '-p' in diff-* brothers has
proven to be useful enough that git-diff-helper outlived its usefulness.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-22 01:54:13 -07:00
Junio C Hamano
5098bafb75 Plug diff leaks.
It is a bit embarrassing that it took this long for a fix since the
problem was first reported on Aug 13th.

    Message-ID: <87y876gl1r.wl@mail2.atmark-techno.com>
    From: Yasushi SHOJI <yashi@atmark-techno.com>
    Newsgroups: gmane.comp.version-control.git
    Subject: [patch] possible memory leak in diff.c::diff_free_filepair()
    Date: Sat, 13 Aug 2005 19:58:56 +0900

This time I used valgrind to make sure that it does not overeagerly
discard memory that is still being used.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-15 16:13:43 -07:00
Junio C Hamano
19397b4521 Revert "[PATCH] plug memory leak in diff.c::diff_free_filepair()"
This reverts 068eac91ce commit.
2005-09-14 14:06:50 -07:00
Linus Torvalds
705a7148ba [PATCH] Fix alloc_filespec() initialization
This simplifies and fixes the initialization of a "diff_filespec" when
allocated.

The old code would not initialize "sha1_valid". Noticed by valgrind.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-14 13:57:34 -07:00
Junio C Hamano
6bac10d89d Fix copy marking from diffcore-rename.
When (A,B) ==> (B,C) rename-copy was detected, we incorrectly said
that C was created by copying B.  This is because we only check if the
path of rename/copy source still exists in the resulting tree to see
if the file is renamed out of existence.  In this case, the new B is
created by copying or renaming A, so the original B is lost and we
should say C is a rename of B not a copy of B.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-10 12:42:32 -07:00
Junio C Hamano
a9ab586a5d Retire support for old environment variables.
We have deprecated the old environment variable names for quite a
while and now it's time to remove them.  Gone are:

    SHA1_FILE_DIRECTORIES AUTHOR_DATE AUTHOR_EMAIL AUTHOR_NAME
    COMMIT_AUTHOR_EMAIL COMMIT_AUTHOR_NAME SHA1_FILE_DIRECTORY

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-09 14:48:54 -07:00
Junio C Hamano
5de36bfec9 Fix compilation warnings.
... found by compiling them with gcc 2.95.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-08-29 21:17:21 -07:00
Jason Riedy
c7c81b3a51 Fix ?: statements.
Omitting the first branch in ?: is a GNU extension.  Cute,
but not supported by other compilers.  Replaced mostly
by explicit tests.  Calls to getenv() simply are repeated
on non-GNU compilers.

Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu>
2005-08-23 20:41:12 -07:00
Yasushi SHOJI
90a734dc7f [PATCH] possible memory leak in diff.c::diff_free_filepair()
Here is a patch to fix the problem in the simplest way.
2005-08-21 03:48:33 -07:00
Yasushi SHOJI
068eac91ce [PATCH] plug memory leak in diff.c::diff_free_filepair()
When I run git-diff-tree on big change, it seems the command eats so
much memory.  so I just put git under valgrind to see what's going on.
diff_free_filespec_data() doesn't free diff_filespec itself.

[jc: I ended up doing things slightly differently from Yasushi's
patch.  The original idea was to use free_filespec_data() only to
free the data portion and keep useing the filespec itself, but
no existing code seems to do things that way, so I just yanked
that part out.]

Signed-off-by: Yasushi SHOJI <yashi@atmark-techno.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-08-13 18:28:55 -07:00
Junio C Hamano
79db12e8ba A bit more format warning squelching.
Inspired by patch from Timo Sirainen.  Most of them are not
strictly necessary but making warnings less chatty would help
spot real bugs later.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-08-09 22:28:19 -07:00
Holger Eitzenberger
64f8a631e1 [PATCH] git: use git_mkstemp() instead of mkstemp() for diff generation.
This lets you run git diff in a repository otherwise read-only
to you.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-08-05 23:06:58 -07:00
Pavel Roskin
e35f982415 [PATCH] mmap error handling
I have reviewed all occurrences of mmap() in git and fixed three types
of errors/defects:

1) The result is not checked.
2) The file descriptor is closed if mmap() succeeds, but not when it
fails.
3) Various casts applied to -1 are used instead of MAP_FAILED, which is
specifically defined to check mmap() return value.

[jc: This is a second round of Pavel's patch.  He fixed up the problem
that close() potentially clobbering the errno from mmap, which
the first round had.]

Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-07-29 17:21:48 -07:00
Junio C Hamano
e7baa4f45f Use symbolic constants for diff-raw status indicators.
Both Cogito and StGIT prefer to see 'A' for new files.  The
current 'N' is visually harder to distinguish from 'M', which is
used for modified files.  Prepare the internals to use symbolic
constants to make the change easier.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-07-25 17:15:34 -07:00
Linus Torvalds
e68b6f1525 Split up "diff_format" into "format" and "line_termination".
This removes the separate "formats" for name and name-with-zero-
termination.

It also removes the difference between HUMAN and MACHINE formats, and
they both become DIFF_FORMAT_RAW, with the difference being just in the
line and inter-filename termination.

It also makes the code easier to understand.
2005-07-14 17:59:17 -07:00
Junio C Hamano
52f28529f4 [PATCH] git-diff-*: --name-only and --name-only-z.
Porcelain layers often want to find only names of changed files,
and even with diff-raw output format they end up having to pick
out only the filename.  Support --name-only (and --name-only-z
for xargs -0 and cpio -0 users that want to treat filenames with
embedded newlines sanely) flag to help them.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-13 12:55:07 -07:00
Junio C Hamano
6fb737be5e [PATCH] Make sq_expand() available as sq_quote().
A useful shell safety helper sq_expand() was hidden as a static
function in diff.c.  Extract it out and make it available as
sq_quote().

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-08 11:01:10 -07:00
Junio C Hamano
36e4d74a21 [PATCH] Enhance sha1_file_size() into sha1_object_info()
This lets us eliminate one use of map_sha1_file() outside
sha1_file.c, to bring us one step closer to the packed GIT.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27 15:27:51 -07:00
Junio C Hamano
366175ef8c [PATCH] Rework -B output.
Patch for a completely rewritten file detected by the -B flag
was shown as a pair of creation followed by deletion in earlier
versions.  This was an misguided attempt to make reviewing such
a complete rewrite easier, and unnecessarily ended up confusing
git-apply.  Instead, show the entire contents of old version
prefixed with '-', followed by the entire contents of new
version prefixed with '+'.  This gives the same easy-to-review
for human consumer while keeping it a single, regular
modification patch for machine consumption, something that even
GNU patch can grok.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-19 20:13:18 -07:00
Junio C Hamano
f2ce9fde57 [PATCH] Add --diff-filter= output restriction to diff-* family.
This is a halfway between debugging aid and a helper to write an
ultra-smart merge scripts.  The new option takes a string that
consists of a list of "status" letters, and limits the diff
output to only those classes of changes, with two exceptions:

 - A broken pair (aka "complete rewrite"), does not match D
   (deleted) or N (created).  Use B to look for them.

 - The letter "A" in the diff-filter string does not match
   anything itself, but causes the entire diff that contains
   selected patches to be output (this behaviour is similar to
   that of --pickaxe-all for the -S option).

For example,

    $ git-rev-list HEAD |
      git-diff-tree --stdin -s -v -B -C --diff-filter=BCR

shows a list of commits that have complete rewrite, copy, or
rename.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-12 20:40:20 -07:00
Junio C Hamano
2210100ac0 [PATCH] Fix rename/copy when dealing with temporarily broken pairs.
When rename/copy uses a file that was broken by diffcore-break
as the source, and the broken filepair gets merged back later,
the output was mislabeled as a rename.  In this case, the source
file ends up staying in the output, so we should label it as a
copy instead.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-12 20:40:19 -07:00
Junio C Hamano
dc7090efbc [PATCH] Re-Fix SIGSEGV on unmerged files in git-diff-files -p
When an unmerged path was fed via diff_unmerged() into diffcore,
it eventually called run_diff() with "one" and "two" parameters
with NULL, but run_diff() was not written carefully enough to
notice this situation.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-12 20:29:31 -07:00
Linus Torvalds
dc93841715 diff 'rename' format change.
Clearly even Junio felt git "rename" header lines should say "from/to"
instead of "old/new", since he wrote the documentation that way.

This way it also matches "copy".

git-apply will accept both versions, at least for a while.
2005-06-05 15:31:52 -07:00
Junio C Hamano
49d9e85d11 [PATCH] diff.c: -B argument passing fix.
This fixes a bug that was preventing non-default parameter to -B
option to be passed correctly; you could not give more than 50%
break score.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-05 14:14:58 -07:00
Junio C Hamano
0601e131c9 [PATCH] diff.c: locate_size_cache() fix.
This fixes two bugs.

 - declaration of auto variable "cmp" was preceeded by a
   statement, causing compilation error on real C compilers;
   noticed and patch given by Yoichi Yuasa.

 - the function's calling convention was overloading its size
   parameter to mean "largest possible value means do not add
   entry", which was a bad taste.  Brought up during a
   discussion with Peter Baudis.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-05 14:14:58 -07:00
Junio C Hamano
eeaa460314 [PATCH] diff: Update -B heuristics.
As Linus pointed out on the mailing list discussion, -B should
break a files that has many inserts even if it still keeps
enough of the original contents, so that the broken pieces can
later be matched with other files by -M or -C.  However, if such
a broken pair does not get picked up by -M or -C, we would want
to apply different criteria; namely, regardless of the amount of
new material in the result, the determination of "rewrite"
should be done by looking at the amount of original material
still left in the result.  If you still have the original 97
lines from a 100-line document, it does not matter if you add
your own 13 lines to make a 110-line document, or if you add 903
lines to make a 1000-line document.  It is not a rewrite but an
in-place edit.  On the other hand, if you did lose 97 lines from
the original, it does not matter if you added 27 lines to make a
30-line document or if you added 997 lines to make a 1000-line
document.  You did a complete rewrite in either case.

This patch introduces a post-processing phase that runs after
diffcore-rename matches up broken pairs diffcore-break creates.
The purpose of this post-processing is to pick up these broken
pieces and merge them back into in-place modifications.  For
this, the score parameter -B option takes is changed into a pair
of numbers, and it takes "-B99/80" format when fully spelled
out.  The first number is the minimum amount of "edit" (same
definition as what diffcore-rename uses, which is "sum of
deletion and insertion") that a modification needs to have to be
broken, and the second number is the minimum amount of "delete"
a surviving broken pair must have to avoid being merged back
together.  It can be abbreviated to "-B" to use default for
both, "-B9" or "-B9/" to use 90% for "edit" but default (80%)
for merge avoidance, or "-B/75" to use default (99%) "edit" and
75% for merge avoidance.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-03 11:23:03 -07:00
Junio C Hamano
0e3994fa97 [PATCH] diff: Clean up diff_scoreopt_parse().
This cleans up diff_scoreopt_parse() function that is used to
parse the fractional notation -B, -C and -M option takes.  The
callers are modified to check for errors and complain.  Earlier
they silently ignored malformed input and falled back on the
default.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-03 11:23:03 -07:00
Junio C Hamano
65c2e0c349 [PATCH] Find size of SHA1 object without inflating everything.
This adds sha1_file_size() helper function and uses it in the
rename/copy similarity estimator.  The helper function handles
deltified object as well.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-02 15:48:33 -07:00
Junio C Hamano
67574c403f [PATCH] diff: mode bits fixes
The core GIT repository has trees that record regular file mode
in 0664 instead of normalized 0644 pattern.  Comparing such a
tree with another tree that records the same file in 0644
pattern without content changes with git-diff-tree causes it to
feed otherwise unmodified pairs to the diff_change() routine,
which triggers a sanity check routine and barfs.  This patch
fixes the problem, along with the fix to another caller that
uses unnormalized mode bits to call diff_change() routine in a
similar way.

Without this patch, you will see "fatal error" from diff-tree
when you run git-deltafy-script on the core GIT repository
itself.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-01 13:24:03 -07:00
Junio C Hamano
70aadac081 [PATCH] Show dissimilarity index for D and N case.
The way broken deletes and creates are shown in the -p
(diff-patch) output format has become consistent with how
rename/copy edits are shown.  They will show "dissimilarity
index" value, immediately following the "deleted file mode" and
"new file mode" lines.

The git-apply is taught to grok such an extended header.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-05-30 18:10:46 -07:00
Junio C Hamano
af5323e027 [PATCH] Add -O<orderfile> option to diff-* brothers.
A new diffcore filter diffcore-order is introduced.  This takes
a text file each of whose line is a shell glob pattern.  Patches
that match a glob pattern on an earlier line in the file are
output before patches that match a later line, and patches that
do not match any glob pattern are output last.

A typical orderfile for git project probably should look like
this:

    README
    Makefile
    Documentation
    *.h
    *.c

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-30 18:10:46 -07:00
Junio C Hamano
f345b0a066 [PATCH] Add -B flag to diff-* brothers.
A new diffcore transformation, diffcore-break.c, is introduced.

When the -B flag is given, a patch that represents a complete
rewrite is broken into a deletion followed by a creation.  This
makes it easier to review such a complete rewrite patch.

The -B flag takes the same syntax as the -M and -C flags to
specify the minimum amount of non-source material the resulting
file needs to have to be considered a complete rewrite, and
defaults to 99% if not specified.

As the new test t4008-diff-break-rewrite.sh demonstrates, if a
file is a complete rewrite, it is broken into a delete/create
pair, which can further be subjected to the usual rename
detection if -M or -C is used.  For example, if file0 gets
completely rewritten to make it as if it were rather based on
file1 which itself disappeared, the following happens:

    The original change looks like this:

	file0     --> file0' (quite different from file0)
	file1     --> /dev/null

    After diffcore-break runs, it would become this:

	file0     --> /dev/null
	/dev/null --> file0'
	file1     --> /dev/null

    Then diffcore-rename matches them up:

	file1     --> file0'

The internal score values are finer grained now.  Earlier
maximum of 10000 has been raised to 60000; there is no user
visible changes but there is no reason to waste available bits.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-30 10:35:49 -07:00
Junio C Hamano
2cd68882ee [PATCH] diff: fix the culling of unneeded delete record.
The commit 15d061b435

    [PATCH] Fix the way diffcore-rename records unremoved source.

still leaves unneeded delete records in its output stream by
mistake, which was covered up by having an extra check to turn
such a delete into a no-op downstream.  Fix the check in the
diffcore-rename to simplify the output routine.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-30 10:35:49 -07:00
Junio C Hamano
9d429ff6ff [PATCH] diff: further cleanup.
When preparing data to feed the external diff, we should give
the mode we obtained from the caller, even when we are dealing
with a file with 0{40} SHA1 (i.e. the caller said "look at the
filesystem"), since the mode passed by the caller via
diff_addremove() or diff_change() is always trustworthy.

This is _not_ a bugfix --- the existing code stat() on the file
ifself and does the same computation on st.st_mode to compute
the mode the same way the caller did to give the original mode.
We cannot remove the stat() call from here, but the extra
computation to create the mode value is unnecessary.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-30 10:35:49 -07:00
Junio C Hamano
01c4e70f63 [PATCH] diff: code clean-up and removal of rename hack.
A new macro, DIFF_PAIR_RENAME(), is introduced to distinguish a
filepair that is a rename/copy (the definition of which is src
and dst are different paths, of course).  This removes the hack
used in the record_rename_pair() to always put a non-zero value
in the score field.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-30 10:35:49 -07:00
Junio C Hamano
befe86392c [PATCH] diff: consolidate various calls into diffcore.
The three diff-* brothers had a sequence of calls into diffcore
that were almost identical.  Introduce a new diffcore_std()
function that takes all the necessary arguments to consolidate
it.  This will make later enhancements and changing the order of
diffcore application simpler.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-30 10:35:49 -07:00
Junio C Hamano
f0c6b2a2fd [PATCH] Optimize diff-tree -[CM] --stdin
This attempts to optimize "diff-tree -[CM] --stdin", which
compares successible tree pairs.  This optimization does not
make much sense for other commands in the diff-* brothers.

When reading from --stdin and using rename/copy detection, the
patch makes diff-tree to read the current index file first.
This is done to reuse the optimization used by diff-cache in the
non-cached case.  Similarity estimator can avoid expanding a
blob if the index says what is in the work tree has an exact
copy of that blob already expanded.

Another optimization the patch makes is to check only file sizes
first to terminate similarity estimation early.  In order for
this to work, it needs a way to tell the size of the blob
without expanding it.  Since an obvious way of doing it, which
is to keep all the blobs previously used in the memory, is too
costly, it does so by keeping the filesize for each object it
has already seen in memory.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-29 11:17:44 -07:00
Junio C Hamano
15d061b435 [PATCH] Fix the way diffcore-rename records unremoved source.
Earier version of diffcore-rename used to keep unmodified
filepair in its output so that the last stage of the processing
that tells renames from copies can make all of rename/copy to
copies.  However this had a bad interaction with other diffcore
filters that wanted to run after diffcore-rename, in that such
unmodified filepair must be retained for proper distinction
between renames and copies to happen.

This patch fixes the problem by changing the way diffcore-rename
records the information needed to distinguish "all are copies"
case and "the last one is a rename" case.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-29 11:17:43 -07:00
Junio C Hamano
be020332a1 [PATCH] Remove a function not used anymore.
Earlier rename/copy detection left unmodified filepair in the
output and forced downstream to keep them even when they are
filtering, and the diff_needs_to_stay() function was used for
the logic.  It is not used anymore, so remove it.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-29 11:17:43 -07:00
Junio C Hamano
19feebc8c3 [PATCH] Clean up diff_setup() to make it more extensible.
This changes the argument of diff_setup() from an integer that
says if we are feeding reversed diff to a bitmask, so that later
global options can be added more easily.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-29 11:17:43 -07:00
Junio C Hamano
09d9d1a648 [PATCH] Remove final newline from the value of xfrm_msg variable.
This change makes the implementation of git-external-diff-script
cleaner.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-29 11:17:43 -07:00
Junio C Hamano
903d475a0b [PATCH] Do not expose internal scaling to diff-helper.
Instead we can normalize what diff-raw records at the diffcore
side.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-29 11:17:43 -07:00
Junio C Hamano
226406f693 [PATCH] Introduce diff_free_filepair() funcion.
This introduces a new function to free a common data structure,
and plugs some leaks.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-29 11:17:43 -07:00
Junio C Hamano
4130b99571 [PATCH] Diff updates to express type changes
With the introduction of type 'T' in the diff-raw output, and
the "apply-patch" program Linus has been quietly working on
without much advertisement, it started to make sense to emit
usable information in the "diff --git" patch output format as
well.  Earlier built-in diff driver punted and did not say
anything about a symbolic link changing into a file or vice
versa, but this version represents it as a pair of deletion
and creation.

It also fixes a minor problem dealing with old archive created
with ancient git.  The earlier code was reporting file mode
change between 100664 and 100644 (we shouldn't).  The linux-2.6
git tree has a good example that exposes this problem.  A good
test case is commit ce1dc02f76432a46db149241e015a4f782974623.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-26 15:18:55 -07:00
Junio C Hamano
9fdade0673 [PATCH] Mode only changes from diff.
This fixes another bug.

 - Mode-only changes were pruned incorrectly from the output.
 - Added test to catch the above problem.
 - Normalize rename/copy similarity score in the diff-raw output
   to per-cent, no matter what scale we internally use.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-25 16:06:24 -07:00
Junio C Hamano
96716a1976 [PATCH] Fix type-change handling when assigning the status code to filepairs.
The interim single-liner '?' fix resulted delete entries that
should not have emitted coming out in the output as an
unintended side effect; I caught this with the "rename" test in
the test suite.  This patch instead fixes the code that assigns
the status code to each filepair.

I verified this does not break the testcase in udev.git tree Kay
Sievers gave us, by running git-diff-tree on that tree which
showed 21 file to symlink changes.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-25 15:21:57 -07:00
Linus Torvalds
6f93a631ac diff.c: don't silently ignore unknown state changes in diffs.
Give them an "unknown" status, ie '?'.
2005-05-25 11:09:12 -07:00
Junio C Hamano
25d5ea410f [PATCH] Redo rename/copy detection logic.
Earlier implementation had a major screw-up in the memory
management area.  Rename/copy logic sometimes borrowed a pointer
to a structure without any provision for downstream to determine
which pointer is shared and which is not.  This resulted in the
later clean-up code to sometimes double free such structure,
resulting in a segfault.  This made -M and -C useless.

Another problem the earlier implementation had was that it
reordered the patches, and forced the logic to differentiate
renames and copies to depend on that particular order.  This
problem was fixed by teaching rename/copy detection logic not to
do any reordering, and rename-copy differentiator not to depend
on the order of the patches.  The diffs will leave rename/copy
detector in the same destination path order as the patch that
was fed into it.  Some test vectors have been reordered to
accommodate this change.

It also adds a sanity check logic to the human-readable diff-raw
output to detect paths with embedded TAB and LF characters,
which cannot be expressed with that format.  This idea came up
during a discussion with Chris Wedgwood.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-24 01:26:26 -07:00
Junio C Hamano
bceafe752c [PATCH] Fix diff-pruning logic which was running prune too early.
For later stages to reorder patches, pruning logic and rename detection
logic should not decide which delete to discard (because another entry
said it will take over the file as a rename) until the very end.

Also fix some tests that were assuming the earlier "last one is rename
or keep everything else is copy" semantics of diff-raw format, which no
longer is true.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-23 19:17:06 -07:00
Junio C Hamano
b6d8f309d9 [PATCH] diff-raw format update take #2.
This changes the diff-raw format again, following the mailing
list discussion.  The new format explicitly expresses which one
is a rename and which one is a copy.

The documentation and tests are updated to match this change.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-23 16:23:10 -07:00
Junio C Hamano
f7c1512af8 [PATCH] Rename/copy detection fix.
The rename/copy detection logic in earlier round was only good
enough to show patch output and discussion on the mailing list
about the diff-raw format updates revealed many problems with
it.  This patch fixes all the ones known to me, without making
things I want to do later impossible, mostly related to patch
reordering.

 (1) Earlier rename/copy detector determined which one is rename
     and which one is copy too early, which made it impossible
     to later introduce diffcore transformers to reorder
     patches.  This patch fixes it by moving that logic to the
     very end of the processing.

 (2) Earlier output routine diff_flush() was pruning all the
     "no-change" entries indiscriminatingly.  This was done due
     to my false assumption that one of the requirements in the
     diff-raw output was not to show such an entry (which
     resulted in my incorrect comment about "diff-helper never
     being able to be equivalent to built-in diff driver").  My
     special thanks go to Linus for correcting me about this.
     When we produce diff-raw output, for the downstream to be
     able to tell renames from copies, sometimes it _is_
     necessary to output "no-change" entries, and this patch
     adds diffcore_prune() function for doing it.

 (3) Earlier diff_filepair structure was trying to be not too
     specific about rename/copy operations, but the purpose of
     the structure was to record one or two paths, which _was_
     indeed about rename/copy.  This patch discards xfrm_msg
     field which was trying to be generic for this wrong reason,
     and introduces a couple of fields (rename_score and
     rename_rank) that are explicitly specific to rename/copy
     logic.  One thing to note is that the information in a
     single diff_filepair structure _still_ does not distinguish
     renames from copies, and it is deliberately so.  This is to
     allow patches to be reordered in later stages.

 (4) This patch also adds some tests about diff-raw format
     output and makes sure that necessary "no-change" entries
     appear on the output.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-23 11:49:30 -07:00
Linus Torvalds
09d74b3b5a Some more sparse warning fixes
Proper function declarations and NULL pointer usage.
2005-05-22 14:33:43 -07:00
Linus Torvalds
6b0c312106 Include file cleanups..
Add <limits.h> to the include files handled by "cache.h", and remove
extraneous #include directives from various .c files. The rule is that
"cache.h" gets all the basic stuff, so that we'll have as few system
dependencies as possible.
2005-05-22 11:54:17 -07:00
Junio C Hamano
6b14d7faf0 [PATCH] Diffcore updates.
This moves the path selection logic from individual programs to a new
diffcore transformer (diff-tree still needs to have its own for
performance reasons).  Also the header printing code in diff-tree was
tweaked not to produce anything when pickaxe is in effect and there is
nothing interesting to report.  An interesting example is the following
in the GIT archive itself:

    $ git-whatchanged -p -C -S'or something in a real script'

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-22 10:17:50 -07:00
Junio C Hamano
81e50eabf0 [PATCH] The diff-raw format updates.
Update the diff-raw format as Linus and I discussed, except that
it does not use sequence of underscore '_' letters to express
nonexistence.  All '0' mode is used for that purpose instead.

The new diff-raw format can express rename/copy, and the earlier
restriction that -M and -C _must_ be used with the patch format
output is no longer necessary.  The patch makes -M and -C flags
independent of -p flag, so you need to say git-whatchanged -M -p
to get the diff/patch format.

Updated are both documentations and tests.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-21 22:49:19 -07:00
Junio C Hamano
38c6f78059 [PATCH] Prepare diffcore interface for diff-tree header supression.
This does not actually supress the extra headers when pickaxe is
used, but prepares enough support for diff-tree to implement it.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-21 22:49:19 -07:00
Junio C Hamano
057c7d3018 [PATCH] Constness fix for pickaxe option.
Constness fix for pickaxe option.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-05-21 15:17:16 -07:00
Junio C Hamano
52e9578985 [PATCH] Introducing software archaeologist's tool "pickaxe".
This steals the "pickaxe" feature from JIT and make it available
to the bare Plumbing layer.  From the command line, the user
gives a string he is intersted in.

Using the diff-core infrastructure previously introduced, it
filters the differences to limit the output only to the diffs
between <src> and <dst> where the string appears only in one but
not in the other.  For example:

 $ ./git-rev-list HEAD | ./git-diff-tree -Sdiff-tree-helper --stdin -M

would show the diffs that touch the string "diff-tree-helper".

In real software-archaeologist application, you would typically
look for a few to several lines of code and see where that code
came from.

The "pickaxe" module runs after "rename/copy detection" module,
so it even crosses the file rename boundary, as the above
example demonstrates.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-21 09:58:03 -07:00
Junio C Hamano
427dcb4bca [PATCH] Diff overhaul, adding half of copy detection.
This introduces the diff-core, the layer between the diff-tree
family and the external diff interface engine.  The calls to the
interface diff-tree family uses (diff_change and diff_addremove)
have not changed and will not change.  The purpose of the
diff-core layer is to provide an infrastructure to transform the
set of differences sent from the applications, before sending
them to the external diff interface.

The recently introduced rename detection code has been rewritten
to use the diff-core facility.  When applications send in
separate creates and deletes, matching ones are transformed into
a single rename-and-edit diff, and sent out to the external diff
interface as such.

This patch also enhances the rename detection code further to be
able to detect copies.  Currently this happens only as long as
copy sources appear as part of the modified files, but there
already is enough provision for callers to report unmodified
files to diff-core, so that they can be also used as copy source
candidates.  Extending the callers this way will be done in a
separate patch.

Please see and marvel at how well this works by trying out the
newly added t/t4003-diff-rename-1.sh test script.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-21 09:58:03 -07:00
Linus Torvalds
e99d59ff0b sparse cleanup
Fix various things that sparse complains about:
 - use NULL instead of 0
 - make sure we declare everything properly, or mark it static
 - use proper function declarations ("fn(void)" instead of "fn()")

Sparse is always right.
2005-05-20 11:46:10 -07:00
Junio C Hamano
7ca45252a3 [PATCH] Simplify "reverse-diff" logic in the diff core.
Instead of swapping the arguments just before output, this patch
makes the swapping happen on the input side of the diff core,
when "reverse-diff" is in effect.  This greatly simplifies the
logic, but more importantly it is necessary for upcoming "copy
detection" work.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-20 10:08:56 -07:00