Commit Graph

208 Commits

Author SHA1 Message Date
Junio C Hamano
e6cb314c08 Merge branch 'ph/diffopts'
* ph/diffopts:
  Reorder diff_opt_parse options more logically per topics.
  Make the diff_options bitfields be an unsigned with explicit masks.
  Use OPT_BIT in builtin-pack-refs
  Use OPT_BIT in builtin-for-each-ref
  Use OPT_SET_INT and OPT_BIT in builtin-branch
  parse-options new features.
2007-11-18 15:50:16 -08:00
Pierre Habouzit
8f67f8aefb Make the diff_options bitfields be an unsigned with explicit masks.
reverse_diff was a bit-value in disguise, it's merged in the flags now.

Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-11 16:54:15 -08:00
Junio C Hamano
fb63d7f889 git-add: make the entry stat-clean after re-adding the same contents
Earlier in commit 0781b8a9b2
(add_file_to_index: skip rehashing if the cached stat already
matches), add_file_to_index() were taught not to re-add the path
if it already matches the index.

The change meant well, but was not executed quite right.  It
used ie_modified() to see if the file on the work tree is really
different from the index, and skipped adding the contents if the
function says "not modified".

This was wrong.  There are three possible comparison results
between the index and the file in the work tree:

 - with lstat(2) we _know_ they are different.  E.g. if the
   length or the owner in the cached stat information is
   different from the length we just obtained from lstat(2), we
   can tell the file is modified without looking at the actual
   contents.

 - with lstat(2) we _know_ they are the same.  The same length,
   the same owner, the same everything (but this has a twist, as
   described below).

 - we cannot tell from lstat(2) information alone and need to go
   to the filesystem to actually compare.

The last case arises from what we call 'racy git' situation,
that can be caused with this sequence:

    $ echo hello >file
    $ git add file
    $ echo aeiou >file ;# the same length

If the second "echo" is done within the same filesystem
timestamp granularity as the first "echo", then the timestamp
recorded by "git add" and the timestamp we get from lstat(2)
will be the same, and we can mistakenly say the file is not
modified.  The path is called 'racily clean'.  We need to
reliably detect racily clean paths are in fact modified.

To solve this problem, when we write out the index, we mark the
index entry that has the same timestamp as the index file itself
(that is the time from the point of view of the filesystem) to
tell any later code that does the lstat(2) comparison not to
trust the cached stat info, and ie_modified() then actually goes
to the filesystem to compare the contents for such a path.

That's all good, but it should not be used for this "git add"
optimization, as the goal of "git add" is to actually update the
path in the index and make it stat-clean.  With the false
optimization, we did _not_ cause any data loss (after all, what
we failed to do was only to update the cached stat information),
but it made the following sequence leave the file stat dirty:

    $ echo hello >file
    $ git add file
    $ echo hello >file ;# the same contents
    $ git add file

The solution is not to use ie_modified() which goes to the
filesystem to see if it is really clean, but instead use
ie_match_stat() with "assume racily clean paths are dirty"
option, to force re-adding of such a path.

There was another problem with "git add -u".  The codepath
shares the same issue when adding the paths that are found to be
modified, but in addition, it asked "git diff-files" machinery
run_diff_files() function (which is "git diff-files") to list
the paths that are modified.  But "git diff-files" machinery
uses the same ie_modified() call so that it does not report
racily clean _and_ actually clean paths as modified, which is
not what we want.

The patch allows the callers of run_diff_files() to pass the
same "assume racily clean paths are dirty" option, and makes
"git-add -u" codepath to use that option, to discover and re-add
racily clean _and_ actually clean paths.

We could further optimize on top of this patch to differentiate
the case where the path really needs re-adding (i.e. the content
of the racily clean entry was indeed different) and the case
where only the cached stat information needs to be refreshed
(i.e. the racily clean entry was actually clean), but I do not
think it is worth it.

This patch applies to maint and all the way up.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-10 00:37:39 -08:00
Junio C Hamano
4bd5b7dacc ce_match_stat, run_diff_files: use symbolic constants for readability
ce_match_stat() can be told:

 (1) to ignore CE_VALID bit (used under "assume unchanged" mode)
     and perform the stat comparison anyway;

 (2) not to perform the contents comparison for racily clean
     entries and report mismatch of cached stat information;

using its "option" parameter.  Give them symbolic constants.

Similarly, run_diff_files() can be told not to report anything
on removed paths.  Also give it a symbolic constant for that.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-10 00:24:51 -08:00
René Scharfe
6d2d9e8666 diff: squelch empty diffs even more
When we compare two non-tracked files, or explicitly
specify --no-index, the suggestion to run git-status
is not helpful.

The patch adds a new diff_options bitfield member, no_index, that
is used instead of the special value of -2 of the rev_info field
max_count to indicate that the index is not to be used.  This makes
it possible to pass that flag down to diffcore_skip_stat_unmatch(),
which only has one diff_options parameter.

This could even become a cleanup if we removed all assignments of
max_count to a value of -2 (viz. replacement of a magic value with
a self-documenting field name) but I didn't dare to do that so late
in the rc game..

The no_index bit, if set, then tells diffcore_skip_stat_unmatch()
to not account for any skipped stat-mismatches, which avoids the
suggestion to run git-status.

Signed-off-by: Rene Scharfe <rene.scharfe@lsfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-14 22:34:58 -07:00
Junio C Hamano
fb13227e08 git-diff: squelch "empty" diffs
After starting to edit a working tree file but later when your edit ends
up identical to the original (this can also happen when you ran a
wholesale regexp replace with something like "perl -i" that does not
actually modify many of the paths), "git diff" between the index and the
working tree outputs many "empty" diffs that show "diff --git" headers
and nothing else, because these paths are stat-dirty.  While it was a
way to warn the user that the earlier action of the user made the index
ineffective as an optimization mechanism, it was felt too loud for the
purpose of warning even to experienced users, and also resulted in
confusing people new to git.

This replaces the "empty" diffs with a single warning message at the
end.  Having many such paths hurts performance, and you can run
"git-update-index --refresh" to update the lstat(2) information recorded
in the index in such a case.  "git-status" does so as a side effect, and
that is more familiar to the end-user, so we recommend it to them.

The change affects only "git diff" that outputs patch text, because that
is where the annoyance of too many "empty" diff is most strongly felt,
and because the warning message can be safely ignored by downstream
tools without getting mistaken as part of the patch.  For the low-level
"git diff-files" and "git diff-index", the traditional behaviour is
retained.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-14 01:55:00 -07:00
Linus Torvalds
750f7b668f Finally implement "git log --follow"
Ok, I've really held off doing this too damn long, because I'm lazy, and I
was always hoping that somebody else would do it.

But no, people keep asking for it, but nobody actually did anything, so I
decided I might as well bite the bullet, and instead of telling people
they could add a "--follow" flag to "git log" to do what they want to do,
I decided that it looks like I just have to do it for them..

The code wasn't actually that complicated, in that the diffstat for this
patch literally says "70 insertions(+), 1 deletions(-)", but I will have
to admit that in order to get to this fairly simple patch, you did have to
know and understand the internal git diff generation machinery pretty
well, and had to really be able to follow how commit generation interacts
with generating patches and generating the log.

So I suspect that while I was right that it wasn't that hard, I might have
been expecting too much of random people - this patch does seem to be
firmly in the core "Linus or Junio" territory.

To make a long story short: I'm sorry for it taking so long until I just
did it.

I'm not going to guarantee that this works for everybody, but you really
can just look at the patch, and after the appropriate appreciative noises
("Ooh, aah") over how clever I am, you can then just notice that the code
itself isn't really that complicated.

All the real new code is in the new "try_to_follow_renames()" function. It
really isn't rocket science: we notice that the pathname we were looking
at went away, so we start a full tree diff and try to see if we can
instead make that pathname be a rename or a copy from some other previous
pathname. And if we can, we just continue, except we show *that*
particular diff, and ever after we use the _previous_ pathname.

One thing to look out for: the "rename detection" is considered to be a
singular event in the _linear_ "git log" output! That's what people want
to do, but I just wanted to point out that this patch is *not* carrying
around a "commit,pathname" kind of pair and it's *not* going to be able to
notice the file coming from multiple *different* files in earlier history.

IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind
of "files have single identities" kind of semantics, and git log will just
pick the identity based on the normal move/copy heuristics _as_if_ the
history could be linearized.

Put another way: I think the model is broken, but given the broken model,
I think this patch does just about as well as you can do. If you have
merges with the same "file" having different filenames over the two
branches, git will just end up picking _one_ of the pathnames at the point
where the newer one goes away. It never looks at multiple pathnames in
parallel.

And if you understood all that, you probably didn't need it explained, and
if you didn't understand the above blathering, it doesn't really mtter to
you. What matters to you is that you can now do

	git log -p --follow builtin-rev-list.c

and it will find the point where the old "rev-list.c" got renamed to
"builtin-rev-list.c" and show it as such.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-22 23:37:11 -07:00
Junio C Hamano
16befb8b7f Even more missing static
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-08 02:54:57 -07:00
Junio C Hamano
f1af60bdba Support 'diff=pgm' attribute
This enhances the attributes mechanism so that external programs
meant for existing GIT_EXTERNAL_DIFF interface can be specifed
per path.

To configure such a custom diff driver, first define a custom
diff driver in the configuration:

	[diff "my-c-diff"]
		command = <<your command string comes here>>

Then mark the paths that you want to use this custom driver
using the attribute mechanism.

	*.c	diff=my-c-diff

The intent of this separation is that the attribute mechanism is
used for specifying the type of the contents, while the
configuration mechanism is used to define what needs to be done
to that type of the contents, which would be specific to both
platform and personal taste.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-22 22:16:14 -07:00
Junio C Hamano
68aacb2f3c diff --quiet
This adds the command line option 'quiet' to tell 'git diff-*'
that we are not interested in the actual diff contents but only
want to know if there is any change.  This option automatically
turns --exit-code on, and turns off output formatting, as it
does not make much sense to show the first hit we happened to
have found.

The --quiet option is silently turned off (but --exit-code is
still in effect, so is silent output) if postprocessing filters
such as pickaxe and diff-filter are used.  For all practical
purposes I do not think of a reason to want to use these filters
and not viewing the diff output.

The backends have not been taught about the option with this patch.
That is a topic for later rounds.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-14 16:21:19 -07:00
Junio C Hamano
3161b4b521 Remove unused diffcore_std_no_resolve
This was only used by diff-tree-helper program, whose purpose
was to translate a raw diff to a patch.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-14 16:21:19 -07:00
Alex Riesen
41bbf9d585 Allow git-diff exit with codes similar to diff(1)
This introduces a new command-line option: --exit-code. The diff
programs will return 1 for differences, return 0 for equality, and
something else for errors.

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-14 16:21:19 -07:00
Johannes Schindelin
fcfa33ec90 diff: make more cases implicit --no-index
When specifying an absolute path, or a relative path pointing outside
the working tree, do not fail, but roll your own diffopt parsing,
and execute a --no-index diff.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-28 16:32:31 -08:00
Johannes Schindelin
34a5e1a2d9 diff --no-index: also imitate the exit status of diff(1)
diff sets the exit status to 0 when no changes were found, to 1
when changes were found, and 2 means error.

We imitate this to be able to use "git diff" in the test scripts.
(Actually, keeping in line with the rest of git, -1 is returned
on error, which corresponds to an exit status 255).

To find out if the diff is not empty, a member called
"found_changes" was introduced in struct diff_options, which is
set in builtin_diff() and fn_out_consume().

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-26 01:20:55 -08:00
Johannes Schindelin
d516c2d119 Teach git-diff-files the new option --no-index
With this flag and given two paths, git-diff-files behaves as a GNU diff
lookalike (plus the git goodies like --check, colour, etc.).  This flag
is also available in git-diff.  It also works outside of a git repository.

In addition, if git-diff{,-files} is called without revision or stage
parameter, and with exactly two paths at least one of which is not tracked,
the default is --no-index.

So, you can now say

	git diff /etc/inittab /etc/fstab

and it actually works!

This also unifies the duplicated argument parsing between cmd_diff_files()
and builtin_diff_files().

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-22 20:59:55 -08:00
Junio C Hamano
1cfe77333f git-blame: no rev means start from the working tree file.
Warning: this changes the semantics.

This makes "git blame" without any positive rev to start digging
from the working tree copy, which is made into a fake commit
whose sole parent is the HEAD.

It also adds --contents <file> option to pretend as if the
working tree copy has the contents of the named file.  You can
use '-' to make the command read from the standard input.

If you want the command to start annotating from the HEAD
commit, you need to explicitly give HEAD parameter.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-05 14:55:11 -08:00
Junio C Hamano
e9c8409900 diff-index --cached --raw: show tree entry on the LHS for unmerged entries.
This updates the way diffcore represents an unmerged pair
somewhat.  It used to be that entries with mode=0 on both sides
were used to represent an unmerged pair, but now it has an
explicit flag.  This is to allow diff-index --cached to report
the entry from the tree when the path is unmerged in the index.

This is used in updating "git reset <tree> -- <path>" to restore
absense of the path in the index from the tree.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-01-06 22:57:42 -08:00
Nicolas Pitre
ebd124c678 make commit message a little more consistent and conforting
It is nicer to let the user know when a commit succeeded all the time,
not only the first time.  Also the commit sha1 is much more useful than
the tree sha1 in this case.

This patch also introduces a -q switch to supress this message as well
as the summary of created/deleted files.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-15 22:29:54 -08:00
Junio C Hamano
24ad8e0ce2 Merge branch 'jc/pickaxe' 2006-11-07 16:33:59 -08:00
Junio C Hamano
2f3f8b218a git-pickaxe: rename detection optimization
The idea is that we are interested in renaming into only one path, so
we do not care about renames that happen elsewhere.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-11-04 12:18:12 -08:00
Rene Scharfe
2b60356da5 Make git-cherry handle root trees
This patch on top of 'next' makes built-in git-cherry handle root
commits.

It moves the static function log-tree.c::diff_root_tree() to
tree-diff.c and makes it more similar to diff_tree_sha1() by
shuffling around arguments and factoring out the call to
log_tree_diff_flush().  Consequently the name is changed to
diff_root_tree_sha1().  It is a version of diff_tree_sha1() that
compares the empty tree (= root tree) against a single 'real' tree.

This function is then used in get_patch_id() to compute patch IDs
for initial commits instead of SEGFAULTing, as the current code
does if confronted with parentless commits.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-10-26 18:31:17 -07:00
Junio C Hamano
74e2abe5b7 diff --numstat
[jc: with documentation from Jakub]

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-10-13 21:37:10 -07:00
Junio C Hamano
dd0c367e5e Merge branch 'jc/diff-stat'
* jc/diff-stat:
  diff --stat: ensure at least one '-' for deletions, and one '+' for additions
  diff --stat=width[,name-width]: allow custom diffstat output width.
  diff --stat: color output.
  diff --stat: allow custom diffstat output width.
2006-09-30 21:29:18 -07:00
Junio C Hamano
a2540023dc diff --stat: allow custom diffstat output width.
This adds two parameters to "diff --stat".

 . --stat-width=72 tells that the page should fit on 72-column output.

 . --stat-name-width=30 tells that the filename part is limited
   to 30 columns.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-27 02:55:08 -07:00
Junio C Hamano
448c3ef144 diff.c: second war on whitespace.
This adds DIFF_WHITESPACE color class (default = reverse red) to
colored diff output to let you catch common whitespace errors.

 - trailing whitespaces at the end of line
 - a space followed by a tab in the indent

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-24 00:12:44 -07:00
Jeff King
0424558190 diff: support custom callbacks for output
Users can request the DIFF_FORMAT_CALLBACK output format to get a callback
consisting of the whole diff_queue_struct.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-07 15:40:25 -07:00
Johannes Schindelin
f59a59e22f Add the --color-words option to the diff options family
With this option, the changed words are shown inline. For example,
if a file containing "This is foo" is changed to "This is bar", the diff
will now show "This is " in plain text, "foo" in red, and "bar" in green.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-10 15:28:57 -07:00
Jeff King
ce43697379 Colorize 'commit' lines in log ui
When paging through the output of git-whatchanged, the color cues help to
visually navigate within a diff. However, it is difficult to notice when a
new commit starts, because the commit and log are shown in the "normal"
color. This patch colorizes the 'commit' line, customizable through
diff.colors.commit and defaulting to yellow.

As a side effect, some of the diff color engine (slot enum, get_color) has
become accessible outside of diff.c.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-24 00:04:41 -07:00
Junio C Hamano
83ad63cfeb diff: do not use configuration magic at the core-level
The Porcelainish has become so much usable as the UI that there
is not much reason people should be using the core programs by
hand anymore.  At this point we are better off making the
behaviour of the core programs predictable by keeping them
unaffected by the configuration variables.  Otherwise they will
become very hard to use as reliable building blocks.

For example, "git-commit -a" internally uses git-diff-files to
figure out the set of paths that need to be updated in the
index, and we should never allow diff.renames that happens to be
in the configuration to interfere (or slow down the process).

The UI level configuration such as showing renamed diff and
coloring are still honored by the Porcelainish ("git log" family
and "git diff"), but not by the core anymore.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-08 03:11:01 -07:00
Stephan Feder
ca49920f6f Add -a and --text to common diff options help
Signed-off-by: Stephan Feder <sf@b-i-t.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-07 12:28:04 -07:00
Stephan Feder
6d64ea965b Teach --text option to diff
Add new item text to struct diff_options.
If set then do not try to detect binary files.

Signed-off-by: Stephan Feder <sf@b-i-t.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-07 12:28:04 -07:00
Junio C Hamano
0c926a3d9c Merge branch 'th/diff'
* th/diff:
  builtin-diff: turn recursive on when defaulting to --patch format.
  t4013: note improvements brought by the new output code.
  t4013: add format-patch tests.
  format-patch: fix diff format option implementation
  combine-diff.c: type sanity.
  t4013 test updates for new output code.
  Fix some more diff options changes.
  Fix diff-tree -s
  log --raw: Don't descend into subdirectories by default
  diff-tree: Use ---\n as a message separator
  Print empty line between raw, stat, summary and patch
  t4013: add more tests around -c and --cc
  whatchanged: Default to DIFF_FORMAT_RAW
  Don't xcalloc() struct diffstat_t
  Add msg_sep to diff_options
  DIFF_FORMAT_RAW is not default anymore
  Set default diff output format after parsing command line
  Make --raw option available for all diff commands
  Merge with_raw, with_stat and summary variables to output_format
  t4013: add tests for diff/log family output options.
2006-07-05 16:31:24 -07:00
Timo Hirvonen
39bc9a6c20 Add msg_sep to diff_options
Add msg_sep variable to struct diff_options.  msg_sep is printed after
commit message.  Default is "\n", format-patch sets it to "---\n".

This also removes the second argument from show_log() because all
callers derived it from the first argument:

    show_log(rev, rev->loginfo, ...

Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-26 14:58:41 -07:00
Timo Hirvonen
c6744349df Merge with_raw, with_stat and summary variables to output_format
DIFF_FORMAT_* are now bit-flags instead of enumerated values.

Signed-off-by: Timo Hirvonen <tihirvon@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-26 14:58:40 -07:00
Johannes Schindelin
fcb3d0adc1 add diff_flush_patch_id() to calculate the patch id
Call it like this:

unsigned char id[20];
if (diff_flush_patch_id(diff_options, id))
	printf("And the patch id is: %s\n", sha1_to_hex(id));

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-26 14:44:03 -07:00
Johannes Schindelin
0d21efa51c Teach diff about -b and -w flags
This adds -b (--ignore-space-change) and -w (--ignore-all-space) flags to
diff. The main part of the patch is teaching libxdiff about it.

[jc: renamed xdl_line_match() to xdl_recmatch() since the former is used
 for different purposes in xpatchi.c which is in the parts of the upstream
 source we do not use.]

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-23 17:35:27 -07:00
Johannes Schindelin
cd112cef99 diff options: add --color
This patch is a slightly adjusted version of Junio's patch:
http://www.gelato.unsw.edu.au/archives/git/0604/19354.html

However, instead of using a config variable, this patch makes it available
as a diff option.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-06-17 17:08:16 -07:00
Johannes Schindelin
698ce6f87e fmt-patch: Support --attach
This patch touches a couple of files, because it adds options to print a
custom text just after the subject of a commit, and just after the
diffstat.

[jc: made "many dashes" used as the boundary leader into a single
 variable, to reduce the possibility of later tweaks to miscount the
 number of dashes to break it.]

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-21 02:03:09 -07:00
Johannes Schindelin
8824689884 diff family: add --check option
Actually, it is a diff option now, so you can say

	git diff --check

to ask if what you are about to commit is a good patch.

[jc: this also would work for fmt-patch, but the point is that
 the check is done before making a commit.  format-patch is run
 from an already created commit, and that is too late to catch
 whitespace damaged change.]

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-21 01:16:09 -07:00
Junio C Hamano
638684824c Merge branch 'se/diff'
* se/diff:
  Convert some "apply --summary" users to "diff --summary".
  Add "--summary" option to git diff.
2006-05-15 23:42:37 -07:00
Sean
4bbd261bbd Add "--summary" option to git diff.
Remove the need to pipe git diff through git apply to
get the extended headers summary.

Signed-off-by: Sean Estabrooks <seanlkml@sympatico.ca>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-14 16:28:45 -07:00
Linus Torvalds
ee1e5412a7 git diff: support "-U" and "--unified" options properly
We used to parse "-U" and "--unified" as part of the GIT_DIFF_OPTS
environment variable, but strangely enough we would _not_ parse them as
part of the normal diff command line (where we only accepted "-u").

This adds parsing of -U and --unified, both with an optional numeric
argument. So now you can just say

	git diff --unified=5

to get a unified diff with a five-line context, instead of having to do
something silly like

	GIT_DIFF_OPTS="--unified=5" git diff -u

(that silly format does continue to still work, of course).

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-14 16:26:27 -07:00
Junio C Hamano
0660626caf binary diff: further updates.
This updates the user interface and generated diff data format.

 * "diff --binary" is used to signal that we want an e-mailable
   binary patch.  It implies --full-index and -p.

 * "apply --allow-binary-replacement" acquired a short synonym
   "apply --binary".

 * After the "GIT binary patch\n" header line there is a token
   to record which binary patch mechanism was used, so that we
   can extend it later.  Currently there are two mechanisms
   defined: "literal" and "delta".  The former records the
   deflated postimage and the latter records the deflated delta
   from the preimage to postimage.

   For purely implementation convenience, I added the deflated
   length after these "literal/delta" tokens (otherwise the
   decoding side needs to guess and reallocate the buffer while
   inflating).  Improvement patches are very welcomed.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-05 15:24:32 -07:00
Junio C Hamano
0fe7c1de16 built-in diff: assorted updates.
"git diff(n)" without --base, --ours, etc. defaults to --cc,
which usually is the same as -p unless you are in the middle of
a conflicted merge, just like the shell script version.

"git diff(n) blobA blobB path" complains and dies.

"git diff(n) tree0 tree1 tree2...treeN" does combined diff that
shows a merge of tree1..treeN to result in tree0.

Giving "-c" option to any command that defaults to "--cc" turns
off dense-combined flag.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-29 01:32:53 -07:00
Junio C Hamano
5c21ac0e7c Libified diff-index: backward compatibility fix.
"diff-index -m" does not mean "do not ignore merges", but means
"pretend missing files match the index".

The previous round tried to address this, but failed because
setup_revisions() ate "-m" flag before the caller had a chance
to intervene.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-22 04:03:32 -07:00
Junio C Hamano
e09ad6e1e3 Libify diff-index.
The second installment to libify diff brothers.  The pathname
arguments are checked more strictly than before because we now
use the revision.c::setup_revisions() infrastructure.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-22 02:43:00 -07:00
Junio C Hamano
6973dcaee7 Libify diff-files.
This is the first installment to libify diff brothers.

The updated diff-files uses revision.c::setup_revisions()
infrastructure to parse its command line arguments, which means
the pathname arguments are checked more strictly than before.
The tests are adjusted to separate possibly missing paths from
the rest of arguments with double-dashes, to show the kosher
way.

As Linus pointed out, renaming diff.c to diff-lib.c was simply
stupid, so I am renaming it back.  The new diff-lib.c is to
contain pieces extracted from diff brothers.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-22 02:37:45 -07:00
Linus Torvalds
9153983310 Log message printout cleanups
On Sun, 16 Apr 2006, Junio C Hamano wrote:
>
> In the mid-term, I am hoping we can drop the generate_header()
> callchain _and_ the custom code that formats commit log in-core,
> found in cmd_log_wc().

Ok, this was nastier than expected, just because the dependencies between
the different log-printing stuff were absolutely _everywhere_, but here's
a patch that does exactly that.

The patch is not very easy to read, and the "--patch-with-stat" thing is
still broken (it does not call the "show_log()" thing properly for
merges). That's not a new bug. In the new world order it _should_ do
something like

	if (rev->logopt)
		show_log(rev, rev->logopt, "---\n");

but it doesn't. I haven't looked at the --with-stat logic, so I left it
alone.

That said, this patch removes more lines than it adds, and in particular,
the "cmd_log_wc()" loop is now a very clean:

	while ((commit = get_revision(rev)) != NULL) {
		log_tree_commit(rev, commit);
		free(commit->buffer);
		commit->buffer = NULL;
	}

so it doesn't get much prettier than this. All the complexity is entirely
hidden in log-tree.c, and any code that needs to flush the log literally
just needs to do the "if (rev->logopt) show_log(...)" incantation.

I had to make the combined_diff() logic take a "struct rev_info" instead
of just a "struct diff_options", but that part is pretty clean.

This does change "git whatchanged" from using "diff-tree" as the commit
descriptor to "commit", and I changed one of the tests to reflect that new
reality. Otherwise everything still passes, and my other tests look fine
too.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-17 15:18:25 -07:00
Johannes Schindelin
2935327394 diff-options: add --patch-with-stat
With this option, git prepends a diffstat in front of the patch.

Since I really, really do not know what a diffstat of a combined diff
("merge diff") should look like, the diffstat is not generated for these.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-15 19:30:27 -07:00
Johannes Schindelin
d75f7952ef diff-options: add --stat (take 2)
Now, you can say "git diff --stat" (to get an idea how many changes are
uncommitted), or "git log --stat".

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-13 16:48:24 -07:00
Junio C Hamano
4da8cbc234 Merge branch 'jc/diff' into next
* jc/diff:
  blame and friends: adjust to multiple pathspec change.
  git log --full-diff
  tree-diff: do not assume we use only one pathspec
2006-04-11 14:34:53 -07:00
Petr Baudis
5c91da25d7 Document --patch-with-raw
Signed-off-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-11 11:16:51 -07:00
Junio C Hamano
86ff1d2012 diff-* --patch-with-raw
This new flag outputs the diff-raw output and diff-patch output
at the same time.  Requested by Cogito.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-10 19:44:18 -07:00
Junio C Hamano
a8baa7b9f5 tree-diff: do not assume we use only one pathspec
The way tree-diff was set up assumed we would use only one set
of pathspec during the entire life of the program.  Move the
pathspec related static variables out to diff_options structure
so that we can filter commits with one set of paths while show
the actual diffs using different set of paths.

I suspect this breaks blame.c, and makes "git log paths..." to
default to the --full-diff, the latter of which is dealt with
the next commit.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-10 16:45:19 -07:00
Junio C Hamano
77882f60d9 Retire diffcore-pathspec.
Nobody except diff-stages used it -- the callers instead filtered
the input to diffcore themselves.  Make diff-stages do that as
well and retire diffcore-pathspec.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-04-10 15:57:24 -07:00
Petr Baudis
d01d8c6782 Support for pickaxe matching regular expressions
git-diff-* --pickaxe-regex will change the -S pickaxe to match
POSIX extended regular expressions instead of fixed strings.

The regex.h library is a rather stupid interface and I like pcre too, but
with any luck it will be everywhere we will want to run Git on, it being
POSIX.2 and all. I'm not sure if we can expect platforms like AIX to
conform to POSIX.2 or if win32 has regex.h. We might add a flag to
Makefile if there is a portability trouble potential.

Signed-off-by: Petr Baudis <pasky@suse.cz>
2006-04-04 13:44:15 -07:00
Junio C Hamano
1b0c7174a1 tree/diff header cleanup.
Introduce tree-walk.[ch] and move "struct tree_desc" and
associated functions from various places.

Rename DIFF_FILE_CANON_MODE(mode) macro to canon_mode(mode) and
move it to cache.h.  This macro returns the canonicalized
st_mode value in the host byte order for files, symlinks and
directories -- to be compared with a tree_desc entry.
create_ce_mode(mode) in cache.h is similar but is intended to be
used for index entries (so it does not work for directories) and
returns the value in the network byte order.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-03-29 23:54:13 -08:00
Junio C Hamano
d416df8869 combine-diff: Record diff status a bit more faithfully
This shows "new file mode XXXX" and "deleted file mode XXXX"
lines like two-way diff-patch output does, by checking the
status from each parent.

The diff-raw output for combined diff is made a bit uglier by
showing diff status letters with each parent.  While most of the
case you would see "MM" in the output, an Evil Merge that
touches a path that was added by inheriting from one parent is
possible and it would be shown like these:

    $ git-diff-tree --abbrev -c HEAD
    2d7ca89675eb8888b0b88a91102f096d4471f09f
    ::000000 000000 100644 0000000... 0000000... 31dd686... AA	b
    ::000000 100644 100644 0000000... 6c884ae... c6d4fa8... AM	d
    ::100644 100644 100644 4f7cbe7... f8c295c... 19d5d80... RR	e

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-10 02:50:53 -08:00
Junio C Hamano
0a798076b8 combine-diff: move formatting logic to show_combined_diff()
This way, diff-files can make use of it.  Also implement the
full suite of what diff_flush_raw() supports just for
consistency.  With this, 'diff-tree -c -r --name-status' would
show what is expected.

There is no way to get the historical output (useful for
debugging and low-level Plumbing work) anymore, so tentatively
it makes '-m' to mean "do not combine and show individual diffs
with parents".

diff-files matches diff-tree to produce raw output for -c.  For
textual combined diff, use -p -c.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-09 15:23:06 -08:00
Linus Torvalds
ee63802422 diff-tree -c raw output
NOTE! This makes "-c" be the default, which effectively means that merges 
are never ignored any more, and "-m" is a no-op. So it changes semantics.

I would also like to make "--cc" the default if you do patches, but didn't 
actually do that.

The raw output format is not wonderfully pretty, but it's distinguishable 
from a "normal patch" in that a normal patch with just one parent has just 
one colon at the beginning, while a multi-parent raw diff has <n> colons 
for <n> parents.

So now, in the kernel, when you do

	git-diff-tree cce0cac125623f9b68f25dd1350f6d616220a8dd

(to see the manual ARM merge that had a conflict in arch/arm/Kconfig), you 
get

	cce0cac125623f9b68f25dd1350f6d616220a8dd
	::100644 100644 100644 4a63a8e2e45247a11c068c6ed66c6e7aba29ddd9 77eee38762d69d3de95ae45dd9278df9b8225e2c 2f61726d2f4b636f6e66696700dbf71a59dad287       arch/arm/Kconfig

ie you see two colons (two parents), then three modes (parent modes 
followed by result mode), then three sha1s (parent sha1s followed by
result sha1).

Which is pretty close to the normal raw diff output.

Cool/stupid exercise:

	$ git-whatchanged | grep '^::' | cut -f2- | sort |
	  uniq -c | sort -n | less -S

will show which files have needed the most file-level merge conflict
resolution. Useful? Probably not. But kind of interesting.

For the kernel, it's

     ....
     10 arch/ia64/Kconfig
     11 drivers/scsi/Kconfig
     12 drivers/net/Makefile
     17 include/linux/libata.h
     18 include/linux/pci_ids.h
     23 drivers/net/Kconfig
     24 drivers/scsi/libata-scsi.c
     28 drivers/scsi/libata-core.c
     43 MAINTAINERS

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-09 11:46:05 -08:00
Junio C Hamano
2454c962fb combine-diff: show mode changes as well.
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-06 13:06:49 -08:00
Junio C Hamano
e3c3a550d4 combine-diff: remove misguided --show-empty hack.
Now --always flag is available in diff-tree, there is no reason
to have that hack in the diffcore side.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-05 22:25:00 -08:00
Linus Torvalds
50f9a858ad Make the "struct tree_desc" operations available to others
We have operations to "extract" and "update" a "struct tree_desc", but we
only used them in tree-diff.c and they were static to that file.

But other tree traversal functions can use them to their advantage

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-01-31 16:07:01 -08:00
Junio C Hamano
addafaf92e Merge lt/revlist,jc/diff,jc/revparse,jc/abbrev 2006-01-28 00:16:09 -08:00
Junio C Hamano
46a6c2620b abbrev cleanup: use symbolic constants
The minimum length of abbreviated object name was hardcoded in
different places to be 4, risking inconsistencies in the future.
Also there were three different "default abbreviation
precision".  Use two C preprocessor symbols to clean up this
mess.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-01-28 00:09:38 -08:00
Junio C Hamano
ea726d02e9 diff-files: -c and --cc options.
This ports the "combined diff" to diff-files so that differences
to the working tree files since stage 2 and stage 3 are shown
the same way as combined diff output from diff-tree for the
merge commit would be shown if the current working tree files
are committed.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-01-28 00:08:29 -08:00
Junio C Hamano
d8f4790e6f diff-tree --cc: denser combined diff output for a merge commit.
Building on the previous '-c' (combined) option, '--cc' option
squelches the output further by omitting hunks that consist of
difference with solely one parent.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-01-28 00:08:28 -08:00
Junio C Hamano
af3feefa1d diff-tree -c: show a merge commit a bit more sensibly.
A new option '-c' to diff-tree changes the way a merge commit is
displayed when generating a patch output.  It shows a "combined
diff" (hence the option letter 'c'), which looks like this:

    $ git-diff-tree --pretty -c -p fec9ebf1 | head -n 18
    diff-tree fec9ebf... (from parents)
    Merge: 0620db3... 8a263ae...
    Author: Junio C Hamano <junkio@cox.net>
    Date:   Sun Jan 15 22:25:35 2006 -0800

	Merge fixes up to GIT 1.1.3

    diff --combined describe.c
    @@@ +98,7 @@@
	    return (a_date > b_date) ? -1 : (a_date == b_date) ? 0 : 1;
       }

    -  static void describe(char *arg)
     - static void describe(struct commit *cmit, int last_one)
    ++ static void describe(char *arg, int last_one)
       {
     +      unsigned char sha1[20];
     +      struct commit *cmit;

There are a few things to note about this feature:

 - The '-c' option implies '-p'.  It also implies '-m' halfway
   in the sense that "interesting" merges are shown, but not all
   merges.

 - When a blob matches one of the parents, we do not show a diff
   for that path at all.  For a merge commit, this option shows
   paths with real file-level merge (aka "interesting things").

 - As a concequence of the above, an "uninteresting" merge is
   not shown at all.  You can use '-m' in addition to '-c' to
   show the commit log for such a merge, but there will be no
   combined diff output.

 - Unlike "gitk", the output is monochrome.

A '-' character in the nth column means the line is from the nth
parent and does not appear in the merge result (i.e. removed
from that parent's version).

A '+' character in the nth column means the line appears in the
merge result, and the nth parent does not have that line
(i.e. added by the merge itself or inherited from another
parent).

The above example output shows that the function signature was
changed from either parents (hence two "-" lines and a "++"
line), and "unsigned char sha1[20]", prefixed by a " +", was
inherited from the first parent.

The code as sent to the list was buggy in few corner cases,
which I have fixed since then.

It does not bother to keep track of and show the line numbers
from parent commits, which it probably should.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-01-28 00:08:28 -08:00
Junio C Hamano
913419fcc6 diff --abbrev: document --abbrev=<n> form.
It was implemented there but was not advertised.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-19 18:32:44 -08:00
Junio C Hamano
47dd0d595d diff: --abbrev option
When I show transcripts to explain how something works, I often
find myself hand-editing the diff-raw output to shorten various
object names in the output.

This adds --abbrev option to the diff family, which shortens
diff-raw output and diff-tree commit id headers.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-19 18:32:44 -08:00
Junio C Hamano
9ce392f482 Move diff.renamelimit out of default configuration.
Otherwise we would end up linking all the unneeded stuff into git-daemon
only to link with git_default_config.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-11-21 23:00:50 -08:00
Junio C Hamano
80b1e511d7 diff: --full-index
A new option, --full-index, is introduced to diff family.  This
causes the full object name of pre- and post-images to appear on
the index line of patch formatted output, to be used in
conjunction with --allow-binary-replacement option of git-apply.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-11-16 16:20:40 -08:00
Chris Shoemaker
50b8e355b6 Documentation changes to recursive option for git-diff-tree
Update docs and usages regarding '-r' recursive option for git-diff-tree.
Remove '-r' from common diff options, mention it only for git-diff-tree.
Remove one extraneous use of '-r' with git-diff-files in get-merge.sh.
Sync the synopsis and usage string for git-diff-tree.

Signed-off-by: Chris Shoemaker <c.shoemaker at cox.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-28 13:37:38 -07:00
Linus Torvalds
ac1b3d1248 Split up tree diff functions into tree-diff.c library
This makes the tree diff functionality independent of the "git-diff-tree"
program, by splitting the core functionality up into a library file.

This will be needed for when we teach git-rev-list to only follow a
specified set of pathnames, rather than the global revision history.

Most of it is a fairly straightforward code move, but it also involves
some calling convention cleanup, and moving some of the static variables
from diff-tree.c into the options structure.

The actual tree change callback routines also become paramterized by the
diff_options structure, allowing the library functionality to do something
else than just show the diff on stdout.

Right now the only user of this functionality remains git-diff-tree
itself.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-22 22:49:51 -07:00
Junio C Hamano
0b34379a8d Fix diff-filter All-Or-None mark.
When we updated the marker for new files from 'N' to 'A', we forgot to
notice that the letter is already taken by the All-Or-None mark.
Change the All-Or-None marker to '*' to resolve this conflict.

	git-diff-tree -r --diff-filter='R*' -M

shows all the changes (not just renames) that are contained in commits
that have renames, in comparison with:

	git-diff-tree -r --diff-filter='R' -M

shows the same set of changes but the diff output are limited only to
renaming changes.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-10-04 17:44:17 -07:00
Junio C Hamano
946f5f7c24 Diff: --name-status output format.
The new output format shows only the status letter and paths.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-24 23:50:44 -07:00
Junio C Hamano
8082d8d305 Diff: -l<num> to limit rename/copy detection.
When many paths are modified, rename detection takes a lot of time.
The new option -l<num> can be used to disable rename detection when
more than <num> paths are possibly created as renames.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-24 23:50:44 -07:00
Junio C Hamano
6b5ee137e5 Diff clean-up.
This is a long overdue clean-up to the code for parsing and passing
diff options.  It also tightens some constness issues.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-24 23:50:43 -07:00
Junio C Hamano
5cfcd07c93 Retire diff-helper.
The textual diff generation with built-in '-p' in diff-* brothers has
proven to be useful enough that git-diff-helper outlived its usefulness.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-22 01:54:13 -07:00
Junio C Hamano
ca8c9156f8 diff-raw: Use 'A' instead of 'N' for added files.
This actually changes the diff-raw status letter from N to A
for added files.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-07-25 17:16:20 -07:00
Junio C Hamano
e7baa4f45f Use symbolic constants for diff-raw status indicators.
Both Cogito and StGIT prefer to see 'A' for new files.  The
current 'N' is visually harder to distinguish from 'M', which is
used for modified files.  Prepare the internals to use symbolic
constants to make the change easier.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-07-25 17:15:34 -07:00
Linus Torvalds
e68b6f1525 Split up "diff_format" into "format" and "line_termination".
This removes the separate "formats" for name and name-with-zero-
termination.

It also removes the difference between HUMAN and MACHINE formats, and
they both become DIFF_FORMAT_RAW, with the difference being just in the
line and inter-filename termination.

It also makes the code easier to understand.
2005-07-14 17:59:17 -07:00
Junio C Hamano
dda2d79af2 [PATCH] Clean up diff option descriptions.
I got tired of maintaining almost duplicated descriptions in
diff-* brothers, both in usage string and documentation.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-13 13:09:17 -07:00
Junio C Hamano
52f28529f4 [PATCH] git-diff-*: --name-only and --name-only-z.
Porcelain layers often want to find only names of changed files,
and even with diff-raw output format they end up having to pick
out only the filename.  Support --name-only (and --name-only-z
for xargs -0 and cpio -0 users that want to treat filenames with
embedded newlines sanely) flag to help them.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-13 12:55:07 -07:00
Junio C Hamano
f2ce9fde57 [PATCH] Add --diff-filter= output restriction to diff-* family.
This is a halfway between debugging aid and a helper to write an
ultra-smart merge scripts.  The new option takes a string that
consists of a list of "status" letters, and limits the diff
output to only those classes of changes, with two exceptions:

 - A broken pair (aka "complete rewrite"), does not match D
   (deleted) or N (created).  Use B to look for them.

 - The letter "A" in the diff-filter string does not match
   anything itself, but causes the entire diff that contains
   selected patches to be output (this behaviour is similar to
   that of --pickaxe-all for the -S option).

For example,

    $ git-rev-list HEAD |
      git-diff-tree --stdin -s -v -B -C --diff-filter=BCR

shows a list of commits that have complete rewrite, copy, or
rename.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-12 20:40:20 -07:00
Junio C Hamano
ce24067549 [PATCH] diff: Fix docs and add -O to diff-helper.
This patch updates diff documentation and usage strings:

 - clarify the semantics of -R.  It is not "output in reverse";
   rather, it is "I will feed diff backwards".  Semantically
   they are different when -C is involved.

 - describe -O in usage strings of diff-* brothers.  It was
   implemented, documented but not described in usage text.

Also it adds -O to diff-helper.  Like -S (and unlike -M/-C/-B),
this option can work on sanitized diff-raw output produced by
the diff-* brothers.  While we are at it, the call it makes to
diffcore is cleaned up to use the diffcore_std() like everybody
else, and the declaration for the low level diffcore routines
are moved from diff.h (public) to diffcore.h (private between
diff.c and diffcore backends).

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-03 11:23:03 -07:00
Junio C Hamano
67574c403f [PATCH] diff: mode bits fixes
The core GIT repository has trees that record regular file mode
in 0664 instead of normalized 0644 pattern.  Comparing such a
tree with another tree that records the same file in 0644
pattern without content changes with git-diff-tree causes it to
feed otherwise unmodified pairs to the diff_change() routine,
which triggers a sanity check routine and barfs.  This patch
fixes the problem, along with the fix to another caller that
uses unnormalized mode bits to call diff_change() routine in a
similar way.

Without this patch, you will see "fatal error" from diff-tree
when you run git-deltafy-script on the core GIT repository
itself.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-01 13:24:03 -07:00
Junio C Hamano
af5323e027 [PATCH] Add -O<orderfile> option to diff-* brothers.
A new diffcore filter diffcore-order is introduced.  This takes
a text file each of whose line is a shell glob pattern.  Patches
that match a glob pattern on an earlier line in the file are
output before patches that match a later line, and patches that
do not match any glob pattern are output last.

A typical orderfile for git project probably should look like
this:

    README
    Makefile
    Documentation
    *.h
    *.c

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-30 18:10:46 -07:00
Junio C Hamano
f345b0a066 [PATCH] Add -B flag to diff-* brothers.
A new diffcore transformation, diffcore-break.c, is introduced.

When the -B flag is given, a patch that represents a complete
rewrite is broken into a deletion followed by a creation.  This
makes it easier to review such a complete rewrite patch.

The -B flag takes the same syntax as the -M and -C flags to
specify the minimum amount of non-source material the resulting
file needs to have to be considered a complete rewrite, and
defaults to 99% if not specified.

As the new test t4008-diff-break-rewrite.sh demonstrates, if a
file is a complete rewrite, it is broken into a delete/create
pair, which can further be subjected to the usual rename
detection if -M or -C is used.  For example, if file0 gets
completely rewritten to make it as if it were rather based on
file1 which itself disappeared, the following happens:

    The original change looks like this:

	file0     --> file0' (quite different from file0)
	file1     --> /dev/null

    After diffcore-break runs, it would become this:

	file0     --> /dev/null
	/dev/null --> file0'
	file1     --> /dev/null

    Then diffcore-rename matches them up:

	file1     --> file0'

The internal score values are finer grained now.  Earlier
maximum of 10000 has been raised to 60000; there is no user
visible changes but there is no reason to waste available bits.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-30 10:35:49 -07:00
Junio C Hamano
befe86392c [PATCH] diff: consolidate various calls into diffcore.
The three diff-* brothers had a sequence of calls into diffcore
that were almost identical.  Introduce a new diffcore_std()
function that takes all the necessary arguments to consolidate
it.  This will make later enhancements and changing the order of
diffcore application simpler.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-30 10:35:49 -07:00
Junio C Hamano
f0c6b2a2fd [PATCH] Optimize diff-tree -[CM] --stdin
This attempts to optimize "diff-tree -[CM] --stdin", which
compares successible tree pairs.  This optimization does not
make much sense for other commands in the diff-* brothers.

When reading from --stdin and using rename/copy detection, the
patch makes diff-tree to read the current index file first.
This is done to reuse the optimization used by diff-cache in the
non-cached case.  Similarity estimator can avoid expanding a
blob if the index says what is in the work tree has an exact
copy of that blob already expanded.

Another optimization the patch makes is to check only file sizes
first to terminate similarity estimation early.  In order for
this to work, it needs a way to tell the size of the blob
without expanding it.  Since an obvious way of doing it, which
is to keep all the blobs previously used in the memory, is too
costly, it does so by keeping the filesize for each object it
has already seen in memory.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-29 11:17:44 -07:00
Junio C Hamano
367cec1c02 [PATCH] Add --pickaxe-all to diff-* brothers.
When --pickaxe-all is given in addition to -S, pickaxe shows the
entire diffs contained in the changeset, not just the diffs for
the filepair that touched the sought-after string.  This is
useful to see the changes in context.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-29 11:17:43 -07:00
Junio C Hamano
19feebc8c3 [PATCH] Clean up diff_setup() to make it more extensible.
This changes the argument of diff_setup() from an integer that
says if we are feeding reversed diff to a bitmask, so that later
global options can be added more easily.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-29 11:17:43 -07:00
Junio C Hamano
bceafe752c [PATCH] Fix diff-pruning logic which was running prune too early.
For later stages to reorder patches, pruning logic and rename detection
logic should not decide which delete to discard (because another entry
said it will take over the file as a rename) until the very end.

Also fix some tests that were assuming the earlier "last one is rename
or keep everything else is copy" semantics of diff-raw format, which no
longer is true.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-23 19:17:06 -07:00
Junio C Hamano
b6d8f309d9 [PATCH] diff-raw format update take #2.
This changes the diff-raw format again, following the mailing
list discussion.  The new format explicitly expresses which one
is a rename and which one is a copy.

The documentation and tests are updated to match this change.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-23 16:23:10 -07:00
Junio C Hamano
6b14d7faf0 [PATCH] Diffcore updates.
This moves the path selection logic from individual programs to a new
diffcore transformer (diff-tree still needs to have its own for
performance reasons).  Also the header printing code in diff-tree was
tweaked not to produce anything when pickaxe is in effect and there is
nothing interesting to report.  An interesting example is the following
in the GIT archive itself:

    $ git-whatchanged -p -C -S'or something in a real script'

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-22 10:17:50 -07:00
Junio C Hamano
81e50eabf0 [PATCH] The diff-raw format updates.
Update the diff-raw format as Linus and I discussed, except that
it does not use sequence of underscore '_' letters to express
nonexistence.  All '0' mode is used for that purpose instead.

The new diff-raw format can express rename/copy, and the earlier
restriction that -M and -C _must_ be used with the patch format
output is no longer necessary.  The patch makes -M and -C flags
independent of -p flag, so you need to say git-whatchanged -M -p
to get the diff/patch format.

Updated are both documentations and tests.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-21 22:49:19 -07:00
Junio C Hamano
38c6f78059 [PATCH] Prepare diffcore interface for diff-tree header supression.
This does not actually supress the extra headers when pickaxe is
used, but prepares enough support for diff-tree to implement it.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-21 22:49:19 -07:00
Junio C Hamano
057c7d3018 [PATCH] Constness fix for pickaxe option.
Constness fix for pickaxe option.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-05-21 15:17:16 -07:00
Junio C Hamano
52e9578985 [PATCH] Introducing software archaeologist's tool "pickaxe".
This steals the "pickaxe" feature from JIT and make it available
to the bare Plumbing layer.  From the command line, the user
gives a string he is intersted in.

Using the diff-core infrastructure previously introduced, it
filters the differences to limit the output only to the diffs
between <src> and <dst> where the string appears only in one but
not in the other.  For example:

 $ ./git-rev-list HEAD | ./git-diff-tree -Sdiff-tree-helper --stdin -M

would show the diffs that touch the string "diff-tree-helper".

In real software-archaeologist application, you would typically
look for a few to several lines of code and see where that code
came from.

The "pickaxe" module runs after "rename/copy detection" module,
so it even crosses the file rename boundary, as the above
example demonstrates.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-21 09:58:03 -07:00
Junio C Hamano
57fe64a40d [PATCH] diff overhaul
This cleans up the way calls are made into the diff core from diff-tree
family and diff-helper.  Earlier, these programs had "if
(generating_patch)" sprinkled all over the place, but those ugliness are
gone and handled uniformly from the diff core, even when not generating
patch format.

This also allowed diff-cache and diff-files to acquire -R
(reverse) option to generate diff in reverse.  Users of
diff-tree can swap two trees easily so I did not add -R there.

[ Linus' note: I'll add -R to "diff-tree" too, since a "commit
  diff" doesn't have another tree to switch around: the other
  tree is always the parent(s) of the commit ]

Also -M<digits-as-mantissa> suggestion made by Linus has been
implemented.

Documentation updates are also included.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-19 22:33:07 -07:00
Junio C Hamano
5c97558c9a [PATCH] Detect renames in diff family.
This rips out the rename detection engine from diff-helper and moves it
to the diff core, and updates the internal calling convention used by
diff-tree family into the diff core.  In order to give the same option
name to diff-tree family as well as to diff-helper, I've changed the
earlier diff-helper '-r' option to '-M' (stands for Move; sorry but the
natural abbreviation 'r' for 'rename' is already taken for 'recursive').

Although I did a fair amount of test with the git-diff-tree with
existing rename commits in the core GIT repository, this should still be
considered beta (preview) release.  This patch depends on the diff-delta
infrastructure just committed.

This implements almost everything I wanted to see in this series of
patch, except a few minor cleanups in the calling convention into diff
core, but that will be a separate cleanup patch.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-19 08:59:40 -07:00
Junio C Hamano
915838c3cb [PATCH] Diff-helper update
This patch adds a framework and a stub implementation of rename
detection to diff-helper program.

The current stub code is just enough to detect pure renames in
diff-tree output and not fancier.  The plan is perhaps to use
the same delta code when Nico's delta storage patch is merged
for similarity evaluation purposes.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-18 11:16:24 -07:00
Junio C Hamano
99665af5c0 [PATCH 2/3] Rename git-diff-tree-helper to git-diff-helper.
It used to be that diff-tree needed helper support to parse its
raw output to generate diffs, but these days git-diff-* family
produces the same output and the helper is not tied to diff-tree
anymore.  Drop "tree" from its name.

This commit is done separately to record just the rename and no
file content changes. The changes in the renamed files are recorded
in the next commit.

Signed-off-by: Junio C Hamano <junkio@cox.net>

Bundled with the changes in the unrenamed files.

Signed-off-by: Petr Baudis <pasky@ucw.cz>
2005-05-15 02:05:03 +02:00
Junio C Hamano
b46f0b6dfd Optimize diff-cache -p --cached
This patch optimizes "diff-cache -p --cached" by avoiding to
inflate blobs into temporary files when the blob recorded in the
cache matches the corresponding file in the work tree.  The file
in the work tree is passed as the comparison source in such a
case instead.

This optimization kicks in only when we have already read the
cache this optimization and this is deliberate.  Especially,
diff-tree does not use this code, because changes are contained
in small number of files relative to the project size most of
the time, and reading cache is so expensive for a large project
that the cost of reading it outweighs the savings by not
inflating blobs.

Also this patch cleans up the structure passed from diff clients
by removing one unused structure member.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-05-04 01:45:24 -07:00
Junio C Hamano
77eb272046 [PATCH] Reworked external diff interface.
This introduces three public functions for diff-cache and friends can
use to call out to the GIT_EXTERNAL_DIFF program when they wish to.

A normal "add/remove/change" entry is turned into 7-parameter process
invocation of GIT_EXTERNAL_DIFF program as before.  In addition, the
program can now be called with a single parameter when diff-cache and
friends want to report an unmerged path. 

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-27 09:21:00 -07:00
Junio C Hamano
be3cfa85f4 [PATCH] Diff-tree-helper take two.
This reworks the diff-tree-helper and show-diff to further make external
diff command interface simpler.

These commands now honor GIT_EXTERNAL_DIFF environment variable which
can point at an arbitrary program that takes 7 parameters:

  name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode

The parameters for an external diff command are as follows:

  name        this invocation of the command is to emit diff
	      for the named cache/tree entry.

  file1       pathname that holds the contents of the first
	      file.  This can be a file inside the working
	      tree, or a temporary file created from the blob
	      object, or /dev/null.  The command should not
	      attempt to unlink it -- the temporary is
	      unlinked by the caller.

  file1-sha1  sha1 hash if file1 is a blob object, or "."
	      otherwise.

  file1-mode  mode bits for file1, or "." for a deleted file.

If GIT_EXTERNAL_DIFF environment variable is not set, the
default is to invoke diff with the set of parameters old
show-diff used to use.  This built-in implementation honors the
GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-26 09:25:05 -07:00
Junio C Hamano
86436c2828 [PATCH] Split external diff command interface to a separate file.
With this patch, the non-core'ish part of show-diff command that
invokes an external "diff" comand to obtain patches is split
into a separate file.  The next patch will introduce a new
command, diff-tree-helper, which uses this common diff interface
to format diff-tree and diff-cache output into a patch form.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-25 18:22:47 -07:00