Enhance git-ls-tree to allow optional 'match paths' that
restricts the output of git-ls-tree. This is useful to retrieve
a single file's SHA1 out of a tree without creating an index.
[JC: I added the test case]
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is a demonstration of GIT_EXTERNAL_DIFF mechanism, and a
testbed for tweaking and enhancing what the built-in diff should
do. This script is designed to output exactly the same output
as what the built-in diff driver produces when used as the
GIT_EXTERNAL_DIFF command.
I've run this and updated built-in diff on the entire history of
linux-2.6 git repository, and JG's udev.git repository which has
interesting symlink cases to make sure it is equivalent to the
built-in diff driver.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
With the introduction of type 'T' in the diff-raw output, and
the "apply-patch" program Linus has been quietly working on
without much advertisement, it started to make sense to emit
usable information in the "diff --git" patch output format as
well. Earlier built-in diff driver punted and did not say
anything about a symbolic link changing into a file or vice
versa, but this version represents it as a pair of deletion
and creation.
It also fixes a minor problem dealing with old archive created
with ancient git. The earlier code was reporting file mode
change between 100664 and 100644 (we shouldn't). The linux-2.6
git tree has a good example that exposes this problem. A good
test case is commit ce1dc02f76432a46db149241e015a4f782974623.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make it a clear two-phase thing: first a read-only parse of
the patch itself (which is independent of any current index
information), and then the second phase actually uses the patch.
The second phase might not be a real apply, it could be just a
diffstat, for example. Which is trivial to do once the patch is
parsed.
This is the remainder of testcase fix by Mark Allen to make them
work on his Darwin box. I was using "xargs -r" (GNU) where it
was not needed, sed -ne '/^\(author\|committer\)/s|>.*|>|p'
where some sed does not know what to do with '\|', and also
"cmp - file" to compare standard input with a file, which his
cmp does not support.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The "end" commit is just faking it right now, it's sorting things
purely by date, so this is _not_ a reachability analysis. Some day.
The "--header" flag causes the commit message to be printed out,
with a NUL character separator after it for parseability. This
allows you to do things like use "grep -z" to grep for certain
authors etc.
object.
A fair number of the users potentially want to look at the
commit objects more closely, and if you worry about memory
leaking in certain applications, you can always do a
free(commit->buffer);
commit->buffer = NULL;
by hand after parsing them.
This fixes another bug.
- Mode-only changes were pruned incorrectly from the output.
- Added test to catch the above problem.
- Normalize rename/copy similarity score in the diff-raw output
to per-cent, no matter what scale we internally use.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The interim single-liner '?' fix resulted delete entries that
should not have emitted coming out in the output as an
unintended side effect; I caught this with the "rename" test in
the test suite. This patch instead fixes the code that assigns
the status code to each filepair.
I verified this does not break the testcase in udev.git tree Kay
Sievers gave us, by running git-diff-tree on that tree which
showed 21 file to symlink changes.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The earlier test was relying on the fact that dotfiles do not
appear in the output to prepare expected test results, which
inevitably got broken when we started handling dotfiles. Change
the test to be honest about what "--other" file it creates.
The problem was originally pointed out by Mark Allen.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This also drops the common ".git" part from the end of the repo
name, and if a non-default head reference is given, makes a nicer
commit message about it.
This adds a "-t" flag to tell the raw diff output to include the tree
objects in the output when doing a recursive diff.
Since that's how the non-recursive output already handles trees and the
flag thus doesn't make sense without "-r", I made "-t" imply "-r".
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oh, I am an idiot. Repeating the same check against the first
element of pathspec array as many times as the pathspec array
has elements in it would not do us any good.
This patch allows you to specify more than one pathspec to
diff-tree family and have them actually used.
Signed-off-by: Junio C Hamano <junkio@cox.net>
;)
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This attempts to match "the directory '.git' anywhere in the
tree is ignored" approach taken in update-cache.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The second round similarity estimator simply used the size of
the xdelta itself to estimate the extent of damage. This patch
keeps that logic to detect big insertions to terminate the check
early, but otherwise looks at the generated delta in order to
estimate the extent of edit more accurately.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Not important but I am a bit annoyed by gcc complaining about the
control falling out of the function without returning value.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is what Linus wrote, improving what David Greaves
originally submitted.
I just added a test case and verified the patch works.
Author: David Greaves <david@dgreaves.com>
Author: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Instead of checking silent flag all over the place, simply use
the NO_OUTPUT option diffcore provides to suppress the diff
output.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We still refuse to add ".", ".." and ".git".
In theory, you could track another git-repository by allowing ".git",
but the potential for confusion is just too high.
We need to quote backslash and backtick too.
And inform the user about our progress, since converting a
big archive can take time. Doing the full mutt history took
just under eight minutes.
This should also mean that the conversion is now completely
defined by the CVS tree, and that two people doing a cvs2git
conversion on the same base will always get the same results
regardless of when or in what timezone they do it.
This escapes '$' characters in <<-handling, and gives preference to
the new branch when cvsps incorrectly reports a commit as originating
on an old branch.
.. and tell 'co' to shut up about the rcs noise.
This still leaves some branch issues up in the air: it looks like
cvsps has some questionable originating branch information, but I
don't know whether that's a cvsps bug or an actual bug in the
syslinux archive I'm using to test.
I'll let David Mansfield answer my questions about CVS. I'm a
total idiot when it comes to branches under CVS ("I'm pure!").
Earlier implementation had a major screw-up in the memory
management area. Rename/copy logic sometimes borrowed a pointer
to a structure without any provision for downstream to determine
which pointer is shared and which is not. This resulted in the
later clean-up code to sometimes double free such structure,
resulting in a segfault. This made -M and -C useless.
Another problem the earlier implementation had was that it
reordered the patches, and forced the logic to differentiate
renames and copies to depend on that particular order. This
problem was fixed by teaching rename/copy detection logic not to
do any reordering, and rename-copy differentiator not to depend
on the order of the patches. The diffs will leave rename/copy
detector in the same destination path order as the patch that
was fed into it. Some test vectors have been reordered to
accommodate this change.
It also adds a sanity check logic to the human-readable diff-raw
output to detect paths with embedded TAB and LF characters,
which cannot be expressed with that format. This idea came up
during a discussion with Chris Wedgwood.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It's very hacky, and it needs lots of work, but it seems to have converted
Peter's "syslinux" archive successfully. Whether the end result is correct
or not is to be seen.
Tons of work still to do: do name conversion properly, and do tags etc.
And testing. Lots of testing.
There's some duplication of filenames when doing filename operations
(creates, deletes, renames and copies), and this makes us verify that
the pathnames match when they should.
Also prevent 'sort' from sorting on the sha1 which was screwing the
history listing.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The recent diff updates gave diff-cache the same ability to
filter paths, which was not properly documented.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
For later stages to reorder patches, pruning logic and rename detection
logic should not decide which delete to discard (because another entry
said it will take over the file as a rename) until the very end.
Also fix some tests that were assuming the earlier "last one is rename
or keep everything else is copy" semantics of diff-raw format, which no
longer is true.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This changes the diff-raw format again, following the mailing
list discussion. The new format explicitly expresses which one
is a rename and which one is a copy.
The documentation and tests are updated to match this change.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
.. and print out the information. This sets up all the pathname
information, and whether it's a new file, deleted file, rename,
copy or whatever.
It's slowly getting to the point where it all comes together,
and we can actually apply all the information that we've gathered.
In particular, give line numbers when detecting corrupt patches.
This makes the tool a lot more friendly (indeed, much more so
than regular "patch", I think).