Commit Graph

55557 Commits

Author SHA1 Message Date
Junio C Hamano
25d5ea410f [PATCH] Redo rename/copy detection logic.
Earlier implementation had a major screw-up in the memory
management area.  Rename/copy logic sometimes borrowed a pointer
to a structure without any provision for downstream to determine
which pointer is shared and which is not.  This resulted in the
later clean-up code to sometimes double free such structure,
resulting in a segfault.  This made -M and -C useless.

Another problem the earlier implementation had was that it
reordered the patches, and forced the logic to differentiate
renames and copies to depend on that particular order.  This
problem was fixed by teaching rename/copy detection logic not to
do any reordering, and rename-copy differentiator not to depend
on the order of the patches.  The diffs will leave rename/copy
detector in the same destination path order as the patch that
was fed into it.  Some test vectors have been reordered to
accommodate this change.

It also adds a sanity check logic to the human-readable diff-raw
output to detect paths with embedded TAB and LF characters,
which cannot be expressed with that format.  This idea came up
during a discussion with Chris Wedgwood.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-24 01:26:26 -07:00
Linus Torvalds
3e91311ae7 Add "cvs2git" program to convert a CVS archive into a GIT archive
It's very hacky, and it needs lots of work, but it seems to have converted
Peter's "syslinux" archive successfully. Whether the end result is correct
or not is to be seen.

Tons of work still to do: do name conversion properly, and do tags etc.

And testing. Lots of testing.
2005-05-24 01:07:31 -07:00
Linus Torvalds
1e3f6b6e64 git-apply: more consistency checks on gitdiff filenames
There's some duplication of filenames when doing filename operations
(creates, deletes, renames and copies), and this makes us verify that
the pathnames match when they should.
2005-05-23 19:54:55 -07:00
Nicolas Pitre
587e49405b [PATCH] adjust git-deltafy-script to the new diff-tree output format
Also prevent 'sort' from sorting on the sha1 which was screwing the
history listing.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-23 19:17:06 -07:00
Junio C Hamano
b1c006dd65 [PATCH] Update git-diff-cache documentation.
The recent diff updates gave diff-cache the same ability to
filter paths, which was not properly documented.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-23 19:17:06 -07:00
Junio C Hamano
bceafe752c [PATCH] Fix diff-pruning logic which was running prune too early.
For later stages to reorder patches, pruning logic and rename detection
logic should not decide which delete to discard (because another entry
said it will take over the file as a rename) until the very end.

Also fix some tests that were assuming the earlier "last one is rename
or keep everything else is copy" semantics of diff-raw format, which no
longer is true.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-23 19:17:06 -07:00
Linus Torvalds
9a4a100eb4 git-apply: start using the index file information.
Right now we only use it to figure out what the filename might
be when that is ambiguous, but we'll get there..
2005-05-23 19:13:55 -07:00
Linus Torvalds
4dfdbe10dc git-apply: if no input files specified, apply stdin
This makes it act more like a traditional UNIX thing (eg "cat").
2005-05-23 16:42:21 -07:00
Linus Torvalds
9ab55bd29a diff-tree: don't write headers if the diff queue is empty
This is not a pickaxe-specific thing, we do this regardless of
what has pruned down the diff queue.
2005-05-23 16:37:47 -07:00
Linus Torvalds
881b07654c git-apply: unknown modes are zero, not -1 2005-05-23 16:32:19 -07:00
Junio C Hamano
b6d8f309d9 [PATCH] diff-raw format update take #2.
This changes the diff-raw format again, following the mailing
list discussion.  The new format explicitly expresses which one
is a rename and which one is a copy.

The documentation and tests are updated to match this change.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-23 16:23:10 -07:00
Linus Torvalds
a4acb0eb14 git-apply: parse the diff headers (both traditional and new)
.. and print out the information. This sets up all the pathname
information, and whether it's a new file, deleted file, rename,
copy or whatever.

It's slowly getting to the point where it all comes together,
and we can actually apply all the information that we've gathered.
2005-05-23 16:09:09 -07:00
Linus Torvalds
46979f567b git-apply: improve error detection and messages
In particular, give line numbers when detecting corrupt patches.
This makes the tool a lot more friendly (indeed, much more so
than regular "patch", I think).
2005-05-23 14:38:49 -07:00
Linus Torvalds
5e224a2ed0 git-apply: bad patch fragments are fatal
Don't just stop at them and look for the next header. Die,
die, die!
2005-05-23 12:31:59 -07:00
Junio C Hamano
5831b563a4 [PATCH] NUL terminate diff-tree header lines under -z.
Thomas Glanzmann noticed that diff-tree -z HEAD piped to
diff-helper -z did not work.  Since diff-helper -z expects NUL
terminated lines, we should generate such.

The output side of the diff-helper should always be using '\n'
termination; earlier it used the same line_termination used for
the input side, which was a mistake.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-23 12:17:09 -07:00
Junio C Hamano
046aa6440f [PATCH] Performance fix for pickaxe.
The pickaxe was expanding the blobs and searching in them even
when it should have already known that both sides are the same.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-23 11:49:30 -07:00
Junio C Hamano
f7c1512af8 [PATCH] Rename/copy detection fix.
The rename/copy detection logic in earlier round was only good
enough to show patch output and discussion on the mailing list
about the diff-raw format updates revealed many problems with
it.  This patch fixes all the ones known to me, without making
things I want to do later impossible, mostly related to patch
reordering.

 (1) Earlier rename/copy detector determined which one is rename
     and which one is copy too early, which made it impossible
     to later introduce diffcore transformers to reorder
     patches.  This patch fixes it by moving that logic to the
     very end of the processing.

 (2) Earlier output routine diff_flush() was pruning all the
     "no-change" entries indiscriminatingly.  This was done due
     to my false assumption that one of the requirements in the
     diff-raw output was not to show such an entry (which
     resulted in my incorrect comment about "diff-helper never
     being able to be equivalent to built-in diff driver").  My
     special thanks go to Linus for correcting me about this.
     When we produce diff-raw output, for the downstream to be
     able to tell renames from copies, sometimes it _is_
     necessary to output "no-change" entries, and this patch
     adds diffcore_prune() function for doing it.

 (3) Earlier diff_filepair structure was trying to be not too
     specific about rename/copy operations, but the purpose of
     the structure was to record one or two paths, which _was_
     indeed about rename/copy.  This patch discards xfrm_msg
     field which was trying to be generic for this wrong reason,
     and introduces a couple of fields (rename_score and
     rename_rank) that are explicitly specific to rename/copy
     logic.  One thing to note is that the information in a
     single diff_filepair structure _still_ does not distinguish
     renames from copies, and it is deliberately so.  This is to
     allow patches to be reordered in later stages.

 (4) This patch also adds some tests about diff-raw format
     output and makes sure that necessary "no-change" entries
     appear on the output.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-23 11:49:30 -07:00
Junio C Hamano
60896c7bfe [PATCH] Be careful with symlinks when detecting renames and copies.
Earlier round was not treating symbolic links carefully enough,
and would have produced diff output that renamed/copied then
edited the contents of a symbolic link, which made no practical
sense.  Change it to detect only pure renames.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-23 11:49:30 -07:00
Linus Torvalds
c1bb935020 Start implementing "git-apply"
This applies git patches (and old-style unified diffs)
in the index, rather than doing it in the working directory.

That allows for a lot more flexibility, and means that if a
patch fails, we aren't going to mess up the working directory.

NOTE! This is just the first cut at it, and right now it only
parses the incoming patch, it doesn't actually apply it yet.
2005-05-23 10:52:17 -07:00
Linus Torvalds
2cb45e9543 Don't care about st_dev in the index file
Thomas Glanzmann points out that it doesn't work well with different
clients accessing the repository over NFS - they have different views
on what the "device" for the filesystem is.

Of course, other filesystems may not even have stable inode numbers.
But we don't care. At least for now.
2005-05-22 15:08:15 -07:00
Linus Torvalds
09d74b3b5a Some more sparse warning fixes
Proper function declarations and NULL pointer usage.
2005-05-22 14:33:43 -07:00
Linus Torvalds
6f97a894d6 Fix up git-fsck-cache documentation
Notably, the "--unreachable" flag no longer depends on specified heads,
and we should document what happens if no heads are given.
2005-05-22 14:32:26 -07:00
Linus Torvalds
6b0c312106 Include file cleanups..
Add <limits.h> to the include files handled by "cache.h", and remove
extraneous #include directives from various .c files. The rule is that
"cache.h" gets all the basic stuff, so that we'll have as few system
dependencies as possible.
2005-05-22 11:54:17 -07:00
Thomas Glanzmann
ca67f00219 [PATCH] Makefile: Solaris fix: call $(MAKE) instead of make for subdirectories
Signed-off-by: Thomas Glanzmann <sithglan@stud.uni-erlangen.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-22 11:40:07 -07:00
David Greaves
2aef5bbae9 [PATCH] Docs - delta object
Added delta documentation

Signed-off-by: David Greaves <david@dgreaves.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-22 11:07:22 -07:00
David Greaves
7096a645cd [PATCH] Docs - tag object, git- prefix and s/changeset/commit/g
Add docs for tag type
Rename commands to have git- prefix
Rename changeset to commit throughout

Signed-off-by: David Greaves <david@dgreaves.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-22 11:07:22 -07:00
David Greaves
6c84e2e0c7 [PATCH] Docs - include README in git.txt
Include the README in the git.txt

Signed-off-by: David Greaves <david@dgreaves.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-22 11:07:22 -07:00
David Greaves
8ac866a869 [PATCH] Docs - asciidoc changes
Whitespace and asciidoc formatting changes only in preparation for
content changes.

Signed-off-by: David Greaves <david@dgreaves.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-22 11:07:22 -07:00
David Greaves
b2bf34d6c5 [PATCH] Docs - Makefile update
A Makefile that works just fine when the 6 character patch is applied
to asciidoc

Signed-off-by: David Greaves <david@dgreaves.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-22 11:07:22 -07:00
Linus Torvalds
7ef76925d9 Split up git-pull-script into separate "fetch" and "merge" phases.
This allows you to just fetch stuff first, inspect it, and then
resolve the merge separately if everything looks good.
2005-05-22 11:03:24 -07:00
Junio C Hamano
6b14d7faf0 [PATCH] Diffcore updates.
This moves the path selection logic from individual programs to a new
diffcore transformer (diff-tree still needs to have its own for
performance reasons).  Also the header printing code in diff-tree was
tweaked not to produce anything when pickaxe is in effect and there is
nothing interesting to report.  An interesting example is the following
in the GIT archive itself:

    $ git-whatchanged -p -C -S'or something in a real script'

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-22 10:17:50 -07:00
Junio C Hamano
26dee0adfc [PATCH] Add the code to set default minimum score back in.
When the minimum score is specified as 0 (meaning "use default
value"), set it to the default as we are told.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-22 09:46:06 -07:00
Junio C Hamano
cd1870edb6 [PATCH] Fix tweak in similarity estimator.
There was a screwy math bug in the estimator that confused what
-C1 meant and what -C9 meant, only in one of the early "cheap"
check, which resulted in quite confusing behaviour.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-22 09:38:26 -07:00
Junio C Hamano
81e50eabf0 [PATCH] The diff-raw format updates.
Update the diff-raw format as Linus and I discussed, except that
it does not use sequence of underscore '_' letters to express
nonexistence.  All '0' mode is used for that purpose instead.

The new diff-raw format can express rename/copy, and the earlier
restriction that -M and -C _must_ be used with the patch format
output is no longer necessary.  The patch makes -M and -C flags
independent of -p flag, so you need to say git-whatchanged -M -p
to get the diff/patch format.

Updated are both documentations and tests.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-21 22:49:19 -07:00
Junio C Hamano
38c6f78059 [PATCH] Prepare diffcore interface for diff-tree header supression.
This does not actually supress the extra headers when pickaxe is
used, but prepares enough support for diff-tree to implement it.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-21 22:49:19 -07:00
Junio C Hamano
58b103f55d [PATCH] Tweak diffcore-rename heuristics.
The heuristics so far was to compare file size change and xdelta
size against the average of file size before and after the
change.  This patch uses the smaller of pre- and post- change
file size instead.

It also makes a very small performance fix.  I didn't measure
it; I do not expect it to make any practical difference, but
while scanning an already sorted list, breaking out in the
middle is the right thing.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-21 16:22:57 -07:00
Linus Torvalds
d6db01075b diff-tree: don't print multiple headers for merges when silent.
Normally we show every facet of a merge, but when we're silent,
there's little point.
2005-05-21 15:42:53 -07:00
Junio C Hamano
057c7d3018 [PATCH] Constness fix for pickaxe option.
Constness fix for pickaxe option.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-05-21 15:17:16 -07:00
Linus Torvalds
3258c902e7 diff-tree: prettify output slightly
Make the commit explanation buffer larger, and make sure that if
we truncate it, we put a "..." marker there to visually tell people
about the truncation (tested with a much smaller buffer to make
sure it looks sane).

Also make sure that the explanation is properly line-terminated,
and add an extra newline iff we have a diff.
2005-05-21 11:04:19 -07:00
Linus Torvalds
da196b603e t/t4003-diff-rename-1: use modern options to "diff"
Don't do "-u0", use "--unified=0" which is accepted by modern GNU
diff versions.
2005-05-21 10:11:37 -07:00
Linus Torvalds
cebf03c4cd "make clean" should also clean up documentation
(Or, if somebody disagrees, we should have a "make distclean").
2005-05-21 09:59:37 -07:00
Junio C Hamano
c3e7fbcbd0 [PATCH] Diff overhaul, adding the other half of copy detection.
This patch extends diff-cache and diff-files to report the
unmodified files to diff-core as well when -C (copy detection)
is in effect, so that the unmodified files can also be used as
the source candidates.  The existing test t4003 has been
extended to cover this case.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-21 09:58:03 -07:00
Junio C Hamano
52e9578985 [PATCH] Introducing software archaeologist's tool "pickaxe".
This steals the "pickaxe" feature from JIT and make it available
to the bare Plumbing layer.  From the command line, the user
gives a string he is intersted in.

Using the diff-core infrastructure previously introduced, it
filters the differences to limit the output only to the diffs
between <src> and <dst> where the string appears only in one but
not in the other.  For example:

 $ ./git-rev-list HEAD | ./git-diff-tree -Sdiff-tree-helper --stdin -M

would show the diffs that touch the string "diff-tree-helper".

In real software-archaeologist application, you would typically
look for a few to several lines of code and see where that code
came from.

The "pickaxe" module runs after "rename/copy detection" module,
so it even crosses the file rename boundary, as the above
example demonstrates.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-21 09:58:03 -07:00
Junio C Hamano
427dcb4bca [PATCH] Diff overhaul, adding half of copy detection.
This introduces the diff-core, the layer between the diff-tree
family and the external diff interface engine.  The calls to the
interface diff-tree family uses (diff_change and diff_addremove)
have not changed and will not change.  The purpose of the
diff-core layer is to provide an infrastructure to transform the
set of differences sent from the applications, before sending
them to the external diff interface.

The recently introduced rename detection code has been rewritten
to use the diff-core facility.  When applications send in
separate creates and deletes, matching ones are transformed into
a single rename-and-edit diff, and sent out to the external diff
interface as such.

This patch also enhances the rename detection code further to be
able to detect copies.  Currently this happens only as long as
copy sources appear as part of the modified files, but there
already is enough provision for callers to report unmodified
files to diff-core, so that they can be also used as copy source
candidates.  Extending the callers this way will be done in a
separate patch.

Please see and marvel at how well this works by trying out the
newly added t/t4003-diff-rename-1.sh test script.

Signed-off-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-21 09:58:03 -07:00
Linus Torvalds
c8265ac096 git-whatchanged: allow other pagers
(but still try to use '-S' if using less)
2005-05-21 09:44:16 -07:00
Paul Mackerras
887fe3c474 Read tags from .git/refs/tags/* and mark commits with tags
with a label.
Allow SHA1 ids or tags to be entered in the SHA1 ID field.
2005-05-21 07:35:37 +00:00
Daniel Barkalow
ca3ebdf5b2 [PATCH] Fix use of wc in t0000-basic
The version of wc I have (GNU textutils-2.1) puts spaces at the beginning
of lines. This patch should work for any version of wc.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Acked-by: Junio C Hamano <junkio@cox.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-20 18:03:47 -07:00
Nicolas Pitre
e002a16ba5 [PATCH] delta creation
This adds the ability to actually create delta objects using a new tool:
git-mkdelta.  It uses an ordered list of potential objects to deltafy
against earlier objects in the list.  A cap on the depth of delta
references can be provided as well, otherwise the default is to not have
any limit.  A limit of 0 will also undeltafy any given object.

Also provided is the beginning of a script to deltafy an entire
repository.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-20 15:41:45 -07:00
Nicolas Pitre
d1af002dc6 [PATCH] delta check
This adds knowledge of delta objects to fsck-cache and various object
parsing code.  A new switch to git-fsck-cache is provided to display the
maximum delta depth found in a repository.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-20 15:41:45 -07:00
Nicolas Pitre
91d7b8afc2 [PATCH] delta read
This makes the core code aware of delta objects and undeltafy them as
needed.  The convention is to use read_sha1_file() to have
undeltafication done automatically (most users do that already so this
is transparent).

If the delta object itself has to be accessed then it must be done
through map_sha1_file() and unpack_sha1_file().

In that context mktag.c has been switched to read_sha1_file() as there
is no reason to do the full map+unpack manually.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-20 15:41:45 -07:00