git-commit-vandalism

Author	SHA1	Message	Date
Junio C Hamano	d7afe648dc	Merge branch 'jc/refactor-diff-stdin' Due to the way "git diff --no-index" is bolted onto by touching the low level code that is shared with the rest of the "git diff" code, even though it has to work in a very different way, any comparison that involves a file "-" at the root level incorrectly tried to read from the standard input. This cleans up the no-index codepath further to remove code that reads from the standard input from the core side, which is never necessary when git is running its usual diff operation. * jc/refactor-diff-stdin: diff-index.c: "git diff" has no need to read blob from the standard input diff-index.c: unify handling of command line paths diff-index.c: do not pretend paths are pathspecs	2012-07-13 15:38:05 -07:00
Junio C Hamano	4682d8521c	diff-index.c: "git diff" has no need to read blob from the standard input Only "diff --no-index -" does. Bolting the logic into the low-level function diff_populate_filespec() was a layering violation from day one. Move populate_from_stdin() function out of the generic diff.c to its only user, diff-index.c. Also make sure "-" from the command line stays a special token "read from the standard input", even if we later decide to sanitize the result from prefix_filename() function in a few obvious ways, e.g. removing unnecessary "./" prefix, duplicated slashes "//" in the middle, etc. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-06-28 16:18:19 -07:00
Junio C Hamano	25e5e2bf85	combine-diff: support format_callback This teaches combine-diff machinery to feed a combined merge to a callback function when DIFF_FORMAT_CALLBACK is specified. So far, format callback functions are not used for anything but 2-way diffs. A callback is given a diff_queue_struct, which is an array of diff_filepair. As its name suggests, a diff_filepair is a _pair_ of diff_filespec that represents a single preimage and a single postimage. Since "diff -c" is to compare N parents with a single merge result and filter out any paths whose result match one (or more) of the parent(s), its output has to be able to represent N preimages and 1 postimage. For this reason, a callback function that inspects a diff_filepair that results from this new infrastructure can and is expected to view the preimage side (i.e. pair->one) as an array of diff_filespec. Each element in the array, except for the last one, is marked with "has_more_entries" bit, so that the same callback function can be used for 2-way diffs and combined diffs. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-08-20 23:03:06 -07:00
Junio C Hamano	382f013bc4	diff: pass the entire diff-options to diffcore_pickaxe() That would make it easier to give enhanced feature to the pickaxe transformation. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-08-31 14:30:28 -07:00
Junio C Hamano	44c48a909a	diff --follow: do call diffcore_std() as necessary Usually, diff frontends populate the output queue with filepairs without any rename information and call diffcore_std() to sort the renames out. When --follow is in effect, however, diff-tree family of frontend has a hack that looks like this: diff-tree frontend -> diff_tree_sha1() . populate diff_queued_diff . if --follow is in effect and there is only one change that creates the target path, then -> try_to_follow_renames() -> diff_tree_sha1() with no pathspec but with -C -> diffcore_std() to find renames . if rename is found, tweak diff_queued_diff and put a single filepair that records the found rename there -> diffcore_std() . tweak elements on diff_queued_diff by - rename detection - path ordering - pickaxe filtering We need to skip parts of the second call to diffcore_std() that is related to rename detection, and do so only when try_to_follow_renames() did find a rename. Earlier `1da6175` (Make diffcore_std only can run once before a diff_flush, 2010-05-06) tried to deal with this issue incorrectly; it unconditionally disabled any second call to diffcore_std(). This hopefully fixes the breakage. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-08-13 12:17:45 -07:00
Jonathan Nieder	987460611a	Standardize do { ... } while (0) style Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-08-12 15:44:51 -07:00
Matthieu Moy	cf958afd83	Document -B<n>[/<m>], -M<n> and -C<n> variants of -B, -M and -C These options take an optional argument, but this optional argument was not documented. Original patch by Matthieu Moy, but documentation for -B mostly copied from the explanations of Junio C Hamano. While we're there, fix a typo in a comment in diffcore.h. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-08-09 09:16:11 -07:00
Bo Yang	1da6175d43	Make diffcore_std only can run once before a diff_flush When file renames/copies detection is turned on, the second diffcore_std will degrade a 'C' pair to a 'R' pair. And this may happen when we run 'git log --follow' with hard copies finding. That is, the try_to_follow_renames() will run diffcore_std to find the copies, and then 'git log' will issue another diffcore_std, which will reduce 'src->rename_used' and recognize this copy as a rename. This is not what we want. So, I think we really don't need to run diffcore_std more than one time. Signed-off-by: Bo Yang <struggleyb.nku@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-05-07 09:34:28 -07:00
Bo Yang	9ca5df9061	Add a macro DIFF_QUEUE_CLEAR. Refactor the diff_queue_struct code, this macro help to reset the structure. Signed-off-by: Bo Yang <struggleyb.nku@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-05-07 09:34:27 -07:00
Jens Lehmann	c7e1a73641	git diff --submodule: Show detailed dirty status of submodules When encountering a dirty submodule while doing "git diff --submodule" print an extra line for new untracked content and another for modified but already tracked content. And if the HEAD of the submodule is equal to the ref diffed against in the superproject, drop the output which would just show the same SHA1s and no commit message headlines. To achieve that, the dirty_submodule bitfield is expanded to two bits. The output of "git status" inside the submodule is parsed to set the according bits. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-03-04 22:16:33 -08:00
Jens Lehmann	e3d42c4773	Performance optimization for detection of modified submodules In the worst case is_submodule_modified() got called three times for each submodule. The information we got from scanning the whole submodule tree the first time can be reused instead. New parameters have been added to diff_change() and diff_addremove(), the information is stored in a new member of struct diff_filespec. Its value is then reused instead of calling is_submodule_modified() again. When no explicit "-dirty" is needed in the output the call to is_submodule_modified() is not necessary when the submodules HEAD already disagrees with the ref of the superproject, as this alone marks it as modified. To achieve that, get_stat_data() got an extra argument. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-01-18 17:28:21 -08:00
Junio C Hamano	84cdd3c635	Merge branch 'maint' * maint: Add reference for status letters in documentation. Document that git-log takes --all-match. Update draft 1.6.0.4 release notes	2008-11-02 16:36:40 -08:00
Yann Dirson	a5a323f33c	Add reference for status letters in documentation. Also fix error in diff_filepair::status documentation, and point to the in-code reference as well as the doc. Signed-off-by: Yann Dirson <ydirson@altern.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-11-02 15:57:10 -08:00
Jeff King	122aa6f9c0	diff: introduce diff.<driver>.binary The "diff" gitattribute is somewhat overloaded right now. It can say one of three things: 1. this file is definitely binary, or definitely not (i.e., diff or !diff) 2. this file should use an external diff engine (i.e., diff=foo, diff.foo.command = custom-script) 3. this file should use particular funcname patterns (i.e., diff=foo, diff.foo.(x?)funcname = some-regex) Most of the time, there is no conflict between these uses, since using one implies that the other is irrelevant (e.g., an external diff engine will decide for itself whether the file is binary). However, there is at least one conflicting situation: there is no way to say "use the regular rules to determine whether this file is binary, but if we do diff it textually, use this funcname pattern." That is, currently setting diff=foo indicates that the file is definitely text. This patch introduces a "binary" config option for a diff driver, so that one can explicitly set diff.foo.binary. We default this value to "don't know". That is, setting a diff attribute to "foo" and using "diff.foo.funcname" will have no effect on the binaryness of a file. To get the current behavior, one can set diff.foo.binary to true. This patch also has one additional advantage: it cleans up the interface to the userdiff code a bit. Before, calling code had to know more about whether attributes were false, true, or unset to determine binaryness. Now that binaryness is a property of a driver, we can represent these situations just by passing back a driver struct. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	2008-10-18 08:02:26 -07:00
Yann Dirson	264e0b9a3c	Bust the ghost of long-defunct diffcore-pathspec. This concept was retired by `77882f6` (Retire diffcore-pathspec., 2006-04-10), more than 2 years ago. Signed-off-by: Yann Dirson <ydirson@altern.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-09-19 19:48:30 -07:00
Linus Torvalds	644797119d	copy vs rename detection: avoid unnecessary O(n*m) loops The core rename detection had some rather stupid code to check if a pathname was used by a later modification or rename, which basically walked the whole pathname space for all renames for each rename, in order to tell whether it was a pure rename (no remaining users) or should be considered a copy (other users of the source file remaining). That's really silly, since we can just keep a count of users around, and replace all those complex and expensive loops with just testing that simple counter (but this all depends on the previous commit that shared the diff_filespec data structure by using a separate reference count). Note that the reference count is not the same as the rename count: they behave otherwise rather similarly, but the reference count is tied to the allocation (and decremented at de-allocation, so that when it turns zero we can get rid of the memory), while the rename count is tied to the renames and is decremented when we find a rename (so that when it turns zero we know that it was a rename, not a copy). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-26 23:18:06 -07:00
Linus Torvalds	9fb88419ba	Ref-count the filespecs used by diffcore Rather than copy the filespecs when introducing new versions of them (for rename or copy detection), use a refcount and increment the count when reusing the diff_filespec. This avoids unnecessary allocations, but the real reason behind this is a future enhancement: we will want to track shared data across the copy/rename detection. In order to efficiently notice when a filespec is used by a rename, the rename machinery wants to keep track of a rename usage count which is shared across all different users of the filespec. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-26 23:18:05 -07:00
Junio C Hamano	8ae92e6389	rename diff_free_filespec_data_large() to diff_free_filespec_blob() Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-02 21:02:09 -07:00
Jeff King	eede7b7d11	diffcore-rename: cache file deltas We find rename candidates by computing a fingerprint hash of each file, and then comparing those fingerprints. There are inherently O(n^2) comparisons, so it pays in CPU time to hoist the (rather expensive) computation of the fingerprint out of that loop (or to cache it once we have computed it once). Previously, we didn't keep the filespec information around because then we had the potential to consume a great deal of memory. However, instead of keeping all of the filespec data, we can instead just keep the fingerprint. This patch implements and uses diff_free_filespec_data_large to accomplish that goal. We also have to change estimate_similarity not to needlessly repopulate the filespec data when we already have the hash. Practical tests showed 4.5x speedup for a 10% memory usage increase. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-02 21:02:03 -07:00
Junio C Hamano	e0e324a4dc	Fix configuration syntax to specify customized hunk header patterns. This updates the hunk header customization syntax. The special case 'funcname' attribute is gone. You assign the name of the type of contents to path's "diff" attribute as a string value in .gitattributes like this: .java diff=java .perl diff=perl *.doc diff=doc If you supply "diff.<name>.funcname" variable via the configuration mechanism (e.g. in $HOME/.gitconfig), the value is used as the regexp set to find the line to use for the hunk header (the variable is called "funcname" because such a line typically is the one that has the name of the function in programming language source text). If there is no such configuration, built-in default is used, if any. Currently there are two default patterns: default and java. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-07-07 01:49:58 -07:00
Junio C Hamano	f258475a6e	Per-path attribute based hunk header selection. This makes"diff -p" hunk headers customizable via gitattributes mechanism. It is based on Johannes's earlier patch that allowed to define a single regexp to be used for everything. The mechanism to arrive at the regexp that is used to define hunk header is the same as other use of gitattributes. You assign an attribute, funcname (because "diff -p" typically uses the name of the function the patch is about as the hunk header), a simple string value. This can be one of the names of built-in pattern (currently, "java" is defined) or a custom pattern name, to be looked up from the configuration file. (in .gitattributes) .java funcname=java .perl funcname=perl (in .git/config) [funcname] java = ... # ugly and complicated regexp to override the built-in one. perl = ... # another ugly and complicated regexp to define a new one. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-07-06 01:20:47 -07:00
Junio C Hamano	29a3eefde1	Introduce diff_filespec_is_binary() This replaces an explicit initialization of filespec->is_binary field used for rename/break followed by direct access to that field with a wrapper function that lazily iniaitlizes and accesses the field. We would add more attribute accesses for the use of diff routines, and it would be better to make this abstraction earlier. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-07-06 00:21:41 -07:00
Junio C Hamano	706098af6b	diffcore_filespec: add is_binary diffcore-break and diffcore-rename would want to behave slightly differently depending on the binary-ness of the data, so add one bit to the filespec, as the structure is now passed down to diffcore_count_changes() function. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-06-30 20:51:31 -07:00
Junio C Hamano	d8c3d03a0b	diffcore_count_changes: pass diffcore_filespec We may want to use richer information on the data we are dealing with in this function, so instead of passing a buffer address and length, just pass the diffcore_filespec structure. Existing callers always call this function with parameters taken from a filespec anyway, so there is no functionality changes. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-06-30 20:51:31 -07:00
Junio C Hamano	e0173ad9fc	Make macros to prevent double-inclusion in headers consistent. Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-04-29 02:05:11 -07:00
Junio C Hamano	e9c8409900	diff-index --cached --raw: show tree entry on the LHS for unmerged entries. This updates the way diffcore represents an unmerged pair somewhat. It used to be that entries with mode=0 on both sides were used to represent an unmerged pair, but now it has an explicit flag. This is to allow diff-index --cached to report the entry from the tree when the path is unmerged in the index. This is used in updating "git reset <tree> -- <path>" to restore absense of the path in the index from the tree. Signed-off-by: Junio C Hamano <junkio@cox.net>	2007-01-06 22:57:42 -08:00
Junio C Hamano	ef677686ef	diff.c: do not use pathname comparison to tell renames The final output from diff used to compare pathnames between preimage and postimage to tell if the filepair is a rename/copy. By explicitly marking the filepair created by diffcore_rename(), the output routine, resolve_rename_copy(), does not have to do so anymore. This helps feeding a filepair that has different pathnames in one and two elements to the diff machinery (most notably, comparing two blobs). Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-08-03 14:41:53 -07:00
Junio C Hamano	c06c79667c	diffcore-rename: somewhat optimized. This changes diffcore-rename to reuse statistics information gathered during similarity estimation, and updates the hashtable implementation used to keep track of the statistics to be denser. This seems to give better performance. Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-03-12 03:22:10 -08:00
Junio C Hamano	4d0f39cecf	diffcore-break: similarity estimator fix. This is a companion patch to the previous fix to diffcore-rename. The merging-back process should use a logic similar to what is used there. Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-03-04 13:26:36 -08:00
Junio C Hamano	65416758cd	diffcore-rename: split out the delta counting code. This is to rework diffcore break/rename/copy detection code so that it does not affected when deltifier code gets improved. Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-02-28 20:20:04 -08:00
Junio C Hamano	ee3d299e93	diffcore-break/diffcore-rename: integer overflow. While reviewing the end user tutorial rewrite by J. Bruce Fields, I noticed that "git-diff-tree -B -C" did not correctly break the total rewrite of Documentation/tutorial.txt. It turns out that we had integer overflow during the break score computations. Cop out by using floating point. This is not a kernel. Signed-off-by: Junio C Hamano <junkio@cox.net>	2006-01-15 21:08:42 -08:00
Junio C Hamano	8082d8d305	Diff: -l<num> to limit rename/copy detection. When many paths are modified, rename detection takes a lot of time. The new option -l<num> can be used to disable rename detection when more than <num> paths are possibly created as renames. Signed-off-by: Junio C Hamano <junkio@cox.net>	2005-09-24 23:50:44 -07:00
Junio C Hamano	19397b4521	Revert "[PATCH] plug memory leak in diff.c::diff_free_filepair()" This reverts `068eac91ce` commit.	2005-09-14 14:06:50 -07:00
Yasushi SHOJI	068eac91ce	[PATCH] plug memory leak in diff.c::diff_free_filepair() When I run git-diff-tree on big change, it seems the command eats so much memory. so I just put git under valgrind to see what's going on. diff_free_filespec_data() doesn't free diff_filespec itself. [jc: I ended up doing things slightly differently from Yasushi's patch. The original idea was to use free_filespec_data() only to free the data portion and keep useing the filespec itself, but no existing code seems to do things that way, so I just yanked that part out.] Signed-off-by: Yasushi SHOJI <yashi@atmark-techno.com> Signed-off-by: Junio C Hamano <junkio@cox.net>	2005-08-13 18:28:55 -07:00
Junio C Hamano	dc7090efbc	[PATCH] Re-Fix SIGSEGV on unmerged files in git-diff-files -p When an unmerged path was fed via diff_unmerged() into diffcore, it eventually called run_diff() with "one" and "two" parameters with NULL, but run_diff() was not written carefully enough to notice this situation. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-12 20:29:31 -07:00
Linus Torvalds	f9e7750621	Fix SIGSEGV on unmerged files in git-diff-files -p NULL is not considered a VALID pathspec.	2005-06-08 11:31:53 -07:00
Junio C Hamano	f78c79c5d4	[PATCH] diffcore-break.c: various fixes. This fixes three bugs in the -B heuristics. - Although it was advertised that the initial break criteria used was the same as what diffcore-rename uses, it was using something different. Instead of using smaller of src and dst size to compare with "edit" size, (insertion and deletion), it was using larger of src and dst, unlike the rename/copy detection logic. This caused the parameter to -B to mean something different from the one to -M and -C. To compensate for this change, the default break score is also changed to match that of the default for rename/copy. - The code would have crashed with division by zero when trying to break an originally empty file. - Contrary to what the comment said, the algorithm was breaking small files, only to later merge them together. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-05 14:14:58 -07:00
Junio C Hamano	eeaa460314	[PATCH] diff: Update -B heuristics. As Linus pointed out on the mailing list discussion, -B should break a files that has many inserts even if it still keeps enough of the original contents, so that the broken pieces can later be matched with other files by -M or -C. However, if such a broken pair does not get picked up by -M or -C, we would want to apply different criteria; namely, regardless of the amount of new material in the result, the determination of "rewrite" should be done by looking at the amount of original material still left in the result. If you still have the original 97 lines from a 100-line document, it does not matter if you add your own 13 lines to make a 110-line document, or if you add 903 lines to make a 1000-line document. It is not a rewrite but an in-place edit. On the other hand, if you did lose 97 lines from the original, it does not matter if you added 27 lines to make a 30-line document or if you added 997 lines to make a 1000-line document. You did a complete rewrite in either case. This patch introduces a post-processing phase that runs after diffcore-rename matches up broken pairs diffcore-break creates. The purpose of this post-processing is to pick up these broken pieces and merge them back into in-place modifications. For this, the score parameter -B option takes is changed into a pair of numbers, and it takes "-B99/80" format when fully spelled out. The first number is the minimum amount of "edit" (same definition as what diffcore-rename uses, which is "sum of deletion and insertion") that a modification needs to have to be broken, and the second number is the minimum amount of "delete" a surviving broken pair must have to avoid being merged back together. It can be abbreviated to "-B" to use default for both, "-B9" or "-B9/" to use 90% for "edit" but default (80%) for merge avoidance, or "-B/75" to use default (99%) "edit" and 75% for merge avoidance. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-03 11:23:03 -07:00
Junio C Hamano	ce24067549	[PATCH] diff: Fix docs and add -O to diff-helper. This patch updates diff documentation and usage strings: - clarify the semantics of -R. It is not "output in reverse"; rather, it is "I will feed diff backwards". Semantically they are different when -C is involved. - describe -O in usage strings of diff-* brothers. It was implemented, documented but not described in usage text. Also it adds -O to diff-helper. Like -S (and unlike -M/-C/-B), this option can work on sanitized diff-raw output produced by the diff-* brothers. While we are at it, the call it makes to diffcore is cleaned up to use the diffcore_std() like everybody else, and the declaration for the low level diffcore routines are moved from diff.h (public) to diffcore.h (private between diff.c and diffcore backends). Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-03 11:23:03 -07:00
Junio C Hamano	355e76a4a3	[PATCH] Tweak count-delta interface Make it return copied source and insertion separately, so that later implementation of heuristics can use them more flexibly. This does not change the heuristics implemented in diffcore-rename nor diffcore-break in any way. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-03 11:23:03 -07:00
Junio C Hamano	67574c403f	[PATCH] diff: mode bits fixes The core GIT repository has trees that record regular file mode in 0664 instead of normalized 0644 pattern. Comparing such a tree with another tree that records the same file in 0644 pattern without content changes with git-diff-tree causes it to feed otherwise unmodified pairs to the diff_change() routine, which triggers a sanity check routine and barfs. This patch fixes the problem, along with the fix to another caller that uses unnormalized mode bits to call diff_change() routine in a similar way. Without this patch, you will see "fatal error" from diff-tree when you run git-deltafy-script on the core GIT repository itself. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-01 13:24:03 -07:00
Junio C Hamano	f345b0a066	[PATCH] Add -B flag to diff-* brothers. A new diffcore transformation, diffcore-break.c, is introduced. When the -B flag is given, a patch that represents a complete rewrite is broken into a deletion followed by a creation. This makes it easier to review such a complete rewrite patch. The -B flag takes the same syntax as the -M and -C flags to specify the minimum amount of non-source material the resulting file needs to have to be considered a complete rewrite, and defaults to 99% if not specified. As the new test t4008-diff-break-rewrite.sh demonstrates, if a file is a complete rewrite, it is broken into a delete/create pair, which can further be subjected to the usual rename detection if -M or -C is used. For example, if file0 gets completely rewritten to make it as if it were rather based on file1 which itself disappeared, the following happens: The original change looks like this: file0 --> file0' (quite different from file0) file1 --> /dev/null After diffcore-break runs, it would become this: file0 --> /dev/null /dev/null --> file0' file1 --> /dev/null Then diffcore-rename matches them up: file1 --> file0' The internal score values are finer grained now. Earlier maximum of 10000 has been raised to 60000; there is no user visible changes but there is no reason to waste available bits. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-30 10:35:49 -07:00
Junio C Hamano	01c4e70f63	[PATCH] diff: code clean-up and removal of rename hack. A new macro, DIFF_PAIR_RENAME(), is introduced to distinguish a filepair that is a rename/copy (the definition of which is src and dst are different paths, of course). This removes the hack used in the record_rename_pair() to always put a non-zero value in the score field. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-30 10:35:49 -07:00
Junio C Hamano	f0c6b2a2fd	[PATCH] Optimize diff-tree -[CM] --stdin This attempts to optimize "diff-tree -[CM] --stdin", which compares successible tree pairs. This optimization does not make much sense for other commands in the diff-* brothers. When reading from --stdin and using rename/copy detection, the patch makes diff-tree to read the current index file first. This is done to reuse the optimization used by diff-cache in the non-cached case. Similarity estimator can avoid expanding a blob if the index says what is in the work tree has an exact copy of that blob already expanded. Another optimization the patch makes is to check only file sizes first to terminate similarity estimation early. In order for this to work, it needs a way to tell the size of the blob without expanding it. Since an obvious way of doing it, which is to keep all the blobs previously used in the memory, is too costly, it does so by keeping the filesize for each object it has already seen in memory. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-29 11:17:44 -07:00
Junio C Hamano	15d061b435	[PATCH] Fix the way diffcore-rename records unremoved source. Earier version of diffcore-rename used to keep unmodified filepair in its output so that the last stage of the processing that tells renames from copies can make all of rename/copy to copies. However this had a bad interaction with other diffcore filters that wanted to run after diffcore-rename, in that such unmodified filepair must be retained for proper distinction between renames and copies to happen. This patch fixes the problem by changing the way diffcore-rename records the information needed to distinguish "all are copies" case and "the last one is a rename" case. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-29 11:17:43 -07:00
Junio C Hamano	be020332a1	[PATCH] Remove a function not used anymore. Earlier rename/copy detection left unmodified filepair in the output and forced downstream to keep them even when they are filtering, and the diff_needs_to_stay() function was used for the logic. It is not used anymore, so remove it. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-29 11:17:43 -07:00
Junio C Hamano	226406f693	[PATCH] Introduce diff_free_filepair() funcion. This introduces a new function to free a common data structure, and plugs some leaks. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-29 11:17:43 -07:00
Junio C Hamano	4130b99571	[PATCH] Diff updates to express type changes With the introduction of type 'T' in the diff-raw output, and the "apply-patch" program Linus has been quietly working on without much advertisement, it started to make sense to emit usable information in the "diff --git" patch output format as well. Earlier built-in diff driver punted and did not say anything about a symbolic link changing into a file or vice versa, but this version represents it as a pair of deletion and creation. It also fixes a minor problem dealing with old archive created with ancient git. The earlier code was reporting file mode change between 100664 and 100644 (we shouldn't). The linux-2.6 git tree has a good example that exposes this problem. A good test case is commit ce1dc02f76432a46db149241e015a4f782974623. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-26 15:18:55 -07:00
Junio C Hamano	96716a1976	[PATCH] Fix type-change handling when assigning the status code to filepairs. The interim single-liner '?' fix resulted delete entries that should not have emitted coming out in the output as an unintended side effect; I caught this with the "rename" test in the test suite. This patch instead fixes the code that assigns the status code to each filepair. I verified this does not break the testcase in udev.git tree Kay Sievers gave us, by running git-diff-tree on that tree which showed 21 file to symlink changes. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-25 15:21:57 -07:00
Junio C Hamano	25d5ea410f	[PATCH] Redo rename/copy detection logic. Earlier implementation had a major screw-up in the memory management area. Rename/copy logic sometimes borrowed a pointer to a structure without any provision for downstream to determine which pointer is shared and which is not. This resulted in the later clean-up code to sometimes double free such structure, resulting in a segfault. This made -M and -C useless. Another problem the earlier implementation had was that it reordered the patches, and forced the logic to differentiate renames and copies to depend on that particular order. This problem was fixed by teaching rename/copy detection logic not to do any reordering, and rename-copy differentiator not to depend on the order of the patches. The diffs will leave rename/copy detector in the same destination path order as the patch that was fed into it. Some test vectors have been reordered to accommodate this change. It also adds a sanity check logic to the human-readable diff-raw output to detect paths with embedded TAB and LF characters, which cannot be expressed with that format. This idea came up during a discussion with Chris Wedgwood. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-24 01:26:26 -07:00

1 2

58 Commits