git-commit-vandalism

Author	SHA1	Message	Date
Christian Couder	737c74ee42	Bisect: refactor some logging into "bisect_write". Also use "die" instead of "echo >&2 something ; exit 1". And simplify "bisect_replay". Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-26 23:27:24 -07:00
Christian Couder	55624f9af4	Bisect: refactor "bisect_write_*" functions. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-26 23:27:24 -07:00
Christian Couder	97e1c51e15	Bisect: implement "bisect skip" to mark untestable revisions. When there are some "skip"ped revisions, we add the '--bisect-all' option to "git rev-list --bisect-vars". Then we filter out the "skip"ped revisions from the result of the rev-list command, and we modify the "bisect_rev" var accordingly. We don't always use "--bisect-all" because it is slower than "--bisect-vars" or "--bisect". When we cannot find for sure the first bad commit because of "skip"ped commits, we print the hash of each possible first bad commit and then we exit with code 2. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	2007-10-26 23:27:23 -07:00
Christian Couder	8fe26f4481	Bisect: fix some white spaces and empty lines breakages. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	2007-10-26 23:27:23 -07:00
Christian Couder	3ac9f612cb	rev-list documentation: add "--bisect-all". Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	2007-10-26 23:27:23 -07:00
Christian Couder	50e62a8e70	rev-list: implement --bisect-all This is Junio's patch with some stuff to make --bisect-all compatible with --bisect-vars. This option makes it possible to see all the potential bisection points. The best ones are displayed first. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	2007-10-26 23:27:23 -07:00
Junio C Hamano	85b0045505	Merge branch 'ja/shorthelp' * ja/shorthelp: help: remove extra blank line after "See 'git --help'" message On error, do not list all commands, but point to --help option	2007-10-26 23:26:49 -07:00
Junio C Hamano	a238917ba4	help: remove extra blank line after "See 'git --help'" message The double LF were there only because we gave a list of common commands. WIth the list gone, there is no reason to have the extra blank line. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-26 23:26:41 -07:00
Linus Torvalds	42899ac898	Do the fuzzy rename detection limits with the exact renames removed When we do the fuzzy rename detection, we don't care about the destinations that we already handled with the exact rename detector. And, in fact, the code already knew that - but the rename limiter, which used to run before exact renames were detected, did not. This fixes it so that the rename detection limiter now bases its decisions on the remaining rename counts, rather than the original ones. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-26 23:18:06 -07:00
Linus Torvalds	81ac051d6a	Fix ugly magic special case in exact rename detection For historical reasons, the exact rename detection had populated the filespecs for the entries it compared, and the rest of the similarity analysis depended on that. I hadn't even bothered to debug why that was the case when I re-did the rename detection, I just made the new one have the same broken behaviour, with a note about this special case. This fixes that fixme. The reason the exact rename detector needed to fill in the file sizes of the files it checked was that the _inexact_ rename detector was broken, and started comparing file sizes before it filled them in. Fixing that allows the exact phase to do the sane thing of never even caring (since all it cares about is really just the SHA1 itself, not the size nor the contents). It turns out that this also indirectly fixes a bug: trying to populate all the filespecs will run out of virtual memory if there is tons and tons of possible rename options. The fuzzy similarity analysis does the right thing in this regard, and free's the blob info after it has generated the hash tables, so the special case code caused more trouble than just some extra illogical code. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-26 23:18:06 -07:00
Linus Torvalds	17559a643e	Do exact rename detection regardless of rename limits Now that the exact rename detection is linear-time (with a very small constant factor to boot), there is no longer any reason to limit it by the number of files involved. In some trivial testing, I created a repository with a directory that had a hundred thousand files in it (all with different contents), and then moved that directory to show the effects of renaming 100,000 files. With the new code, that resulted in [torvalds@woody big-rename]$ time ~/git/git show -C \| wc -l 400006 real 0m2.071s user 0m1.520s sys 0m0.576s ie the code can correctly detect the hundred thousand renames in about 2 seconds (the number "400006" comes from four lines for each rename: diff --git a/really-big-dir/file-1-1-1-1-1 b/moved-big-dir/file-1-1-1-1-1 similarity index 100% rename from really-big-dir/file-1-1-1-1-1 rename to moved-big-dir/file-1-1-1-1-1 and the extra six lines is from a one-liner commit message and all the commit information and spacing). Most of those two seconds weren't even really the rename detection, it's really all the other stuff needed to get there. With the old code, this wouldn't have been practically possible. Doing a pairwise check of the ten billion possible pairs would have been prohibitively expensive. In fact, even with the rename limiter in place, the old code would waste a lot of time just on the diff_filespec checks, and despite not even trying to find renames, it used to look like: [torvalds@woody big-rename]$ time git show -C \| wc -l 1400006 real 0m12.337s user 0m12.285s sys 0m0.192s ie we used to take 12 seconds for this load and not even do any rename detection! (The number 1400006 comes from fourteen lines per file moved: seven lines each for the delete and the create of a one-liner file, and the same extra six lines of commit information). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-26 23:18:06 -07:00
Linus Torvalds	9027f53cb5	Do linear-time/space rename logic for exact renames This implements a smarter rename detector for exact renames, which rather than doing a pairwise comparison (time O(m*n)) will just hash the files into a hash-table (size O(n+m)), and only do pairwise comparisons to renames that have the same hash (time O(n+m) except for unrealistic hash collissions, which we just cull aggressively). Admittedly the exact rename case is not nearly as interesting as the generic case, but it's an important case none-the-less. A similar general approach should work for the generic case too, but even then you do need to handle the exact renames/copies separately (to avoid the inevitable added cost factor that comes from the _size_ of the file), so this is worth doing. In the expectation that we will indeed do the same hashing trick for the general rename case, this code uses a generic hash-table implementation that can be used for other things too. In fact, we might be able to consolidate some of our existing hash tables with the new generic code in hash.[ch]. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-26 23:18:06 -07:00
Linus Torvalds	644797119d	copy vs rename detection: avoid unnecessary O(n*m) loops The core rename detection had some rather stupid code to check if a pathname was used by a later modification or rename, which basically walked the whole pathname space for all renames for each rename, in order to tell whether it was a pure rename (no remaining users) or should be considered a copy (other users of the source file remaining). That's really silly, since we can just keep a count of users around, and replace all those complex and expensive loops with just testing that simple counter (but this all depends on the previous commit that shared the diff_filespec data structure by using a separate reference count). Note that the reference count is not the same as the rename count: they behave otherwise rather similarly, but the reference count is tied to the allocation (and decremented at de-allocation, so that when it turns zero we can get rid of the memory), while the rename count is tied to the renames and is decremented when we find a rename (so that when it turns zero we know that it was a rename, not a copy). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-26 23:18:06 -07:00
Linus Torvalds	9fb88419ba	Ref-count the filespecs used by diffcore Rather than copy the filespecs when introducing new versions of them (for rename or copy detection), use a refcount and increment the count when reusing the diff_filespec. This avoids unnecessary allocations, but the real reason behind this is a future enhancement: we will want to track shared data across the copy/rename detection. In order to efficiently notice when a filespec is used by a rename, the rename machinery wants to keep track of a rename usage count which is shared across all different users of the filespec. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-26 23:18:05 -07:00
Linus Torvalds	cb1491b6bf	Split out "exact content match" phase of rename detection This makes the exact content match a separate function of its own. Partly to cut down a bit on the size of the diffcore_rename() function (which is too complex as it is), and partly because there are smarter ways to do this than an O(m*n) loop over it all, and that function should be rewritten to take that into account. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-26 23:18:05 -07:00
Linus Torvalds	505f297989	Add 'diffcore.h' to LIB_H The diffcore.h header file is included by more than just the internal diff generation files, and needs to be part of the proper dependencies. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-26 23:18:05 -07:00
Junio C Hamano	d633f702a0	Merge branch 'maint' * maint: Fix generation of perl/perl.mak git-remote: fix "Use of uninitialized value in string ne"	2007-10-26 23:17:23 -07:00
Christian Couder	15387e32ff	Test suite: reset TERM to its previous value after testing. Using konsole, I get no colored output at the end of "t7005-editor.sh" without this patch. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-26 23:17:19 -07:00
Junio C Hamano	dc2715554e	Merge branch 'ph/color-test' * ph/color-test: Support a --quiet option in the test-suite. Add some fancy colors in the test library when terminal supports it.	2007-10-26 23:17:14 -07:00
Jim Meyering	4a21d13db4	hooks-pre-commit: use \t, rather than a literal TAB in regexp Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-26 23:16:51 -07:00
Alex Riesen	d1a2057560	Fix generation of perl/perl.mak The code generating perl/Makefile from Makefile.PL was causing trouble because it didn't considered NO_PERL_MAKEMAKER and ran makemaker unconditionally, rewriting perl.mak. Makemaker is FUBAR in ActiveState Perl, and perl/Makefile has a replacement for it. Besides, a changed Git.pm is NOT a reason to rebuild all the perl scripts, so remove the dependency too. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-26 16:44:45 -07:00
Pierre Habouzit	c2e6b6d0d1	fast-import.c: fix regression due to strbuf conversion Without this strbuf_detach(), it yields a double free later, the command is in fact stashed, and this is not a memory leak. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-26 15:28:09 -07:00
Shawn O. Pearce	ab0d33c438	git-gui: Protect against bad translation strings If a translation string uses a format character we don't have an argument for then it may throw an error when we attempt to format the translation. In this case switch back to the default format that comes with the program (aka the English translation). Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	2007-10-26 03:08:37 -04:00
Pierre Habouzit	1ece127467	Support a --quiet option in the test-suite. This shuts down the "* ok ##: `test description`" messages. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-24 22:44:14 -07:00
Pierre Habouzit	55db1df0c8	Add some fancy colors in the test library when terminal supports it. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-24 22:44:14 -07:00
David Symonds	d3cd249565	gitweb: Use chop_and_escape_str in more places. Signed-off-by: David Symonds <dsymonds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-24 22:07:05 -07:00
David Symonds	ce58ec9158	gitweb: Refactor abbreviation-with-title-attribute code. Signed-off-by: David Symonds <dsymonds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-24 22:06:57 -07:00
Junio C Hamano	d90a7fda35	Merge branch 'db/fetch-pack' * db/fetch-pack: (60 commits) Define compat version of mkdtemp for systems lacking it Avoid scary errors about tagged trees/blobs during git-fetch fetch: if not fetching from default remote, ignore default merge Support 'push --dry-run' for http transport Support 'push --dry-run' for rsync transport Fix 'push --all branch...' error handling Fix compilation when NO_CURL is defined Added a test for fetching remote tags when there is not tags. Fix a crash in ls-remote when refspec expands into nothing Remove duplicate ref matches in fetch Restore default verbosity for http fetches. fetch/push: readd rsync support Introduce remove_dir_recursively() bundle transport: fix an alloc_ref() call Allow abbreviations in the first refspec to be merged Prevent send-pack from segfaulting when a branch doesn't match Cleanup unnecessary break in remote.c Cleanup style nit of 'x == NULL' in remote.c Fix memory leaks when disconnecting transport instances Ensure builtin-fetch honors {fetch,transfer}.unpackLimit ...	2007-10-24 21:59:50 -07:00
Miklos Vajna	2db9b49c6c	git-send-email: add a new sendemail.to configuration variable Some projects prefer to receive patches via a given email address. In these cases, it's handy to configure that address once. Signed-off-by: Miklos Vajna <vmiklos@frugalware.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-24 20:13:07 -07:00
Junio C Hamano	59b2023fbb	git-remote: fix "Use of uninitialized value in string ne" martin f krafft <madduck@madduck.net> writes: > piper:~> git remote show origin > * remote origin > URL: ssh://git.madduck.net/~/git/etc/mailplate.git > Use of uninitialized value in string ne at /usr/local/stow/git/bin/git-remote line 248. This is because there might not be branch.<name>.remote defined but the code unconditionally dereferences $branch->{$name}{'REMOTE'} and compares with another string. Tested-by: Martin F Krafft <madduck@madduck.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-24 18:47:50 -07:00
Paul Mackerras	74a40c7110	gitk: Fix a couple more bugs in the path limiting First, paths ending in a slash were not matching anything. This fixes path_filter to handle paths ending in a slash (such entries have to match a directory, and can't match a file, e.g., foo/bar/ can't match a plain file called foo/bar). Secondly, clicking in the file list pane (bottom right) was broken because $treediffs($ids) contained all the files modified by the commit, not just those within the file list. This fixes that too. Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-10-24 10:16:56 +10:00
Shawn O. Pearce	f4e9996b77	Merge branch 'maint' * maint: git-gui: Make sure we get errors from git-update-index Conflicts: lib/index.tcl	2007-10-23 18:50:19 -04:00
Shawn O. Pearce	d4e890e5de	git-gui: Make sure we get errors from git-update-index I'm seeing a lot of silent failures from git-update-index on Windows and this is leaving the index.lock file intact, which means users are later unable to perform additional operations. When the index is locked behind our back and we are unable to use it we may need to allow the user to delete the index lock and try again. However our UI state is probably not currect as we have assumed that some changes were applied but none of them actually did. A rescan is the easiest (in code anyway) solution to correct our UI to show what the index really has (or doesn't have). Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	2007-10-23 18:49:27 -04:00
Junio C Hamano	8d863c98b2	k.org git toppage: Add link to 1.5.3 release notes. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-23 12:10:55 -07:00
Paul Mackerras	3de07118f0	Merge branch 'master' into dev	2007-10-23 22:40:50 +10:00
Paul Mackerras	bd8f677e1c	gitk: Fix some bugs with path limiting in the diff display First, we weren't putting "--" between the ids and the paths in the git diff-tree/diff-index/diff-files command, so if there was a tag and a file with the same name, we could get an ambiguity in the command. This puts the "--" in to make it clear that the paths are paths. Secondly, this implements the path limiting for merge diffs as well as the normal 2-way diffs. Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-10-23 22:37:23 +10:00
Paul Mackerras	4570b7e9d7	gitk: Use the status window for other functions This sets the status window when reading commits, searching through commits, cherry-picking or checking out a head. Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-10-23 21:19:06 +10:00
Paul Mackerras	a137a90f49	gitk: Integrate the reset progress bar in the main frame This makes the reset function use a progress bar in the same location as the progress bars for reading in commits and for finding commits, instead of a progress bar in a separate detached window. The progress bar for resetting is red. This also puts "Resetting" in the status window while the reset is in progress. The setting of the status window is done through an extension of the interface used for setting the watch cursor. Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-10-23 21:12:49 +10:00
Alex Riesen	dec2b4aaa8	More updates and corrections to the russian translation of git-gui In particular many screw-ups after po regeneration were fixed. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	2007-10-23 00:28:35 -04:00
Paul Mackerras	94503918e4	gitk: Ensure tabstop setting gets restored by Cancel button We weren't restoring the tabstop setting if the user pressed the Cancel button in the Edit/Preferences window. Also improved the label for the checkbox (made it "Tab spacing" rather than the laconic "tabstop") and moved it above the "Display nearby tags" checkbox. Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-10-23 10:33:38 +10:00
Paul Mackerras	7a39a17a87	gitk: Limit diff display to listed paths by default When the user has specified a list of paths, either on the command line or when creating a view, gitk currently displays the diffs for all files that a commit has modified, not just the ones that match the path list. This is different from other git commands such as git log. This change makes gitk behave the same as these other git commands by default, that is, gitk only displays the diffs for files that match the path list. There is now a checkbox labelled "Limit diffs to listed paths" in the Edit/Preferences pane. If that is unchecked, gitk will display the diffs for all files as before. When gitk is run with the --merge flag, it will get the list of unmerged files at startup, intersect that with the paths listed on the command line (if any), and use that as the list of paths. Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-10-23 10:15:11 +10:00
Jari Aalto	b5d21a4b68	On error, do not list all commands, but point to --help option - Remove out call to list_common_cmds_help() - Send error message to stderr, not stdout. Signed-off-by: Jari Aalto <jari.aalto@cante.net> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	2007-10-22 01:57:50 -04:00
David Symonds	e076a0e71f	gitweb: Provide title attributes for abbreviated author names. Signed-off-by: David Symonds <dsymonds@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	2007-10-22 01:54:12 -04:00
Ralf Wildenhues	dd8175f83c	git-cherry-pick: improve description of -x. Reword the first sentence of the description of -x, in order to make it easier to read and understand. Signed-off-by: Ralf Wildenhues <Ralf.Wildenhues@gmx.de> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	2007-10-22 01:38:19 -04:00
Kirill	c43ff43601	Updated Russian translation. The most important changes are: - Git version cannot be determined... (lost in `57364320bf`) - git-gui: fatal error Some changes need the second opinion (search for TOVERIFY), some changes are just copies (search for "carbon copy"). Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	2007-10-22 00:03:42 -04:00
René Scharfe	c32f749fec	Correct some sizeof(size_t) != sizeof(unsigned long) typing errors Fix size_t vs. unsigned long pointer mismatch warnings introduced with the addition of strbuf_detach(). Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	2007-10-22 00:00:40 -04:00
Shawn O. Pearce	5be507fc95	Use PRIuMAX instead of 'unsigned long long' in show-index Elsewhere in Git we already use PRIuMAX and cast to uintmax_t when we need to display a value that is 'very big' and we're not exactly sure what the largest display size is for this platform. This particular fix is needed so we can do the incredibly crazy temporary hack of: diff --git a/cache.h b/cache.h index e0abcd6..6637fd8 100644 --- a/cache.h +++ b/cache.h @@ -6,6 +6,7 @@ #include SHA1_HEADER #include <zlib.h> +#define long long long #if ZLIB_VERNUM < 0x1200 #define deflateBound(c,s) ((s) + (((s) + 7) >> 3) + (((s) + 63) >> 6) + 11) allowing us to more easily look for locations where we are passing a pointer to an 8 byte value to a function that expects a 4 byte value. This can occur on some platforms where sizeof(long) == 8 and sizeof(size_t) == 4. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	2007-10-21 02:16:57 -04:00
Shawn O. Pearce	8a37e21dab	Merge branch 'maint' * maint: Describe more 1.5.3.5 fixes in release notes Fix diffcore-break total breakage Fix directory scanner to correctly ignore files without d_type Improve receive-pack error message about funny ref creation fast-import: Fix argument order to die in file_change_m git-gui: Don't display CR within console windows git-gui: Handle progress bars from newer gits git-gui: Correctly report failures from git-write-tree gitk.txt: Fix markup. send-pack: respect '+' on wildcard refspecs git-gui: accept versions containing text annotations, like 1.5.3.mingw.1 git-gui: Don't crash when starting gitk from a browser session git-gui: Allow gitk to be started on Cygwin with native Tcl/Tk git-gui: Ensure .git/info/exclude is honored in Cygwin workdirs git-gui: Handle starting on mapped shares under Cygwin git-gui: Display message box when we cannot find git in $PATH git-gui: Avoid using bold text in entire gui for some fonts	2007-10-21 02:11:45 -04:00
Shawn O. Pearce	2ee52eb17c	Describe more 1.5.3.5 fixes in release notes Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	2007-10-21 02:04:02 -04:00
Linus Torvalds	6dd4b66fde	Fix diffcore-break total breakage Ok, so on the kernel list, some people noticed that "git log --follow" doesn't work too well with some files in the x86 merge, because a lot of files got renamed in very special ways. In particular, there was a pattern of doing single commits with renames that looked basically like - rename "filename.h" -> "filename_64.h" - create new "filename.c" that includes "filename_32.h" or "filename_64.h" depending on whether we're 32-bit or 64-bit. which was preparatory for smushing the two trees together. Now, there's two issues here: - "filename.c" remained. Yes, it was a rename, but there was a new file created with the old name in the same commit. This was important, because we wanted each commit to compile properly, so that it was bisectable, so splitting the rename into one commit and the "create helper file" into another was not an option. So we need to break associations where the contents change too much. Fine. We have the -B flag for that. When we break things up, then the rename detection will be able to figure out whether there are better alternatives. - "git log --follow" didn't with with -B. Now, the second case was really simple: we use a different "diffopt" structure for the rename detection than the basic one (which we use for showing the diffs). So that second case is trivially fixed by a trivial one-liner that just copies the break_opt values from the "real" diffopts to the one used for rename following. So now "git log -B --follow" works fine: diff --git a/tree-diff.c b/tree-diff.c index 26bdbdd..7c261fd 100644 --- a/tree-diff.c +++ b/tree-diff.c @@ -319,6 +319,7 @@ static void try_to_follow_renames(struct tree_desc t1, struct tree_desc t2, co diff_opts.detect_rename = DIFF_DETECT_RENAME; diff_opts.output_format = DIFF_FORMAT_NO_OUTPUT; diff_opts.single_follow = opt->paths[0]; + diff_opts.break_opt = opt->break_opt; paths[0] = NULL; diff_tree_setup_paths(paths, &diff_opts); if (diff_setup_done(&diff_opts) < 0) however, the end result does not work. Because our diffcore-break.c logic is totally bogus! In particular: - it used to do if (base_size < MINIMUM_BREAK_SIZE) return 0; /* we do not break too small filepair / which basically says "don't bother to break small files". But that "base_size" is the smaller* of the two sizes, which means that if some large file was rewritten into one that just includes another file, we would look at the (small) result, and decide that it's smaller than the break size, so it cannot be worth it to break it up! Even if the other side was ten times bigger and looked nothing like the samell file! That's clearly bogus. I replaced "base_size" with "max_size", so that we compare the bigger of the filepair with the break size. - It calculated a "merge_score", which was the score needed to merge it back together if nothing else wanted it. But even if it was so different that we would never want to merge it back, we wouldn't consider it a break! That makes no sense. So I added if (merge_score_p > break_score) return 1; to make it clear that if we wouldn't want to merge it at the end, it was definitely* a break. - It compared the whole "extent of damage", counting all inserts and deletes, but it based this score on the "base_size", and generated the damage score with delta_size = src_removed + literal_added; damage_score = delta_size * MAX_SCORE / base_size; but that makes no sense either, since quite often, this will result in a number that is bigger than MAX_SCORE! Why? Because base_size is (again) the smaller of the two files we compare, and when you start out from a small file and add a lot (or start out from a large file and remove a lot), the base_size is going to be much smaller than the damage! Again, the fix was to replace "base_size" with "max_size", at which point the damage actually becomes a sane percentage of the whole. With these changes in place, not only does "git log -B --follow" work for the case that triggered this in the first place, ie now git log -B --follow arch/x86/kernel/vmlinux_64.lds.S actually gives reasonable results. But I also wanted to verify it in general, by doing a full-history git log --stat -B -C on my kernel tree with the old code and the new code. There's some tweaking to be done, but generally, the new code generates much better results wrt breaking up files (and then finding better rename candidates). Here's a few examples of the "--stat" output: - This: include/asm-x86/Kbuild \| 2 - include/asm-x86/debugreg.h \| 79 +++++++++++++++++++++++++++++++++++------ include/asm-x86/debugreg_32.h \| 64 --------------------------------- include/asm-x86/debugreg_64.h \| 65 --------------------------------- 4 files changed, 68 insertions(+), 142 deletions(-) Becomes: include/asm-x86/Kbuild \| 2 - include/asm-x86/{debugreg_64.h => debugreg.h} \| 9 +++- include/asm-x86/debugreg_32.h \| 64 ------------------------- 3 files changed, 7 insertions(+), 68 deletions(-) - This: include/asm-x86/bug.h \| 41 +++++++++++++++++++++++++++++++++++++++-- include/asm-x86/bug_32.h \| 37 ------------------------------------- include/asm-x86/bug_64.h \| 34 ---------------------------------- 3 files changed, 39 insertions(+), 73 deletions(-) Becomes include/asm-x86/{bug_64.h => bug.h} \| 20 +++++++++++++----- include/asm-x86/bug_32.h \| 37 ----------------------------------- 2 files changed, 14 insertions(+), 43 deletions(-) Now, in some other cases, it does actually turn a rename into a real "delete+create" pair, and then the diff is usually bigger, so truth in advertizing: it doesn't always generate a nicer diff. But for what -B was meant for, I think this is a big improvement, and I suspect those cases where it generates a bigger diff are tweakable. So I think this diff fixes a real bug, but we might still want to tweak the default values and perhaps the exact rules for when a break happens. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	2007-10-21 01:59:42 -04:00

... 66 67 68 69 70 ...

15424 Commits