Ok, so on the kernel list, some people noticed that "git log --follow"
doesn't work too well with some files in the x86 merge, because a lot of
files got renamed in very special ways.
In particular, there was a pattern of doing single commits with renames
that looked basically like
- rename "filename.h" -> "filename_64.h"
- create new "filename.c" that includes "filename_32.h" or
"filename_64.h" depending on whether we're 32-bit or 64-bit.
which was preparatory for smushing the two trees together.
Now, there's two issues here:
- "filename.c" *remained*. Yes, it was a rename, but there was a new file
created with the old name in the same commit. This was important,
because we wanted each commit to compile properly, so that it was
bisectable, so splitting the rename into one commit and the "create
helper file" into another was *not* an option.
So we need to break associations where the contents change too much.
Fine. We have the -B flag for that. When we break things up, then the
rename detection will be able to figure out whether there are better
alternatives.
- "git log --follow" didn't with with -B.
Now, the second case was really simple: we use a different "diffopt"
structure for the rename detection than the basic one (which we use for
showing the diffs). So that second case is trivially fixed by a trivial
one-liner that just copies the break_opt values from the "real" diffopts
to the one used for rename following. So now "git log -B --follow" works
fine:
diff --git a/tree-diff.c b/tree-diff.c
index 26bdbdd..7c261fd 100644
--- a/tree-diff.c
+++ b/tree-diff.c
@@ -319,6 +319,7 @@ static void try_to_follow_renames(struct tree_desc *t1, struct tree_desc *t2, co
diff_opts.detect_rename = DIFF_DETECT_RENAME;
diff_opts.output_format = DIFF_FORMAT_NO_OUTPUT;
diff_opts.single_follow = opt->paths[0];
+ diff_opts.break_opt = opt->break_opt;
paths[0] = NULL;
diff_tree_setup_paths(paths, &diff_opts);
if (diff_setup_done(&diff_opts) < 0)
however, the end result does *not* work. Because our diffcore-break.c
logic is totally bogus!
In particular:
- it used to do
if (base_size < MINIMUM_BREAK_SIZE)
return 0; /* we do not break too small filepair */
which basically says "don't bother to break small files". But that
"base_size" is the *smaller* of the two sizes, which means that if some
large file was rewritten into one that just includes another file, we
would look at the (small) result, and decide that it's smaller than the
break size, so it cannot be worth it to break it up! Even if the other
side was ten times bigger and looked *nothing* like the samell file!
That's clearly bogus. I replaced "base_size" with "max_size", so that
we compare the *bigger* of the filepair with the break size.
- It calculated a "merge_score", which was the score needed to merge it
back together if nothing else wanted it. But even if it was *so*
different that we would never want to merge it back, we wouldn't
consider it a break! That makes no sense. So I added
if (*merge_score_p > break_score)
return 1;
to make it clear that if we wouldn't want to merge it at the end, it
was *definitely* a break.
- It compared the whole "extent of damage", counting all inserts and
deletes, but it based this score on the "base_size", and generated the
damage score with
delta_size = src_removed + literal_added;
damage_score = delta_size * MAX_SCORE / base_size;
but that makes no sense either, since quite often, this will result in
a number that is *bigger* than MAX_SCORE! Why? Because base_size is
(again) the smaller of the two files we compare, and when you start out
from a small file and add a lot (or start out from a large file and
remove a lot), the base_size is going to be much smaller than the
damage!
Again, the fix was to replace "base_size" with "max_size", at which
point the damage actually becomes a sane percentage of the whole.
With these changes in place, not only does "git log -B --follow" work for
the case that triggered this in the first place, ie now
git log -B --follow arch/x86/kernel/vmlinux_64.lds.S
actually gives reasonable results. But I also wanted to verify it in
general, by doing a full-history
git log --stat -B -C
on my kernel tree with the old code and the new code.
There's some tweaking to be done, but generally, the new code generates
much better results wrt breaking up files (and then finding better rename
candidates). Here's a few examples of the "--stat" output:
- This:
include/asm-x86/Kbuild | 2 -
include/asm-x86/debugreg.h | 79 +++++++++++++++++++++++++++++++++++------
include/asm-x86/debugreg_32.h | 64 ---------------------------------
include/asm-x86/debugreg_64.h | 65 ---------------------------------
4 files changed, 68 insertions(+), 142 deletions(-)
Becomes:
include/asm-x86/Kbuild | 2 -
include/asm-x86/{debugreg_64.h => debugreg.h} | 9 +++-
include/asm-x86/debugreg_32.h | 64 -------------------------
3 files changed, 7 insertions(+), 68 deletions(-)
- This:
include/asm-x86/bug.h | 41 +++++++++++++++++++++++++++++++++++++++--
include/asm-x86/bug_32.h | 37 -------------------------------------
include/asm-x86/bug_64.h | 34 ----------------------------------
3 files changed, 39 insertions(+), 73 deletions(-)
Becomes
include/asm-x86/{bug_64.h => bug.h} | 20 +++++++++++++-----
include/asm-x86/bug_32.h | 37 -----------------------------------
2 files changed, 14 insertions(+), 43 deletions(-)
Now, in some other cases, it does actually turn a rename into a real
"delete+create" pair, and then the diff is usually bigger, so truth in
advertizing: it doesn't always generate a nicer diff. But for what -B was
meant for, I think this is a big improvement, and I suspect those cases
where it generates a bigger diff are tweakable.
So I think this diff fixes a real bug, but we might still want to tweak
the default values and perhaps the exact rules for when a break happens.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
On Fri, 19 Oct 2007, Todd T. Fries wrote:
> If DT_UNKNOWN exists, then we have to do a stat() of some form to
> find out the right type.
That happened in the case of a pathname that was ignored, and we did
not ask for "dir->show_ignored". That test used to be *together*
with the "DTYPE(de) != DT_DIR", but splitting the two tests up
means that we can do that (common) test before we even bother to
calculate the real dtype.
Of course, that optimization only matters for systems that don't
have, or don't fill in DTYPE properly.
I also clarified the real relationship between "exclude" and
"dir->show_ignored". It used to do
if (exclude != dir->show_ignored) {
..
which wasn't exactly obvious, because it triggers for two different
cases:
- the path is marked excluded, but we are not interested in ignored
files: ignore it
- the path is *not* excluded, but we *are* interested in ignored
files: ignore it unless it's a directory, in which case we might
have ignored files inside the directory and need to recurse
into it).
so this splits them into those two cases, since the first case
doesn't even care about the type.
I also made a the DT_UNKNOWN case a separate helper function,
and added some commentary to the cases.
Linus
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* 'maint' of git://repo.or.cz/git-gui:
git-gui: Don't display CR within console windows
git-gui: Handle progress bars from newer gits
git-gui: Correctly report failures from git-write-tree
git-gui: accept versions containing text annotations, like 1.5.3.mingw.1
git-gui: Don't crash when starting gitk from a browser session
git-gui: Allow gitk to be started on Cygwin with native Tcl/Tk
git-gui: Ensure .git/info/exclude is honored in Cygwin workdirs
git-gui: Handle starting on mapped shares under Cygwin
git-gui: Display message box when we cannot find git in $PATH
git-gui: Avoid using bold text in entire gui for some fonts
receive-pack is only executed remotely so when
reporting errors, say so.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The arguments to the "Not a blob" die call in file_change_m were
transposed, so that the command was printed as the type, and the type
as the command. Switch them around so that the error message comes
out correctly.
Signed-off-by: Julian Phillips <julian@quantumfyre.co.uk>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Git progress bars from tools like git-push and git-fetch use CR
to skip back to the start of the current line and redraw it with
an updated progress. We were doing this in our Tk widget but had
failed to skip the CR, which Tk doesn't draw well.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Post Git 1.5.3 a new style progress bar has been introduced that
uses only one line rather than two. The formatting of the completed
and total section is also slightly different so we must adjust our
regexp to match. Unfortunately both styles are in active use by
different versions of Git so we need to look for both.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
If git-write-tree fails (such as if the index file is currently
locked and it wants to write to it) we were not getting the error
message as $tree_id was always the empty string so we shortcut
through the catch and never got the output from stderr.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
For the manpage, avoid generating an em dash in code.
Signed-off-by: Ralf Wildenhues <Ralf.Wildenhues@gmx.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
When matching source and destination refs, we were failing
to pull the 'force' parameter from wildcard refspecs (but
not explicit ones) and attach it to the ref struct.
This adds a test for explicit and wildcard refspecs; the
latter fails without this patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
In 89d750bf6f I got a little too
aggressive with changing "git diff" to "git diff-tree". This is
shown to the user, who expects to see a full diff on their console,
and will want to see the output of their custom diff drivers (if
any) as the whole point of this call site is to show the diff to
the end-user.
Noticed by Johannes Sixt <j.sixt@viscovery.net>.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
git-stash needs to restrict itself to plumbing when running automated
diffs as part of its operation as the user may have configured a
custom diff driver that opens an interactive UI for certain/all
files. Doing that during scripted actions is very unfriendly to
the end-user and may cause git-stash to fail to work.
Reported by Johannes Sixt
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
git may segfault if gitattributes contains an invalid
entry. A test is added to t0020 that triggers the segfault.
The parsing code is fixed to avoid the crash.
Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
I found I needed NI_MAXSERV as it is defined in netdb.h, which is
not included by daemon.c. Rather than including the whole header
we can define a reasonable fallback value.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
CVS gets understandably upset if you try and add a subdirectory
before it's parent directory. This patch fixes that.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Ok, what is going on is:
- append_fetch_head() looks up the SHA1 for all heads (including tags):
if (get_sha1(head, sha1))
return error("Not a valid object name: %s", head);
- it then wants to check if it's a candidate for merging (because
fetching also does the whole "list which heads to merge" in case
it is going to be part of a "pull"):
commit = lookup_commit_reference(sha1);
if (!commit)
not_for_merge = 1;
- and that "lookup_commit_reference()" is just very vocal about the
case where it fails. It really shouldn't be, and it shouldn't
affect the actual end result, but that basically explains why
you get that scary warning.
In short, the warning is just bogus, and should be harmless, but
I agree that it's ugly. I think the appended patch should fix it.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This was causing test failures because die was exiting 255.
Signed-off-by: Brian Gernhardt <benji@silverinsanity.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
If we are in the middle of resolving a merge conflict there may be
one or more files whose entries in the index represent an unmerged
state (index entries in the higher-order stages).
Attempting to run git-blame on any file in such a working directory
resulted in "fatal: internal error: ce_mode is 0" as we use the magic
marker for an unmerged entry is 0 (set up by things like diff-lib.c's
do_diff_cache() and builtin-read-tree.c's read_tree_unmerged())
and the ce_match_stat_basic() function gets upset about this.
I'm not entirely sure that the whole "ce_mode = 0" case is a good
idea to begin with, and maybe the right thing to do is to remove
that horrid freakish special case, but removing the internal error
seems to be the simplest fix for now.
Linus
[sp: Thanks to Björn Steinbrink for the test case]
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Instead of simply exiting with 255, print an error message including
the reason why a config file specified through --file cannot be opened
or read.
The problem was noticed by Joey Hess, reported through
http://bugs.debian.org/445208
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
post-receive-email has one place where the variable fast_forward is not
spelled correctly. At the same place the logic was reversed. The
combination of both bugs made the script work correctly for fast-forward
commits but not for non-fast-forward ones. This change fixes this to
be correct in both cases.
Signed-off-by: Robert Schiele <rschiele@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This fixes git-instaweb so it can start an httpd without warning
about an invalid test command. Yes its ugly, but its also quite
portable.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Earlier, "git filter-branch --<options> HEAD" would not update the
working tree after rewriting the branch. This commit fixes it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The man page for filter-branch still talked about writing the result
to the branch "newbranch". This is hopefully the last place where the
old behaviour was described.
Noticed by Bill Lear.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
There are a few programs, such as config and diff, which allow running
without a git repository. Therefore, they have to call
setup_git_directory_gently().
However, when GIT_DIR and GIT_WORK_TREE were set, and the current
directory was a subdirectory of the work tree,
setup_git_directory_gently() would return a bogus NULL prefix.
This patch fixes that.
Noticed by REPLeffect on IRC.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
When diff drivers are installed, calling "git diff <tree1>..<tree2>"
calls those drivers. This borks the patch generation of rebase -i.
So use "git diff-tree -p" instead, which does not call diff drivers.
Noticed by Johannes Sixt.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Before this patch the clean target has removed the
configure script that comes with Git tar file.
That made compiling Git for different architectures
inconvenient.
This patch excludes configure from the files to be
deleted by 'make clean' and adds new target 'distclean'
to preserve old functionality.
Signed-off-by: Mathias Megyei <mathias@mnet-mail.de>
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
When calling git-config not from the top level directory of a repository,
it changes directory before trying to open the config file specified
through the --file option, which then fails if the config file was
specified by a relative pathname. This patch adjusts the pathname to
the config file if applicable.
The problem was noticed by Joey Hess, reported through
http://bugs.debian.org/445208
Signed-off-by: Gerrit Pape <pape@smarden.org>
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Before this patch, clear_commit_marks() recursed for each parent. This
could be potentially very expensive in terms of stack space. Probably
the only reason that this did not lead to problems is the fact that we
typically call clear_commit_marks() after marking a relatively small set
of commits.
Use (sort of) a tail recursion instead: first recurse on the parents
other than the first one, and then continue the loop with the first
parent.
Noticed by Shawn Pearce.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The unified diff format allows one-line ranges to be abbreviated
by omiting the size. The hunk header "@@ -10,1 +10,1 @@" can be
expressed as "@@ -10 +10 @@", but this wasn't properly parsed in
all cases.
Such abbreviated hunk headers are generated when a one-line change
(add, remove or modify) appears without context; for example
because the file is a one-liner itself or because GIT_DIFF_OPTS
was set to '-u0'. If the user then runs 'git add -i' and enters
the 'patch' command for that file, perl complains about undefined
variables.
Signed-off-by: Jean-Luc Herren <jlh@gmx.ch>
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Error out if someone gives options after --list since that is
not a valid syntax.
Signed-off-by: Frank Lichtenheld <frank@lichtenheld.de>
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This fixes an unnecessary empty line that we add to the log message when
we generate diffs, but don't actually end up printing any due to having
DIFF_FORMAT_NO_OUTPUT set.
This can happen with pickaxe or with rename following. The reason is that
we normally add an empty line between the commit and the diff, but we do
that even for the case where we've then suppressed the actual printing of
the diff.
This also updates a couple of tests that assumed the extraneous empty
line would exist at the end of output.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
It turns out that I completely broke "git log --follow" with my recent
patch to revision.c ("Fix revision log diff setup, avoid unnecessary diff
generation", commit b7bb760d5e).
Why? Because --follow obviously requires the diff machinery to function,
exactly the same way pickaxe does.
So everybody is away right now, but considering that nobody even noticed
this bug, I don't think it matters. But for the record, here's the trivial
one-liner fix (well, two, since I also fixed the comment).
Because of the nature of the bug, if you ask for patches when following
(which is one of the things I normally do), the bug is hidden, because
then the request for diff output will automatically also enable the diffs
themselves.
So while "git log --follow <filename>" didn't work, adding a "-p"
magically made it work again even without this fix.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This commit teaches git-gui to accept versions with annotations
that start with text and optionally end with a dot followed by
a number.
This is needed by the current versioning scheme of msysgit,
which uses versions like 1.5.3.mingw.1. However, the changes
is not limited to this use case. Any version of the form
<numeric version>.<anytext>.<number> would be parsed and only
the starting <numeric version> used for validation.
[sp: Minor edit to remove unnecessary group matching]
Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This tests basic functionality and also exercises a bug noticed
by Keith Packard, (prune_cache followed by add_index_entry can
trigger an attempt to realloc a pointer into the middle of an
allocated buffer).
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The index cache is not static, growing as new entries are added. If
entries are added after prune_cache is called, cache will no longer
point at the base of the allocation, and realloc will not be happy.
I verified that this was the only place in the current source which
modified any index_state.cache elements aside from the alloc/realloc
calls in read-cache by changing the type of the element to 'struct
cache_entry ** const cache' and recompiling.
A more efficient patch would create a separate 'cache_base' value to
track the allocation and then fix things up when reallocation was
necessary, instead of the brute-force memmove used here.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the user has started git-gui from the command line as a browser
we offer the gitk menu options but we didn't create the main status
bar widget in the "." toplevel. Trying to access it while starting
gitk just results in Tcl errors.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
gitk expects $env(GIT_DIR) to be valid as both a path that core Git
and Tcl/Tk can resolve to a valid directory, but it has no special
handling for Cygwin style UNIX paths and Windows style paths. So
we need to do that for gitk and ensure that only relative paths are
fed to it, thus allowing both Cygwin style and UNIX style paths to
be resolved.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Some systems that have only installed the GNU toolchain (prefixed with "g")
do not provide "ar" but only "gar". Make configure find this tool as well.
Signed-off-by: Robert Schiele <rschiele@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>