In a path-limited bisection, when the $bad commit is not
changing the limited path, and the number of suspects is 1, the
code miscounted and returned $bad from find_bisection(), which
is not marked with TREECHANGE. This is of course filtered by
the output routine, resulting in an empty output, in turn
causing git-bisect driver to say "$bad was both good and bad".
Illustration. Suppose you have these four commits, and only C
changes path P. You know D is bad and A is good.
A---B---C*--D
git-bisect driver runs this to find a bisection point:
$ git rev-list --bisect A..D -- P
which calls find_bisection() with B, C and D. The set of
commits that is given to this function is the same set of
commits as rev-list without --bisect option and pathspec
returns. Among them, only C is marked with TREECHANGE. Let's
call the set of commits given to find_bisection() that are
marked with TREECHANGE (or all of them if no path limiter is in
effect) "the bisect set". In the above example, the size of the
bisect set is 1 (contains only "C").
For each commit in its input, find_bisection() computes the
number of commits it can reach in the bisect set. For a commit
in the bisect set, this number includes itself, so the number is
1 or more. This number is called "depth", and computed by
count_distance() function.
When you have a bisect set of N commits, and a commit has depth
D, how good is your bisection if you returned that commit? How
good this bisection is can be measured by how many commits are
effectively tested "together" by testing one commit.
Currently you have (N-1) untested commits (the tip of the bisect
set, although it is included in the bisect set, is already known
to be bad). If the commit with depth D turns out to be bad,
then your next bisect set will have D commits and you will have
(D-1) untested commits left, which means you tested (N-1)-(D-1)
= (N-D) commits with this bisection. If it turns out to be good, then
your next bisect set will have (N-D) commits, and you will have
(N-D-1) untested commits left, which means you tested
(N-1)-(N-D-1) = D commits with this bisection.
Therefore, the goodness of this bisection is is min(N-D, D), and
find_bisection() function tries to find a commit that maximizes
this, by initializing "closest" variable to 0 and whenever a
commit with the goodness that is larger than the current
"closest" is found, that commit and its goodness are remembered
by updating "closest" variable. The "the commit with the best
goodness so far" is kept in "best" variable, and is initialized
to a commit that happens to be at the beginning of the list of
commits given to this function (which may or may not be in the
bisect set when path-limit is in use).
However, when N is 1, then the sole tree-changing commit has
depth of 1, and min(N-D, D) evaluates to 0. This is not larger
than the initial value of "closest", and the "so far the best
one" commit is never replaced in the loop.
When path-limit is not in use, this is not a problem, as any
commit in the input set is tree-changing. But when path-limit
is in use, and when the starting "bad" commit does not change
the specified path, it is not correct to return it.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Fix copy'n'paste error in commit c9d193df which caused that "next"
link for merge commits in "commit" view
(merge: _commit_ _commit_ ...)
was to "commitdiff" view instead of being to "commit" view.
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The branch you are on while bisecting is always "bisect", and
checking for "refs/heads/bisect*" is wrong. Only check if it is
exactly "refs/heads/bisect".
Signed-off-by: Junio C Hamano <junkio@cox.net>
After "git reset" moves the HEAD around, it reports which commit
you are on, which gives the user a warm fuzzy feeling of
assurance. Give the same assurance from git-checkout when
moving the detached HEAD around.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This idea was suggested by Bill Lear
(Message-ID: <17920.38942.364466.642979@lisa.zopyra.com>)
and I think it is a very good one.
This patch adds a new test file for "git bisect run", but there
is currently only one basic test.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Without this the rev could be (e.g.) a tag and then the condition to end the
bisect might fail and you have to check the already known to be bad revision
once more.
Signed-off-by: Uwe Kleine-König <ukleinek@informatik.uni-freiburg.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Elias Pipping:
> I'm on a mac, hence /usr/bin/sed is not gnu sed, which makes
> t4118 fail.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Ack'd-by: Elias Pipping <pipping@macports.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
When we write out the result of patch application, we sometimes
need to munge the data (e.g. under core.autocrlf). After doing
so, what we should free is the temporary buffer that holds the
converted data returned from convert_to_working_tree(), not the
original one.
This patch also moves the call to open() up in the function, as
the caller expects us to fail cheaply if leading directories
need to be created (and then the caller creates them and calls
us again). For that calling pattern, attempting conversion
before opening the file adds unnecessary overhead.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The HEAD reflog is updated as well as the reflog for the branch pointed
to by HEAD whenever it is referenced with "HEAD".
There are some cases where a specific branch may be modified directly.
In those cases, the HEAD reflog should be updated as well if it is a
symref to that branch in order to be consistent.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
It was annoying to always have the first email from a project be from
the "Unnamed repository; edit this file to name it for gitweb project";
just because it's so easy to forget to set it.
This patch checks to see if the description file is still default (or
empty) and aborts if so - allowing you to fix the problem before sending
out silly looking emails to every developer.
Signed-off-by: Andy Parkins <andyparkins@gmail.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This makes git-fetch <URL> && git-merge FETCH_HEAD produce the
same merge message as git-pull <URL>.
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Junio C Hamano <junkio@cox.net>
When you want to amend the commit message of 3 commits before
the tip of the current branch, say 'master',
A--B--C--D--E(master)
it is sometimes handy to make your head detached at that commit
with:
$ git checkout HEAD~3 ;# check out B
$ git commit --amend ;# without modifying contents...
to create:
.B'(HEAD)
/
A--B--C--D--E(master)
and then rebase 'master' branch onto HEAD with this:
$ git rebase HEAD master
to result in:
.B'-C'-D'-E(master=HEAD)
/
A--B--C--D--E
However, the current code interprets HEAD after it switches to
the branch 'master', which means the rebase will not do
anything. You have to say something unwieldly like this
instead:
$ git rebase $(git rev-parse HEAD) master
This fixes it by expanding the $onto commit name before
switching to the target branch.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This improves the performance of revision bisection.
The idea is to avoid rather expensive count_distance() function,
which counts the number of commits that are reachable from any
given commit (including itself) in the set. When a commit has
only one relevant parent commit, the number of commits the
commit can reach is exactly the number of commits that the
parent can reach plus one; instead of running count_distance()
on commits that are on straight single strand of pearls, we can
just add one to the parents' count.
On the other hand, for a merge commit, because the commits
reachable from one parent can be reachable from another parent,
you cannot just add the parents' counts up plus one for the
commit itself; that would overcount ancestors that are reachable
from more than one parents.
The algorithm used in the patch runs count_distance() on merge
commits, and uses the util field of commit objects to remember
them. After that, the number of commits reachable from each of
the remaining commits is counted by finding a commit whose count
is not yet known but the count for its (sole) parent is known,
and adding one to the parent's count, until we assign numbers to
everybody.
Another small optimization is whenever we find a half-way commit
(that is, a commit that can reach exactly half of the commits),
we stop giving counts to remaining commits, as we will not find
any better commit than we just found.
The performance to bisect between v1.0.0 and v1.5.0 in git.git
repository was improved by saying good and bad in turns from
3.68 seconds down to 1.26 seconds. Bisecting the kernel between
v2.6.18 and v2.6.20 was sped up from 21.84 seconds down to 4.22
seconds.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds --bisect-vars option to rev-list. The output is suitable
for `eval` in shell and defines five variables:
- bisect_rev is the next revision to test.
- bisect_nr is the expected number of commits to test after
bisect_rev is tested.
- bisect_good is the expected number of commits to test
if bisect_rev turns out to be good.
- bisect_bad is the expected number of commits to test
if bisect_rev turns out to be bad.
- bisect_all is the number of commits we are bisecting right now.
The documentation text was partly stolen from Johannes
Schindelin's patch.
Signed-off-by: Junio C Hamano <junkio@cox.net>
Currently all calls to the database backend make no
error checking or handling at all. At least abort
if the connection to the database failed since
there is really no way we could do anything useful
after that.
Signed-off-by: Frank Lichtenheld <frank@lichtenheld.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Make all the different parts of the database backend connection
configurable. This adds the following string configuration variables:
- gitcvs.dbdriver
- gitcvs.dbname
- gitcvs.dbuser
- gitcvs.dbpass
The default values emulate the current behavior exactly for
backwards compatibility.
All configuration variables can also be specified for a specific
access method (i.e. in the form gitcvs.<method>.<var>)
The dbdriver/dbuser/dbpass variables are added for completness.
No other backend than SQLite is tested yet.
The dbname variable on the other hand is useful with this backend
already (to not discriminate against other possible backends
it was not splitted in dbdir and dbfile).
Both dbname and dbuser support dynamic variable substitution where
the available variables are:
%m -- the CVS 'module' (i.e. GIT 'head') worked on
%a -- CVS access method used (i.e. 'ext' or 'pserver')
%u -- User name of the user invoking git-cvsserver
%G -- .git directory name
%g -- .git directory name, mangled to be used in a filename,
currently this substitutes all chars except for [\w.-]
with '_'
Signed-off-by: Frank Lichtenheld <frank@lichtenheld.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Allow to override the gitcvs.enabled and gitcvs.logfile configuration
variables for each access method (i.e. "ext" or "pserver") in the
form gitcvs.<method>.<var>
Signed-off-by: Frank Lichtenheld <frank@lichtenheld.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This is intended to be used in the form gitcvs.<method>.<var>
but this patch doesn't introduce any users yet.
Signed-off-by: Frank Lichtenheld <frank@lichtenheld.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
$state->{method} contains the CVS access method used,
either 'ext' or 'pserver'
Signed-off-by: Frank Lichtenheld <frank@lichtenheld.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The test expects --bisect option can be configured with by setting
$_bisect_option. So let's allow that uniformly.
Signed-off-by: Junio C Hamano <junkio@cox.net>
In addition to optimizing pathspecs that would never match,
which was done earlier, this optimizes pathspecs that would
always match (e.g. "arch/" while the traversal is already in
"arch/i386/" hierarchy).
This patch makes the worst case slightly more palatable, while
improving average case.
Signed-off-by: Junio C Hamano <junkio@cox.net>
If we already know that some of the pathspecs can match later
entries in the tree we are looking at, we do not have to do more
expensive strncmp() upfront before comparing the length of the
match pattern and the path, as a path longer than the match
pattern will not match it, and a path shorter than the match
pattern will match only if the path is a directory-component
wise prefix of the match pattern.
Signed-off-by: Junio C Hamano <junkio@cox.net>
When we are looking at a tree entry with pathspecs, if all the
pathspecs sort strictly earlier than the entry we are currently
looking at, there is no way later entries in the same tree would
match our pathspecs, because the entries are sorted.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This makes the tree descriptor contain a "struct name_entry" as part of
it, and it gets filled in so that it always contains a valid entry. On
some benchmarks, it improves performance by up to 15%.
That makes tree entry "extract" trivial, and means that we only actually
need to decode each tree entry just once: we decode the first one when
we initialize the tree descriptor, and each subsequent one when doing
"update_tree_entry()". In particular, this means that we don't need to
do strlen() both at extract time _and_ at update time.
Finally, it also allows more sharing of code (entry_extract(), that
wanted a "struct name_entry", just got totally trivial, along with the
"tree_entry()" function).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This removes slightly more lines than it adds, but the real reason for
doing this is that future optimizations will require more setup of the
tree descriptor, and so we want to do it in one place.
Also renamed the "desc.buf" field to "desc.buffer" just to trigger
compiler errors for old-style manual initializations, making sure I
didn't miss anything.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Since we have the "tree_entry_len()" helper function these days, and
don't need to do a full strlen(), there's no point in saving the path
length - it's just redundant information.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Preferring git _space_ COMMAND over git _dash_ COMMAND allows the
user to have only git and gitk in their path. e.g. when git and gitk
are symbolic links in a personal bin directory to the real git and gitk.
Signed-off-by: Paul Mackerras <paulus@samba.org>
The earlier round makes the function return "is it different"
and it does not return a value suitable for sorting anymore. Reverse
the logic to return "are they the same suspect" instead, and rename
it to "same_suspect()".
Signed-off-by: Junio C Hamano <junkio@cox.net>