Here's a test-patch. I don't guarantee anything, except that when I did
the timings I also did a "wc" on the result, and they matched..
Before:
[torvalds@woody linux]$ time git diff -l0 --stat -C v2.6.22.. | wc
7104 28574 438020
real 0m10.526s
user 0m10.401s
sys 0m0.136s
After:
[torvalds@woody linux]$ time ~/git/git diff -l0 --stat -C v2.6.22.. | wc
7104 28574 438020
real 0m8.876s
user 0m8.761s
sys 0m0.128s
but the diff is fairly simple, so if somebody will go over it and say
whether it's likely to be *correct* too, that 15% may well be worth it.
[ Side note, without rename detection, that diff takes just under three
seconds for me, so in that sense the improvement to the rename detection
itself is larger than the overall 15% - it brings the cost of just
rename detection from 7.5s to 5.9s, which would be on the order of just
over a 20% performance improvement. ]
Hmm. The patch depends on half-way subtle issues like the fact that the
hashtables are guaranteed to not be full => we're guaranteed to have zero
counts at the end => we don't need to do any steenking iterator count in
the loop. A few comments might in order.
Linus
In most cases of branching, the tree is copied unmodified from the trunk
to the branch. When that is done, we can simply start with the parent's
index and apply the changes on the branch as usual.
[ew: rewritten from Steven's original to use SVN::Client instead
of the command-line svn client.
Since SVN::Client connects separately, we'll share our
authentication providers array between our usages of
SVN::Client and SVN::Ra, too. Bypassing the high-level
SVN::Client library can avoid this, but the code will be
much more complex. Regardless, any implementation of this
seems to require restarting a connection to the remote
server.
Also of note is that SVN 1.4 and later allows a more
efficient diff_summary to be done instead of a full diff,
but since this code is only to support SVN < 1.4.4, we'll
ignore it for now.]
Signed-off-by: Steven Walter <stevenrwalter@gmail.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ph/strbuf: (44 commits)
Make read_patch_file work on a strbuf.
strbuf_read_file enhancement, and use it.
strbuf change: be sure ->buf is never ever NULL.
double free in builtin-update-index.c
Clean up stripspace a bit, use strbuf even more.
Add strbuf_read_file().
rerere: Fix use of an empty strbuf.buf
Small cache_tree_write refactor.
Make builtin-rerere use of strbuf nicer and more efficient.
Add strbuf_cmp.
strbuf_setlen(): do not barf on setting length of an empty buffer to 0
sq_quote_argv and add_to_string rework with strbuf's.
Full rework of quote_c_style and write_name_quoted.
Rework unquote_c_style to work on a strbuf.
strbuf API additions and enhancements.
nfv?asprintf are broken without va_copy, workaround them.
Fix the expansion pattern of the pseudo-static path buffer.
builtin-for-each-ref.c::copy_name() - do not overstep the buffer.
builtin-apply.c: fix a tiny leak introduced during xmemdupz() conversion.
Use xmemdupz() in many places.
...
* lh/merge:
git-merge: add --ff and --no-ff options
git-merge: add support for --commit and --no-squash
git-merge: add support for branch.<name>.mergeoptions
git-merge: refactor option parsing
git-merge: fix faulty SQUASH_MSG
Add test-script for git-merge porcelain
* jc/autogc:
git-gc --auto: run "repack -A -d -l" as necessary.
git-gc --auto: restructure the way "repack" command line is built.
git-gc --auto: protect ourselves from accumulated cruft
git-gc --auto: add documentation.
git-gc --auto: move threshold check to need_to_gc() function.
repack -A -d: use --keep-unreachable when repacking
pack-objects --keep-unreachable
Export matches_pack_name() and fix its return value
Invoke "git gc --auto" from commit, merge, am and rebase.
Implement git gc --auto
* ap/dateformat:
Add a test script for for-each-ref, including test of date formatting
dateformat: parse %(xxdate) %(yydate:format) correctly
Make for-each-ref's grab_date() support per-atom formatting
Make for-each-ref allow atom names like "<name>:<something>"
parse_date_format(): convert a format name to an enum date_mode
This tests basic functionality and also exercises a bug noticed
by Keith Packard, (prune_cache followed by add_index_entry can
trigger an attempt to realloc a pointer into the middle of an
allocated buffer).
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The index cache is not static, growing as new entries are added. If
entries are added after prune_cache is called, cache will no longer
point at the base of the allocation, and realloc will not be happy.
I verified that this was the only place in the current source which
modified any index_state.cache elements aside from the alloc/realloc
calls in read-cache by changing the type of the element to 'struct
cache_entry ** const cache' and recompiling.
A more efficient patch would create a separate 'cache_base' value to
track the allocation and then fix things up when reallocation was
necessary, instead of the brute-force memmove used here.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the user has started git-gui from the command line as a browser
we offer the gitk menu options but we didn't create the main status
bar widget in the "." toplevel. Trying to access it while starting
gitk just results in Tcl errors.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
gitk expects $env(GIT_DIR) to be valid as both a path that core Git
and Tcl/Tk can resolve to a valid directory, but it has no special
handling for Cygwin style UNIX paths and Windows style paths. So
we need to do that for gitk and ensure that only relative paths are
fed to it, thus allowing both Cygwin style and UNIX style paths to
be resolved.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This adds a verbosity level below 0 for suppressing default messages
with --quiet, and makes the default for http be verbose instead of
quiet. This matches the behavior of the shell script version of git-fetch.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some systems that have only installed the GNU toolchain (prefixed with "g")
do not provide "ar" but only "gar". Make configure find this tool as well.
Signed-off-by: Robert Schiele <rschiele@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We find rename candidates by computing a fingerprint hash of
each file, and then comparing those fingerprints. There are
inherently O(n^2) comparisons, so it pays in CPU time to
hoist the (rather expensive) computation of the fingerprint
out of that loop (or to cache it once we have computed it once).
Previously, we didn't keep the filespec information around
because then we had the potential to consume a great deal of
memory. However, instead of keeping all of the filespec
data, we can instead just keep the fingerprint.
This patch implements and uses diff_free_filespec_data_large
to accomplish that goal. We also have to change
estimate_similarity not to needlessly repopulate the
filespec data when we already have the hash.
Practical tests showed 4.5x speedup for a 10% memory usage
increase.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation used to say what the option does, but it
didn't mention a use case.
Signed-off-by: Federico Mena Quintero <federico@gnu.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There was an 'l' (ell) instead of a '1' (one) in one of the gitlinks.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The string value of %(numparent) was not returned correctly.
Also %(parent) misbehaved for the root commits (returned garbage)
and merge commits (returned first parent, followed by a space).
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Andy Parkins noticed that parsing of the above would not
correctly notice that xxdate does not have any format
specifier.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We rely on TMP_INDEX variable to decide if we are doing a partial commit,
as it is only set in the partial commit codepath. But the variable is
never initialized. A stray environment variable from outside could
ruin the day.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We lost rsync support when transitioning from shell to C. Support it
again (even if the transport is technically deprecated, some people just
do not have any chance to use anything else).
Also, add a test to t5510. Since rsync transport is not configured by
default on most machines, and especially not such that you can write to
rsync://127.0.0.1$(pwd)/, it is disabled by default; you can enable it by
setting the environment variable TEST_RSYNC.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
Whip post 1.5.3.3 maintenance series into shape.
git stash: document apply's --index switch
post-receive-hook: Remove the From field from the generated email header so that the pusher's name is used
Using the name of the committer of the revision at the tip of the
updated ref is not sensible. That information is available in the email
itself should it be wanted, and by supplying a "From", we were
effectively hiding the person who performed the push - which is useful
information in itself.
Signed-off-by: Andy Parkins <andyparkins@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There was a function called remove_empty_dir_recursive() buried
in refs.c. Expose a slightly enhanced version in dir.h: it can now
optionally remove a non-empty directory.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently alloc_ref() expects the length of the refname plus 1
as its parameter, prepares that much space and returns a "ref"
structure for the caller to fill the refname. One caller in
transport.c::get_refs_from_bundle() however allocated one byte
less.
It may be a good idea to change the calling convention to give
alloc_ref() the length of the refname, but that clean-up can be
done in a separate patch. This patch only fixes the bug and
makes all callers consistent.
There was also one overallocation in connect.c, which would not
hurt but was wasteful. This patch fixes it as well.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead print a single message around sequences of commands that can
potentially take some time.
Signed-off-by: Alexandre Julliard <julliard@winehq.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Without this, a non-path URL gets lost before the clone.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Exit with non-zero status when "git remote rm" was told to
remove a non-existing remote.
Signed-off-by: Jari Aalto <jari.aalto@cante.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
git-remote: exit with non-zero status after detecting errors.
rebase -i: squash should retain the authorship of the _first_ commit
git-add--interactive: Improve behavior on bogus input
git-add--interactive: Allow Ctrl-D to exit
Some subcommands of "git-remote" detected and issued error
messages but did not signal that to the calling process with
exit status.
Signed-off-by: Jari Aalto <jari.aalto@cante.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It was determined on the mailing list, that it makes more sense for a
"squash" to keep the author of the first commit as the author for the
result of the squash.
Make it so.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When you do many rebases, you can get annoyed by having to type out
the actions "edit" or "squash" in total.
This commit helps that, by allowing you to enter "e" instead of "edit",
"p" instead of "pick", or "s" instead of "squash".
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>