I think I have found a way to avoid the gcc crazyness.
Lookie here:
# TIME[s] SPEED[MB/s]
rfc3174 5.094 119.8
rfc3174 5.098 119.7
linus 1.462 417.5
linusas 2.008 304
linusas2 1.878 325
mozilla 5.566 109.6
mozillaas 5.866 104.1
openssl 1.609 379.3
spelvin 1.675 364.5
spelvina 1.601 381.3
nettle 1.591 383.6
notice? I outperform all the hand-tuned asm on 32-bit too. By quite a
margin, in fact.
Now, I didn't try a P4, and it's possible that it won't do that there, but
the 32-bit code generation sure looks impressive on my Nehalem box. The
magic? I force the stores to the 512-bit hash bucket to be done in order.
That seems to help a lot.
The diff is trivial (on top of the "rename registers with cpp" patch), as
appended. And it does seem to fix the P4 issues too, although I can
obviously (once again) only test Prescott, and only in 64-bit mode:
# TIME[s] SPEED[MB/s]
rfc3174 1.662 36.73
rfc3174 1.64 37.22
linus 0.2523 241.9
linusas 0.4367 139.8
linusas2 0.4487 136
mozilla 0.9704 62.9
mozillaas 0.9399 64.94
that's some really impressive improvement. All from just saying "do the
stores in the order I told you to, dammit!" to the compiler.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of letting the compiler to figure out the optimal way to rotate
register usage, explicitly rotate the register names with cpp.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When this option is given, the command does not verify the pack contents,
but shows the delta chain histogram. If used with --verbose, the usual
list of objects is also shown.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint: (95 commits)
verify-pack -v: do not report "chain length 0"
t5510: harden the way verify-pack is used
gitweb/README: Document $base_url
Documentation: git submodule: add missing options to synopsis
Better usage string for reflog.
hg-to-git: don't import the unused popen2 module
send-email: remove debug trace
config: Keep inner whitespace verbatim
GIT 1.6.4
GIT 1.6.3.4
config.txt: document add.ignore-errors
request-pull: allow ls-remote to notice remote.$nickname.uploadpack
Update the documentation of the raw diff output format
git-rerere.txt: Clarify ambiguity of the config variable
t9143: do not fail if Compress::Zlib is missing
Trivial path quoting fixes in git-instaweb
GIT 1.6.4-rc3
Documentation/config.txt: a variable can be defined on the section header line
git svn: make minimize URL more reliable over http(s)
Disable asciidoc 8.4.1+ semantics for `{plus}` and friends
...
When making a histogram of delta chain length in the pack, the program
collects number of objects whose delta depth exceeds the MAX_CHAIN limit
in histogram[0], and showed it as the number of items that exceeds the
limit correctly. HOWEVER, it also showed the same number labeled as
"chain length = 0".
In fact, we are not showing the number of objects whose chain length is
zero, i.e. the base objects. Correct this.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test ignored the exit status from verify pack command, and also relied
on not seeing any delta chain statistics.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a path is unmerged in the index, we used to always say "unmerged" in
the "Changed but not updated" section, even when the path was deleted in
the work tree.
Remove unmerged entries from the "Updated" section, and create a new
section "Unmerged paths". Describe how the different stages conflict
in more detail in this new section.
Note that with the current 3-way merge policy (with or without recursive),
certain combinations of index stages should never happen. For example,
having only stage #2 means that a path that did not exist in the common
ancestor was added by us while the other branch did not do anything to it,
which would have autoresolved to take our addition. The code nevertheless
prepares for the possibility that future merge policies may leave a path
in such a state.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid git ending with this message:
"Patch format is not supported."
With improved error message in the format detection failure case by
Giuseppe Bilotta.
Signed-off-by: Nicolas Sebrecht <ni.s@laposte.net>
Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We traditionally allowed a mbox file or a directory name of a maildir (but
never an individual file inside a maildir) to be given to "git am". Even
though an individual file in a maildir (or more generally, a piece of
RFC2822 e-mail) is not a mbox file, it contains enough information to
create a commit out of it, so there is no reason to reject one. Running
mailsplit on such a file feels stupid, but it does not hurt.
This builds on top of a5a6755 (git-am foreign patch support: introduce
patch_format, 2009-05-27) that introduced mailbox format detection. The
codepath to deal with a mbox requires it to begin with "From " line and
also allows it to begin with "From: ", but a random piece of e-mail can
and often do begin with any valid RFC2822 header lines.
Instead of checking the first line, we extract all the lines up to the
first empty line, and make sure they look like e-mail headers.
A test is added to t4150 to demonstrate this feature.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The XZ compression format uses the LZMA2 compression algorithm, which
often yields higher compression ratios than both GZip and BZip2 at the
cost of using more CPU time and RAM. XZ is the slowest for compression,
but still much faster than BZip2 for decompression, almost comparable
to GZip (see benchmarks below).
Some simple benchmarks show the pros and cons of using XZ compression;
starting with an already tarball'd archive of the repos listed below.
Memory usage seemed to be consistent for any given algorithm at their
respective default compression levels.
CPU: AMD Sempron 3400+ (1 core @ 1.8GHz with 256K L2 cache)
Virtual Memory Usage
GZip: 4152K BZip2: 13352K XZ: 102M
Linux 2.6 series (f5886c7f96f2542382d3a983c5f13e03d7fc5259) 349M
gzip 23.70s user 0.47s system 99% cpu 24.227 total 76M
gunzip 3.74s user 0.74s system 94% cpu 4.741 total
bzip2 130.96s user 0.53s system 99% cpu 2:11.97 total 59M
bunzip2 31.05s user 1.02s system 99% cpu 32.355 total
xz 448.78s user 0.91s system 99% cpu 7:31.28 total 51M
unxz 7.67s user 0.80s system 98% cpu 8.607 total
Git (0a53e9ddea) 11M
gzip 0.77s user 0.03s system 99% cpu 0.792 total 2.5M
gunzip 0.12s user 0.02s system 98% cpu 0.142 total
bzip2 3.42s user 0.02s system 99% cpu 3.454 total 2.1M
bunzip2 0.95s user 0.03s system 99% cpu 0.984 total
xz 12.88s user 0.14s system 98% cpu 13.239 total 1.9M
unxz 0.27s user 0.03s system 99% cpu 0.298 total
XZ (669413bb2db954bbfde3c4542fddbbab53891eb4) 1.8M
gzip 0.12s user 0.00s system 95% cpu 0.132 total 442K
gunzip 0.02s user 0.00s system 97% cpu 0.027 total
bzip2 1.28s user 0.01s system 99% cpu 1.298 total 363K
bunzip2 0.15s user 0.01s system 100% cpu 0.157 total
xz 1.62s user 0.03s system 99% cpu 1.652 total 347K
unxz 0.05s user 0.00s system 99% cpu 0.058 total
From a time and memory perspective, nothing compares to GZip, but if
given an average upload speed of 20KB/s, it would take ~400 seconds
longer to transfer the BZip2'd kernel snapshot than the XZ snapshot;
the transfer time difference is even greater between GZip and XZ. The
real time savings are relatively the same for all test cases, but less
dramatic for smaller repositories.
XZ decompresses ~1.8-2 times slower than GZip, and ~2.7-3.75 times
faster than BZip2; XZ gets relatively faster as snapshots get larger.
However, XZ takes relatively longer to compress as snapshots get larger.
The downside for XZ'd snapshots is the large CPU and memory load put on
the server to generate the compressed snapshot, though XZ will
eventually
have threading support, and the real clock time for making XZ'd
snapshots
would decrease if the server had a beefy multi-core CPU.
XZ compression is disabled by default to allow upgrades to take place
without any surprises, as the CPU and memory requirements will be an
issue for high load or lightweight servers. Also, the XZ format is still
new (format declared stable ~6 months ago), and there have been no
"stable" releases of the utils yet.
Signed-off-by: Mark Rada <marada@uwaterloo.ca>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This includes instructions on how to disable a snapshot format and how
to add options to a snapshot format (e.g. setting the compression
level).
Signed-off-by: Mark Rada <marada@uwaterloo.ca>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow Gitweb administrators to set a 'disabled' key in the
%known_snapshot_formats hash to disable a specific snapshot format.
All formats are enabled by default to maintain backwards compatibility.
Signed-off-by: Mark Rada <marada@uwaterloo.ca>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
.. and simplify the ctx->size logic.
We now count the size in bytes, which means that 'lenW' was always just
the low 6 bits of the total size, so we don't carry it around separately
any more. And we do the 'size in bits' shift at the end.
Suggested by Nicolas Pitre and linux@horizon.com.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's an equivalent expression, but the '+' gives us some freedom in
instruction selection (for example, we can use 'lea' rather than 'add'),
and associates with the other additions around it to give some minor
scheduling freedom.
Suggested-by: linux@horizon.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid repeating the shared parts of the different rounds by adding a
macro layer or two. It was already more cpp than C.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The mozilla-SHA1 code did this 80-word array for the 80 iterations. But
the SHA1 state is really just 512 bits, and you can actually keep it in
a kind of "circular queue" of just 16 words instead.
This requires us to do the xor updates as we go along (rather than as a
pre-phase), but that's really what we want to do anyway.
This gets me really close to the OpenSSL performance on my Nehalem.
Look ma, all C code (ok, there's the rol/ror hack, but that one doesn't
strictly even matter on my Nehalem, it's just a local optimization).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This helps a teeny bit. But what I -really- want to do is to avoid the
whole 80-array loop, and do the xor updates as I go along..
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Bert Wesarg noticed non-x86 version of SHA_ROT() had a typo.
Also spell in-line assembly as __asm__(), otherwise I seem to get
error: implicit declaration of function 'asm' from my compiler.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the one with the smaller constant. It _can_ generate slightly
smaller code (a constant of 1 is special), but perhaps more importantly
it's possibly faster on any uarch that does a rotate with a loop.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Undo the change I picked up from the mailing list discussion suggested
by Nico, not because it is wrong, but it will be done at the end of the
follow-up series.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we find no refs that may be used for git-describe with the current
options, then die early instead of pointlessly walking the whole
history.
In git.git with all the tags dropped, this makes "git describe" go down
from 0.244 to 0.003 seconds for me. This is especially noticeable with
"git submodule status" which calls describe with increasing levels of
allowed refs to be matched. For a submodule without tags, this means
that it walks the whole history in the submodule twice (first annotated,
then plain tags), just to find out that it can't describe the commit
anyway.
Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previous version expose the output of the plumbing update-index to the
user, which novice users have difficulty to understand.
We still need to run update-index to refresh the cache (if
diff.autorefreshindex is false, git diff won't do it).
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Explain briefly what characters are prohibited in tag <name>
and point to git-check-ref-format(1) manual page for
further information.
Signed-off-by: Jari Aalto <jari.aalto@cante.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If one thinks of a revision as the set of commits which can be reached
from the rev, and of ^rev as the complement, then multiple arguments to
git rev-list can be neither understood as the intersection nor the union
of the individual sets.
But set language is the natural as well as logical language in which to
phrase this. So, add a paragraph which explains multiple arguments using
set language.
Suggested-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a new infrastructure to find and summarize changes in a single
string list, and rewrite wt_status_print_{updated,changed} functions using
it.
The goal of this change is to give more information on conflicted paths in
the status output.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
I have 4GB of RAM on my system which should, in theory, be quite enough
to repack a 600 MB repository. However the unbounded delta cache size
always pushes it into swap, at which point everything virtually comes to
a halt. So unbounded caches are never a good idea.
A default of 256MB should be a good compromize between memory usage and
speed where medium sized repositories are still likely to fit in the
cache with a reasonable memory usage, and larger repositories are going
to take quite some time to repack already anyway.
While at it, clarify the associated config variable documentation
entries a bit.
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When --quiet is given, the user generally only wants to see
errors. So let's suppress printing the ref status table
unless there is an error, in which case we print out the
whole table.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When pushing over the git protocol, pack-objects gives
progress reports about the pack being sent. If "push" is
given the --quiet flag, it now passes "-q" to pack-objects,
suppressing this output.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some transports produce output even without "--verbose"
turned on. This provides a way to tell them to be more
quiet (whereas simply redirecting might lose error
messages).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Based on the mozilla SHA1 routine, but doing the input data accesses a
word at a time and with 'htonl()' instead of loading bytes and shifting.
It requires an architecture that is ok with unaligned 32-bit loads and a
fast htonl().
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* sb/parse-options:
prune-packed: migrate to parse-options
verify-pack: migrate to parse-options
verify-tag: migrate to parse-options
write-tree: migrate to parse-options
Call to_utf8 when parsing author and committer names, otherwise they will appear
with bad encoding if they written by using chop_and_escape_str.
Signed-off-by: Zoltán Füzesi <zfuzesi@eaglet.hu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The option --merge was missing for submodule update and --cached for
submodule summary.
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>