Commit Graph

19 Commits

Author SHA1 Message Date
Jeff King
c578e29ba0 bswap.h: drop unaligned loads
Our put_be32() routine and its variants (get_be32(), put_be64(), etc)
has two implementations: on some platforms we cast memory in place and
use nothl()/htonl(), which can cause unaligned memory access. And on
others, we pick out the individual bytes using bitshifts.

This introduces extra complexity, and sometimes causes compilers to
generate warnings about type-punning. And it's not clear there's any
performance advantage.

This split goes back to 660231aa97 (block-sha1: support for
architectures with memory alignment restrictions, 2009-08-12). The
unaligned versions were part of the original block-sha1 code in
d7c208a92e (Add new optimized C 'block-sha1' routines, 2009-08-05),
which says it is:

   Based on the mozilla SHA1 routine, but doing the input data accesses a
   word at a time and with 'htonl()' instead of loading bytes and shifting.

Back then, Linus provided timings versus the mozilla code which showed a
27% improvement:

  https://lore.kernel.org/git/alpine.LFD.2.01.0908051545000.3390@localhost.localdomain/

However, the unaligned loads were either not the useful part of that
speedup, or perhaps compilers and processors have changed since then.
Here are times for computing the sha1 of 4GB of random data, with and
without -DNO_UNALIGNED_LOADS (and BLK_SHA1=1, of course). This is with
gcc 10, -O2, and the processor is a Core i9-9880H.

  [stock]
  Benchmark #1: t/helper/test-tool sha1 <foo.rand
    Time (mean ± σ):      6.638 s ±  0.081 s    [User: 6.269 s, System: 0.368 s]
    Range (min … max):    6.550 s …  6.841 s    10 runs

  [-DNO_UNALIGNED_LOADS]
  Benchmark #1: t/helper/test-tool sha1 <foo.rand
    Time (mean ± σ):      6.418 s ±  0.015 s    [User: 6.058 s, System: 0.360 s]
    Range (min … max):    6.394 s …  6.447 s    10 runs

And here's the same test run on an AMD A8-7600, using gcc 8.

  [stock]
  Benchmark #1: t/helper/test-tool sha1 <foo.rand
    Time (mean ± σ):     11.721 s ±  0.113 s    [User: 10.761 s, System: 0.951 s]
    Range (min … max):   11.509 s … 11.861 s    10 runs

  [-DNO_UNALIGNED_LOADS]
  Benchmark #1: t/helper/test-tool sha1 <foo.rand
    Time (mean ± σ):     11.744 s ±  0.066 s    [User: 10.807 s, System: 0.928 s]
    Range (min … max):   11.637 s … 11.863 s    10 runs

So the unaligned loads don't seem to help much, and actually make things
worse. It's possible there are platforms where they provide more
benefit, but:

  - the non-x86 platforms for which we use this code are old and obscure
    (powerpc and s390).

  - the main caller that cares about performance is block-sha1. But
    these days it is rarely used anyway, in favor of sha1dc (which is
    already much slower, and nobody seems to have cared that much).

Let's just drop unaligned versions entirely in the name of simplicity.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-24 12:30:09 -07:00
Jeff King
33aa579a55 compat/bswap: add include header guards
Our compat/bswap.h lacks the usual preprocessor guards against multiple
inclusion. This usually isn't an issue since it only gets included from
git-compat-util.h, which has its own guards. But it would produce
redeclaration errors if any file included it separately.

Our hdr-check target would complain about this, except that it currently
skips items in compat/ entirely.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-07 07:42:14 +09:00
Ben Peart
b2e39d0067 bswap: add 64 bit endianness helper get_be64
Add a new get_be64 macro to enable 64 bit endian conversions on memory
that may or may not be aligned.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-24 10:39:37 +09:00
René Scharfe
5b114f3bb0 bswap: convert get_be16, get_be32 and put_be32 to inline functions
Simplify the implementation and allow callers to use expressions with
side-effects by turning the macros get_be16, get_be32 and put_be32 into
inline functions.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-17 14:54:15 -07:00
René Scharfe
7780af1e8e bswap: convert to unsigned before shifting in get_be32
The pointer p is dereferenced and we get an unsigned char.  Before
shifting it's automatically promoted to int.  Left-shifting a signed
32-bit value bigger than 127 by 24 places is undefined.  Explicitly
convert to a 32-bit unsigned type to avoid undefined behaviour if
the highest bit is set.

Found with Clang's UBSan.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-17 14:54:13 -07:00
Jeff King
a0df2e5a7e bswap: add NO_UNALIGNED_LOADS define
The byte-swapping code automatically decides, based on the
platform, whether it is sensible to cast and do a potentially
unaligned ntohl(), or to pick individual bytes out of an
array.

It can be handy to override this decision, though, when
turning on compiler flags that will complain about unaligned
loads (such as -fsanitize=undefined). This patch adds a
macro check to make this possible.

There's no nice Makefile knob here; this is for prodding at
Git's internals, and anybody using it can set
"-DNO_UNALIGNED_LOADS" in the same place they are setting up
"-fsanitize".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-04 09:51:33 -08:00
David Michael
bfb0e6fcd2 compat/bswap.h: detect endianness from XL C compiler macros
There is no /usr/include/endian.h equivalent on z/OS, but the
compiler will define macros to indicate endianness on host and
target hardware.  This adds a test for these macros as a last
resort for determining byte order.

Signed-off-by: David Michael <fedora.dm0@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-27 11:51:12 -07:00
Ben Walton
9c65ee15ee compat/bswap.h: fix endianness detection
The changes to make detection of endianness more portable had a bug
that breaks on (at least) Solaris x86.

The bug appears to be a simple copy/paste typo. It checks for
_BIG_ENDIAN and not _LITTLE_ENDIAN for both the case where we would
decide the system is big endian and little endian. Instead, the
second test should be for _LITTLE_ENDIAN and not _BIG_ENDIAN.

Two fixes were possible:

 1. Change the negation order of the conditions in the second test.
 2. Reverse the order of the conditions in the second test.

Use the second option so that the condition we expect is always a
positive check.

Signed-off-by: Ben Walton <bdwalton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-30 11:48:28 -07:00
Junio C Hamano
839fa9c500 compat/bswap.h: restore preference __BIG_ENDIAN over BIG_ENDIAN
The previous commit swaps the order we check the macros defined by
the compiler and the system headers from the original.  Since the
order of check should not matter (i.e. it is insane to define both
__BIG_ENDIAN and friends and BIG_ENDIAN and friends and in a
conflicting way), it is the most conservative thing to do not to
change it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-02 12:36:10 -07:00
Charles Bailey
3cf6bb3406 compat/bswap.h: detect endianness on more platforms that don't use BYTE_ORDER
Signed-off-by: Charles Bailey <cbailey32@bloomberg.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-02 12:31:59 -07:00
Jeff King
c3d8da571f read-cache: use get_be32 instead of hand-rolled ntoh_l
Commit d60c49c (read-cache.c: allow unaligned mapping of the
index file, 2012-04-03) introduced helpers to access
unaligned data. However, we already have get_be32, which has
a few advantages:

  1. It's already written, so we avoid duplication.

  2. It's probably faster, since it does the endian
     conversion and the alignment fix at the same time.

  3. The get_be32 code is well-tested, having been in
     block-sha1 for a long time. By contrast, our custom
     helpers were probably almost never used, since the user
     needed to manually define a macro to enable them.

We have to add a get_be16 implementation to the existing
get_be32, but that is very simple to do.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-23 14:03:48 -08:00
Jeff King
802b123366 block-sha1: factor out get_be and put_be wrappers
The BLK_SHA1 code has optimized wrappers for doing endian
conversions on memory that may not be aligned. Let's pull
them out so that we can use them elsewhere, especially the
time-tested list of platforms that prefer each strategy.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-23 14:03:21 -08:00
Vicent Marti
7e3dae4943 compat: add endianness helpers
The POSIX standard doesn't currently define a `ntohll`/`htonll`
function pair to perform network-to-host and host-to-network
swaps of 64-bit data. These 64-bit swaps are necessary for the on-disk
storage of EWAH bitmaps if they are not in native byte order.

Many thanks to Ramsay Jones <ramsay@ramsay1.demon.co.uk> and
Torsten Bögershausen <tboegi@web.de> for cygwin/mingw/msvc
portability fixes.

Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-11-18 10:57:42 -08:00
Jonathan Nieder
c6c8d0b797 compat: make gcc bswap an inline function
Without this change, gcc -pedantic warns:

 cache.h: In function 'ce_to_dtype':
 cache.h:270:21: warning: ISO C forbids braced-groups within expressions [-pedantic]

An inline function is more readable anyway.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-16 12:44:59 -07:00
Holger Weiß
21e403a7b9 Don't redefine htonl and ntohl on big-endian
Since commit 0fcabdeb52, compat/bswap.h
redefined htonl and ntohl to bswap32 not only if bswap32 has been
defined earlier in compat/bswap.h (which is done only on selected
platforms), but also if bswap32 has been defined anywhere else.  This
broke Git at least for NetBSD systems running on big-endian machines
(where ntohl and htonl should, of course, be NOOPs), since NetBSD
defines a bswap32 macro in the system headers.

So, we now undefine any previously defined bswap32 in compat/bswap.h
before defining our own.

Signed-off-by: Holger Weiß <holger@zedat.fu-berlin.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-31 14:21:39 -07:00
Benjamin Kramer
b073b7a990 Explicitly truncate bswap operand to uint32_t
There are some places in git where a long is passed to htonl/ntohl. llvm
doesn't support matching operands of different bitwidths intentionally.
This patch fixes the build with llvm-gcc (and clang) on x86_64.

Signed-off-by: Benjamin Kramer <benny.kra@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-11-25 00:16:38 -08:00
Sebastian Schuberth
0fcabdeb52 Use faster byte swapping when compiling with MSVC
When compiling with MSVC on x86-compatible, use an intrinsic for byte swapping.
In contrast to the GCC path, we do not prefer inline assembly here as it is not
supported for the x64 platform.

Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-30 09:37:48 -07:00
Ramsay Jones
5322ef2006 Fix some printf format warnings
commit 51ea551 ("make sure byte swapping is optimal for git"
2009-08-18) introduced a "sane definition for ntohl()/htonl()"
for use on some GNU C platforms. Unfortunately, for some of
these platforms, this results in the introduction of a problem
which is essentially the reverse of a problem that commit 6e1c234
("Fix some warnings (on cygwin) to allow -Werror" 2008-07-3) was
intended to fix.

In particular, on platforms where the uint32_t type is defined
to be unsigned long, the return type of the new ntohl()/htonl()
is causing gcc to issue printf format warnings, such as:

    warning: long unsigned int format, unsigned int arg (arg 3)

(nine such warnings, covering six different files). The earlier
commit (6e1c234) needed to suppress these same warnings, except
that the types were in the opposite direction; namely the format
specifier ("%u") was 'unsigned int' and the argument type (ie the
return type of ntohl()) was 'long unsigned int' (aka uint32_t).

In order to suppress these warnings, the earlier commit used the
(C99) PRIu32 format specifier, since the definition of this macro
is suitable for use with the uint32_t type on that platform.
This worked because the return type of the (original) platform
ntohl()/htonl() functions was uint32_t.

In order to suppress these warnings, we change the return type of
the new byte swapping functions in the compat/bswap.h header file
from 'unsigned int' to uint32_t.

Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Jeff King <peff@peff.net>
2009-10-02 03:32:51 -04:00
Nicolas Pitre
51ea55190b make sure byte swapping is optimal for git
We rely on ntohl() and htonl() to perform byte swapping in many places.
However, some platforms have libraries providing really poor
implementations of those which might cause significant performance
issues, especially with the block-sha1 code.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-18 14:16:37 -07:00