The ewah/ewok.h header pollutes the global namespace with
"BITS_IN_WORD", without any specific notion that we are
talking about the bits in an eword_t. We can give this the
more specific name "BITS_IN_EWORD".
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When buffer_grow changes the size of the buffer using realloc,
it first computes and saves the rlw pointer's offset into the
buffer using (uint8_t *) math before the realloc but then
restores it using (eword_t *) math.
In order to do this it's necessary to convert the (uint8_t *)
offset into an (eword_t *) offset. It was doing this by
dividing by the sizeof(size_t). Unfortunately sizeof(size_t)
is not same as sizeof(eword_t) on all platforms.
This causes illegal memory accesses and other bad things to
happen when attempting to use bitmaps on those platforms.
Fix this by dividing by the sizeof(eword_t) instead which
will always be correct for all platforms.
Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
EWAH is a word-aligned compressed variant of a bitset (i.e. a data
structure that acts as a 0-indexed boolean array for many entries).
It uses a 64-bit run-length encoding (RLE) compression scheme,
trading some compression for better processing speed.
The goal of this word-aligned implementation is not to achieve
the best compression, but rather to improve query processing time.
As it stands right now, this EWAH implementation will always be more
efficient storage-wise than its uncompressed alternative.
EWAH arrays will be used as the on-disk format to store reachability
bitmaps for all objects in a repository while keeping reasonable sizes,
in the same way that JGit does.
This EWAH implementation is a mostly straightforward port of the
original `javaewah` library that JGit currently uses. The library is
self-contained and has been embedded whole (4 files) inside the `ewah`
folder to ease redistribution.
The library is re-licensed under the GPLv2 with the permission of Daniel
Lemire, the original author. The source code for the C version can
be found on GitHub:
https://github.com/vmg/libewok
The original Java implementation can also be found on GitHub:
https://github.com/lemire/javaewah
[jc: stripped debug-only code per Peff's $gmane/239768]
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Helped-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>