Merge branch 'jk/pack-bitmap'
Borrow the bitmap index into packfiles from JGit to speed up enumeration of objects involved in a commit range without having to fully traverse the history. * jk/pack-bitmap: (26 commits) ewah: unconditionally ntohll ewah data ewah: support platforms that require aligned reads read-cache: use get_be32 instead of hand-rolled ntoh_l block-sha1: factor out get_be and put_be wrappers do not discard revindex when re-preparing packfiles pack-bitmap: implement optional name_hash cache t/perf: add tests for pack bitmaps t: add basic bitmap functionality tests count-objects: recognize .bitmap in garbage-checking repack: consider bitmaps when performing repacks repack: handle optional files created by pack-objects repack: turn exts array into array-of-struct repack: stop using magic number for ARRAY_SIZE(exts) pack-objects: implement bitmap writing rev-list: add bitmap mode to speed up object lists pack-objects: use bitmaps when packing objects pack-objects: split add_object_entry pack-bitmap: add support for bitmap indexes documentation: add documentation for the bitmap format ewah: compressed bitmap implementation ...
This commit is contained in:
commit
0f9e62e084
@ -1870,6 +1870,31 @@ pack.packSizeLimit::
|
|||||||
Common unit suffixes of 'k', 'm', or 'g' are
|
Common unit suffixes of 'k', 'm', or 'g' are
|
||||||
supported.
|
supported.
|
||||||
|
|
||||||
|
pack.useBitmaps::
|
||||||
|
When true, git will use pack bitmaps (if available) when packing
|
||||||
|
to stdout (e.g., during the server side of a fetch). Defaults to
|
||||||
|
true. You should not generally need to turn this off unless
|
||||||
|
you are debugging pack bitmaps.
|
||||||
|
|
||||||
|
pack.writebitmaps::
|
||||||
|
When true, git will write a bitmap index when packing all
|
||||||
|
objects to disk (e.g., when `git repack -a` is run). This
|
||||||
|
index can speed up the "counting objects" phase of subsequent
|
||||||
|
packs created for clones and fetches, at the cost of some disk
|
||||||
|
space and extra time spent on the initial repack. Defaults to
|
||||||
|
false.
|
||||||
|
|
||||||
|
pack.writeBitmapHashCache::
|
||||||
|
When true, git will include a "hash cache" section in the bitmap
|
||||||
|
index (if one is written). This cache can be used to feed git's
|
||||||
|
delta heuristics, potentially leading to better deltas between
|
||||||
|
bitmapped and non-bitmapped objects (e.g., when serving a fetch
|
||||||
|
between an older, bitmapped pack and objects that have been
|
||||||
|
pushed since the last gc). The downside is that it consumes 4
|
||||||
|
bytes per object of disk space, and that JGit's bitmap
|
||||||
|
implementation does not understand it, causing it to complain if
|
||||||
|
Git and JGit are used on the same repository. Defaults to false.
|
||||||
|
|
||||||
pager.<cmd>::
|
pager.<cmd>::
|
||||||
If the value is boolean, turns on or off pagination of the
|
If the value is boolean, turns on or off pagination of the
|
||||||
output of a particular Git subcommand when writing to a tty.
|
output of a particular Git subcommand when writing to a tty.
|
||||||
|
@ -9,7 +9,7 @@ git-repack - Pack unpacked objects in a repository
|
|||||||
SYNOPSIS
|
SYNOPSIS
|
||||||
--------
|
--------
|
||||||
[verse]
|
[verse]
|
||||||
'git repack' [-a] [-A] [-d] [-f] [-F] [-l] [-n] [-q] [--window=<n>] [--depth=<n>]
|
'git repack' [-a] [-A] [-d] [-f] [-F] [-l] [-n] [-q] [-b] [--window=<n>] [--depth=<n>]
|
||||||
|
|
||||||
DESCRIPTION
|
DESCRIPTION
|
||||||
-----------
|
-----------
|
||||||
@ -110,6 +110,13 @@ other objects in that pack they already have locally.
|
|||||||
The default is unlimited, unless the config variable
|
The default is unlimited, unless the config variable
|
||||||
`pack.packSizeLimit` is set.
|
`pack.packSizeLimit` is set.
|
||||||
|
|
||||||
|
-b::
|
||||||
|
--write-bitmap-index::
|
||||||
|
Write a reachability bitmap index as part of the repack. This
|
||||||
|
only makes sense when used with `-a` or `-A`, as the bitmaps
|
||||||
|
must be able to refer to all reachable objects. This option
|
||||||
|
overrides the setting of `pack.writebitmaps`.
|
||||||
|
|
||||||
|
|
||||||
Configuration
|
Configuration
|
||||||
-------------
|
-------------
|
||||||
|
@ -55,6 +55,7 @@ SYNOPSIS
|
|||||||
[ \--reverse ]
|
[ \--reverse ]
|
||||||
[ \--walk-reflogs ]
|
[ \--walk-reflogs ]
|
||||||
[ \--no-walk ] [ \--do-walk ]
|
[ \--no-walk ] [ \--do-walk ]
|
||||||
|
[ \--use-bitmap-index ]
|
||||||
<commit>... [ \-- <paths>... ]
|
<commit>... [ \-- <paths>... ]
|
||||||
|
|
||||||
DESCRIPTION
|
DESCRIPTION
|
||||||
|
@ -257,6 +257,14 @@ See also linkgit:git-reflog[1].
|
|||||||
Output excluded boundary commits. Boundary commits are
|
Output excluded boundary commits. Boundary commits are
|
||||||
prefixed with `-`.
|
prefixed with `-`.
|
||||||
|
|
||||||
|
ifdef::git-rev-list[]
|
||||||
|
--use-bitmap-index::
|
||||||
|
|
||||||
|
Try to speed up the traversal using the pack bitmap index (if
|
||||||
|
one is available). Note that when traversing with `--objects`,
|
||||||
|
trees and blobs will not have their associated path printed.
|
||||||
|
endif::git-rev-list[]
|
||||||
|
|
||||||
--
|
--
|
||||||
|
|
||||||
History Simplification
|
History Simplification
|
||||||
|
164
Documentation/technical/bitmap-format.txt
Normal file
164
Documentation/technical/bitmap-format.txt
Normal file
@ -0,0 +1,164 @@
|
|||||||
|
GIT bitmap v1 format
|
||||||
|
====================
|
||||||
|
|
||||||
|
- A header appears at the beginning:
|
||||||
|
|
||||||
|
4-byte signature: {'B', 'I', 'T', 'M'}
|
||||||
|
|
||||||
|
2-byte version number (network byte order)
|
||||||
|
The current implementation only supports version 1
|
||||||
|
of the bitmap index (the same one as JGit).
|
||||||
|
|
||||||
|
2-byte flags (network byte order)
|
||||||
|
|
||||||
|
The following flags are supported:
|
||||||
|
|
||||||
|
- BITMAP_OPT_FULL_DAG (0x1) REQUIRED
|
||||||
|
This flag must always be present. It implies that the bitmap
|
||||||
|
index has been generated for a packfile with full closure
|
||||||
|
(i.e. where every single object in the packfile can find
|
||||||
|
its parent links inside the same packfile). This is a
|
||||||
|
requirement for the bitmap index format, also present in JGit,
|
||||||
|
that greatly reduces the complexity of the implementation.
|
||||||
|
|
||||||
|
- BITMAP_OPT_HASH_CACHE (0x4)
|
||||||
|
If present, the end of the bitmap file contains
|
||||||
|
`N` 32-bit name-hash values, one per object in the
|
||||||
|
pack. The format and meaning of the name-hash is
|
||||||
|
described below.
|
||||||
|
|
||||||
|
4-byte entry count (network byte order)
|
||||||
|
|
||||||
|
The total count of entries (bitmapped commits) in this bitmap index.
|
||||||
|
|
||||||
|
20-byte checksum
|
||||||
|
|
||||||
|
The SHA1 checksum of the pack this bitmap index belongs to.
|
||||||
|
|
||||||
|
- 4 EWAH bitmaps that act as type indexes
|
||||||
|
|
||||||
|
Type indexes are serialized after the hash cache in the shape
|
||||||
|
of four EWAH bitmaps stored consecutively (see Appendix A for
|
||||||
|
the serialization format of an EWAH bitmap).
|
||||||
|
|
||||||
|
There is a bitmap for each Git object type, stored in the following
|
||||||
|
order:
|
||||||
|
|
||||||
|
- Commits
|
||||||
|
- Trees
|
||||||
|
- Blobs
|
||||||
|
- Tags
|
||||||
|
|
||||||
|
In each bitmap, the `n`th bit is set to true if the `n`th object
|
||||||
|
in the packfile is of that type.
|
||||||
|
|
||||||
|
The obvious consequence is that the OR of all 4 bitmaps will result
|
||||||
|
in a full set (all bits set), and the AND of all 4 bitmaps will
|
||||||
|
result in an empty bitmap (no bits set).
|
||||||
|
|
||||||
|
- N entries with compressed bitmaps, one for each indexed commit
|
||||||
|
|
||||||
|
Where `N` is the total amount of entries in this bitmap index.
|
||||||
|
Each entry contains the following:
|
||||||
|
|
||||||
|
- 4-byte object position (network byte order)
|
||||||
|
The position **in the index for the packfile** where the
|
||||||
|
bitmap for this commit is found.
|
||||||
|
|
||||||
|
- 1-byte XOR-offset
|
||||||
|
The xor offset used to compress this bitmap. For an entry
|
||||||
|
in position `x`, a XOR offset of `y` means that the actual
|
||||||
|
bitmap representing this commit is composed by XORing the
|
||||||
|
bitmap for this entry with the bitmap in entry `x-y` (i.e.
|
||||||
|
the bitmap `y` entries before this one).
|
||||||
|
|
||||||
|
Note that this compression can be recursive. In order to
|
||||||
|
XOR this entry with a previous one, the previous entry needs
|
||||||
|
to be decompressed first, and so on.
|
||||||
|
|
||||||
|
The hard-limit for this offset is 160 (an entry can only be
|
||||||
|
xor'ed against one of the 160 entries preceding it). This
|
||||||
|
number is always positive, and hence entries are always xor'ed
|
||||||
|
with **previous** bitmaps, not bitmaps that will come afterwards
|
||||||
|
in the index.
|
||||||
|
|
||||||
|
- 1-byte flags for this bitmap
|
||||||
|
At the moment the only available flag is `0x1`, which hints
|
||||||
|
that this bitmap can be re-used when rebuilding bitmap indexes
|
||||||
|
for the repository.
|
||||||
|
|
||||||
|
- The compressed bitmap itself, see Appendix A.
|
||||||
|
|
||||||
|
== Appendix A: Serialization format for an EWAH bitmap
|
||||||
|
|
||||||
|
Ewah bitmaps are serialized in the same protocol as the JAVAEWAH
|
||||||
|
library, making them backwards compatible with the JGit
|
||||||
|
implementation:
|
||||||
|
|
||||||
|
- 4-byte number of bits of the resulting UNCOMPRESSED bitmap
|
||||||
|
|
||||||
|
- 4-byte number of words of the COMPRESSED bitmap, when stored
|
||||||
|
|
||||||
|
- N x 8-byte words, as specified by the previous field
|
||||||
|
|
||||||
|
This is the actual content of the compressed bitmap.
|
||||||
|
|
||||||
|
- 4-byte position of the current RLW for the compressed
|
||||||
|
bitmap
|
||||||
|
|
||||||
|
All words are stored in network byte order for their corresponding
|
||||||
|
sizes.
|
||||||
|
|
||||||
|
The compressed bitmap is stored in a form of run-length encoding, as
|
||||||
|
follows. It consists of a concatenation of an arbitrary number of
|
||||||
|
chunks. Each chunk consists of one or more 64-bit words
|
||||||
|
|
||||||
|
H L_1 L_2 L_3 .... L_M
|
||||||
|
|
||||||
|
H is called RLW (run length word). It consists of (from lower to higher
|
||||||
|
order bits):
|
||||||
|
|
||||||
|
- 1 bit: the repeated bit B
|
||||||
|
|
||||||
|
- 32 bits: repetition count K (unsigned)
|
||||||
|
|
||||||
|
- 31 bits: literal word count M (unsigned)
|
||||||
|
|
||||||
|
The bitstream represented by the above chunk is then:
|
||||||
|
|
||||||
|
- K repetitions of B
|
||||||
|
|
||||||
|
- The bits stored in `L_1` through `L_M`. Within a word, bits at
|
||||||
|
lower order come earlier in the stream than those at higher
|
||||||
|
order.
|
||||||
|
|
||||||
|
The next word after `L_M` (if any) must again be a RLW, for the next
|
||||||
|
chunk. For efficient appending to the bitstream, the EWAH stores a
|
||||||
|
pointer to the last RLW in the stream.
|
||||||
|
|
||||||
|
|
||||||
|
== Appendix B: Optional Bitmap Sections
|
||||||
|
|
||||||
|
These sections may or may not be present in the `.bitmap` file; their
|
||||||
|
presence is indicated by the header flags section described above.
|
||||||
|
|
||||||
|
Name-hash cache
|
||||||
|
---------------
|
||||||
|
|
||||||
|
If the BITMAP_OPT_HASH_CACHE flag is set, the end of the bitmap contains
|
||||||
|
a cache of 32-bit values, one per object in the pack. The value at
|
||||||
|
position `i` is the hash of the pathname at which the `i`th object
|
||||||
|
(counting in index order) in the pack can be found. This can be fed
|
||||||
|
into the delta heuristics to compare objects with similar pathnames.
|
||||||
|
|
||||||
|
The hash algorithm used is:
|
||||||
|
|
||||||
|
hash = 0;
|
||||||
|
while ((c = *name++))
|
||||||
|
if (!isspace(c))
|
||||||
|
hash = (hash >> 2) + (c << 24);
|
||||||
|
|
||||||
|
Note that this hashing scheme is tied to the BITMAP_OPT_HASH_CACHE flag.
|
||||||
|
If implementations want to choose a different hashing scheme, they are
|
||||||
|
free to do so, but MUST allocate a new header flag (because comparing
|
||||||
|
hashes made under two different schemes would be pointless).
|
16
Makefile
16
Makefile
@ -664,6 +664,8 @@ LIB_H += diff.h
|
|||||||
LIB_H += diffcore.h
|
LIB_H += diffcore.h
|
||||||
LIB_H += dir.h
|
LIB_H += dir.h
|
||||||
LIB_H += exec_cmd.h
|
LIB_H += exec_cmd.h
|
||||||
|
LIB_H += ewah/ewok.h
|
||||||
|
LIB_H += ewah/ewok_rlw.h
|
||||||
LIB_H += fetch-pack.h
|
LIB_H += fetch-pack.h
|
||||||
LIB_H += fmt-merge-msg.h
|
LIB_H += fmt-merge-msg.h
|
||||||
LIB_H += fsck.h
|
LIB_H += fsck.h
|
||||||
@ -691,8 +693,10 @@ LIB_H += notes-merge.h
|
|||||||
LIB_H += notes-utils.h
|
LIB_H += notes-utils.h
|
||||||
LIB_H += notes.h
|
LIB_H += notes.h
|
||||||
LIB_H += object.h
|
LIB_H += object.h
|
||||||
|
LIB_H += pack-objects.h
|
||||||
LIB_H += pack-revindex.h
|
LIB_H += pack-revindex.h
|
||||||
LIB_H += pack.h
|
LIB_H += pack.h
|
||||||
|
LIB_H += pack-bitmap.h
|
||||||
LIB_H += parse-options.h
|
LIB_H += parse-options.h
|
||||||
LIB_H += patch-ids.h
|
LIB_H += patch-ids.h
|
||||||
LIB_H += pathspec.h
|
LIB_H += pathspec.h
|
||||||
@ -796,6 +800,10 @@ LIB_OBJS += dir.o
|
|||||||
LIB_OBJS += editor.o
|
LIB_OBJS += editor.o
|
||||||
LIB_OBJS += entry.o
|
LIB_OBJS += entry.o
|
||||||
LIB_OBJS += environment.o
|
LIB_OBJS += environment.o
|
||||||
|
LIB_OBJS += ewah/bitmap.o
|
||||||
|
LIB_OBJS += ewah/ewah_bitmap.o
|
||||||
|
LIB_OBJS += ewah/ewah_io.o
|
||||||
|
LIB_OBJS += ewah/ewah_rlw.o
|
||||||
LIB_OBJS += exec_cmd.o
|
LIB_OBJS += exec_cmd.o
|
||||||
LIB_OBJS += fetch-pack.o
|
LIB_OBJS += fetch-pack.o
|
||||||
LIB_OBJS += fsck.o
|
LIB_OBJS += fsck.o
|
||||||
@ -827,7 +835,10 @@ LIB_OBJS += notes-cache.o
|
|||||||
LIB_OBJS += notes-merge.o
|
LIB_OBJS += notes-merge.o
|
||||||
LIB_OBJS += notes-utils.o
|
LIB_OBJS += notes-utils.o
|
||||||
LIB_OBJS += object.o
|
LIB_OBJS += object.o
|
||||||
|
LIB_OBJS += pack-bitmap.o
|
||||||
|
LIB_OBJS += pack-bitmap-write.o
|
||||||
LIB_OBJS += pack-check.o
|
LIB_OBJS += pack-check.o
|
||||||
|
LIB_OBJS += pack-objects.o
|
||||||
LIB_OBJS += pack-revindex.o
|
LIB_OBJS += pack-revindex.o
|
||||||
LIB_OBJS += pack-write.o
|
LIB_OBJS += pack-write.o
|
||||||
LIB_OBJS += pager.o
|
LIB_OBJS += pager.o
|
||||||
@ -2480,8 +2491,9 @@ profile-clean:
|
|||||||
$(RM) $(addsuffix *.gcno,$(addprefix $(PROFILE_DIR)/, $(object_dirs)))
|
$(RM) $(addsuffix *.gcno,$(addprefix $(PROFILE_DIR)/, $(object_dirs)))
|
||||||
|
|
||||||
clean: profile-clean coverage-clean
|
clean: profile-clean coverage-clean
|
||||||
$(RM) *.o *.res block-sha1/*.o ppc/*.o compat/*.o compat/*/*.o xdiff/*.o vcs-svn/*.o \
|
$(RM) *.o *.res block-sha1/*.o ppc/*.o compat/*.o compat/*/*.o
|
||||||
builtin/*.o $(LIB_FILE) $(XDIFF_LIB) $(VCSSVN_LIB)
|
$(RM) xdiff/*.o vcs-svn/*.o ewah/*.o builtin/*.o
|
||||||
|
$(RM) $(LIB_FILE) $(XDIFF_LIB) $(VCSSVN_LIB)
|
||||||
$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) git$X
|
$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) git$X
|
||||||
$(RM) $(TEST_PROGRAMS) $(NO_INSTALL)
|
$(RM) $(TEST_PROGRAMS) $(NO_INSTALL)
|
||||||
$(RM) -r bin-wrappers $(dep_dirs)
|
$(RM) -r bin-wrappers $(dep_dirs)
|
||||||
|
@ -62,38 +62,6 @@
|
|||||||
#define setW(x, val) (W(x) = (val))
|
#define setW(x, val) (W(x) = (val))
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
/*
|
|
||||||
* Performance might be improved if the CPU architecture is OK with
|
|
||||||
* unaligned 32-bit loads and a fast ntohl() is available.
|
|
||||||
* Otherwise fall back to byte loads and shifts which is portable,
|
|
||||||
* and is faster on architectures with memory alignment issues.
|
|
||||||
*/
|
|
||||||
|
|
||||||
#if defined(__i386__) || defined(__x86_64__) || \
|
|
||||||
defined(_M_IX86) || defined(_M_X64) || \
|
|
||||||
defined(__ppc__) || defined(__ppc64__) || \
|
|
||||||
defined(__powerpc__) || defined(__powerpc64__) || \
|
|
||||||
defined(__s390__) || defined(__s390x__)
|
|
||||||
|
|
||||||
#define get_be32(p) ntohl(*(unsigned int *)(p))
|
|
||||||
#define put_be32(p, v) do { *(unsigned int *)(p) = htonl(v); } while (0)
|
|
||||||
|
|
||||||
#else
|
|
||||||
|
|
||||||
#define get_be32(p) ( \
|
|
||||||
(*((unsigned char *)(p) + 0) << 24) | \
|
|
||||||
(*((unsigned char *)(p) + 1) << 16) | \
|
|
||||||
(*((unsigned char *)(p) + 2) << 8) | \
|
|
||||||
(*((unsigned char *)(p) + 3) << 0) )
|
|
||||||
#define put_be32(p, v) do { \
|
|
||||||
unsigned int __v = (v); \
|
|
||||||
*((unsigned char *)(p) + 0) = __v >> 24; \
|
|
||||||
*((unsigned char *)(p) + 1) = __v >> 16; \
|
|
||||||
*((unsigned char *)(p) + 2) = __v >> 8; \
|
|
||||||
*((unsigned char *)(p) + 3) = __v >> 0; } while (0)
|
|
||||||
|
|
||||||
#endif
|
|
||||||
|
|
||||||
/* This "rolls" over the 512-bit array */
|
/* This "rolls" over the 512-bit array */
|
||||||
#define W(x) (array[(x)&15])
|
#define W(x) (array[(x)&15])
|
||||||
|
|
||||||
|
@ -14,10 +14,12 @@
|
|||||||
#include "diff.h"
|
#include "diff.h"
|
||||||
#include "revision.h"
|
#include "revision.h"
|
||||||
#include "list-objects.h"
|
#include "list-objects.h"
|
||||||
|
#include "pack-objects.h"
|
||||||
#include "progress.h"
|
#include "progress.h"
|
||||||
#include "refs.h"
|
#include "refs.h"
|
||||||
#include "streaming.h"
|
#include "streaming.h"
|
||||||
#include "thread-utils.h"
|
#include "thread-utils.h"
|
||||||
|
#include "pack-bitmap.h"
|
||||||
|
|
||||||
static const char *pack_usage[] = {
|
static const char *pack_usage[] = {
|
||||||
N_("git pack-objects --stdout [options...] [< ref-list | < object-list]"),
|
N_("git pack-objects --stdout [options...] [< ref-list | < object-list]"),
|
||||||
@ -25,42 +27,15 @@ static const char *pack_usage[] = {
|
|||||||
NULL
|
NULL
|
||||||
};
|
};
|
||||||
|
|
||||||
struct object_entry {
|
|
||||||
struct pack_idx_entry idx;
|
|
||||||
unsigned long size; /* uncompressed size */
|
|
||||||
struct packed_git *in_pack; /* already in pack */
|
|
||||||
off_t in_pack_offset;
|
|
||||||
struct object_entry *delta; /* delta base object */
|
|
||||||
struct object_entry *delta_child; /* deltified objects who bases me */
|
|
||||||
struct object_entry *delta_sibling; /* other deltified objects who
|
|
||||||
* uses the same base as me
|
|
||||||
*/
|
|
||||||
void *delta_data; /* cached delta (uncompressed) */
|
|
||||||
unsigned long delta_size; /* delta data size (uncompressed) */
|
|
||||||
unsigned long z_delta_size; /* delta data size (compressed) */
|
|
||||||
enum object_type type;
|
|
||||||
enum object_type in_pack_type; /* could be delta */
|
|
||||||
uint32_t hash; /* name hint hash */
|
|
||||||
unsigned char in_pack_header_size;
|
|
||||||
unsigned preferred_base:1; /*
|
|
||||||
* we do not pack this, but is available
|
|
||||||
* to be used as the base object to delta
|
|
||||||
* objects against.
|
|
||||||
*/
|
|
||||||
unsigned no_try_delta:1;
|
|
||||||
unsigned tagged:1; /* near the very tip of refs */
|
|
||||||
unsigned filled:1; /* assigned write-order */
|
|
||||||
};
|
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Objects we are going to pack are collected in objects array (dynamically
|
* Objects we are going to pack are collected in the `to_pack` structure.
|
||||||
* expanded). nr_objects & nr_alloc controls this array. They are stored
|
* It contains an array (dynamically expanded) of the object data, and a map
|
||||||
* in the order we see -- typically rev-list --objects order that gives us
|
* that can resolve SHA1s to their position in the array.
|
||||||
* nice "minimum seek" order.
|
|
||||||
*/
|
*/
|
||||||
static struct object_entry *objects;
|
static struct packing_data to_pack;
|
||||||
|
|
||||||
static struct pack_idx_entry **written_list;
|
static struct pack_idx_entry **written_list;
|
||||||
static uint32_t nr_objects, nr_alloc, nr_result, nr_written;
|
static uint32_t nr_result, nr_written;
|
||||||
|
|
||||||
static int non_empty;
|
static int non_empty;
|
||||||
static int reuse_delta = 1, reuse_object = 1;
|
static int reuse_delta = 1, reuse_object = 1;
|
||||||
@ -83,27 +58,43 @@ static struct progress *progress_state;
|
|||||||
static int pack_compression_level = Z_DEFAULT_COMPRESSION;
|
static int pack_compression_level = Z_DEFAULT_COMPRESSION;
|
||||||
static int pack_compression_seen;
|
static int pack_compression_seen;
|
||||||
|
|
||||||
|
static struct packed_git *reuse_packfile;
|
||||||
|
static uint32_t reuse_packfile_objects;
|
||||||
|
static off_t reuse_packfile_offset;
|
||||||
|
|
||||||
|
static int use_bitmap_index = 1;
|
||||||
|
static int write_bitmap_index;
|
||||||
|
static uint16_t write_bitmap_options;
|
||||||
|
|
||||||
static unsigned long delta_cache_size = 0;
|
static unsigned long delta_cache_size = 0;
|
||||||
static unsigned long max_delta_cache_size = 256 * 1024 * 1024;
|
static unsigned long max_delta_cache_size = 256 * 1024 * 1024;
|
||||||
static unsigned long cache_max_small_delta_size = 1000;
|
static unsigned long cache_max_small_delta_size = 1000;
|
||||||
|
|
||||||
static unsigned long window_memory_limit = 0;
|
static unsigned long window_memory_limit = 0;
|
||||||
|
|
||||||
/*
|
|
||||||
* The object names in objects array are hashed with this hashtable,
|
|
||||||
* to help looking up the entry by object name.
|
|
||||||
* This hashtable is built after all the objects are seen.
|
|
||||||
*/
|
|
||||||
static int *object_ix;
|
|
||||||
static int object_ix_hashsz;
|
|
||||||
static struct object_entry *locate_object_entry(const unsigned char *sha1);
|
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* stats
|
* stats
|
||||||
*/
|
*/
|
||||||
static uint32_t written, written_delta;
|
static uint32_t written, written_delta;
|
||||||
static uint32_t reused, reused_delta;
|
static uint32_t reused, reused_delta;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Indexed commits
|
||||||
|
*/
|
||||||
|
static struct commit **indexed_commits;
|
||||||
|
static unsigned int indexed_commits_nr;
|
||||||
|
static unsigned int indexed_commits_alloc;
|
||||||
|
|
||||||
|
static void index_commit_for_bitmap(struct commit *commit)
|
||||||
|
{
|
||||||
|
if (indexed_commits_nr >= indexed_commits_alloc) {
|
||||||
|
indexed_commits_alloc = (indexed_commits_alloc + 32) * 2;
|
||||||
|
indexed_commits = xrealloc(indexed_commits,
|
||||||
|
indexed_commits_alloc * sizeof(struct commit *));
|
||||||
|
}
|
||||||
|
|
||||||
|
indexed_commits[indexed_commits_nr++] = commit;
|
||||||
|
}
|
||||||
|
|
||||||
static void *get_delta(struct object_entry *entry)
|
static void *get_delta(struct object_entry *entry)
|
||||||
{
|
{
|
||||||
@ -553,12 +544,12 @@ static int mark_tagged(const char *path, const unsigned char *sha1, int flag,
|
|||||||
void *cb_data)
|
void *cb_data)
|
||||||
{
|
{
|
||||||
unsigned char peeled[20];
|
unsigned char peeled[20];
|
||||||
struct object_entry *entry = locate_object_entry(sha1);
|
struct object_entry *entry = packlist_find(&to_pack, sha1, NULL);
|
||||||
|
|
||||||
if (entry)
|
if (entry)
|
||||||
entry->tagged = 1;
|
entry->tagged = 1;
|
||||||
if (!peel_ref(path, peeled)) {
|
if (!peel_ref(path, peeled)) {
|
||||||
entry = locate_object_entry(peeled);
|
entry = packlist_find(&to_pack, peeled, NULL);
|
||||||
if (entry)
|
if (entry)
|
||||||
entry->tagged = 1;
|
entry->tagged = 1;
|
||||||
}
|
}
|
||||||
@ -633,9 +624,10 @@ static struct object_entry **compute_write_order(void)
|
|||||||
{
|
{
|
||||||
unsigned int i, wo_end, last_untagged;
|
unsigned int i, wo_end, last_untagged;
|
||||||
|
|
||||||
struct object_entry **wo = xmalloc(nr_objects * sizeof(*wo));
|
struct object_entry **wo = xmalloc(to_pack.nr_objects * sizeof(*wo));
|
||||||
|
struct object_entry *objects = to_pack.objects;
|
||||||
|
|
||||||
for (i = 0; i < nr_objects; i++) {
|
for (i = 0; i < to_pack.nr_objects; i++) {
|
||||||
objects[i].tagged = 0;
|
objects[i].tagged = 0;
|
||||||
objects[i].filled = 0;
|
objects[i].filled = 0;
|
||||||
objects[i].delta_child = NULL;
|
objects[i].delta_child = NULL;
|
||||||
@ -647,7 +639,7 @@ static struct object_entry **compute_write_order(void)
|
|||||||
* Make sure delta_sibling is sorted in the original
|
* Make sure delta_sibling is sorted in the original
|
||||||
* recency order.
|
* recency order.
|
||||||
*/
|
*/
|
||||||
for (i = nr_objects; i > 0;) {
|
for (i = to_pack.nr_objects; i > 0;) {
|
||||||
struct object_entry *e = &objects[--i];
|
struct object_entry *e = &objects[--i];
|
||||||
if (!e->delta)
|
if (!e->delta)
|
||||||
continue;
|
continue;
|
||||||
@ -665,7 +657,7 @@ static struct object_entry **compute_write_order(void)
|
|||||||
* Give the objects in the original recency order until
|
* Give the objects in the original recency order until
|
||||||
* we see a tagged tip.
|
* we see a tagged tip.
|
||||||
*/
|
*/
|
||||||
for (i = wo_end = 0; i < nr_objects; i++) {
|
for (i = wo_end = 0; i < to_pack.nr_objects; i++) {
|
||||||
if (objects[i].tagged)
|
if (objects[i].tagged)
|
||||||
break;
|
break;
|
||||||
add_to_write_order(wo, &wo_end, &objects[i]);
|
add_to_write_order(wo, &wo_end, &objects[i]);
|
||||||
@ -675,7 +667,7 @@ static struct object_entry **compute_write_order(void)
|
|||||||
/*
|
/*
|
||||||
* Then fill all the tagged tips.
|
* Then fill all the tagged tips.
|
||||||
*/
|
*/
|
||||||
for (; i < nr_objects; i++) {
|
for (; i < to_pack.nr_objects; i++) {
|
||||||
if (objects[i].tagged)
|
if (objects[i].tagged)
|
||||||
add_to_write_order(wo, &wo_end, &objects[i]);
|
add_to_write_order(wo, &wo_end, &objects[i]);
|
||||||
}
|
}
|
||||||
@ -683,7 +675,7 @@ static struct object_entry **compute_write_order(void)
|
|||||||
/*
|
/*
|
||||||
* And then all remaining commits and tags.
|
* And then all remaining commits and tags.
|
||||||
*/
|
*/
|
||||||
for (i = last_untagged; i < nr_objects; i++) {
|
for (i = last_untagged; i < to_pack.nr_objects; i++) {
|
||||||
if (objects[i].type != OBJ_COMMIT &&
|
if (objects[i].type != OBJ_COMMIT &&
|
||||||
objects[i].type != OBJ_TAG)
|
objects[i].type != OBJ_TAG)
|
||||||
continue;
|
continue;
|
||||||
@ -693,7 +685,7 @@ static struct object_entry **compute_write_order(void)
|
|||||||
/*
|
/*
|
||||||
* And then all the trees.
|
* And then all the trees.
|
||||||
*/
|
*/
|
||||||
for (i = last_untagged; i < nr_objects; i++) {
|
for (i = last_untagged; i < to_pack.nr_objects; i++) {
|
||||||
if (objects[i].type != OBJ_TREE)
|
if (objects[i].type != OBJ_TREE)
|
||||||
continue;
|
continue;
|
||||||
add_to_write_order(wo, &wo_end, &objects[i]);
|
add_to_write_order(wo, &wo_end, &objects[i]);
|
||||||
@ -702,17 +694,57 @@ static struct object_entry **compute_write_order(void)
|
|||||||
/*
|
/*
|
||||||
* Finally all the rest in really tight order
|
* Finally all the rest in really tight order
|
||||||
*/
|
*/
|
||||||
for (i = last_untagged; i < nr_objects; i++) {
|
for (i = last_untagged; i < to_pack.nr_objects; i++) {
|
||||||
if (!objects[i].filled)
|
if (!objects[i].filled)
|
||||||
add_family_to_write_order(wo, &wo_end, &objects[i]);
|
add_family_to_write_order(wo, &wo_end, &objects[i]);
|
||||||
}
|
}
|
||||||
|
|
||||||
if (wo_end != nr_objects)
|
if (wo_end != to_pack.nr_objects)
|
||||||
die("ordered %u objects, expected %"PRIu32, wo_end, nr_objects);
|
die("ordered %u objects, expected %"PRIu32, wo_end, to_pack.nr_objects);
|
||||||
|
|
||||||
return wo;
|
return wo;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static off_t write_reused_pack(struct sha1file *f)
|
||||||
|
{
|
||||||
|
unsigned char buffer[8192];
|
||||||
|
off_t to_write;
|
||||||
|
int fd;
|
||||||
|
|
||||||
|
if (!is_pack_valid(reuse_packfile))
|
||||||
|
die("packfile is invalid: %s", reuse_packfile->pack_name);
|
||||||
|
|
||||||
|
fd = git_open_noatime(reuse_packfile->pack_name);
|
||||||
|
if (fd < 0)
|
||||||
|
die_errno("unable to open packfile for reuse: %s",
|
||||||
|
reuse_packfile->pack_name);
|
||||||
|
|
||||||
|
if (lseek(fd, sizeof(struct pack_header), SEEK_SET) == -1)
|
||||||
|
die_errno("unable to seek in reused packfile");
|
||||||
|
|
||||||
|
if (reuse_packfile_offset < 0)
|
||||||
|
reuse_packfile_offset = reuse_packfile->pack_size - 20;
|
||||||
|
|
||||||
|
to_write = reuse_packfile_offset - sizeof(struct pack_header);
|
||||||
|
|
||||||
|
while (to_write) {
|
||||||
|
int read_pack = xread(fd, buffer, sizeof(buffer));
|
||||||
|
|
||||||
|
if (read_pack <= 0)
|
||||||
|
die_errno("unable to read from reused packfile");
|
||||||
|
|
||||||
|
if (read_pack > to_write)
|
||||||
|
read_pack = to_write;
|
||||||
|
|
||||||
|
sha1write(f, buffer, read_pack);
|
||||||
|
to_write -= read_pack;
|
||||||
|
}
|
||||||
|
|
||||||
|
close(fd);
|
||||||
|
written += reuse_packfile_objects;
|
||||||
|
return reuse_packfile_offset - sizeof(struct pack_header);
|
||||||
|
}
|
||||||
|
|
||||||
static void write_pack_file(void)
|
static void write_pack_file(void)
|
||||||
{
|
{
|
||||||
uint32_t i = 0, j;
|
uint32_t i = 0, j;
|
||||||
@ -724,7 +756,7 @@ static void write_pack_file(void)
|
|||||||
|
|
||||||
if (progress > pack_to_stdout)
|
if (progress > pack_to_stdout)
|
||||||
progress_state = start_progress("Writing objects", nr_result);
|
progress_state = start_progress("Writing objects", nr_result);
|
||||||
written_list = xmalloc(nr_objects * sizeof(*written_list));
|
written_list = xmalloc(to_pack.nr_objects * sizeof(*written_list));
|
||||||
write_order = compute_write_order();
|
write_order = compute_write_order();
|
||||||
|
|
||||||
do {
|
do {
|
||||||
@ -737,8 +769,17 @@ static void write_pack_file(void)
|
|||||||
f = create_tmp_packfile(&pack_tmp_name);
|
f = create_tmp_packfile(&pack_tmp_name);
|
||||||
|
|
||||||
offset = write_pack_header(f, nr_remaining);
|
offset = write_pack_header(f, nr_remaining);
|
||||||
|
|
||||||
|
if (reuse_packfile) {
|
||||||
|
off_t packfile_size;
|
||||||
|
assert(pack_to_stdout);
|
||||||
|
|
||||||
|
packfile_size = write_reused_pack(f);
|
||||||
|
offset += packfile_size;
|
||||||
|
}
|
||||||
|
|
||||||
nr_written = 0;
|
nr_written = 0;
|
||||||
for (; i < nr_objects; i++) {
|
for (; i < to_pack.nr_objects; i++) {
|
||||||
struct object_entry *e = write_order[i];
|
struct object_entry *e = write_order[i];
|
||||||
if (write_one(f, e, &offset) == WRITE_ONE_BREAK)
|
if (write_one(f, e, &offset) == WRITE_ONE_BREAK)
|
||||||
break;
|
break;
|
||||||
@ -789,9 +830,31 @@ static void write_pack_file(void)
|
|||||||
if (sizeof(tmpname) <= strlen(base_name) + 50)
|
if (sizeof(tmpname) <= strlen(base_name) + 50)
|
||||||
die("pack base name '%s' too long", base_name);
|
die("pack base name '%s' too long", base_name);
|
||||||
snprintf(tmpname, sizeof(tmpname), "%s-", base_name);
|
snprintf(tmpname, sizeof(tmpname), "%s-", base_name);
|
||||||
|
|
||||||
|
if (write_bitmap_index) {
|
||||||
|
bitmap_writer_set_checksum(sha1);
|
||||||
|
bitmap_writer_build_type_index(written_list, nr_written);
|
||||||
|
}
|
||||||
|
|
||||||
finish_tmp_packfile(tmpname, pack_tmp_name,
|
finish_tmp_packfile(tmpname, pack_tmp_name,
|
||||||
written_list, nr_written,
|
written_list, nr_written,
|
||||||
&pack_idx_opts, sha1);
|
&pack_idx_opts, sha1);
|
||||||
|
|
||||||
|
if (write_bitmap_index) {
|
||||||
|
char *end_of_name_prefix = strrchr(tmpname, 0);
|
||||||
|
sprintf(end_of_name_prefix, "%s.bitmap", sha1_to_hex(sha1));
|
||||||
|
|
||||||
|
stop_progress(&progress_state);
|
||||||
|
|
||||||
|
bitmap_writer_show_progress(progress);
|
||||||
|
bitmap_writer_reuse_bitmaps(&to_pack);
|
||||||
|
bitmap_writer_select_commits(indexed_commits, indexed_commits_nr, -1);
|
||||||
|
bitmap_writer_build(&to_pack);
|
||||||
|
bitmap_writer_finish(written_list, nr_written,
|
||||||
|
tmpname, write_bitmap_options);
|
||||||
|
write_bitmap_index = 0;
|
||||||
|
}
|
||||||
|
|
||||||
free(pack_tmp_name);
|
free(pack_tmp_name);
|
||||||
puts(sha1_to_hex(sha1));
|
puts(sha1_to_hex(sha1));
|
||||||
}
|
}
|
||||||
@ -801,7 +864,7 @@ static void write_pack_file(void)
|
|||||||
written_list[j]->offset = (off_t)-1;
|
written_list[j]->offset = (off_t)-1;
|
||||||
}
|
}
|
||||||
nr_remaining -= nr_written;
|
nr_remaining -= nr_written;
|
||||||
} while (nr_remaining && i < nr_objects);
|
} while (nr_remaining && i < to_pack.nr_objects);
|
||||||
|
|
||||||
free(written_list);
|
free(written_list);
|
||||||
free(write_order);
|
free(write_order);
|
||||||
@ -811,73 +874,6 @@ static void write_pack_file(void)
|
|||||||
written, nr_result);
|
written, nr_result);
|
||||||
}
|
}
|
||||||
|
|
||||||
static int locate_object_entry_hash(const unsigned char *sha1)
|
|
||||||
{
|
|
||||||
int i;
|
|
||||||
unsigned int ui;
|
|
||||||
memcpy(&ui, sha1, sizeof(unsigned int));
|
|
||||||
i = ui % object_ix_hashsz;
|
|
||||||
while (0 < object_ix[i]) {
|
|
||||||
if (!hashcmp(sha1, objects[object_ix[i] - 1].idx.sha1))
|
|
||||||
return i;
|
|
||||||
if (++i == object_ix_hashsz)
|
|
||||||
i = 0;
|
|
||||||
}
|
|
||||||
return -1 - i;
|
|
||||||
}
|
|
||||||
|
|
||||||
static struct object_entry *locate_object_entry(const unsigned char *sha1)
|
|
||||||
{
|
|
||||||
int i;
|
|
||||||
|
|
||||||
if (!object_ix_hashsz)
|
|
||||||
return NULL;
|
|
||||||
|
|
||||||
i = locate_object_entry_hash(sha1);
|
|
||||||
if (0 <= i)
|
|
||||||
return &objects[object_ix[i]-1];
|
|
||||||
return NULL;
|
|
||||||
}
|
|
||||||
|
|
||||||
static void rehash_objects(void)
|
|
||||||
{
|
|
||||||
uint32_t i;
|
|
||||||
struct object_entry *oe;
|
|
||||||
|
|
||||||
object_ix_hashsz = nr_objects * 3;
|
|
||||||
if (object_ix_hashsz < 1024)
|
|
||||||
object_ix_hashsz = 1024;
|
|
||||||
object_ix = xrealloc(object_ix, sizeof(int) * object_ix_hashsz);
|
|
||||||
memset(object_ix, 0, sizeof(int) * object_ix_hashsz);
|
|
||||||
for (i = 0, oe = objects; i < nr_objects; i++, oe++) {
|
|
||||||
int ix = locate_object_entry_hash(oe->idx.sha1);
|
|
||||||
if (0 <= ix)
|
|
||||||
continue;
|
|
||||||
ix = -1 - ix;
|
|
||||||
object_ix[ix] = i + 1;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
static uint32_t name_hash(const char *name)
|
|
||||||
{
|
|
||||||
uint32_t c, hash = 0;
|
|
||||||
|
|
||||||
if (!name)
|
|
||||||
return 0;
|
|
||||||
|
|
||||||
/*
|
|
||||||
* This effectively just creates a sortable number from the
|
|
||||||
* last sixteen non-whitespace characters. Last characters
|
|
||||||
* count "most", so things that end in ".c" sort together.
|
|
||||||
*/
|
|
||||||
while ((c = *name++) != 0) {
|
|
||||||
if (isspace(c))
|
|
||||||
continue;
|
|
||||||
hash = (hash >> 2) + (c << 24);
|
|
||||||
}
|
|
||||||
return hash;
|
|
||||||
}
|
|
||||||
|
|
||||||
static void setup_delta_attr_check(struct git_attr_check *check)
|
static void setup_delta_attr_check(struct git_attr_check *check)
|
||||||
{
|
{
|
||||||
static struct git_attr *attr_delta;
|
static struct git_attr *attr_delta;
|
||||||
@ -900,42 +896,69 @@ static int no_try_delta(const char *path)
|
|||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
static int add_object_entry(const unsigned char *sha1, enum object_type type,
|
/*
|
||||||
const char *name, int exclude)
|
* When adding an object, check whether we have already added it
|
||||||
|
* to our packing list. If so, we can skip. However, if we are
|
||||||
|
* being asked to excludei t, but the previous mention was to include
|
||||||
|
* it, make sure to adjust its flags and tweak our numbers accordingly.
|
||||||
|
*
|
||||||
|
* As an optimization, we pass out the index position where we would have
|
||||||
|
* found the item, since that saves us from having to look it up again a
|
||||||
|
* few lines later when we want to add the new entry.
|
||||||
|
*/
|
||||||
|
static int have_duplicate_entry(const unsigned char *sha1,
|
||||||
|
int exclude,
|
||||||
|
uint32_t *index_pos)
|
||||||
{
|
{
|
||||||
struct object_entry *entry;
|
struct object_entry *entry;
|
||||||
struct packed_git *p, *found_pack = NULL;
|
|
||||||
off_t found_offset = 0;
|
|
||||||
int ix;
|
|
||||||
uint32_t hash = name_hash(name);
|
|
||||||
|
|
||||||
ix = nr_objects ? locate_object_entry_hash(sha1) : -1;
|
entry = packlist_find(&to_pack, sha1, index_pos);
|
||||||
if (ix >= 0) {
|
if (!entry)
|
||||||
if (exclude) {
|
|
||||||
entry = objects + object_ix[ix] - 1;
|
|
||||||
if (!entry->preferred_base)
|
|
||||||
nr_result--;
|
|
||||||
entry->preferred_base = 1;
|
|
||||||
}
|
|
||||||
return 0;
|
return 0;
|
||||||
|
|
||||||
|
if (exclude) {
|
||||||
|
if (!entry->preferred_base)
|
||||||
|
nr_result--;
|
||||||
|
entry->preferred_base = 1;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
return 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Check whether we want the object in the pack (e.g., we do not want
|
||||||
|
* objects found in non-local stores if the "--local" option was used).
|
||||||
|
*
|
||||||
|
* As a side effect of this check, we will find the packed version of this
|
||||||
|
* object, if any. We therefore pass out the pack information to avoid having
|
||||||
|
* to look it up again later.
|
||||||
|
*/
|
||||||
|
static int want_object_in_pack(const unsigned char *sha1,
|
||||||
|
int exclude,
|
||||||
|
struct packed_git **found_pack,
|
||||||
|
off_t *found_offset)
|
||||||
|
{
|
||||||
|
struct packed_git *p;
|
||||||
|
|
||||||
if (!exclude && local && has_loose_object_nonlocal(sha1))
|
if (!exclude && local && has_loose_object_nonlocal(sha1))
|
||||||
return 0;
|
return 0;
|
||||||
|
|
||||||
|
*found_pack = NULL;
|
||||||
|
*found_offset = 0;
|
||||||
|
|
||||||
for (p = packed_git; p; p = p->next) {
|
for (p = packed_git; p; p = p->next) {
|
||||||
off_t offset = find_pack_entry_one(sha1, p);
|
off_t offset = find_pack_entry_one(sha1, p);
|
||||||
if (offset) {
|
if (offset) {
|
||||||
if (!found_pack) {
|
if (!*found_pack) {
|
||||||
if (!is_pack_valid(p)) {
|
if (!is_pack_valid(p)) {
|
||||||
warning("packfile %s cannot be accessed", p->pack_name);
|
warning("packfile %s cannot be accessed", p->pack_name);
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
found_offset = offset;
|
*found_offset = offset;
|
||||||
found_pack = p;
|
*found_pack = p;
|
||||||
}
|
}
|
||||||
if (exclude)
|
if (exclude)
|
||||||
break;
|
return 1;
|
||||||
if (incremental)
|
if (incremental)
|
||||||
return 0;
|
return 0;
|
||||||
if (local && !p->pack_local)
|
if (local && !p->pack_local)
|
||||||
@ -945,14 +968,21 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type,
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
if (nr_objects >= nr_alloc) {
|
return 1;
|
||||||
nr_alloc = (nr_alloc + 1024) * 3 / 2;
|
}
|
||||||
objects = xrealloc(objects, nr_alloc * sizeof(*entry));
|
|
||||||
}
|
|
||||||
|
|
||||||
entry = objects + nr_objects++;
|
static void create_object_entry(const unsigned char *sha1,
|
||||||
memset(entry, 0, sizeof(*entry));
|
enum object_type type,
|
||||||
hashcpy(entry->idx.sha1, sha1);
|
uint32_t hash,
|
||||||
|
int exclude,
|
||||||
|
int no_try_delta,
|
||||||
|
uint32_t index_pos,
|
||||||
|
struct packed_git *found_pack,
|
||||||
|
off_t found_offset)
|
||||||
|
{
|
||||||
|
struct object_entry *entry;
|
||||||
|
|
||||||
|
entry = packlist_alloc(&to_pack, sha1, index_pos);
|
||||||
entry->hash = hash;
|
entry->hash = hash;
|
||||||
if (type)
|
if (type)
|
||||||
entry->type = type;
|
entry->type = type;
|
||||||
@ -965,16 +995,43 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type,
|
|||||||
entry->in_pack_offset = found_offset;
|
entry->in_pack_offset = found_offset;
|
||||||
}
|
}
|
||||||
|
|
||||||
if (object_ix_hashsz * 3 <= nr_objects * 4)
|
entry->no_try_delta = no_try_delta;
|
||||||
rehash_objects();
|
}
|
||||||
else
|
|
||||||
object_ix[-1 - ix] = nr_objects;
|
|
||||||
|
|
||||||
display_progress(progress_state, nr_objects);
|
static int add_object_entry(const unsigned char *sha1, enum object_type type,
|
||||||
|
const char *name, int exclude)
|
||||||
|
{
|
||||||
|
struct packed_git *found_pack;
|
||||||
|
off_t found_offset;
|
||||||
|
uint32_t index_pos;
|
||||||
|
|
||||||
if (name && no_try_delta(name))
|
if (have_duplicate_entry(sha1, exclude, &index_pos))
|
||||||
entry->no_try_delta = 1;
|
return 0;
|
||||||
|
|
||||||
|
if (!want_object_in_pack(sha1, exclude, &found_pack, &found_offset))
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
create_object_entry(sha1, type, pack_name_hash(name),
|
||||||
|
exclude, name && no_try_delta(name),
|
||||||
|
index_pos, found_pack, found_offset);
|
||||||
|
|
||||||
|
display_progress(progress_state, to_pack.nr_objects);
|
||||||
|
return 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
static int add_object_entry_from_bitmap(const unsigned char *sha1,
|
||||||
|
enum object_type type,
|
||||||
|
int flags, uint32_t name_hash,
|
||||||
|
struct packed_git *pack, off_t offset)
|
||||||
|
{
|
||||||
|
uint32_t index_pos;
|
||||||
|
|
||||||
|
if (have_duplicate_entry(sha1, 0, &index_pos))
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
create_object_entry(sha1, type, name_hash, 0, 0, index_pos, pack, offset);
|
||||||
|
|
||||||
|
display_progress(progress_state, to_pack.nr_objects);
|
||||||
return 1;
|
return 1;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -1175,7 +1232,7 @@ static void add_preferred_base_object(const char *name)
|
|||||||
{
|
{
|
||||||
struct pbase_tree *it;
|
struct pbase_tree *it;
|
||||||
int cmplen;
|
int cmplen;
|
||||||
unsigned hash = name_hash(name);
|
unsigned hash = pack_name_hash(name);
|
||||||
|
|
||||||
if (!num_preferred_base || check_pbase_path(hash))
|
if (!num_preferred_base || check_pbase_path(hash))
|
||||||
return;
|
return;
|
||||||
@ -1327,7 +1384,7 @@ static void check_object(struct object_entry *entry)
|
|||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
|
|
||||||
if (base_ref && (base_entry = locate_object_entry(base_ref))) {
|
if (base_ref && (base_entry = packlist_find(&to_pack, base_ref, NULL))) {
|
||||||
/*
|
/*
|
||||||
* If base_ref was set above that means we wish to
|
* If base_ref was set above that means we wish to
|
||||||
* reuse delta data, and we even found that base
|
* reuse delta data, and we even found that base
|
||||||
@ -1401,12 +1458,12 @@ static void get_object_details(void)
|
|||||||
uint32_t i;
|
uint32_t i;
|
||||||
struct object_entry **sorted_by_offset;
|
struct object_entry **sorted_by_offset;
|
||||||
|
|
||||||
sorted_by_offset = xcalloc(nr_objects, sizeof(struct object_entry *));
|
sorted_by_offset = xcalloc(to_pack.nr_objects, sizeof(struct object_entry *));
|
||||||
for (i = 0; i < nr_objects; i++)
|
for (i = 0; i < to_pack.nr_objects; i++)
|
||||||
sorted_by_offset[i] = objects + i;
|
sorted_by_offset[i] = to_pack.objects + i;
|
||||||
qsort(sorted_by_offset, nr_objects, sizeof(*sorted_by_offset), pack_offset_sort);
|
qsort(sorted_by_offset, to_pack.nr_objects, sizeof(*sorted_by_offset), pack_offset_sort);
|
||||||
|
|
||||||
for (i = 0; i < nr_objects; i++) {
|
for (i = 0; i < to_pack.nr_objects; i++) {
|
||||||
struct object_entry *entry = sorted_by_offset[i];
|
struct object_entry *entry = sorted_by_offset[i];
|
||||||
check_object(entry);
|
check_object(entry);
|
||||||
if (big_file_threshold < entry->size)
|
if (big_file_threshold < entry->size)
|
||||||
@ -2032,7 +2089,7 @@ static int add_ref_tag(const char *path, const unsigned char *sha1, int flag, vo
|
|||||||
|
|
||||||
if (starts_with(path, "refs/tags/") && /* is a tag? */
|
if (starts_with(path, "refs/tags/") && /* is a tag? */
|
||||||
!peel_ref(path, peeled) && /* peelable? */
|
!peel_ref(path, peeled) && /* peelable? */
|
||||||
locate_object_entry(peeled)) /* object packed? */
|
packlist_find(&to_pack, peeled, NULL)) /* object packed? */
|
||||||
add_object_entry(sha1, OBJ_TAG, NULL, 0);
|
add_object_entry(sha1, OBJ_TAG, NULL, 0);
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
@ -2055,14 +2112,14 @@ static void prepare_pack(int window, int depth)
|
|||||||
if (!pack_to_stdout)
|
if (!pack_to_stdout)
|
||||||
do_check_packed_object_crc = 1;
|
do_check_packed_object_crc = 1;
|
||||||
|
|
||||||
if (!nr_objects || !window || !depth)
|
if (!to_pack.nr_objects || !window || !depth)
|
||||||
return;
|
return;
|
||||||
|
|
||||||
delta_list = xmalloc(nr_objects * sizeof(*delta_list));
|
delta_list = xmalloc(to_pack.nr_objects * sizeof(*delta_list));
|
||||||
nr_deltas = n = 0;
|
nr_deltas = n = 0;
|
||||||
|
|
||||||
for (i = 0; i < nr_objects; i++) {
|
for (i = 0; i < to_pack.nr_objects; i++) {
|
||||||
struct object_entry *entry = objects + i;
|
struct object_entry *entry = to_pack.objects + i;
|
||||||
|
|
||||||
if (entry->delta)
|
if (entry->delta)
|
||||||
/* This happens if we decided to reuse existing
|
/* This happens if we decided to reuse existing
|
||||||
@ -2140,6 +2197,20 @@ static int git_pack_config(const char *k, const char *v, void *cb)
|
|||||||
cache_max_small_delta_size = git_config_int(k, v);
|
cache_max_small_delta_size = git_config_int(k, v);
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
if (!strcmp(k, "pack.writebitmaps")) {
|
||||||
|
write_bitmap_index = git_config_bool(k, v);
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
if (!strcmp(k, "pack.writebitmaphashcache")) {
|
||||||
|
if (git_config_bool(k, v))
|
||||||
|
write_bitmap_options |= BITMAP_OPT_HASH_CACHE;
|
||||||
|
else
|
||||||
|
write_bitmap_options &= ~BITMAP_OPT_HASH_CACHE;
|
||||||
|
}
|
||||||
|
if (!strcmp(k, "pack.usebitmaps")) {
|
||||||
|
use_bitmap_index = git_config_bool(k, v);
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
if (!strcmp(k, "pack.threads")) {
|
if (!strcmp(k, "pack.threads")) {
|
||||||
delta_search_threads = git_config_int(k, v);
|
delta_search_threads = git_config_int(k, v);
|
||||||
if (delta_search_threads < 0)
|
if (delta_search_threads < 0)
|
||||||
@ -2198,6 +2269,9 @@ static void show_commit(struct commit *commit, void *data)
|
|||||||
{
|
{
|
||||||
add_object_entry(commit->object.sha1, OBJ_COMMIT, NULL, 0);
|
add_object_entry(commit->object.sha1, OBJ_COMMIT, NULL, 0);
|
||||||
commit->object.flags |= OBJECT_ADDED;
|
commit->object.flags |= OBJECT_ADDED;
|
||||||
|
|
||||||
|
if (write_bitmap_index)
|
||||||
|
index_commit_for_bitmap(commit);
|
||||||
}
|
}
|
||||||
|
|
||||||
static void show_object(struct object *obj,
|
static void show_object(struct object *obj,
|
||||||
@ -2340,7 +2414,7 @@ static void loosen_unused_packed_objects(struct rev_info *revs)
|
|||||||
|
|
||||||
for (i = 0; i < p->num_objects; i++) {
|
for (i = 0; i < p->num_objects; i++) {
|
||||||
sha1 = nth_packed_object_sha1(p, i);
|
sha1 = nth_packed_object_sha1(p, i);
|
||||||
if (!locate_object_entry(sha1) &&
|
if (!packlist_find(&to_pack, sha1, NULL) &&
|
||||||
!has_sha1_pack_kept_or_nonlocal(sha1))
|
!has_sha1_pack_kept_or_nonlocal(sha1))
|
||||||
if (force_object_loose(sha1, p->mtime))
|
if (force_object_loose(sha1, p->mtime))
|
||||||
die("unable to force loose object");
|
die("unable to force loose object");
|
||||||
@ -2348,6 +2422,29 @@ static void loosen_unused_packed_objects(struct rev_info *revs)
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static int get_object_list_from_bitmap(struct rev_info *revs)
|
||||||
|
{
|
||||||
|
if (prepare_bitmap_walk(revs) < 0)
|
||||||
|
return -1;
|
||||||
|
|
||||||
|
if (!reuse_partial_packfile_from_bitmap(
|
||||||
|
&reuse_packfile,
|
||||||
|
&reuse_packfile_objects,
|
||||||
|
&reuse_packfile_offset)) {
|
||||||
|
assert(reuse_packfile_objects);
|
||||||
|
nr_result += reuse_packfile_objects;
|
||||||
|
|
||||||
|
if (progress) {
|
||||||
|
fprintf(stderr, "Reusing existing pack: %d, done.\n",
|
||||||
|
reuse_packfile_objects);
|
||||||
|
fflush(stderr);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
traverse_bitmap_commit_list(&add_object_entry_from_bitmap);
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
static void get_object_list(int ac, const char **av)
|
static void get_object_list(int ac, const char **av)
|
||||||
{
|
{
|
||||||
struct rev_info revs;
|
struct rev_info revs;
|
||||||
@ -2367,6 +2464,7 @@ static void get_object_list(int ac, const char **av)
|
|||||||
if (*line == '-') {
|
if (*line == '-') {
|
||||||
if (!strcmp(line, "--not")) {
|
if (!strcmp(line, "--not")) {
|
||||||
flags ^= UNINTERESTING;
|
flags ^= UNINTERESTING;
|
||||||
|
write_bitmap_index = 0;
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
die("not a rev '%s'", line);
|
die("not a rev '%s'", line);
|
||||||
@ -2375,6 +2473,9 @@ static void get_object_list(int ac, const char **av)
|
|||||||
die("bad revision '%s'", line);
|
die("bad revision '%s'", line);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if (use_bitmap_index && !get_object_list_from_bitmap(&revs))
|
||||||
|
return;
|
||||||
|
|
||||||
if (prepare_revision_walk(&revs))
|
if (prepare_revision_walk(&revs))
|
||||||
die("revision walk setup failed");
|
die("revision walk setup failed");
|
||||||
mark_edges_uninteresting(&revs, show_edge);
|
mark_edges_uninteresting(&revs, show_edge);
|
||||||
@ -2504,6 +2605,10 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
|
|||||||
N_("pack compression level")),
|
N_("pack compression level")),
|
||||||
OPT_SET_INT(0, "keep-true-parents", &grafts_replace_parents,
|
OPT_SET_INT(0, "keep-true-parents", &grafts_replace_parents,
|
||||||
N_("do not hide commits by grafts"), 0),
|
N_("do not hide commits by grafts"), 0),
|
||||||
|
OPT_BOOL(0, "use-bitmap-index", &use_bitmap_index,
|
||||||
|
N_("use a bitmap index if available to speed up counting objects")),
|
||||||
|
OPT_BOOL(0, "write-bitmap-index", &write_bitmap_index,
|
||||||
|
N_("write a bitmap index together with the pack index")),
|
||||||
OPT_END(),
|
OPT_END(),
|
||||||
};
|
};
|
||||||
|
|
||||||
@ -2570,6 +2675,12 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
|
|||||||
if (keep_unreachable && unpack_unreachable)
|
if (keep_unreachable && unpack_unreachable)
|
||||||
die("--keep-unreachable and --unpack-unreachable are incompatible.");
|
die("--keep-unreachable and --unpack-unreachable are incompatible.");
|
||||||
|
|
||||||
|
if (!use_internal_rev_list || !pack_to_stdout || is_repository_shallow())
|
||||||
|
use_bitmap_index = 0;
|
||||||
|
|
||||||
|
if (pack_to_stdout || !rev_list_all)
|
||||||
|
write_bitmap_index = 0;
|
||||||
|
|
||||||
if (progress && all_progress_implied)
|
if (progress && all_progress_implied)
|
||||||
progress = 2;
|
progress = 2;
|
||||||
|
|
||||||
|
@ -94,7 +94,7 @@ static void get_non_kept_pack_filenames(struct string_list *fname_list)
|
|||||||
|
|
||||||
static void remove_redundant_pack(const char *dir_name, const char *base_name)
|
static void remove_redundant_pack(const char *dir_name, const char *base_name)
|
||||||
{
|
{
|
||||||
const char *exts[] = {".pack", ".idx", ".keep"};
|
const char *exts[] = {".pack", ".idx", ".keep", ".bitmap"};
|
||||||
int i;
|
int i;
|
||||||
struct strbuf buf = STRBUF_INIT;
|
struct strbuf buf = STRBUF_INIT;
|
||||||
size_t plen;
|
size_t plen;
|
||||||
@ -115,7 +115,14 @@ static void remove_redundant_pack(const char *dir_name, const char *base_name)
|
|||||||
|
|
||||||
int cmd_repack(int argc, const char **argv, const char *prefix)
|
int cmd_repack(int argc, const char **argv, const char *prefix)
|
||||||
{
|
{
|
||||||
const char *exts[2] = {".pack", ".idx"};
|
struct {
|
||||||
|
const char *name;
|
||||||
|
unsigned optional:1;
|
||||||
|
} exts[] = {
|
||||||
|
{".pack"},
|
||||||
|
{".idx"},
|
||||||
|
{".bitmap", 1},
|
||||||
|
};
|
||||||
struct child_process cmd;
|
struct child_process cmd;
|
||||||
struct string_list_item *item;
|
struct string_list_item *item;
|
||||||
struct argv_array cmd_args = ARGV_ARRAY_INIT;
|
struct argv_array cmd_args = ARGV_ARRAY_INIT;
|
||||||
@ -137,6 +144,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
|
|||||||
int no_update_server_info = 0;
|
int no_update_server_info = 0;
|
||||||
int quiet = 0;
|
int quiet = 0;
|
||||||
int local = 0;
|
int local = 0;
|
||||||
|
int write_bitmap = -1;
|
||||||
|
|
||||||
struct option builtin_repack_options[] = {
|
struct option builtin_repack_options[] = {
|
||||||
OPT_BIT('a', NULL, &pack_everything,
|
OPT_BIT('a', NULL, &pack_everything,
|
||||||
@ -155,6 +163,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
|
|||||||
OPT__QUIET(&quiet, N_("be quiet")),
|
OPT__QUIET(&quiet, N_("be quiet")),
|
||||||
OPT_BOOL('l', "local", &local,
|
OPT_BOOL('l', "local", &local,
|
||||||
N_("pass --local to git-pack-objects")),
|
N_("pass --local to git-pack-objects")),
|
||||||
|
OPT_BOOL('b', "write-bitmap-index", &write_bitmap,
|
||||||
|
N_("write bitmap index")),
|
||||||
OPT_STRING(0, "unpack-unreachable", &unpack_unreachable, N_("approxidate"),
|
OPT_STRING(0, "unpack-unreachable", &unpack_unreachable, N_("approxidate"),
|
||||||
N_("with -A, do not loosen objects older than this")),
|
N_("with -A, do not loosen objects older than this")),
|
||||||
OPT_STRING(0, "window", &window, N_("n"),
|
OPT_STRING(0, "window", &window, N_("n"),
|
||||||
@ -196,6 +206,9 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
|
|||||||
argv_array_pushf(&cmd_args, "--no-reuse-delta");
|
argv_array_pushf(&cmd_args, "--no-reuse-delta");
|
||||||
if (no_reuse_object)
|
if (no_reuse_object)
|
||||||
argv_array_pushf(&cmd_args, "--no-reuse-object");
|
argv_array_pushf(&cmd_args, "--no-reuse-object");
|
||||||
|
if (write_bitmap >= 0)
|
||||||
|
argv_array_pushf(&cmd_args, "--%swrite-bitmap-index",
|
||||||
|
write_bitmap ? "" : "no-");
|
||||||
|
|
||||||
if (pack_everything & ALL_INTO_ONE) {
|
if (pack_everything & ALL_INTO_ONE) {
|
||||||
get_non_kept_pack_filenames(&existing_packs);
|
get_non_kept_pack_filenames(&existing_packs);
|
||||||
@ -256,17 +269,17 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
|
|||||||
*/
|
*/
|
||||||
failed = 0;
|
failed = 0;
|
||||||
for_each_string_list_item(item, &names) {
|
for_each_string_list_item(item, &names) {
|
||||||
for (ext = 0; ext < 2; ext++) {
|
for (ext = 0; ext < ARRAY_SIZE(exts); ext++) {
|
||||||
char *fname, *fname_old;
|
char *fname, *fname_old;
|
||||||
fname = mkpathdup("%s/pack-%s%s", packdir,
|
fname = mkpathdup("%s/pack-%s%s", packdir,
|
||||||
item->string, exts[ext]);
|
item->string, exts[ext].name);
|
||||||
if (!file_exists(fname)) {
|
if (!file_exists(fname)) {
|
||||||
free(fname);
|
free(fname);
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
|
|
||||||
fname_old = mkpath("%s/old-%s%s", packdir,
|
fname_old = mkpath("%s/old-%s%s", packdir,
|
||||||
item->string, exts[ext]);
|
item->string, exts[ext].name);
|
||||||
if (file_exists(fname_old))
|
if (file_exists(fname_old))
|
||||||
if (unlink(fname_old))
|
if (unlink(fname_old))
|
||||||
failed = 1;
|
failed = 1;
|
||||||
@ -313,19 +326,23 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
|
|||||||
|
|
||||||
/* Now the ones with the same name are out of the way... */
|
/* Now the ones with the same name are out of the way... */
|
||||||
for_each_string_list_item(item, &names) {
|
for_each_string_list_item(item, &names) {
|
||||||
for (ext = 0; ext < 2; ext++) {
|
for (ext = 0; ext < ARRAY_SIZE(exts); ext++) {
|
||||||
char *fname, *fname_old;
|
char *fname, *fname_old;
|
||||||
struct stat statbuffer;
|
struct stat statbuffer;
|
||||||
|
int exists = 0;
|
||||||
fname = mkpathdup("%s/pack-%s%s",
|
fname = mkpathdup("%s/pack-%s%s",
|
||||||
packdir, item->string, exts[ext]);
|
packdir, item->string, exts[ext].name);
|
||||||
fname_old = mkpathdup("%s-%s%s",
|
fname_old = mkpathdup("%s-%s%s",
|
||||||
packtmp, item->string, exts[ext]);
|
packtmp, item->string, exts[ext].name);
|
||||||
if (!stat(fname_old, &statbuffer)) {
|
if (!stat(fname_old, &statbuffer)) {
|
||||||
statbuffer.st_mode &= ~(S_IWUSR | S_IWGRP | S_IWOTH);
|
statbuffer.st_mode &= ~(S_IWUSR | S_IWGRP | S_IWOTH);
|
||||||
chmod(fname_old, statbuffer.st_mode);
|
chmod(fname_old, statbuffer.st_mode);
|
||||||
|
exists = 1;
|
||||||
|
}
|
||||||
|
if (exists || !exts[ext].optional) {
|
||||||
|
if (rename(fname_old, fname))
|
||||||
|
die_errno(_("renaming '%s' failed"), fname_old);
|
||||||
}
|
}
|
||||||
if (rename(fname_old, fname))
|
|
||||||
die_errno(_("renaming '%s' failed"), fname_old);
|
|
||||||
free(fname);
|
free(fname);
|
||||||
free(fname_old);
|
free(fname_old);
|
||||||
}
|
}
|
||||||
@ -333,12 +350,12 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
|
|||||||
|
|
||||||
/* Remove the "old-" files */
|
/* Remove the "old-" files */
|
||||||
for_each_string_list_item(item, &names) {
|
for_each_string_list_item(item, &names) {
|
||||||
for (ext = 0; ext < 2; ext++) {
|
for (ext = 0; ext < ARRAY_SIZE(exts); ext++) {
|
||||||
char *fname;
|
char *fname;
|
||||||
fname = mkpath("%s/old-%s%s",
|
fname = mkpath("%s/old-%s%s",
|
||||||
packdir,
|
packdir,
|
||||||
item->string,
|
item->string,
|
||||||
exts[ext]);
|
exts[ext].name);
|
||||||
if (remove_path(fname))
|
if (remove_path(fname))
|
||||||
warning(_("removing '%s' failed"), fname);
|
warning(_("removing '%s' failed"), fname);
|
||||||
}
|
}
|
||||||
|
@ -3,6 +3,8 @@
|
|||||||
#include "diff.h"
|
#include "diff.h"
|
||||||
#include "revision.h"
|
#include "revision.h"
|
||||||
#include "list-objects.h"
|
#include "list-objects.h"
|
||||||
|
#include "pack.h"
|
||||||
|
#include "pack-bitmap.h"
|
||||||
#include "builtin.h"
|
#include "builtin.h"
|
||||||
#include "log-tree.h"
|
#include "log-tree.h"
|
||||||
#include "graph.h"
|
#include "graph.h"
|
||||||
@ -257,6 +259,18 @@ static int show_bisect_vars(struct rev_list_info *info, int reaches, int all)
|
|||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static int show_object_fast(
|
||||||
|
const unsigned char *sha1,
|
||||||
|
enum object_type type,
|
||||||
|
int exclude,
|
||||||
|
uint32_t name_hash,
|
||||||
|
struct packed_git *found_pack,
|
||||||
|
off_t found_offset)
|
||||||
|
{
|
||||||
|
fprintf(stdout, "%s\n", sha1_to_hex(sha1));
|
||||||
|
return 1;
|
||||||
|
}
|
||||||
|
|
||||||
int cmd_rev_list(int argc, const char **argv, const char *prefix)
|
int cmd_rev_list(int argc, const char **argv, const char *prefix)
|
||||||
{
|
{
|
||||||
struct rev_info revs;
|
struct rev_info revs;
|
||||||
@ -265,6 +279,7 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
|
|||||||
int bisect_list = 0;
|
int bisect_list = 0;
|
||||||
int bisect_show_vars = 0;
|
int bisect_show_vars = 0;
|
||||||
int bisect_find_all = 0;
|
int bisect_find_all = 0;
|
||||||
|
int use_bitmap_index = 0;
|
||||||
|
|
||||||
git_config(git_default_config, NULL);
|
git_config(git_default_config, NULL);
|
||||||
init_revisions(&revs, prefix);
|
init_revisions(&revs, prefix);
|
||||||
@ -306,6 +321,14 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
|
|||||||
bisect_show_vars = 1;
|
bisect_show_vars = 1;
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
|
if (!strcmp(arg, "--use-bitmap-index")) {
|
||||||
|
use_bitmap_index = 1;
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
if (!strcmp(arg, "--test-bitmap")) {
|
||||||
|
test_bitmap_walk(&revs);
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
usage(rev_list_usage);
|
usage(rev_list_usage);
|
||||||
|
|
||||||
}
|
}
|
||||||
@ -333,6 +356,22 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
|
|||||||
if (bisect_list)
|
if (bisect_list)
|
||||||
revs.limited = 1;
|
revs.limited = 1;
|
||||||
|
|
||||||
|
if (use_bitmap_index) {
|
||||||
|
if (revs.count && !revs.left_right && !revs.cherry_mark) {
|
||||||
|
uint32_t commit_count;
|
||||||
|
if (!prepare_bitmap_walk(&revs)) {
|
||||||
|
count_bitmap_commit_list(&commit_count, NULL, NULL, NULL);
|
||||||
|
printf("%d\n", commit_count);
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
} else if (revs.tag_objects && revs.tree_objects && revs.blob_objects) {
|
||||||
|
if (!prepare_bitmap_walk(&revs)) {
|
||||||
|
traverse_bitmap_commit_list(&show_object_fast);
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
if (prepare_revision_walk(&revs))
|
if (prepare_revision_walk(&revs))
|
||||||
die("revision walk setup failed");
|
die("revision walk setup failed");
|
||||||
if (revs.tree_objects)
|
if (revs.tree_objects)
|
||||||
|
1
cache.h
1
cache.h
@ -809,6 +809,7 @@ extern int hash_sha1_file(const void *buf, unsigned long len, const char *type,
|
|||||||
extern int write_sha1_file(const void *buf, unsigned long len, const char *type, unsigned char *return_sha1);
|
extern int write_sha1_file(const void *buf, unsigned long len, const char *type, unsigned char *return_sha1);
|
||||||
extern int pretend_sha1_file(void *, unsigned long, enum object_type, unsigned char *);
|
extern int pretend_sha1_file(void *, unsigned long, enum object_type, unsigned char *);
|
||||||
extern int force_object_loose(const unsigned char *sha1, time_t mtime);
|
extern int force_object_loose(const unsigned char *sha1, time_t mtime);
|
||||||
|
extern int git_open_noatime(const char *name);
|
||||||
extern void *map_sha1_file(const unsigned char *sha1, unsigned long *size);
|
extern void *map_sha1_file(const unsigned char *sha1, unsigned long *size);
|
||||||
extern int unpack_sha1_header(git_zstream *stream, unsigned char *map, unsigned long mapsize, void *buffer, unsigned long bufsiz);
|
extern int unpack_sha1_header(git_zstream *stream, unsigned char *map, unsigned long mapsize, void *buffer, unsigned long bufsiz);
|
||||||
extern int parse_sha1_header(const char *hdr, unsigned long *sizep);
|
extern int parse_sha1_header(const char *hdr, unsigned long *sizep);
|
||||||
|
112
compat/bswap.h
112
compat/bswap.h
@ -17,7 +17,20 @@ static inline uint32_t default_swab32(uint32_t val)
|
|||||||
((val & 0x000000ff) << 24));
|
((val & 0x000000ff) << 24));
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static inline uint64_t default_bswap64(uint64_t val)
|
||||||
|
{
|
||||||
|
return (((val & (uint64_t)0x00000000000000ffULL) << 56) |
|
||||||
|
((val & (uint64_t)0x000000000000ff00ULL) << 40) |
|
||||||
|
((val & (uint64_t)0x0000000000ff0000ULL) << 24) |
|
||||||
|
((val & (uint64_t)0x00000000ff000000ULL) << 8) |
|
||||||
|
((val & (uint64_t)0x000000ff00000000ULL) >> 8) |
|
||||||
|
((val & (uint64_t)0x0000ff0000000000ULL) >> 24) |
|
||||||
|
((val & (uint64_t)0x00ff000000000000ULL) >> 40) |
|
||||||
|
((val & (uint64_t)0xff00000000000000ULL) >> 56));
|
||||||
|
}
|
||||||
|
|
||||||
#undef bswap32
|
#undef bswap32
|
||||||
|
#undef bswap64
|
||||||
|
|
||||||
#if defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__))
|
#if defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__))
|
||||||
|
|
||||||
@ -32,15 +45,42 @@ static inline uint32_t git_bswap32(uint32_t x)
|
|||||||
return result;
|
return result;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#define bswap64 git_bswap64
|
||||||
|
#if defined(__x86_64__)
|
||||||
|
static inline uint64_t git_bswap64(uint64_t x)
|
||||||
|
{
|
||||||
|
uint64_t result;
|
||||||
|
if (__builtin_constant_p(x))
|
||||||
|
result = default_bswap64(x);
|
||||||
|
else
|
||||||
|
__asm__("bswap %q0" : "=r" (result) : "0" (x));
|
||||||
|
return result;
|
||||||
|
}
|
||||||
|
#else
|
||||||
|
static inline uint64_t git_bswap64(uint64_t x)
|
||||||
|
{
|
||||||
|
union { uint64_t i64; uint32_t i32[2]; } tmp, result;
|
||||||
|
if (__builtin_constant_p(x))
|
||||||
|
result.i64 = default_bswap64(x);
|
||||||
|
else {
|
||||||
|
tmp.i64 = x;
|
||||||
|
result.i32[0] = git_bswap32(tmp.i32[1]);
|
||||||
|
result.i32[1] = git_bswap32(tmp.i32[0]);
|
||||||
|
}
|
||||||
|
return result.i64;
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
|
||||||
#elif defined(_MSC_VER) && (defined(_M_IX86) || defined(_M_X64))
|
#elif defined(_MSC_VER) && (defined(_M_IX86) || defined(_M_X64))
|
||||||
|
|
||||||
#include <stdlib.h>
|
#include <stdlib.h>
|
||||||
|
|
||||||
#define bswap32(x) _byteswap_ulong(x)
|
#define bswap32(x) _byteswap_ulong(x)
|
||||||
|
#define bswap64(x) _byteswap_uint64(x)
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
#ifdef bswap32
|
#if defined(bswap32)
|
||||||
|
|
||||||
#undef ntohl
|
#undef ntohl
|
||||||
#undef htonl
|
#undef htonl
|
||||||
@ -48,3 +88,73 @@ static inline uint32_t git_bswap32(uint32_t x)
|
|||||||
#define htonl(x) bswap32(x)
|
#define htonl(x) bswap32(x)
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
#if defined(bswap64)
|
||||||
|
|
||||||
|
#undef ntohll
|
||||||
|
#undef htonll
|
||||||
|
#define ntohll(x) bswap64(x)
|
||||||
|
#define htonll(x) bswap64(x)
|
||||||
|
|
||||||
|
#else
|
||||||
|
|
||||||
|
#undef ntohll
|
||||||
|
#undef htonll
|
||||||
|
|
||||||
|
#if !defined(__BYTE_ORDER)
|
||||||
|
# if defined(BYTE_ORDER) && defined(LITTLE_ENDIAN) && defined(BIG_ENDIAN)
|
||||||
|
# define __BYTE_ORDER BYTE_ORDER
|
||||||
|
# define __LITTLE_ENDIAN LITTLE_ENDIAN
|
||||||
|
# define __BIG_ENDIAN BIG_ENDIAN
|
||||||
|
# endif
|
||||||
|
#endif
|
||||||
|
|
||||||
|
#if !defined(__BYTE_ORDER)
|
||||||
|
# error "Cannot determine endianness"
|
||||||
|
#endif
|
||||||
|
|
||||||
|
#if __BYTE_ORDER == __BIG_ENDIAN
|
||||||
|
# define ntohll(n) (n)
|
||||||
|
# define htonll(n) (n)
|
||||||
|
#else
|
||||||
|
# define ntohll(n) default_bswap64(n)
|
||||||
|
# define htonll(n) default_bswap64(n)
|
||||||
|
#endif
|
||||||
|
|
||||||
|
#endif
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Performance might be improved if the CPU architecture is OK with
|
||||||
|
* unaligned 32-bit loads and a fast ntohl() is available.
|
||||||
|
* Otherwise fall back to byte loads and shifts which is portable,
|
||||||
|
* and is faster on architectures with memory alignment issues.
|
||||||
|
*/
|
||||||
|
|
||||||
|
#if defined(__i386__) || defined(__x86_64__) || \
|
||||||
|
defined(_M_IX86) || defined(_M_X64) || \
|
||||||
|
defined(__ppc__) || defined(__ppc64__) || \
|
||||||
|
defined(__powerpc__) || defined(__powerpc64__) || \
|
||||||
|
defined(__s390__) || defined(__s390x__)
|
||||||
|
|
||||||
|
#define get_be16(p) ntohs(*(unsigned short *)(p))
|
||||||
|
#define get_be32(p) ntohl(*(unsigned int *)(p))
|
||||||
|
#define put_be32(p, v) do { *(unsigned int *)(p) = htonl(v); } while (0)
|
||||||
|
|
||||||
|
#else
|
||||||
|
|
||||||
|
#define get_be16(p) ( \
|
||||||
|
(*((unsigned char *)(p) + 0) << 8) | \
|
||||||
|
(*((unsigned char *)(p) + 1) << 0) )
|
||||||
|
#define get_be32(p) ( \
|
||||||
|
(*((unsigned char *)(p) + 0) << 24) | \
|
||||||
|
(*((unsigned char *)(p) + 1) << 16) | \
|
||||||
|
(*((unsigned char *)(p) + 2) << 8) | \
|
||||||
|
(*((unsigned char *)(p) + 3) << 0) )
|
||||||
|
#define put_be32(p, v) do { \
|
||||||
|
unsigned int __v = (v); \
|
||||||
|
*((unsigned char *)(p) + 0) = __v >> 24; \
|
||||||
|
*((unsigned char *)(p) + 1) = __v >> 16; \
|
||||||
|
*((unsigned char *)(p) + 2) = __v >> 8; \
|
||||||
|
*((unsigned char *)(p) + 3) = __v >> 0; } while (0)
|
||||||
|
|
||||||
|
#endif
|
||||||
|
221
ewah/bitmap.c
Normal file
221
ewah/bitmap.c
Normal file
@ -0,0 +1,221 @@
|
|||||||
|
/**
|
||||||
|
* Copyright 2013, GitHub, Inc
|
||||||
|
* Copyright 2009-2013, Daniel Lemire, Cliff Moon,
|
||||||
|
* David McIntosh, Robert Becho, Google Inc. and Veronika Zenz
|
||||||
|
*
|
||||||
|
* This program is free software; you can redistribute it and/or
|
||||||
|
* modify it under the terms of the GNU General Public License
|
||||||
|
* as published by the Free Software Foundation; either version 2
|
||||||
|
* of the License, or (at your option) any later version.
|
||||||
|
*
|
||||||
|
* This program is distributed in the hope that it will be useful,
|
||||||
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||||
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||||
|
* GNU General Public License for more details.
|
||||||
|
*
|
||||||
|
* You should have received a copy of the GNU General Public License
|
||||||
|
* along with this program; if not, write to the Free Software
|
||||||
|
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
|
||||||
|
*/
|
||||||
|
#include "git-compat-util.h"
|
||||||
|
#include "ewok.h"
|
||||||
|
|
||||||
|
#define MASK(x) ((eword_t)1 << (x % BITS_IN_WORD))
|
||||||
|
#define BLOCK(x) (x / BITS_IN_WORD)
|
||||||
|
|
||||||
|
struct bitmap *bitmap_new(void)
|
||||||
|
{
|
||||||
|
struct bitmap *bitmap = ewah_malloc(sizeof(struct bitmap));
|
||||||
|
bitmap->words = ewah_calloc(32, sizeof(eword_t));
|
||||||
|
bitmap->word_alloc = 32;
|
||||||
|
return bitmap;
|
||||||
|
}
|
||||||
|
|
||||||
|
void bitmap_set(struct bitmap *self, size_t pos)
|
||||||
|
{
|
||||||
|
size_t block = BLOCK(pos);
|
||||||
|
|
||||||
|
if (block >= self->word_alloc) {
|
||||||
|
size_t old_size = self->word_alloc;
|
||||||
|
self->word_alloc = block * 2;
|
||||||
|
self->words = ewah_realloc(self->words,
|
||||||
|
self->word_alloc * sizeof(eword_t));
|
||||||
|
|
||||||
|
memset(self->words + old_size, 0x0,
|
||||||
|
(self->word_alloc - old_size) * sizeof(eword_t));
|
||||||
|
}
|
||||||
|
|
||||||
|
self->words[block] |= MASK(pos);
|
||||||
|
}
|
||||||
|
|
||||||
|
void bitmap_clear(struct bitmap *self, size_t pos)
|
||||||
|
{
|
||||||
|
size_t block = BLOCK(pos);
|
||||||
|
|
||||||
|
if (block < self->word_alloc)
|
||||||
|
self->words[block] &= ~MASK(pos);
|
||||||
|
}
|
||||||
|
|
||||||
|
int bitmap_get(struct bitmap *self, size_t pos)
|
||||||
|
{
|
||||||
|
size_t block = BLOCK(pos);
|
||||||
|
return block < self->word_alloc &&
|
||||||
|
(self->words[block] & MASK(pos)) != 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
struct ewah_bitmap *bitmap_to_ewah(struct bitmap *bitmap)
|
||||||
|
{
|
||||||
|
struct ewah_bitmap *ewah = ewah_new();
|
||||||
|
size_t i, running_empty_words = 0;
|
||||||
|
eword_t last_word = 0;
|
||||||
|
|
||||||
|
for (i = 0; i < bitmap->word_alloc; ++i) {
|
||||||
|
if (bitmap->words[i] == 0) {
|
||||||
|
running_empty_words++;
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (last_word != 0)
|
||||||
|
ewah_add(ewah, last_word);
|
||||||
|
|
||||||
|
if (running_empty_words > 0) {
|
||||||
|
ewah_add_empty_words(ewah, 0, running_empty_words);
|
||||||
|
running_empty_words = 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
last_word = bitmap->words[i];
|
||||||
|
}
|
||||||
|
|
||||||
|
ewah_add(ewah, last_word);
|
||||||
|
return ewah;
|
||||||
|
}
|
||||||
|
|
||||||
|
struct bitmap *ewah_to_bitmap(struct ewah_bitmap *ewah)
|
||||||
|
{
|
||||||
|
struct bitmap *bitmap = bitmap_new();
|
||||||
|
struct ewah_iterator it;
|
||||||
|
eword_t blowup;
|
||||||
|
size_t i = 0;
|
||||||
|
|
||||||
|
ewah_iterator_init(&it, ewah);
|
||||||
|
|
||||||
|
while (ewah_iterator_next(&blowup, &it)) {
|
||||||
|
if (i >= bitmap->word_alloc) {
|
||||||
|
bitmap->word_alloc *= 1.5;
|
||||||
|
bitmap->words = ewah_realloc(
|
||||||
|
bitmap->words, bitmap->word_alloc * sizeof(eword_t));
|
||||||
|
}
|
||||||
|
|
||||||
|
bitmap->words[i++] = blowup;
|
||||||
|
}
|
||||||
|
|
||||||
|
bitmap->word_alloc = i;
|
||||||
|
return bitmap;
|
||||||
|
}
|
||||||
|
|
||||||
|
void bitmap_and_not(struct bitmap *self, struct bitmap *other)
|
||||||
|
{
|
||||||
|
const size_t count = (self->word_alloc < other->word_alloc) ?
|
||||||
|
self->word_alloc : other->word_alloc;
|
||||||
|
|
||||||
|
size_t i;
|
||||||
|
|
||||||
|
for (i = 0; i < count; ++i)
|
||||||
|
self->words[i] &= ~other->words[i];
|
||||||
|
}
|
||||||
|
|
||||||
|
void bitmap_or_ewah(struct bitmap *self, struct ewah_bitmap *other)
|
||||||
|
{
|
||||||
|
size_t original_size = self->word_alloc;
|
||||||
|
size_t other_final = (other->bit_size / BITS_IN_WORD) + 1;
|
||||||
|
size_t i = 0;
|
||||||
|
struct ewah_iterator it;
|
||||||
|
eword_t word;
|
||||||
|
|
||||||
|
if (self->word_alloc < other_final) {
|
||||||
|
self->word_alloc = other_final;
|
||||||
|
self->words = ewah_realloc(self->words,
|
||||||
|
self->word_alloc * sizeof(eword_t));
|
||||||
|
memset(self->words + original_size, 0x0,
|
||||||
|
(self->word_alloc - original_size) * sizeof(eword_t));
|
||||||
|
}
|
||||||
|
|
||||||
|
ewah_iterator_init(&it, other);
|
||||||
|
|
||||||
|
while (ewah_iterator_next(&word, &it))
|
||||||
|
self->words[i++] |= word;
|
||||||
|
}
|
||||||
|
|
||||||
|
void bitmap_each_bit(struct bitmap *self, ewah_callback callback, void *data)
|
||||||
|
{
|
||||||
|
size_t pos = 0, i;
|
||||||
|
|
||||||
|
for (i = 0; i < self->word_alloc; ++i) {
|
||||||
|
eword_t word = self->words[i];
|
||||||
|
uint32_t offset;
|
||||||
|
|
||||||
|
if (word == (eword_t)~0) {
|
||||||
|
for (offset = 0; offset < BITS_IN_WORD; ++offset)
|
||||||
|
callback(pos++, data);
|
||||||
|
} else {
|
||||||
|
for (offset = 0; offset < BITS_IN_WORD; ++offset) {
|
||||||
|
if ((word >> offset) == 0)
|
||||||
|
break;
|
||||||
|
|
||||||
|
offset += ewah_bit_ctz64(word >> offset);
|
||||||
|
callback(pos + offset, data);
|
||||||
|
}
|
||||||
|
pos += BITS_IN_WORD;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
size_t bitmap_popcount(struct bitmap *self)
|
||||||
|
{
|
||||||
|
size_t i, count = 0;
|
||||||
|
|
||||||
|
for (i = 0; i < self->word_alloc; ++i)
|
||||||
|
count += ewah_bit_popcount64(self->words[i]);
|
||||||
|
|
||||||
|
return count;
|
||||||
|
}
|
||||||
|
|
||||||
|
int bitmap_equals(struct bitmap *self, struct bitmap *other)
|
||||||
|
{
|
||||||
|
struct bitmap *big, *small;
|
||||||
|
size_t i;
|
||||||
|
|
||||||
|
if (self->word_alloc < other->word_alloc) {
|
||||||
|
small = self;
|
||||||
|
big = other;
|
||||||
|
} else {
|
||||||
|
small = other;
|
||||||
|
big = self;
|
||||||
|
}
|
||||||
|
|
||||||
|
for (i = 0; i < small->word_alloc; ++i) {
|
||||||
|
if (small->words[i] != big->words[i])
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
for (; i < big->word_alloc; ++i) {
|
||||||
|
if (big->words[i] != 0)
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
return 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
void bitmap_reset(struct bitmap *bitmap)
|
||||||
|
{
|
||||||
|
memset(bitmap->words, 0x0, bitmap->word_alloc * sizeof(eword_t));
|
||||||
|
}
|
||||||
|
|
||||||
|
void bitmap_free(struct bitmap *bitmap)
|
||||||
|
{
|
||||||
|
if (bitmap == NULL)
|
||||||
|
return;
|
||||||
|
|
||||||
|
free(bitmap->words);
|
||||||
|
free(bitmap);
|
||||||
|
}
|
714
ewah/ewah_bitmap.c
Normal file
714
ewah/ewah_bitmap.c
Normal file
@ -0,0 +1,714 @@
|
|||||||
|
/**
|
||||||
|
* Copyright 2013, GitHub, Inc
|
||||||
|
* Copyright 2009-2013, Daniel Lemire, Cliff Moon,
|
||||||
|
* David McIntosh, Robert Becho, Google Inc. and Veronika Zenz
|
||||||
|
*
|
||||||
|
* This program is free software; you can redistribute it and/or
|
||||||
|
* modify it under the terms of the GNU General Public License
|
||||||
|
* as published by the Free Software Foundation; either version 2
|
||||||
|
* of the License, or (at your option) any later version.
|
||||||
|
*
|
||||||
|
* This program is distributed in the hope that it will be useful,
|
||||||
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||||
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||||
|
* GNU General Public License for more details.
|
||||||
|
*
|
||||||
|
* You should have received a copy of the GNU General Public License
|
||||||
|
* along with this program; if not, write to the Free Software
|
||||||
|
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
|
||||||
|
*/
|
||||||
|
#include "git-compat-util.h"
|
||||||
|
#include "ewok.h"
|
||||||
|
#include "ewok_rlw.h"
|
||||||
|
|
||||||
|
static inline size_t min_size(size_t a, size_t b)
|
||||||
|
{
|
||||||
|
return a < b ? a : b;
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline size_t max_size(size_t a, size_t b)
|
||||||
|
{
|
||||||
|
return a > b ? a : b;
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline void buffer_grow(struct ewah_bitmap *self, size_t new_size)
|
||||||
|
{
|
||||||
|
size_t rlw_offset = (uint8_t *)self->rlw - (uint8_t *)self->buffer;
|
||||||
|
|
||||||
|
if (self->alloc_size >= new_size)
|
||||||
|
return;
|
||||||
|
|
||||||
|
self->alloc_size = new_size;
|
||||||
|
self->buffer = ewah_realloc(self->buffer,
|
||||||
|
self->alloc_size * sizeof(eword_t));
|
||||||
|
self->rlw = self->buffer + (rlw_offset / sizeof(size_t));
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline void buffer_push(struct ewah_bitmap *self, eword_t value)
|
||||||
|
{
|
||||||
|
if (self->buffer_size + 1 >= self->alloc_size)
|
||||||
|
buffer_grow(self, self->buffer_size * 3 / 2);
|
||||||
|
|
||||||
|
self->buffer[self->buffer_size++] = value;
|
||||||
|
}
|
||||||
|
|
||||||
|
static void buffer_push_rlw(struct ewah_bitmap *self, eword_t value)
|
||||||
|
{
|
||||||
|
buffer_push(self, value);
|
||||||
|
self->rlw = self->buffer + self->buffer_size - 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
static size_t add_empty_words(struct ewah_bitmap *self, int v, size_t number)
|
||||||
|
{
|
||||||
|
size_t added = 0;
|
||||||
|
eword_t runlen, can_add;
|
||||||
|
|
||||||
|
if (rlw_get_run_bit(self->rlw) != v && rlw_size(self->rlw) == 0) {
|
||||||
|
rlw_set_run_bit(self->rlw, v);
|
||||||
|
} else if (rlw_get_literal_words(self->rlw) != 0 ||
|
||||||
|
rlw_get_run_bit(self->rlw) != v) {
|
||||||
|
buffer_push_rlw(self, 0);
|
||||||
|
if (v) rlw_set_run_bit(self->rlw, v);
|
||||||
|
added++;
|
||||||
|
}
|
||||||
|
|
||||||
|
runlen = rlw_get_running_len(self->rlw);
|
||||||
|
can_add = min_size(number, RLW_LARGEST_RUNNING_COUNT - runlen);
|
||||||
|
|
||||||
|
rlw_set_running_len(self->rlw, runlen + can_add);
|
||||||
|
number -= can_add;
|
||||||
|
|
||||||
|
while (number >= RLW_LARGEST_RUNNING_COUNT) {
|
||||||
|
buffer_push_rlw(self, 0);
|
||||||
|
added++;
|
||||||
|
if (v) rlw_set_run_bit(self->rlw, v);
|
||||||
|
rlw_set_running_len(self->rlw, RLW_LARGEST_RUNNING_COUNT);
|
||||||
|
number -= RLW_LARGEST_RUNNING_COUNT;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (number > 0) {
|
||||||
|
buffer_push_rlw(self, 0);
|
||||||
|
added++;
|
||||||
|
|
||||||
|
if (v) rlw_set_run_bit(self->rlw, v);
|
||||||
|
rlw_set_running_len(self->rlw, number);
|
||||||
|
}
|
||||||
|
|
||||||
|
return added;
|
||||||
|
}
|
||||||
|
|
||||||
|
size_t ewah_add_empty_words(struct ewah_bitmap *self, int v, size_t number)
|
||||||
|
{
|
||||||
|
if (number == 0)
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
self->bit_size += number * BITS_IN_WORD;
|
||||||
|
return add_empty_words(self, v, number);
|
||||||
|
}
|
||||||
|
|
||||||
|
static size_t add_literal(struct ewah_bitmap *self, eword_t new_data)
|
||||||
|
{
|
||||||
|
eword_t current_num = rlw_get_literal_words(self->rlw);
|
||||||
|
|
||||||
|
if (current_num >= RLW_LARGEST_LITERAL_COUNT) {
|
||||||
|
buffer_push_rlw(self, 0);
|
||||||
|
|
||||||
|
rlw_set_literal_words(self->rlw, 1);
|
||||||
|
buffer_push(self, new_data);
|
||||||
|
return 2;
|
||||||
|
}
|
||||||
|
|
||||||
|
rlw_set_literal_words(self->rlw, current_num + 1);
|
||||||
|
|
||||||
|
/* sanity check */
|
||||||
|
assert(rlw_get_literal_words(self->rlw) == current_num + 1);
|
||||||
|
|
||||||
|
buffer_push(self, new_data);
|
||||||
|
return 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
void ewah_add_dirty_words(
|
||||||
|
struct ewah_bitmap *self, const eword_t *buffer,
|
||||||
|
size_t number, int negate)
|
||||||
|
{
|
||||||
|
size_t literals, can_add;
|
||||||
|
|
||||||
|
while (1) {
|
||||||
|
literals = rlw_get_literal_words(self->rlw);
|
||||||
|
can_add = min_size(number, RLW_LARGEST_LITERAL_COUNT - literals);
|
||||||
|
|
||||||
|
rlw_set_literal_words(self->rlw, literals + can_add);
|
||||||
|
|
||||||
|
if (self->buffer_size + can_add >= self->alloc_size)
|
||||||
|
buffer_grow(self, (self->buffer_size + can_add) * 3 / 2);
|
||||||
|
|
||||||
|
if (negate) {
|
||||||
|
size_t i;
|
||||||
|
for (i = 0; i < can_add; ++i)
|
||||||
|
self->buffer[self->buffer_size++] = ~buffer[i];
|
||||||
|
} else {
|
||||||
|
memcpy(self->buffer + self->buffer_size,
|
||||||
|
buffer, can_add * sizeof(eword_t));
|
||||||
|
self->buffer_size += can_add;
|
||||||
|
}
|
||||||
|
|
||||||
|
self->bit_size += can_add * BITS_IN_WORD;
|
||||||
|
|
||||||
|
if (number - can_add == 0)
|
||||||
|
break;
|
||||||
|
|
||||||
|
buffer_push_rlw(self, 0);
|
||||||
|
buffer += can_add;
|
||||||
|
number -= can_add;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
static size_t add_empty_word(struct ewah_bitmap *self, int v)
|
||||||
|
{
|
||||||
|
int no_literal = (rlw_get_literal_words(self->rlw) == 0);
|
||||||
|
eword_t run_len = rlw_get_running_len(self->rlw);
|
||||||
|
|
||||||
|
if (no_literal && run_len == 0) {
|
||||||
|
rlw_set_run_bit(self->rlw, v);
|
||||||
|
assert(rlw_get_run_bit(self->rlw) == v);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (no_literal && rlw_get_run_bit(self->rlw) == v &&
|
||||||
|
run_len < RLW_LARGEST_RUNNING_COUNT) {
|
||||||
|
rlw_set_running_len(self->rlw, run_len + 1);
|
||||||
|
assert(rlw_get_running_len(self->rlw) == run_len + 1);
|
||||||
|
return 0;
|
||||||
|
} else {
|
||||||
|
buffer_push_rlw(self, 0);
|
||||||
|
|
||||||
|
assert(rlw_get_running_len(self->rlw) == 0);
|
||||||
|
assert(rlw_get_run_bit(self->rlw) == 0);
|
||||||
|
assert(rlw_get_literal_words(self->rlw) == 0);
|
||||||
|
|
||||||
|
rlw_set_run_bit(self->rlw, v);
|
||||||
|
assert(rlw_get_run_bit(self->rlw) == v);
|
||||||
|
|
||||||
|
rlw_set_running_len(self->rlw, 1);
|
||||||
|
assert(rlw_get_running_len(self->rlw) == 1);
|
||||||
|
assert(rlw_get_literal_words(self->rlw) == 0);
|
||||||
|
return 1;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
size_t ewah_add(struct ewah_bitmap *self, eword_t word)
|
||||||
|
{
|
||||||
|
self->bit_size += BITS_IN_WORD;
|
||||||
|
|
||||||
|
if (word == 0)
|
||||||
|
return add_empty_word(self, 0);
|
||||||
|
|
||||||
|
if (word == (eword_t)(~0))
|
||||||
|
return add_empty_word(self, 1);
|
||||||
|
|
||||||
|
return add_literal(self, word);
|
||||||
|
}
|
||||||
|
|
||||||
|
void ewah_set(struct ewah_bitmap *self, size_t i)
|
||||||
|
{
|
||||||
|
const size_t dist =
|
||||||
|
(i + BITS_IN_WORD) / BITS_IN_WORD -
|
||||||
|
(self->bit_size + BITS_IN_WORD - 1) / BITS_IN_WORD;
|
||||||
|
|
||||||
|
assert(i >= self->bit_size);
|
||||||
|
|
||||||
|
self->bit_size = i + 1;
|
||||||
|
|
||||||
|
if (dist > 0) {
|
||||||
|
if (dist > 1)
|
||||||
|
add_empty_words(self, 0, dist - 1);
|
||||||
|
|
||||||
|
add_literal(self, (eword_t)1 << (i % BITS_IN_WORD));
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (rlw_get_literal_words(self->rlw) == 0) {
|
||||||
|
rlw_set_running_len(self->rlw,
|
||||||
|
rlw_get_running_len(self->rlw) - 1);
|
||||||
|
add_literal(self, (eword_t)1 << (i % BITS_IN_WORD));
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
self->buffer[self->buffer_size - 1] |=
|
||||||
|
((eword_t)1 << (i % BITS_IN_WORD));
|
||||||
|
|
||||||
|
/* check if we just completed a stream of 1s */
|
||||||
|
if (self->buffer[self->buffer_size - 1] == (eword_t)(~0)) {
|
||||||
|
self->buffer[--self->buffer_size] = 0;
|
||||||
|
rlw_set_literal_words(self->rlw,
|
||||||
|
rlw_get_literal_words(self->rlw) - 1);
|
||||||
|
add_empty_word(self, 1);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
void ewah_each_bit(struct ewah_bitmap *self, void (*callback)(size_t, void*), void *payload)
|
||||||
|
{
|
||||||
|
size_t pos = 0;
|
||||||
|
size_t pointer = 0;
|
||||||
|
size_t k;
|
||||||
|
|
||||||
|
while (pointer < self->buffer_size) {
|
||||||
|
eword_t *word = &self->buffer[pointer];
|
||||||
|
|
||||||
|
if (rlw_get_run_bit(word)) {
|
||||||
|
size_t len = rlw_get_running_len(word) * BITS_IN_WORD;
|
||||||
|
for (k = 0; k < len; ++k, ++pos)
|
||||||
|
callback(pos, payload);
|
||||||
|
} else {
|
||||||
|
pos += rlw_get_running_len(word) * BITS_IN_WORD;
|
||||||
|
}
|
||||||
|
|
||||||
|
++pointer;
|
||||||
|
|
||||||
|
for (k = 0; k < rlw_get_literal_words(word); ++k) {
|
||||||
|
int c;
|
||||||
|
|
||||||
|
/* todo: zero count optimization */
|
||||||
|
for (c = 0; c < BITS_IN_WORD; ++c, ++pos) {
|
||||||
|
if ((self->buffer[pointer] & ((eword_t)1 << c)) != 0)
|
||||||
|
callback(pos, payload);
|
||||||
|
}
|
||||||
|
|
||||||
|
++pointer;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
struct ewah_bitmap *ewah_new(void)
|
||||||
|
{
|
||||||
|
struct ewah_bitmap *self;
|
||||||
|
|
||||||
|
self = ewah_malloc(sizeof(struct ewah_bitmap));
|
||||||
|
if (self == NULL)
|
||||||
|
return NULL;
|
||||||
|
|
||||||
|
self->buffer = ewah_malloc(32 * sizeof(eword_t));
|
||||||
|
self->alloc_size = 32;
|
||||||
|
|
||||||
|
ewah_clear(self);
|
||||||
|
return self;
|
||||||
|
}
|
||||||
|
|
||||||
|
void ewah_clear(struct ewah_bitmap *self)
|
||||||
|
{
|
||||||
|
self->buffer_size = 1;
|
||||||
|
self->buffer[0] = 0;
|
||||||
|
self->bit_size = 0;
|
||||||
|
self->rlw = self->buffer;
|
||||||
|
}
|
||||||
|
|
||||||
|
void ewah_free(struct ewah_bitmap *self)
|
||||||
|
{
|
||||||
|
if (!self)
|
||||||
|
return;
|
||||||
|
|
||||||
|
if (self->alloc_size)
|
||||||
|
free(self->buffer);
|
||||||
|
|
||||||
|
free(self);
|
||||||
|
}
|
||||||
|
|
||||||
|
static void read_new_rlw(struct ewah_iterator *it)
|
||||||
|
{
|
||||||
|
const eword_t *word = NULL;
|
||||||
|
|
||||||
|
it->literals = 0;
|
||||||
|
it->compressed = 0;
|
||||||
|
|
||||||
|
while (1) {
|
||||||
|
word = &it->buffer[it->pointer];
|
||||||
|
|
||||||
|
it->rl = rlw_get_running_len(word);
|
||||||
|
it->lw = rlw_get_literal_words(word);
|
||||||
|
it->b = rlw_get_run_bit(word);
|
||||||
|
|
||||||
|
if (it->rl || it->lw)
|
||||||
|
return;
|
||||||
|
|
||||||
|
if (it->pointer < it->buffer_size - 1) {
|
||||||
|
it->pointer++;
|
||||||
|
} else {
|
||||||
|
it->pointer = it->buffer_size;
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
int ewah_iterator_next(eword_t *next, struct ewah_iterator *it)
|
||||||
|
{
|
||||||
|
if (it->pointer >= it->buffer_size)
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
if (it->compressed < it->rl) {
|
||||||
|
it->compressed++;
|
||||||
|
*next = it->b ? (eword_t)(~0) : 0;
|
||||||
|
} else {
|
||||||
|
assert(it->literals < it->lw);
|
||||||
|
|
||||||
|
it->literals++;
|
||||||
|
it->pointer++;
|
||||||
|
|
||||||
|
assert(it->pointer < it->buffer_size);
|
||||||
|
|
||||||
|
*next = it->buffer[it->pointer];
|
||||||
|
}
|
||||||
|
|
||||||
|
if (it->compressed == it->rl && it->literals == it->lw) {
|
||||||
|
if (++it->pointer < it->buffer_size)
|
||||||
|
read_new_rlw(it);
|
||||||
|
}
|
||||||
|
|
||||||
|
return 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
void ewah_iterator_init(struct ewah_iterator *it, struct ewah_bitmap *parent)
|
||||||
|
{
|
||||||
|
it->buffer = parent->buffer;
|
||||||
|
it->buffer_size = parent->buffer_size;
|
||||||
|
it->pointer = 0;
|
||||||
|
|
||||||
|
it->lw = 0;
|
||||||
|
it->rl = 0;
|
||||||
|
it->compressed = 0;
|
||||||
|
it->literals = 0;
|
||||||
|
it->b = 0;
|
||||||
|
|
||||||
|
if (it->pointer < it->buffer_size)
|
||||||
|
read_new_rlw(it);
|
||||||
|
}
|
||||||
|
|
||||||
|
void ewah_not(struct ewah_bitmap *self)
|
||||||
|
{
|
||||||
|
size_t pointer = 0;
|
||||||
|
|
||||||
|
while (pointer < self->buffer_size) {
|
||||||
|
eword_t *word = &self->buffer[pointer];
|
||||||
|
size_t literals, k;
|
||||||
|
|
||||||
|
rlw_xor_run_bit(word);
|
||||||
|
++pointer;
|
||||||
|
|
||||||
|
literals = rlw_get_literal_words(word);
|
||||||
|
for (k = 0; k < literals; ++k) {
|
||||||
|
self->buffer[pointer] = ~self->buffer[pointer];
|
||||||
|
++pointer;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
void ewah_xor(
|
||||||
|
struct ewah_bitmap *ewah_i,
|
||||||
|
struct ewah_bitmap *ewah_j,
|
||||||
|
struct ewah_bitmap *out)
|
||||||
|
{
|
||||||
|
struct rlw_iterator rlw_i;
|
||||||
|
struct rlw_iterator rlw_j;
|
||||||
|
size_t literals;
|
||||||
|
|
||||||
|
rlwit_init(&rlw_i, ewah_i);
|
||||||
|
rlwit_init(&rlw_j, ewah_j);
|
||||||
|
|
||||||
|
while (rlwit_word_size(&rlw_i) > 0 && rlwit_word_size(&rlw_j) > 0) {
|
||||||
|
while (rlw_i.rlw.running_len > 0 || rlw_j.rlw.running_len > 0) {
|
||||||
|
struct rlw_iterator *prey, *predator;
|
||||||
|
size_t index;
|
||||||
|
int negate_words;
|
||||||
|
|
||||||
|
if (rlw_i.rlw.running_len < rlw_j.rlw.running_len) {
|
||||||
|
prey = &rlw_i;
|
||||||
|
predator = &rlw_j;
|
||||||
|
} else {
|
||||||
|
prey = &rlw_j;
|
||||||
|
predator = &rlw_i;
|
||||||
|
}
|
||||||
|
|
||||||
|
negate_words = !!predator->rlw.running_bit;
|
||||||
|
index = rlwit_discharge(prey, out,
|
||||||
|
predator->rlw.running_len, negate_words);
|
||||||
|
|
||||||
|
ewah_add_empty_words(out, negate_words,
|
||||||
|
predator->rlw.running_len - index);
|
||||||
|
|
||||||
|
rlwit_discard_first_words(predator,
|
||||||
|
predator->rlw.running_len);
|
||||||
|
}
|
||||||
|
|
||||||
|
literals = min_size(
|
||||||
|
rlw_i.rlw.literal_words,
|
||||||
|
rlw_j.rlw.literal_words);
|
||||||
|
|
||||||
|
if (literals) {
|
||||||
|
size_t k;
|
||||||
|
|
||||||
|
for (k = 0; k < literals; ++k) {
|
||||||
|
ewah_add(out,
|
||||||
|
rlw_i.buffer[rlw_i.literal_word_start + k] ^
|
||||||
|
rlw_j.buffer[rlw_j.literal_word_start + k]
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
rlwit_discard_first_words(&rlw_i, literals);
|
||||||
|
rlwit_discard_first_words(&rlw_j, literals);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if (rlwit_word_size(&rlw_i) > 0)
|
||||||
|
rlwit_discharge(&rlw_i, out, ~0, 0);
|
||||||
|
else
|
||||||
|
rlwit_discharge(&rlw_j, out, ~0, 0);
|
||||||
|
|
||||||
|
out->bit_size = max_size(ewah_i->bit_size, ewah_j->bit_size);
|
||||||
|
}
|
||||||
|
|
||||||
|
void ewah_and(
|
||||||
|
struct ewah_bitmap *ewah_i,
|
||||||
|
struct ewah_bitmap *ewah_j,
|
||||||
|
struct ewah_bitmap *out)
|
||||||
|
{
|
||||||
|
struct rlw_iterator rlw_i;
|
||||||
|
struct rlw_iterator rlw_j;
|
||||||
|
size_t literals;
|
||||||
|
|
||||||
|
rlwit_init(&rlw_i, ewah_i);
|
||||||
|
rlwit_init(&rlw_j, ewah_j);
|
||||||
|
|
||||||
|
while (rlwit_word_size(&rlw_i) > 0 && rlwit_word_size(&rlw_j) > 0) {
|
||||||
|
while (rlw_i.rlw.running_len > 0 || rlw_j.rlw.running_len > 0) {
|
||||||
|
struct rlw_iterator *prey, *predator;
|
||||||
|
|
||||||
|
if (rlw_i.rlw.running_len < rlw_j.rlw.running_len) {
|
||||||
|
prey = &rlw_i;
|
||||||
|
predator = &rlw_j;
|
||||||
|
} else {
|
||||||
|
prey = &rlw_j;
|
||||||
|
predator = &rlw_i;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (predator->rlw.running_bit == 0) {
|
||||||
|
ewah_add_empty_words(out, 0,
|
||||||
|
predator->rlw.running_len);
|
||||||
|
rlwit_discard_first_words(prey,
|
||||||
|
predator->rlw.running_len);
|
||||||
|
rlwit_discard_first_words(predator,
|
||||||
|
predator->rlw.running_len);
|
||||||
|
} else {
|
||||||
|
size_t index = rlwit_discharge(prey, out,
|
||||||
|
predator->rlw.running_len, 0);
|
||||||
|
ewah_add_empty_words(out, 0,
|
||||||
|
predator->rlw.running_len - index);
|
||||||
|
rlwit_discard_first_words(predator,
|
||||||
|
predator->rlw.running_len);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
literals = min_size(
|
||||||
|
rlw_i.rlw.literal_words,
|
||||||
|
rlw_j.rlw.literal_words);
|
||||||
|
|
||||||
|
if (literals) {
|
||||||
|
size_t k;
|
||||||
|
|
||||||
|
for (k = 0; k < literals; ++k) {
|
||||||
|
ewah_add(out,
|
||||||
|
rlw_i.buffer[rlw_i.literal_word_start + k] &
|
||||||
|
rlw_j.buffer[rlw_j.literal_word_start + k]
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
rlwit_discard_first_words(&rlw_i, literals);
|
||||||
|
rlwit_discard_first_words(&rlw_j, literals);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if (rlwit_word_size(&rlw_i) > 0)
|
||||||
|
rlwit_discharge_empty(&rlw_i, out);
|
||||||
|
else
|
||||||
|
rlwit_discharge_empty(&rlw_j, out);
|
||||||
|
|
||||||
|
out->bit_size = max_size(ewah_i->bit_size, ewah_j->bit_size);
|
||||||
|
}
|
||||||
|
|
||||||
|
void ewah_and_not(
|
||||||
|
struct ewah_bitmap *ewah_i,
|
||||||
|
struct ewah_bitmap *ewah_j,
|
||||||
|
struct ewah_bitmap *out)
|
||||||
|
{
|
||||||
|
struct rlw_iterator rlw_i;
|
||||||
|
struct rlw_iterator rlw_j;
|
||||||
|
size_t literals;
|
||||||
|
|
||||||
|
rlwit_init(&rlw_i, ewah_i);
|
||||||
|
rlwit_init(&rlw_j, ewah_j);
|
||||||
|
|
||||||
|
while (rlwit_word_size(&rlw_i) > 0 && rlwit_word_size(&rlw_j) > 0) {
|
||||||
|
while (rlw_i.rlw.running_len > 0 || rlw_j.rlw.running_len > 0) {
|
||||||
|
struct rlw_iterator *prey, *predator;
|
||||||
|
|
||||||
|
if (rlw_i.rlw.running_len < rlw_j.rlw.running_len) {
|
||||||
|
prey = &rlw_i;
|
||||||
|
predator = &rlw_j;
|
||||||
|
} else {
|
||||||
|
prey = &rlw_j;
|
||||||
|
predator = &rlw_i;
|
||||||
|
}
|
||||||
|
|
||||||
|
if ((predator->rlw.running_bit && prey == &rlw_i) ||
|
||||||
|
(!predator->rlw.running_bit && prey != &rlw_i)) {
|
||||||
|
ewah_add_empty_words(out, 0,
|
||||||
|
predator->rlw.running_len);
|
||||||
|
rlwit_discard_first_words(prey,
|
||||||
|
predator->rlw.running_len);
|
||||||
|
rlwit_discard_first_words(predator,
|
||||||
|
predator->rlw.running_len);
|
||||||
|
} else {
|
||||||
|
size_t index;
|
||||||
|
int negate_words;
|
||||||
|
|
||||||
|
negate_words = (&rlw_i != prey);
|
||||||
|
index = rlwit_discharge(prey, out,
|
||||||
|
predator->rlw.running_len, negate_words);
|
||||||
|
ewah_add_empty_words(out, negate_words,
|
||||||
|
predator->rlw.running_len - index);
|
||||||
|
rlwit_discard_first_words(predator,
|
||||||
|
predator->rlw.running_len);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
literals = min_size(
|
||||||
|
rlw_i.rlw.literal_words,
|
||||||
|
rlw_j.rlw.literal_words);
|
||||||
|
|
||||||
|
if (literals) {
|
||||||
|
size_t k;
|
||||||
|
|
||||||
|
for (k = 0; k < literals; ++k) {
|
||||||
|
ewah_add(out,
|
||||||
|
rlw_i.buffer[rlw_i.literal_word_start + k] &
|
||||||
|
~(rlw_j.buffer[rlw_j.literal_word_start + k])
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
rlwit_discard_first_words(&rlw_i, literals);
|
||||||
|
rlwit_discard_first_words(&rlw_j, literals);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if (rlwit_word_size(&rlw_i) > 0)
|
||||||
|
rlwit_discharge(&rlw_i, out, ~0, 0);
|
||||||
|
else
|
||||||
|
rlwit_discharge_empty(&rlw_j, out);
|
||||||
|
|
||||||
|
out->bit_size = max_size(ewah_i->bit_size, ewah_j->bit_size);
|
||||||
|
}
|
||||||
|
|
||||||
|
void ewah_or(
|
||||||
|
struct ewah_bitmap *ewah_i,
|
||||||
|
struct ewah_bitmap *ewah_j,
|
||||||
|
struct ewah_bitmap *out)
|
||||||
|
{
|
||||||
|
struct rlw_iterator rlw_i;
|
||||||
|
struct rlw_iterator rlw_j;
|
||||||
|
size_t literals;
|
||||||
|
|
||||||
|
rlwit_init(&rlw_i, ewah_i);
|
||||||
|
rlwit_init(&rlw_j, ewah_j);
|
||||||
|
|
||||||
|
while (rlwit_word_size(&rlw_i) > 0 && rlwit_word_size(&rlw_j) > 0) {
|
||||||
|
while (rlw_i.rlw.running_len > 0 || rlw_j.rlw.running_len > 0) {
|
||||||
|
struct rlw_iterator *prey, *predator;
|
||||||
|
|
||||||
|
if (rlw_i.rlw.running_len < rlw_j.rlw.running_len) {
|
||||||
|
prey = &rlw_i;
|
||||||
|
predator = &rlw_j;
|
||||||
|
} else {
|
||||||
|
prey = &rlw_j;
|
||||||
|
predator = &rlw_i;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (predator->rlw.running_bit) {
|
||||||
|
ewah_add_empty_words(out, 0,
|
||||||
|
predator->rlw.running_len);
|
||||||
|
rlwit_discard_first_words(prey,
|
||||||
|
predator->rlw.running_len);
|
||||||
|
rlwit_discard_first_words(predator,
|
||||||
|
predator->rlw.running_len);
|
||||||
|
} else {
|
||||||
|
size_t index = rlwit_discharge(prey, out,
|
||||||
|
predator->rlw.running_len, 0);
|
||||||
|
ewah_add_empty_words(out, 0,
|
||||||
|
predator->rlw.running_len - index);
|
||||||
|
rlwit_discard_first_words(predator,
|
||||||
|
predator->rlw.running_len);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
literals = min_size(
|
||||||
|
rlw_i.rlw.literal_words,
|
||||||
|
rlw_j.rlw.literal_words);
|
||||||
|
|
||||||
|
if (literals) {
|
||||||
|
size_t k;
|
||||||
|
|
||||||
|
for (k = 0; k < literals; ++k) {
|
||||||
|
ewah_add(out,
|
||||||
|
rlw_i.buffer[rlw_i.literal_word_start + k] |
|
||||||
|
rlw_j.buffer[rlw_j.literal_word_start + k]
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
rlwit_discard_first_words(&rlw_i, literals);
|
||||||
|
rlwit_discard_first_words(&rlw_j, literals);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if (rlwit_word_size(&rlw_i) > 0)
|
||||||
|
rlwit_discharge(&rlw_i, out, ~0, 0);
|
||||||
|
else
|
||||||
|
rlwit_discharge(&rlw_j, out, ~0, 0);
|
||||||
|
|
||||||
|
out->bit_size = max_size(ewah_i->bit_size, ewah_j->bit_size);
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
#define BITMAP_POOL_MAX 16
|
||||||
|
static struct ewah_bitmap *bitmap_pool[BITMAP_POOL_MAX];
|
||||||
|
static size_t bitmap_pool_size;
|
||||||
|
|
||||||
|
struct ewah_bitmap *ewah_pool_new(void)
|
||||||
|
{
|
||||||
|
if (bitmap_pool_size)
|
||||||
|
return bitmap_pool[--bitmap_pool_size];
|
||||||
|
|
||||||
|
return ewah_new();
|
||||||
|
}
|
||||||
|
|
||||||
|
void ewah_pool_free(struct ewah_bitmap *self)
|
||||||
|
{
|
||||||
|
if (self == NULL)
|
||||||
|
return;
|
||||||
|
|
||||||
|
if (bitmap_pool_size == BITMAP_POOL_MAX ||
|
||||||
|
self->alloc_size == 0) {
|
||||||
|
ewah_free(self);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
ewah_clear(self);
|
||||||
|
bitmap_pool[bitmap_pool_size++] = self;
|
||||||
|
}
|
||||||
|
|
||||||
|
uint32_t ewah_checksum(struct ewah_bitmap *self)
|
||||||
|
{
|
||||||
|
const uint8_t *p = (uint8_t *)self->buffer;
|
||||||
|
uint32_t crc = (uint32_t)self->bit_size;
|
||||||
|
size_t size = self->buffer_size * sizeof(eword_t);
|
||||||
|
|
||||||
|
while (size--)
|
||||||
|
crc = (crc << 5) - crc + (uint32_t)*p++;
|
||||||
|
|
||||||
|
return crc;
|
||||||
|
}
|
204
ewah/ewah_io.c
Normal file
204
ewah/ewah_io.c
Normal file
@ -0,0 +1,204 @@
|
|||||||
|
/**
|
||||||
|
* Copyright 2013, GitHub, Inc
|
||||||
|
* Copyright 2009-2013, Daniel Lemire, Cliff Moon,
|
||||||
|
* David McIntosh, Robert Becho, Google Inc. and Veronika Zenz
|
||||||
|
*
|
||||||
|
* This program is free software; you can redistribute it and/or
|
||||||
|
* modify it under the terms of the GNU General Public License
|
||||||
|
* as published by the Free Software Foundation; either version 2
|
||||||
|
* of the License, or (at your option) any later version.
|
||||||
|
*
|
||||||
|
* This program is distributed in the hope that it will be useful,
|
||||||
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||||
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||||
|
* GNU General Public License for more details.
|
||||||
|
*
|
||||||
|
* You should have received a copy of the GNU General Public License
|
||||||
|
* along with this program; if not, write to the Free Software
|
||||||
|
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
|
||||||
|
*/
|
||||||
|
#include "git-compat-util.h"
|
||||||
|
#include "ewok.h"
|
||||||
|
|
||||||
|
int ewah_serialize_native(struct ewah_bitmap *self, int fd)
|
||||||
|
{
|
||||||
|
uint32_t write32;
|
||||||
|
size_t to_write = self->buffer_size * 8;
|
||||||
|
|
||||||
|
/* 32 bit -- bit size for the map */
|
||||||
|
write32 = (uint32_t)self->bit_size;
|
||||||
|
if (write(fd, &write32, 4) != 4)
|
||||||
|
return -1;
|
||||||
|
|
||||||
|
/** 32 bit -- number of compressed 64-bit words */
|
||||||
|
write32 = (uint32_t)self->buffer_size;
|
||||||
|
if (write(fd, &write32, 4) != 4)
|
||||||
|
return -1;
|
||||||
|
|
||||||
|
if (write(fd, self->buffer, to_write) != to_write)
|
||||||
|
return -1;
|
||||||
|
|
||||||
|
/** 32 bit -- position for the RLW */
|
||||||
|
write32 = self->rlw - self->buffer;
|
||||||
|
if (write(fd, &write32, 4) != 4)
|
||||||
|
return -1;
|
||||||
|
|
||||||
|
return (3 * 4) + to_write;
|
||||||
|
}
|
||||||
|
|
||||||
|
int ewah_serialize_to(struct ewah_bitmap *self,
|
||||||
|
int (*write_fun)(void *, const void *, size_t),
|
||||||
|
void *data)
|
||||||
|
{
|
||||||
|
size_t i;
|
||||||
|
eword_t dump[2048];
|
||||||
|
const size_t words_per_dump = sizeof(dump) / sizeof(eword_t);
|
||||||
|
uint32_t bitsize, word_count, rlw_pos;
|
||||||
|
|
||||||
|
const eword_t *buffer;
|
||||||
|
size_t words_left;
|
||||||
|
|
||||||
|
/* 32 bit -- bit size for the map */
|
||||||
|
bitsize = htonl((uint32_t)self->bit_size);
|
||||||
|
if (write_fun(data, &bitsize, 4) != 4)
|
||||||
|
return -1;
|
||||||
|
|
||||||
|
/** 32 bit -- number of compressed 64-bit words */
|
||||||
|
word_count = htonl((uint32_t)self->buffer_size);
|
||||||
|
if (write_fun(data, &word_count, 4) != 4)
|
||||||
|
return -1;
|
||||||
|
|
||||||
|
/** 64 bit x N -- compressed words */
|
||||||
|
buffer = self->buffer;
|
||||||
|
words_left = self->buffer_size;
|
||||||
|
|
||||||
|
while (words_left >= words_per_dump) {
|
||||||
|
for (i = 0; i < words_per_dump; ++i, ++buffer)
|
||||||
|
dump[i] = htonll(*buffer);
|
||||||
|
|
||||||
|
if (write_fun(data, dump, sizeof(dump)) != sizeof(dump))
|
||||||
|
return -1;
|
||||||
|
|
||||||
|
words_left -= words_per_dump;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (words_left) {
|
||||||
|
for (i = 0; i < words_left; ++i, ++buffer)
|
||||||
|
dump[i] = htonll(*buffer);
|
||||||
|
|
||||||
|
if (write_fun(data, dump, words_left * 8) != words_left * 8)
|
||||||
|
return -1;
|
||||||
|
}
|
||||||
|
|
||||||
|
/** 32 bit -- position for the RLW */
|
||||||
|
rlw_pos = (uint8_t*)self->rlw - (uint8_t *)self->buffer;
|
||||||
|
rlw_pos = htonl(rlw_pos / sizeof(eword_t));
|
||||||
|
|
||||||
|
if (write_fun(data, &rlw_pos, 4) != 4)
|
||||||
|
return -1;
|
||||||
|
|
||||||
|
return (3 * 4) + (self->buffer_size * 8);
|
||||||
|
}
|
||||||
|
|
||||||
|
static int write_helper(void *fd, const void *buf, size_t len)
|
||||||
|
{
|
||||||
|
return write((intptr_t)fd, buf, len);
|
||||||
|
}
|
||||||
|
|
||||||
|
int ewah_serialize(struct ewah_bitmap *self, int fd)
|
||||||
|
{
|
||||||
|
return ewah_serialize_to(self, write_helper, (void *)(intptr_t)fd);
|
||||||
|
}
|
||||||
|
|
||||||
|
int ewah_read_mmap(struct ewah_bitmap *self, void *map, size_t len)
|
||||||
|
{
|
||||||
|
uint8_t *ptr = map;
|
||||||
|
size_t i;
|
||||||
|
|
||||||
|
self->bit_size = get_be32(ptr);
|
||||||
|
ptr += sizeof(uint32_t);
|
||||||
|
|
||||||
|
self->buffer_size = self->alloc_size = get_be32(ptr);
|
||||||
|
ptr += sizeof(uint32_t);
|
||||||
|
|
||||||
|
self->buffer = ewah_realloc(self->buffer,
|
||||||
|
self->alloc_size * sizeof(eword_t));
|
||||||
|
|
||||||
|
if (!self->buffer)
|
||||||
|
return -1;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Copy the raw data for the bitmap as a whole chunk;
|
||||||
|
* if we're in a little-endian platform, we'll perform
|
||||||
|
* the endianness conversion in a separate pass to ensure
|
||||||
|
* we're loading 8-byte aligned words.
|
||||||
|
*/
|
||||||
|
memcpy(self->buffer, ptr, self->buffer_size * sizeof(uint64_t));
|
||||||
|
ptr += self->buffer_size * sizeof(uint64_t);
|
||||||
|
|
||||||
|
for (i = 0; i < self->buffer_size; ++i)
|
||||||
|
self->buffer[i] = ntohll(self->buffer[i]);
|
||||||
|
|
||||||
|
self->rlw = self->buffer + get_be32(ptr);
|
||||||
|
|
||||||
|
return (3 * 4) + (self->buffer_size * 8);
|
||||||
|
}
|
||||||
|
|
||||||
|
int ewah_deserialize(struct ewah_bitmap *self, int fd)
|
||||||
|
{
|
||||||
|
size_t i;
|
||||||
|
eword_t dump[2048];
|
||||||
|
const size_t words_per_dump = sizeof(dump) / sizeof(eword_t);
|
||||||
|
uint32_t bitsize, word_count, rlw_pos;
|
||||||
|
|
||||||
|
eword_t *buffer = NULL;
|
||||||
|
size_t words_left;
|
||||||
|
|
||||||
|
ewah_clear(self);
|
||||||
|
|
||||||
|
/* 32 bit -- bit size for the map */
|
||||||
|
if (read(fd, &bitsize, 4) != 4)
|
||||||
|
return -1;
|
||||||
|
|
||||||
|
self->bit_size = (size_t)ntohl(bitsize);
|
||||||
|
|
||||||
|
/** 32 bit -- number of compressed 64-bit words */
|
||||||
|
if (read(fd, &word_count, 4) != 4)
|
||||||
|
return -1;
|
||||||
|
|
||||||
|
self->buffer_size = self->alloc_size = (size_t)ntohl(word_count);
|
||||||
|
self->buffer = ewah_realloc(self->buffer,
|
||||||
|
self->alloc_size * sizeof(eword_t));
|
||||||
|
|
||||||
|
if (!self->buffer)
|
||||||
|
return -1;
|
||||||
|
|
||||||
|
/** 64 bit x N -- compressed words */
|
||||||
|
buffer = self->buffer;
|
||||||
|
words_left = self->buffer_size;
|
||||||
|
|
||||||
|
while (words_left >= words_per_dump) {
|
||||||
|
if (read(fd, dump, sizeof(dump)) != sizeof(dump))
|
||||||
|
return -1;
|
||||||
|
|
||||||
|
for (i = 0; i < words_per_dump; ++i, ++buffer)
|
||||||
|
*buffer = ntohll(dump[i]);
|
||||||
|
|
||||||
|
words_left -= words_per_dump;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (words_left) {
|
||||||
|
if (read(fd, dump, words_left * 8) != words_left * 8)
|
||||||
|
return -1;
|
||||||
|
|
||||||
|
for (i = 0; i < words_left; ++i, ++buffer)
|
||||||
|
*buffer = ntohll(dump[i]);
|
||||||
|
}
|
||||||
|
|
||||||
|
/** 32 bit -- position for the RLW */
|
||||||
|
if (read(fd, &rlw_pos, 4) != 4)
|
||||||
|
return -1;
|
||||||
|
|
||||||
|
self->rlw = self->buffer + ntohl(rlw_pos);
|
||||||
|
return 0;
|
||||||
|
}
|
115
ewah/ewah_rlw.c
Normal file
115
ewah/ewah_rlw.c
Normal file
@ -0,0 +1,115 @@
|
|||||||
|
/**
|
||||||
|
* Copyright 2013, GitHub, Inc
|
||||||
|
* Copyright 2009-2013, Daniel Lemire, Cliff Moon,
|
||||||
|
* David McIntosh, Robert Becho, Google Inc. and Veronika Zenz
|
||||||
|
*
|
||||||
|
* This program is free software; you can redistribute it and/or
|
||||||
|
* modify it under the terms of the GNU General Public License
|
||||||
|
* as published by the Free Software Foundation; either version 2
|
||||||
|
* of the License, or (at your option) any later version.
|
||||||
|
*
|
||||||
|
* This program is distributed in the hope that it will be useful,
|
||||||
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||||
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||||
|
* GNU General Public License for more details.
|
||||||
|
*
|
||||||
|
* You should have received a copy of the GNU General Public License
|
||||||
|
* along with this program; if not, write to the Free Software
|
||||||
|
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
|
||||||
|
*/
|
||||||
|
#include "git-compat-util.h"
|
||||||
|
#include "ewok.h"
|
||||||
|
#include "ewok_rlw.h"
|
||||||
|
|
||||||
|
static inline int next_word(struct rlw_iterator *it)
|
||||||
|
{
|
||||||
|
if (it->pointer >= it->size)
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
it->rlw.word = &it->buffer[it->pointer];
|
||||||
|
it->pointer += rlw_get_literal_words(it->rlw.word) + 1;
|
||||||
|
|
||||||
|
it->rlw.literal_words = rlw_get_literal_words(it->rlw.word);
|
||||||
|
it->rlw.running_len = rlw_get_running_len(it->rlw.word);
|
||||||
|
it->rlw.running_bit = rlw_get_run_bit(it->rlw.word);
|
||||||
|
it->rlw.literal_word_offset = 0;
|
||||||
|
|
||||||
|
return 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
void rlwit_init(struct rlw_iterator *it, struct ewah_bitmap *from_ewah)
|
||||||
|
{
|
||||||
|
it->buffer = from_ewah->buffer;
|
||||||
|
it->size = from_ewah->buffer_size;
|
||||||
|
it->pointer = 0;
|
||||||
|
|
||||||
|
next_word(it);
|
||||||
|
|
||||||
|
it->literal_word_start = rlwit_literal_words(it) +
|
||||||
|
it->rlw.literal_word_offset;
|
||||||
|
}
|
||||||
|
|
||||||
|
void rlwit_discard_first_words(struct rlw_iterator *it, size_t x)
|
||||||
|
{
|
||||||
|
while (x > 0) {
|
||||||
|
size_t discard;
|
||||||
|
|
||||||
|
if (it->rlw.running_len > x) {
|
||||||
|
it->rlw.running_len -= x;
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
x -= it->rlw.running_len;
|
||||||
|
it->rlw.running_len = 0;
|
||||||
|
|
||||||
|
discard = (x > it->rlw.literal_words) ? it->rlw.literal_words : x;
|
||||||
|
|
||||||
|
it->literal_word_start += discard;
|
||||||
|
it->rlw.literal_words -= discard;
|
||||||
|
x -= discard;
|
||||||
|
|
||||||
|
if (x > 0 || rlwit_word_size(it) == 0) {
|
||||||
|
if (!next_word(it))
|
||||||
|
break;
|
||||||
|
|
||||||
|
it->literal_word_start =
|
||||||
|
rlwit_literal_words(it) + it->rlw.literal_word_offset;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
size_t rlwit_discharge(
|
||||||
|
struct rlw_iterator *it, struct ewah_bitmap *out, size_t max, int negate)
|
||||||
|
{
|
||||||
|
size_t index = 0;
|
||||||
|
|
||||||
|
while (index < max && rlwit_word_size(it) > 0) {
|
||||||
|
size_t pd, pl = it->rlw.running_len;
|
||||||
|
|
||||||
|
if (index + pl > max)
|
||||||
|
pl = max - index;
|
||||||
|
|
||||||
|
ewah_add_empty_words(out, it->rlw.running_bit ^ negate, pl);
|
||||||
|
index += pl;
|
||||||
|
|
||||||
|
pd = it->rlw.literal_words;
|
||||||
|
if (pd + index > max)
|
||||||
|
pd = max - index;
|
||||||
|
|
||||||
|
ewah_add_dirty_words(out,
|
||||||
|
it->buffer + it->literal_word_start, pd, negate);
|
||||||
|
|
||||||
|
rlwit_discard_first_words(it, pd + pl);
|
||||||
|
index += pd;
|
||||||
|
}
|
||||||
|
|
||||||
|
return index;
|
||||||
|
}
|
||||||
|
|
||||||
|
void rlwit_discharge_empty(struct rlw_iterator *it, struct ewah_bitmap *out)
|
||||||
|
{
|
||||||
|
while (rlwit_word_size(it) > 0) {
|
||||||
|
ewah_add_empty_words(out, 0, rlwit_word_size(it));
|
||||||
|
rlwit_discard_first_words(it, rlwit_word_size(it));
|
||||||
|
}
|
||||||
|
}
|
233
ewah/ewok.h
Normal file
233
ewah/ewok.h
Normal file
@ -0,0 +1,233 @@
|
|||||||
|
/**
|
||||||
|
* Copyright 2013, GitHub, Inc
|
||||||
|
* Copyright 2009-2013, Daniel Lemire, Cliff Moon,
|
||||||
|
* David McIntosh, Robert Becho, Google Inc. and Veronika Zenz
|
||||||
|
*
|
||||||
|
* This program is free software; you can redistribute it and/or
|
||||||
|
* modify it under the terms of the GNU General Public License
|
||||||
|
* as published by the Free Software Foundation; either version 2
|
||||||
|
* of the License, or (at your option) any later version.
|
||||||
|
*
|
||||||
|
* This program is distributed in the hope that it will be useful,
|
||||||
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||||
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||||
|
* GNU General Public License for more details.
|
||||||
|
*
|
||||||
|
* You should have received a copy of the GNU General Public License
|
||||||
|
* along with this program; if not, write to the Free Software
|
||||||
|
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
|
||||||
|
*/
|
||||||
|
#ifndef __EWOK_BITMAP_H__
|
||||||
|
#define __EWOK_BITMAP_H__
|
||||||
|
|
||||||
|
#ifndef ewah_malloc
|
||||||
|
# define ewah_malloc xmalloc
|
||||||
|
#endif
|
||||||
|
#ifndef ewah_realloc
|
||||||
|
# define ewah_realloc xrealloc
|
||||||
|
#endif
|
||||||
|
#ifndef ewah_calloc
|
||||||
|
# define ewah_calloc xcalloc
|
||||||
|
#endif
|
||||||
|
|
||||||
|
typedef uint64_t eword_t;
|
||||||
|
#define BITS_IN_WORD (sizeof(eword_t) * 8)
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Do not use __builtin_popcountll. The GCC implementation
|
||||||
|
* is notoriously slow on all platforms.
|
||||||
|
*
|
||||||
|
* See: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36041
|
||||||
|
*/
|
||||||
|
static inline uint32_t ewah_bit_popcount64(uint64_t x)
|
||||||
|
{
|
||||||
|
x = (x & 0x5555555555555555ULL) + ((x >> 1) & 0x5555555555555555ULL);
|
||||||
|
x = (x & 0x3333333333333333ULL) + ((x >> 2) & 0x3333333333333333ULL);
|
||||||
|
x = (x & 0x0F0F0F0F0F0F0F0FULL) + ((x >> 4) & 0x0F0F0F0F0F0F0F0FULL);
|
||||||
|
return (x * 0x0101010101010101ULL) >> 56;
|
||||||
|
}
|
||||||
|
|
||||||
|
#ifdef __GNUC__
|
||||||
|
#define ewah_bit_ctz64(x) __builtin_ctzll(x)
|
||||||
|
#else
|
||||||
|
static inline int ewah_bit_ctz64(uint64_t x)
|
||||||
|
{
|
||||||
|
int n = 0;
|
||||||
|
if ((x & 0xffffffff) == 0) { x >>= 32; n += 32; }
|
||||||
|
if ((x & 0xffff) == 0) { x >>= 16; n += 16; }
|
||||||
|
if ((x & 0xff) == 0) { x >>= 8; n += 8; }
|
||||||
|
if ((x & 0xf) == 0) { x >>= 4; n += 4; }
|
||||||
|
if ((x & 0x3) == 0) { x >>= 2; n += 2; }
|
||||||
|
if ((x & 0x1) == 0) { x >>= 1; n += 1; }
|
||||||
|
return n + !x;
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
|
||||||
|
struct ewah_bitmap {
|
||||||
|
eword_t *buffer;
|
||||||
|
size_t buffer_size;
|
||||||
|
size_t alloc_size;
|
||||||
|
size_t bit_size;
|
||||||
|
eword_t *rlw;
|
||||||
|
};
|
||||||
|
|
||||||
|
typedef void (*ewah_callback)(size_t pos, void *);
|
||||||
|
|
||||||
|
struct ewah_bitmap *ewah_pool_new(void);
|
||||||
|
void ewah_pool_free(struct ewah_bitmap *self);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Allocate a new EWAH Compressed bitmap
|
||||||
|
*/
|
||||||
|
struct ewah_bitmap *ewah_new(void);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Clear all the bits in the bitmap. Does not free or resize
|
||||||
|
* memory.
|
||||||
|
*/
|
||||||
|
void ewah_clear(struct ewah_bitmap *self);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Free all the memory of the bitmap
|
||||||
|
*/
|
||||||
|
void ewah_free(struct ewah_bitmap *self);
|
||||||
|
|
||||||
|
int ewah_serialize_to(struct ewah_bitmap *self,
|
||||||
|
int (*write_fun)(void *out, const void *buf, size_t len),
|
||||||
|
void *out);
|
||||||
|
int ewah_serialize(struct ewah_bitmap *self, int fd);
|
||||||
|
int ewah_serialize_native(struct ewah_bitmap *self, int fd);
|
||||||
|
|
||||||
|
int ewah_deserialize(struct ewah_bitmap *self, int fd);
|
||||||
|
int ewah_read_mmap(struct ewah_bitmap *self, void *map, size_t len);
|
||||||
|
int ewah_read_mmap_native(struct ewah_bitmap *self, void *map, size_t len);
|
||||||
|
|
||||||
|
uint32_t ewah_checksum(struct ewah_bitmap *self);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Logical not (bitwise negation) in-place on the bitmap
|
||||||
|
*
|
||||||
|
* This operation is linear time based on the size of the bitmap.
|
||||||
|
*/
|
||||||
|
void ewah_not(struct ewah_bitmap *self);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Call the given callback with the position of every single bit
|
||||||
|
* that has been set on the bitmap.
|
||||||
|
*
|
||||||
|
* This is an efficient operation that does not fully decompress
|
||||||
|
* the bitmap.
|
||||||
|
*/
|
||||||
|
void ewah_each_bit(struct ewah_bitmap *self, ewah_callback callback, void *payload);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Set a given bit on the bitmap.
|
||||||
|
*
|
||||||
|
* The bit at position `pos` will be set to true. Because of the
|
||||||
|
* way that the bitmap is compressed, a set bit cannot be unset
|
||||||
|
* later on.
|
||||||
|
*
|
||||||
|
* Furthermore, since the bitmap uses streaming compression, bits
|
||||||
|
* can only set incrementally.
|
||||||
|
*
|
||||||
|
* E.g.
|
||||||
|
* ewah_set(bitmap, 1); // ok
|
||||||
|
* ewah_set(bitmap, 76); // ok
|
||||||
|
* ewah_set(bitmap, 77); // ok
|
||||||
|
* ewah_set(bitmap, 8712800127); // ok
|
||||||
|
* ewah_set(bitmap, 25); // failed, assert raised
|
||||||
|
*/
|
||||||
|
void ewah_set(struct ewah_bitmap *self, size_t i);
|
||||||
|
|
||||||
|
struct ewah_iterator {
|
||||||
|
const eword_t *buffer;
|
||||||
|
size_t buffer_size;
|
||||||
|
|
||||||
|
size_t pointer;
|
||||||
|
eword_t compressed, literals;
|
||||||
|
eword_t rl, lw;
|
||||||
|
int b;
|
||||||
|
};
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Initialize a new iterator to run through the bitmap in uncompressed form.
|
||||||
|
*
|
||||||
|
* The iterator can be stack allocated. The underlying bitmap must not be freed
|
||||||
|
* before the iteration is over.
|
||||||
|
*
|
||||||
|
* E.g.
|
||||||
|
*
|
||||||
|
* struct ewah_bitmap *bitmap = ewah_new();
|
||||||
|
* struct ewah_iterator it;
|
||||||
|
*
|
||||||
|
* ewah_iterator_init(&it, bitmap);
|
||||||
|
*/
|
||||||
|
void ewah_iterator_init(struct ewah_iterator *it, struct ewah_bitmap *parent);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Yield every single word in the bitmap in uncompressed form. This is:
|
||||||
|
* yield single words (32-64 bits) where each bit represents an actual
|
||||||
|
* bit from the bitmap.
|
||||||
|
*
|
||||||
|
* Return: true if a word was yield, false if there are no words left
|
||||||
|
*/
|
||||||
|
int ewah_iterator_next(eword_t *next, struct ewah_iterator *it);
|
||||||
|
|
||||||
|
void ewah_or(
|
||||||
|
struct ewah_bitmap *ewah_i,
|
||||||
|
struct ewah_bitmap *ewah_j,
|
||||||
|
struct ewah_bitmap *out);
|
||||||
|
|
||||||
|
void ewah_and_not(
|
||||||
|
struct ewah_bitmap *ewah_i,
|
||||||
|
struct ewah_bitmap *ewah_j,
|
||||||
|
struct ewah_bitmap *out);
|
||||||
|
|
||||||
|
void ewah_xor(
|
||||||
|
struct ewah_bitmap *ewah_i,
|
||||||
|
struct ewah_bitmap *ewah_j,
|
||||||
|
struct ewah_bitmap *out);
|
||||||
|
|
||||||
|
void ewah_and(
|
||||||
|
struct ewah_bitmap *ewah_i,
|
||||||
|
struct ewah_bitmap *ewah_j,
|
||||||
|
struct ewah_bitmap *out);
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Direct word access
|
||||||
|
*/
|
||||||
|
size_t ewah_add_empty_words(struct ewah_bitmap *self, int v, size_t number);
|
||||||
|
void ewah_add_dirty_words(
|
||||||
|
struct ewah_bitmap *self, const eword_t *buffer, size_t number, int negate);
|
||||||
|
size_t ewah_add(struct ewah_bitmap *self, eword_t word);
|
||||||
|
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Uncompressed, old-school bitmap that can be efficiently compressed
|
||||||
|
* into an `ewah_bitmap`.
|
||||||
|
*/
|
||||||
|
struct bitmap {
|
||||||
|
eword_t *words;
|
||||||
|
size_t word_alloc;
|
||||||
|
};
|
||||||
|
|
||||||
|
struct bitmap *bitmap_new(void);
|
||||||
|
void bitmap_set(struct bitmap *self, size_t pos);
|
||||||
|
void bitmap_clear(struct bitmap *self, size_t pos);
|
||||||
|
int bitmap_get(struct bitmap *self, size_t pos);
|
||||||
|
void bitmap_reset(struct bitmap *self);
|
||||||
|
void bitmap_free(struct bitmap *self);
|
||||||
|
int bitmap_equals(struct bitmap *self, struct bitmap *other);
|
||||||
|
int bitmap_is_subset(struct bitmap *self, struct bitmap *super);
|
||||||
|
|
||||||
|
struct ewah_bitmap * bitmap_to_ewah(struct bitmap *bitmap);
|
||||||
|
struct bitmap *ewah_to_bitmap(struct ewah_bitmap *ewah);
|
||||||
|
|
||||||
|
void bitmap_and_not(struct bitmap *self, struct bitmap *other);
|
||||||
|
void bitmap_or_ewah(struct bitmap *self, struct ewah_bitmap *other);
|
||||||
|
void bitmap_or(struct bitmap *self, const struct bitmap *other);
|
||||||
|
|
||||||
|
void bitmap_each_bit(struct bitmap *self, ewah_callback callback, void *data);
|
||||||
|
size_t bitmap_popcount(struct bitmap *self);
|
||||||
|
|
||||||
|
#endif
|
114
ewah/ewok_rlw.h
Normal file
114
ewah/ewok_rlw.h
Normal file
@ -0,0 +1,114 @@
|
|||||||
|
/**
|
||||||
|
* Copyright 2013, GitHub, Inc
|
||||||
|
* Copyright 2009-2013, Daniel Lemire, Cliff Moon,
|
||||||
|
* David McIntosh, Robert Becho, Google Inc. and Veronika Zenz
|
||||||
|
*
|
||||||
|
* This program is free software; you can redistribute it and/or
|
||||||
|
* modify it under the terms of the GNU General Public License
|
||||||
|
* as published by the Free Software Foundation; either version 2
|
||||||
|
* of the License, or (at your option) any later version.
|
||||||
|
*
|
||||||
|
* This program is distributed in the hope that it will be useful,
|
||||||
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||||
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||||
|
* GNU General Public License for more details.
|
||||||
|
*
|
||||||
|
* You should have received a copy of the GNU General Public License
|
||||||
|
* along with this program; if not, write to the Free Software
|
||||||
|
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
|
||||||
|
*/
|
||||||
|
#ifndef __EWOK_RLW_H__
|
||||||
|
#define __EWOK_RLW_H__
|
||||||
|
|
||||||
|
#define RLW_RUNNING_BITS (sizeof(eword_t) * 4)
|
||||||
|
#define RLW_LITERAL_BITS (sizeof(eword_t) * 8 - 1 - RLW_RUNNING_BITS)
|
||||||
|
|
||||||
|
#define RLW_LARGEST_RUNNING_COUNT (((eword_t)1 << RLW_RUNNING_BITS) - 1)
|
||||||
|
#define RLW_LARGEST_LITERAL_COUNT (((eword_t)1 << RLW_LITERAL_BITS) - 1)
|
||||||
|
|
||||||
|
#define RLW_LARGEST_RUNNING_COUNT_SHIFT (RLW_LARGEST_RUNNING_COUNT << 1)
|
||||||
|
|
||||||
|
#define RLW_RUNNING_LEN_PLUS_BIT (((eword_t)1 << (RLW_RUNNING_BITS + 1)) - 1)
|
||||||
|
|
||||||
|
static int rlw_get_run_bit(const eword_t *word)
|
||||||
|
{
|
||||||
|
return *word & (eword_t)1;
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline void rlw_set_run_bit(eword_t *word, int b)
|
||||||
|
{
|
||||||
|
if (b) {
|
||||||
|
*word |= (eword_t)1;
|
||||||
|
} else {
|
||||||
|
*word &= (eword_t)(~1);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline void rlw_xor_run_bit(eword_t *word)
|
||||||
|
{
|
||||||
|
if (*word & 1) {
|
||||||
|
*word &= (eword_t)(~1);
|
||||||
|
} else {
|
||||||
|
*word |= (eword_t)1;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline void rlw_set_running_len(eword_t *word, eword_t l)
|
||||||
|
{
|
||||||
|
*word |= RLW_LARGEST_RUNNING_COUNT_SHIFT;
|
||||||
|
*word &= (l << 1) | (~RLW_LARGEST_RUNNING_COUNT_SHIFT);
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline eword_t rlw_get_running_len(const eword_t *word)
|
||||||
|
{
|
||||||
|
return (*word >> 1) & RLW_LARGEST_RUNNING_COUNT;
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline eword_t rlw_get_literal_words(const eword_t *word)
|
||||||
|
{
|
||||||
|
return *word >> (1 + RLW_RUNNING_BITS);
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline void rlw_set_literal_words(eword_t *word, eword_t l)
|
||||||
|
{
|
||||||
|
*word |= ~RLW_RUNNING_LEN_PLUS_BIT;
|
||||||
|
*word &= (l << (RLW_RUNNING_BITS + 1)) | RLW_RUNNING_LEN_PLUS_BIT;
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline eword_t rlw_size(const eword_t *self)
|
||||||
|
{
|
||||||
|
return rlw_get_running_len(self) + rlw_get_literal_words(self);
|
||||||
|
}
|
||||||
|
|
||||||
|
struct rlw_iterator {
|
||||||
|
const eword_t *buffer;
|
||||||
|
size_t size;
|
||||||
|
size_t pointer;
|
||||||
|
size_t literal_word_start;
|
||||||
|
|
||||||
|
struct {
|
||||||
|
const eword_t *word;
|
||||||
|
int literal_words;
|
||||||
|
int running_len;
|
||||||
|
int literal_word_offset;
|
||||||
|
int running_bit;
|
||||||
|
} rlw;
|
||||||
|
};
|
||||||
|
|
||||||
|
void rlwit_init(struct rlw_iterator *it, struct ewah_bitmap *bitmap);
|
||||||
|
void rlwit_discard_first_words(struct rlw_iterator *it, size_t x);
|
||||||
|
size_t rlwit_discharge(
|
||||||
|
struct rlw_iterator *it, struct ewah_bitmap *out, size_t max, int negate);
|
||||||
|
void rlwit_discharge_empty(struct rlw_iterator *it, struct ewah_bitmap *out);
|
||||||
|
|
||||||
|
static inline size_t rlwit_word_size(struct rlw_iterator *it)
|
||||||
|
{
|
||||||
|
return it->rlw.running_len + it->rlw.literal_words;
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline size_t rlwit_literal_words(struct rlw_iterator *it)
|
||||||
|
{
|
||||||
|
return it->pointer - it->rlw.literal_words;
|
||||||
|
}
|
||||||
|
|
||||||
|
#endif
|
338
khash.h
Normal file
338
khash.h
Normal file
@ -0,0 +1,338 @@
|
|||||||
|
/* The MIT License
|
||||||
|
|
||||||
|
Copyright (c) 2008, 2009, 2011 by Attractive Chaos <attractor@live.co.uk>
|
||||||
|
|
||||||
|
Permission is hereby granted, free of charge, to any person obtaining
|
||||||
|
a copy of this software and associated documentation files (the
|
||||||
|
"Software"), to deal in the Software without restriction, including
|
||||||
|
without limitation the rights to use, copy, modify, merge, publish,
|
||||||
|
distribute, sublicense, and/or sell copies of the Software, and to
|
||||||
|
permit persons to whom the Software is furnished to do so, subject to
|
||||||
|
the following conditions:
|
||||||
|
|
||||||
|
The above copyright notice and this permission notice shall be
|
||||||
|
included in all copies or substantial portions of the Software.
|
||||||
|
|
||||||
|
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
||||||
|
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
||||||
|
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||||
|
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
|
||||||
|
BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
|
||||||
|
ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
|
||||||
|
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||||
|
SOFTWARE.
|
||||||
|
*/
|
||||||
|
|
||||||
|
#ifndef __AC_KHASH_H
|
||||||
|
#define __AC_KHASH_H
|
||||||
|
|
||||||
|
#define AC_VERSION_KHASH_H "0.2.8"
|
||||||
|
|
||||||
|
typedef uint32_t khint32_t;
|
||||||
|
typedef uint64_t khint64_t;
|
||||||
|
|
||||||
|
typedef khint32_t khint_t;
|
||||||
|
typedef khint_t khiter_t;
|
||||||
|
|
||||||
|
#define __ac_isempty(flag, i) ((flag[i>>4]>>((i&0xfU)<<1))&2)
|
||||||
|
#define __ac_isdel(flag, i) ((flag[i>>4]>>((i&0xfU)<<1))&1)
|
||||||
|
#define __ac_iseither(flag, i) ((flag[i>>4]>>((i&0xfU)<<1))&3)
|
||||||
|
#define __ac_set_isdel_false(flag, i) (flag[i>>4]&=~(1ul<<((i&0xfU)<<1)))
|
||||||
|
#define __ac_set_isempty_false(flag, i) (flag[i>>4]&=~(2ul<<((i&0xfU)<<1)))
|
||||||
|
#define __ac_set_isboth_false(flag, i) (flag[i>>4]&=~(3ul<<((i&0xfU)<<1)))
|
||||||
|
#define __ac_set_isdel_true(flag, i) (flag[i>>4]|=1ul<<((i&0xfU)<<1))
|
||||||
|
|
||||||
|
#define __ac_fsize(m) ((m) < 16? 1 : (m)>>4)
|
||||||
|
|
||||||
|
#define kroundup32(x) (--(x), (x)|=(x)>>1, (x)|=(x)>>2, (x)|=(x)>>4, (x)|=(x)>>8, (x)|=(x)>>16, ++(x))
|
||||||
|
|
||||||
|
static inline khint_t __ac_X31_hash_string(const char *s)
|
||||||
|
{
|
||||||
|
khint_t h = (khint_t)*s;
|
||||||
|
if (h) for (++s ; *s; ++s) h = (h << 5) - h + (khint_t)*s;
|
||||||
|
return h;
|
||||||
|
}
|
||||||
|
|
||||||
|
#define kh_str_hash_func(key) __ac_X31_hash_string(key)
|
||||||
|
#define kh_str_hash_equal(a, b) (strcmp(a, b) == 0)
|
||||||
|
|
||||||
|
static const double __ac_HASH_UPPER = 0.77;
|
||||||
|
|
||||||
|
#define __KHASH_TYPE(name, khkey_t, khval_t) \
|
||||||
|
typedef struct { \
|
||||||
|
khint_t n_buckets, size, n_occupied, upper_bound; \
|
||||||
|
khint32_t *flags; \
|
||||||
|
khkey_t *keys; \
|
||||||
|
khval_t *vals; \
|
||||||
|
} kh_##name##_t;
|
||||||
|
|
||||||
|
#define __KHASH_PROTOTYPES(name, khkey_t, khval_t) \
|
||||||
|
extern kh_##name##_t *kh_init_##name(void); \
|
||||||
|
extern void kh_destroy_##name(kh_##name##_t *h); \
|
||||||
|
extern void kh_clear_##name(kh_##name##_t *h); \
|
||||||
|
extern khint_t kh_get_##name(const kh_##name##_t *h, khkey_t key); \
|
||||||
|
extern int kh_resize_##name(kh_##name##_t *h, khint_t new_n_buckets); \
|
||||||
|
extern khint_t kh_put_##name(kh_##name##_t *h, khkey_t key, int *ret); \
|
||||||
|
extern void kh_del_##name(kh_##name##_t *h, khint_t x);
|
||||||
|
|
||||||
|
#define __KHASH_IMPL(name, SCOPE, khkey_t, khval_t, kh_is_map, __hash_func, __hash_equal) \
|
||||||
|
SCOPE kh_##name##_t *kh_init_##name(void) { \
|
||||||
|
return (kh_##name##_t*)xcalloc(1, sizeof(kh_##name##_t)); \
|
||||||
|
} \
|
||||||
|
SCOPE void kh_destroy_##name(kh_##name##_t *h) \
|
||||||
|
{ \
|
||||||
|
if (h) { \
|
||||||
|
free((void *)h->keys); free(h->flags); \
|
||||||
|
free((void *)h->vals); \
|
||||||
|
free(h); \
|
||||||
|
} \
|
||||||
|
} \
|
||||||
|
SCOPE void kh_clear_##name(kh_##name##_t *h) \
|
||||||
|
{ \
|
||||||
|
if (h && h->flags) { \
|
||||||
|
memset(h->flags, 0xaa, __ac_fsize(h->n_buckets) * sizeof(khint32_t)); \
|
||||||
|
h->size = h->n_occupied = 0; \
|
||||||
|
} \
|
||||||
|
} \
|
||||||
|
SCOPE khint_t kh_get_##name(const kh_##name##_t *h, khkey_t key) \
|
||||||
|
{ \
|
||||||
|
if (h->n_buckets) { \
|
||||||
|
khint_t k, i, last, mask, step = 0; \
|
||||||
|
mask = h->n_buckets - 1; \
|
||||||
|
k = __hash_func(key); i = k & mask; \
|
||||||
|
last = i; \
|
||||||
|
while (!__ac_isempty(h->flags, i) && (__ac_isdel(h->flags, i) || !__hash_equal(h->keys[i], key))) { \
|
||||||
|
i = (i + (++step)) & mask; \
|
||||||
|
if (i == last) return h->n_buckets; \
|
||||||
|
} \
|
||||||
|
return __ac_iseither(h->flags, i)? h->n_buckets : i; \
|
||||||
|
} else return 0; \
|
||||||
|
} \
|
||||||
|
SCOPE int kh_resize_##name(kh_##name##_t *h, khint_t new_n_buckets) \
|
||||||
|
{ /* This function uses 0.25*n_buckets bytes of working space instead of [sizeof(key_t+val_t)+.25]*n_buckets. */ \
|
||||||
|
khint32_t *new_flags = NULL; \
|
||||||
|
khint_t j = 1; \
|
||||||
|
{ \
|
||||||
|
kroundup32(new_n_buckets); \
|
||||||
|
if (new_n_buckets < 4) new_n_buckets = 4; \
|
||||||
|
if (h->size >= (khint_t)(new_n_buckets * __ac_HASH_UPPER + 0.5)) j = 0; /* requested size is too small */ \
|
||||||
|
else { /* hash table size to be changed (shrink or expand); rehash */ \
|
||||||
|
new_flags = (khint32_t*)xmalloc(__ac_fsize(new_n_buckets) * sizeof(khint32_t)); \
|
||||||
|
if (!new_flags) return -1; \
|
||||||
|
memset(new_flags, 0xaa, __ac_fsize(new_n_buckets) * sizeof(khint32_t)); \
|
||||||
|
if (h->n_buckets < new_n_buckets) { /* expand */ \
|
||||||
|
khkey_t *new_keys = (khkey_t*)xrealloc((void *)h->keys, new_n_buckets * sizeof(khkey_t)); \
|
||||||
|
if (!new_keys) return -1; \
|
||||||
|
h->keys = new_keys; \
|
||||||
|
if (kh_is_map) { \
|
||||||
|
khval_t *new_vals = (khval_t*)xrealloc((void *)h->vals, new_n_buckets * sizeof(khval_t)); \
|
||||||
|
if (!new_vals) return -1; \
|
||||||
|
h->vals = new_vals; \
|
||||||
|
} \
|
||||||
|
} /* otherwise shrink */ \
|
||||||
|
} \
|
||||||
|
} \
|
||||||
|
if (j) { /* rehashing is needed */ \
|
||||||
|
for (j = 0; j != h->n_buckets; ++j) { \
|
||||||
|
if (__ac_iseither(h->flags, j) == 0) { \
|
||||||
|
khkey_t key = h->keys[j]; \
|
||||||
|
khval_t val; \
|
||||||
|
khint_t new_mask; \
|
||||||
|
new_mask = new_n_buckets - 1; \
|
||||||
|
if (kh_is_map) val = h->vals[j]; \
|
||||||
|
__ac_set_isdel_true(h->flags, j); \
|
||||||
|
while (1) { /* kick-out process; sort of like in Cuckoo hashing */ \
|
||||||
|
khint_t k, i, step = 0; \
|
||||||
|
k = __hash_func(key); \
|
||||||
|
i = k & new_mask; \
|
||||||
|
while (!__ac_isempty(new_flags, i)) i = (i + (++step)) & new_mask; \
|
||||||
|
__ac_set_isempty_false(new_flags, i); \
|
||||||
|
if (i < h->n_buckets && __ac_iseither(h->flags, i) == 0) { /* kick out the existing element */ \
|
||||||
|
{ khkey_t tmp = h->keys[i]; h->keys[i] = key; key = tmp; } \
|
||||||
|
if (kh_is_map) { khval_t tmp = h->vals[i]; h->vals[i] = val; val = tmp; } \
|
||||||
|
__ac_set_isdel_true(h->flags, i); /* mark it as deleted in the old hash table */ \
|
||||||
|
} else { /* write the element and jump out of the loop */ \
|
||||||
|
h->keys[i] = key; \
|
||||||
|
if (kh_is_map) h->vals[i] = val; \
|
||||||
|
break; \
|
||||||
|
} \
|
||||||
|
} \
|
||||||
|
} \
|
||||||
|
} \
|
||||||
|
if (h->n_buckets > new_n_buckets) { /* shrink the hash table */ \
|
||||||
|
h->keys = (khkey_t*)xrealloc((void *)h->keys, new_n_buckets * sizeof(khkey_t)); \
|
||||||
|
if (kh_is_map) h->vals = (khval_t*)xrealloc((void *)h->vals, new_n_buckets * sizeof(khval_t)); \
|
||||||
|
} \
|
||||||
|
free(h->flags); /* free the working space */ \
|
||||||
|
h->flags = new_flags; \
|
||||||
|
h->n_buckets = new_n_buckets; \
|
||||||
|
h->n_occupied = h->size; \
|
||||||
|
h->upper_bound = (khint_t)(h->n_buckets * __ac_HASH_UPPER + 0.5); \
|
||||||
|
} \
|
||||||
|
return 0; \
|
||||||
|
} \
|
||||||
|
SCOPE khint_t kh_put_##name(kh_##name##_t *h, khkey_t key, int *ret) \
|
||||||
|
{ \
|
||||||
|
khint_t x; \
|
||||||
|
if (h->n_occupied >= h->upper_bound) { /* update the hash table */ \
|
||||||
|
if (h->n_buckets > (h->size<<1)) { \
|
||||||
|
if (kh_resize_##name(h, h->n_buckets - 1) < 0) { /* clear "deleted" elements */ \
|
||||||
|
*ret = -1; return h->n_buckets; \
|
||||||
|
} \
|
||||||
|
} else if (kh_resize_##name(h, h->n_buckets + 1) < 0) { /* expand the hash table */ \
|
||||||
|
*ret = -1; return h->n_buckets; \
|
||||||
|
} \
|
||||||
|
} /* TODO: to implement automatically shrinking; resize() already support shrinking */ \
|
||||||
|
{ \
|
||||||
|
khint_t k, i, site, last, mask = h->n_buckets - 1, step = 0; \
|
||||||
|
x = site = h->n_buckets; k = __hash_func(key); i = k & mask; \
|
||||||
|
if (__ac_isempty(h->flags, i)) x = i; /* for speed up */ \
|
||||||
|
else { \
|
||||||
|
last = i; \
|
||||||
|
while (!__ac_isempty(h->flags, i) && (__ac_isdel(h->flags, i) || !__hash_equal(h->keys[i], key))) { \
|
||||||
|
if (__ac_isdel(h->flags, i)) site = i; \
|
||||||
|
i = (i + (++step)) & mask; \
|
||||||
|
if (i == last) { x = site; break; } \
|
||||||
|
} \
|
||||||
|
if (x == h->n_buckets) { \
|
||||||
|
if (__ac_isempty(h->flags, i) && site != h->n_buckets) x = site; \
|
||||||
|
else x = i; \
|
||||||
|
} \
|
||||||
|
} \
|
||||||
|
} \
|
||||||
|
if (__ac_isempty(h->flags, x)) { /* not present at all */ \
|
||||||
|
h->keys[x] = key; \
|
||||||
|
__ac_set_isboth_false(h->flags, x); \
|
||||||
|
++h->size; ++h->n_occupied; \
|
||||||
|
*ret = 1; \
|
||||||
|
} else if (__ac_isdel(h->flags, x)) { /* deleted */ \
|
||||||
|
h->keys[x] = key; \
|
||||||
|
__ac_set_isboth_false(h->flags, x); \
|
||||||
|
++h->size; \
|
||||||
|
*ret = 2; \
|
||||||
|
} else *ret = 0; /* Don't touch h->keys[x] if present and not deleted */ \
|
||||||
|
return x; \
|
||||||
|
} \
|
||||||
|
SCOPE void kh_del_##name(kh_##name##_t *h, khint_t x) \
|
||||||
|
{ \
|
||||||
|
if (x != h->n_buckets && !__ac_iseither(h->flags, x)) { \
|
||||||
|
__ac_set_isdel_true(h->flags, x); \
|
||||||
|
--h->size; \
|
||||||
|
} \
|
||||||
|
}
|
||||||
|
|
||||||
|
#define KHASH_DECLARE(name, khkey_t, khval_t) \
|
||||||
|
__KHASH_TYPE(name, khkey_t, khval_t) \
|
||||||
|
__KHASH_PROTOTYPES(name, khkey_t, khval_t)
|
||||||
|
|
||||||
|
#define KHASH_INIT2(name, SCOPE, khkey_t, khval_t, kh_is_map, __hash_func, __hash_equal) \
|
||||||
|
__KHASH_TYPE(name, khkey_t, khval_t) \
|
||||||
|
__KHASH_IMPL(name, SCOPE, khkey_t, khval_t, kh_is_map, __hash_func, __hash_equal)
|
||||||
|
|
||||||
|
#define KHASH_INIT(name, khkey_t, khval_t, kh_is_map, __hash_func, __hash_equal) \
|
||||||
|
KHASH_INIT2(name, static inline, khkey_t, khval_t, kh_is_map, __hash_func, __hash_equal)
|
||||||
|
|
||||||
|
/* Other convenient macros... */
|
||||||
|
|
||||||
|
/*! @function
|
||||||
|
@abstract Test whether a bucket contains data.
|
||||||
|
@param h Pointer to the hash table [khash_t(name)*]
|
||||||
|
@param x Iterator to the bucket [khint_t]
|
||||||
|
@return 1 if containing data; 0 otherwise [int]
|
||||||
|
*/
|
||||||
|
#define kh_exist(h, x) (!__ac_iseither((h)->flags, (x)))
|
||||||
|
|
||||||
|
/*! @function
|
||||||
|
@abstract Get key given an iterator
|
||||||
|
@param h Pointer to the hash table [khash_t(name)*]
|
||||||
|
@param x Iterator to the bucket [khint_t]
|
||||||
|
@return Key [type of keys]
|
||||||
|
*/
|
||||||
|
#define kh_key(h, x) ((h)->keys[x])
|
||||||
|
|
||||||
|
/*! @function
|
||||||
|
@abstract Get value given an iterator
|
||||||
|
@param h Pointer to the hash table [khash_t(name)*]
|
||||||
|
@param x Iterator to the bucket [khint_t]
|
||||||
|
@return Value [type of values]
|
||||||
|
@discussion For hash sets, calling this results in segfault.
|
||||||
|
*/
|
||||||
|
#define kh_val(h, x) ((h)->vals[x])
|
||||||
|
|
||||||
|
/*! @function
|
||||||
|
@abstract Alias of kh_val()
|
||||||
|
*/
|
||||||
|
#define kh_value(h, x) ((h)->vals[x])
|
||||||
|
|
||||||
|
/*! @function
|
||||||
|
@abstract Get the start iterator
|
||||||
|
@param h Pointer to the hash table [khash_t(name)*]
|
||||||
|
@return The start iterator [khint_t]
|
||||||
|
*/
|
||||||
|
#define kh_begin(h) (khint_t)(0)
|
||||||
|
|
||||||
|
/*! @function
|
||||||
|
@abstract Get the end iterator
|
||||||
|
@param h Pointer to the hash table [khash_t(name)*]
|
||||||
|
@return The end iterator [khint_t]
|
||||||
|
*/
|
||||||
|
#define kh_end(h) ((h)->n_buckets)
|
||||||
|
|
||||||
|
/*! @function
|
||||||
|
@abstract Get the number of elements in the hash table
|
||||||
|
@param h Pointer to the hash table [khash_t(name)*]
|
||||||
|
@return Number of elements in the hash table [khint_t]
|
||||||
|
*/
|
||||||
|
#define kh_size(h) ((h)->size)
|
||||||
|
|
||||||
|
/*! @function
|
||||||
|
@abstract Get the number of buckets in the hash table
|
||||||
|
@param h Pointer to the hash table [khash_t(name)*]
|
||||||
|
@return Number of buckets in the hash table [khint_t]
|
||||||
|
*/
|
||||||
|
#define kh_n_buckets(h) ((h)->n_buckets)
|
||||||
|
|
||||||
|
/*! @function
|
||||||
|
@abstract Iterate over the entries in the hash table
|
||||||
|
@param h Pointer to the hash table [khash_t(name)*]
|
||||||
|
@param kvar Variable to which key will be assigned
|
||||||
|
@param vvar Variable to which value will be assigned
|
||||||
|
@param code Block of code to execute
|
||||||
|
*/
|
||||||
|
#define kh_foreach(h, kvar, vvar, code) { khint_t __i; \
|
||||||
|
for (__i = kh_begin(h); __i != kh_end(h); ++__i) { \
|
||||||
|
if (!kh_exist(h,__i)) continue; \
|
||||||
|
(kvar) = kh_key(h,__i); \
|
||||||
|
(vvar) = kh_val(h,__i); \
|
||||||
|
code; \
|
||||||
|
} }
|
||||||
|
|
||||||
|
/*! @function
|
||||||
|
@abstract Iterate over the values in the hash table
|
||||||
|
@param h Pointer to the hash table [khash_t(name)*]
|
||||||
|
@param vvar Variable to which value will be assigned
|
||||||
|
@param code Block of code to execute
|
||||||
|
*/
|
||||||
|
#define kh_foreach_value(h, vvar, code) { khint_t __i; \
|
||||||
|
for (__i = kh_begin(h); __i != kh_end(h); ++__i) { \
|
||||||
|
if (!kh_exist(h,__i)) continue; \
|
||||||
|
(vvar) = kh_val(h,__i); \
|
||||||
|
code; \
|
||||||
|
} }
|
||||||
|
|
||||||
|
static inline khint_t __kh_oid_hash(const unsigned char *oid)
|
||||||
|
{
|
||||||
|
khint_t hash;
|
||||||
|
memcpy(&hash, oid, sizeof(hash));
|
||||||
|
return hash;
|
||||||
|
}
|
||||||
|
|
||||||
|
#define __kh_oid_cmp(a, b) (hashcmp(a, b) == 0)
|
||||||
|
|
||||||
|
KHASH_INIT(sha1, const unsigned char *, void *, 1, __kh_oid_hash, __kh_oid_cmp)
|
||||||
|
typedef kh_sha1_t khash_sha1;
|
||||||
|
|
||||||
|
KHASH_INIT(sha1_pos, const unsigned char *, int, 1, __kh_oid_hash, __kh_oid_cmp)
|
||||||
|
typedef kh_sha1_pos_t khash_sha1_pos;
|
||||||
|
|
||||||
|
#endif /* __AC_KHASH_H */
|
552
pack-bitmap-write.c
Normal file
552
pack-bitmap-write.c
Normal file
@ -0,0 +1,552 @@
|
|||||||
|
#include "cache.h"
|
||||||
|
#include "commit.h"
|
||||||
|
#include "tag.h"
|
||||||
|
#include "diff.h"
|
||||||
|
#include "revision.h"
|
||||||
|
#include "list-objects.h"
|
||||||
|
#include "progress.h"
|
||||||
|
#include "pack-revindex.h"
|
||||||
|
#include "pack.h"
|
||||||
|
#include "pack-bitmap.h"
|
||||||
|
#include "sha1-lookup.h"
|
||||||
|
#include "pack-objects.h"
|
||||||
|
|
||||||
|
struct bitmapped_commit {
|
||||||
|
struct commit *commit;
|
||||||
|
struct ewah_bitmap *bitmap;
|
||||||
|
struct ewah_bitmap *write_as;
|
||||||
|
int flags;
|
||||||
|
int xor_offset;
|
||||||
|
uint32_t commit_pos;
|
||||||
|
};
|
||||||
|
|
||||||
|
struct bitmap_writer {
|
||||||
|
struct ewah_bitmap *commits;
|
||||||
|
struct ewah_bitmap *trees;
|
||||||
|
struct ewah_bitmap *blobs;
|
||||||
|
struct ewah_bitmap *tags;
|
||||||
|
|
||||||
|
khash_sha1 *bitmaps;
|
||||||
|
khash_sha1 *reused;
|
||||||
|
struct packing_data *to_pack;
|
||||||
|
|
||||||
|
struct bitmapped_commit *selected;
|
||||||
|
unsigned int selected_nr, selected_alloc;
|
||||||
|
|
||||||
|
struct progress *progress;
|
||||||
|
int show_progress;
|
||||||
|
unsigned char pack_checksum[20];
|
||||||
|
};
|
||||||
|
|
||||||
|
static struct bitmap_writer writer;
|
||||||
|
|
||||||
|
void bitmap_writer_show_progress(int show)
|
||||||
|
{
|
||||||
|
writer.show_progress = show;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Build the initial type index for the packfile
|
||||||
|
*/
|
||||||
|
void bitmap_writer_build_type_index(struct pack_idx_entry **index,
|
||||||
|
uint32_t index_nr)
|
||||||
|
{
|
||||||
|
uint32_t i;
|
||||||
|
|
||||||
|
writer.commits = ewah_new();
|
||||||
|
writer.trees = ewah_new();
|
||||||
|
writer.blobs = ewah_new();
|
||||||
|
writer.tags = ewah_new();
|
||||||
|
|
||||||
|
for (i = 0; i < index_nr; ++i) {
|
||||||
|
struct object_entry *entry = (struct object_entry *)index[i];
|
||||||
|
enum object_type real_type;
|
||||||
|
|
||||||
|
entry->in_pack_pos = i;
|
||||||
|
|
||||||
|
switch (entry->type) {
|
||||||
|
case OBJ_COMMIT:
|
||||||
|
case OBJ_TREE:
|
||||||
|
case OBJ_BLOB:
|
||||||
|
case OBJ_TAG:
|
||||||
|
real_type = entry->type;
|
||||||
|
break;
|
||||||
|
|
||||||
|
default:
|
||||||
|
real_type = sha1_object_info(entry->idx.sha1, NULL);
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
|
switch (real_type) {
|
||||||
|
case OBJ_COMMIT:
|
||||||
|
ewah_set(writer.commits, i);
|
||||||
|
break;
|
||||||
|
|
||||||
|
case OBJ_TREE:
|
||||||
|
ewah_set(writer.trees, i);
|
||||||
|
break;
|
||||||
|
|
||||||
|
case OBJ_BLOB:
|
||||||
|
ewah_set(writer.blobs, i);
|
||||||
|
break;
|
||||||
|
|
||||||
|
case OBJ_TAG:
|
||||||
|
ewah_set(writer.tags, i);
|
||||||
|
break;
|
||||||
|
|
||||||
|
default:
|
||||||
|
die("Missing type information for %s (%d/%d)",
|
||||||
|
sha1_to_hex(entry->idx.sha1), real_type, entry->type);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Compute the actual bitmaps
|
||||||
|
*/
|
||||||
|
static struct object **seen_objects;
|
||||||
|
static unsigned int seen_objects_nr, seen_objects_alloc;
|
||||||
|
|
||||||
|
static inline void push_bitmapped_commit(struct commit *commit, struct ewah_bitmap *reused)
|
||||||
|
{
|
||||||
|
if (writer.selected_nr >= writer.selected_alloc) {
|
||||||
|
writer.selected_alloc = (writer.selected_alloc + 32) * 2;
|
||||||
|
writer.selected = xrealloc(writer.selected,
|
||||||
|
writer.selected_alloc * sizeof(struct bitmapped_commit));
|
||||||
|
}
|
||||||
|
|
||||||
|
writer.selected[writer.selected_nr].commit = commit;
|
||||||
|
writer.selected[writer.selected_nr].bitmap = reused;
|
||||||
|
writer.selected[writer.selected_nr].flags = 0;
|
||||||
|
|
||||||
|
writer.selected_nr++;
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline void mark_as_seen(struct object *object)
|
||||||
|
{
|
||||||
|
ALLOC_GROW(seen_objects, seen_objects_nr + 1, seen_objects_alloc);
|
||||||
|
seen_objects[seen_objects_nr++] = object;
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline void reset_all_seen(void)
|
||||||
|
{
|
||||||
|
unsigned int i;
|
||||||
|
for (i = 0; i < seen_objects_nr; ++i) {
|
||||||
|
seen_objects[i]->flags &= ~(SEEN | ADDED | SHOWN);
|
||||||
|
}
|
||||||
|
seen_objects_nr = 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
static uint32_t find_object_pos(const unsigned char *sha1)
|
||||||
|
{
|
||||||
|
struct object_entry *entry = packlist_find(writer.to_pack, sha1, NULL);
|
||||||
|
|
||||||
|
if (!entry) {
|
||||||
|
die("Failed to write bitmap index. Packfile doesn't have full closure "
|
||||||
|
"(object %s is missing)", sha1_to_hex(sha1));
|
||||||
|
}
|
||||||
|
|
||||||
|
return entry->in_pack_pos;
|
||||||
|
}
|
||||||
|
|
||||||
|
static void show_object(struct object *object, const struct name_path *path,
|
||||||
|
const char *last, void *data)
|
||||||
|
{
|
||||||
|
struct bitmap *base = data;
|
||||||
|
bitmap_set(base, find_object_pos(object->sha1));
|
||||||
|
mark_as_seen(object);
|
||||||
|
}
|
||||||
|
|
||||||
|
static void show_commit(struct commit *commit, void *data)
|
||||||
|
{
|
||||||
|
mark_as_seen((struct object *)commit);
|
||||||
|
}
|
||||||
|
|
||||||
|
static int
|
||||||
|
add_to_include_set(struct bitmap *base, struct commit *commit)
|
||||||
|
{
|
||||||
|
khiter_t hash_pos;
|
||||||
|
uint32_t bitmap_pos = find_object_pos(commit->object.sha1);
|
||||||
|
|
||||||
|
if (bitmap_get(base, bitmap_pos))
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
hash_pos = kh_get_sha1(writer.bitmaps, commit->object.sha1);
|
||||||
|
if (hash_pos < kh_end(writer.bitmaps)) {
|
||||||
|
struct bitmapped_commit *bc = kh_value(writer.bitmaps, hash_pos);
|
||||||
|
bitmap_or_ewah(base, bc->bitmap);
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
bitmap_set(base, bitmap_pos);
|
||||||
|
return 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
static int
|
||||||
|
should_include(struct commit *commit, void *_data)
|
||||||
|
{
|
||||||
|
struct bitmap *base = _data;
|
||||||
|
|
||||||
|
if (!add_to_include_set(base, commit)) {
|
||||||
|
struct commit_list *parent = commit->parents;
|
||||||
|
|
||||||
|
mark_as_seen((struct object *)commit);
|
||||||
|
|
||||||
|
while (parent) {
|
||||||
|
parent->item->object.flags |= SEEN;
|
||||||
|
mark_as_seen((struct object *)parent->item);
|
||||||
|
parent = parent->next;
|
||||||
|
}
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
return 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
static void compute_xor_offsets(void)
|
||||||
|
{
|
||||||
|
static const int MAX_XOR_OFFSET_SEARCH = 10;
|
||||||
|
|
||||||
|
int i, next = 0;
|
||||||
|
|
||||||
|
while (next < writer.selected_nr) {
|
||||||
|
struct bitmapped_commit *stored = &writer.selected[next];
|
||||||
|
|
||||||
|
int best_offset = 0;
|
||||||
|
struct ewah_bitmap *best_bitmap = stored->bitmap;
|
||||||
|
struct ewah_bitmap *test_xor;
|
||||||
|
|
||||||
|
for (i = 1; i <= MAX_XOR_OFFSET_SEARCH; ++i) {
|
||||||
|
int curr = next - i;
|
||||||
|
|
||||||
|
if (curr < 0)
|
||||||
|
break;
|
||||||
|
|
||||||
|
test_xor = ewah_pool_new();
|
||||||
|
ewah_xor(writer.selected[curr].bitmap, stored->bitmap, test_xor);
|
||||||
|
|
||||||
|
if (test_xor->buffer_size < best_bitmap->buffer_size) {
|
||||||
|
if (best_bitmap != stored->bitmap)
|
||||||
|
ewah_pool_free(best_bitmap);
|
||||||
|
|
||||||
|
best_bitmap = test_xor;
|
||||||
|
best_offset = i;
|
||||||
|
} else {
|
||||||
|
ewah_pool_free(test_xor);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
stored->xor_offset = best_offset;
|
||||||
|
stored->write_as = best_bitmap;
|
||||||
|
|
||||||
|
next++;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
void bitmap_writer_build(struct packing_data *to_pack)
|
||||||
|
{
|
||||||
|
static const double REUSE_BITMAP_THRESHOLD = 0.2;
|
||||||
|
|
||||||
|
int i, reuse_after, need_reset;
|
||||||
|
struct bitmap *base = bitmap_new();
|
||||||
|
struct rev_info revs;
|
||||||
|
|
||||||
|
writer.bitmaps = kh_init_sha1();
|
||||||
|
writer.to_pack = to_pack;
|
||||||
|
|
||||||
|
if (writer.show_progress)
|
||||||
|
writer.progress = start_progress("Building bitmaps", writer.selected_nr);
|
||||||
|
|
||||||
|
init_revisions(&revs, NULL);
|
||||||
|
revs.tag_objects = 1;
|
||||||
|
revs.tree_objects = 1;
|
||||||
|
revs.blob_objects = 1;
|
||||||
|
revs.no_walk = 0;
|
||||||
|
|
||||||
|
revs.include_check = should_include;
|
||||||
|
reset_revision_walk();
|
||||||
|
|
||||||
|
reuse_after = writer.selected_nr * REUSE_BITMAP_THRESHOLD;
|
||||||
|
need_reset = 0;
|
||||||
|
|
||||||
|
for (i = writer.selected_nr - 1; i >= 0; --i) {
|
||||||
|
struct bitmapped_commit *stored;
|
||||||
|
struct object *object;
|
||||||
|
|
||||||
|
khiter_t hash_pos;
|
||||||
|
int hash_ret;
|
||||||
|
|
||||||
|
stored = &writer.selected[i];
|
||||||
|
object = (struct object *)stored->commit;
|
||||||
|
|
||||||
|
if (stored->bitmap == NULL) {
|
||||||
|
if (i < writer.selected_nr - 1 &&
|
||||||
|
(need_reset ||
|
||||||
|
!in_merge_bases(writer.selected[i + 1].commit,
|
||||||
|
stored->commit))) {
|
||||||
|
bitmap_reset(base);
|
||||||
|
reset_all_seen();
|
||||||
|
}
|
||||||
|
|
||||||
|
add_pending_object(&revs, object, "");
|
||||||
|
revs.include_check_data = base;
|
||||||
|
|
||||||
|
if (prepare_revision_walk(&revs))
|
||||||
|
die("revision walk setup failed");
|
||||||
|
|
||||||
|
traverse_commit_list(&revs, show_commit, show_object, base);
|
||||||
|
|
||||||
|
revs.pending.nr = 0;
|
||||||
|
revs.pending.alloc = 0;
|
||||||
|
revs.pending.objects = NULL;
|
||||||
|
|
||||||
|
stored->bitmap = bitmap_to_ewah(base);
|
||||||
|
need_reset = 0;
|
||||||
|
} else
|
||||||
|
need_reset = 1;
|
||||||
|
|
||||||
|
if (i >= reuse_after)
|
||||||
|
stored->flags |= BITMAP_FLAG_REUSE;
|
||||||
|
|
||||||
|
hash_pos = kh_put_sha1(writer.bitmaps, object->sha1, &hash_ret);
|
||||||
|
if (hash_ret == 0)
|
||||||
|
die("Duplicate entry when writing index: %s",
|
||||||
|
sha1_to_hex(object->sha1));
|
||||||
|
|
||||||
|
kh_value(writer.bitmaps, hash_pos) = stored;
|
||||||
|
display_progress(writer.progress, writer.selected_nr - i);
|
||||||
|
}
|
||||||
|
|
||||||
|
bitmap_free(base);
|
||||||
|
stop_progress(&writer.progress);
|
||||||
|
|
||||||
|
compute_xor_offsets();
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Select the commits that will be bitmapped
|
||||||
|
*/
|
||||||
|
static inline unsigned int next_commit_index(unsigned int idx)
|
||||||
|
{
|
||||||
|
static const unsigned int MIN_COMMITS = 100;
|
||||||
|
static const unsigned int MAX_COMMITS = 5000;
|
||||||
|
|
||||||
|
static const unsigned int MUST_REGION = 100;
|
||||||
|
static const unsigned int MIN_REGION = 20000;
|
||||||
|
|
||||||
|
unsigned int offset, next;
|
||||||
|
|
||||||
|
if (idx <= MUST_REGION)
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
if (idx <= MIN_REGION) {
|
||||||
|
offset = idx - MUST_REGION;
|
||||||
|
return (offset < MIN_COMMITS) ? offset : MIN_COMMITS;
|
||||||
|
}
|
||||||
|
|
||||||
|
offset = idx - MIN_REGION;
|
||||||
|
next = (offset < MAX_COMMITS) ? offset : MAX_COMMITS;
|
||||||
|
|
||||||
|
return (next > MIN_COMMITS) ? next : MIN_COMMITS;
|
||||||
|
}
|
||||||
|
|
||||||
|
static int date_compare(const void *_a, const void *_b)
|
||||||
|
{
|
||||||
|
struct commit *a = *(struct commit **)_a;
|
||||||
|
struct commit *b = *(struct commit **)_b;
|
||||||
|
return (long)b->date - (long)a->date;
|
||||||
|
}
|
||||||
|
|
||||||
|
void bitmap_writer_reuse_bitmaps(struct packing_data *to_pack)
|
||||||
|
{
|
||||||
|
if (prepare_bitmap_git() < 0)
|
||||||
|
return;
|
||||||
|
|
||||||
|
writer.reused = kh_init_sha1();
|
||||||
|
rebuild_existing_bitmaps(to_pack, writer.reused, writer.show_progress);
|
||||||
|
}
|
||||||
|
|
||||||
|
static struct ewah_bitmap *find_reused_bitmap(const unsigned char *sha1)
|
||||||
|
{
|
||||||
|
khiter_t hash_pos;
|
||||||
|
|
||||||
|
if (!writer.reused)
|
||||||
|
return NULL;
|
||||||
|
|
||||||
|
hash_pos = kh_get_sha1(writer.reused, sha1);
|
||||||
|
if (hash_pos >= kh_end(writer.reused))
|
||||||
|
return NULL;
|
||||||
|
|
||||||
|
return kh_value(writer.reused, hash_pos);
|
||||||
|
}
|
||||||
|
|
||||||
|
void bitmap_writer_select_commits(struct commit **indexed_commits,
|
||||||
|
unsigned int indexed_commits_nr,
|
||||||
|
int max_bitmaps)
|
||||||
|
{
|
||||||
|
unsigned int i = 0, j, next;
|
||||||
|
|
||||||
|
qsort(indexed_commits, indexed_commits_nr, sizeof(indexed_commits[0]),
|
||||||
|
date_compare);
|
||||||
|
|
||||||
|
if (writer.show_progress)
|
||||||
|
writer.progress = start_progress("Selecting bitmap commits", 0);
|
||||||
|
|
||||||
|
if (indexed_commits_nr < 100) {
|
||||||
|
for (i = 0; i < indexed_commits_nr; ++i)
|
||||||
|
push_bitmapped_commit(indexed_commits[i], NULL);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
for (;;) {
|
||||||
|
struct ewah_bitmap *reused_bitmap = NULL;
|
||||||
|
struct commit *chosen = NULL;
|
||||||
|
|
||||||
|
next = next_commit_index(i);
|
||||||
|
|
||||||
|
if (i + next >= indexed_commits_nr)
|
||||||
|
break;
|
||||||
|
|
||||||
|
if (max_bitmaps > 0 && writer.selected_nr >= max_bitmaps) {
|
||||||
|
writer.selected_nr = max_bitmaps;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (next == 0) {
|
||||||
|
chosen = indexed_commits[i];
|
||||||
|
reused_bitmap = find_reused_bitmap(chosen->object.sha1);
|
||||||
|
} else {
|
||||||
|
chosen = indexed_commits[i + next];
|
||||||
|
|
||||||
|
for (j = 0; j <= next; ++j) {
|
||||||
|
struct commit *cm = indexed_commits[i + j];
|
||||||
|
|
||||||
|
reused_bitmap = find_reused_bitmap(cm->object.sha1);
|
||||||
|
if (reused_bitmap || (cm->object.flags & NEEDS_BITMAP) != 0) {
|
||||||
|
chosen = cm;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (cm->parents && cm->parents->next)
|
||||||
|
chosen = cm;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
push_bitmapped_commit(chosen, reused_bitmap);
|
||||||
|
|
||||||
|
i += next + 1;
|
||||||
|
display_progress(writer.progress, i);
|
||||||
|
}
|
||||||
|
|
||||||
|
stop_progress(&writer.progress);
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
static int sha1write_ewah_helper(void *f, const void *buf, size_t len)
|
||||||
|
{
|
||||||
|
/* sha1write will die on error */
|
||||||
|
sha1write(f, buf, len);
|
||||||
|
return len;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Write the bitmap index to disk
|
||||||
|
*/
|
||||||
|
static inline void dump_bitmap(struct sha1file *f, struct ewah_bitmap *bitmap)
|
||||||
|
{
|
||||||
|
if (ewah_serialize_to(bitmap, sha1write_ewah_helper, f) < 0)
|
||||||
|
die("Failed to write bitmap index");
|
||||||
|
}
|
||||||
|
|
||||||
|
static const unsigned char *sha1_access(size_t pos, void *table)
|
||||||
|
{
|
||||||
|
struct pack_idx_entry **index = table;
|
||||||
|
return index[pos]->sha1;
|
||||||
|
}
|
||||||
|
|
||||||
|
static void write_selected_commits_v1(struct sha1file *f,
|
||||||
|
struct pack_idx_entry **index,
|
||||||
|
uint32_t index_nr)
|
||||||
|
{
|
||||||
|
int i;
|
||||||
|
|
||||||
|
for (i = 0; i < writer.selected_nr; ++i) {
|
||||||
|
struct bitmapped_commit *stored = &writer.selected[i];
|
||||||
|
struct bitmap_disk_entry on_disk;
|
||||||
|
|
||||||
|
int commit_pos =
|
||||||
|
sha1_pos(stored->commit->object.sha1, index, index_nr, sha1_access);
|
||||||
|
|
||||||
|
if (commit_pos < 0)
|
||||||
|
die("BUG: trying to write commit not in index");
|
||||||
|
|
||||||
|
on_disk.object_pos = htonl(commit_pos);
|
||||||
|
on_disk.xor_offset = stored->xor_offset;
|
||||||
|
on_disk.flags = stored->flags;
|
||||||
|
|
||||||
|
sha1write(f, &on_disk, sizeof(on_disk));
|
||||||
|
dump_bitmap(f, stored->write_as);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
static void write_hash_cache(struct sha1file *f,
|
||||||
|
struct pack_idx_entry **index,
|
||||||
|
uint32_t index_nr)
|
||||||
|
{
|
||||||
|
uint32_t i;
|
||||||
|
|
||||||
|
for (i = 0; i < index_nr; ++i) {
|
||||||
|
struct object_entry *entry = (struct object_entry *)index[i];
|
||||||
|
uint32_t hash_value = htonl(entry->hash);
|
||||||
|
sha1write(f, &hash_value, sizeof(hash_value));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
void bitmap_writer_set_checksum(unsigned char *sha1)
|
||||||
|
{
|
||||||
|
hashcpy(writer.pack_checksum, sha1);
|
||||||
|
}
|
||||||
|
|
||||||
|
void bitmap_writer_finish(struct pack_idx_entry **index,
|
||||||
|
uint32_t index_nr,
|
||||||
|
const char *filename,
|
||||||
|
uint16_t options)
|
||||||
|
{
|
||||||
|
static char tmp_file[PATH_MAX];
|
||||||
|
static uint16_t default_version = 1;
|
||||||
|
static uint16_t flags = BITMAP_OPT_FULL_DAG;
|
||||||
|
struct sha1file *f;
|
||||||
|
|
||||||
|
struct bitmap_disk_header header;
|
||||||
|
|
||||||
|
int fd = odb_mkstemp(tmp_file, sizeof(tmp_file), "pack/tmp_bitmap_XXXXXX");
|
||||||
|
|
||||||
|
if (fd < 0)
|
||||||
|
die_errno("unable to create '%s'", tmp_file);
|
||||||
|
f = sha1fd(fd, tmp_file);
|
||||||
|
|
||||||
|
memcpy(header.magic, BITMAP_IDX_SIGNATURE, sizeof(BITMAP_IDX_SIGNATURE));
|
||||||
|
header.version = htons(default_version);
|
||||||
|
header.options = htons(flags | options);
|
||||||
|
header.entry_count = htonl(writer.selected_nr);
|
||||||
|
memcpy(header.checksum, writer.pack_checksum, 20);
|
||||||
|
|
||||||
|
sha1write(f, &header, sizeof(header));
|
||||||
|
dump_bitmap(f, writer.commits);
|
||||||
|
dump_bitmap(f, writer.trees);
|
||||||
|
dump_bitmap(f, writer.blobs);
|
||||||
|
dump_bitmap(f, writer.tags);
|
||||||
|
write_selected_commits_v1(f, index, index_nr);
|
||||||
|
|
||||||
|
if (options & BITMAP_OPT_HASH_CACHE)
|
||||||
|
write_hash_cache(f, index, index_nr);
|
||||||
|
|
||||||
|
sha1close(f, NULL, CSUM_FSYNC);
|
||||||
|
|
||||||
|
if (adjust_shared_perm(tmp_file))
|
||||||
|
die_errno("unable to make temporary bitmap file readable");
|
||||||
|
|
||||||
|
if (rename(tmp_file, filename))
|
||||||
|
die_errno("unable to rename temporary bitmap file to '%s'", filename);
|
||||||
|
}
|
1073
pack-bitmap.c
Normal file
1073
pack-bitmap.c
Normal file
File diff suppressed because it is too large
Load Diff
64
pack-bitmap.h
Normal file
64
pack-bitmap.h
Normal file
@ -0,0 +1,64 @@
|
|||||||
|
#ifndef PACK_BITMAP_H
|
||||||
|
#define PACK_BITMAP_H
|
||||||
|
|
||||||
|
#include "ewah/ewok.h"
|
||||||
|
#include "khash.h"
|
||||||
|
#include "pack-objects.h"
|
||||||
|
|
||||||
|
struct bitmap_disk_entry {
|
||||||
|
uint32_t object_pos;
|
||||||
|
uint8_t xor_offset;
|
||||||
|
uint8_t flags;
|
||||||
|
} __attribute__((packed));
|
||||||
|
|
||||||
|
struct bitmap_disk_header {
|
||||||
|
char magic[4];
|
||||||
|
uint16_t version;
|
||||||
|
uint16_t options;
|
||||||
|
uint32_t entry_count;
|
||||||
|
unsigned char checksum[20];
|
||||||
|
};
|
||||||
|
|
||||||
|
static const char BITMAP_IDX_SIGNATURE[] = {'B', 'I', 'T', 'M'};
|
||||||
|
|
||||||
|
#define NEEDS_BITMAP (1u<<22)
|
||||||
|
|
||||||
|
enum pack_bitmap_opts {
|
||||||
|
BITMAP_OPT_FULL_DAG = 1,
|
||||||
|
BITMAP_OPT_HASH_CACHE = 4,
|
||||||
|
};
|
||||||
|
|
||||||
|
enum pack_bitmap_flags {
|
||||||
|
BITMAP_FLAG_REUSE = 0x1
|
||||||
|
};
|
||||||
|
|
||||||
|
typedef int (*show_reachable_fn)(
|
||||||
|
const unsigned char *sha1,
|
||||||
|
enum object_type type,
|
||||||
|
int flags,
|
||||||
|
uint32_t hash,
|
||||||
|
struct packed_git *found_pack,
|
||||||
|
off_t found_offset);
|
||||||
|
|
||||||
|
int prepare_bitmap_git(void);
|
||||||
|
void count_bitmap_commit_list(uint32_t *commits, uint32_t *trees, uint32_t *blobs, uint32_t *tags);
|
||||||
|
void traverse_bitmap_commit_list(show_reachable_fn show_reachable);
|
||||||
|
void test_bitmap_walk(struct rev_info *revs);
|
||||||
|
char *pack_bitmap_filename(struct packed_git *p);
|
||||||
|
int prepare_bitmap_walk(struct rev_info *revs);
|
||||||
|
int reuse_partial_packfile_from_bitmap(struct packed_git **packfile, uint32_t *entries, off_t *up_to);
|
||||||
|
int rebuild_existing_bitmaps(struct packing_data *mapping, khash_sha1 *reused_bitmaps, int show_progress);
|
||||||
|
|
||||||
|
void bitmap_writer_show_progress(int show);
|
||||||
|
void bitmap_writer_set_checksum(unsigned char *sha1);
|
||||||
|
void bitmap_writer_build_type_index(struct pack_idx_entry **index, uint32_t index_nr);
|
||||||
|
void bitmap_writer_reuse_bitmaps(struct packing_data *to_pack);
|
||||||
|
void bitmap_writer_select_commits(struct commit **indexed_commits,
|
||||||
|
unsigned int indexed_commits_nr, int max_bitmaps);
|
||||||
|
void bitmap_writer_build(struct packing_data *to_pack);
|
||||||
|
void bitmap_writer_finish(struct pack_idx_entry **index,
|
||||||
|
uint32_t index_nr,
|
||||||
|
const char *filename,
|
||||||
|
uint16_t options);
|
||||||
|
|
||||||
|
#endif
|
111
pack-objects.c
Normal file
111
pack-objects.c
Normal file
@ -0,0 +1,111 @@
|
|||||||
|
#include "cache.h"
|
||||||
|
#include "object.h"
|
||||||
|
#include "pack.h"
|
||||||
|
#include "pack-objects.h"
|
||||||
|
|
||||||
|
static uint32_t locate_object_entry_hash(struct packing_data *pdata,
|
||||||
|
const unsigned char *sha1,
|
||||||
|
int *found)
|
||||||
|
{
|
||||||
|
uint32_t i, hash, mask = (pdata->index_size - 1);
|
||||||
|
|
||||||
|
memcpy(&hash, sha1, sizeof(uint32_t));
|
||||||
|
i = hash & mask;
|
||||||
|
|
||||||
|
while (pdata->index[i] > 0) {
|
||||||
|
uint32_t pos = pdata->index[i] - 1;
|
||||||
|
|
||||||
|
if (!hashcmp(sha1, pdata->objects[pos].idx.sha1)) {
|
||||||
|
*found = 1;
|
||||||
|
return i;
|
||||||
|
}
|
||||||
|
|
||||||
|
i = (i + 1) & mask;
|
||||||
|
}
|
||||||
|
|
||||||
|
*found = 0;
|
||||||
|
return i;
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline uint32_t closest_pow2(uint32_t v)
|
||||||
|
{
|
||||||
|
v = v - 1;
|
||||||
|
v |= v >> 1;
|
||||||
|
v |= v >> 2;
|
||||||
|
v |= v >> 4;
|
||||||
|
v |= v >> 8;
|
||||||
|
v |= v >> 16;
|
||||||
|
return v + 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
static void rehash_objects(struct packing_data *pdata)
|
||||||
|
{
|
||||||
|
uint32_t i;
|
||||||
|
struct object_entry *entry;
|
||||||
|
|
||||||
|
pdata->index_size = closest_pow2(pdata->nr_objects * 3);
|
||||||
|
if (pdata->index_size < 1024)
|
||||||
|
pdata->index_size = 1024;
|
||||||
|
|
||||||
|
pdata->index = xrealloc(pdata->index, sizeof(uint32_t) * pdata->index_size);
|
||||||
|
memset(pdata->index, 0, sizeof(int) * pdata->index_size);
|
||||||
|
|
||||||
|
entry = pdata->objects;
|
||||||
|
|
||||||
|
for (i = 0; i < pdata->nr_objects; i++) {
|
||||||
|
int found;
|
||||||
|
uint32_t ix = locate_object_entry_hash(pdata, entry->idx.sha1, &found);
|
||||||
|
|
||||||
|
if (found)
|
||||||
|
die("BUG: Duplicate object in hash");
|
||||||
|
|
||||||
|
pdata->index[ix] = i + 1;
|
||||||
|
entry++;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
struct object_entry *packlist_find(struct packing_data *pdata,
|
||||||
|
const unsigned char *sha1,
|
||||||
|
uint32_t *index_pos)
|
||||||
|
{
|
||||||
|
uint32_t i;
|
||||||
|
int found;
|
||||||
|
|
||||||
|
if (!pdata->index_size)
|
||||||
|
return NULL;
|
||||||
|
|
||||||
|
i = locate_object_entry_hash(pdata, sha1, &found);
|
||||||
|
|
||||||
|
if (index_pos)
|
||||||
|
*index_pos = i;
|
||||||
|
|
||||||
|
if (!found)
|
||||||
|
return NULL;
|
||||||
|
|
||||||
|
return &pdata->objects[pdata->index[i] - 1];
|
||||||
|
}
|
||||||
|
|
||||||
|
struct object_entry *packlist_alloc(struct packing_data *pdata,
|
||||||
|
const unsigned char *sha1,
|
||||||
|
uint32_t index_pos)
|
||||||
|
{
|
||||||
|
struct object_entry *new_entry;
|
||||||
|
|
||||||
|
if (pdata->nr_objects >= pdata->nr_alloc) {
|
||||||
|
pdata->nr_alloc = (pdata->nr_alloc + 1024) * 3 / 2;
|
||||||
|
pdata->objects = xrealloc(pdata->objects,
|
||||||
|
pdata->nr_alloc * sizeof(*new_entry));
|
||||||
|
}
|
||||||
|
|
||||||
|
new_entry = pdata->objects + pdata->nr_objects++;
|
||||||
|
|
||||||
|
memset(new_entry, 0, sizeof(*new_entry));
|
||||||
|
hashcpy(new_entry->idx.sha1, sha1);
|
||||||
|
|
||||||
|
if (pdata->index_size * 3 <= pdata->nr_objects * 4)
|
||||||
|
rehash_objects(pdata);
|
||||||
|
else
|
||||||
|
pdata->index[index_pos] = pdata->nr_objects;
|
||||||
|
|
||||||
|
return new_entry;
|
||||||
|
}
|
68
pack-objects.h
Normal file
68
pack-objects.h
Normal file
@ -0,0 +1,68 @@
|
|||||||
|
#ifndef PACK_OBJECTS_H
|
||||||
|
#define PACK_OBJECTS_H
|
||||||
|
|
||||||
|
struct object_entry {
|
||||||
|
struct pack_idx_entry idx;
|
||||||
|
unsigned long size; /* uncompressed size */
|
||||||
|
struct packed_git *in_pack; /* already in pack */
|
||||||
|
off_t in_pack_offset;
|
||||||
|
struct object_entry *delta; /* delta base object */
|
||||||
|
struct object_entry *delta_child; /* deltified objects who bases me */
|
||||||
|
struct object_entry *delta_sibling; /* other deltified objects who
|
||||||
|
* uses the same base as me
|
||||||
|
*/
|
||||||
|
void *delta_data; /* cached delta (uncompressed) */
|
||||||
|
unsigned long delta_size; /* delta data size (uncompressed) */
|
||||||
|
unsigned long z_delta_size; /* delta data size (compressed) */
|
||||||
|
enum object_type type;
|
||||||
|
enum object_type in_pack_type; /* could be delta */
|
||||||
|
uint32_t hash; /* name hint hash */
|
||||||
|
unsigned int in_pack_pos;
|
||||||
|
unsigned char in_pack_header_size;
|
||||||
|
unsigned preferred_base:1; /*
|
||||||
|
* we do not pack this, but is available
|
||||||
|
* to be used as the base object to delta
|
||||||
|
* objects against.
|
||||||
|
*/
|
||||||
|
unsigned no_try_delta:1;
|
||||||
|
unsigned tagged:1; /* near the very tip of refs */
|
||||||
|
unsigned filled:1; /* assigned write-order */
|
||||||
|
};
|
||||||
|
|
||||||
|
struct packing_data {
|
||||||
|
struct object_entry *objects;
|
||||||
|
uint32_t nr_objects, nr_alloc;
|
||||||
|
|
||||||
|
int32_t *index;
|
||||||
|
uint32_t index_size;
|
||||||
|
};
|
||||||
|
|
||||||
|
struct object_entry *packlist_alloc(struct packing_data *pdata,
|
||||||
|
const unsigned char *sha1,
|
||||||
|
uint32_t index_pos);
|
||||||
|
|
||||||
|
struct object_entry *packlist_find(struct packing_data *pdata,
|
||||||
|
const unsigned char *sha1,
|
||||||
|
uint32_t *index_pos);
|
||||||
|
|
||||||
|
static inline uint32_t pack_name_hash(const char *name)
|
||||||
|
{
|
||||||
|
uint32_t c, hash = 0;
|
||||||
|
|
||||||
|
if (!name)
|
||||||
|
return 0;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* This effectively just creates a sortable number from the
|
||||||
|
* last sixteen non-whitespace characters. Last characters
|
||||||
|
* count "most", so things that end in ".c" sort together.
|
||||||
|
*/
|
||||||
|
while ((c = *name++) != 0) {
|
||||||
|
if (isspace(c))
|
||||||
|
continue;
|
||||||
|
hash = (hash >> 2) + (c << 24);
|
||||||
|
}
|
||||||
|
return hash;
|
||||||
|
}
|
||||||
|
|
||||||
|
#endif
|
@ -16,11 +16,6 @@
|
|||||||
* get the object sha1 from the main index.
|
* get the object sha1 from the main index.
|
||||||
*/
|
*/
|
||||||
|
|
||||||
struct pack_revindex {
|
|
||||||
struct packed_git *p;
|
|
||||||
struct revindex_entry *revindex;
|
|
||||||
};
|
|
||||||
|
|
||||||
static struct pack_revindex *pack_revindex;
|
static struct pack_revindex *pack_revindex;
|
||||||
static int pack_revindex_hashsz;
|
static int pack_revindex_hashsz;
|
||||||
|
|
||||||
@ -201,15 +196,14 @@ static void create_pack_revindex(struct pack_revindex *rix)
|
|||||||
sort_revindex(rix->revindex, num_ent, p->pack_size);
|
sort_revindex(rix->revindex, num_ent, p->pack_size);
|
||||||
}
|
}
|
||||||
|
|
||||||
struct revindex_entry *find_pack_revindex(struct packed_git *p, off_t ofs)
|
struct pack_revindex *revindex_for_pack(struct packed_git *p)
|
||||||
{
|
{
|
||||||
int num;
|
int num;
|
||||||
unsigned lo, hi;
|
|
||||||
struct pack_revindex *rix;
|
struct pack_revindex *rix;
|
||||||
struct revindex_entry *revindex;
|
|
||||||
|
|
||||||
if (!pack_revindex_hashsz)
|
if (!pack_revindex_hashsz)
|
||||||
init_pack_revindex();
|
init_pack_revindex();
|
||||||
|
|
||||||
num = pack_revindex_ix(p);
|
num = pack_revindex_ix(p);
|
||||||
if (num < 0)
|
if (num < 0)
|
||||||
die("internal error: pack revindex fubar");
|
die("internal error: pack revindex fubar");
|
||||||
@ -217,30 +211,37 @@ struct revindex_entry *find_pack_revindex(struct packed_git *p, off_t ofs)
|
|||||||
rix = &pack_revindex[num];
|
rix = &pack_revindex[num];
|
||||||
if (!rix->revindex)
|
if (!rix->revindex)
|
||||||
create_pack_revindex(rix);
|
create_pack_revindex(rix);
|
||||||
revindex = rix->revindex;
|
|
||||||
|
|
||||||
lo = 0;
|
return rix;
|
||||||
hi = p->num_objects + 1;
|
}
|
||||||
|
|
||||||
|
int find_revindex_position(struct pack_revindex *pridx, off_t ofs)
|
||||||
|
{
|
||||||
|
int lo = 0;
|
||||||
|
int hi = pridx->p->num_objects + 1;
|
||||||
|
struct revindex_entry *revindex = pridx->revindex;
|
||||||
|
|
||||||
do {
|
do {
|
||||||
unsigned mi = lo + (hi - lo) / 2;
|
unsigned mi = lo + (hi - lo) / 2;
|
||||||
if (revindex[mi].offset == ofs) {
|
if (revindex[mi].offset == ofs) {
|
||||||
return revindex + mi;
|
return mi;
|
||||||
} else if (ofs < revindex[mi].offset)
|
} else if (ofs < revindex[mi].offset)
|
||||||
hi = mi;
|
hi = mi;
|
||||||
else
|
else
|
||||||
lo = mi + 1;
|
lo = mi + 1;
|
||||||
} while (lo < hi);
|
} while (lo < hi);
|
||||||
|
|
||||||
error("bad offset for revindex");
|
error("bad offset for revindex");
|
||||||
return NULL;
|
return -1;
|
||||||
}
|
}
|
||||||
|
|
||||||
void discard_revindex(void)
|
struct revindex_entry *find_pack_revindex(struct packed_git *p, off_t ofs)
|
||||||
{
|
{
|
||||||
if (pack_revindex_hashsz) {
|
struct pack_revindex *pridx = revindex_for_pack(p);
|
||||||
int i;
|
int pos = find_revindex_position(pridx, ofs);
|
||||||
for (i = 0; i < pack_revindex_hashsz; i++)
|
|
||||||
free(pack_revindex[i].revindex);
|
if (pos < 0)
|
||||||
free(pack_revindex);
|
return NULL;
|
||||||
pack_revindex_hashsz = 0;
|
|
||||||
}
|
return pridx->revindex + pos;
|
||||||
}
|
}
|
||||||
|
@ -6,7 +6,14 @@ struct revindex_entry {
|
|||||||
unsigned int nr;
|
unsigned int nr;
|
||||||
};
|
};
|
||||||
|
|
||||||
|
struct pack_revindex {
|
||||||
|
struct packed_git *p;
|
||||||
|
struct revindex_entry *revindex;
|
||||||
|
};
|
||||||
|
|
||||||
|
struct pack_revindex *revindex_for_pack(struct packed_git *p);
|
||||||
|
int find_revindex_position(struct pack_revindex *pridx, off_t ofs);
|
||||||
|
|
||||||
struct revindex_entry *find_pack_revindex(struct packed_git *p, off_t ofs);
|
struct revindex_entry *find_pack_revindex(struct packed_git *p, off_t ofs);
|
||||||
void discard_revindex(void);
|
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
@ -364,5 +364,7 @@ void finish_tmp_packfile(char *name_buffer,
|
|||||||
if (rename(idx_tmp_name, name_buffer))
|
if (rename(idx_tmp_name, name_buffer))
|
||||||
die_errno("unable to rename temporary index file");
|
die_errno("unable to rename temporary index file");
|
||||||
|
|
||||||
|
*end_of_name_prefix = '\0';
|
||||||
|
|
||||||
free((void *)idx_tmp_name);
|
free((void *)idx_tmp_name);
|
||||||
}
|
}
|
||||||
|
44
read-cache.c
44
read-cache.c
@ -1314,26 +1314,6 @@ int read_index(struct index_state *istate)
|
|||||||
return read_index_from(istate, get_index_file());
|
return read_index_from(istate, get_index_file());
|
||||||
}
|
}
|
||||||
|
|
||||||
#ifndef NEEDS_ALIGNED_ACCESS
|
|
||||||
#define ntoh_s(var) ntohs(var)
|
|
||||||
#define ntoh_l(var) ntohl(var)
|
|
||||||
#else
|
|
||||||
static inline uint16_t ntoh_s_force_align(void *p)
|
|
||||||
{
|
|
||||||
uint16_t x;
|
|
||||||
memcpy(&x, p, sizeof(x));
|
|
||||||
return ntohs(x);
|
|
||||||
}
|
|
||||||
static inline uint32_t ntoh_l_force_align(void *p)
|
|
||||||
{
|
|
||||||
uint32_t x;
|
|
||||||
memcpy(&x, p, sizeof(x));
|
|
||||||
return ntohl(x);
|
|
||||||
}
|
|
||||||
#define ntoh_s(var) ntoh_s_force_align(&(var))
|
|
||||||
#define ntoh_l(var) ntoh_l_force_align(&(var))
|
|
||||||
#endif
|
|
||||||
|
|
||||||
static struct cache_entry *cache_entry_from_ondisk(struct ondisk_cache_entry *ondisk,
|
static struct cache_entry *cache_entry_from_ondisk(struct ondisk_cache_entry *ondisk,
|
||||||
unsigned int flags,
|
unsigned int flags,
|
||||||
const char *name,
|
const char *name,
|
||||||
@ -1341,16 +1321,16 @@ static struct cache_entry *cache_entry_from_ondisk(struct ondisk_cache_entry *on
|
|||||||
{
|
{
|
||||||
struct cache_entry *ce = xmalloc(cache_entry_size(len));
|
struct cache_entry *ce = xmalloc(cache_entry_size(len));
|
||||||
|
|
||||||
ce->ce_stat_data.sd_ctime.sec = ntoh_l(ondisk->ctime.sec);
|
ce->ce_stat_data.sd_ctime.sec = get_be32(&ondisk->ctime.sec);
|
||||||
ce->ce_stat_data.sd_mtime.sec = ntoh_l(ondisk->mtime.sec);
|
ce->ce_stat_data.sd_mtime.sec = get_be32(&ondisk->mtime.sec);
|
||||||
ce->ce_stat_data.sd_ctime.nsec = ntoh_l(ondisk->ctime.nsec);
|
ce->ce_stat_data.sd_ctime.nsec = get_be32(&ondisk->ctime.nsec);
|
||||||
ce->ce_stat_data.sd_mtime.nsec = ntoh_l(ondisk->mtime.nsec);
|
ce->ce_stat_data.sd_mtime.nsec = get_be32(&ondisk->mtime.nsec);
|
||||||
ce->ce_stat_data.sd_dev = ntoh_l(ondisk->dev);
|
ce->ce_stat_data.sd_dev = get_be32(&ondisk->dev);
|
||||||
ce->ce_stat_data.sd_ino = ntoh_l(ondisk->ino);
|
ce->ce_stat_data.sd_ino = get_be32(&ondisk->ino);
|
||||||
ce->ce_mode = ntoh_l(ondisk->mode);
|
ce->ce_mode = get_be32(&ondisk->mode);
|
||||||
ce->ce_stat_data.sd_uid = ntoh_l(ondisk->uid);
|
ce->ce_stat_data.sd_uid = get_be32(&ondisk->uid);
|
||||||
ce->ce_stat_data.sd_gid = ntoh_l(ondisk->gid);
|
ce->ce_stat_data.sd_gid = get_be32(&ondisk->gid);
|
||||||
ce->ce_stat_data.sd_size = ntoh_l(ondisk->size);
|
ce->ce_stat_data.sd_size = get_be32(&ondisk->size);
|
||||||
ce->ce_flags = flags & ~CE_NAMEMASK;
|
ce->ce_flags = flags & ~CE_NAMEMASK;
|
||||||
ce->ce_namelen = len;
|
ce->ce_namelen = len;
|
||||||
hashcpy(ce->sha1, ondisk->sha1);
|
hashcpy(ce->sha1, ondisk->sha1);
|
||||||
@ -1390,14 +1370,14 @@ static struct cache_entry *create_from_disk(struct ondisk_cache_entry *ondisk,
|
|||||||
unsigned int flags;
|
unsigned int flags;
|
||||||
|
|
||||||
/* On-disk flags are just 16 bits */
|
/* On-disk flags are just 16 bits */
|
||||||
flags = ntoh_s(ondisk->flags);
|
flags = get_be16(&ondisk->flags);
|
||||||
len = flags & CE_NAMEMASK;
|
len = flags & CE_NAMEMASK;
|
||||||
|
|
||||||
if (flags & CE_EXTENDED) {
|
if (flags & CE_EXTENDED) {
|
||||||
struct ondisk_cache_entry_extended *ondisk2;
|
struct ondisk_cache_entry_extended *ondisk2;
|
||||||
int extended_flags;
|
int extended_flags;
|
||||||
ondisk2 = (struct ondisk_cache_entry_extended *)ondisk;
|
ondisk2 = (struct ondisk_cache_entry_extended *)ondisk;
|
||||||
extended_flags = ntoh_s(ondisk2->flags2) << 16;
|
extended_flags = get_be16(&ondisk2->flags2) << 16;
|
||||||
/* We do not yet understand any bit out of CE_EXTENDED_FLAGS */
|
/* We do not yet understand any bit out of CE_EXTENDED_FLAGS */
|
||||||
if (extended_flags & ~CE_EXTENDED_FLAGS)
|
if (extended_flags & ~CE_EXTENDED_FLAGS)
|
||||||
die("Unknown index entry format %08x", extended_flags);
|
die("Unknown index entry format %08x", extended_flags);
|
||||||
|
@ -774,6 +774,10 @@ static int add_parents_to_list(struct rev_info *revs, struct commit *commit,
|
|||||||
return 0;
|
return 0;
|
||||||
commit->object.flags |= ADDED;
|
commit->object.flags |= ADDED;
|
||||||
|
|
||||||
|
if (revs->include_check &&
|
||||||
|
!revs->include_check(commit, revs->include_check_data))
|
||||||
|
return 0;
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* If the commit is uninteresting, don't try to
|
* If the commit is uninteresting, don't try to
|
||||||
* prune parents - we want the maximal uninteresting
|
* prune parents - we want the maximal uninteresting
|
||||||
|
@ -172,6 +172,8 @@ struct rev_info {
|
|||||||
unsigned long min_age;
|
unsigned long min_age;
|
||||||
int min_parents;
|
int min_parents;
|
||||||
int max_parents;
|
int max_parents;
|
||||||
|
int (*include_check)(struct commit *, void *);
|
||||||
|
void *include_check_data;
|
||||||
|
|
||||||
/* diff info for patches and for paths limiting */
|
/* diff info for patches and for paths limiting */
|
||||||
struct diff_options diffopt;
|
struct diff_options diffopt;
|
||||||
|
@ -252,8 +252,6 @@ char *sha1_pack_index_name(const unsigned char *sha1)
|
|||||||
struct alternate_object_database *alt_odb_list;
|
struct alternate_object_database *alt_odb_list;
|
||||||
static struct alternate_object_database **alt_odb_tail;
|
static struct alternate_object_database **alt_odb_tail;
|
||||||
|
|
||||||
static int git_open_noatime(const char *name);
|
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Prepare alternate object database registry.
|
* Prepare alternate object database registry.
|
||||||
*
|
*
|
||||||
@ -1232,6 +1230,7 @@ static void prepare_packed_git_one(char *objdir, int local)
|
|||||||
|
|
||||||
if (has_extension(de->d_name, ".idx") ||
|
if (has_extension(de->d_name, ".idx") ||
|
||||||
has_extension(de->d_name, ".pack") ||
|
has_extension(de->d_name, ".pack") ||
|
||||||
|
has_extension(de->d_name, ".bitmap") ||
|
||||||
has_extension(de->d_name, ".keep"))
|
has_extension(de->d_name, ".keep"))
|
||||||
string_list_append(&garbage, path);
|
string_list_append(&garbage, path);
|
||||||
else
|
else
|
||||||
@ -1316,7 +1315,6 @@ void prepare_packed_git(void)
|
|||||||
|
|
||||||
void reprepare_packed_git(void)
|
void reprepare_packed_git(void)
|
||||||
{
|
{
|
||||||
discard_revindex();
|
|
||||||
prepare_packed_git_run_once = 0;
|
prepare_packed_git_run_once = 0;
|
||||||
prepare_packed_git();
|
prepare_packed_git();
|
||||||
}
|
}
|
||||||
@ -1393,7 +1391,7 @@ int check_sha1_signature(const unsigned char *sha1, void *map,
|
|||||||
return hashcmp(sha1, real_sha1) ? -1 : 0;
|
return hashcmp(sha1, real_sha1) ? -1 : 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
static int git_open_noatime(const char *name)
|
int git_open_noatime(const char *name)
|
||||||
{
|
{
|
||||||
static int sha1_file_open_flag = O_NOATIME;
|
static int sha1_file_open_flag = O_NOATIME;
|
||||||
|
|
||||||
|
57
t/perf/p5310-pack-bitmaps.sh
Executable file
57
t/perf/p5310-pack-bitmaps.sh
Executable file
@ -0,0 +1,57 @@
|
|||||||
|
#!/bin/sh
|
||||||
|
|
||||||
|
test_description='Tests pack performance using bitmaps'
|
||||||
|
. ./perf-lib.sh
|
||||||
|
|
||||||
|
test_perf_large_repo
|
||||||
|
|
||||||
|
# note that we do everything through config,
|
||||||
|
# since we want to be able to compare bitmap-aware
|
||||||
|
# git versus non-bitmap git
|
||||||
|
test_expect_success 'setup bitmap config' '
|
||||||
|
git config pack.writebitmaps true &&
|
||||||
|
git config pack.writebitmaphashcache true
|
||||||
|
'
|
||||||
|
|
||||||
|
test_perf 'repack to disk' '
|
||||||
|
git repack -ad
|
||||||
|
'
|
||||||
|
|
||||||
|
test_perf 'simulated clone' '
|
||||||
|
git pack-objects --stdout --all </dev/null >/dev/null
|
||||||
|
'
|
||||||
|
|
||||||
|
test_perf 'simulated fetch' '
|
||||||
|
have=$(git rev-list HEAD~100 -1) &&
|
||||||
|
{
|
||||||
|
echo HEAD &&
|
||||||
|
echo ^$have
|
||||||
|
} | git pack-objects --revs --stdout >/dev/null
|
||||||
|
'
|
||||||
|
|
||||||
|
test_expect_success 'create partial bitmap state' '
|
||||||
|
# pick a commit to represent the repo tip in the past
|
||||||
|
cutoff=$(git rev-list HEAD~100 -1) &&
|
||||||
|
orig_tip=$(git rev-parse HEAD) &&
|
||||||
|
|
||||||
|
# now kill off all of the refs and pretend we had
|
||||||
|
# just the one tip
|
||||||
|
rm -rf .git/logs .git/refs/* .git/packed-refs
|
||||||
|
git update-ref HEAD $cutoff
|
||||||
|
|
||||||
|
# and then repack, which will leave us with a nice
|
||||||
|
# big bitmap pack of the "old" history, and all of
|
||||||
|
# the new history will be loose, as if it had been pushed
|
||||||
|
# up incrementally and exploded via unpack-objects
|
||||||
|
git repack -Ad
|
||||||
|
|
||||||
|
# and now restore our original tip, as if the pushes
|
||||||
|
# had happened
|
||||||
|
git update-ref HEAD $orig_tip
|
||||||
|
'
|
||||||
|
|
||||||
|
test_perf 'partial bitmap' '
|
||||||
|
git pack-objects --stdout --all </dev/null >/dev/null
|
||||||
|
'
|
||||||
|
|
||||||
|
test_done
|
139
t/t5310-pack-bitmaps.sh
Executable file
139
t/t5310-pack-bitmaps.sh
Executable file
@ -0,0 +1,139 @@
|
|||||||
|
#!/bin/sh
|
||||||
|
|
||||||
|
test_description='exercise basic bitmap functionality'
|
||||||
|
. ./test-lib.sh
|
||||||
|
|
||||||
|
test_expect_success 'setup repo with moderate-sized history' '
|
||||||
|
for i in $(test_seq 1 10); do
|
||||||
|
test_commit $i
|
||||||
|
done &&
|
||||||
|
git checkout -b other HEAD~5 &&
|
||||||
|
for i in $(test_seq 1 10); do
|
||||||
|
test_commit side-$i
|
||||||
|
done &&
|
||||||
|
git checkout master &&
|
||||||
|
blob=$(echo tagged-blob | git hash-object -w --stdin) &&
|
||||||
|
git tag tagged-blob $blob &&
|
||||||
|
git config pack.writebitmaps true &&
|
||||||
|
git config pack.writebitmaphashcache true
|
||||||
|
'
|
||||||
|
|
||||||
|
test_expect_success 'full repack creates bitmaps' '
|
||||||
|
git repack -ad &&
|
||||||
|
ls .git/objects/pack/ | grep bitmap >output &&
|
||||||
|
test_line_count = 1 output
|
||||||
|
'
|
||||||
|
|
||||||
|
test_expect_success 'rev-list --test-bitmap verifies bitmaps' '
|
||||||
|
git rev-list --test-bitmap HEAD
|
||||||
|
'
|
||||||
|
|
||||||
|
rev_list_tests() {
|
||||||
|
state=$1
|
||||||
|
|
||||||
|
test_expect_success "counting commits via bitmap ($state)" '
|
||||||
|
git rev-list --count HEAD >expect &&
|
||||||
|
git rev-list --use-bitmap-index --count HEAD >actual &&
|
||||||
|
test_cmp expect actual
|
||||||
|
'
|
||||||
|
|
||||||
|
test_expect_success "counting partial commits via bitmap ($state)" '
|
||||||
|
git rev-list --count HEAD~5..HEAD >expect &&
|
||||||
|
git rev-list --use-bitmap-index --count HEAD~5..HEAD >actual &&
|
||||||
|
test_cmp expect actual
|
||||||
|
'
|
||||||
|
|
||||||
|
test_expect_success "counting non-linear history ($state)" '
|
||||||
|
git rev-list --count other...master >expect &&
|
||||||
|
git rev-list --use-bitmap-index --count other...master >actual &&
|
||||||
|
test_cmp expect actual
|
||||||
|
'
|
||||||
|
|
||||||
|
test_expect_success "enumerate --objects ($state)" '
|
||||||
|
git rev-list --objects --use-bitmap-index HEAD >tmp &&
|
||||||
|
cut -d" " -f1 <tmp >tmp2 &&
|
||||||
|
sort <tmp2 >actual &&
|
||||||
|
git rev-list --objects HEAD >tmp &&
|
||||||
|
cut -d" " -f1 <tmp >tmp2 &&
|
||||||
|
sort <tmp2 >expect &&
|
||||||
|
test_cmp expect actual
|
||||||
|
'
|
||||||
|
|
||||||
|
test_expect_success "bitmap --objects handles non-commit objects ($state)" '
|
||||||
|
git rev-list --objects --use-bitmap-index HEAD tagged-blob >actual &&
|
||||||
|
grep $blob actual
|
||||||
|
'
|
||||||
|
}
|
||||||
|
|
||||||
|
rev_list_tests 'full bitmap'
|
||||||
|
|
||||||
|
test_expect_success 'clone from bitmapped repository' '
|
||||||
|
git clone --no-local --bare . clone.git &&
|
||||||
|
git rev-parse HEAD >expect &&
|
||||||
|
git --git-dir=clone.git rev-parse HEAD >actual &&
|
||||||
|
test_cmp expect actual
|
||||||
|
'
|
||||||
|
|
||||||
|
test_expect_success 'setup further non-bitmapped commits' '
|
||||||
|
for i in $(test_seq 1 10); do
|
||||||
|
test_commit further-$i
|
||||||
|
done
|
||||||
|
'
|
||||||
|
|
||||||
|
rev_list_tests 'partial bitmap'
|
||||||
|
|
||||||
|
test_expect_success 'fetch (partial bitmap)' '
|
||||||
|
git --git-dir=clone.git fetch origin master:master &&
|
||||||
|
git rev-parse HEAD >expect &&
|
||||||
|
git --git-dir=clone.git rev-parse HEAD >actual &&
|
||||||
|
test_cmp expect actual
|
||||||
|
'
|
||||||
|
|
||||||
|
test_expect_success 'incremental repack cannot create bitmaps' '
|
||||||
|
test_commit more-1 &&
|
||||||
|
test_must_fail git repack -d
|
||||||
|
'
|
||||||
|
|
||||||
|
test_expect_success 'incremental repack can disable bitmaps' '
|
||||||
|
test_commit more-2 &&
|
||||||
|
git repack -d --no-write-bitmap-index
|
||||||
|
'
|
||||||
|
|
||||||
|
test_expect_success 'full repack, reusing previous bitmaps' '
|
||||||
|
git repack -ad &&
|
||||||
|
ls .git/objects/pack/ | grep bitmap >output &&
|
||||||
|
test_line_count = 1 output
|
||||||
|
'
|
||||||
|
|
||||||
|
test_expect_success 'fetch (full bitmap)' '
|
||||||
|
git --git-dir=clone.git fetch origin master:master &&
|
||||||
|
git rev-parse HEAD >expect &&
|
||||||
|
git --git-dir=clone.git rev-parse HEAD >actual &&
|
||||||
|
test_cmp expect actual
|
||||||
|
'
|
||||||
|
|
||||||
|
test_lazy_prereq JGIT '
|
||||||
|
type jgit
|
||||||
|
'
|
||||||
|
|
||||||
|
test_expect_success JGIT 'we can read jgit bitmaps' '
|
||||||
|
git clone . compat-jgit &&
|
||||||
|
(
|
||||||
|
cd compat-jgit &&
|
||||||
|
rm -f .git/objects/pack/*.bitmap &&
|
||||||
|
jgit gc &&
|
||||||
|
git rev-list --test-bitmap HEAD
|
||||||
|
)
|
||||||
|
'
|
||||||
|
|
||||||
|
test_expect_success JGIT 'jgit can read our bitmaps' '
|
||||||
|
git clone . compat-us &&
|
||||||
|
(
|
||||||
|
cd compat-us &&
|
||||||
|
git repack -adb &&
|
||||||
|
# jgit gc will barf if it does not like our bitmaps
|
||||||
|
jgit gc
|
||||||
|
)
|
||||||
|
'
|
||||||
|
|
||||||
|
test_done
|
Loading…
Reference in New Issue
Block a user