This is the technical documentation for the JGit-compatible Bitmap v1
on-disk format.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
EWAH is a word-aligned compressed variant of a bitset (i.e. a data
structure that acts as a 0-indexed boolean array for many entries).
It uses a 64-bit run-length encoding (RLE) compression scheme,
trading some compression for better processing speed.
The goal of this word-aligned implementation is not to achieve
the best compression, but rather to improve query processing time.
As it stands right now, this EWAH implementation will always be more
efficient storage-wise than its uncompressed alternative.
EWAH arrays will be used as the on-disk format to store reachability
bitmaps for all objects in a repository while keeping reasonable sizes,
in the same way that JGit does.
This EWAH implementation is a mostly straightforward port of the
original `javaewah` library that JGit currently uses. The library is
self-contained and has been embedded whole (4 files) inside the `ewah`
folder to ease redistribution.
The library is re-licensed under the GPLv2 with the permission of Daniel
Lemire, the original author. The source code for the C version can
be found on GitHub:
https://github.com/vmg/libewok
The original Java implementation can also be found on GitHub:
https://github.com/lemire/javaewah
[jc: stripped debug-only code per Peff's $gmane/239768]
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Helped-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The POSIX standard doesn't currently define a `ntohll`/`htonll`
function pair to perform network-to-host and host-to-network
swaps of 64-bit data. These 64-bit swaps are necessary for the on-disk
storage of EWAH bitmaps if they are not in native byte order.
Many thanks to Ramsay Jones <ramsay@ramsay1.demon.co.uk> and
Torsten Bögershausen <tboegi@web.de> for cygwin/mingw/msvc
portability fixes.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `git_open_noatime` helper can be of general interest for other
consumers of git's different on-disk formats.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit enables users of `struct rev_info` to peform custom limiting
during a revision walk (i.e. `get_revision`).
If the field `include_check` has been set to a callback, this callback
will be issued once for each commit before it is added to the "pending"
list of the revwalk. If the include check returns 0, the commit will be
marked as added but won't be pushed to the pending list, effectively
limiting the walk.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As the pack-objects system grows beyond the single
pack-objects.c file, more parts (like the soon-to-exist
bitmap code) will need to compute hashes for matching
deltas. Factor out name_hash to make it available to other
files.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The hash table that stores the packing list for a given `pack-objects`
run was tightly coupled to the pack-objects code.
In this commit, we refactor the hash table and the underlying storage
array into a `packing_data` struct. The functionality for accessing and
adding entries to the packing list is hence accessible from other parts
of Git besides the `pack-objects` builtin.
This refactoring is a requirement for further patches in this series
that will require accessing the commit packing list from outside of
`pack-objects`.
The hash table implementation has been minimally altered: we now
use table sizes which are always a power of two, to ensure a uniform
index distribution in the array.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow users to efficiently lookup consecutive entries that are expected
to be found on the same revindex by exporting `find_revindex_position`:
this function takes a pointer to revindex itself, instead of looking up
the proper revindex for a given packfile on each call.
Signed-off-by: Vicent Marti <tanoku@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We are passed a "void *" and write it out without ever
touching it; let's indicate that by using "const".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git ls-files -k" needs to crawl only the part of the working tree
that may overlap the paths in the index to find killed files, but
shared code with the logic to find all the untracked files, which
made it unnecessarily inefficient.
* jc/ls-files-killed-optim:
dir.c::test_one_path(): work around directory_exists_in_index_icase() breakage
t3010: update to demonstrate "ls-files -k" optimization pitfalls
ls-files -k: a directory only can be killed if the index has a non-directory
dir.c: use the cache_* macro to access the current index
"git branch --track" had a minor regression in v1.8.3.2 and later
that made it impossible to base your local work on anything but a
local branch of the upstream repository you are tracking from.
* jh/checkout-auto-tracking:
t3200: fix failure on case-insensitive filesystems
branch.c: Relax unnecessary requirement on upstream's remote ref name
t3200: Add test demonstrating minor regression in 41c21f2
Refer to branch.<name>.remote/merge when documenting --track
t3200: Minor fix when preparing for tracking failure
t2024: Fix &&-chaining and a couple of typos
When there is no sufficient overlap between old and new history
during a "git fetch" into a shallow repository, objects that the
sending side knows the receiving end has were unnecessarily sent.
* nd/fetch-into-shallow:
Add testcase for needless objects during a shallow fetch
list-objects: mark more commits as edges in mark_edges_uninteresting
list-objects: reduce one argument in mark_edges_uninteresting
upload-pack: delegate rev walking in shallow fetch to pack-objects
shallow: add setup_temporary_shallow()
shallow: only add shallow graft points to new shallow file
move setup_alternate_shallow and write_shallow_commits to shallow.c
Cleanups and tweaks for credential handling to work with ancient versions
of the gnome-keyring library that are still in use.
* bc/gnome-keyring:
contrib/git-credential-gnome-keyring.c: support really ancient gnome-keyring
contrib/git-credential-gnome-keyring.c: support ancient gnome-keyring
contrib/git-credential-gnome-keyring.c: report failure to store password
contrib/git-credential-gnome-keyring.c: use glib messaging functions
contrib/git-credential-gnome-keyring.c: use glib memory allocation functions
contrib/git-credential-gnome-keyring.c: use secure memory for reading passwords
contrib/git-credential-gnome-keyring.c: use secure memory functions for passwds
contrib/git-credential-gnome-keyring.c: use gnome helpers in keyring_object()
contrib/git-credential-gnome-keyring.c: set Gnome application name
contrib/git-credential-gnome-keyring.c: ensure buffer is non-empty before accessing
contrib/git-credential-gnome-keyring.c: strlen() returns size_t, not ssize_t
contrib/git-credential-gnome-keyring.c: exit non-zero when called incorrectly
contrib/git-credential-gnome-keyring.c: add static where applicable
contrib/git-credential-gnome-keyring.c: *style* use "if ()" not "if()" etc.
contrib/git-credential-gnome-keyring.c: remove unused die() function
contrib/git-credential-gnome-keyring.c: remove unnecessary pre-declarations
Explain how '.' can be used to refer to the "current repository"
in the documentation.
* po/dot-url:
doc/cli: make "dot repository" an independent bullet point
config doc: update dot-repository notes
doc: command line interface (cli) dot-repository dwimmery
"git cherry-pick" without further options would segfault.
Could use a follow-up to handle '-' after argv[1] better.
* hu/cherry-pick-previous-branch:
cherry-pick: handle "-" after parsing options
Make "git grep" and "git show" pay attention to --textconv when
dealing with blob objects.
* mg/more-textconv:
grep: honor --textconv for the case rev:path
grep: allow to use textconv filters
t7008: demonstrate behavior of grep with textconv
cat-file: do not die on --textconv without textconv filters
show: honor --textconv for blobs
diff_opt: track whether flags have been set explicitly
t4030: demonstrate behavior of show with textconv
Document rules to use GIT_REFLOG_ACTION variable in the scripted
Porcelain. git-rebase--interactive locally violates them, but it
is a leaf user that does not call out to or dot-source other
scripts, so it does not urgently need to be fixed.
* jc/reflog-doc:
setup_reflog_action: document the rules for using GIT_REFLOG_ACTION
Rewrite "git repack" in C.
* sb/repack-in-c:
repack: improve warnings about failure of renaming and removing files
repack: retain the return value of pack-objects
repack: rewrite the shell script in C
Some progress and diagnostic messages from "git clone" were
incorrectly sent to the standard output stream, not to the standard
error stream.
* jk/clone-progress-to-stderr:
clone: always set transport options
clone: treat "checking connectivity" like other progress
clone: send diagnostic messages to stderr
The option to gpg sign a merge commit is available but was not
documented. Use wording from the git-commit(1) manpage.
Signed-off-by: Nicolas Vigier <boklm@mars-attacks.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"format-patch --from=<whom>" forgot to omit unnecessary in-body
from line, i.e. when <whom> is the same as the real author.
* jk/format-patch-from:
format-patch: print in-body "From" only when needed
Clean up the internal of the name-hash mechanism used to work
around case insensitivity on some filesystems to cleanly fix a
long-standing API glitch where the caller of cache_name_exists()
that ask about a directory with a counted string was required to
have '/' at one location past the end of the string.
* es/name-hash-no-trailing-slash-in-dirs:
dir: revert work-around for retired dangerous behavior
name-hash: stop storing trailing '/' on paths in index_state.dir_hash
employ new explicit "exists in index?" API
name-hash: refactor polymorphic index_name_exists()
"git filter-branch" in a repository with many refs blew limit of
command line length.
* lc/filter-branch-too-many-refs:
Allow git-filter-branch to process large repositories with lots of branches.
"git checkout [--detach] <commit>" was listed poorly in the
synopsis section of its documentation.
* jc/checkout-detach-doc:
checkout: update synopsys and documentation on detaching HEAD
* es/rebase-i-no-abbrev:
rebase -i: fix short SHA-1 collision
t3404: rebase -i: demonstrate short SHA-1 collision
t3404: make tests more self-contained
Conflicts:
t/t3404-rebase-interactive.sh
- Don't start tests with 'test $? = 0' to catch preparation done
outside the test_expect_success block.
- Move writing the bogus patch and the expected output into the
appropriate test_expect_success blocks.
- Use the test_must_fail helper instead of manually checking for
non-zero exit code.
- Use the debug-friendly test_path_is_file helper instead of 'test -f'.
- No space after '>'.
Signed-off-by: SZEDER Gábor <szeder@ira.uka.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* git://git.bogomips.org/git-svn:
git-svn: Warn about changing default for --prefix in Git v2.0
Documentation/git-svn: Promote the use of --prefix in docs + examples
git-svn.txt: elaborate on rev_map files
git-svn.txt: replace .git with $GIT_DIR
git-svn.txt: reword description of gc command
git-svn.txt: fix AsciiDoc formatting error
git-svn: fix signed commit parsing