Due to how Windows handles replacing a lockfile when there is an open
handle, create the close_midx() method to close the existing midx before
writing the new one.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The core.multiPackIndex config setting controls the multi-pack-
index (MIDX) feature. If false, the setting will disable all reads
from the multi-pack-index file.
Read this config setting in the new prepare_multi_pack_index_one()
which is called during prepare_packed_git(). This check is run once
per repository.
Add comparison commands in t5319-multi-pack-index.sh to check
typical Git behavior remains the same as the config setting is turned
on and off. This currently includes 'git rev-list' and 'git log'
commands to trigger several object database reads. Currently, these
would only catch an error in the prepare_multi_pack_index_one(), but
with later commits will catch errors in object lookups, abbreviations,
and approximate object counts.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The final pair of chunks for the multi-pack-index file stores the object
offsets. We default to using 32-bit offsets as in the pack-index version
1 format, but if there exists an offset larger than 32-bits, we use a
trick similar to the pack-index version 2 format by storing all offsets
at least 2^31 in a 64-bit table; we use the 32-bit table to point into
that 64-bit table as necessary.
We only store these 64-bit offsets if necessary, so create a test that
manipulates a version 2 pack-index to fake a large offset. This allows
us to test that the large offset table is created, but the data does not
match the actual packfile offsets. The multi-pack-index offset does match
the (corrupted) pack-index offset, so a future feature will compare these
offsets during a 'verify' step.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before writing a list of objects and their offsets to a multi-pack-index,
we need to collect the list of objects contained in the packfiles. There
may be multiple copies of some objects, so this list must be deduplicated.
It is possible to artificially get into a state where there are many
duplicate copies of objects. That can create high memory pressure if we
are to create a list of all objects before de-duplication. To reduce
this memory pressure without a significant performance drop,
automatically group objects by the first byte of their object id. Use
the IDX fanout tables to group the data, copy to a local array, then
sort.
Copy only the de-duplicated entries. Select the duplicate based on the
most-recent modified time of a packfile containing the object.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The multi-pack-index needs to track which packfiles it indexes. Store
these in our first required chunk. Since filenames are not well
structured, add padding to keep good alignment in later chunks.
Modify the 'git multi-pack-index read' subcommand to output the
existence of the pack-file name chunk. Modify t5319-multi-pack-index.sh
to reflect this new output and the new expected number of chunks.
Defense in depth: A pattern we are using in the multi-pack-index feature
is to verify the data as we write it. We want to ensure we never write
invalid data to the multi-pack-index. There are many checks that verify
that the values we are writing fit the format definitions. This mainly
helps developers while working on the feature, but it can also identify
issues that only appear when dealing with very large data sets. These
large sets are hard to encode into test cases.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When constructing a multi-pack-index file for a given object directory,
read the files within the enclosed pack directory and find matches that
end with ".idx" and find the correct paired packfile using
add_packed_git().
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In anticipation of sharing the pack directory listing with the
multi-pack-index, generalize prepare_packed_git_one() into
for_each_file_in_pack_dir().
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we build the multi-pack-index file format, we want to test the format
on real repositories. Add tests that create repository data including
multiple packfiles with both version 1 and version 2 formats.
The current 'git multi-pack-index write' command will always write the
same file with no "real" data. This will be expanded in future commits,
along with the test expectations.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a new multi_pack_index struct for loading multi-pack-indexes into
memory. Create a test-tool builtin for reading basic information about
that multi-pack-index to verify the correct data is written.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we begin writing the multi-pack-index format to disk, start with
the basics: the 12-byte header and the 20-byte checksum footer. Start
with these basics so we can add the rest of the format in small
increments.
As we implement the format, we will use a technique to check that our
computed offsets within the multi-pack-index file match what we are
actually writing. Each method that writes to the hashfile will return
the number of bytes written, and we will track that those values match
our expectations.
Currently, write_midx_header() returns 12, but is not checked. We will
check the return value in a later commit.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In anticipation of writing multi-pack-indexes, add a skeleton
'git multi-pack-index write' subcommand and send the options to a
write_midx_file() method. Also create a skeleton test script that
tests the 'write' subcommand.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This new 'git multi-pack-index' builtin will be the plumbing access
for writing, reading, and checking multi-pack-index files. The
initial implementation is a no-op.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The multi-pack-index feature generalizes the existing pack-index
feature by indexing objects across multiple pack-files.
Describe the basic file format, using a 12-byte header followed by
a lookup table for a list of "chunks" which will be described later.
The file ends with a footer containing a checksum using the hash
algorithm.
The header allows later versions to create breaking changes by
advancing the version number. We can also change the hash algorithm
using a different version value.
We will add the individual chunk format information as we introduce
the code that writes that information.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running t7400 in a shell you observe more output than expected:
...
ok 8 - setup - hide init subdirectory
ok 9 - setup - repository to add submodules to
ok 10 - submodule add
[master (root-commit) d79ce16] one
Author: A U Thor <author@example.com>
1 file changed, 1 insertion(+)
create mode 100644 one.t
ok 11 - redirected submodule add does not show progress
ok 12 - redirected submodule add --progress does show progress
ok 13 - submodule add to .gitignored path fails
...
Fix the output by encapsulating the setup code in test_expect_success
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When testing a reworded root commit, ensure that the squash-onto commit
which is created and amended is still the root commit.
Suggested-by: Phillip Wood <phillip.wood@talktalk.net>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"make NO_ICONV=NoThanks" did not override NEEDS_LIBICONV
(i.e. linkage of -lintl, -liconv, etc. that are platform-specific
tweaks), which has been corrected.
* es/make-no-iconv:
Makefile: make NO_ICONV really mean "no iconv"
A regression to "rebase -i --root" introduced during this cycle has
been fixed.
* js/rebase-i-root-fix:
rebase --root: fix amending root commit messages
rebase --root: demonstrate a bug while amending root commit messages
The code to read compressed bitmap was not careful to avoid reading
past the end of the file, which has been corrected.
* jk/ewah-bounds-check:
ewah: adjust callers of ewah_read_mmap()
ewah_read_mmap: bounds-check mmap reads
Update the Korean translation and change the team leader to Gwan-gyeong
Mun.
Signed-off-by: Gwan-gyeong Mun <elongbug@gmail.com>
Signed-off-by: Changwoo Ryu <cwryu@debian.org>
Reviewed-by: Gwan-gyeong Mun <elongbug@gmail.com>
Newly added codepath in merge-recursive had potential buffer
overrun, which has been fixed.
* en/rename-directory-detection:
merge-recursive: use xstrdup() instead of fixed buffer
Make zlib inflate codepath more robust against versions of zlib
that clobber unused portion of outbuf.
* jl/zlib-restore-nul-termination:
packfile: correct zlib buffer handling
"git p4" updates.
* ld/git-p4-updates:
git-p4: auto-size the block
git-p4: narrow the scope of exceptions caught when parsing an int
git-p4: raise exceptions from p4CmdList based on error from p4 server
git-p4: better error reporting when p4 fails
git-p4: add option to disable syncing of p4/master with p4
git-p4: disable-rebase: allow setting this via configuration
git-p4: add options --commit and --disable-rebase
Paths can be longer than PATH_MAX. Avoid a buffer overrun in
check_dir_renamed() by using xstrdup() to make a private copy safely.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It was not "newer versions of bash" but newer versions of
bash-completion that made commit 085e2ee0e6 (completion: load
completion file for external subcommand, 2018-04-29) both necessary
and possible.
Update the corresponding RelNotes entry accordingly.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Three tests in 't7406-submodule-update' contain broken &&-chains, but
since they are all in subshells, chain-lint couldn't notice them.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code path that triggered that "BUG" really does not want to run
without an explicit commit message. In the case where we want to amend a
commit message, we have an *implicit* commit message, though: the one of
the commit to amend. Therefore, this code path should not even be
entered.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>