"sparse-checkout" learns to work well with the sparse-index
feature.
* ds/sparse-sparse-checkout:
sparse-checkout: integrate with sparse index
p2000: add test for 'git sparse-checkout [add|set]'
sparse-index: complete partial expansion
sparse-index: partially expand directories
sparse-checkout: --no-sparse-index needs a full index
cache-tree: implement cache_tree_find_path()
sparse-index: introduce partially-sparse indexes
sparse-index: create expand_index()
t1092: stress test 'git sparse-checkout set'
t1092: refactor 'sparse-index contents' test
The multi-pack-index code did not protect the packfile it is going
to depend on from getting removed while in use, which has been
corrected.
* tb/midx-race-in-pack-objects:
builtin/pack-objects.c: ensure pack validity from MIDX bitmap objects
builtin/pack-objects.c: ensure included `--stdin-packs` exist
builtin/pack-objects.c: avoid redundant NULL check
pack-bitmap.c: check preferred pack validity when opening MIDX bitmap
Preliminary code refactoring around transport and bundle code.
* ds/bundle-uri:
bundle.h: make "fd" version of read_bundle_header() public
remote: allow relative_url() to return an absolute url
remote: move relative_url()
http: make http_get_file() external
fetch-pack: move --keep=* option filling to a function
fetch-pack: add a deref_without_lazy_fetch_extended()
dir API: add a generalized path_match_flags() function
connect.c: refactor sending of agent & object-format
Introduce a filesystem-dependent mechanism to optimize the way the
bits for many loose object files are ensured to hit the disk
platter.
* ns/batch-fsync:
core.fsyncmethod: performance tests for batch mode
t/perf: add iteration setup mechanism to perf-lib
core.fsyncmethod: tests for batch mode
test-lib-functions: add parsing helpers for ls-files and ls-tree
core.fsync: use batch mode and sync loose objects by default on Windows
unpack-objects: use the bulk-checkin infrastructure
update-index: use the bulk-checkin infrastructure
builtin/add: add ODB transaction around add_files_to_cache
cache-tree: use ODB transaction around writing a tree
core.fsyncmethod: batched disk flushes for loose-objects
bulk-checkin: rebrand plug/unplug APIs as 'odb transactions'
bulk-checkin: rename 'state' variable and separate 'plugged' boolean
Deprecate non-cone mode of the sparse-checkout feature.
* en/sparse-cone-becomes-default:
Documentation: some sparsity wording clarifications
git-sparse-checkout.txt: mark non-cone mode as deprecated
git-sparse-checkout.txt: flesh out pattern set sections a bit
git-sparse-checkout.txt: add a new EXAMPLES section
git-sparse-checkout.txt: shuffle some sections and mark as internal
git-sparse-checkout.txt: update docs for deprecation of 'init'
git-sparse-checkout.txt: wording updates for the cone mode default
sparse-checkout: make --cone the default
tests: stop assuming --no-cone is the default mode for sparse-checkout
Fixes real problems noticed by gcc 12 and works around false
positives.
* js/ci-gcc-12-fixes:
dir.c: avoid "exceeds maximum object size" error with GCC v12.x
nedmalloc: avoid new compile error
compat/win32/syslog: fix use-after-realloc
"git add -i" was rewritten in C some time ago and has been in
testing; the reimplementation is now exposed to general public by
default.
* js/use-builtin-add-i:
add -i: default to the built-in implementation
t2016: require the PERL prereq only when necessary
The tests that ensured merges stop when interfering local changes
are present did not make sure that local changes are preserved; now
they do.
* jc/t6424-failing-merge-preserve-local-changes:
t6424: make sure a failed merge preserves local changes
With the new http.curloptResolve configuration, the CURLOPT_RESOLVE
mechanism that allows cURL based applications to use pre-resolved
IP addresses for the requests is exposed to the scripts.
* cc/http-curlopt-resolve:
http: add custom hostname to IP address resolutions
In http.c, the run_active_slot() function allows the given "slot" to
make progress by calling step_active_slots() in a loop repeatedly,
and the loop is not left until the request held in the slot
completes.
Ages ago, we used to use the slot->in_use member to get out of the
loop, which misbehaved when the request in "slot" completes (at
which time, the result of the request is copied away from the slot,
and the in_use member is cleared, making the slot ready to be
reused), and the "slot" gets reused to service a different request
(at which time, the "slot" becomes in_use again, even though it is
for a different request). The loop terminating condition mistakenly
thought that the original request has yet to be completed.
Today's code, after baa7b67d (HTTP slot reuse fixes, 2006-03-10)
fixed this issue, uses a separate "slot->finished" member that is
set in run_active_slot() to point to an on-stack variable, and the
code that completes the request in finish_active_slot() clears the
on-stack variable via the pointer to signal that the particular
request held by the slot has completed. It also clears the in_use
member (as before that fix), so that the slot itself can safely be
reused for an unrelated request.
One thing that is not quite clean in this arrangement is that,
unless the slot gets reused, at which point the finished member is
reset to NULL, the member keeps the value of &finished, which
becomes a dangling pointer into the stack when run_active_slot()
returns. Clear the finished member before the control leaves the
function, which has a side effect of unconfusing compilers like
recent GCC 12 that is over-eager to warn against such an assignment.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git remote -v" now shows the list-objects-filter used during
fetching from the remote, if available.
* ac/remote-v-with-object-list-filters:
builtin/remote.c: teach `-v` to list filters for promisor remotes
With a recent update to refuse access to repositories of other
people by default, "sudo make install" and "sudo git describe"
stopped working. This series intends to loosen it while keeping
the safety.
* cb/path-owner-check-with-sudo:
t0034: add negative tests and allow git init to mostly work under sudo
git-compat-util: avoid failing dir ownership checks if running privileged
t: regression git needs safe.directory when using sudo
"git -c branch.autosetupmerge=simple branch $A $B" will set the $B
as $A's upstream only when $A and $B shares the same name, and "git
-c push.default=simple" on branch $A would push to update the
branch $A at the remote $B came from. Also more places use the
sole remote, if exists, before defaulting to 'origin'.
* tk/simple-autosetupmerge:
push: new config option "push.autoSetupRemote" supports "simple" push
push: default to single remote even when not named origin
branch: new autosetupmerge option 'simple' for matching branches
"git fetch --recurse-submodules" from multiple remotes (either from
a remote group, or "--all") used to make one extra "git fetch" in
the submodules, which has been corrected.
* jc/avoid-redundant-submodule-fetch:
fetch: do not run a redundant fetch from submodule
The way "git fetch" without "--update-head-ok" ensures that HEAD in
no worktree points at any ref being updated was too wasteful, which
has been optimized a bit.
* os/fetch-check-not-current-branch:
fetch: limit shared symref check only for local branches
Documentation update.
* pb/ggg-in-mfc-doc:
MyFirstContribution: drop PR description for GGG single-patch contributions
MyFirstContribution: reference "The cover letter" in GitGitGadget section
MyFirstContribution: reference "The cover letter" in "Preparing Email"
MyFirstContribution: add standalone section on cover letter
MyFirstContribution: add "Anatomy of a Patch Series" section
"git fetch" unnecessarily failed when an unexpected optional
section appeared in the output, which has been corrected.
* jt/fetch-peek-optional-section:
fetch-pack: make unexpected peek result non-fatal
The "--current" option of "git show-branch" should have been made
incompatible with the "--reflog" mode, but this was not enforced,
which has been corrected.
* jc/show-branch-g-current:
show-branch: -g and --current are incompatible
"make coverage-report" without first running "make coverage" did
not produce any meaningful result, which has been corrected.
* ep/coverage-report-wants-test-to-have-run:
Makefile: add a prerequisite to the coverage-report target
The FreeBSD CI build (on Cirrus-CI) has been failing in
't9001-send-email.sh' for quite some time, with an error from the
runtime linker relating to the Perl installation:
$ GIT_SEND_EMAIL_NOTTY=1 git send-email \
'--from=Example <from@example.com>' '--to=nobody@example.com' \
'--smtp-server=/tmp/cirrus-ci-build/t/trash directory.t9001-send-email/fake.sendmail' \
--compose '--subject=foo' 0001-Second.patch
ld-elf.so.1: /usr/local/lib/perl5/5.32/mach/CORE/libperl.so.5.32: Undefined symbol "strerror_l@FBSD_1.6"
This first instance is in t9001.6 but it fails similarly in several tests
in this file.
The FreeBSD image we use is FreeBSD 12.2, which is unsupported since
March 31st, 2022 [1]. Switching to a supported version, 12.3,
makes this error disappear [2].
Change the image we use to FreeBSD 12.3.
[1] https://www.freebsd.org/security/unsupported/
[2] https://lore.kernel.org/git/9cc31276-ab78-fa8a-9fb4-b19266911211@gmail.com/
Reviewed-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Technically, the pointer difference `end - start` _could_ be negative,
and when cast to an (unsigned) `size_t` that would cause problems. In
this instance, the symptom is:
dir.c: In function 'git_url_basename':
dir.c:3087:13: error: 'memchr' specified bound [9223372036854775808, 0]
exceeds maximum object size 9223372036854775807
[-Werror=stringop-overread]
CC ewah/bitmap.o
3087 | if (memchr(start, '/', end - start) == NULL
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
While it is a bit far-fetched to think that `end` (which is defined as
`repo + strlen(repo)`) and `start` (which starts at `repo` and never
steps beyond the NUL terminator) could result in such a negative
difference, GCC has no way of knowing that.
See also https://gcc.gnu.org/bugzilla//show_bug.cgi?id=85783.
Let's just add a safety check, primarily for GCC's benefit.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
GCC v12.x complains thusly:
compat/nedmalloc/nedmalloc.c: In function 'DestroyCaches':
compat/nedmalloc/nedmalloc.c:326:12: error: the comparison will always
evaluate as 'true' for the address of 'caches'
will never be NULL [-Werror=address]
326 | if(p->caches)
| ^
compat/nedmalloc/nedmalloc.c:196:22: note: 'caches' declared here
196 | threadcache *caches[THREADCACHEMAXCACHES];
| ^~~~~~
... and it is correct, of course.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git for Windows' SDK recently upgraded to GCC v12.x which points out
that the `pos` variable might be used even after the corresponding
memory was `realloc()`ed and therefore potentially no longer valid.
Since a subset of this SDK is used in Git's CI/PR builds, we need to fix
this to continue to be able to benefit from the CI/PR runs.
Note: This bug has been with us since 2a6b149c64 (mingw: avoid using
strbuf in syslog, 2011-10-06), and while it looks tempting to replace
the hand-rolled string manipulation with a `strbuf`-based one, that
commit's message explains why we cannot do that: The `syslog()` function
is called as part of the function in `daemon.c` which is set as the
`die()` routine, and since `strbuf_grow()` can call that function if it
runs out of memory, this would cause a nasty infinite loop that we do
not want to re-introduce.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using a multi-pack bitmap, pack-objects will try to perform its
traversal using a call to `traverse_bitmap_commit_list()`, which calls
`add_object_entry_from_bitmap()` to add each object it finds to its
packing list.
This path can cause pack-objects to add objects from packs that don't
have open pack_fds on them, by avoiding a call to `is_pack_valid()`.
This is because we only call `is_pack_valid()` on the preferred pack (in
order to do verbatim reuse via `reuse_partial_packfile_from_bitmap()`)
and not others when loading a MIDX bitmap.
In this case, `add_object_entry_from_bitmap()` will check whether it
wants each object entry by calling `want_object_in_pack()`, which will
call `want_found_object` (since its caller already supplied a
`found_pack`). In most cases (particularly without `--local`, and when
`ignored_packed_keep_on_disk` and `ignored_packed_keep_in_core` are
both "0"), we'll take the entry from the pack contained in the MIDX
bitmap, all without an open pack_fd.
When we then try to use that entry later to assemble the actual pack,
we'll be susceptible to any simultaneous writers moving that pack out of
the way (e.g., due to a concurrent repack) without having an open file
descriptor, causing races that result in errors like:
remote: Enumerating objects: 1498802, done.
remote: fatal: packfile ./objects/pack/pack-e57d433b5a588daa37fbe946e2b28dfaec03a93e.pack cannot be accessed
remote: aborting due to possible repository corruption on the remote side.
This race can happen even with multi-pack bitmaps, since we may open a
MIDX bitmap that is being rewritten long before its packs are actually
unlinked.
Work around this by calling `is_pack_valid()` from within
`want_found_object()`, matching the behavior in
`want_object_in_pack_one()` (which has an analogous call). Most calls to
`is_pack_valid()` should be basically no-ops, since only the first call
requires us to open a file (subsequent calls realize the file is already
open, and return immediately).
Importantly, when `want_object_in_pack()` is given a non-NULL
`*found_pack`, but `want_found_object()` rejects the copy of the object
in that pack, we must reset `*found_pack` and `*found_offset` to NULL
and 0, respectively. Failing to do so could lead to other checks in
`want_object_in_pack()` (such as `want_object_in_pack_one()`) using the
same (invalid) pack as `*found_pack`, meaning that we don't call
`is_pack_valid()` because `p == *found_pack`. This can lead the caller
to believe it can use a copy of an object from an invalid pack.
An alternative approach to closing this race would have been to call
`is_pack_valid()` on _all_ packs in a multi-pack bitmap on load. This
has a couple of problems:
- it is unnecessarily expensive in the cases where we don't actually
need to open any packs (e.g., in `git rev-list --use-bitmap-index
--count`)
- more importantly, it means any time we would have hit this race,
we'll avoid using bitmaps altogether, leading to significant
slowdowns by forcing a full object traversal
Co-authored-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A subsequent patch will teach `want_object_in_pack()` to set its
`*found_pack` and `*found_offset` poitners to NULL when the provided
pack does not pass the `is_pack_valid()` check.
The `--stdin-packs` mode of `pack-objects` is not quite prepared to
handle this. To prepare it for this change, do the following two things:
- Ensure provided packs pass the `is_pack_valid()` check when
collecting the caller-provided packs into the "included" and
"excluded" lists.
- Gracefully handle any _invalid_ packs being passed to
`want_object_in_pack()`.
Calling `is_pack_valid()` early on makes it substantially less likely
that we will have to deal with a pack going away, since we'll have an
open file descriptor on its contents much earlier.
But even packs with open descriptors can become invalid in the future if
we (a) hit our open descriptor limit, forcing us to close some open
packs, and (b) one of those just-closed packs has gone away in the
meantime.
`add_object_entry_from_pack()` depends on having a non-NULL
`*found_pack`, since it passes that pointer to `packed_object_info()`,
meaning that we would SEGV if the pointer became NULL (like we propose
to do in `want_object_in_pack()` in the following patch).
But avoiding calling `packed_object_info()` entirely is OK, too, since
its only purpose is to identify which objects in the included packs are
commits, so that they can form the tips of the advisory traversal used
to discover the object namehashes.
Failing to do this means that at worst we will produce lower-quality
deltas, but it does not prevent us from generating the pack as long as
we can find a copy of each object from the disappearing pack in some
other part of the repository.
Co-authored-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before calling `for_each_object_in_pack()`, the caller
`read_packs_list_from_stdin()` loops through each of the `include_packs`
and checks that its `->util` pointer (which is used to store the `struct
packed_git *` itself) is non-NULL.
This check is redundant, because `read_packs_list_from_stdin()` already
checks that the included packs are non-NULL earlier on in the same
function (and it does not add any new entries in between).
Remove this check, since it is not doing anything in the meantime.
Co-authored-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When pack-objects adds an entry to its packing list, it marks the
packfile and offset containing the object, which we may later use during
verbatim reuse (c.f., `write_reused_pack_verbatim()`).
If the packfile in question is deleted in the background (e.g., due to a
concurrent `git repack`), we'll die() as a result of calling use_pack(),
unless we have an open file descriptor on the pack itself. 4c08018204
(pack-objects: protect against disappearing packs, 2011-10-14) worked
around this by opening the pack ahead of time before recording it as a
valid source for reuse.
4c08018204's treatment meant that we could tolerate disappearing packs,
since it ensures we always have an open file descriptor on any pack that
we mark as a valid source for reuse. This tightens the race to only
happen when we need to close an open pack's file descriptor (c.f., the
caller of `packfile.c::get_max_fd_limit()`) _and_ that pack was deleted,
in which case we'll complain that a pack could not be accessed and
die().
The pack bitmap code does this, too, since prior to dc1daacdcc
(pack-bitmap: check pack validity when opening bitmap, 2021-07-23) it
was vulnerable to the same race.
The MIDX bitmap code does not do this, and is vulnerable to the same
race. Apply the same treatment as dc1daacdcc to the routine responsible
for opening the multi-pack bitmap's preferred pack to close this race.
This patch handles the "preferred" pack (c.f., the section
"multi-pack-index reverse indexes" in
Documentation/technical/pack-format.txt) specially, since pack-objects
depends on reusing exact chunks of that pack verbatim in
reuse_partial_packfile_from_bitmap(). So if that pack cannot be loaded,
the utility of a bitmap is significantly diminished.
Similar to dc1daacdcc, we could technically just add this check in
reuse_partial_packfile_from_bitmap(), since it's possible to use a MIDX
.bitmap without needing to open any of its packs. But it's simpler to do
the check as early as possible, covering all direct uses of the
preferred pack. Note that doing this check early requires us to call
prepare_midx_pack() early, too, so move the relevant part of that loop
from load_reverse_index() into open_midx_bitmap_1().
Subsequent patches handle the non-preferred packs in a slightly
different fashion.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git archive --add-file=<path>" picked up the raw permission bits
from the path and propagated to zip output in some cases, without
normalization, which has been corrected (tar output did not have
this issue).
* jc/archive-add-file-normalize-mode:
archive: do not let on-disk mode leak to zip archives
A bit of test framework fixes with a few fixes to issues found by
valgrind.
* ab/valgrind-fixes:
commit-graph.c: don't assume that stat() succeeds
object-file: fix a unpack_loose_header() regression in 3b6a8db3b0
log test: skip a failing mkstemp() test under valgrind
tests: using custom GIT_EXEC_PATH breaks --valgrind tests
When modifying the sparse-checkout definition, the sparse-checkout
builtin calls update_sparsity() to modify the SKIP_WORKTREE bits of all
cache entries in the index. Before, we needed the index to be fully
expanded in order to ensure we had the full list of files necessary that
match the new patterns.
Insert a call to reset_sparse_directories() that expands sparse
directories that are within the new pattern list, but only far enough
that every necessary file path now exists as a cache entry. The
remaining logic within update_sparsity() will modify the SKIP_WORKTREE
bits appropriately.
This allows us to disable command_requires_full_index within the
sparse-checkout builtin. Add tests that demonstrate that we are not
expanding to a full index unnecessarily.
We can see the improved performance in the p2000 test script:
Test HEAD~1 HEAD
------------------------------------------------------------------------
2000.24: git ... (sparse-v3) 2.14(1.55+0.58) 1.57(1.03+0.53) -26.6%
2000.25: git ... (sparse-v4) 2.20(1.62+0.57) 1.58(0.98+0.59) -28.2%
These reductions of 26-28% are small compared to most examples, but the
time is dominated by writing a new copy of the base repository to the
worktree and then deleting it again. The fact that the previous index
expansion was such a large portion of the time is telling how important
it is to complete this sparse index integration.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sparse-checkout builtin is almost completely integrated with the
sparse index, allowing the sparse-checkout boundary to be modified
without expanding a sparse index to a full one. Add a test to
p2000-sparse-operations.sh that adds a directory to the sparse-checkout
definition, then removes it. Using both operations is important to
ensure that the operation is doing the same work in each repetition as
well as leaving the test repo in a good state for later tests.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To complete the implementation of expand_to_pattern_list(), we need to
detect when a sparse directory entry should remain sparse. This avoids a
full expansion, so we now need to use the PARTIALLY_SPARSE mode to
indicate this state.
There still are no callers to this method, but we will add one in the
next change.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The expand_to_pattern_list() method expands sparse directory entries
to their list of contained files when either the pattern list is NULL or
the directory is contained in the new pattern list's cone mode patterns.
It is possible that the pattern list has a recursive match with a
directory 'A/B/C/' and so an existing sparse directory 'A/B/' would need
to be expanded. If there exists a directory 'A/B/D/', then that
directory should not be expanded and instead we can create a sparse
directory.
To implement this, we plug into the add_path_to_index() callback for the
call to read_tree_at(). Since we now need access to both the index we
are writing and the pattern list we are comparing, create a 'struct
modify_index_context' to use as a data transfer object. It is important
that we use the given pattern list since we will use this pattern list
to change the sparse-checkout patterns and cannot use
istate->sparse_checkout_patterns.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the --no-sparse-index option is supplied, the sparse-checkout
builtin should explicitly ask to expand a sparse index to a full one.
This is currently done implicitly due to the command_requires_full_index
protection, but that will be removed in an upcoming change.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Given a 'struct cache_tree', it may be beneficial to navigate directly
to a node within that corresponds to a given path name. Create
cache_tree_find_path() for this function. It returns NULL when no such
path exists.
The implementation is adapted from do_invalidate_path() which does a
similar search but also modifies the nodes it finds along the way. The
method could be implemented simply using tail-recursion, but this while
loop does the same thing.
This new method is not currently used, but will be in an upcoming
change.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A future change will present a temporary, in-memory mode where the index
can both contain sparse directory entries but also not be completely
collapsed to the smallest possible sparse directories. This will be
necessary for modifying the sparse-checkout definition while using a
sparse index.
For now, convert the single-bit member 'sparse_index' in 'struct
index_state' to be a an 'enum sparse_index_mode' with three modes:
* INDEX_EXPANDED (0): No sparse directories exist. This is always the
case for repositories that do not use cone-mode sparse-checkout.
* INDEX_COLLAPSED: Sparse directories may exist. Files outside the
sparse-checkout cone are reduced to sparse directory entries whenever
possible.
* INDEX_PARTIALLY_SPARSE: Sparse directories may exist. Some file
entries outside the sparse-checkout cone may exist. Running
convert_to_sparse() may further reduce those files to sparse directory
entries.
The main reason to store this extra information is to allow
convert_to_sparse() to short-circuit when the index is already in
INDEX_EXPANDED mode but to actually do the necessary work when in
INDEX_PARTIALLY_SPARSE mode.
The INDEX_PARTIALLY_SPARSE mode will be used in an upcoming change.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is the first change in a series to allow modifying the
sparse-checkout pattern set without expanding a sparse index to a full
one in the process. Here, we focus on the problem of expanding the
pattern set through a command like 'git sparse-checkout add <path>'
which needs to create new index entries for the paths now being written
to the worktree.
To achieve this, we need to be able to replace sparse directory entries
with their contained files and subdirectories. Once this is complete,
other code paths can discover those cache entries and write the
corresponding files to disk before committing the index.
We already have logic in ensure_full_index() that expands the index
entries, so we will use that as our base. Create a new method,
expand_index(), which takes a pattern list, but for now mostly ignores
it. The current implementation is only correct when the pattern list is
NULL as that does the same as ensure_full_index(). In fact,
ensure_full_index() is converted to a shim over expand_index().
A future update will actually implement expand_index() to its full
capabilities. For now, it is created and documented.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'sparse-index contents' test checks that the sparse index has the
correct set of sparse directories in the index after modifying the cone
mode patterns using 'git sparse-checkout set'. Add to the coverage here
by adding more complicated scenarios that were not previously tested.
In order to check paths that do not exist at HEAD, we need to modify the
test_sparse_checkout_set helper slightly:
1. Add the --skip-checks argument to the 'set' command to avoid failures
when passing paths that do not exist at HEAD.
2. When looking for the non-existence of sparse directories for the
paths in $CONE_DIRS, allow the rev-list command to fail because the
path does not exist at HEAD.
This allows us to add some interesting test cases.
Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before expanding this test with more involved cases, first extract the
repeated logic into a new test_sparse_checkout_set helper. This helper
checks that 'git sparse-checkout set ...' succeeds and then verifies
that certain directories have sparse directory entries in the sparse
index. It also verifies that the in-cone directories are _not_ sparse
directory entries in the sparse index.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>