sparse-checkout: add 'cone' mode

The sparse-checkout feature can have quadratic performance as
the number of patterns and number of entries in the index grow.
If there are 1,000 patterns and 1,000,000 entries, this time can
be very significant.

Create a new Boolean config option, core.sparseCheckoutCone, to
indicate that we expect the sparse-checkout file to contain a
more limited set of patterns. This is a separate config setting
from core.sparseCheckout to avoid breaking older clients by
introducing a tri-state option.

The config option does nothing right now, but will be expanded
upon in a later commit.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Derrick Stolee 2019-11-21 22:04:40 +00:00 committed by Junio C Hamano
parent e6152e35ff
commit 879321eb0b
6 changed files with 85 additions and 4 deletions

View File

@ -593,8 +593,14 @@ core.multiPackIndex::
multi-pack-index design document]. multi-pack-index design document].
core.sparseCheckout:: core.sparseCheckout::
Enable "sparse checkout" feature. See section "Sparse checkout" in Enable "sparse checkout" feature. See linkgit:git-sparse-checkout[1]
linkgit:git-read-tree[1] for more information. for more information.
core.sparseCheckoutCone::
Enables the "cone mode" of the sparse checkout feature. When the
sparse-checkout file contains a limited set of patterns, then this
mode provides significant performance advantages. See
linkgit:git-sparse-checkout[1] for more information.
core.abbrev:: core.abbrev::
Set the length object names are abbreviated to. If Set the length object names are abbreviated to. If

View File

@ -80,7 +80,9 @@ the sparse-checkout file.
To repopulate the working directory with all files, use the To repopulate the working directory with all files, use the
`git sparse-checkout disable` command. `git sparse-checkout disable` command.
## FULL PATTERN SET
FULL PATTERN SET
----------------
By default, the sparse-checkout file uses the same syntax as `.gitignore` By default, the sparse-checkout file uses the same syntax as `.gitignore`
files. files.
@ -95,6 +97,57 @@ using negative patterns. For example, to remove the file `unwanted`:
---------------- ----------------
CONE PATTERN SET
----------------
The full pattern set allows for arbitrary pattern matches and complicated
inclusion/exclusion rules. These can result in O(N*M) pattern matches when
updating the index, where N is the number of patterns and M is the number
of paths in the index. To combat this performance issue, a more restricted
pattern set is allowed when `core.spareCheckoutCone` is enabled.
The accepted patterns in the cone pattern set are:
1. *Recursive:* All paths inside a directory are included.
2. *Parent:* All files immediately inside a directory are included.
In addition to the above two patterns, we also expect that all files in the
root directory are included. If a recursive pattern is added, then all
leading directories are added as parent patterns.
By default, when running `git sparse-checkout init`, the root directory is
added as a parent pattern. At this point, the sparse-checkout file contains
the following patterns:
----------------
/*
!/*/
----------------
This says "include everything in root, but nothing two levels below root."
If we then add the folder `A/B/C` as a recursive pattern, the folders `A` and
`A/B` are added as parent patterns. The resulting sparse-checkout file is
now
----------------
/*
!/*/
/A/
!/A/*/
/A/B/
!/A/B/*/
/A/B/C/
----------------
Here, order matters, so the negative patterns are overridden by the positive
patterns that appear lower in the file.
If `core.sparseCheckoutCone=true`, then Git will parse the sparse-checkout file
expecting patterns of these types. Git will warn if the patterns do not match.
If the patterns do match the expected format, then Git will use faster hash-
based algorithms to compute inclusion in the sparse-checkout.
SEE ALSO SEE ALSO
-------- --------

View File

@ -918,12 +918,14 @@ extern char *git_replace_ref_base;
extern int fsync_object_files; extern int fsync_object_files;
extern int core_preload_index; extern int core_preload_index;
extern int core_apply_sparse_checkout;
extern int precomposed_unicode; extern int precomposed_unicode;
extern int protect_hfs; extern int protect_hfs;
extern int protect_ntfs; extern int protect_ntfs;
extern const char *core_fsmonitor; extern const char *core_fsmonitor;
int core_apply_sparse_checkout;
int core_sparse_checkout_cone;
/* /*
* Include broken refs in all ref iterations, which will * Include broken refs in all ref iterations, which will
* generally choke dangerous operations rather than letting * generally choke dangerous operations rather than letting

View File

@ -1364,6 +1364,11 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
return 0; return 0;
} }
if (!strcmp(var, "core.sparsecheckoutcone")) {
core_sparse_checkout_cone = git_config_bool(var, value);
return 0;
}
if (!strcmp(var, "core.precomposeunicode")) { if (!strcmp(var, "core.precomposeunicode")) {
precomposed_unicode = git_config_bool(var, value); precomposed_unicode = git_config_bool(var, value);
return 0; return 0;

View File

@ -67,6 +67,7 @@ enum object_creation_mode object_creation_mode = OBJECT_CREATION_MODE;
char *notes_ref_name; char *notes_ref_name;
int grafts_replace_parents = 1; int grafts_replace_parents = 1;
int core_apply_sparse_checkout; int core_apply_sparse_checkout;
int core_sparse_checkout_cone;
int merge_log_config = -1; int merge_log_config = -1;
int precomposed_unicode = -1; /* see probe_utf8_pathname_composition() */ int precomposed_unicode = -1; /* see probe_utf8_pathname_composition() */
unsigned long pack_size_limit_cfg; unsigned long pack_size_limit_cfg;

View File

@ -148,6 +148,20 @@ test_expect_success 'set sparse-checkout using --stdin' '
test_cmp expect dir test_cmp expect dir
' '
test_expect_success 'cone mode: match patterns' '
git -C repo config --worktree core.sparseCheckoutCone true &&
rm -rf repo/a repo/folder1 repo/folder2 &&
git -C repo read-tree -mu HEAD &&
git -C repo reset --hard &&
ls repo >dir &&
cat >expect <<-EOF &&
a
folder1
folder2
EOF
test_cmp expect dir
'
test_expect_success 'sparse-checkout disable' ' test_expect_success 'sparse-checkout disable' '
git -C repo sparse-checkout disable && git -C repo sparse-checkout disable &&
test_path_is_missing repo/.git/info/sparse-checkout && test_path_is_missing repo/.git/info/sparse-checkout &&