5d4b293340
Improve the wording for a couple paragraphs in two different manuals relating to sparse behavior. Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
461 lines
20 KiB
Plaintext
461 lines
20 KiB
Plaintext
git-sparse-checkout(1)
|
|
======================
|
|
|
|
NAME
|
|
----
|
|
git-sparse-checkout - Reduce your working tree to a subset of tracked files
|
|
|
|
|
|
SYNOPSIS
|
|
--------
|
|
[verse]
|
|
'git sparse-checkout <subcommand> [<options>]'
|
|
|
|
|
|
DESCRIPTION
|
|
-----------
|
|
|
|
This command is used to create sparse checkouts, which change the
|
|
working tree from having all tracked files present to only having a
|
|
subset of those files. It can also switch which subset of files are
|
|
present, or undo and go back to having all tracked files present in
|
|
the working copy.
|
|
|
|
The subset of files is chosen by providing a list of directories in
|
|
cone mode (the default), or by providing a list of patterns in
|
|
non-cone mode.
|
|
|
|
When in a sparse-checkout, other Git commands behave a bit differently.
|
|
For example, switching branches will not update paths outside the
|
|
sparse-checkout directories/patterns, and `git commit -a` will not record
|
|
paths outside the sparse-checkout directories/patterns as deleted.
|
|
|
|
THIS COMMAND IS EXPERIMENTAL. ITS BEHAVIOR, AND THE BEHAVIOR OF OTHER
|
|
COMMANDS IN THE PRESENCE OF SPARSE-CHECKOUTS, WILL LIKELY CHANGE IN
|
|
THE FUTURE.
|
|
|
|
|
|
COMMANDS
|
|
--------
|
|
'list'::
|
|
Describe the directories or patterns in the sparse-checkout file.
|
|
|
|
'set'::
|
|
Enable the necessary sparse-checkout config settings
|
|
(`core.sparseCheckout`, `core.sparseCheckoutCone`, and
|
|
`index.sparse`) if they are not already set to the desired values,
|
|
populate the sparse-checkout file from the list of arguments
|
|
following the 'set' subcommand, and update the working directory to
|
|
match.
|
|
+
|
|
To ensure that adjusting the sparse-checkout settings within a worktree
|
|
does not alter the sparse-checkout settings in other worktrees, the 'set'
|
|
subcommand will upgrade your repository config to use worktree-specific
|
|
config if not already present. The sparsity defined by the arguments to
|
|
the 'set' subcommand are stored in the worktree-specific sparse-checkout
|
|
file. See linkgit:git-worktree[1] and the documentation of
|
|
`extensions.worktreeConfig` in linkgit:git-config[1] for more details.
|
|
+
|
|
When the `--stdin` option is provided, the directories or patterns are
|
|
read from standard in as a newline-delimited list instead of from the
|
|
arguments.
|
|
+
|
|
By default, the input list is considered a list of directories, matching
|
|
the output of `git ls-tree -d --name-only`. This includes interpreting
|
|
pathnames that begin with a double quote (") as C-style quoted strings.
|
|
Note that all files under the specified directories (at any depth) will
|
|
be included in the sparse checkout, as well as files that are siblings
|
|
of either the given directory or any of its ancestors (see 'CONE PATTERN
|
|
SET' below for more details). In the past, this was not the default,
|
|
and `--cone` needed to be specified or `core.sparseCheckoutCone` needed
|
|
to be enabled.
|
|
+
|
|
When `--no-cone` is passed, the input list is considered a list of
|
|
patterns. This mode has a number of drawbacks, including not working
|
|
with some options like `--sparse-index`. As explained in the
|
|
"Non-cone Problems" section below, we do not recommend using it.
|
|
+
|
|
Use the `--[no-]sparse-index` option to use a sparse index (the
|
|
default is to not use it). A sparse index reduces the size of the
|
|
index to be more closely aligned with your sparse-checkout
|
|
definition. This can have significant performance advantages for
|
|
commands such as `git status` or `git add`. This feature is still
|
|
experimental. Some commands might be slower with a sparse index until
|
|
they are properly integrated with the feature.
|
|
+
|
|
**WARNING:** Using a sparse index requires modifying the index in a way
|
|
that is not completely understood by external tools. If you have trouble
|
|
with this compatibility, then run `git sparse-checkout init --no-sparse-index`
|
|
to rewrite your index to not be sparse. Older versions of Git will not
|
|
understand the sparse directory entries index extension and may fail to
|
|
interact with your repository until it is disabled.
|
|
|
|
'add'::
|
|
Update the sparse-checkout file to include additional directories
|
|
(in cone mode) or patterns (in non-cone mode). By default, these
|
|
directories or patterns are read from the command-line arguments,
|
|
but they can be read from stdin using the `--stdin` option.
|
|
|
|
'reapply'::
|
|
Reapply the sparsity pattern rules to paths in the working tree.
|
|
Commands like merge or rebase can materialize paths to do their
|
|
work (e.g. in order to show you a conflict), and other
|
|
sparse-checkout commands might fail to sparsify an individual file
|
|
(e.g. because it has unstaged changes or conflicts). In such
|
|
cases, it can make sense to run `git sparse-checkout reapply` later
|
|
after cleaning up affected paths (e.g. resolving conflicts, undoing
|
|
or committing changes, etc.).
|
|
+
|
|
The `reapply` command can also take `--[no-]cone` and `--[no-]sparse-index`
|
|
flags, with the same meaning as the flags from the `set` command, in order
|
|
to change which sparsity mode you are using without needing to also respecify
|
|
all sparsity paths.
|
|
|
|
'disable'::
|
|
Disable the `core.sparseCheckout` config setting, and restore the
|
|
working directory to include all files.
|
|
|
|
'init'::
|
|
Deprecated command that behaves like `set` with no specified paths.
|
|
May be removed in the future.
|
|
+
|
|
Historically, `set` did not handle all the necessary config settings,
|
|
which meant that both `init` and `set` had to be called. Invoking
|
|
both meant the `init` step would first remove nearly all tracked files
|
|
(and in cone mode, ignored files too), then the `set` step would add
|
|
many of the tracked files (but not ignored files) back. In addition
|
|
to the lost files, the performance and UI of this combination was
|
|
poor.
|
|
+
|
|
Also, historically, `init` would not actually initialize the
|
|
sparse-checkout file if it already existed. This meant it was
|
|
possible to return to a sparse-checkout without remembering which
|
|
paths to pass to a subsequent 'set' or 'add' command. However,
|
|
`--cone` and `--sparse-index` options would not be remembered across
|
|
the disable command, so the easy restore of calling a plain `init`
|
|
decreased in utility.
|
|
|
|
EXAMPLES
|
|
--------
|
|
`git sparse-checkout set MY/DIR1 SUB/DIR2`::
|
|
|
|
Change to a sparse checkout with all files (at any depth) under
|
|
MY/DIR1/ and SUB/DIR2/ present in the working copy (plus all
|
|
files immediately under MY/ and SUB/ and the toplevel
|
|
directory). If already in a sparse checkout, change which files
|
|
are present in the working copy to this new selection. Note
|
|
that this command will also delete all ignored files in any
|
|
directory that no longer has either tracked or
|
|
non-ignored-untracked files present.
|
|
|
|
`git sparse-checkout disable`::
|
|
|
|
Repopulate the working directory with all files, disabling sparse
|
|
checkouts.
|
|
|
|
`git sparse-checkout add SOME/DIR/ECTORY`::
|
|
|
|
Add all files under SOME/DIR/ECTORY/ (at any depth) to the
|
|
sparse checkout, as well as all files immediately under
|
|
SOME/DIR/ and immediately under SOME/. Must already be in a
|
|
sparse checkout before using this command.
|
|
|
|
`git sparse-checkout reapply`::
|
|
|
|
It is possible for commands to update the working tree in a
|
|
way that does not respect the selected sparsity directories.
|
|
This can come from tools external to Git writing files, or
|
|
even affect Git commands because of either special cases (such
|
|
as hitting conflicts when merging/rebasing), or because some
|
|
commands didn't fully support sparse checkouts (e.g. the old
|
|
`recursive` merge backend had only limited support). This
|
|
command reapplies the existing sparse directory specifications
|
|
to make the working directory match.
|
|
|
|
INTERNALS -- SPARSE CHECKOUT
|
|
----------------------------
|
|
|
|
"Sparse checkout" allows populating the working directory sparsely. It
|
|
uses the skip-worktree bit (see linkgit:git-update-index[1]) to tell Git
|
|
whether a file in the working directory is worth looking at. If the
|
|
skip-worktree bit is set, and the file is not present in the working tree,
|
|
then its absence is ignored. Git will avoid populating the contents of
|
|
those files, which makes a sparse checkout helpful when working in a
|
|
repository with many files, but only a few are important to the current
|
|
user.
|
|
|
|
The `$GIT_DIR/info/sparse-checkout` file is used to define the
|
|
skip-worktree reference bitmap. When Git updates the working
|
|
directory, it updates the skip-worktree bits in the index based
|
|
on this file. The files matching the patterns in the file will
|
|
appear in the working directory, and the rest will not.
|
|
|
|
INTERNALS -- NON-CONE PROBLEMS
|
|
------------------------------
|
|
|
|
The `$GIT_DIR/info/sparse-checkout` file populated by the `set` and
|
|
`add` subcommands is defined to be a bunch of patterns (one per line)
|
|
using the same syntax as `.gitignore` files. In cone mode, these
|
|
patterns are restricted to matching directories (and users only ever
|
|
need supply or see directory names), while in non-cone mode any
|
|
gitignore-style pattern is permitted. Using the full gitignore-style
|
|
patterns in non-cone mode has a number of shortcomings:
|
|
|
|
* Fundamentally, it makes various worktree-updating processes (pull,
|
|
merge, rebase, switch, reset, checkout, etc.) require O(N*M) pattern
|
|
matches, where N is the number of patterns and M is the number of
|
|
paths in the index. This scales poorly.
|
|
|
|
* Avoiding the scaling issue has to be done via limiting the number
|
|
of patterns via specifying leading directory name or glob.
|
|
|
|
* Passing globs on the command line is error-prone as users may
|
|
forget to quote the glob, causing the shell to expand it into all
|
|
matching files and pass them all individually along to
|
|
sparse-checkout set/add. While this could also be a problem with
|
|
e.g. "git grep -- *.c", mistakes with grep/log/status appear in
|
|
the immediate output. With sparse-checkout, the mistake gets
|
|
recorded at the time the sparse-checkout command is run and might
|
|
not be problematic until the user later switches branches or rebases
|
|
or merges, thus putting a delay between the user's error and when
|
|
they have a chance to catch/notice it.
|
|
|
|
* Related to the previous item, sparse-checkout has an 'add'
|
|
subcommand but no 'remove' subcommand. Even if a 'remove'
|
|
subcommand were added, undoing an accidental unquoted glob runs
|
|
the risk of "removing too much", as it may remove entries that had
|
|
been included before the accidental add.
|
|
|
|
* Non-cone mode uses gitignore-style patterns to select what to
|
|
*include* (with the exception of negated patterns), while
|
|
.gitignore files use gitignore-style patterns to select what to
|
|
*exclude* (with the exception of negated patterns). The
|
|
documentation on gitignore-style patterns usually does not talk in
|
|
terms of matching or non-matching, but on what the user wants to
|
|
"exclude". This can cause confusion for users trying to learn how
|
|
to specify sparse-checkout patterns to get their desired behavior.
|
|
|
|
* Every other git subcommand that wants to provide "special path
|
|
pattern matching" of some sort uses pathspecs, but non-cone mode
|
|
for sparse-checkout uses gitignore patterns, which feels
|
|
inconsistent.
|
|
|
|
* It has edge cases where the "right" behavior is unclear. Two examples:
|
|
|
|
First, two users are in a subdirectory, and the first runs
|
|
git sparse-checkout set '/toplevel-dir/*.c'
|
|
while the second runs
|
|
git sparse-checkout set relative-dir
|
|
Should those arguments be transliterated into
|
|
current/subdirectory/toplevel-dir/*.c
|
|
and
|
|
current/subdirectory/relative-dir
|
|
before inserting into the sparse-checkout file? The user who typed
|
|
the first command is probably aware that arguments to set/add are
|
|
supposed to be patterns in non-cone mode, and probably would not be
|
|
happy with such a transliteration. However, many gitignore-style
|
|
patterns are just paths, which might be what the user who typed the
|
|
second command was thinking, and they'd be upset if their argument
|
|
wasn't transliterated.
|
|
|
|
Second, what should bash-completion complete on for set/add commands
|
|
for non-cone users? If it suggests paths, is it exacerbating the
|
|
problem above? Also, if it suggests paths, what if the user has a
|
|
file or directory that begins with either a '!' or '#' or has a '*',
|
|
'\', '?', '[', or ']' in its name? And if it suggests paths, will
|
|
it complete "/pro" to "/proc" (in the root filesytem) rather than to
|
|
"/progress.txt" in the current directory? (Note that users are
|
|
likely to want to start paths with a leading '/' in non-cone mode,
|
|
for the same reason that .gitignore files often have one.)
|
|
Completing on files or directories might give nasty surprises in
|
|
all these cases.
|
|
|
|
* The excessive flexibility made other extensions essentially
|
|
impractical. `--sparse-index` is likely impossible in non-cone
|
|
mode; even if it is somehow feasible, it would have been far more
|
|
work to implement and may have been too slow in practice. Some
|
|
ideas for adding coupling between partial clones and sparse
|
|
checkouts are only practical with a more restricted set of paths
|
|
as well.
|
|
|
|
For all these reasons, non-cone mode is deprecated. Please switch to
|
|
using cone mode.
|
|
|
|
|
|
INTERNALS -- CONE MODE HANDLING
|
|
-------------------------------
|
|
|
|
The "cone mode", which is the default, lets you specify only what
|
|
directories to include. For any directory specified, all paths below
|
|
that directory will be included, and any paths immediately under
|
|
leading directories (including the toplevel directory) will also be
|
|
included. Thus, if you specified the directory
|
|
Documentation/technical/
|
|
then your sparse checkout would contain:
|
|
|
|
* all files in the toplevel-directory
|
|
* all files immediately under Documentation/
|
|
* all files at any depth under Documentation/technical/
|
|
|
|
Also, in cone mode, even if no directories are specified, then the
|
|
files in the toplevel directory will be included.
|
|
|
|
When changing the sparse-checkout patterns in cone mode, Git will inspect each
|
|
tracked directory that is not within the sparse-checkout cone to see if it
|
|
contains any untracked files. If all of those files are ignored due to the
|
|
`.gitignore` patterns, then the directory will be deleted. If any of the
|
|
untracked files within that directory is not ignored, then no deletions will
|
|
occur within that directory and a warning message will appear. If these files
|
|
are important, then reset your sparse-checkout definition so they are included,
|
|
use `git add` and `git commit` to store them, then remove any remaining files
|
|
manually to ensure Git can behave optimally.
|
|
|
|
See also the "Internals -- Cone Pattern Set" section to learn how the
|
|
directories are transformed under the hood into a subset of the
|
|
Full Pattern Set of sparse-checkout.
|
|
|
|
|
|
INTERNALS -- FULL PATTERN SET
|
|
-----------------------------
|
|
|
|
The full pattern set allows for arbitrary pattern matches and complicated
|
|
inclusion/exclusion rules. These can result in O(N*M) pattern matches when
|
|
updating the index, where N is the number of patterns and M is the number
|
|
of paths in the index. To combat this performance issue, a more restricted
|
|
pattern set is allowed when `core.sparseCheckoutCone` is enabled.
|
|
|
|
The sparse-checkout file uses the same syntax as `.gitignore` files;
|
|
see linkgit:gitignore[5] for details. Here, though, the patterns are
|
|
usually being used to select which files to include rather than which
|
|
files to exclude. (However, it can get a bit confusing since
|
|
gitignore-style patterns have negations defined by patterns which
|
|
begin with a '!', so you can also select files to _not_ include.)
|
|
|
|
For example, to select everything, and then to remove the file
|
|
`unwanted` (so that every file will appear in your working tree except
|
|
the file named `unwanted`):
|
|
|
|
git sparse-checkout set --no-cone '/*' '!unwanted'
|
|
|
|
These patterns are just placed into the
|
|
`$GIT_DIR/info/sparse-checkout` as-is, so the contents of that file
|
|
at this point would be
|
|
|
|
----------------
|
|
/*
|
|
!unwanted
|
|
----------------
|
|
|
|
See also the "Sparse Checkout" section of linkgit:git-read-tree[1] to
|
|
learn more about the gitignore-style patterns used in sparse
|
|
checkouts.
|
|
|
|
|
|
INTERNALS -- CONE PATTERN SET
|
|
-----------------------------
|
|
|
|
In cone mode, only directories are accepted, but they are translated into
|
|
the same gitignore-style patterns used in the full pattern set. We refer
|
|
to the particular patterns used in those mode as being of one of two types:
|
|
|
|
1. *Recursive:* All paths inside a directory are included.
|
|
|
|
2. *Parent:* All files immediately inside a directory are included.
|
|
|
|
Since cone mode always includes files at the toplevel, when running
|
|
`git sparse-checkout set` with no directories specified, the toplevel
|
|
directory is added as a parent pattern. At this point, the
|
|
sparse-checkout file contains the following patterns:
|
|
|
|
----------------
|
|
/*
|
|
!/*/
|
|
----------------
|
|
|
|
This says "include everything immediately under the toplevel
|
|
directory, but nothing at any level below that."
|
|
|
|
When in cone mode, the `git sparse-checkout set` subcommand takes a
|
|
list of directories. The command `git sparse-checkout set A/B/C` sets
|
|
the directory `A/B/C` as a recursive pattern, the directories `A` and
|
|
`A/B` are added as parent patterns. The resulting sparse-checkout file
|
|
is now
|
|
|
|
----------------
|
|
/*
|
|
!/*/
|
|
/A/
|
|
!/A/*/
|
|
/A/B/
|
|
!/A/B/*/
|
|
/A/B/C/
|
|
----------------
|
|
|
|
Here, order matters, so the negative patterns are overridden by the positive
|
|
patterns that appear lower in the file.
|
|
|
|
Unless `core.sparseCheckoutCone` is explicitly set to `false`, Git will
|
|
parse the sparse-checkout file expecting patterns of these types. Git will
|
|
warn if the patterns do not match. If the patterns do match the expected
|
|
format, then Git will use faster hash-based algorithms to compute inclusion
|
|
in the sparse-checkout. If they do not match, git will behave as though
|
|
`core.sparseCheckoutCone` was false, regardless of its setting.
|
|
|
|
In the cone mode case, despite the fact that full patterns are written
|
|
to the $GIT_DIR/info/sparse-checkout file, the `git sparse-checkout
|
|
list` subcommand will list the directories that define the recursive
|
|
patterns. For the example sparse-checkout file above, the output is as
|
|
follows:
|
|
|
|
--------------------------
|
|
$ git sparse-checkout list
|
|
A/B/C
|
|
--------------------------
|
|
|
|
If `core.ignoreCase=true`, then the pattern-matching algorithm will use a
|
|
case-insensitive check. This corrects for case mismatched filenames in the
|
|
'git sparse-checkout set' command to reflect the expected cone in the working
|
|
directory.
|
|
|
|
|
|
INTERNALS -- SUBMODULES
|
|
-----------------------
|
|
|
|
If your repository contains one or more submodules, then submodules
|
|
are populated based on interactions with the `git submodule` command.
|
|
Specifically, `git submodule init -- <path>` will ensure the submodule
|
|
at `<path>` is present, while `git submodule deinit [-f] -- <path>`
|
|
will remove the files for the submodule at `<path>` (including any
|
|
untracked files, uncommitted changes, and unpushed history). Similar
|
|
to how sparse-checkout removes files from the working tree but still
|
|
leaves entries in the index, deinitialized submodules are removed from
|
|
the working directory but still have an entry in the index.
|
|
|
|
Since submodules may have unpushed changes or untracked files,
|
|
removing them could result in data loss. Thus, changing sparse
|
|
inclusion/exclusion rules will not cause an already checked out
|
|
submodule to be removed from the working copy. Said another way, just
|
|
as `checkout` will not cause submodules to be automatically removed or
|
|
initialized even when switching between branches that remove or add
|
|
submodules, using `sparse-checkout` to reduce or expand the scope of
|
|
"interesting" files will not cause submodules to be automatically
|
|
deinitialized or initialized either.
|
|
|
|
Further, the above facts mean that there are multiple reasons that
|
|
"tracked" files might not be present in the working copy: sparsity
|
|
pattern application from sparse-checkout, and submodule initialization
|
|
state. Thus, commands like `git grep` that work on tracked files in
|
|
the working copy may return results that are limited by either or both
|
|
of these restrictions.
|
|
|
|
|
|
SEE ALSO
|
|
--------
|
|
|
|
linkgit:git-read-tree[1]
|
|
linkgit:gitignore[5]
|
|
|
|
GIT
|
|
---
|
|
Part of the linkgit:git[1] suite
|