The git-request-pull.sh script invokes Perl, so it requires Perl to be
available, but the associated test t5150 does not skip itself when Perl
has been disabled, which then makes subtest 4 through 10 fail. Subtest 3
still passes, but for the wrong reasons (it expects git-request-pull to
fail, and it does fail when Perl is not available). The initial two
subtests that do pass are only doing setup.
To prevent t5150 from failing the build when NO_PERL=1, add a check that
sets skip_all when "! test_have_prereq PERL", just like how for example
t3701-add-interactive skips itself when Perl has been disabled.
Signed-off-by: Ruud van Asseldonk <dev@veniogames.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing a commit-graph, we show progress along several commit
walks. When we use start_delayed_progress(), the progress line will
only appear if that step takes a decent amount of time.
However, one place was missed: computing generation numbers. This is
normally a very fast operation as all commits have been parsed in a
previous step. But, this is showing up for all users no matter how few
commits are being added.
The tests that check for the progress output have already been updated
to use GIT_PROGRESS_DELAY=0 to force the expected output.
Helped-by: Jeff King <peff@peff.net>
Reported-by: ryenus <ryenus@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The start_delayed_progress() method is a preferred way to show
optional progress to users as it ignores steps that take less
than two seconds. However, this makes testing unreliable as tests
expect to be very fast.
In addition, users may want to decrease or increase this time
interval depending on their preferences for terminal noise.
Create the GIT_PROGRESS_DELAY environment variable to control
the delay set during start_delayed_progress(). Set the value
in some tests to guarantee their output remains consistent.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The perf suite's aggregate.perl depends on Git.pm, which is a mild
annoyance if you've built git with NO_PERL. It turns out that the only
thing we use it for is a single call of the command_oneline() helper.
We can just replace this with backticks or similar.
Annoyingly, perl has no backtick equivalent that avoids a shell eval,
which means our $arg would require quoting. This probably doesn't matter
for our purposes, but it's better to be safe and model good style. So
we'll just provide a short helper around open(), which takes its
arguments as a list.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The perf tests write files recording the results of tests. These
results are later aggregated by 'aggregate.perl'. If the tests are run
multiple times, those results are overwritten by the new results. This
works just fine as long as there are only perf tests measuring the
times, whose results are stored in "$base".times files.
However 22bec79d1a ("t/perf: add infrastructure for measuring sizes",
2018-08-17) introduced a new type of test for measuring the size of
something. The results of this are written to "$base".size files.
"$base" is essentially made up of the basename of the script plus the
test number. So if test numbers shift because a new test was
introduced earlier in the script we might end up with both a ".times"
and a ".size" file for the same test. In the aggregation script the
".times" file is preferred over the ".size" file, so some size tests
might end with performance numbers from a previous run of the test.
This is mainly relevant when writing perf tests that check both
performance and sizes, and can get quite confusing during
developement.
We could fix this by doing a more thorough job of cleaning out old
".times" and ".size" files before running each test. However, an even
easier solution is to just use the same filename for both types of
measurement, meaning we'll always overwrite the previous result. We
don't even need to change the file format to distinguish the two;
aggregate.perl already decides which is which based on a regex of the
content (this may become ambiguous if we add new types in the future,
but we could easily add a header field to the file at that point).
Based on an initial patch from Thomas Gummerer, who discovered the
problem and did all of the analysis (which I stole for the commit
message above):
https://public-inbox.org/git/20191119185047.8550-1-t.gummerer@gmail.com/
Helped-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 'test_commit_bulk' is invoked in an empty test repository, it
prints a "fatal: Needed a single revision" error, but still does what
it's supposed to do. A test helper function displaying a fatal error
and still succeeding is always suspect to be buggy, but luckily that's
not the case here: that error comes from a 'git rev-parse --verify
HEAD' command invoked in a condition, which doesn't have anything to
verify in an empty repository.
Use the '--quiet' option to suppress that error message.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Clarify the way the `--reset-author-date` option is described,
and mark its usage string translatable.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When calling `git submodule status` while in a subdirectory, we are
incorrectly not detecting modified submodules and
thus reporting that all of the submodules are unchanged.
This is because the submodule helper is calling `diff-index` with the
submodule path assuming the path is relative to the current prefix
directory, however the submodule path used is actually relative to the root.
Always pass NULL as the `prefix` when running diff-files on the
submodule, to make sure the submodule's path is interpreted as relative
to the superproject's repository root.
Signed-off-by: Manish Goregaokar <manishsmail@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, complete_action(), used by builtin/rebase.c to start a new
rebase, calls sequencer_continue() to do it. Before the former calls
pick_commits(), it
- calls read_and_refresh_cache() -- this is unnecessary here as we've
just called require_clean_work_tree() in complete_action()
- calls read_populate_opts() -- this is unnecessary as we're starting a
new rebase, so `opts' is fully populated
- loads the todo list -- this is unnecessary as we've just populated
the todo list in complete_action()
- commits any staged changes -- this is unnecessary as we're starting a
new rebase, so there are no staged changes
- calls record_in_rewritten() -- this is unnecessary as we're starting
a new rebase.
This changes complete_action() to directly call pick_commits() to avoid
these unnecessary steps.
Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When sequencer_continue() is called by complete_action(), `opts' has
been filled by get_replay_opts(). Currently, it does not initialise the
`squash_onto' field (used by the `--root' mode), only
read_populate_opts() does. It’s not a problem yet since
sequencer_continue() calls it before pick_commits(), but it would lead
to incorrect results once complete_action() is modified to call
pick_commits() directly.
Let’s change that.
Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The total number of commands can be used to show the progression of the
rebasing in a shell. It is written to the disk by read_populate_todo()
when the todo list is loaded from sequencer_continue() or
pick_commits(), but not by complete_action().
This moves the part writing total_nr to a new function so it can be
called from complete_action().
Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a todo list, `done_nr' is the number of commands that were executed
or skipped, but skip_unnecessary_picks() did not update it.
This variable is mostly used by command prompts (ie. git-prompt.sh and
the like). As in the previous commit, this inconsistent behaviour is
not a problem yet, but it would start to matter at the end of this
series the same reason.
Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`total_nr' is the total number of items, counting both done and todo,
that are in a todo list. But unlike `nr', it was not updated when an
item was appended to the list.
This variable is mostly used by command prompts (ie. git-prompt.sh and
the like). By forgetting to update it, the original code made it not
reflect the reality, but this flaw was masked by the code calling
unnecessarily read_populate_todo() again to update the variable to its
correct value. At the end of this series, the unnecessary call will be
removed, and the inconsistency addressed by this patch would start to
matter.
Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
init_topo_walk doesn't reuse an existing topo_walk_info, and currently
leaks the one that might exist on the current rev_info if it was already
used for a topo walk beforehand.
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Not doing so can lead to wrong topo-walks when using the revision walk API
consecutively.
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git's code base already seems to be using `PRIdMAX` without any such
fallback definition for quite a while (75459410ed (json_writer: new
routines to create JSON data, 2018-07-13), to be precise, and the
first Git version to include that commit was v2.19.0). Having a
fallback definition only for `PRIuMAX` is a bit inconsistent.
We do sometimes get portability reports more than a year after the
problem was introduced. This one should be fairly safe. PRIuMAX is
in C99 (for that matter, SCNuMAX, PRIu32 and others also are), and
we've been picking up other C99-isms without complaint.
The PRIuMAX fallback definition was originally added in 3efb1f343a
(Check for PRIuMAX rather than NO_C99_FORMAT in fast-import.c.,
2007-02-20). But it was replacing a construct that was introduced in
an even earlier commit, 579d1fbfaf (Add NO_C99_FORMAT to support
older compilers., 2006-07-30), which talks about gcc 2.95.
That's pretty ancient at this point.
Signed-off-by: Hariom Verma <hariom18599@gmail.com>
Helped-by: Jeff King <peff@peff.net>
[jc: tweaked both message and code, taking what peff wrote]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Based on commit a5cd71ca4a ("l10n: zh_CN: for git v2.24.0 l10n round
1~2", 2019-10-29).
Signed-off-by: Yi-Jyun Pan <pan93412@gmail.com>
Reviewed-by: Franklin Weng <franklin@goodhorse.idv.tw>
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Since 2f328c3d ("reset $sha1 $pathspec: require $sha1 only to be
treeish", 2013-01-14), we allowed "git reset $object -- $path" to reset
individual paths that match the pathspec to take the blob from a tree
object, not necessarily a commit, while the form to reset the tip of the
current branch to some other commit still must be given a commit.
Like resetting with paths, "git reset --patch" does not update HEAD, and
need not require a commit. The path-filtered form, "git reset --patch
$object -- $pathspec", has accepted a tree-ish since 2f328c3d.
"git reset --patch" is documented as accepting a <tree-ish> since
bf44142f ("reset: update documentation to require only tree-ish with
paths", 2013-01-16). Documentation changes are not required.
Loosen the restriction that requires a commit for the unfiltered "git
reset --patch $object".
Signed-off-by: Nika Layzell <nika@thelayzells.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git submodule update' will fetch new commits from the submodule remote
if the SHA-1 recorded in the superproject is not found. This was not
mentioned in the documentation.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git submodule status --cached' reports the SHAs recorded in the
index of the superproject, instead of the SHAs that are checked out
in the submodule.
Signed-off-by: Manish Goregaokar <manishsmail@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 'git revert' or 'git cherry-pick --edit' is invoked with multiple
commits, then after editing the first commit message is finished both
these commands should continue with processing the second commit and
launch another editor for its commit message, assuming there are
no conflicts, of course.
Alas, this inadvertently changed with commit a47ba3c777 (rebase -i:
check for updated todo after squash and reword, 2019-08-19): after
editing the first commit message is finished, both 'git revert' and
'git cherry-pick --edit' exit with error, claiming that "nothing to
commit, working tree clean".
The reason for the changed behaviour is twofold:
- Prior to a47ba3c777 the up-to-dateness of the todo list file was
only checked after 'exec' instructions, and that commit moved
those checks to the common code path. The intention was that this
check should be performed after instructions spawning an editor
('squash' and 'reword') as well, so the ongoing 'rebase -i'
notices when the user runs a 'git rebase --edit-todo' while
squashing/rewording a commit message.
However, as it happened that check is now performed even after
'revert' and 'pick' instructions when they involved editing the
commit message. And 'revert' by default while 'pick' optionally
(with 'git cherry-pick --edit') involves editing the commit
message.
- When invoking 'git revert' or 'git cherry-pick --edit' with
multiple commits they don't read a todo list file but assemble the
todo list in memory, thus the associated stat data used to check
whether the file has been updated is all zeroed out initially.
Then the sequencer writes all instructions (including the very
first) to the todo file, executes the first 'revert/pick'
instruction, and after the user finished editing the commit
message the changes of a47ba3c777 kick in, and it checks whether
the todo file has been modified. The initial all-zero stat data
obviously differs from the todo file's current stat data, so the
sequencer concludes that the file has been modified. Technically
it is not wrong, of course, because the file just has been written
indeed by the sequencer itself, though the file's contents still
match what the sequencer was invoked with in the beginning.
Consequently, after re-reading the todo file the sequencer
executes the same first instruction _again_, thus ending up in
that "nothing to commit" situation.
The todo list was never meant to be edited during multi-commit 'git
revert' or 'cherry-pick' operations, so perform that "has the todo
file been modified" check only when the sequencer was invoked as part
of an interactive rebase.
Reported-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Turns out that it don't work so well on Vista, see
https://github.com/git-for-windows/git/issues/1742 for details.
According to https://devblogs.microsoft.com/oldnewthing/?p=8873, it
*should* work on Windows Vista and later.
But apparently there are issues on Windows Vista when pipes are
involved. Given that Windows Vista is past its end of life (official
support ended on April 11th, 2017), let's not spend *too* much time on
this issue and just disable the file handle inheritance restriction on
any Windows version earlier than Windows 7.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By default, CreateProcess() does not inherit any open file handles,
unless the bInheritHandles parameter is set to TRUE. Which we do need to
set because we need to pass in stdin/stdout/stderr to talk to the child
processes. Sadly, this means that all file handles (unless marked via
O_NOINHERIT) are inherited.
This lead to problems in VFS for Git, where a long-running read-object
hook is used to hydrate missing objects, and depending on the
circumstances, might only be called *after* Git opened a file handle.
Ideally, we would not open files without O_NOINHERIT unless *really*
necessary (i.e. when we want to pass the opened file handle as standard
handle into a child process), but apparently it is all-too-easy to
introduce incorrect open() calls: this happened, and prevented updating
a file after the read-object hook was started because the hook still
held a handle on said file.
Happily, there is a solution: as described in the "Old New Thing"
https://blogs.msdn.microsoft.com/oldnewthing/20111216-00/?p=8873 there
is a way, starting with Windows Vista, that lets us define precisely
which handles should be inherited by the child process.
And since we bumped the minimum Windows version for use with Git for
Windows to Vista with v2.10.1 (i.e. a *long* time ago), we can use this
method. So let's do exactly that.
We need to make sure that the list of handles to inherit does not
contain duplicates; Otherwise CreateProcessW() would fail with
ERROR_INVALID_ARGUMENT.
While at it, stop setting errno to ENOENT unless it really is the
correct value.
Also, fall back to not limiting handle inheritance under certain error
conditions (e.g. on Windows 7, which is a lot stricter in what handles
you can specify to limit to).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For some reason, when being called via TortoiseGit the standard handles,
or at least what is returned by _get_osfhandle(0) for standard input,
can take on the value (HANDLE)-2 (which is not a legal value, according
to the documentation).
Even if this value is not documented anywhere, CreateProcess() seems to
work fine without complaints if hStdInput set to this value.
In contrast, the upcoming code to restrict which file handles get
inherited by spawned processes would result in `ERROR_INVALID_PARAMETER`
when including such handle values in the list.
To help this, special-case the value (HANDLE)-2 returned by
_get_osfhandle() and replace it with INVALID_HANDLE_VALUE, which will
hopefully let the handle inheritance restriction work even when called
from TortoiseGit.
This fixes https://github.com/git-for-windows/git/issues/1481
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When spawning child processes, we really should be careful which file
handles we let them inherit.
This is doubly important on Windows, where we cannot rename, delete, or
modify files if there is still a file handle open.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The GIT_TEST_CLONE_2GB environment variable is only ever checked with
'test -z' in 't5608-clone-2gb.sh', so any non-empty value is
interpreted as "yes, run these expensive tests", even
'GIT_TEST_CLONE_2GB=NoThanks'.
Similar GIT_TEST_* environment variables have already been turned into
bools in 3b072c577b (tests: replace test_tristate with "git
env--helper", 2019-06-21), so let's turn GIT_TEST_CLONE_2GB into a
bool as well, to follow suit.
Our CI builds set GIT_TEST_CLONE_2GB=YesPlease, so adjust them
accordingly, thus removing the last 'YesPlease' from our CI scripts.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 3b072c577b (tests: replace test_tristate with "git env--helper",
2019-06-21) we get the normalized bool values of various GIT_TEST_*
environment variables via 'git env--helper'. Now, while the 'git
env--helper' command itself does catch invalid values in the
environment variable or in the given --default and exits with error
(exit code 128 or 129, respectively), it's invoked in conditions like
'if ! git env--helper ...', which means that all invalid bool values
are interpreted the same as the ordinary 'false' (exit code 1). This
has led to inadvertently skipped httpd tests in our CI builds for a
couple of weeks, see 3960290675 (ci: restore running httpd tests,
2019-09-06).
Let's be more careful about what the test suite accepts as bool values
in GIT_TEST_* environment variables, and error out loud and clear on
invalid values instead of simply skipping tests. Add the
'test_bool_env' helper function to encapsulate the invocation of 'git
env--helper' and the verification of its exit code, and replace all
invocations of that command in our test framework and test suite with
a call to this new helper (except in 't0017-env-helper.sh', of
course).
$ GIT_TEST_GIT_DAEMON=YesPlease ./t5570-git-daemon.sh
fatal: bad numeric config value 'YesPlease' for 'GIT_TEST_GIT_DAEMON': invalid unit
error: test_bool_env requires bool values both for $GIT_TEST_GIT_DAEMON and for the default fallback
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This fixes a regression introduced in 356ee4659b ("sequencer: try to
commit without forking 'git commit'", 2017-11-24). When amending a
commit try_to_commit() was using the wrong parent when checking if the
commit would be empty. When amending we need to check against HEAD^ not
HEAD.
t3403 may not seem like the natural home for the new tests but a further
patch series will improve the advice printed by `git commit`. That
series will mutate these tests to check that the advice includes
suggesting `rebase --skip` to skip the fixup that would empty the
commit.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We deprecated `--preserve-merges` in favor of `--rebase-merges`; Let's
reflect that in `git svn`.
Note: Even when the user asks for `--preserve-merges`, we now silently
pass `--rebase-merges` to `git rebase` instead. Technically, this is a
change of behavior. But practically, `git svn` only ever asks for a
non-interactive rebase, and `--preserve-merges` and `--rebase-merges`
are on par with regard to that.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The VALIDSIG status line from GnuPG with --status-fd is documented to
have 9 required and 1 optional fields [1]. The final, and optional,
field is used to specify the fingerprint of the primary key that made
the signature in case it was made by a subkey. However, this field is
only available for OpenPGP signatures; not for CMS/X.509.
If the VALIDSIG status line does not have the optional 10th field, the
current code will continue reading onto the next status line. And this
is the case for non-OpenPGP signatures [1].
The consequence is that a subsequent status line may be considered as
the "primary key" for signatures that does not have an actual primary
key.
Limit the search of these 9 or 10 fields to the single line to avoid
this problem. If the 10th field is missing, report that there is no
primary key fingerprint.
[Reference]
[1] GnuPG Details, General status codes
https://git.gnupg.org/cgi-bin/gitweb.cgi?p=gnupg.git;a=blob;f=doc/DETAILS;h=6ce340e8c04794add995e84308bb3091450bd28f;hb=HEAD#l483
The documentation says:
VALIDSIG <args>
The args are:
- <fingerprint_in_hex>
- <sig_creation_date>
- <sig-timestamp>
- <expire-timestamp>
- <sig-version>
- <reserved>
- <pubkey-algo>
- <hash-algo>
- <sig-class>
- [ <primary-key-fpr> ]
This status indicates that the signature is cryptographically
valid. [...] PRIMARY-KEY-FPR is the fingerprint of the primary key
or identical to the first argument.
The primary-key-fpr parameter is used for OpenPGP and not available
for CMS signatures. [...]
Signed-off-by: Hans Jerry Illikainen <hji@dyntopia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a static replace_cstring() function to simplify repeated
pattern of free-and-xmemdupz() for GPG status line parsing.
This also helps us avoid potential memleaks if parsing of new status
lines are introduced in the future.
Signed-off-by: Hans Jerry Illikainen <hji@dyntopia.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The index-merge performed by 'git sparse-checkout' will erase any staged
changes, which can lead to data loss. Prevent these attempts by requiring
a clean 'git status' output.
Helped-by: Szeder Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'git sparse-checkout init' subcommand previously wrote directly
to the sparse-checkout file and then updated the working directory.
This may fail if there are modified files not included in the initial
pattern set. However, that left a populated sparse-checkout file.
Use the in-process working directory update to guarantee that the
init subcommand only changes the sparse-checkout file if the working
directory update succeeds.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
During the development of the sparse-checkout "cone mode" feature,
an incorrect placement of the initializer for "use_cone_patterns = 1"
caused warnings to show up when a .gitignore file was present with
non-cone-mode patterns. This was fixed in the original commit
introducing the cone mode, but now we should add a test to avoid
hitting this problem again in the future.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If two 'git sparse-checkout set' subcommands are launched at the
same time, the behavior can be unexpected as they compete to write
the sparse-checkout file and update the working directory.
Take a lockfile around the writes to the sparse-checkout file. In
addition, acquire this lock around the working directory update
to avoid two commands updating the working directory in different
ways.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'git sparse-checkout disable' subcommand returns a user to a
full working directory. The old process for doing this required
updating the sparse-checkout file with the "/*" pattern and then
updating the working directory with core.sparseCheckout enabled.
Finally, the sparse-checkout file could be removed and the config
setting disabled.
However, it is valuable to keep a user's sparse-checkout file
intact so they can re-enable the sparse-checkout they previously
used with 'git sparse-checkout init'. This is now possible with
the in-process mechanism for updating the working directory.
Reported-by: Szeder Gábor <szeder.dev@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sparse-checkout builtin used 'git read-tree -mu HEAD' to update the
skip-worktree bits in the index and to update the working directory.
This extra process is overly complex, and prone to failure. It also
requires that we write our changes to the sparse-checkout file before
trying to update the index.
Remove this extra process call by creating a direct call to
unpack_trees() in the same way 'git read-tree -mu HEAD' does. In
addition, provide an in-memory list of patterns so we can avoid
reading from the sparse-checkout file. This allows us to test a
proposed change to the file before writing to it.
An earlier version of this patch included a bug when the 'set' command
failed due to the "Sparse checkout leaves no entry on working directory"
error. It would not rollback the index.lock file, so the replay of the
old sparse-checkout specification would fail. A test in t1091 now
covers that scenario.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a user provides folders A/ and A/B/ for inclusion in a cone-mode
sparse-checkout file, the parsing logic will notice that A/ appears
both as a "parent" type pattern and as a "recursive" type pattern.
This is unexpected and hence will complain via a warning and revert
to the old logic for checking sparse-checkout patterns.
Prevent this from happening accidentally by sanitizing the folders
for this type of inclusion in the 'git sparse-checkout' builtin.
This happens in two ways:
1. Do not include any parent patterns that also appear as recursive
patterns.
2. Do not include any recursive patterns deeper than other recursive
patterns.
In order to minimize duplicate code for scanning parents, create
hashmap_contains_parent() method. It takes a strbuf buffer to
avoid reallocating a buffer when calling in a tight loop.
Helped-by: Eric Wong <e@80x24.org>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a large repository has many sparse-checkout patterns, the
process for updating the skip-worktree bits can take long enough
that a user gets confused why nothing is happening. Update the
clear_ce_flags() method to write progress.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sparse-checkout feature in "cone mode" can use the fact that
the recursive patterns are "connected" to the root via parent
patterns to decide if a directory is entirely contained in the
sparse-checkout or entirely removed.
In these cases, we can skip hashing the paths within those
directories and simply set the skipworktree bit to the correct
value.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To make the cone pattern set easy to use, update the behavior of
'git sparse-checkout (init|set)'.
Add '--cone' flag to 'git sparse-checkout init' to set the config
option 'core.sparseCheckoutCone=true'.
When running 'git sparse-checkout set' in cone mode, a user only
needs to supply a list of recursive folder matches. Git will
automatically add the necessary parent matches for the leading
directories.
When testing 'git sparse-checkout set' in cone mode, check the
error stream to ensure we do not see any errors. Specifically,
we want to avoid the warning that the patterns do not match
the cone-mode patterns.
Helped-by: Eric Wong <e@80x24.org>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The parent and recursive patterns allowed by the "cone mode"
option in sparse-checkout are restrictive enough that we
can avoid using the regex parsing. Everything is based on
prefix matches, so we can use hashsets to store the prefixes
from the sparse-checkout file. When checking a path, we can
strip path entries from the path and check the hashset for
an exact match.
As a test, I created a cone-mode sparse-checkout file for the
Linux repository that actually includes every file. This was
constructed by taking every folder in the Linux repo and creating
the pattern pairs here:
/$folder/
!/$folder/*/
This resulted in a sparse-checkout file sith 8,296 patterns.
Running 'git read-tree -mu HEAD' on this file had the following
performance:
core.sparseCheckout=false: 0.21 s (0.00 s)
core.sparseCheckout=true: 3.75 s (3.50 s)
core.sparseCheckoutCone=true: 0.23 s (0.01 s)
The times in parentheses above correspond to the time spent
in the first clear_ce_flags() call, according to the trace2
performance traces.
While this example is contrived, it demonstrates how these
patterns can slow the sparse-checkout feature.
Helped-by: Eric Wong <e@80x24.org>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sparse-checkout feature can have quadratic performance as
the number of patterns and number of entries in the index grow.
If there are 1,000 patterns and 1,000,000 entries, this time can
be very significant.
Create a new Boolean config option, core.sparseCheckoutCone, to
indicate that we expect the sparse-checkout file to contain a
more limited set of patterns. This is a separate config setting
from core.sparseCheckout to avoid breaking older clients by
introducing a tri-state option.
The config option does nothing right now, but will be expanded
upon in a later commit.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When Git updates the working directory with the sparse-checkout
feature enabled, the unpack_trees() method calls clear_ce_flags()
to update the skip-wortree bits on the cache entries. This
check can be expensive, depending on the patterns used.
Add trace2 regions around the method, including some flag
information, so we can get granular performance data during
experiments. This data will be used to measure improvements
to the pattern-matching algorithms for sparse-checkout.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The instructions for disabling a sparse-checkout to a full
working directory are complicated and non-intuitive. Add a
subcommand, 'git sparse-checkout disable', to perform those
steps for the user.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'git sparse-checkout set' subcommand takes a list of patterns
and places them in the sparse-checkout file. Then, it updates the
working directory to match those patterns. For a large list of
patterns, the command-line call can get very cumbersome.
Add a '--stdin' option to instead read patterns over standard in.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'git sparse-checkout set' subcommand takes a list of patterns
as arguments and writes them to the sparse-checkout file. Then, it
updates the working directory using 'git read-tree -mu HEAD'.
The 'set' subcommand will replace the entire contents of the
sparse-checkout file. The write_patterns_and_update() method is
extracted from cmd_sparse_checkout() to make it easier to implement
'add' and/or 'remove' subcommands in the future.
If the core.sparseCheckout config setting is disabled, then enable
the config setting in the worktree config. If we set the config
this way and the sparse-checkout fails, then re-disable the config
setting.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When someone wants to clone a large repository, but plans to work
using a sparse-checkout file, they either need to do a full
checkout first and then reduce the patterns they included, or
clone with --no-checkout, set up their patterns, and then run
a checkout manually. This requires knowing a lot about the repo
shape and how sparse-checkout works.
Add a new '--sparse' option to 'git clone' that initializes the
sparse-checkout file to include the following patterns:
/*
!/*/
These patterns include every file in the root directory, but
no directories. This allows a repo to include files like a
README or a bootstrapping script to grow enlistments from that
point.
During the 'git sparse-checkout init' call, we must first look
to see if HEAD is valid, since 'git clone' does not have a valid
HEAD at the point where it initializes the sparse-checkout. The
following checkout within the clone command will create the HEAD
ref and update the working directory correctly.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Getting started with a sparse-checkout file can be daunting. Help
users start their sparse enlistment using 'git sparse-checkout init'.
This will set 'core.sparseCheckout=true' in their config, write
an initial set of patterns to the sparse-checkout file, and update
their working directory.
Make sure to use the `extensions.worktreeConfig` setting and write
the sparse checkout config to the worktree-specific config file.
This avoids confusing interactions with other worktrees.
The use of running another process for 'git read-tree' is sub-
optimal. This will be removed in a later change.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sparse-checkout feature is mostly hidden to users, as its
only documentation is supplementary information in the docs for
'git read-tree'. In addition, users need to know how to edit the
.git/info/sparse-checkout file with the right patterns, then run
the appropriate 'git read-tree -mu HEAD' command. Keeping the
working directory in sync with the sparse-checkout file requires
care.
Begin an effort to make the sparse-checkout feature a porcelain
feature by creating a new 'git sparse-checkout' builtin. This
builtin will be the preferred mechanism for manipulating the
sparse-checkout file and syncing the working directory.
The documentation provided is adapted from the "git read-tree"
documentation with a few edits for clarity in the new context.
Extra sections are added to hint toward a future change to
a more restricted pattern set.
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>