Various optimization for "git fetch".
* ps/fetch-mirror-optim:
refs/files-backend: optimize reading of symbolic refs
remote: read symbolic refs via `refs_read_symbolic_ref()`
refs: add ability for backends to special-case reading of symbolic refs
fetch: avoid lookup of commits when not appending to FETCH_HEAD
upload-pack: look up "want" lines via commit-graph
The untracked cache newly computed weren't written back to the
on-disk index file when there is no other change to the index,
which has been corrected.
* tk/empty-untracked-cache:
untracked-cache: write index when populating empty untracked cache
t7519: populate untracked cache before test
t7519: avoid file to index mtime race for untracked cache
When check_has_commit() is called on a missing submodule, initialization
of the struct repository fails, but it attempts to clear the struct
anyway (which is a fatal error). This bug is masked by its only caller,
submodule_has_commits(), first calling add_submodule_odb(). The latter
fails if the submodule does not exist, making submodule_has_commits()
exit early and not invoke check_has_commit().
Fix this bug, and because calling add_submodule_odb() is no longer
necessary as of 13a2f620b2 (submodule: pass repo to
check_has_commit(), 2021-10-08), remove that call too.
This is the last caller of add_submodule_odb(), so remove that
function. (Submodule ODBs are still added as alternates via
add_submodule_odb_by_path().)
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fetch --recurse-submodules" only considers populated
submodules (i.e. submodules that can be found by iterating the index),
which makes "git fetch" behave differently based on which commit is
checked out. As a result, even if the user has initialized all submodules
correctly, they may not fetch the necessary submodule commits, and
commands like "git checkout --recurse-submodules" might fail.
Teach "git fetch" to fetch cloned, changed submodules regardless of
whether they are populated. This is in addition to the current behavior
of fetching populated submodules (which is always attempted regardless
of what was fetched in the superproject, or even if nothing was fetched
in the superproject).
A submodule may be encountered multiple times (via the list of
populated submodules or via the list of changed submodules). When this
happens, "git fetch" only reads the 'populated copy' and ignores the
'changed copy'. Amend the verify_fetch_result() test helper so that we
can assert on which 'copy' is being read.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rearrange functions so that submodule_update() no longer needs to be
forward declared.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch completes the conversion past the flag parsing of
`submodule update` by introducing a helper subcommand called
`submodule--helper update`. The behaviour of `submodule update` should
remain the same after this patch.
Prior to this patch, `submodule update` was implemented by piping the
output of `update-clone` (which clones all missing submodules, then
prints relevant information for all submodules) into
`run-update-procedure` (which reads the information and updates the
submodule tree).
With `submodule--helper update`, we iterate over the submodules and
update the submodule tree in the same process. This reuses most of
existing code structure, except that `update_submodule()` now updates
the submodule tree (instead of printing submodule information to be
consumed by another process).
Recursing on a submodule is done by calling a subprocess that launches
`submodule--helper update`, with a modified `--recursive-prefix` and
`--prefix` parameter.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A later commit will combine the "update-clone" and
"run-update-procedure" commands, so run_update_procedure() will be
removed. Prepare for this by moving as much logic as possible out of
run_update_procedure() and into update_submodule2().
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor 'struct update_data' to hold the parsed args needed by "git
submodule--helper update" and refactor "update-clone" and
"run-update-procedure" (the functions that will be combined to form
"update") to use these options.
For "run-update-procedure", 'struct update_data' already holds its args,
so only arg parsing code needs to be updated.
For "update-clone", move its args from 'struct submodule_update_clone'
into 'struct update_data', and replace them with a pointer to 'struct
update_data'. Its other members hold the submodule iteration state of
"update-clone", so those are unchanged.
Incidentally, since we reformat the designated initializers of the
affected structs, also reformat MODULE_CLONE_DATA_INIT for consistency.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a later commit, update_clone()'s options will be stored in a struct
update_data instead of submodule_update_clone. Preemptively rename the
options struct to "opt" to shrink that commit's diff.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use die_message() to print the "fatal: " prefix instead of doing it in
git-submodule.sh and remove a now-unnecessary exit code from "git
submodule--helper run-update-procedure".
Also, since die_message() adds the newline for us, replace an invocation
of die_with_status() with printf + exit invocations that do not add a
newline, but are otherwise identical to die_with_status().
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We switch to using the run-command API function that takes a
'struct child process', since we are using a lot of the options. This
will also make it simple to switch over to using 'capture_command()'
when we start handling the output of the command completely in C.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* gc/submodule-update-part1:
submodule--helper update-clone: check for --filter and --init
submodule update: add tests for --filter
submodule--helper: remove ensure-core-worktree
submodule--helper update-clone: learn --init
submodule--helper: allow setting superprefix for init_submodule()
submodule--helper: refactor get_submodule_displaypath()
submodule--helper run-update-procedure: learn --remote
submodule--helper: don't use bitfield indirection for parse_options()
submodule--helper: get remote names from any repository
submodule--helper run-update-procedure: remove --suboid
submodule--helper: reorganize code for sh to C conversion
submodule--helper: remove update-module-mode
submodule tests: test for init and update failure output
If the user suspends git while it is waiting for a keypress reset the
terminal before stopping and restore the settings when git resumes. If
the user tries to resume in the background print an error
message (taking care to use async safe functions) before stopping
again. Ideally we would reprint the prompt for the user when git
resumes but this patch just restarts the read().
The signal handler is established with sigaction() rather than using
sigchain_push() as this allows us to control the signal mask when the
handler is invoked and ensure SA_RESTART is used to restart the
read() when resuming.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On macos the builtin "add -p" does not handle keys that generate
escape sequences because poll() does not work with terminals
there. Switch to using select() on non-windows platforms to work
around this.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
read_key_without_echo() reads from stdin but uses /dev/tty when it
disables echo. This is unfortunate as there no guarantee that stdin is
the same device as /dev/tty. The perl version of "add -p" uses stdin
when it sets the terminal mode, this commit does the same for the
builtin version. There is still a difference between the perl and
builtin versions though - the perl version will ignore any errors when
setting the terminal mode[1] and will still read single bytes when
stdin is not a terminal. The builtin version displays a warning if
setting the terminal mode fails and switches to reading a line at a
time.
[1] b061c913bb/ReadKey.xs (L1090)
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The next commit will add another flag in addition to the existing
full_duplex so change the function signature to take a flags
argument. Also alter the functions that call save_term() so that they
can pass flags down to it.
The choice to use an enum for tho bitwise flags is because gdb will
display the symbolic names of all the flags that are set rather than
the integer value.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a blobless-cloned repo, `git log --follow -- <path>` (`<path>` have
an exact OID rename) shouldn't download blob of the file from where the
new file is renamed.
Add a test case to verify it.
Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of creating a new allocation, reverse the original list
in-place by calling the reverse_commit_list() helper.
The original code discards the list "bases" after storing its
reverse copy in a newly created list "reversed". If the code that
followed from here used both "bases" and "reversed", the
modification would not have worked, but since the original list
"bases" gets discarded, we can simply reverse "bases" in-place with
the reverse_commit_list() helper and reuse the same variable in the
code that follows.
builtin/merge.c has been left unmodified, since in its case, the
original list is needed separately from its reverse copy by the
code.
Signed-off-by: Jayati Shrivastava <gaurijove@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If no --args are present after 'git restore', it assumes that you
want to tab-complete one of the files with unstaged uncommitted
changes.
If a file has been staged, we don't want to list it, as restoring those
requires a slightly more complex `git restore --staged`, so we only list
those files that are --modified. While --committable also looks like
a good candidate, that includes changes that have been staged.
Signed-off-by: David Cantrell <david@cantrell.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing both loose and packed references to disk we first create a
lockfile, write the updated values into that lockfile, and on commit we
rename the file into place. According to filesystem developers, this
behaviour is broken because applications should always sync data to disk
before doing the final rename to ensure data consistency [1][2][3]. If
applications fail to do this correctly, a hard crash of the machine can
easily result in corrupted on-disk data.
This kind of corruption can in fact be easily observed with Git when the
machine hard-resets shortly after writing references to disk. On
machines with ext4, this will likely lead to the "empty files" problem:
the file has been renamed, but its data has not been synced to disk. The
result is that the reference is corrupt, and in the worst case this can
lead to data loss.
Implement a new option to harden references so that users and admins can
avoid this scenario by syncing locked loose and packed references to
disk before we rename them into place.
[1]: https://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/
[2]: https://btrfs.wiki.kernel.org/index.php/FAQ (What are the crash guarantees of overwrite-by-rename)
[3]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/ext4.rst (see auto_da_alloc)
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ns/core-fsyncmethod:
core.fsync: documentation and user-friendly aggregate options
core.fsync: new option to harden the index
core.fsync: add configuration parsing
core.fsync: introduce granular fsync control infrastructure
core.fsyncmethod: add writeout-only mode
wrapper: make inclusion of Windows csprng header tightly scoped
This commit adds aggregate options for the core.fsync setting that are
more user-friendly. These options are specified in terms of 'levels of
safety', indicating which Git operations are considered to be sync
points for durability.
The new documentation is also included here in its entirety for ease of
review.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The synopsis for 'git maintenance' did not include the commands other
than the 'run' command. Update this to include the others. The 'start'
command is the only one of these that parses additional options, and
then only the --scheduler option.
Also move the 'register' command down after 'stop' and before
'unregister' for a logical grouping of the commands instead of an
alphabetical one. The diff makes it look as three other commands are
moved up.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When format is passed into --batch, --batch-check, --batch-command,
the format gets expanded. When nothing is passed in, the default format
is set and the expand_format() gets called.
We can save on these cycles by hardcoding how to print the
information when nothing is passed as the format, or when the default
format is passed. There is no need for the fully expanded format with
the default. Since batch_object_write() happens on every object provided
in batch mode, we get a nice performance improvement.
git rev-list --all > /tmp/all-obj.txt
git cat-file --batch-check </tmp/all-obj.txt
with HEAD^:
Time (mean ± σ): 57.6 ms ± 1.7 ms [User: 51.5 ms, System: 6.2 ms]
Range (min … max): 54.6 ms … 64.7 ms 50 runs
with HEAD:
Time (mean ± σ): 49.8 ms ± 1.7 ms [User: 42.6 ms, System: 7.3 ms]
Range (min … max): 46.9 ms … 55.9 ms 56 runs
If nothing is provided as a format argument, or if the default format is
passed, skip expanding of the format and print the object info with a
default format.
See https://lore.kernel.org/git/87eecf8ork.fsf@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the options '-q' and '--refresh' to the 'git reset' executed in
'reset_head()', and '--refresh' to the 'git reset -q' executed in
'do_push_stash(...)'.
'stash' is implemented such that git commands invoked as part of it (e.g.,
'clean', 'read-tree', 'reset', etc.) have their informational output
silenced. However, the 'reset' in 'reset_head()' is *not* called with '-q',
leading to the potential for a misleading printout from 'git stash apply
--index' if the stash included a removed file:
Unstaged changes after reset: D <deleted file>
Not only is this confusing in its own right (since, after the reset, 'git
stash' execution would stage the deletion in the index), it would be printed
even when the stash was applied with the '-q' option. As a result, the
messaging is removed entirely by calling 'git status' with '-q'.
Additionally, because the default behavior of 'git reset -q' is to skip
refreshing the index, but later operations in 'git stash' subcommands expect
a non-stale index, enable '--refresh' as well.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If using '--quiet' or 'reset.quiet=true', do not print the 'resetnoRefresh'
advice string. For applications that rely on '--quiet' disabling all
non-error logs, the advice message should be suppressed accordingly.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace references to '--quiet' with '--no-refresh' in the advice on how to
skip refreshing the index. When the advice was introduced, '--quiet' was the
only way to avoid the expensive 'refresh_index(...)' at the end of a mixed
reset. After introducing '--no-refresh', however, '--quiet' became only a
fallback option for determining refresh behavior, overridden by
'--[no-]refresh' or 'reset.refresh' if either is set. To ensure users are
advised to use the most reliable option for avoiding 'refresh_index(...)',
replace recommendation of '--quiet' with '--[no-]refresh'.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new --[no-]refresh option that is intended to explicitly determine
whether a mixed reset should end in an index refresh.
Starting at 9ac8125d1a (reset: don't compute unstaged changes after reset
when --quiet, 2018-10-23), using the '--quiet' option results in skipping
the call to 'refresh_index(...)' at the end of a mixed reset with the goal
of improving performance. However, by coupling behavior that modifies the
index with the option that silences logs, there is no way for users to have
one without the other (i.e., silenced logs with a refreshed index) without
incurring the overhead of a separate call to 'git update-index --refresh'.
Furthermore, there is minimal user-facing documentation indicating that
--quiet skips the index refresh, potentially leading to unexpected issues
executing commands after 'git reset --quiet' that do not themselves refresh
the index (e.g., internals of 'git stash', 'git read-tree').
To mitigate these issues, '--[no-]refresh' and 'reset.refresh' are
introduced to provide a dedicated mechanism for refreshing the index. When
either is set, '--quiet' and 'reset.quiet' revert to controlling only
whether logs are silenced and do not affect index refresh.
Helped-by: Derrick Stolee <derrickstolee@github.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update the advice describing index refresh from "enumerate unstaged changes"
to "refresh the index." Describing 'refresh_index(...)' as "enumerating
unstaged changes" is not fully representative of what an index refresh is
doing; more generally, it updates the properties of index entries that are
affected by outside-of-index state, e.g. CE_UPTODATE, which is affected by
the file contents on-disk. This distinction is relevant to operations that
read the index but do not refresh first - e.g., 'git read-tree' - where a
stale index may cause incorrect behavior.
In addition to changing the advice message, use the "advise" function to
print advice.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By default, git-repack(1) will update server info that is required by
the dumb HTTP transport. This can be skipped by passing the `-n` flag,
but what we're noticably missing is a config option to permanently
disable updating this information.
Add a new option "repack.updateServerInfo" which can be used to disable
the logic. Most hosting providers have turned off the dumb HTTP protocol
anyway, and on the client-side it woudln't typically be useful either.
Giving a persistent way to disable this feature thus makes quite some
sense to avoid wasting compute cycles and storage.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By default, git-repack(1) runs `update_server_info()` to generate info
required for the dumb HTTP protocol. This can be disabled via the `-n`
flag, which then sets the `no_update_server_info` flag. Further down the
code this leads to some double-negation logic, which is about to become
more confusing as we're about to add a new config which allows the user
to permanently disable generation of the info.
Refactor the code to avoid the double-negation and add some tests which
verify that the flag continues to work as expected.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Plug random memory leaks.
* ab/plug-random-leaks:
repository.c: free the "path cache" in repo_clear()
range-diff: plug memory leak in read_patches()
range-diff: plug memory leak in common invocation
lockfile API users: simplify and don't leak "path"
commit-graph: stop fill_oids_from_packs() progress on error and free()
commit-graph: fix memory leak in misused string_list API
submodule--helper: fix trivial leak in module_add()
transport: stop needlessly copying bundle header references
bundle: call strvec_clear() on allocated strvec
remote-curl.c: free memory in cmd_main()
urlmatch.c: add and use a *_release() function
diff.c: free "buf" in diff_words_flush()
merge-base: free() allocated "struct commit **" list
index-pack: fix memory leaks
Newer version of GPGSM changed its output in a backward
incompatible way to break our code that parses its output. It also
added more processes our tests need to kill when cleaning up.
Adjustments have been made to accommodate these changes.
* fs/gpgsm-update:
t/lib-gpg: kill all gpg components, not just gpg-agent
t/lib-gpg: reload gpg components after updating trustlist
gpg-interface/gpgsm: fix for v2.3
Check the return value from parse_tree_indirect() to turn segfaults
into calls to die().
* gc/parse-tree-indirect-errors:
checkout, clone: die if tree cannot be parsed
Align the level of verbose output from the ort backend during inner
merge to that of the recursive backend.
* en/merge-ort-align-verbosity-with-recursive:
merge-ort: exclude messages from inner merges by default
Makefile refactoring with a bit of suffixes rule stripping to
optimize the runtime overhead.
* ab/make-optim-noop:
Makefiles: add and use wildcard "mkdir -p" template
Makefile: add "$(QUIET)" boilerplate to shared.mak
Makefile: move $(comma), $(empty) and $(space) to shared.mak
Makefile: move ".SUFFIXES" rule to shared.mak
Makefile: define $(LIB_H) in terms of $(FIND_SOURCE_FILES)
Makefile: disable GNU make built-in wildcard rules
Makefiles: add "shared.mak", move ".DELETE_ON_ERROR" to it
scalar Makefile: use "The default target of..." pattern
"git fetch" can make two separate fetches, but ref updates coming
from them were in two separate ref transactions under "--atomic",
which has been corrected.
* ps/fetch-atomic:
fetch: make `--atomic` flag cover pruning of refs
fetch: make `--atomic` flag cover backfilling of tags
refs: add interface to iterate over queued transactional updates
fetch: report errors when backfilling tags fails
fetch: control lifecycle of FETCH_HEAD in a single place
fetch: backfill tags before setting upstream
fetch: increase test coverage of fetches
The Git CodingGuidelines prefer the $(...) construct for command
substitution instead of using the backquotes `...`.
The backquoted form is the traditional method for command
substitution, and is supported by POSIX. However, all but the
simplest uses become complicated quickly. In particular, embedded
command substitutions and/or the use of double quotes require
careful escaping with the backslash character.
The patch was generated by:
for _f in $(find . -name "*.sh")
do
shellcheck -i SC2006 -f diff ${_f} | ifne git apply -p2
done
and then carefully proof-read.
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a commit in a sequence of linear history has a non-monotonically
increasing commit timestamp, git name-rev might not properly name the
commit.
This occurs because name-rev uses a heuristic of the commit date to
avoid searching down tags which lead to commits that are older than the
named commit. This is intended to avoid work on larger repositories.
This heuristic impacts git name-rev, and by extension git describe
--contains which is built on top of name-rev.
Further more, if --all or --annotate-stdin is used, the heuristic is not
enabled because the full history has to be analyzed anyways. This
results in some confusion if a user sees that --annotate-stdin works but
a normal name-rev does not.
If the repository has a commit graph, we can use the generation numbers
instead of using the commit dates. This is essentially the same check
except that generation numbers make it exact, where the commit date
heuristic could be incorrect due to clock errors.
Since we're extending the notion of cutoff to more than one variable,
create a series of functions for setting and checking the cutoff. This
avoids duplication and moves access of the global cutoff and
generation_cutoff to as few functions as possible.
Add several test cases including a test that covers the new commitGraph
behavior, as well as tests for --all and --annotate-stdin with and
without commitGraphs.
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>