"git commit --fixup" now works with "--edit" again, after it was
broken in v2.32.
* jk/commit-edit-fixup-fix:
commit: restore --edit when combined with --fixup
The revision traversal API has been optimized by taking advantage
of the commit-graph, when available, to determine if a commit is
reachable from any of the existing refs.
* ps/connectivity-optim:
revision: avoid hitting packfiles when commits are in commit-graph
commit-graph: split out function to search commit position
revision: stop retrieving reference twice
connected: do not sort input revisions
revision: separate walk and unsorted flags
When executing git-update-ref(1) with the `--stdin` flag, then the user
can queue updates and, since e48cf33b61 (update-ref: implement
interactive transaction handling, 2020-04-02), interactively drive the
transaction's state via a set of transactional verbs. This interactivity
is somewhat broken though: while the caller can use these verbs to drive
the transaction's state, the status messages which confirm that a verb
has been processed is not flushed. The caller may thus be left hanging
waiting for the acknowledgement.
Fix the bug by flushing stdout after writing the status update. Add a
test which catches this bug.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the trailing dot from the warning we emit about gc.log. It's
common for various terminal UX's to allow the user to select "words",
and by including the trailing dot a user wanting to select the path to
gc.log will need to manually remove the trailing dot.
Such a user would also probably need to adjust the path if it e.g. had
spaces in it, but this should address this very common case.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Suggested-by: Jan Judas <snugar.i@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a new 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' environment
variable to also write a multi-pack bitmap when
'GIT_TEST_MULTI_PACK_INDEX' is set.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Write multi-pack bitmaps in the format described by
Documentation/technical/bitmap-format.txt, inferring their presence with
the absence of '--bitmap'.
To write a multi-pack bitmap, this patch attempts to reuse as much of
the existing machinery from pack-objects as possible. Specifically, the
MIDX code prepares a packing_data struct that pretends as if a single
packfile has been generated containing all of the objects contained
within the MIDX.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This prepares the code in pack-bitmap to interpret the new multi-pack
bitmaps described in Documentation/technical/bitmap-format.txt, which
mostly involves converting bit positions to accommodate looking them up
in a MIDX.
Note that there are currently no writers who write multi-pack bitmaps,
and that this will be implemented in the subsequent commit. Note also
that get_midx_checksum() and get_midx_filename() are made non-static so
they can be called from pack-bitmap.c.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Opening multiple instance of the same MIDX can lead to problems like two
separate packed_git structures which represent the same pack being added
to the repository's object store.
The above scenario can happen because prepare_midx_pack() checks if
`m->packs[pack_int_id]` is NULL in order to determine if a pack has been
opened and installed in the repository before. But a caller can
construct two copies of the same MIDX by calling get_multi_pack_index()
and load_multi_pack_index() since the former manipulates the
object store directly but the latter is a lower-level routine which
allocates a new MIDX for each call.
So if prepare_midx_pack() is called on multiple MIDXs with the same
pack_int_id, then that pack will be installed twice in the object
store's packed_git pointer.
This can lead to problems in, for e.g., the pack-bitmap code, which does
something like the following (in pack-bitmap.c:open_pack_bitmap()):
struct bitmap_index *bitmap_git = ...;
for (p = get_all_packs(r); p; p = p->next) {
if (open_pack_bitmap_1(bitmap_git, p) == 0)
ret = 0;
}
which is a problem if two copies of the same pack exist in the
packed_git list because pack-bitmap.c:open_pack_bitmap_1() contains a
conditional like the following:
if (bitmap_git->pack || bitmap_git->midx) {
/* ignore extra bitmap file; we can only handle one */
warning("ignoring extra bitmap file: %s", packfile->pack_name);
close(fd);
return -1;
}
Avoid this scenario by not letting write_midx_internal() open a MIDX
that isn't also pointed at by the object store. So long as this is the
case, other routines should prefer to open MIDXs with
get_multi_pack_index() or reprepare_packed_git() instead of creating
instances on their own. Because get_multi_pack_index() returns
`r->object_store->multi_pack_index` if it is non-NULL, we'll only have
one instance of a MIDX open at one time, avoiding these problems.
To encourage this, drop the `struct multi_pack_index *` parameter from
`write_midx_internal()`, and rely instead on the `object_dir` to find
(or initialize) the correct MIDX instance.
Likewise, replace the call to `close_midx()` with
`close_object_store()`, since we're about to replace the MIDX with a new
one and should invalidate the object store's memory of any MIDX that
might have existed beforehand.
Note that this now forbids passing object directories that don't belong
to alternate repositories over `--object-dir`, since before we would
have happily opened a MIDX in any directory, but now restrict ourselves
to only those reachable by `r->objects->multi_pack_index` (and alternate
MIDXs that we can see by walking the `next` pointer).
As far as I can tell, supporting arbitrary directories with
`--object-dir` was a historical accident, since even the documentation
says `<alt>` when referring to the value passed to this option.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fetching refs, we are doing two connectivity checks:
- The first one is done such that we can skip fetching refs in the
case where we already have all objects referenced by the updated
set of refs.
- The second one verifies that we have all objects after we have
fetched objects.
We always execute both connectivity checks, but this is wasteful in case
the first connectivity check already notices that we have all objects
locally available.
Skip the second connectivity check in case we already had all objects
available. This gives us a nice speedup when doing a mirror-fetch in a
repository with about 2.3M refs where the fetching repo already has all
objects:
Benchmark #1: HEAD~: git-fetch
Time (mean ± σ): 30.025 s ± 0.081 s [User: 27.070 s, System: 4.933 s]
Range (min … max): 29.900 s … 30.111 s 5 runs
Benchmark #2: HEAD: git-fetch
Time (mean ± σ): 25.574 s ± 0.177 s [User: 22.855 s, System: 4.683 s]
Range (min … max): 25.399 s … 25.765 s 5 runs
Summary
'HEAD: git-fetch' ran
1.17 ± 0.01 times faster than 'HEAD~: git-fetch'
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The functions `fetch_refs()` and `consume_refs()` must always be called
together such that we first obtain all missing objects and then update
our local refs to match the remote refs. In a subsequent patch, we'll
further require that `fetch_refs()` must always be called before
`consume_refs()` such that it can correctly assert that we have all
objects after the fetch given that we're about to move the connectivity
check.
Make this requirement explicit by merging both functions into a single
`fetch_and_consume_refs()` function.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor `fetch_refs()` code to make it more extendable by explicitly
handling error cases. The refactored code should behave the same.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The object ID iterator used by the connectivity checks returns the next
object ID via an out-parameter and then uses a return code to indicate
whether an item was found. This is a bit roundabout: instead of a
separate error code, we can just return the next object ID directly and
use `NULL` pointers as indicator that the iterator got no items left.
Furthermore, this avoids a copy of the object ID.
Refactor the iterator and all its implementations to return object IDs
directly. This brings a tiny performance improvement when doing a mirror-fetch of a repository with about 2.3M refs:
Benchmark #1: 328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch
Time (mean ± σ): 30.110 s ± 0.148 s [User: 27.161 s, System: 5.075 s]
Range (min … max): 29.934 s … 30.406 s 10 runs
Benchmark #2: 328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch
Time (mean ± σ): 29.899 s ± 0.109 s [User: 26.916 s, System: 5.104 s]
Range (min … max): 29.696 s … 29.996 s 10 runs
Summary
'328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch' ran
1.01 ± 0.01 times faster than '328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch'
While this 1% speedup could be labelled as statistically insignificant,
the speedup is consistent on my machine. Furthermore, this is an end to
end test, so it is expected that the improvement in the connectivity
check itself is more significant.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When updating local refs after the fetch has transferred all objects, we
do an object existence test as a safety guard to avoid updating a ref to
an object which we don't have. We do so via `oid_object_info()`: if it
returns an error, then we know the object does not exist.
One side effect of `oid_object_info()` is that it parses the object's
type, and to do so it must unpack the object header. This is completely
pointless: we don't care for the type, but only want to assert that the
object exists.
Refactor the code to use `repo_has_object_file()`, which both makes the
code's intent clearer and is also faster because it does not unpack
object headers. In a real-world repo with 2.3M refs, this results in a
small speedup when doing a mirror-fetch:
Benchmark #1: HEAD~: git-fetch
Time (mean ± σ): 33.686 s ± 0.176 s [User: 30.119 s, System: 5.262 s]
Range (min … max): 33.512 s … 33.944 s 5 runs
Benchmark #2: HEAD: git-fetch
Time (mean ± σ): 31.247 s ± 0.195 s [User: 28.135 s, System: 5.066 s]
Range (min … max): 30.948 s … 31.472 s 5 runs
Summary
'HEAD: git-fetch' ran
1.08 ± 0.01 times faster than 'HEAD~: git-fetch'
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When updating our local refs based on the refs fetched from the remote,
we need to iterate through all requested refs and load their respective
commits such that we can determine whether they need to be appended to
FETCH_HEAD or not. In cases where we're fetching from a remote with
exceedingly many refs, resolving these refs can be quite expensive given
that we repeatedly need to unpack object headers for each of the
referenced objects.
Speed this up by opportunistically trying to resolve object IDs via the
commit graph. We only do so for any refs which are not in "refs/tags":
more likely than not, these are going to be a commit anyway, and this
lets us avoid having to unpack object headers completely in case the
object is a commit that is part of the commit-graph. This significantly
speeds up mirror-fetches in a real-world repository with
2.3M refs:
Benchmark #1: HEAD~: git-fetch
Time (mean ± σ): 56.482 s ± 0.384 s [User: 53.340 s, System: 5.365 s]
Range (min … max): 56.050 s … 57.045 s 5 runs
Benchmark #2: HEAD: git-fetch
Time (mean ± σ): 33.727 s ± 0.170 s [User: 30.252 s, System: 5.194 s]
Range (min … max): 33.452 s … 33.871 s 5 runs
Summary
'HEAD: git-fetch' ran
1.67 ± 0.01 times faster than 'HEAD~: git-fetch'
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 7f40759496 (fast-export: tighten anonymize_mem() interface to
handle only strings, 2020-06-23) changed the interface used in anonymizing
strings, but failed to update the size of annotated tag messages to match
the new anonymized string.
As a result, exporting tags having messages longer than 13 characters
would create output that couldn't be parsed by fast-import,
as the data length indicated was larger than the data output.
Reset the message size when anonymizing, and add a tag with a "long"
message to the test.
Signed-off-by: Tal Kelrich <hasturkun@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Bring the "commit-graph" command in line with the error output and
general pattern in cmd_multi_pack_index().
Let's test for that output, and also cover the same potential bug as
was fixed in the multi-pack-index command in
88617d11f9 (multi-pack-index: fix potential segfault without
sub-command, 2021-07-19).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the parse_options() invocation in the commit-graph code to
error on unknown leftover argv elements, in addition to the existing
and implicit erroring via parse_options() on unknown options.
We'd already error in cmd_commit_graph() on e.g.:
git commit-graph unknown verify
git commit-graph --unknown verify
But here we're calling parse_options() twice more for the "write" and
"verify" subcommands. We did not do the same checking for leftover
argv elements there. As a result we'd silently accept garbage in these
subcommands, let's not do that.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rather than guarding all of the !argc with an additional "if" arm
let's do an early goto to "usage". This also makes it clear that
"save_commit_buffer" is not needed in this case.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the "goto usage" pattern added in
cd57bc41bb (builtin/multi-pack-index.c: display usage on unrecognized
command, 2021-03-30) and 88617d11f9 (multi-pack-index: fix potential
segfault without sub-command, 2021-07-19) to maintain the same
brevity, but in a form that doesn't run afoul of the recommendation in
CodingGuidelines about braces:
When there are multiple arms to a conditional and some of them
require braces, enclose even a single line block in braces for
consistency[...]
Let's also change "argv == 0" to juts "!argv", per:
Do not explicitly compare an integral value with constant 0 or
'\0', or a pointer value with constant NULL[...]
I'm changing this because in a subsequent commit I'll make
builtin/commit-graph.c use the same pattern, having the two similarly
structured commands match aids readability.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make use of the parse_options_concat() so we don't need to copy/paste
common options like --object-dir.
This is inspired by a similar change to "checkout" in 2087182272
(checkout: split options[] array in three pieces, 2019-03-29), and the
same pattern in the multi-pack-index command, see
60ca94769c (builtin/multi-pack-index.c: split sub-commands,
2021-03-30).
A minor behavior change here is that now we're going to list both
--object-dir and --progress first, before we'd list --progress along
with other options.
Co-authored-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we don't handle the -h option here like most parse_options() users
we'll fall through and it'll do the right thing for us.
I think this code added in 4ce58ee38d (commit-graph: create
git-commit-graph builtin, 2018-04-02) was always redundant,
parse_options() did this at the time, and the commit-graph code never
used PARSE_OPT_NO_INTERNAL_HELP.
We don't need a test for this, it's tested by the t0012-help.sh test
added in d691551192 (t0012: test "-h" with builtins, 2017-05-30).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Share the usage message between these three variables by using a
macro. Before this new options needed to copy/paste the usage
information, see e.g. 809e0327f5 (builtin/commit-graph.c: introduce
'--max-new-filters=<n>', 2020-09-18).
See b25b727494 (builtin/multi-pack-index.c: define common usage with
a macro, 2021-03-30) for another use of this pattern (but on-list this
one came first).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Silently skipping commits when rebasing with --no-reapply-cherry-picks
(currently the default behavior) can cause user confusion. Issue
warnings when this happens, as well as advice on how to preserve the
skipped commits.
These warnings and advice are displayed only when using the (default)
"merge" rebase backend.
Update the git-rebase docs to mention the warnings and advice.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use `ort` instead of `recursive` as the default merge strategy.
* en/ort-becomes-the-default:
Update docs for change of default merge backend
Change default merge backend from recursive to ort
Documentation updates.
* en/merge-strategy-docs:
Update error message and code comment
merge-strategies.txt: add coverage of the `ort` merge strategy
git-rebase.txt: correct out-of-date and misleading text about renames
merge-strategies.txt: fix simple capitalization error
merge-strategies.txt: avoid giving special preference to patience algorithm
merge-strategies.txt: do not imply using copy detection is desired
merge-strategies.txt: update wording for the resolve strategy
Documentation: edit awkward references to `git merge-recursive`
directory-rename-detection.txt: small updates due to merge-ort optimizations
git-rebase.txt: correct antiquated claims about --rebase-merges
"git pull" had various corner cases that were not well thought out
around its --rebase backend, e.g. "git pull --ff-only" did not stop
but went ahead and rebased when the history on other side is not a
descendant of our history. The series tries to fix them up.
* en/pull-conflicting-options:
pull: fix handling of multiple heads
pull: update docs & code for option compatibility with rebasing
pull: abort by default when fast-forwarding is not possible
pull: make --rebase and --no-rebase override pull.ff=only
pull: since --ff-only overrides, handle it first
pull: abort if --ff-only is given and fast-forwarding is impossible
t7601: add tests of interactions with multiple merge heads and config
t7601: test interaction of merge/rebase/fast-forward flags and options
Based on current experience, when running git clone --recurse-submodules,
developers do not expect other commands such as pull or checkout to run
recursively into active submodules. However, setting submodule.recurse=true
at this step could make for a simpler workflow by eliminating the need for
the --recurse-submodules option in subsequent commands. To collect more
data on developers' preference in regards to making submodule.recurse=true
a default config value in the future, deploy this feature under the opt in
submodule.stickyRecursiveClone flag.
Signed-off-by: Mahi Kolla <mkolla2@illinois.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fetching, Git will by default print a list of all updated refs in a
nicely formatted table. In order to come up with this table, Git needs
to iterate refs twice: first to determine the maximum column width, and
a second time to actually format these changed refs.
While this table will not be printed in case the user passes `--quiet`,
we still go out of our way and do all these steps. In fact, we even do
more work compared to not passing `--quiet`: without the flag, we will
skip all references in the column width computation which have not been
updated, but if it is set we will now compute widths for all refs.
Fix this issue by completely skipping both preparation of the format and
formatting data for display in case the user passes `--quiet`, improving
performance especially with many refs. The following benchmark shows a
nice speedup for a quiet mirror-fetch in a repository with 2.3M refs:
Benchmark #1: HEAD~: git-fetch
Time (mean ± σ): 26.929 s ± 0.145 s [User: 24.194 s, System: 4.656 s]
Range (min … max): 26.692 s … 27.068 s 5 runs
Benchmark #2: HEAD: git-fetch
Time (mean ± σ): 25.189 s ± 0.094 s [User: 22.556 s, System: 4.606 s]
Range (min … max): 25.070 s … 25.314 s 5 runs
Summary
'HEAD: git-fetch' ran
1.07 ± 0.01 times faster than 'HEAD~: git-fetch'
While at it, this patch also fixes `adjust_refcol_width()` such that it
skips unchanged refs in case the user passed `--quiet`, where verbosity
will be negative. While this function won't be called anymore if so,
this brings the comment in line with actual code. Furthermore, needless
`verbosity >= 0` checks are now removed in `store_updated_refs()`: we
never print to the `note` buffer anymore in case `verbosity < 0`, so we
won't end up in that code block anyway.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the original code from 08cdfb1337 (pack-objects --keep-unreachable,
2007-09-16), we add each object to the packing list with type
`obj->type`, where `obj` comes from `lookup_unknown_object()`. Unless we
had already looked up and parsed the object, this will be `OBJ_NONE`.
That's fine, since oe_set_type() sets the type_valid bit to '0', and we
determine the real type later on.
So the only thing we need from the object lookup is access to the
`flags` field so that we can mark that we've added the object with
`OBJECT_ADDED` to avoid adding it again (we can just pass `OBJ_NONE`
directly instead of grabbing it from the object).
But add_object_entry() already rejects duplicates! This has been the
behavior since 7a979d99ba (Thin pack - create packfile with missing
delta base., 2006-02-19), but 08cdfb1337 didn't take advantage of it.
Moreover, to do the OBJECT_ADDED check, we have to do a hash lookup in
`obj_hash`.
So we can drop the lookup_unknown_object() call completely, *and* the
OBJECT_ADDED flag, too, since the spot we're touching here is the only
location that checks it.
In the end, we perform the same number of hash lookups, but with the
added bonus that we don't waste memory allocating an OBJ_NONE object (if
we were traversing, we'd need it eventually, but the whole point of this
code path is not to traverse).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function is used to implement `pack-objects`'s `--keep-unreachable`
option, but can be simplified in a couple of ways:
- add_objects_in_unpacked_packs() iterates over all packs (and then
all packed objects) itself, but could use for_each_packed_object()
instead since the missing flags necessary were added in the previous
commit
- objects are added to an in_pack array which store (off_t, object)
tuples, and then sorted in offset order when we could iterate
objects in offset order.
There is a slight behavior change here: before we would have added
objects in sorted offset order among _all_ packs. Handing objects to
create_object_entry() in pack order for each pack (instead of
feeding objects from all packs simultaneously their offset relative
to different packs) is much more reasonable, if different than how
the code currently works.
- objects in a single pack are iterated in index order and searched
for in order to discover their offsets, which is much less efficient
than using the on-disk reverse index
Simplify the function by addressing each of the above and moving the
core of the loop into a callback function that we then pass to
for_each_packed_object() instead of open-coding the latter function
ourselves.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git branch only allows deleting branches that point to valid commits.
Skip that check if --force is given, as the caller is indicating with
it that they know what they are doing and accept the consequences.
This allows deleting dangling branches, which previously had to be
reset to a valid start-point using --force first.
Reported-by: Ulrich Windl <Ulrich.Windl@rz.uni-regensburg.de>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Only one of the callers of rev_is_head() provides two hashes to compare.
Move that check there and convert it to struct object_id.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'Filtering contents...' progress report from delayed checkout is
displayed even when checkout and clone are invoked with --quiet or
--no-progress. Furthermore, it is displayed unconditionally, without
first checking whether stdout is a tty. Let's fix these issues and also
add some regression tests for the two code paths that currently use
delayed checkout: unpack_trees.c:check_updates() and
builtin/checkout.c:checkout_worktree().
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git column's '--nl' option can be used to specify a "string to be
printed at the end of each line" (quoting the man page), but this
option and its mandatory argument has been parsed as OPT_INTEGER since
the introduction of the command in 7e29b8254f (Add column layout
skeleton and git-column, 2012-04-21). Consequently, any non-number
argument is rejected by parse-options, and any number other than 0
leads to segfault:
$ printf "%s\n" one two |git column --mode=plain --nl=foo
error: option `nl' expects a numerical value
$ printf "%s\n" one two |git column --mode=plain --nl=42
Segmentation fault (core dumped)
$ printf "%s\n" one two |git column --mode=plain --nl=0
one
two
Parse this option as OPT_STRING.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add and apply a semantic patch for using xopen() instead of calling
open(2) and die() or die_errno() explicitly. This makes the error
messages more consistent and shortens the code.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since the the preceding commit the "oid" parameter to reflog_expire()
is always NULL, but it was not cleaned up to reduce the size of the
diff. Let's do that subsequent API and documentation cleanup now.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
During reflog expiry, the cmd_reflog_expire() function first iterates
over all reflogs in logs/*, and then one-by-one acquires the lock for
each one and expires it. This behavior has been with us since this
command was implemented in 4264dc15e1 ("git reflog expire",
2006-12-19).
Change this to stop calling lock_ref_oid_basic() with the OID we saw
when we looped over the logs, instead have it pass the OID it managed
to lock.
This mostly mitigates a race condition where e.g. "git gc" will fail
in a concurrently updated repository because the branch moved since
"git reflog expire --all" was started. I.e. with:
error: cannot lock ref '<refname>': ref '<refname>' is at <OID-A> but expected <OID-B>
This behavior of passing in an "oid" was needed for an edge-case that
I've untangled in this and preceding commits though, namely that we
needed this OID because we'd:
1. Lookup the reflog name/OID via dwim_log()
2. With that OID, lock the reflog
3. Later in builtin/reflog.c we use the OID we looked as input to
lookup_commit_reference_gently(), assured that it's equal to the
OID we got from dwim_log().
We can be sure that this change is safe to make because between
dwim_log (step #1) and lock_ref_oid_basic (step #2) there was no other
logic relevant to the OID or expiry run in the cmd_reflog_expire()
caller.
We can thus treat that code as a black box, before and after this
change it would get an OID that's been locked, the only difference is
that now we mostly won't be failing to get the lock due to the TOCTOU
race[0]. That failure was purely an implementation detail in how the
"current OID" was looked up, it was divorced from the locking
mechanism.
What do we mean with "mostly"? It mostly mitigates it because we'll
still run into cases where the ref is locked and being updated as we
want to expire it, and other git processes wanting to update the refs
will in turn race with us as we expire the reflog.
That remaining race can in turn be mitigated with the
core.filesRefLockTimeout setting, see 4ff0f01cb7 ("refs: retry
acquiring reference locks for 100ms", 2017-08-21). In practice if that
value is high enough we'll probably never have ref updates or reflog
expiry failing, since the clients involved will retry for far longer
than the time any of those operations could take.
See [1] for an initial report of how this impacted "git gc" and a
large discussion about this change in early 2019. In particular patch
looked good to Michael Haggerty, see his[2]. That message seems to not
have made it to the ML archive, its content is quoted in full in my
[3].
I'm leaving behind now-unused code the refs API etc. that takes the
now-NULL "unused_oid" argument, and other code that can be simplified now
that we never have on OID in that context, that'll be cleaned up in
subsequent commits, but for now let's narrowly focus on fixing the
"git gc" issue. As the modified assert() shows we always pass a NULL
oid to reflog_expire() now.
Unfortunately this sort of probabilistic contention is hard to turn
into a test. I've tested this by running the following three subshells
in concurrent terminals:
(
rm -rf /tmp/git &&
git init /tmp/git &&
while true
do
head -c 10 /dev/urandom | hexdump >/tmp/git/out &&
git -C /tmp/git add out &&
git -C /tmp/git commit -m"out"
done
)
(
rm -rf /tmp/git-clone &&
git clone file:///tmp/git /tmp/git-clone &&
while git -C /tmp/git-clone pull
do
date
done
)
(
while git -C /tmp/git-clone reflog expire --all
do
date
done
)
Before this change the "reflog expire" would fail really quickly with
the "but expected" error noted above.
After this change both the "pull" and "reflog expire" will run for a
while, but eventually fail because I get unlucky with
core.filesRefLockTimeout (the "reflog expire" is in a really tight
loop). As noted above that can in turn be mitigated with higher values
of core.filesRefLockTimeout than the 100ms default.
As noted in the commentary added in the preceding commit there's also
the case of branches being racily deleted, that can be tested by
adding this to the above:
(
while git -C /tmp/git-clone branch topic master &&
git -C /tmp/git-clone branch -D topic
do
date
done
)
With core.filesRefLockTimeout set to 10 seconds (it can probably be a
lot lower) I managed to run all four of these concurrently for about
an hour, and accumulated ~125k commits, auto-gc's and all, and didn't
have a single failure. The loops visibly stall while waiting for the
lock, but that's expected and desired behavior.
0. https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use
1. https://lore.kernel.org/git/87tvg7brlm.fsf@evledraar.gmail.com/
2. http://lore.kernel.org/git/b870a17d-2103-41b8-3cbc-7389d5fff33a@alum.mit.edu
3. https://lore.kernel.org/git/87pnqkco8v.fsf@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the squashing of the advice.graftFileDeprecated advice over to an
external variable in commit.[ch], allowing advice() to purely use the
new-style API of invoking advice() with an enum.
See 8821e90a09 (advice: don't pointlessly suggest
--convert-graft-file, 2018-11-27) for why quieting this advice was
needed. It's more straightforward to move this code to commit.[ch] and
use it builtin/replace.c, than to go through the indirection of
advice.[ch].
Because this was the last advice_config variable we can remove that
old facility from advice.c.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The external use of this variable was added in 532139940c (add: warn
when adding an embedded repository, 2017-06-14). For the use-case it's
more straightforward to track whether we've shown advice in
check_embedded_repo() than setting the global variable.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In c4a09cc9cc (Merge branch 'hw/advise-ng', 2020-03-25), a new API for
accessing advice variables was introduced and deprecated `advice_config`
in favor of a new array, `advice_setting`.
This patch ports all but two uses which read the status of the global
`advice_` variables over to the new `advice_enabled` API. We'll deal
with advice_add_embedded_repo and advice_graft_file_deprecated
separately.
Signed-off-by: Ben Boeckel <mathstuf@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Bugfix for common ancestor negotiation recently introduced in "git
push" code path.
* jt/push-negotiation-fixes:
fetch: die on invalid --negotiation-tip hash
send-pack: fix push nego. when remote has refs
send-pack: fix push.negotiate with remote helper
Pathname expansion (like "~username/") learned a way to specify a
location relative to Git installation (e.g. its $sharedir which is
$(prefix)/share), with "%(prefix)".
* js/expand-runtime-prefix:
expand_user_path: allow in-flight topics to keep using the old name
interpolate_path(): allow specifying paths relative to the runtime prefix
Use a better name for the function interpolating paths
expand_user_path(): clarify the role of the `real_home` parameter
expand_user_path(): remove stale part of the comment
tests: exercise the RUNTIME_PREFIX feature
Prepare the "ref-filter" machinery that drives the "--format"
option of "git for-each-ref" and its friends to be used in "git
cat-file --batch".
* zh/ref-filter-raw-data:
ref-filter: add %(rest) atom
ref-filter: use non-const ref_format in *_atom_parser()
ref-filter: --format=%(raw) support --perl
ref-filter: add %(raw) atom
ref-filter: add obj-type check in grab contents
Input validation of "git pack-objects --stdin-packs" has been
corrected.
* ab/pack-stdin-packs-fix:
pack-objects: fix segfault in --stdin-packs option
pack-objects tests: cover blindspots in stdin handling
"git add" can work better with the sparse index.
* ds/add-with-sparse-index:
add: remove ensure_full_index() with --renormalize
add: ignore outside the sparse-checkout in refresh()
pathspec: stop calling ensure_full_index
add: allow operating on a sparse-only index
t1092: test merge conflicts outside cone
The die() routine adds a "fatal: " prefix, there is no reason to add
another one. Fixes code added in e65123a71d (builtin rebase: support
`git rebase <upstream> <switch-to>`, 2018-09-04).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Set packet_trace_identity() for ls-remote. This replaces the generic
"git" identity in GIT_TRACE_PACKET=<file> traces to "ls-remote", e.g.:
[...] packet: upload-pack> version 2
[...] packet: upload-pack> agent=git/2.32.0-dev
[...] packet: ls-remote< version 2
[...] packet: ls-remote< agent=git/2.32.0-dev
Where in an "git ls-remote file://<path>" dialog ">" is the sender (or
"to the server") and "<" is the recipient (or "received by the
client").
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On macOS, we use launchctl to manage the background maintenance
schedule. This uses a set of .plist files to describe the schedule, but
these files are also registered with 'launchctl bootstrap'. If multiple
'git maintenance start' commands run concurrently, then they can collide
replacing these schedule files and registering them with launchctl.
To avoid extra launchctl commands, do a check for the .plist files on
disk and check if they are registered using 'launchctl list <name>'.
This command will return with exit code 0 if it exists, or exit code 113
if it does not.
We can test this behavior using the GIT_TEST_MAINT_SCHEDULER environment
variable.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When two `git maintenance` processes try to write the `.plist` file, we
need to help them with serializing their efforts.
The 150ms time-out value was determined from thin air.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new submodule--helper subcommand `run-update-procedure` that runs
the update procedure if the SHA1 of the submodule does not match what
the superproject expects.
This is an intermediate change that works towards total conversion of
`submodule update` from shell to C.
Specific error codes are returned so that the shell script calling the
subcommand can take a decision on the control flow, and preserve the
error messages across subsequent recursive calls of `cmd_update`.
This change is more focused on doing a faithful conversion, so for now we
are not too concerned with trying to reduce subprocess spawns.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The set of objects covered by a bitmap must be closed under
reachability, since it must be the case that there is a valid bit
position assigned for every possible reachable object (otherwise the
bitmaps would be incomplete).
Pack bitmaps are never written from 'git repack' unless repacking
all-into-one, and so we never write non-closed bitmaps (except in the
case of partial clones where we aren't guaranteed to have all objects).
But multi-pack bitmaps change this, since it isn't known whether the
set of objects in the MIDX is closed under reachability until walking
them. Plumb through a bit that is set when a reachable object isn't
found.
As soon as a reachable object isn't found in the set of objects to
include in the bitmap, bitmap_writer_build() knows that the set is not
closed, and so it now fails gracefully.
A test is added in t0410 to trigger a bitmap write without full
reachability closure by removing local copies of some reachable objects
from a promisor remote.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Recent changes to --fixup, adding amend suboption, caused the
--edit flag to be ignored as use_editor was always set to zero.
Restore edit_flag having higher priority than fixup_message when
deciding the value of use_editor by moving the edit flag condition
later in the method.
Signed-off-by: Joel Klinghed <the_jk@spawned.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ds/add-with-sparse-index:
add: remove ensure_full_index() with --renormalize
add: ignore outside the sparse-checkout in refresh()
pathspec: stop calling ensure_full_index
add: allow operating on a sparse-only index
t1092: test merge conflicts outside cone
Let's rename 'compute_submodule_clone_url()' to 'resolve_relative_url()'
to make it clear that this internal helper need not be used exclusively
for computing submodule clone URLs.
Since the original 'resolve-relative-url' subcommand and its C entry
point has been removed in c461095ae3 (submodule--helper: remove
resolve-relative-url subcommand, 2021-07-02), this rename can be done
without causing any confusion about which function it actually binds to.
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The shell subcommand `resolve-relative-url` is no longer required, as
its last caller has been removed when it was converted to C.
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Also no longer needed is this subcommand, as all of its functionality is
being called by the newly-introduced `module_add()` directly within C.
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We no longer need this subcommand, as all of its functionality is being
called by the newly-introduced `module_add()` directly within C.
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce the 'add' subcommand to `submodule--helper.c` that does all
the work 'submodule add' past the parsing of flags.
We also remove the constness of the sm_path field of the `add_data`
struct. This is needed so that it can be modified by
normalize_path_copy().
As with the previous conversions, this is meant to be a faithful
conversion with no modification to the behaviour of `submodule add`.
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Helped-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Based-on-patch-by: Shourya Shukla <periperidip@gmail.com>
Based-on-patch-by: Prathamesh Chavan <pc44800@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These functions can be useful to other parts of Git. Let's move them to
dir.c, while renaming them to be make their functionality more explicit.
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This part of `sync_submodule()` is doing the same thing that
`compute_submodule_clone_url()` is doing. Let's reuse that helper here.
Note that this change adds a small overhead where we allocate and free
the 'remote' twice, but that is a small price to pay for the higher
level of abstraction we get.
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the helper function to resolve a relative url, by reusing the
existing `compute_submodule_clone_url()` function.
`compute_submodule_clone_url()` performs the same work that
`resolve_relative_url()` is doing, so we eliminate this code repetition
by moving the former function's definition up, and calling it inside
`resolve_relative_url()`.
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's modify the interface to `compute_submodule_clone_url()` function
by adding two more arguments, so that we can reuse this in various parts
of `submodule--helper.c` that follow a common pattern, which is--read
the remote url configuration of the superproject and then call
`relative_url()`.
This function is nearly identical to `resolve_relative_url()`, the only
difference being the extra warning message. We can add a quiet flag to
it, to suppress that warning when not needed, and then refactor
`resolve_relative_url()` by using this function, something we will do in
the next patch.
We also rename the local variable 'relurl' to avoid potential confusion
with the 'rel_url' parameter while we are at it.
Having this functionality factored out will be useful for converting the
rest of `submodule add` in subsequent patches.
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new "add-config" subcommand to `git submodule--helper` with the
goal of converting part of the shell code in git-submodule.sh related to
`git submodule add` into C code. This new subcommand sets the
configuration variables of a newly added submodule, by registering the
url in local git config, as well as the submodule name and path in the
.gitmodules file. It also sets 'submodule.<name>.active' to "true" if
the submodule path has not already been covered by any pathspec
specified in 'submodule.active'.
This is meant to be a faithful conversion from shell to C, although we
add comments to areas that could be improved in future patches, after
the conversion has settled.
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Based-on-patch-by: Shourya Shukla <periperidip@gmail.com>
Based-on-patch-by: Prathamesh Chavan <pc44800@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
d540b70c85 (merge: cleanup messages like commit, 2019-04-17) adds
a way to change part of the helper text using a single call to
strbuf_add_commented_addf but with two formats with varying number
of parameters.
this trigger a warning in old versions of Xcode (ex 8.0), so use
instead two independent calls with a matching number of parameters
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a few reasons to switch the default:
* Correctness
* Extensibility
* Performance
I'll provide some summaries about each.
=== Correctness ===
The original impetus for a new merge backend was to fix issues that were
difficult to fix within recursive's design. The success with this goal
is perhaps most easily demonstrated by running the following:
$ git grep -2 KNOWN_FAILURE t/ | grep -A 4 GIT_TEST_MERGE_ALGORITHM
$ git grep test_expect_merge_algorithm.failure.success t/
$ git grep test_expect_merge_algorithm.success.failure t/
In order, these greps show:
* Seven sets of submodule tests (10 total tests) that fail with
recursive but succeed with ort
* 22 other tests that fail with recursive, but succeed with ort
* 0 tests that pass with recursive, but fail with ort
=== Extensibility ===
Being able to perform merges without touching the working tree or index
makes it possible to create new features that were difficult with the
old backend:
* Merging, cherry-picking, rebasing, reverting in bare repositories...
or just on branches that aren't checked out.
* `git diff AUTO_MERGE` -- ability to see what changes the user has
made to resolve conflicts so far (see commit 5291828df8 ("merge-ort:
write $GIT_DIR/AUTO_MERGE whenever we hit a conflict", 2021-03-20)
* A --remerge-diff option for log/show, used to show diffs for merges
that display the difference between what an automatic merge would
have created and what was recorded in the merge. (This option will
often result in an empty diff because many merges are clean, but for
the non-clean ones it will show how conflicts were fixed including
the removal of conflict markers, and also show additional changes
made outside of conflict regions to e.g. fix semantic conflicts.)
* A --remerge-diff-only option for log/show, similar to --remerge-diff
but also showing how cherry-picks or reverts differed from what an
automatic cherry-pick or revert would provide.
The last three have been implemented already (though only one has been
submitted upstream so far; the others were waiting for performance work
to complete), and I still plan to implement the first one.
=== Performance ===
I'll quote from the summary of my final optimization for merge-ort
(while fixing the testcase name from 'no-renames' to 'few-renames'):
Timings
Infinite
merge- merge- Parallelism
recursive recursive of rename merge-ort
v2.30.0 current detection current
---------- --------- ----------- ---------
few-renames: 18.912 s 18.030 s 11.699 s 198.3 ms
mega-renames: 5964.031 s 361.281 s 203.886 s 661.8 ms
just-one-mega: 149.583 s 11.009 s 7.553 s 264.6 ms
Speedup factors
Infinite
merge- merge- Parallelism
recursive recursive of rename
v2.30.0 current detection merge-ort
---------- --------- ----------- ---------
few-renames: 1 1.05 1.6 95
mega-renames: 1 16.5 29 9012
just-one-mega: 1 13.6 20 565
And, for partial clone users:
Factor reduction in number of objects needed
Infinite
merge- merge- Parallelism
recursive recursive of rename
v2.30.0 current detection merge-ort
---------- --------- ----------- ---------
mega-renames: 1 1 1 181.3
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `--no-walk` flag supports two modes: either it sorts the revisions
given as input input or it doesn't. This is reflected in a single
`no_walk` flag, which reflects one of the three states "walk", "don't
walk but without sorting" and "don't walk but with sorting".
Split up the flag into two separate bits, one indicating whether we
should walk or not and one indicating whether the input should be sorted
or not. This will allow us to more easily introduce a new flag
`--unsorted-input`, which only impacts the sorting bit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --advertise-refs documentation in git-upload-pack added in
9812f2136b (upload-pack.c: use parse-options API, 2016-05-31) hasn't
been entirely true ever since v2 support was implemented in
e52449b672 (connect: request remote refs using v2, 2018-03-15). Under
v2 we don't advertise the refs at all, but rather dump the
capabilities header.
This option has always been an obscure internal implementation detail,
it wasn't even documented for git-receive-pack. Since it has exactly
one user let's rename it to --http-backend-info-refs, which is more
accurate and points the reader in the right direction. Let's also
cross-link this from the protocol v1 and v2 documentation.
I'm retaining a hidden --advertise-refs alias in case there's any
external users of this, and making both options hidden to the bash
completion (as with most other internal-only options).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "advertise capabilities" mode of serve.c added in
ed10cb952d (serve: introduce git-serve, 2018-03-15) is only used by
the http-backend.c to call {upload,receive}-pack with the
--advertise-refs parameter. See 42526b478e (Add stateless RPC options
to upload-pack, receive-pack, 2009-10-30).
Let's just make cmd_upload_pack() take the two (v2) or three (v2)
parameters the the v2/v1 servicing functions need directly, and pass
those in via the function signature. The logic of whether daemon mode
is implied by the timeout belongs in the v1 function (only used
there).
Once we split up the "advertise v2 refs" from "serve v2 request" it
becomes clear that v2 never cared about those in combination. The only
time it mattered was for v1 to emit its ref advertisement, in that
case we wanted to emit the smart-http-only "no-done" capability.
Since we only do that in the --advertise-refs codepath let's just have
it set "do_done" itself in v1's upload_pack() just before send_ref(),
at that point --advertise-refs and --stateless-rpc in combination are
redundant (the only user is get_info_refs() in http-backend.c), so we
can just pass in --advertise-refs only.
Since we need to touch all the serve() and advertise_capabilities()
codepaths let's rename them to less clever and obvious names, it's
been suggested numerous times, the latest of which is [1]'s suggestion
for protocol_v2_serve_loop(). Let's go with that.
1. https://lore.kernel.org/git/CAFQ2z_NyGb8rju5CKzmo6KhZXD0Dp21u-BbyCb2aNxLEoSPRJw@mail.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There were two locations in the code that referred to 'merge-recursive'
but which were also applicable to 'merge-ort'. Update them to more
general wording.
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The local changes stashed by "git merge --autostash" were lost when
the merge failed in certain ways, which has been corrected.
* pb/merge-autostash-more:
merge: apply autostash if merge strategy fails
merge: apply autostash if fast-forward fails
Documentation: define 'MERGE_AUTOSTASH'
merge: add missing word "strategy" to a message
Leak plugging.
* ah/plugleaks:
reset: clear_unpack_trees_porcelain to plug leak
builtin/rebase: fix options.strategy memory lifecycle
builtin/merge: free found_ref when done
builtin/mv: free or UNLEAK multiple pointers at end of cmd_mv
convert: release strbuf to avoid leak
read-cache: call diff_setup_done to avoid leak
ref-filter: also free head for ATOM_HEAD to avoid leak
diffcore-rename: move old_dir/new_dir definition to plug leak
builtin/for-each-repo: remove unnecessary argv copy to plug leak
builtin/submodule--helper: release unused strbuf to avoid leak
environment: move strbuf into block to plug leak
fmt-merge-msg: free newly allocated temporary strings when done
Rewrite of "git submodule" in C continues.
* ar/submodule-add:
submodule: drop unused sm_name parameter from show_fetch_remotes()
submodule--helper: introduce add-clone subcommand
submodule--helper: refactor module_clone()
submodule: prefix die messages with 'fatal'
t7400: test failure to add submodule in tracked path
cf2dc1c238 (speed up alt_odb_usable() with many alternates, 2021-07-07)
introduced the function fspathhash() for calculating path hashes while
respecting the configuration option core.ignorecase. Call it instead of
open-coding it; the resulting code is shorter and less repetitive.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --renormalize option updates the EOL conversions for the tracked
files. However, the loop already ignores files marked with the
SKIP_WORKTREE bit, so it will continue to do so with a sparse index
because the sparse directory entries also have this bit set.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since b243012 (refresh_index(): add flag to ignore SKIP_WORKTREE
entries, 2021-04-08), 'git add --refresh <path>' will output a warning
message when the path is outside the sparse-checkout definition. The
implementation of this warning happened in parallel with the
sparse-index work to add ensure_full_index() calls throughout the
codebase.
Update this loop to have the proper logic that checks to see if the
pathspec is outside the sparse-checkout definition. This avoids the need
to expand the sparse directory entry and determine if the path is
tracked, untracked, or ignored. We simply avoid updating the stat()
information because there isn't even an entry that matches the path!
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Disable command_requires_full_index for 'git add'. This does not require
any additional removals of ensure_full_index(). The main reason is that
'git add' discovers changes based on the pathspec and the worktree
itself. These are then inserted into the index directly, and calls to
index_name_pos() or index_file_exists() already call expand_to_path() at
the appropriate time to support a sparse-index.
Add a test to check that 'git add -A' and 'git add <file>' does not
expand the index at all, as long as <file> is not within a sparse
directory. This does not help the global 'git add .' case.
We can measure the improvement using p2000-sparse-operations.sh with
these results:
Test HEAD~1 HEAD
------------------------------------------------------------------------------
2000.6: git add -A (full-index-v3) 0.35(0.30+0.05) 0.37(0.29+0.06) +5.7%
2000.7: git add -A (full-index-v4) 0.31(0.26+0.06) 0.33(0.27+0.06) +6.5%
2000.8: git add -A (sparse-index-v3) 0.57(0.53+0.07) 0.05(0.04+0.08) -91.2%
2000.9: git add -A (sparse-index-v4) 0.58(0.55+0.06) 0.05(0.05+0.06) -91.4%
While the 91% improvement seems impressive, it's important to recognize
that previously we had significant overhead for expanding the
sparse-index. Comparing to the full index case, 'git add -A' goes from
0.37s to 0.05s, which is "only" an 86% improvement.
This modification to 'git add' creates some behavior change depending on
the use of a sparse index. We modify a test in t1092 to demonstrate
these changes which will be remedied in future changes.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code that gives an error message in "git multi-pack-index" when
no subcommand is given tried to print a NULL pointer as a strong,
which has been corrected.
* tb/reverse-midx:
multi-pack-index: fix potential segfault without sub-command
"git status" codepath learned to work with sparsely populated index
without hydrating it fully.
* ds/status-with-sparse-index:
t1092: document bad sparse-checkout behavior
fsmonitor: integrate with sparse index
wt-status: expand added sparse directory entries
status: use sparse-index throughout
status: skip sparse-checkout percentage with sparse-index
diff-lib: handle index diffs with sparse dirs
dir.c: accept a directory as part of cone-mode patterns
unpack-trees: unpack sparse directory entries
unpack-trees: rename unpack_nondirectories()
unpack-trees: compare sparse directories correctly
unpack-trees: preserve cache_bottom
t1092: add tests for status/add and sparse files
t1092: expand repository data shape
t1092: replace incorrect 'echo' with 'cat'
sparse-index: include EXTENDED flag when expanding
sparse-index: skip indexes with unmerged entries
Many "printf"-like helper functions we have have been annotated
with __attribute__() to catch placeholder/parameter mismatches.
* ab/attribute-format:
advice.h: add missing __attribute__((format)) & fix usage
*.h: add a few missing __attribute__((format))
*.c static functions: add missing __attribute__((format))
sequencer.c: move static function to avoid forward decl
*.c static functions: don't forward-declare __attribute__
Optimize "git log" for cases where we wasted cycles to load ref
decoration data that may not be needed.
* jk/log-decorate-optim:
load_ref_decorations(): fix decoration with tags
add_ref_decoration(): rename s/type/deco_type/
load_ref_decorations(): avoid parsing non-tag objects
object.h: add lookup_object_by_type() function
object.h: expand docstring for lookup_unknown_object()
log: avoid loading decorations for userformats that don't need it
pretty.h: update and expand docstring for userformat_find_requirements()
"git worktree add --lock" learned to record why the worktree is
locked with a custom message.
* sm/worktree-add-lock:
worktree: teach `add` to accept --reason <string> with --lock
worktree: mark lock strings with `_()` for translation
t2400: clean up '"add" worktree with lock' test
"git commit --allow-empty-message" won't abort the operation upon
an empty message, but the hint shown in the editor said otherwise.
* hj/commit-allow-empty-message:
commit: remove irrelavent prompt on `--allow-empty-message`
commit: reorganise commit hint strings
- cmd_rebase populates rebase_options.strategy with newly allocated
strings, hence we need to free those strings at the end of cmd_rebase
to avoid a leak.
- In some cases: get_replay_opts() is called, which prepares replay_opts
using data from rebase_options. We used to simply copy the pointer
from rebase_options.strategy, however that would now result in a
double-free because sequencer_remove_state() is eventually used to
free replay_opts.strategy. To avoid this we xstrdup() strategy when
adding it to replay_opts.
The original leak happens because we always populate
rebase_options.strategy, but we don't always enter the path that calls
get_replay_opts() and later sequencer_remove_state() - in other words
we'd always allocate a new string into rebase_options.strategy but
only sometimes did we free it. We now make sure that rebase_options
and replay_opts both own their own copies of strategy, and each copy
is free'd independently.
This was first seen when running t0021 with LSAN, but t2012 helped catch
the fact that we can't just free(options.strategy) at the end of
cmd_rebase (as that can cause a double-free). LSAN output from t0021:
LSAN output from t0021:
Direct leak of 4 byte(s) in 1 object(s) allocated from:
#0 0x486804 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
#1 0xa71eb8 in xstrdup wrapper.c:29:14
#2 0x61b1cc in cmd_rebase builtin/rebase.c:1779:22
#3 0x4ce83e in run_builtin git.c:475:11
#4 0x4ccafe in handle_builtin git.c:729:3
#5 0x4cb01c in run_argv git.c:818:4
#6 0x4cb01c in cmd_main git.c:949:19
#7 0x6b3fad in main common-main.c:52:11
#8 0x7f267b512349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 4 byte(s) leaked in 1 allocation(s).
Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
merge_name() calls dwim_ref(), which allocates a new string into
found_ref. Therefore add a free() to avoid leaking found_ref.
LSAN output from t0021:
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x486804 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
#1 0xa8beb8 in xstrdup wrapper.c:29:14
#2 0x954054 in expand_ref refs.c:671:12
#3 0x953cb6 in repo_dwim_ref refs.c:644:22
#4 0x5d3759 in dwim_ref refs.h:162:9
#5 0x5d3759 in merge_name builtin/merge.c:517:6
#6 0x5d3759 in collect_parents builtin/merge.c:1214:5
#7 0x5cf60d in cmd_merge builtin/merge.c:1458:16
#8 0x4ce83e in run_builtin git.c:475:11
#9 0x4ccafe in handle_builtin git.c:729:3
#10 0x4cb01c in run_argv git.c:818:4
#11 0x4cb01c in cmd_main git.c:949:19
#12 0x6bdbfd in main common-main.c:52:11
#13 0x7f0430502349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 16 byte(s) leaked in 1 allocation(s).
Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These leaks all happen at the end of cmd_mv, hence don't matter in any
way. But we still fix the easy ones and squash the rest to get us closer
to being able to run tests without leaks.
LSAN output from t0050:
Direct leak of 384 byte(s) in 1 object(s) allocated from:
#0 0x49ab49 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0xa8c015 in xrealloc wrapper.c:126:8
#2 0xa0a7e1 in add_entry string-list.c:44:2
#3 0xa0a7e1 in string_list_insert string-list.c:58:14
#4 0x5dac03 in cmd_mv builtin/mv.c:248:4
#5 0x4ce83e in run_builtin git.c:475:11
#6 0x4ccafe in handle_builtin git.c:729:3
#7 0x4cb01c in run_argv git.c:818:4
#8 0x4cb01c in cmd_main git.c:949:19
#9 0x6bd9ad in main common-main.c:52:11
#10 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x49a82d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0xa8bd09 in do_xmalloc wrapper.c:41:8
#2 0x5dbc34 in internal_prefix_pathspec builtin/mv.c:32:2
#3 0x5da575 in cmd_mv builtin/mv.c:158:14
#4 0x4ce83e in run_builtin git.c:475:11
#5 0x4ccafe in handle_builtin git.c:729:3
#6 0x4cb01c in run_argv git.c:818:4
#7 0x4cb01c in cmd_main git.c:949:19
#8 0x6bd9ad in main common-main.c:52:11
#9 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x49a82d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0xa8bd09 in do_xmalloc wrapper.c:41:8
#2 0x5dbc34 in internal_prefix_pathspec builtin/mv.c:32:2
#3 0x5da4e4 in cmd_mv builtin/mv.c:148:11
#4 0x4ce83e in run_builtin git.c:475:11
#5 0x4ccafe in handle_builtin git.c:729:3
#6 0x4cb01c in run_argv git.c:818:4
#7 0x4cb01c in cmd_main git.c:949:19
#8 0x6bd9ad in main common-main.c:52:11
#9 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Direct leak of 8 byte(s) in 1 object(s) allocated from:
#0 0x49a9a2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
#1 0xa8c119 in xcalloc wrapper.c:140:8
#2 0x5da585 in cmd_mv builtin/mv.c:159:22
#3 0x4ce83e in run_builtin git.c:475:11
#4 0x4ccafe in handle_builtin git.c:729:3
#5 0x4cb01c in run_argv git.c:818:4
#6 0x4cb01c in cmd_main git.c:949:19
#7 0x6bd9ad in main common-main.c:52:11
#8 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Direct leak of 4 byte(s) in 1 object(s) allocated from:
#0 0x49a9a2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
#1 0xa8c119 in xcalloc wrapper.c:140:8
#2 0x5da4f8 in cmd_mv builtin/mv.c:149:10
#3 0x4ce83e in run_builtin git.c:475:11
#4 0x4ccafe in handle_builtin git.c:729:3
#5 0x4cb01c in run_argv git.c:818:4
#6 0x4cb01c in cmd_main git.c:949:19
#7 0x6bd9ad in main common-main.c:52:11
#8 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Indirect leak of 65 byte(s) in 1 object(s) allocated from:
#0 0x49ab49 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0xa8c015 in xrealloc wrapper.c:126:8
#2 0xa00226 in strbuf_grow strbuf.c:98:2
#3 0xa00226 in strbuf_vaddf strbuf.c:394:3
#4 0xa065c7 in xstrvfmt strbuf.c:981:2
#5 0xa065c7 in xstrfmt strbuf.c:991:8
#6 0x9e7ce7 in prefix_path_gently setup.c:115:15
#7 0x9e7fa6 in prefix_path setup.c:128:12
#8 0x5dbdbf in internal_prefix_pathspec builtin/mv.c:55:23
#9 0x5da575 in cmd_mv builtin/mv.c:158:14
#10 0x4ce83e in run_builtin git.c:475:11
#11 0x4ccafe in handle_builtin git.c:729:3
#12 0x4cb01c in run_argv git.c:818:4
#13 0x4cb01c in cmd_main git.c:949:19
#14 0x6bd9ad in main common-main.c:52:11
#15 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Indirect leak of 65 byte(s) in 1 object(s) allocated from:
#0 0x49ab49 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0xa8c015 in xrealloc wrapper.c:126:8
#2 0xa00226 in strbuf_grow strbuf.c:98:2
#3 0xa00226 in strbuf_vaddf strbuf.c:394:3
#4 0xa065c7 in xstrvfmt strbuf.c:981:2
#5 0xa065c7 in xstrfmt strbuf.c:991:8
#6 0x9e7ce7 in prefix_path_gently setup.c:115:15
#7 0x9e7fa6 in prefix_path setup.c:128:12
#8 0x5dbdbf in internal_prefix_pathspec builtin/mv.c:55:23
#9 0x5da4e4 in cmd_mv builtin/mv.c:148:11
#10 0x4ce83e in run_builtin git.c:475:11
#11 0x4ccafe in handle_builtin git.c:729:3
#12 0x4cb01c in run_argv git.c:818:4
#13 0x4cb01c in cmd_main git.c:949:19
#14 0x6bd9ad in main common-main.c:52:11
#15 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 558 byte(s) leaked in 7 allocation(s).
Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
cmd_for_each_repo() copies argv into args (a strvec), which is later
passed into run_command_on_repo(), which in turn copies that strvec onto
the end of child.args. The initial copy is unnecessary (we never modify
args). We therefore choose to just pass argv directly into
run_command_on_repo(), which lets us avoid the copy and fixes the leak.
LSAN output from t0068:
Direct leak of 192 byte(s) in 1 object(s) allocated from:
#0 0x7f63bd4ab8b0 in realloc (/usr/lib64/libasan.so.4+0xdc8b0)
#1 0x98d7e6 in xrealloc wrapper.c:126
#2 0x916914 in strvec_push_nodup strvec.c:19
#3 0x916a6e in strvec_push strvec.c:26
#4 0x4be4eb in cmd_for_each_repo builtin/for-each-repo.c:49
#5 0x410dcd in run_builtin git.c:475
#6 0x410dcd in handle_builtin git.c:729
#7 0x414087 in run_argv git.c:818
#8 0x414087 in cmd_main git.c:949
#9 0x40e9ec in main common-main.c:52
#10 0x7f63bc9fa349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Indirect leak of 22 byte(s) in 2 object(s) allocated from:
#0 0x7f63bd445e30 in __interceptor_strdup (/usr/lib64/libasan.so.4+0x76e30)
#1 0x98d698 in xstrdup wrapper.c:29
#2 0x916a63 in strvec_push strvec.c:26
#3 0x4be4eb in cmd_for_each_repo builtin/for-each-repo.c:49
#4 0x410dcd in run_builtin git.c:475
#5 0x410dcd in handle_builtin git.c:729
#6 0x414087 in run_argv git.c:818
#7 0x414087 in cmd_main git.c:949
#8 0x40e9ec in main common-main.c:52
#9 0x7f63bc9fa349 in __libc_start_main (/lib64/libc.so.6+0x24349)
See also discussion about the original implementation below - this code
appears to have evolved from a callback explaining the double-strvec-copy
pattern, but there's no strong reason to keep that now:
https://lore.kernel.org/git/68bbeca5-314b-08ee-ef36-040e3f3814e9@gmail.com/
Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
relative_url() populates sb. In the normal return path, its buffer is
detached using strbuf_detach(). However the early return path does
nothing with sb, which means that sb's memory is leaked - therefore
we add a release to avoid this leak.
The reset is also only necessary for the normal return path, hence we
move it down to after the early-return to avoid unnecessary work.
LSAN output from t0060:
Direct leak of 121 byte(s) in 1 object(s) allocated from:
#0 0x7f31246f28b0 in realloc (/usr/lib64/libasan.so.4+0xdc8b0)
#1 0x98d7d6 in xrealloc wrapper.c:126
#2 0x909a60 in strbuf_grow strbuf.c:98
#3 0x90bf00 in strbuf_vaddf strbuf.c:401
#4 0x90c321 in strbuf_addf strbuf.c:335
#5 0x5cb78d in relative_url builtin/submodule--helper.c:182
#6 0x5cbe46 in resolve_relative_url_test builtin/submodule--helper.c:248
#7 0x410dcd in run_builtin git.c:475
#8 0x410dcd in handle_builtin git.c:729
#9 0x414087 in run_argv git.c:818
#10 0x414087 in cmd_main git.c:949
#11 0x40e9ec in main common-main.c:52
#12 0x7f3123c41349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 121 byte(s) leaked in 1 allocation(s).
Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is not immediately clear what `expand_user_path()` means, so let's
rename it to `interpolate_path()`. This also opens the path for
interpolating more than just a home directory.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This parameter has not been used since the function was introduced in
8c8195e9c3 (submodule--helper: introduce add-clone subcommand,
2021-07-10).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use non-const ref_format in *_atom_parser(), which can help us
modify the members of ref_format in *_atom_parser().
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 'git merge' learned '--autostash' in a03b55530a (merge: teach
--autostash option, 2020-04-07), 'cmd_merge', once it is determined that
we have to create a merge commit, calls 'create_autostash' if
'--autostash' is given.
As explained in a03b55530a, and made more abvious by the tests added in
that commit, the autostash is then applied if the merge succeeds, either
directly or by committing (after conflict resolution or if '--no-commit'
was given), or if the merge is aborted with 'git merge --abort'. In some
other cases, like the user calling 'git reset --merge' or 'git merge
--quit', the autostash is not applied, but saved in the stash list.
However, there exists a scenario that creates an autostash but does not
apply nor save it to the stash list: if the chosen merge strategy
completely fails to handle the merge, i.e. 'try_merge_strategy' returns
2.
Apply the autostash in that case also. An easy way to test that is to
try to merge more than two commits but explicitely ask for the 'recursive'
merge strategy.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 'git merge' learned '--autostash' in a03b55530a (merge: teach
--autostash option, 2020-04-07), 'cmd_merge', in the fast-forward case,
calls 'create_autostash' before calling 'checkout_fast_forward' if
'--autostash' is given.
However, if 'checkout_fast_forward' fails, the autostash is not applied
to the working tree, nor saved in the stash list, since the code simply
calls 'goto done'.
Be more helpful to the user by applying the autostash in that case.
An easy way to test a failing fast-forward is when we are merging a
branch that has a tracked file that conflicts with an untracked file in
the working tree.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The variable 'best_strategy' holds the name of the merge strategy that
resulted in fewer conflicts, if several strategies were tried. When
that's the case but the best strategy was not the first one tried, we
inform the user which strategy was the "best" one before recreating the
merge and leaving the conflicted files in the tree.
This informational message is missing the word "strategy", so it shows
something like:
Using the recursive to prepare resolving by hand.
Fix that.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git rev-list" learns to omit the "commit <object-name>" header
lines from the output with the `--no-commit-header` option.
* bc/rev-list-without-commit-line:
rev-list: add option for --pretty=format without header
With multiple heads, we should not allow rebasing or fast-forwarding.
Make sure any fast-forward request calls out specifically the fact that
multiple branches are in play. Also, since we cannot fast-forward to
multiple branches, fix our computation of can_ff.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-pull.txt includes merge-options.txt, which is written assuming
merges will happen. git-pull has allowed rebases for many years; update
the documentation to reflect that.
While at it, pass any `--signoff` flag through to the rebase backend too
so that we don't have to document it as merge-specific. Rebase has
supported the --signoff flag for years now as well.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have for some time shown a long warning when the user does not
specify how to reconcile divergent branches with git pull. Make it an
error now.
Initial-patch-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix the last few precedence tests failing in t7601 by now implementing
the logic to have --[no-]rebase override a pull.ff=only config setting.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are both merge and rebase branches in the logic, and previously
both had to handle fast-forwarding. Merge handled that implicitly
(because git merge handles it directly), while in rebase it was
explicit. Given that the --ff-only flag is meant to override any
--rebase or --no-rebase, make the code reflect that by handling
--ff-only before the merge-vs-rebase logic.
It turns out that this also fixes a bug for submodules. Previously,
when --ff-only was given, the code would run `merge --ff-only` on the
main module, and then run `submodule update --recursive --rebase` on the
submodules. With this change, we still run `merge --ff-only` on the
main module, but now run `submodule update --recursive --checkout` on
the submodules. I believe this better reflects the intent of --ff-only
to have it apply to both the main module and the submodules.
(Sidenote: It is somewhat interesting that all merges pass `--checkout`
to submodule update, even when `--no-ff` is specified, meaning that it
will only do fast-forward merges for submodules. This was discussed in
commit a6d7eb2c7a ("pull: optionally rebase submodules (remote submodule
changes only)", 2017-06-23). The same limitations apply now as then, so
we are not trying to fix this at this time.)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The warning about pulling without specifying how to reconcile divergent
branches says that after setting pull.rebase to true, --ff-only can
still be passed on the command line to require a fast-forward. Make that
actually work.
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
[en: updated tests; note 3 fixes and 1 new failure]
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since cd57bc41bb (builtin/multi-pack-index.c: display usage on
unrecognized command, 2021-03-30) we have used a "usage" label to avoid
having two separate callers of usage_with_options (one when no arguments
are given, and another for unrecognized sub-commands).
But the first caller has been broken since cd57bc41bb, since it will
happily jump to usage without arguments, and then pass argv[0] to the
"unrecognized subcommand" error.
Many compilers will save us from a segfault here, but the end result is
ugly, since it mentions an unrecognized subcommand when we didn't even
pass one, and (on GCC) includes "(null)" in its output.
Move the "usage" label down past the error about unrecognized
subcommands so that it is only triggered when it should be. While we're
at it, bulk up our test coverage in this area, too.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code cleanup around struct_type_init() functions.
* ab/struct-init:
string-list.h users: change to use *_{nodup,dup}()
string-list.[ch]: add a string_list_init_{nodup,dup}()
dir.[ch]: replace dir_init() with DIR_INIT
*.c *_init(): define in terms of corresponding *_INIT macro
*.h: move some *_INIT to designated initializers
Code clean-up and leak plugging in "git bundle".
* ab/bundle-updates:
bundle: remove "ref_list" in favor of string-list.c API
bundle.c: use a temporary variable for OIDs and names
bundle cmd: stop leaking memory from parse_options_cmd_bundle()
Fill test gaps.
* ab/show-branch-tests:
show-branch tests: add missing tests
show-branch: don't <COLOR></RESET> for space characters
show-branch tests: modernize test code
show-branch tests: rename the one "show-branch" test file
Code recently added to support common ancestry negotiation during
"git push" did not sanity check its arguments carefully enough.
* ab/fetch-negotiate-segv-fix:
fetch: fix segfault in --negotiate-only without --negotiation-tip=*
fetch: document the --negotiate-only option
send-pack.c: move "no refs in common" abort earlier
The default reason stored in the lock file, "added with --lock",
is unlikely to be what the user would have given in a separate
`git worktree lock` command. Allowing `--reason` to be specified
along with `--lock` when adding a working tree gives the user control
over the reason for locking without needing a second command.
Signed-off-by: Stephen Manz <smanz@alum.mit.edu>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a full hexadecimal hash is given as a --negotiation-tip to "git
fetch", and that hash does not correspond to an object, "git fetch" will
segfault if --negotiate-only is given and will silently ignore that hash
otherwise. Make these cases fatal errors, just like the case when an
invalid ref name or abbreviated hash is given.
While at it, mark the error messages as translatable.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 477673d6f3 ("send-pack: support push negotiation", 2021-05-05)
introduced the push.negotiate config variable and included a test. The
test only covered pushing without a remote helper, so the fact that
pushing with a remote helper doesn't work went unnoticed.
This is ultimately caused by the "url" field not being set in the args
struct. This field being unset probably went unnoticed because besides
push negotiation, this field is only used to generate a "pushee" line in
a push cert (and if not given, no such line is generated). Therefore,
set this field.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previous changes did the necessary improvements to unpack-trees.c and
diff-lib.c in order to modify a sparse index based on its comparision
with a tree. The only remaining work is to remove some
ensure_full_index() calls and add tests that verify that the index is
not expanded in our interesting cases. Include 'switch' and 'restore' in
these tests, as they share a base implementation with 'checkout'.
Here are the relevant performance results from
p2000-sparse-operations.sh:
Test HEAD~1 HEAD
--------------------------------------------------------------------------------
2000.18: git checkout -f - (full-v3) 0.49(0.43+0.03) 0.47(0.39+0.05) -4.1%
2000.19: git checkout -f - (full-v4) 0.45(0.37+0.06) 0.42(0.37+0.05) -6.7%
2000.20: git checkout -f - (sparse-v3) 0.76(0.71+0.07) 0.04(0.03+0.04) -94.7%
2000.21: git checkout -f - (sparse-v4) 0.75(0.72+0.04) 0.05(0.06+0.04) -93.3%
It is important to compare the full index case to the sparse index case,
as the previous results for the sparse index were inflated by the index
expansion. For index v4, this is an 88% improvement.
On an internal repository with over two million paths at HEAD and a
sparse-checkout definition containing ~60,000 of those paths, 'git
checkout' went from 3.5s to 297ms with this change. The theoretical
optimum where only those ~60,000 paths exist was 275ms, so the extra
sparse directory entries contribute a 22ms overhead.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update 'git commit' to allow using the sparse-index in memory without
expanding to a full one. The only place that had an ensure_full_index()
call was in cache_tree_update(). The recursive algorithm for
update_one() was already updated in 2de37c536 (cache-tree: integrate
with sparse directory entries, 2021-03-03) to handle sparse directory
entries in the index.
Most of this change involves testing different command-line options that
allow specifying which on-disk changes should be included in the commit.
This includes no options (only take currently-staged changes), -a (take
all tracked changes), and --include (take a list of specific changes).
To simplify testing that these options do not expand the index, update
the test that previously verified that 'git status' does not expand the
index with a helper method, ensure_not_expanded().
This allows 'git commit' to operate much faster when the sparse-checkout
cone is much smaller than the full list of files at HEAD.
Here are the relevant lines from p2000-sparse-operations.sh:
Test HEAD~1 HEAD
----------------------------------------------------------------------------------
2000.14: git commit -a -m A (full-v3) 0.35(0.26+0.06) 0.36(0.28+0.07) +2.9%
2000.15: git commit -a -m A (full-v4) 0.32(0.26+0.05) 0.34(0.28+0.06) +6.3%
2000.16: git commit -a -m A (sparse-v3) 0.63(0.59+0.06) 0.04(0.05+0.05) -93.7%
2000.17: git commit -a -m A (sparse-v4) 0.64(0.59+0.08) 0.04(0.04+0.04) -93.8%
It is important to compare the full-index case to the sparse-index case,
so the improvement for index version v4 is actually an 88% improvement in
this synthetic example.
In a real repository with over two million files at HEAD and 60,000
files in the sparse-checkout definition, the time for 'git commit -a'
went from 2.61 seconds to 134ms. I compared this to the result if the
index only contained the paths in the sparse-checkout definition and
found the theoretical optimum to be 120ms, so the out-of-cone paths only
add a 12% overhead.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By testing 'git -c core.fsmonitor= status -uno', we can check for the
simplest index operations that can be made sparse-aware. The necessary
implementation details are already integrated with sparse-checkout, so
modify command_requires_full_index to be zero for cmd_status().
In refresh_index(), we loop through the index entries to refresh their
stat() information. However, sparse directories have no stat()
information to populate. Ignore these entries.
This allows 'git status' to no longer expand a sparse index to a full
one. This is further tested by dropping the "-uno" option and adding an
untracked file into the worktree.
The performance test p2000-sparse-checkout-operations.sh demonstrates
these improvements:
Test HEAD~1 HEAD
-----------------------------------------------------------------------------
2000.2: git status (full-index-v3) 0.31(0.30+0.05) 0.31(0.29+0.06) +0.0%
2000.3: git status (full-index-v4) 0.31(0.29+0.07) 0.34(0.30+0.08) +9.7%
2000.4: git status (sparse-index-v3) 2.35(2.28+0.10) 0.04(0.04+0.05) -98.3%
2000.5: git status (sparse-index-v4) 2.35(2.24+0.15) 0.05(0.04+0.06) -97.9%
Note that since HEAD~1 was expanding the sparse index by parsing trees,
it was artificially slower than the full index case. Thus, the 98%
improvement is misleading, and instead we should celebrate the 0.34s to
0.05s improvement of 85%. This is more indicative of the peformance
gains we are expecting by using a sparse index.
Note: we are dropping the assignment of core.fsmonitor here. This is not
necessary for the test script as we are not altering the config any
other way. Correct integration with FS Monitor will be validated in
later changes.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
- default lock string, "added with --lock"
- temporary lock string, "initializing"
Signed-off-by: Stephen Manz <smanz@alum.mit.edu>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rewrite the backend for "diff -G/-S" to use pcre2 engine when
available.
* ab/pickaxe-pcre2: (22 commits)
xdiff-interface: replace discard_hunk_line() with a flag
xdiff users: use designated initializers for out_line
pickaxe -G: don't special-case create/delete
pickaxe -G: terminate early on matching lines
xdiff-interface: allow early return from xdiff_emit_line_fn
xdiff-interface: prepare for allowing early return
pickaxe -S: slightly optimize contains()
pickaxe: rename variables in has_changes() for brevity
pickaxe -S: support content with NULs under --pickaxe-regex
pickaxe: assert that we must have a needle under -G or -S
pickaxe: refactor function selection in diffcore-pickaxe()
perf: add performance test for pickaxe
pickaxe/style: consolidate declarations and assignments
diff.h: move pickaxe fields together again
pickaxe: die when --find-object and --pickaxe-all are combined
pickaxe: die when -G and --pickaxe-regex are combined
pickaxe tests: add missing test for --no-pickaxe-regex being an error
pickaxe tests: test for -G, -S and --find-object incompatibility
pickaxe tests: add test for "log -S" not being a regex
pickaxe tests: add test for diffgrep_consume() internals
...
Some more code and doc clarification around "git push".
* fc/push-simple-updates-cleanup:
push: don't get a full remote object
push: only check same_remote when needed
push: remove trivial function
push: remove redundant check
push: factor out the typical case
push: get rid of all the setup_push_* functions
push: trivial simplifications
push: make setup_push_* return the dst
push: only get the branch when needed
push: factor out null branch check
push: split switch cases
push: return immediately in trivial switch case
push: create new get_upstream_ref() helper
"git cat-file --batch-all-objects"" misbehaved when "--batch" is in
use and did not ask for certain object traits.
* zh/cat-file-batch-fix:
cat-file: merge two block into one
cat-file: handle trivial --batch format with --batch-all-objects
Add missing __attribute__((format)) function attributes to various
"static" functions that take printf arguments.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `git diff --merge-base` was introduced at around Git 2.30, the
documentation included a few errors.
In the example given for `git diff --cached --merge-base`, the
`--cached` flag was omitted for the `--merge-base` example. Add the
missing flag.
In the `git diff <commit>` case, we failed to mention that
`--merge-base` is an available option. Give the usage of `--merge-base`
as an option there.
Finally, there are two errors in the usage of `git diff`. Firstly, we do
not mention `--merge-base` in the `git diff --cached` case. Mention it
so that it's consistent with the documentation. Secondly, we put the
`[--merge-base]` in between `<commit>` and `[<commit>...]`. Move the
`[--merge-base]` so that it's beside `[<options>]` which is a more
logical grouping.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
9cf6d3357a (Add git-index-pack utility, 2005-10-12) and
466dbc42f5 (receive-pack: Send internal errors over side-band #2,
2010-02-10) we added these static functions and forward-declared their
__attribute__((printf)).
I think this may have been to work around some compiler limitation at
the time, but in any case we have a lot of code that uses the briefer
way of declaring these that I'm using here, so if we had any such
issues with compilers we'd have seen them already.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let's add a new "add-clone" subcommand to `git submodule--helper` with
the goal of converting part of the shell code in git-submodule.sh
related to `git submodule add` into C code. This new subcommand clones
the repository that is to be added, and checks out to the appropriate
branch.
This is meant to be a faithful conversion that leaves the behaviour of
'cmd_add()' script unchanged.
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Based-on-patch-by: Shourya Shukla <periperidip@gmail.com>
Based-on-patch-by: Prathamesh Chavan <pc44800@gmail.com>
Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Separate out the core logic of module_clone() from the flag
parsing---this way we can call the equivalent of the `submodule--helper
clone` subcommand directly within C, without needing to push arguments
in a strvec.
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In general, we encourage users to use plumbing commands, like git
rev-list, over porcelain commands, like git log, when scripting.
However, git rev-list has one glaring problem that prevents it from
being used in certain cases: when --pretty is used with a custom format,
it always prints out a line containing "commit" and the object ID. This
makes it unsuitable for many scripting needs, and forces users to use
git log instead.
While we can't change this behavior for backwards compatibility, we can
add an option to suppress this behavior, so let's do so, and call it
"--no-commit-header". Additionally, add the corresponding positive
option to switch it back on.
Note that this option doesn't affect the built-in formats, only custom
formats. This is exactly the same behavior as users already have from
git log and is what most users will be used to.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Even when the `--allow-empty-message` option is given, "git commit"
offers an interactive editor session with prefilled message that says
the commit will be aborted if the buffer is emptied, which is wrong.
Remove the "an empty message aborts" part from the message when the
option is given to fix it.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Helped-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Hu Jialun <hujialun@comp.nus.edu.sg>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Strings of hint messages inserted into editor on interactive commit was
scattered in-line, rendering the code harder to understand at first
glance.
Extract those messages out into separate variables to make the code
outline easier to follow.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Hu Jialun <hujialun@comp.nus.edu.sg>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a segfault in the --stdin-packs option added in
339bce27f4 (builtin/pack-objects.c: add '--stdin-packs' option,
2021-02-22).
The read_packs_list_from_stdin() function didn't check that the lines
it was reading were valid packs, and thus when doing the QSORT() with
pack_mtime_cmp() we'd have a NULL "util" field. The "util" field is
used to associate the names of included/excluded packs with the
packed_git structs they correspond to.
The logic error was in assuming that we could iterate all packs and
annotate the excluded and included packs we got, as opposed to
checking the lines we got on stdin. There was a check for excluded
packs, but included packs were simply assumed to be valid.
As noted in the test we'll not report the first bad line, but whatever
line sorted first according to the string-list.c API. In this case I
think that's fine. There was further discussion of alternate
approaches in [1].
Even though we're being lazy let's assert the line we do expect to get
in the test, since whoever changes this code in the future might miss
this case, and would want to update the test and comments.
1. http://lore.kernel.org/git/YND3h2l10PlnSNGJ@nand.local
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the codebase MSAN clean.
* ah/uninitialized-reads-fix:
builtin/checkout--worker: zero-initialise struct to avoid MSAN complaints
split-index: use oideq instead of memcmp to compare object_id's
bulk-checkin: make buffer reuse more obvious and safer
The recent --negotiate-only option would segfault in the call to
oid_array_for_each() in negotiate_using_fetch() unless one or more
--negotiation-tip=* options were provided.
All of the other tests for the feature combine both, but nothing was
checking this assumption, let's do that and add a test for it. Fixes a
bug in 9c1e657a8f (fetch: teach independent negotiation (no
packfile), 2021-05-04).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"dir.h" should have been included only once.
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Depending on the chosen format of help pages, git-help uses function
show_man_page, show_info_page, or show_html_page. The first thing all
three functions do is to convert given `git_cmd` to a `page` using
function cmd_to_page.
Move the common part of these three functions to function cmd_help to
avoid code duplication.
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Reviewed-by: Felipe Contreras <felipe.contreras@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move away from the "struct ref_list" in bundle.c in favor of the
almost identical string-list.c API.
That API fits this use-case perfectly, but did not exist in its
current form when this code was added in 2e0afafebd (Add git-bundle:
move objects and references by archive, 2007-02-22), with hindsight we
could have used the path-list API, which later got renamed to
string-list. See 8fd2cb4069 (Extract helper bits from
c-merge-recursive work, 2006-07-25)
We need to change "name" to "string" and "oid" to "util" to make this
conversion, but other than that the APIs are pretty much identical for
what bundle.c made use of.
Let's also replace the memset(..,0,...) pattern with a more idiomatic
"INIT" macro, and finally add a *_release() function so to free the
allocated memory.
Before this the add_to_ref_list() would leak memory, now e.g. "bundle
list-heads" reports no memory leaks at all under valgrind.
In the bundle_header_init() function we're using a clever trick to
memcpy() what we'd get from the corresponding
BUNDLE_HEADER_INIT. There is a concurrent series to make use of that
pattern more generally, see [1].
1. https://lore.kernel.org/git/cover-0.5-00000000000-20210701T104855Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak from the prefix_filename() function introduced with
its use in 3b754eedd5 (bundle: use prefix_filename with bundle path,
2017-03-20).
As noted in that commit the leak was intentional as a part of being
sloppy about freeing resources just before we exit, I'm changing this
because I'll be fixing other memory leaks in the bundle API (including
the library version) in subsequent commits. It's easier to reason
about those fixes if valgrind runs cleanly at the end without any
leaks whatsoever.
An earlier version of this change[1] went out of its way to not leak
memory on the die() codepaths here, but doing so will only avoid
reports of potential leaks under heap-only leak trackers such as
valgrind, not the SANITIZE=leak mode.
Avoiding those leaks as well might be useful to enable us to run
cleanly under the likes of valgrind in the future. But for now the
relative verbosity of the resulting code, and the fact that we don't
have some valgrind or SANITIZE=leak mode as part of our CI (it's only
run ad-hoc, see [2]), means we're not worrying about that for now.
1. https://lore.kernel.org/git/87v95vdxrc.fsf@evledraar.gmail.com/
2. https://lore.kernel.org/git/87czsv2idy.fsf@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the dir_init() function and replace it with a DIR_INIT
macro. In many cases in the codebase we need to initialize things with
a function for good reasons, e.g. needing to call another function on
initialization. The "dir_init()" function was not one such case, and
could trivially be replaced with a more idiomatic macro initialization
pattern.
The only place where we made use of its use of memset() was in
dir_clear() itself, which resets the contents of an an existing struct
pointer. Let's use the new "memcpy() a 'blank' struct on the stack"
idiom to do that reset.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If no --decorate option is given, we default to auto-decoration. And
when that kicks in, cmd_log_init_finish() will unconditionally load the
decoration refs.
However, if we are using a user-format that does not include "%d" or
"%D", we won't show the decorations at all, so we don't need to load
them. We can detect this case and auto-disable them by adding a new
field to our userformat_want helper. We can do this even when the user
explicitly asked for --decorate, because it can't affect the output at
all.
This patch consistently reduces the time to run "git log -1 --format=%H"
on my git.git clone (with ~2k refs) from 34ms to 7ms. On a much more
extreme real-world repository (with ~220k refs), it goes from 2.5s to
4ms.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the colored output introduced in ab07ba2a24 (show-branch: color
the commit status signs, 2009-04-22) to not color and reset each
individual space character we use for padding. The intent is to color
just the "!", "+" etc. characters.
This makes the output easier to test, so let's do that now. The test
would be much more verbose without a color/reset for each space
character. Since the coloring cycles through colors we previously had
a "rainbow of space characters".
In theory this breaks things for anyone who's relying on the exact
colored output of show-branch, in practice I'd think anyone parsing it
isn't actively turning on the colored output.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Two spaces unaligned to anything is not part of the coding-style. A
single tab is.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's no need to store ran_ff. Now it's obvious from the conditionals.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently "git pull --rebase" takes a shortcut in the case a
fast-forward merge is possible; run_merge() is called with --ff-only.
However, "git merge" didn't have an --autostash option, so, when "git
pull --rebase --autostash" was called *and* the fast-forward merge
shortcut was taken, then the pull failed.
This was fixed in commit f15e7cf5cc (pull: ff --rebase --autostash
works in dirty repo, 2017-06-01) by simply skipping the fast-forward
merge shortcut.
Later on "git merge" learned the --autostash option [a03b55530a
(merge: teach --autostash option, 2020-04-07)], and so did "git pull"
[d9f15d37f1 (pull: pass --autostash to merge, 2020-04-07)].
Therefore it's not necessary to skip the fast-forward merge shortcut
anymore when called with --rebase --autostash.
Let's always take the fast-forward merge shortcut by essentially
reverting f15e7cf5cc.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
report_result() sends a struct to the parent process, but that struct
would contain uninitialised padding bytes. Running this code under MSAN
rightly triggers a warning - but we don't particularly care about this
warning because we control the receiving code, and we therefore know
that those padding bytes won't be read on the receiving end.
We could simply suppress this warning under MSAN with the approporiate
ifdef'd attributes, but a less intrusive solution is to 0-initialise the
struct, which guarantees that the padding will also be initialised.
Interestingly, in the error-case branch, we only try to copy the first
two members of pc_item_result, by copying only PC_ITEM_RESULT_BASE_SIZE
bytes. However PC_ITEM_RESULT_BASE_SIZE is defined as
'offsetof(the_last_member)', which means that we're copying padding bytes
after the end of the second last member. We could avoid doing this by
redefining PC_ITEM_RESULT_BASE_SIZE as
'offsetof(second_last_member) + sizeof(second_last_member)', but there's
no huge benefit to doing so (and this patch silences the MSAN warning in
this scenario either way).
MSAN output from t2080 (partially interleaved due to the
parallel work :) ):
Uninitialized bytes in __interceptor_write at offset 12 inside [0x7fff37d83408, 160)
==23279==WARNING: MemorySanitizer: use-of-uninitialized-value
Uninitialized bytes in __interceptor_write at offset 12 inside [0x7ffdb8a07ec8, 160)
==23280==WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0xd5ac28 in xwrite /home/ahunt/git/git/wrapper.c:256:8
#1 0xd5b327 in write_in_full /home/ahunt/git/git/wrapper.c:311:21
#2 0xb0a8c4 in do_packet_write /home/ahunt/git/git/pkt-line.c:221:6
#3 0xb0a5fd in packet_write /home/ahunt/git/git/pkt-line.c:242:6
#4 0x4f7441 in report_result /home/ahunt/git/git/builtin/checkout--worker.c:69:2
#5 0x4f6be6 in worker_loop /home/ahunt/git/git/builtin/checkout--worker.c💯3
#6 0x4f68d3 in cmd_checkout__worker /home/ahunt/git/git/builtin/checkout--worker.c:143:2
#7 0x4a1e76 in run_builtin /home/ahunt/git/git/git.c:461:11
#8 0x49e1e7 in handle_builtin /home/ahunt/git/git/git.c:714:3
#9 0x4a0c08 in run_argv /home/ahunt/git/git/git.c:781:4
#10 0x49d5a8 in cmd_main /home/ahunt/git/git/git.c:912:19
#11 0x7974da in main /home/ahunt/git/git/common-main.c:52:11
#12 0x7f8778114349 in __libc_start_main (/lib64/libc.so.6+0x24349)
#13 0x421bd9 in _start /home/abuild/rpmbuild/BUILD/glibc-2.26/csu/../sysdeps/x86_64/start.S:120
Uninitialized value was created by an allocation of 'res' in the stack frame of function 'report_result'
#0 0x4f72c0 in report_result /home/ahunt/git/git/builtin/checkout--worker.c:55
SUMMARY: MemorySanitizer: use-of-uninitialized-value /home/ahunt/git/git/wrapper.c:256:8 in xwrite
Exiting
#0 0xd5ac28 in xwrite /home/ahunt/git/git/wrapper.c:256:8
#1 0xd5b327 in write_in_full /home/ahunt/git/git/wrapper.c:311:21
#2 0xb0a8c4 in do_packet_write /home/ahunt/git/git/pkt-line.c:221:6
#3 0xb0a5fd in packet_write /home/ahunt/git/git/pkt-line.c:242:6
#4 0x4f7441 in report_result /home/ahunt/git/git/builtin/checkout--worker.c:69:2
#5 0x4f6be6 in worker_loop /home/ahunt/git/git/builtin/checkout--worker.c💯3
#6 0x4f68d3 in cmd_checkout__worker /home/ahunt/git/git/builtin/checkout--worker.c:143:2
#7 0x4a1e76 in run_builtin /home/ahunt/git/git/git.c:461:11
#8 0x49e1e7 in handle_builtin /home/ahunt/git/git/git.c:714:3
#9 0x4a0c08 in run_argv /home/ahunt/git/git/git.c:781:4
#10 0x49d5a8 in cmd_main /home/ahunt/git/git/git.c:912:19
#11 0x7974da in main /home/ahunt/git/git/common-main.c:52:11
#12 0x7f2749a0e349 in __libc_start_main (/lib64/libc.so.6+0x24349)
#13 0x421bd9 in _start /home/abuild/rpmbuild/BUILD/glibc-2.26/csu/../sysdeps/x86_64/start.S:120
Uninitialized value was created by an allocation of 'res' in the stack frame of function 'report_result'
#0 0x4f72c0 in report_result /home/ahunt/git/git/builtin/checkout--worker.c:55
SUMMARY: MemorySanitizer: use-of-uninitialized-value /home/ahunt/git/git/wrapper.c:256:8 in xwrite
Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "-m" option in "git log -m" that does not specify which format,
if any, of diff is desired did not have any visible effect; it now
implies some form of diff (by default "--patch") is produced.
* so/log-m-implies-p:
diff-merges: let "-m" imply "-p"
diff-merges: rename "combined_imply_patch" to "merges_imply_patch"
stash list: stop passing "-m" to "git log"
git-svn: stop passing "-m" to "git rev-list"
diff-merges: move specific diff-index "-m" handling to diff-index
t4013: test "git diff-index -m"
t4013: test "git diff-tree -m"
t4013: test "git log -m --stat"
t4013: test "git log -m --raw"
t4013: test that "-m" alone has no effect in "git log"
Recent "git clone" left a temporary directory behind when the
transport layer returned an failure.
* jk/clone-clean-upon-transport-error:
clone: clean up directory after transport_fetch_refs() failure
Fix typos in documentation, code comments, and RelNotes which repeat
various words. In trivial cases, just delete the duplicated word and
rewrap text, if needed. Reword the affected sentence in
Documentation/RelNotes/1.8.4.txt for it to make sense.
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change various cmd_* functions that claim to return an "int" to use
"return" instead of exit() to indicate an exit code. These were not
marked with NORETURN, and by directly exit()-ing we'll skip the
cleanup git.c would otherwise do (e.g. closing fd's, erroring if we
can't). See run_builtin() in git.c.
In the case of shell.c and sh-i18n--envsubst.c this was the result of
an incomplete migration to using a cmd_main() in 3f2e2297b9 (add an
extra level of indirection to main(), 2016-07-01).
This was spotted by SunCC 12.5 on Solaris 10 (gcc210 on the gccfarm).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are two "if (opt->all_objects)" blocks next
to each other, merge them into one to provide better
readability.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --batch code to print an object assumes we found out the type of
the object from calling oid_object_info_extended(). This is true for
the default format, but even in a custom format, we manually modify
the object_info struct to ask for the type.
This assumption was broken by 845de33a5b (cat-file: avoid noop calls
to sha1_object_info_extended, 2016-05-18). That commit skips the call
to oid_object_info_extended() entirely when --batch-all-objects is in
use, and the custom format does not include any placeholders that
require calling it.
Or when the custom format only include placeholders like %(objectname) or
%(rest), oid_object_info_extended() will not get the type of the object.
This results in an error when we try to confirm that the type didn't
change:
$ git cat-file --batch=batman --batch-all-objects
batman
fatal: object 000023961a changed type!?
and also has other subtle effects (e.g., we'd fail to stream a blob,
since we don't realize it's a blob in the first place).
We can fix this by flipping the order of the setup. The check for "do
we need to get the object info" must come _after_ we've decided
whether we need to look up the type.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All we need to know is that their names are the same.
Additionally this might be easier to parse for some since
remote_for_branch is more descriptive than remote_get(NULL).
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's a single line that is used in a single place, and the variable has
the same name as the function.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If fetch_remote is NULL (i.e. the branch remote is invalid), then it
can't possibly be same as remote, which can't be NULL.
The check is redundant, and so is the extra variable.
Also, fix the Yoda condition: we want to check if remote is the same as
the branch remote, not the other way around.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Only override dst on the odd case.
This allows a preemptive break on the `simple` case.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Their code is much simpler now and can move into the parent function.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All of the setup_push_* functions are appending a refspec. Do this only
once on the parent function.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
No need to do it in every single function.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We want all the cases that don't do anything with a branch first, and
then the rest. That way we will be able to get the branch and die if
there's a problem in the parent function, instead of inside the function
of each mode.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's no need to break when nothing else will be executed.
Will help next patches.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This code is duplicated among multiple functions.
No functional changes.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's a safety check to make sure branch->refname isn't different
from branch->merge[0]->src, otherwise we die().
Therefore we always push to branch->refname.
Suggestions-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Simply move the code around and remove dead code. In particular the
'!same_remote' conditional is a no-op since that part of the code is the
same_remote leg of the conditional beforehand.
No functional changes.
Suggestions-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to avoid doing unnecessary things and simplify it in further
patches. In particular moving the additional name safety out of
setup_push_upstream() and into setup_push_simple() and thus making both
more straightforward.
The code is copied exactly as-is; no functional changes.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`simple` is the most important mode so move the relevant code to its own
function to make it easier to see what it's doing.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The typical case is what git was designed for: distributed remotes.
It's only the atypical case--fetching and pushing to the same
remote--that we need to keep an eye on.
No functional changes.
Liked-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a warning on AIX's xlc compiler that's been emitted since my
a1aad71601 (fsck.h: use "enum object_type" instead of "int",
2021-03-28):
"builtin/fsck.c", line 805.32: 1506-068 (W) Operation between
types "int(*)(struct object*,enum object_type,void*,struct
fsck_options*)" and "int(*)(struct object*,int,void*,struct
fsck_options*)" is not allowed.
I.e. it complains about us assigning a function with a prototype "int"
where we're expecting "enum object_type".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"ld" on Solaris fails to link some test helpers, which has been
worked around by reshuffling the inline function definitions from a
header file to a source file that is the only user of them.
* ab/pack-linkage-fix:
pack-objects: move static inline from a header to the sole consumer
Move the code that is only used in builtin/pack-objects.c out of
pack-objects.h.
This fixes an issue where Solaris's SunCC hasn't been able to compile
git since 483fa7f42d (t/helper/test-bitmap.c: initial commit,
2021-03-31).
The real origin of that issue is that in 898eba5e63 (pack-objects:
refer to delta objects by index instead of pointer, 2018-04-14)
utility functions only needed by builtin/pack-objects.c were added to
pack-objects.h. Since then the header has been used in a few other
places, but 483fa7f42d was the first time it was used by test helper.
Since Solaris is stricter about linking and the oe_get_size_slow()
function lives in builtin/pack-objects.c the build started failing
with:
Undefined first referenced
symbol in file
oe_get_size_slow t/helper/test-bitmap.o
ld: fatal: symbol referencing errors. No output written to t/helper/test-tool
On other platforms this is presumably OK because the compiler and/or
linker detects that the "static inline" functions that reference
oe_get_size_slow() aren't used.
Let's solve this by moving the relevant code from pack-objects.h to
builtin/pack-objects.c. This is almost entirely a code-only move, but
because of the early macro definitions in that file referencing some
of these inline functions we need to move the definition of "static
struct packing_data to_pack" earlier, and declare these inline
functions above the macros.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We used to read the init.templateDir setting at builtin/init-db.c using
a git_config() callback that, in turn, called git_config_pathname(). To
simplify the config reading logic at this file and plug a memory leak,
this was replaced by a direct call to git_config_get_value() at
e4de4502e6 ("init: remove git_init_db_config() while fixing leaks",
2021-03-14). However, this function doesn't provide path expanding
semantics, like git_config_pathname() does, so paths with '~/' and
'~user/' are treated literally. This makes 'git init' fail to handle
init.templateDir paths using these constructs:
$ git config init.templateDir '~/templates_dir'
$ git init
'warning: templates not found in ~/templates_dir'
Replace the git_config_get_value() call by git_config_get_pathname(),
which does the '~/' and '~user/' expansions. Also add a regression test.
Note that unlike git_config_get_value(), the config cache does not own
the memory for the path returned by git_config_get_pathname(), so we
must free() it.
Reported on IRC by rkta.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Another brown paper bag inconsistency fix for a new feature
introduced during this cycle.
* dl/stash-show-untracked-fixup:
stash show: use stash.showIncludeUntracked even when diff options given
The "rev-parse" command did not diagnose the lack of argument to
"--path-format" option, which was introduced in v2.31 era, which
has been corrected.
* wm/rev-parse-path-format-wo-arg:
rev-parse: fix segfault with missing --path-format argument
If options pertaining to how the diff is displayed is provided to
`git stash show`, the command will ignore the stash.showIncludeUntracked
configuration variable, defaulting to not showing any untracked files.
This is unintuitive behaviour since the format of the diff output and
whether or not to display untracked files are orthogonal.
Use stash.showIncludeUntracked even when diff options are given. Of
course, this is still overridable via the command-line options.
Update the documentation to explicitly say which configuration variables
will be overridden when a diff options are given.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Passing "-m" in "git log --first-parent -m" is not needed as
--first-parent implies --diff-merges=first-parent anyway. OTOH, it
will stop being harmless once we let "-m" imply "-p".
While we are at it, fix corresponding test description in t3903-stash
to match what it actually tests.
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move specific handling of "-m" for diff-index to diff-index.c, so
diff-merges is left to handle only diff for merges options.
Being a better design by itself, this is especially essential in
preparation for letting -m imply -p, as "diff-index -m" obviously
should not imply -p, as it's entirely unrelated.
To handle this, in addition to moving specific diff-index "-m" code
out of diff-merges, we introduce new
diff_merges_suppress_options_parsing()
and call it before generic options processing in cmd_diff_index().
This new diff_merges_suppress_options_parsing() could then be reused
and called before invocations of setup_revisions() for other commands
that don't need --diff-merges options, but that's outside of the scope
of these patch series.
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git clean" and "git ls-files -i" had confusion around working on
or showing ignored paths inside an ignored directory, which has
been corrected.
* en/dir-traversal:
dir: introduce readdir_skip_dot_and_dotdot() helper
dir: update stale description of treat_directory()
dir: traverse into untracked directories if they may have ignored subfiles
dir: avoid unnecessary traversal into ignored directory
t3001, t7300: add testcase showcasing missed directory traversal
t7300: add testcase showing unnecessary traversal into ignored directory
ls-files: error out on -i unless -o or -c are specified
dir: report number of visited directories and paths with trace2
dir: convert trace calls to trace2 equivalents
git-clone started respecting errors from the transport subsystem in
aab179d937 (builtin/clone.c: don't ignore transport_fetch_refs() errors,
2020-12-03). However, that commit didn't handle the cleanup of the
filesystem quite right.
The cleanup of the directory that cmd_clone() creates is done by an
atexit() handler, which we control with a flag. It starts as
JUNK_LEAVE_NONE ("clean up everything"), then progresses to
JUNK_LEAVE_REPO when we know we have a valid repo but not working tree,
and then finally JUNK_LEAVE_ALL when we have a successful checkout.
Most errors cause us to die(), which then triggers the handler to do the
right thing based on how far into cmd_clone() we got. But the checks
added by aab179d937 instead set the "err" variable and then jump to a
new "cleanup" label, which then returns our non-zero status. However,
the code after the cleanup label includes setting the flag to
JUNK_LEAVE_ALL, and so we accidentally leave the repository and working
tree in place.
One obvious option to fix this is to reorder the end of the function to
set the flag first, before cleanup code, and put the label between them.
But we can observe another small bug: the error return from
transport_fetch_refs() is generally "-1", and we propagate that to the
return value of cmd_clone(), which ultimately becomes the exit code of
the process. And we try to avoid transmitting negative values via exit
codes (only the low 8 bits are passed along as an unsigned value, though
in practice for "-1" this at least retains the property that it's
non-zero).
Instead, let's just die(). That makes us consistent with rest of the
code in the function. It does add a new "fatal:" line to the output, but
I'd argue that's a good thing:
- in the rare case that the transport code didn't say anything, now
the user gets _some_ error message
- even if the transport code said something like "error: ssh died of
signal 9", it's nice to also say "fatal" to indicate that we
considered that to be a show-stopper.
Triggering this in the test suite turns out to be surprisingly
difficult. Almost every error we'd encounter, including ones deep inside
the transport code, cause us to just die() right there! However, one way
is to put a fake wrapper around git-upload-pack that sends the complete
packfile but exits with a failure code.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These error messages are intended for the user. Let's touch them up
since we're here from the previous commit.
Signed-off-by: Wolfgang Müller <wolf@oriole.systems>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Calling "git rev-parse --path-format" without an argument segfaults
instead of giving an error message. Commit fac60b8925 (rev-parse: add
option for absolute or relative path formatting, 2020-12-13) added the
argument parsing code but forgot to handle NULL.
Returning an error makes sense here because there is no default value we
could use. Add a test case to verify.
Signed-off-by: Wolfgang Müller <wolf@oriole.systems>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code to handle options recently added to "git stash show"
around untracked part of the stash segfaulted when these options
were used on a stash entry that does not record untracked part.
* dl/stash-show-untracked-fixup:
stash show: fix segfault with --{include,only}-untracked
t3905: correct test title
"git mailinfo" (hence "git am") learned the "--quoted-cr" option to
control how lines ending with CRLF wrapped in base64 or qp are
handled.
* dd/mailinfo-quoted-cr:
am: learn to process quoted lines that ends with CRLF
mailinfo: allow stripping quoted CR without warning
mailinfo: allow squelching quoted CRLF warning
mailinfo: warn if CRLF found in decoded base64/QP email
mailinfo: stop parsing options manually
mailinfo: load default metainfo_charset lazily
The final part of "parallel checkout".
* mt/parallel-checkout-part-3:
ci: run test round with parallel-checkout enabled
parallel-checkout: add tests related to .gitattributes
t0028: extract encoding helpers to lib-encoding.sh
parallel-checkout: add tests related to path collisions
parallel-checkout: add tests for basic operations
checkout-index: add parallel checkout support
builtin/checkout.c: complete parallel checkout support
make_transient_cache_entry(): optionally alloc from mem_pool
"git push" learns to discover common ancestor with the receiving
end over protocol v2.
* jt/push-negotiation:
send-pack: support push negotiation
fetch: teach independent negotiation (no packfile)
fetch-pack: refactor command and capability write
fetch-pack: refactor add_haves()
fetch-pack: refactor process_acks()
These strings have not been modified in any translation, nor should they
be.
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git add -i --dry-run" does not dry-run, which was surprising. The
combination of options has taught to error out.
* ow/no-dryrun-in-add-i:
add: die if both --dry-run and --interactive are given
When `git stash show --include-untracked` or
`git stash show --only-untracked` is run on a stash that doesn't include
an untracked entry, a segfault occurs. This happens because we do not
check whether the untracked entry is actually present and just attempt
to blindly dereference it.
Ensure that the untracked entry is present before actually attempting to
dereference it.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many places in the code were doing
while ((d = readdir(dir)) != NULL) {
if (is_dot_or_dotdot(d->d_name))
continue;
...process d...
}
Introduce a readdir_skip_dot_and_dotdot() helper to make that a one-liner:
while ((d = readdir_skip_dot_and_dotdot(dir)) != NULL) {
...process d...
}
This helper particularly simplifies checks for empty directories.
Also use this helper in read_cached_dir() so that our statistics are
consistent across platforms. (In other words, read_cached_dir() should
have been using is_dot_or_dotdot() and skipping such entries, but did
not and left it to treat_path() to detect and mark such entries as
path_none.)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
ls-files --ignored can be used together with either --others or
--cached. After being perplexed for a bit and digging in to the code, I
assumed that ls-files -i was just broken and not printing anything and
I had a nice patch ready to submit when I finally realized that -i can be
used with --cached to find tracked ignores.
While that was a mistake on my part, and a careful reading of the
documentation could have made this more clear, I suspect this is an
error others are likely to make as well. In fact, of two uses in our
testsuite, I believe one of the two did make this error. In t1306.13,
there are NO tracked files, and all the excludes built up and used in
that test and in previous tests thus have to be about untracked files.
However, since they were looking for an empty result, the mistake went
unnoticed as their erroneous command also just happened to give an empty
answer.
-i will most the time be used with -o, which would suggest we could just
make -i imply -o in the absence of either a -o or -c, but that would be
a backward incompatible break. Instead, let's just flag -i without
either a -o or -c as an error, and update the two relevant testcases to
specify their intent.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fixes two memory leaks when running `git maintenance start` or `git
maintenance stop` in `update_background_schedule`:
$ valgrind --leak-check=full ~/git/bin/git maintenance start
==76584== Memcheck, a memory error detector
==76584== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==76584== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==76584== Command: /home/lenaic/git/bin/git maintenance start
==76584==
==76584==
==76584== HEAP SUMMARY:
==76584== in use at exit: 34,880 bytes in 252 blocks
==76584== total heap usage: 820 allocs, 568 frees, 146,414 bytes allocated
==76584==
==76584== 65 bytes in 1 blocks are definitely lost in loss record 17 of 39
==76584== at 0x483E6AF: malloc (vg_replace_malloc.c:306)
==76584== by 0x3DC39C: xrealloc (wrapper.c:126)
==76584== by 0x3992CC: strbuf_grow (strbuf.c:98)
==76584== by 0x39A473: strbuf_vaddf (strbuf.c:392)
==76584== by 0x39BC54: xstrvfmt (strbuf.c:979)
==76584== by 0x39BD2C: xstrfmt (strbuf.c:989)
==76584== by 0x18451B: update_background_schedule (gc.c:1977)
==76584== by 0x1846F6: maintenance_start (gc.c:2011)
==76584== by 0x1847B4: cmd_maintenance (gc.c:2030)
==76584== by 0x127A2E: run_builtin (git.c:453)
==76584== by 0x127E81: handle_builtin (git.c:704)
==76584== by 0x128142: run_argv (git.c:771)
==76584==
==76584== 240 bytes in 1 blocks are definitely lost in loss record 29 of 39
==76584== at 0x4840D7B: realloc (vg_replace_malloc.c:834)
==76584== by 0x491CE5D: getdelim (in /usr/lib/libc-2.33.so)
==76584== by 0x39ADD7: strbuf_getwholeline (strbuf.c:635)
==76584== by 0x39AF31: strbuf_getdelim (strbuf.c:706)
==76584== by 0x39B064: strbuf_getline_lf (strbuf.c:727)
==76584== by 0x184273: crontab_update_schedule (gc.c:1919)
==76584== by 0x184678: update_background_schedule (gc.c:1997)
==76584== by 0x1846F6: maintenance_start (gc.c:2011)
==76584== by 0x1847B4: cmd_maintenance (gc.c:2030)
==76584== by 0x127A2E: run_builtin (git.c:453)
==76584== by 0x127E81: handle_builtin (git.c:704)
==76584== by 0x128142: run_argv (git.c:771)
==76584==
==76584== LEAK SUMMARY:
==76584== definitely lost: 305 bytes in 2 blocks
==76584== indirectly lost: 0 bytes in 0 blocks
==76584== possibly lost: 0 bytes in 0 blocks
==76584== still reachable: 34,575 bytes in 250 blocks
==76584== suppressed: 0 bytes in 0 blocks
==76584== Reachable blocks (those to which a pointer was found) are not shown.
==76584== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==76584==
==76584== For lists of detected and suppressed errors, rerun with: -s
==76584== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr>
Acked-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Options to "git pack-objects" that take numeric values like
--window and --depth should not accept negative values; the input
validation has been tightened.
* jk/pack-objects-negative-options-fix:
pack-objects: clamp negative depth to 0
t5316: check behavior of pack-objects --depth=0
pack-objects: clamp negative window size to 0
t5300: check that we produced expected number of deltas
t5300: modernize basic tests
A few variants of informational message "Already up-to-date" has
been rephrased.
* js/merge-already-up-to-date-message-reword:
merge: fix swapped "up to date" message components
merge(s): apply consistent punctuation to "up to date" messages
"git bisect skip" when custom words are used for new/old did not
work, which has been corrected.
* rj/bisect-skip-honor-terms:
bisect--helper: use BISECT_TERMS in 'bisect skip' command
Amend the code added in 611e42a598 (xdiff: provide a separate emit
callback for hunks, 2018-11-02) to be more readable by using
designated initializers.
This changes "priv" in rerere.c to be initialized to NULL as we did in
merge-tree.c. That's not needed as we'll only use it if the callback
is defined, but being consistent here is better and less verbose.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git repack -A -d" in a partial clone unnecessarily loosened
objects in promisor pack.
* rs/repack-without-loosening-promised-objects:
repack: avoid loosening promisor objects in partial clones
SHA-256 transition.
* bc/hash-transition-interop-part-1:
hex: print objects using the hash algorithm member
hex: default to the_hash_algo on zero algorithm value
builtin/pack-objects: avoid using struct object_id for pack hash
commit-graph: don't store file hashes as struct object_id
builtin/show-index: set the algorithm for object IDs
hash: provide per-algorithm null OIDs
hash: set, copy, and use algo field in struct object_id
builtin/pack-redundant: avoid casting buffers to struct object_id
Use the final_oid_fn to finalize hashing of object IDs
hash: add a function to finalize object IDs
http-push: set algorithm when reading object ID
Always use oidread to read into struct object_id
hash: add an algo member to struct object_id
In previous changes, mailinfo has learnt to process lines that decoded
from base64 or quoted-printable, and ends with CRLF.
Let's teach "am" that new trick, too.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In previous change, Git starts to warn for quoted CRLF in decoded
base64/QP email. Despite those warnings are usually helpful,
quoted CRLF could be part of some users' workflow.
Let's give them an option to turn off the warning completely.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Plug various leans reported by LSAN.
* ah/plugleaks:
builtin/rm: avoid leaking pathspec and seen
builtin/rebase: release git_format_patch_opt too
builtin/for-each-ref: free filter and UNLEAK sorting.
mailinfo: also free strbuf lists when clearing mailinfo
builtin/checkout: clear pending objects after diffing
builtin/check-ignore: clear_pathspec before returning
builtin/bugreport: don't leak prefixed filename
branch: FREE_AND_NULL instead of NULL'ing real_ref
bloom: clear each bloom_key after use
ls-files: free max_prefix when done
wt-status: fix multiple small leaks
revision: free remainder of old commit list in limit_list
"git rev-list" learns the "--filter=object:type=<type>" option,
which can be used to exclude objects of the given kind from the
packfile generated by pack-objects.
* ps/rev-list-object-type-filter:
rev-list: allow filtering of provided items
pack-bitmap: implement combined filter
pack-bitmap: implement object type filter
list-objects: implement object type filter
list-objects: support filtering by tag and commit
list-objects: move tag processing into its own function
revision: mark commit parents as NOT_USER_GIVEN
uploadpack.txt: document implication of `uploadpackfilter.allow`
"git add" and "git rm" learned not to touch those paths that are
outside of sparse checkout.
* mt/add-rm-in-sparse-checkout:
rm: honor sparse checkout patterns
add: warn when asked to update SKIP_WORKTREE entries
refresh_index(): add flag to ignore SKIP_WORKTREE entries
pathspec: allow to ignore SKIP_WORKTREE entries on index matching
add: make --chmod and --renormalize honor sparse checkouts
t3705: add tests for `git add` in sparse checkouts
add: include magic part of pathspec on --refresh error
Replace GIT_CONFIG_NOSYSTEM mechanism to decline from reading the
system-wide configuration file with GIT_CONFIG_SYSTEM that lets
users specify from which file to read the system-wide configuration
(setting it to an empty file would essentially be the same as
setting NOSYSTEM), and introduce GIT_CONFIG_GLOBAL to override the
per-user configuration in $HOME/.gitconfig.
* ps/config-global-override:
t1300: fix unset of GIT_CONFIG_NOSYSTEM leaking into subsequent tests
config: allow overriding of global and system configuration
config: unify code paths to get global config paths
config: rename `git_etc_config()`
"git (branch|tag) --format=..." has been micro-optimized.
* zh/format-ref-array-optim:
ref-filter: reuse output buffer
ref-filter: get rid of show_ref_array_item
The variable `matches` used to hold the return of a `dir_path_match()`
call that was removed in 95c11ecc73 ("Fix error-prone fill_directory()
API; make it only return matches", 2020-04-01). Now `matches` will
always hold 0, which is the value it's initialized with; and the
condition `matches != MATCHED_EXACTLY` will always evaluate to true. So
let's remove this unnecessary variable.
Interestingly, it seems that `matches != MATCHED_EXACTLY` was already
unnecessary before 95c11ecc73. That's because `remove_directories` is
always set to 1 when we have pathspecs; So, in the condition
`!remove_directories && matches != MATCHED_EXACTLY`, we would either:
- have pathspecs (or have been given `-d`) and ignore `matches` because
`remove_directories` is 1; or
- not have pathspecs (nor `-d`) and end up just checking that
`0 != MATCHED_EXACTLY`, as `matches` would never get reassigned
after its zero initialization (because there is no pathspec to match).
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a later change, mailinfo will learn more options, let's switch to our
robust parse_options framework before that step.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a later change, we will use parse_option to parse mailinfo's options.
In mailinfo, both "-u", "-n", and "--encoding" try to set the same
field, with "-u" reset that field to some default value from
configuration variable "i18n.commitEncoding".
Let's delay the setting of that field until we finish processing all
options. By doing that, "i18n.commitEncoding" can be parsed on demand.
More importantly, it cleans the way for using parse_option.
This change introduces some inconsistent brackets "{}" in "if/else if"
construct, however, we will rewrite them in the next few changes.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The interactive machinery does not obey --dry-run. Die appropriately
if both flags are passed.
Signed-off-by: Øystein Walle <oystwa@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow checkout-index to use the parallel checkout framework, honoring
the checkout.workers configuration.
There are two code paths in checkout-index which call
`checkout_entry()`, and thus, can make use of parallel checkout:
`checkout_file()`, which is used to write paths explicitly given at the
command line; and `checkout_all()`, which is used to write all paths in
the index, when the `--all` option is given.
In both operation modes, checkout-index doesn't abort immediately on a
`checkout_entry()` failure. Instead, it tries to check out all remaining
paths before exiting with a non-zero exit code. To keep this behavior
when parallel checkout is being used, we must allow
`run_parallel_checkout()` to try writing the queued entries before we
exit, even if we already got an error code from a previous
`checkout_entry()` call.
However, `checkout_all()` doesn't return on errors, it calls `exit()`
with code 128. We could make it call `run_parallel_checkout()` before
exiting, but it makes the code easier to follow if we unify the exit
path for both checkout-index modes at `cmd_checkout_index()`, and let
this function take care of the interactions with the parallel checkout
API. So let's do that.
With this change, we also have to consider whether we want to keep using
128 as the error code for `git checkout-index --all`, while we use 1 for
`git checkout-index <path>` (even when the actual error is the same).
Since there is not much value in having code 128 only for `--all`, and
there is no mention about it in the docs (so it's unlikely that changing
it will break any existing script), let's make both modes exit with code
1 on `checkout_entry()` errors.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pathspec-limited checkouts (like `git checkout *.txt`) are performed by
a code path that doesn't yet support parallel checkout because it calls
checkout_entry() directly, instead of unpack_trees(). Let's add parallel
checkout support for this code path too.
The transient cache entries allocated in checkout_merged() are now
allocated in a mem_pool which is only discarded after parallel checkout
finishes. This is done because the entries need to be valid when
run_parallel_checkout() is called.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow make_transient_cache_entry() to optionally receive a mem_pool
struct in which it should allocate the entry. This will be used in the
following patch, to store some transient entries which should persist
until parallel checkout finishes.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, the packfile negotiation step within a Git fetch cannot be
done independent of sending the packfile, even though there is at least
one application wherein this is useful. Therefore, make it possible for
this negotiation step to be done independently. A subsequent commit will
use this for one such application - push negotiation.
This feature is for protocol v2 only. (An implementation for protocol v0
would require a separate implementation in the fetch, transport, and
transport helper code.)
In the protocol, the main hindrance towards independent negotiation is
that the server can unilaterally decide to send the packfile. This is
solved by a "wait-for-done" argument: the server will then wait for the
client to say "done". In practice, the client will never say it; instead
it will cease requests once it is satisfied.
In the client, the main change lies in the transport and transport
helper code. fetch_refs_via_pack() performs everything needed - protocol
version and capability checks, and the negotiation itself.
There are 2 code paths that do not go through fetch_refs_via_pack() that
needed to be individually excluded: the bundle transport (excluded
through requiring smart_options, which the bundle transport doesn't
support) and transport helpers that do not support takeover. If or when
we support independent negotiation for protocol v0, we will need to
modify these 2 code paths to support it. But for now, report failure if
independent negotiation is requested in these cases.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A negative delta depth makes no sense, and the code is not prepared to
handle it. If passed "--depth=-1" on the command line, then this line
from break_delta_chains():
cur->depth = (total_depth--) % (depth + 1);
triggers a divide-by-zero. This is undefined behavior according to the C
standard, but on POSIX systems results in SIGFPE killing the process.
This is certainly one way to inform the use that the command was
invalid, but it's a bit friendlier to just treat it as "don't allow any
deltas", which we already do for --depth=0.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A negative window size makes no sense, and the code in find_deltas() is
not prepared to handle it. If you pass "-1", for example, we end up
generate a 0-length array of "struct unpacked", but our loop assumes it
has at least one entry in it (and we end up reading garbage memory).
We could complain to the user about this, but it's more forgiving to
just clamp it to 0, which means "do not find any deltas at all". The
0-case is already tested earlier in the script, so we'll make sure this
does the same thing.
Reported-by: Yiyuan guo <yguoaz@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The rewrite of git-merge from shell to C in 1c7b76be7d (Build in merge,
2008-07-07) accidentally transformed the message:
Already up-to-date. (nothing to squash)
to:
(nothing to squash)Already up-to-date.
due to reversed printf() arguments. This problem has gone unnoticed
despite being touched over the years by 7f87aff22c (Teach/Fix pull/fetch
-q/-v options, 2008-11-15) and bacec47845 (i18n: git-merge basic
messages, 2011-02-22), and tangentially by bef4830e88 (i18n: merge: mark
messages for translation, 2016-06-17) and 7560f547e6 (treewide: correct
several "up-to-date" to "up to date", 2017-08-23).
Fix it by restoring the message to its intended order. While at it, help
translators out by avoiding "sentence Lego".
[es: rewrote commit message]
Co-authored-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Although the various "Already up to date" messages resulting from merge
attempts share identical phrasing, they use a mix of punctuation ranging
from "." to "!" and even "Yeeah!", which leads to extra work for
translators. Ease the job of translators by settling upon "." as
punctuation for all such messages.
While at it, take advantage of printf_ln() to further ease the
translation task so translators need not worry about line termination,
and fix a case of missing line termination in the (unused)
merge_ort_nonrecursive() function.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The checkout machinery has been taught to perform the actual
write-out of the files in parallel when able.
* mt/parallel-checkout-part-2:
parallel-checkout: add design documentation
parallel-checkout: support progress displaying
parallel-checkout: add configuration options
parallel-checkout: make it truly parallel
unpack-trees: add basic support for parallel checkout
Builds on top of the sparse-index infrastructure to mark operations
that are not ready to mark with the sparse index, causing them to
fall back on fully-populated index that they always have worked with.
* ds/sparse-index-protections: (47 commits)
name-hash: use expand_to_path()
sparse-index: expand_to_path()
name-hash: don't add directories to name_hash
revision: ensure full index
resolve-undo: ensure full index
read-cache: ensure full index
pathspec: ensure full index
merge-recursive: ensure full index
entry: ensure full index
dir: ensure full index
update-index: ensure full index
stash: ensure full index
rm: ensure full index
merge-index: ensure full index
ls-files: ensure full index
grep: ensure full index
fsck: ensure full index
difftool: ensure full index
commit: ensure full index
checkout: ensure full index
...
The prefetch task in "git maintenance" assumed that "git fetch"
from any remote would fetch all its local branches, which would
fetch too much if the user is interested in only a subset of
branches there.
* ds/maintenance-prefetch-fix:
maintenance: respect remote.*.skipFetchAll
maintenance: use 'git fetch --prefetch'
fetch: add --prefetch option
maintenance: simplify prefetch logic
Handling of "promisor packs" that allows certain objects to be
missing and lazily retrievable has been optimized (a bit).
* jk/promisor-optim:
revision: avoid parsing with --exclude-promisor-objects
lookup_unknown_object(): take a repository argument
is_promisor_object(): free tree buffer after parsing
Commit e4c7b33747 ("bisect--helper: reimplement `bisect_skip` shell
function in C", 2021-02-03), as part of the shell-to-C conversion,
forgot to read the 'terms' file (.git/BISECT_TERMS) during the new
'bisect skip' command implementation. As a result, the 'bisect skip'
command will use the default 'bad'/'good' terms. If the bisection
terms have been set to non-default values (for example by the
'bisect start' command), then the 'bisect skip' command will fail.
In order to correct this problem, we insert a call to the get_terms()
function, which reads the non-default terms from that file (if set),
in the '--bisect-skip' command implementation of 'bisect--helper'.
Also, add a test[1] to protect against potential future regression.
[1] https://lore.kernel.org/git/xmqqim45h585.fsf@gitster.g/T/#m207791568054b0f8cf1a3942878ea36293273c7d
Reported-by: Trygve Aaberge <trygveaa@gmail.com>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `git repack -A -d` is run in a partial clone, `pack-objects`
is invoked twice: once to repack all promisor objects, and once to
repack all non-promisor objects. The latter `pack-objects` invocation
is with --exclude-promisor-objects and --unpack-unreachable, which
loosens all objects unused during this invocation. Unfortunately,
this includes promisor objects.
Because the -d argument to `git repack` subsequently deletes all loose
objects also in packs, these just-loosened promisor objects will be
immediately deleted. However, this extra disk churn is unnecessary in
the first place. For example, in a newly-cloned partial repo that
filters all blob objects (e.g. `--filter=blob:none`), `repack` ends up
unpacking all trees and commits into the filesystem because every
object, in this particular case, is a promisor object. Depending on
the repo size, this increases the disk usage considerably: In my copy
of the linux.git, the object directory peaked 26GB of more disk usage.
In order to avoid this extra disk churn, pass the names of the promisor
packfiles as --keep-pack arguments to the second invocation of
`pack-objects`. This informs `pack-objects` that the promisor objects
are already in a safe packfile and, therefore, do not need to be
loosened.
For testing, we need to validate whether any object was loosened.
However, the "evidence" (loosened objects) is deleted during the
process which prevents us from inspecting the object directory.
Instead, let's teach `pack-objects` to count loosened objects and
emit via trace2 thus allowing inspecting the debug events after the
process is finished. This new event is used on the added regression
test.
Lastly, add a new perf test to evaluate the performance impact
made by this changes (tested on git.git):
Test HEAD^ HEAD
----------------------------------------------------------
5600.3: gc 134.38(41.93+90.95) 7.80(6.72+1.35) -94.2%
For a bigger repository, such as linux.git, the improvement is
even bigger:
Test HEAD^ HEAD
-------------------------------------------------------------------
5600.3: gc 6833.00(918.07+3162.74) 268.79(227.02+39.18) -96.1%
These improvements are particular big because every object in the
newly-cloned partial repository is a promisor object.
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
parse_pathspec() populates pathspec, hence we need to clear it once it's
no longer needed. seen is xcalloc'd within the same function and
likewise needs to be freed once its no longer needed.
cmd_rm() has multiple early returns, therefore we need to clear or free
as soon as this data is no longer needed, as opposed to doing a cleanup
at the end.
LSAN output from t0020:
Direct leak of 112 byte(s) in 1 object(s) allocated from:
#0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0x9ac0a4 in do_xmalloc wrapper.c:41:8
#2 0x9ac07a in xmalloc wrapper.c:62:9
#3 0x873277 in parse_pathspec pathspec.c:582:2
#4 0x646ffa in cmd_rm builtin/rm.c:266:2
#5 0x4cd91d in run_builtin git.c:467:11
#6 0x4cb5f3 in handle_builtin git.c:719:3
#7 0x4ccf47 in run_argv git.c:808:4
#8 0x4caf49 in cmd_main git.c:939:19
#9 0x69dc0e in main common-main.c:52:11
#10 0x7f948825b349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Indirect leak of 65 byte(s) in 1 object(s) allocated from:
#0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0x9ac2a6 in xrealloc wrapper.c:126:8
#2 0x93b14d in strbuf_grow strbuf.c:98:2
#3 0x93ccf6 in strbuf_vaddf strbuf.c:392:3
#4 0x93f726 in xstrvfmt strbuf.c:979:2
#5 0x93f8b3 in xstrfmt strbuf.c:989:8
#6 0x92ad8a in prefix_path_gently setup.c:115:15
#7 0x873a8d in init_pathspec_item pathspec.c:439:11
#8 0x87334f in parse_pathspec pathspec.c:589:3
#9 0x646ffa in cmd_rm builtin/rm.c:266:2
#10 0x4cd91d in run_builtin git.c:467:11
#11 0x4cb5f3 in handle_builtin git.c:719:3
#12 0x4ccf47 in run_argv git.c:808:4
#13 0x4caf49 in cmd_main git.c:939:19
#14 0x69dc0e in main common-main.c:52:11
#15 0x7f948825b349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Indirect leak of 15 byte(s) in 1 object(s) allocated from:
#0 0x486834 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
#1 0x9ac048 in xstrdup wrapper.c:29:14
#2 0x873ba2 in init_pathspec_item pathspec.c:468:20
#3 0x87334f in parse_pathspec pathspec.c:589:3
#4 0x646ffa in cmd_rm builtin/rm.c:266:2
#5 0x4cd91d in run_builtin git.c:467:11
#6 0x4cb5f3 in handle_builtin git.c:719:3
#7 0x4ccf47 in run_argv git.c:808:4
#8 0x4caf49 in cmd_main git.c:939:19
#9 0x69dc0e in main common-main.c:52:11
#10 0x7f948825b349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Direct leak of 1 byte(s) in 1 object(s) allocated from:
#0 0x49a9d2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
#1 0x9ac392 in xcalloc wrapper.c:140:8
#2 0x647108 in cmd_rm builtin/rm.c:294:9
#3 0x4cd91d in run_builtin git.c:467:11
#4 0x4cb5f3 in handle_builtin git.c:719:3
#5 0x4ccf47 in run_argv git.c:808:4
#6 0x4caf49 in cmd_main git.c:939:19
#7 0x69dbfe in main common-main.c:52:11
#8 0x7f4fac1b0349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
options.git_format_patch_opt can be populated during cmd_rebase's setup,
and will therefore leak on return. Although we could just UNLEAK all of
options, we choose to strbuf_release() the individual member, which matches
the existing pattern (where we're freeing invidual members of options).
Leak found when running t0021:
Direct leak of 24 byte(s) in 1 object(s) allocated from:
#0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0x9ac296 in xrealloc wrapper.c:126:8
#2 0x93b13d in strbuf_grow strbuf.c:98:2
#3 0x93bd3a in strbuf_add strbuf.c:295:2
#4 0x60ae92 in strbuf_addstr strbuf.h:304:2
#5 0x605f17 in cmd_rebase builtin/rebase.c:1759:3
#6 0x4cd91d in run_builtin git.c:467:11
#7 0x4cb5f3 in handle_builtin git.c:719:3
#8 0x4ccf47 in run_argv git.c:808:4
#9 0x4caf49 in cmd_main git.c:939:19
#10 0x69dbfe in main common-main.c:52:11
#11 0x7f66dae91349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 24 byte(s) leaked in 1 allocation(s).
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
sorting might be a list allocated in ref_default_sorting() (in this case
it's a fixed single item list, which has nevertheless been xcalloc'd),
or it might be a list allocated in parse_opt_ref_sorting(). In either
case we could free these lists - but instead we UNLEAK as we're at the
end of cmd_for_each_ref. (There's no existing implementation of
clear_ref_sorting(), and writing a loop to free the list seems more
trouble than it's worth.)
filter.with_commit/no_commit are populated via
OPT_CONTAINS/OPT_NO_CONTAINS, both of which create new entries via
parse_opt_commits(), and also need to be free'd or UNLEAK'd. Because
free_commit_list() already exists, we choose to use that over an UNLEAK.
LSAN output from t0041:
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x49a9d2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
#1 0x9ac252 in xcalloc wrapper.c:140:8
#2 0x8a4a55 in ref_default_sorting ref-filter.c:2486:32
#3 0x56c6b1 in cmd_for_each_ref builtin/for-each-ref.c:72:13
#4 0x4cd91d in run_builtin git.c:467:11
#5 0x4cb5f3 in handle_builtin git.c:719:3
#6 0x4ccf47 in run_argv git.c:808:4
#7 0x4caf49 in cmd_main git.c:939:19
#8 0x69dabe in main common-main.c:52:11
#9 0x7f2bdc570349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0x9abf54 in do_xmalloc wrapper.c:41:8
#2 0x9abf2a in xmalloc wrapper.c:62:9
#3 0x717486 in commit_list_insert commit.c:540:33
#4 0x8644cf in parse_opt_commits parse-options-cb.c:98:2
#5 0x869bb5 in get_value parse-options.c:181:11
#6 0x8677dc in parse_long_opt parse-options.c:378:10
#7 0x8659bd in parse_options_step parse-options.c:817:11
#8 0x867fcd in parse_options parse-options.c:870:10
#9 0x56c62b in cmd_for_each_ref builtin/for-each-ref.c:59:2
#10 0x4cd91d in run_builtin git.c:467:11
#11 0x4cb5f3 in handle_builtin git.c:719:3
#12 0x4ccf47 in run_argv git.c:808:4
#13 0x4caf49 in cmd_main git.c:939:19
#14 0x69dabe in main common-main.c:52:11
#15 0x7f2bdc570349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
add_pending_object() populates rev.pending, we need to take care of
clearing it once we're done.
This code is run close to the end of a checkout, therefore this leak
seems like it would have very little impact. See also LSAN output
from t0020 below:
Direct leak of 2048 byte(s) in 1 object(s) allocated from:
#0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0x9acc46 in xrealloc wrapper.c:126:8
#2 0x83e3a3 in add_object_array_with_path object.c:337:3
#3 0x8f672a in add_pending_object_with_path revision.c:329:2
#4 0x8eaeab in add_pending_object_with_mode revision.c:336:2
#5 0x8eae9d in add_pending_object revision.c:342:2
#6 0x5154a0 in show_local_changes builtin/checkout.c:602:2
#7 0x513b00 in merge_working_tree builtin/checkout.c:979:3
#8 0x512cb3 in switch_branches builtin/checkout.c:1242:9
#9 0x50f8de in checkout_branch builtin/checkout.c:1646:9
#10 0x50ba12 in checkout_main builtin/checkout.c:2003:9
#11 0x5086c0 in cmd_checkout builtin/checkout.c:2055:8
#12 0x4cd91d in run_builtin git.c:467:11
#13 0x4cb5f3 in handle_builtin git.c:719:3
#14 0x4ccf47 in run_argv git.c:808:4
#15 0x4caf49 in cmd_main git.c:939:19
#16 0x69e43e in main common-main.c:52:11
#17 0x7f5dd1d50349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 2048 byte(s) leaked in 1 allocation(s).
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
parse_pathspec() allocates new memory into pathspec, therefore we need
to free it when we're done.
An UNLEAK would probably be just as good here - but clear_pathspec() is
not much more work so we might as well use it. check_ignore() is either
called once directly from cmd_check_ignore() (in which case the leak
really doesnt matter), or it can be called multiple times in a loop from
check_ignore_stdin_paths(), in which case we're potentially leaking
multiple times - but even in this scenario the leak is so small as to
have no real consequence.
Found while running t0008:
Direct leak of 112 byte(s) in 1 object(s) allocated from:
#0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0x9aca44 in do_xmalloc wrapper.c:41:8
#2 0x9aca1a in xmalloc wrapper.c:62:9
#3 0x873c17 in parse_pathspec pathspec.c:582:2
#4 0x503eb8 in check_ignore builtin/check-ignore.c:90:2
#5 0x5038af in cmd_check_ignore builtin/check-ignore.c:190:17
#6 0x4cd91d in run_builtin git.c:467:11
#7 0x4cb5f3 in handle_builtin git.c:719:3
#8 0x4ccf47 in run_argv git.c:808:4
#9 0x4caf49 in cmd_main git.c:939:19
#10 0x69e43e in main common-main.c:52:11
#11 0x7f18bb0dd349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Indirect leak of 65 byte(s) in 1 object(s) allocated from:
#0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0x9acc46 in xrealloc wrapper.c:126:8
#2 0x93baed in strbuf_grow strbuf.c:98:2
#3 0x93d696 in strbuf_vaddf strbuf.c:392:3
#4 0x9400c6 in xstrvfmt strbuf.c:979:2
#5 0x940253 in xstrfmt strbuf.c:989:8
#6 0x92b72a in prefix_path_gently setup.c:115:15
#7 0x87442d in init_pathspec_item pathspec.c:439:11
#8 0x873cef in parse_pathspec pathspec.c:589:3
#9 0x503eb8 in check_ignore builtin/check-ignore.c:90:2
#10 0x5038af in cmd_check_ignore builtin/check-ignore.c:190:17
#11 0x4cd91d in run_builtin git.c:467:11
#12 0x4cb5f3 in handle_builtin git.c:719:3
#13 0x4ccf47 in run_argv git.c:808:4
#14 0x4caf49 in cmd_main git.c:939:19
#15 0x69e43e in main common-main.c:52:11
#16 0x7f18bb0dd349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Indirect leak of 2 byte(s) in 1 object(s) allocated from:
#0 0x486834 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
#1 0x9ac9e8 in xstrdup wrapper.c:29:14
#2 0x874542 in init_pathspec_item pathspec.c:468:20
#3 0x873cef in parse_pathspec pathspec.c:589:3
#4 0x503eb8 in check_ignore builtin/check-ignore.c:90:2
#5 0x5038af in cmd_check_ignore builtin/check-ignore.c:190:17
#6 0x4cd91d in run_builtin git.c:467:11
#7 0x4cb5f3 in handle_builtin git.c:719:3
#8 0x4ccf47 in run_argv git.c:808:4
#9 0x4caf49 in cmd_main git.c:939:19
#10 0x69e43e in main common-main.c:52:11
#11 0x7f18bb0dd349 in __libc_start_main (/lib64/libc.so.6+0x24349)
SUMMARY: AddressSanitizer: 179 byte(s) leaked in 3 allocation(s).
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
prefix_filename() returns newly allocated memory, and strbuf_addstr()
doesn't take ownership of its inputs. Therefore we have to make sure to
store and free prefix_filename()'s result.
As this leak is in cmd_bugreport(), we could just as well UNLEAK the
prefix - but there's no good reason not to just free it properly. This
leak was found while running t0091, see output below:
Direct leak of 24 byte(s) in 1 object(s) allocated from:
#0 0x49ab79 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
#1 0x9acc66 in xrealloc wrapper.c:126:8
#2 0x93baed in strbuf_grow strbuf.c:98:2
#3 0x93c6ea in strbuf_add strbuf.c:295:2
#4 0x69f162 in strbuf_addstr ./strbuf.h:304:2
#5 0x69f083 in prefix_filename abspath.c:277:2
#6 0x4fb275 in cmd_bugreport builtin/bugreport.c:146:9
#7 0x4cd91d in run_builtin git.c:467:11
#8 0x4cb5f3 in handle_builtin git.c:719:3
#9 0x4ccf47 in run_argv git.c:808:4
#10 0x4caf49 in cmd_main git.c:939:19
#11 0x69df9e in main common-main.c:52:11
#12 0x7f523a987349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
common_prefix() returns a new string, which we store in max_prefix -
this string needs to be freed to avoid a leak. This leak is happening
in cmd_ls_files, hence is of no real consequence - an UNLEAK would be
just as good, but we might as well free the string properly.
Leak found while running t0002, see output below:
Direct leak of 8 byte(s) in 1 object(s) allocated from:
#0 0x49a85d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0x9ab1b4 in do_xmalloc wrapper.c:41:8
#2 0x9ab248 in do_xmallocz wrapper.c:75:8
#3 0x9ab22a in xmallocz wrapper.c:83:9
#4 0x9ab2d7 in xmemdupz wrapper.c:99:16
#5 0x78d6a4 in common_prefix dir.c:191:15
#6 0x5aca48 in cmd_ls_files builtin/ls-files.c:669:16
#7 0x4cd92d in run_builtin git.c:453:11
#8 0x4cb5fa in handle_builtin git.c:704:3
#9 0x4ccf57 in run_argv git.c:771:4
#10 0x4caf49 in cmd_main git.c:902:19
#11 0x69ce2e in main common-main.c:52:11
#12 0x7f64d4d94349 in __libc_start_main (/lib64/libc.so.6+0x24349)
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use struct object_id for the names of objects. It isn't intended to
be used for other hash values that don't name objects such as the pack
hash.
Because struct object_id will soon need to have its algorithm member
set, using it in this code path would mean that we didn't set that
member, only the hash member, which would result in a crash. For both
of these reasons, switch to using an unsigned char array of size
GIT_MAX_RAWSZ.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In most cases, when we load the hash of an object into a struct
object_id, we load it using one of the oid* or *_oid_hex functions.
However, for git show-index, we read it in directly using fread. As a
consequence, set the algorithm correctly so the objects can be used
correctly both now and in the future.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Up until recently, object IDs did not have an algorithm member, only a
hash. Consequently, it was possible to share one null (all-zeros)
object ID among all hash algorithms. Now that we're going to be
handling objects from multiple hash algorithms, it's important to make
sure that all object IDs have a correct algorithm field.
Introduce a per-algorithm null OID, and add it to struct hash_algo.
Introduce a wrapper function as well, and use it everywhere we used to
use the null_oid constant.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we need our instances of struct object_id to be zero padded, we
can no longer cast unsigned char buffers to be pointers to struct
object_id. This file reads data out of the pack objects and then
inserts it directly into a linked list item which is a pointer to struct
object_id. Instead, let's have the linked list item hold its own struct
object_id and copy the data into it.
In addition, since these are not really pointers to struct object_id,
stop passing them around as such, and call them what they really are:
pointers to unsigned char.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we're hashing a value which is going to be an object ID, we want to
zero-pad that value if necessary. To do so, use the final_oid_fn
instead of the final_fn anytime we're going to create an object ID to
ensure we perform this operation.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the future, we'll want oidread to automatically set the hash
algorithm member for an object ID we read into it, so ensure we use
oidread instead of hashcpy everywhere we're copying a hash value into a
struct object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "git pack-objects" makes a literal copy of a part of existing
packfile using the reachability bitmaps, its update to the progress
meter was broken.
* jk/pack-objects-bitmap-progress-fix:
pack-objects: update "nr_seen" progress based on pack-reused count
When we use `git for-each-ref`, every ref will allocate
its own output strbuf and error strbuf. But we can reuse
the final strbuf for each step ref's output. The error
buffer will also be reused, despite the fact that the git
will exit when `format_ref_array_item()` return a non-zero
value and output the contents of the error buffer.
The performance for `git for-each-ref` on the Git repository
itself with performance testing tool `hyperfine` changes from
23.7 ms ± 0.9 ms to 22.2 ms ± 1.0 ms. Optimization is relatively
minor.
At the same time, we apply this optimization to `git tag -l`
and `git branch -l`.
This approach is similar to the one used by 79ed0a5
(cat-file: use a single strbuf for all output, 2018-08-14)
to speed up the cat-file builtin.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Jeff King <peff@peff.net>
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Inlining the exported function `show_ref_array_item()`,
which is not providing the right level of abstraction,
simplifies the API and can unlock improvements at the
former call sites.
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There's two callsites which assemble global config paths, once in the
config loading code and once in the git-config(1) builtin. We're about
to implement a way to override global config paths via an environment
variable which would require us to adjust both sites.
Unify both code paths into a single `git_global_config()` function which
returns both paths for `~/.gitconfig` and the XDG config file. This will
make the subsequent patch which introduces the new envvar easier to
implement.
No functional changes are expected from this patch.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `git_etc_gitconfig()` function retrieves the system-level path of
the configuration file. We're about to introduce a way to override it
via an environment variable, at which point the name of this function
would start to become misleading.
Rename the function to `git_system_config()` as a preparatory step.
While at it, the function is also refactored to pass memory ownership to
the caller. This is done to better match semantics of
`git_global_config()`, which is going to be introduced in the next
commit.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When providing an object filter, it is currently impossible to also
filter provided items. E.g. when executing `git rev-list HEAD` , the
commit this reference points to will be treated as user-provided and is
thus excluded from the filtering mechanism. This makes it harder than
necessary to properly use the new `--filter=object:type` filter given
that even if the user wants to only see blobs, he'll still see commits
of provided references.
Improve this by introducing a new `--filter-provided-objects` option
to the git-rev-parse(1) command. If given, then all user-provided
references will be subject to filtering.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use multiple worker processes to distribute the queued entries and call
write_pc_item() in parallel for them. The items are distributed
uniformly in contiguous chunks. This minimizes the chances of two
workers writing to the same directory simultaneously, which could affect
performance due to lock contention in the kernel. Work stealing (or any
other format of re-distribution) is not implemented yet.
The protocol between the main process and the workers is quite simple.
They exchange binary messages packed in pkt-line format, and use
PKT-FLUSH to mark the end of input (from both sides). The main process
starts the communication by sending N pkt-lines, each corresponding to
an item that needs to be written. These packets contain all the
necessary information to load, smudge, and write the blob associated
with each item. Then it waits for the worker to send back N pkt-lines
containing the results for each item. The resulting packet must contain:
the identification number of the item that it refers to, the status of
the operation, and the lstat() data gathered after writing the file (iff
the operation was successful).
For now, checkout always uses a hardcoded value of 2 workers, only to
demonstrate that the parallel checkout framework correctly divides and
writes the queued entries. The next patch will add user configurations
and define a more reasonable default, based on tests with the said
settings.
Co-authored-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
New log.diffMerges configuration variable sets the format that
--diff-merges=on will be using. The default is "separate".
t4013: add the following tests for log.diffMerges config:
* Test that wrong values are denied.
* Test that the value of log.diffMerges properly affects both
--diff-merges=on and -m.
t9902: fix completion tests for log.d* to match log.diffMerges.
Added documentation for log.diffMerges.
Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Plug the ort merge backend throughout the rest of the system, and
start testing it as a replacement for the recursive backend.
* en/ort-readiness:
Add testing with merge-ort merge strategy
t6423: mark remaining expected failure under merge-ort as such
Revert "merge-ort: ignore the directory rename split conflict for now"
merge-recursive: add a bunch of FIXME comments documenting known bugs
merge-ort: write $GIT_DIR/AUTO_MERGE whenever we hit a conflict
t: mark several submodule merging tests as fixed under merge-ort
merge-ort: implement CE_SKIP_WORKTREE handling with conflicted entries
t6428: new test for SKIP_WORKTREE handling and conflicts
merge-ort: support subtree shifting
merge-ort: let renormalization change modify/delete into clean delete
merge-ort: have ll_merge() use a special attr_index for renormalization
merge-ort: add a special minimal index just for renormalization
merge-ort: use STABLE_QSORT instead of QSORT where required
If a remote has the skipFetchAll setting enabled, then that remote is
not intended for frequent fetching. It makes sense to not fetch that
data during the 'prefetch' maintenance task. Skip that remote in the
iteration without error. The skip_default_update member is initialized
in remote.c:handle_config() as part of initializing the 'struct remote'.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'prefetch' maintenance task previously forced the following refspec
for each remote:
+refs/heads/*:refs/prefetch/<remote>/*
If a user has specified a more strict refspec for the remote, then this
prefetch task downloads more objects than necessary.
The previous change introduced the '--prefetch' option to 'git fetch'
which manipulates the remote's refspec to place all resulting refs into
refs/prefetch/, with further partitioning based on the destinations of
those refspecs.
Update the documentation to be more generic about the destination refs.
Do not mention custom refspecs explicitly, as that does not need to be
highlighted in this documentation. The important part of placing refs in
refs/prefetch/ remains.
Reported-by: Tom Saeger <tom.saeger@oracle.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --prefetch option will be used by the 'prefetch' maintenance task
instead of sending refspecs explicitly across the command-line. The
intention is to modify the refspec to place all results in
refs/prefetch/ instead of anywhere else.
Create helper method filter_prefetch_refspec() to modify a given refspec
to fit the rules expected of the prefetch task:
* Negative refspecs are preserved.
* Refspecs without a destination are removed.
* Refspecs whose source starts with "refs/tags/" are removed.
* Other refspecs are placed within "refs/prefetch/".
Finally, we add the 'force' option to ensure that prefetch refs are
replaced as necessary.
There are some interesting cases that are worth testing.
An earlier version of this change dropped the "i--" from the loop that
deletes a refspec item and shifts the remaining entries down. This
allowed some refspecs to not be modified. The subtle part about the
first --prefetch test is that the "refs/tags/*" refspec appears directly
before the "refs/heads/bogus/*" refspec. Without that "i--", this
ordering would remove the "refs/tags/*" refspec and leave the last one
unmodified, placing the result in "refs/heads/*".
It is possible to have an empty refspec. This is typically the case for
remotes other than the origin, where users want to fetch a specific tag
or branch. To correctly test this case, we need to further remove the
upstream remote for the local branch. Thus, we are testing a refspec
that will be deleted, leaving nothing to fetch.
Helped-by: Tom Saeger <tom.saeger@oracle.com>
Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full one to avoid unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full one to avoid missing files.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full one so we do not miss blobs to scan. Later, this can
integrate more carefully with sparse indexes with proper testing.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When verifying all blobs reachable from the index, ensure that a sparse
index has been expanded to a full one to avoid missing some blobs.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before iterating over all cache entries, ensure that a sparse index has
been expanded to a full one to avoid unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These two loops iterate over all cache entries, so ensure that a sparse
index is expanded to a full index before we do so.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before iterating over all cache entries in the checkout builtin, ensure
that we have a full index to avoid any unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before we iterate over all cache entries, ensure that the index is not
sparse. This loop in checkout_all() might be safe to iterate over a
sparse index, but let's put this protection here until it can be
carefully tested.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Several methods specify that they take a 'struct index_state' pointer
with the 'const' qualifier because they intend to only query the data,
not change it. However, we will be introducing a step very low in the
method stack that might modify a sparse-index to become a full index in
the case that our queries venture inside a sparse-directory entry.
This change only removes the 'const' qualifiers that are necessary for
the following change which will actually modify the implementation of
index_name_stage_pos().
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A configuration variable has been added to force tips of certain
refs to be given a reachability bitmap.
* tb/pack-preferred-tips-to-give-bitmap:
builtin/pack-objects.c: respect 'pack.preferBitmapTips'
t/helper/test-bitmap.c: initial commit
pack-bitmap: add 'test_bitmap_commits()' helper
All of the other lookup_foo() functions take a repository argument, but
lookup_unknown_object() was never converted, and it uses the_repository
internally. Let's fix that.
We could leave a wrapper that uses the_repository, but there aren't that
many calls, so we'll just convert them all. I looked briefly at each
site to see if we had a repository struct (besides the_repository) we
could pass, but none of them do (so this conversion to pass
the_repository is a pure noop in each case, though it does take us one
step closer to eventually getting rid of the_repository).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When serving a clone or fetch with bitmaps, after deciding which objects
need to be sent our "pack reuse" mechanism kicks in: we try to send
more-or-less verbatim a bunch of objects from the beginning of the
bitmapped packfile without even adding them to the to_pack.objects
array.
After deciding which objects will be in the "reused" portion, we update
nr_result to account for those, and then trigger display_progress() to
show the user (who is undoubtedly dazzled that we managed to enumerate
so many objects so quickly).
But then something confusing happens: the "Enumerating objects" progress
meter jumps _backwards_, counting up from zero the number of objects we
actually add into to_pack.objects.
This worked correctly once upon a time, but was broken in 5af050437a
(pack-objects: show some progress when counting kept objects,
2018-04-15), when the latter half of that progress meter switched to
using a separate nr_seen counter, rather than nr_result. Nobody noticed
for two reasons:
- prior to the pack-reuse fixes from a14aebeac3 (Merge branch
'jk/packfile-reuse-cleanup', 2020-02-14), the reuse code almost
never kicked in anyway
- the output looks _kind of_ correct. The "backwards" moment is hard
to catch, because we overwrite the old progress number with the new
one, and the larger number is displayed only for a second. So unless
you look at that exact second, you just see the much smaller value,
counting up to the number of non-reused objects (though of course if
you catch it in stderr, or look at GIT_TRACE_PACKET from a server
with bitmaps, you can see both values).
This smaller output isn't wrong per se, but isn't counting what we ever
intended to. We should give the user the whole number of objects we
considered (which, as per 5af050437a's original purpose, is already
_not_ a count of what goes into to_pack.objects). The follow-on
"Counting objects" meter shows the actual number of objects we feed into
that array.
We can easily fix this by bumping (and showing) nr_seen for the
pack-reused objects. When the included test is run without this patch,
the second pack-objects invocation produces "Enumerating objects: 1" to
show the one loose object, even though the resulting pack has hundreds
of objects in it. With it, we jump to "Enumerating objects: 674" after
deciding on reuse, and then "675" when we add in the loose object.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git add` refrains from adding or updating index entries that are
outside the current sparse checkout, but `git rm` doesn't follow the
same restriction. This is somewhat counter-intuitive and inconsistent.
So make `rm` honor the sparsity rules and advise on how to remove
SKIP_WORKTREE entries just like `add` does. Also add some tests for the
new behavior.
Suggested-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git add` already refrains from updating SKIP_WORKTREE entries, but it
silently exits with zero code when it is asked to do so. Instead, let's
warn the user and display a hint on how to update these entries.
Note that we only warn the user whey they give a pathspec item that
matches no eligible path for updating, but it does match one or more
SKIP_WORKTREE entries. A warning was chosen over erroring out right away
to reproduce the same behavior `add` already exhibits with ignored
files. This also allow users to continue their workflow without having
to invoke `add` again with only the eligible paths (as those will have
already been added).
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new enum parameter to `add_pathspec_matches_against_index()` and
`find_pathspecs_matching_against_index()`, allowing callers to specify
whether these function should attempt to match SKIP_WORKTREE entries or
not. This will be used in a future patch to make `git add` display a
warning when it is asked to update SKIP_WORKTREE entries.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `git add --refresh <pathspec>` doesn't find any matches for the
given pathspec, it prints an error message using the `match` field of
the `struct pathspec_item`. However, this field doesn't contain the
magic part of the pathspec. Instead, let's use the `original` field.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git cherry-pick/revert" with or without "--[no-]edit" did not spawn
the editor as expected (e.g. "revert --no-edit" after a conflict
still asked to edit the message), which has been corrected.
* en/sequencer-edit-upon-conflict-fix:
sequencer: fix edit handling for cherry-pick and revert messages
"git clone --reject-shallow" option fails the clone as soon as we
notice that we are cloning from a shallow repository.
* ll/clone-reject-shallow:
builtin/clone.c: add --reject-shallow option
An on-disk reverse-index to map the in-pack location of an object
back to its object name across multiple packfiles is introduced.
* tb/reverse-midx:
midx.c: improve cache locality in midx_pack_order_cmp()
pack-revindex: write multi-pack reverse indexes
pack-write.c: extract 'write_rev_file_order'
pack-revindex: read multi-pack reverse indexes
Documentation/technical: describe multi-pack reverse indexes
midx: make some functions non-static
midx: keep track of the checksum
midx: don't free midx_name early
midx: allow marking a pack as preferred
t/helper/test-read-midx.c: add '--show-objects'
builtin/multi-pack-index.c: display usage on unrecognized command
builtin/multi-pack-index.c: don't enter bogus cmd_mode
builtin/multi-pack-index.c: split sub-commands
builtin/multi-pack-index.c: define common usage with a macro
builtin/multi-pack-index.c: don't handle 'progress' separately
builtin/multi-pack-index.c: inline 'flags' with options
Fsck API clean-up.
* ab/fsck-api-cleanup:
fetch-pack: use new fsck API to printing dangling submodules
fetch-pack: use file-scope static struct for fsck_options
fetch-pack: don't needlessly copy fsck_options
fsck.c: move gitmodules_{found,done} into fsck_options
fsck.c: add an fsck_set_msg_type() API that takes enums
fsck.c: pass along the fsck_msg_id in the fsck_error callback
fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h
fsck.c: give "FOREACH_MSG_ID" a more specific name
fsck.c: undefine temporary STR macro after use
fsck.c: call parse_msg_type() early in fsck_set_msg_type()
fsck.h: re-order and re-assign "enum fsck_msg_type"
fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum
fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type"
fsck.c: rename remaining fsck_msg_id "id" to "msg_id"
fsck.c: remove (mostly) redundant append_msg_id() function
fsck.c: rename variables in fsck_set_msg_type() for less confusion
fsck.h: use "enum object_type" instead of "int"
fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT}
fsck.c: refactor and rename common config callback
A few option description strings started with capital letters,
which were corrected.
* cc/downcase-opt-help:
column, range-diff: downcase option description
"git commit" learned "--trailer <key>[=<value>]" option; together
with the interpret-trailers command, this will make it easier to
support custom trailers.
* zh/commit-trailer:
commit: add --trailer option
Plug or annotate remaining leaks that trigger while running the
very basic set of tests.
* ah/plugleaks:
transport: also free remote_refs in transport_disconnect()
parse-options: don't leak alias help messages
parse-options: convert bitfield values to use binary shift
init-db: silence template_dir leak when converting to absolute path
init: remove git_init_db_config() while fixing leaks
worktree: fix leak in dwim_branch()
clone: free or UNLEAK further pointers when finished
reset: free instead of leaking unneeded ref
symbolic-ref: don't leak shortened refname in check_symref()
The previous logic filled a string list with the names of each remote,
but instead we could simply run the appropriate 'git fetch' data
directly in the remote iterator. Do this for reduced code size, but also
because it sets up an upcoming change to use the remote's refspec. This
data is accessible from the 'struct remote' data that is now accessible
in fetch_remote().
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git format-patch -v<n>" learned to allow a reroll count that is
not an integer.
* zh/format-patch-fractional-reroll-count:
format-patch: allow a non-integral version numbers
A simple IPC interface gets introduced to build services like
fsmonitor on top.
* jh/simple-ipc:
t0052: add simple-ipc tests and t/helper/test-simple-ipc tool
simple-ipc: add Unix domain socket implementation
unix-stream-server: create unix domain socket under lock
unix-socket: disallow chdir() when creating unix domain sockets
unix-socket: add backlog size option to unix_stream_listen()
unix-socket: eliminate static unix_stream_socket() helper function
simple-ipc: add win32 implementation
simple-ipc: design documentation for new IPC mechanism
pkt-line: add options argument to read_packetized_to_strbuf()
pkt-line: add PACKET_READ_GENTLE_ON_READ_ERROR option
pkt-line: do not issue flush packets in write_packetized_*()
pkt-line: eliminate the need for static buffer in packet_write_gently()
Preparatory API changes for parallel checkout.
* mt/parallel-checkout-part-1:
entry: add checkout_entry_ca() taking preloaded conv_attrs
entry: move conv_attrs lookup up to checkout_entry()
entry: extract update_ce_after_write() from write_entry()
entry: make fstat_output() and read_blob_entry() public
entry: extract a header file for entry.c functions
convert: add classification for conv_attrs struct
convert: add get_stream_filter_ca() variant
convert: add [async_]convert_to_working_tree_ca() variants
convert: make convert_attrs() and convert structs public
When multiple packs in the multi-pack index contain the same object, the
MIDX machinery must make a choice about which pack it associates with
that object. Prior to this patch, the lowest-ordered[1] pack was always
selected.
Pack selection for duplicate objects is relatively unimportant today,
but it will become important for multi-pack bitmaps. This is because we
can only invoke the pack-reuse mechanism when all of the bits for reused
objects come from the reuse pack (in order to ensure that all reused
deltas can find their base objects in the same pack).
To encourage the pack selection process to prefer one pack over another
(the pack to be preferred is the one a caller would like to later use as
a reuse pack), introduce the concept of a "preferred pack". When
provided, the MIDX code will always prefer an object found in a
preferred pack over any other.
No format changes are required to store the preferred pack, since it
will be able to be inferred with a corresponding MIDX bitmap, by looking
up the pack associated with the object in the first bit position (this
ordering is described in detail in a subsequent commit).
[1]: the ordering is specified by MIDX internals; for our purposes we
can consider the "lowest ordered" pack to be "the one with the
most-recent mtime.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In some scenarios, users may want more history than the repository
offered for cloning, which happens to be a shallow repository, can
give them. But because users don't know it is a shallow repository
until they download it to local, we may want to refuse to clone
this kind of repository, without creating any unnecessary files.
The '--depth=x' option cannot be used as a solution; the source may
be deep enough to give us 'x' commits when cloned, but the user may
later need to deepen the history to arbitrary depth.
Teach '--reject-shallow' option to "git clone" to abort as soon as
we find out that we are cloning from a shallow repository.
Signed-off-by: Li Linchao <lilinchao@oschina.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>