Since 4a4b4cda (builtin-remote: Make "remote -v" display push urls,
2009-06-13), the string_list that was initialized with 0 in its
strdup_string member is immediately made to strdup its key strings
by flipping the strdup_string member to true. When 183113a5
(string_list: Add STRING_LIST_INIT macro and make use of it.,
2010-07-04) has introduced STRING_LIST_INIT macros, it mechanically
replaced the initialization to STRING_LIST_INIT_NODUP.
Instead, just use the other initialization macro to make it strdup
the key from the beginning.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An earlier attempt to plug leaks placed a clean-up label to jump to
at a bogus place, which as been corrected.
* jk/diff-files-cleanup-fix:
diff-files: move misplaced cleanup label
"git clone" from a repository with some ref whose HEAD is unborn
did not set the HEAD in the resulting repository correctly, which
has been corrected.
* jk/clone-unborn-confusion:
clone: move unborn head creation to update_head()
clone: use remote branch if it matches default HEAD
clone: propagate empty remote HEAD even with other branches
clone: drop extra newline from warning message
The resolve-undo information in the index was not protected against
GC, which has been corrected.
* jc/resolve-undo:
fsck: do not dereference NULL while checking resolve-undo data
revision: mark blobs needed for resolve-undo as reachable
The previous change added the --update-refs command-line option. For
users who always want this mode, create the rebase.updateRefs config
option which behaves the same way as rebase.autoSquash does with the
--autosquash option.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When working on a large feature, it can be helpful to break that feature
into multiple smaller parts that become reviewed in sequence. During
development or during review, a change to one part of the feature could
affect multiple of these parts. An interactive rebase can help adjust
the multi-part "story" of the branch.
However, if there are branches tracking the different parts of the
feature, then rebasing the entire list of commits can create commits not
reachable from those "sub branches". It can take a manual step to update
those branches.
Add a new --update-refs option to 'git rebase -i' that adds 'update-ref
<ref>' steps to the todo file whenever a commit that is being rebased is
decorated with that <ref>. At the very end, the rebase process updates
all of the listed refs to the values stored during the rebase operation.
Be sure to iterate after any squashing or fixups are placed. Update the
branch only after those squashes and fixups are complete. This allows a
--fixup commit at the tip of the feature to apply correctly to the sub
branch, even if it is fixing up the most-recent commit in that part.
This change update the documentation and builtin to accept the
--update-refs option as well as updating the todo file with the
'update-ref' commands. Tests are added to ensure that these todo
commands are added in the correct locations.
This change does _not_ include the actual behavior of tracking the
updated refs and writing the new ref values at the end of the rebase
process. That is deferred to a later change.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The way "git multi-pack" uses parse-options API has been improved.
* sg/multi-pack-index-parse-options-fix:
multi-pack-index: simplify handling of unknown --options
Add Coccinelle rules to detect the pattern of initializing and then
finalizing a structure without using it in between at all, which
happens after code restructuring and the compilers fail to
recognize as an unused variable.
* ab/cocci-unused:
cocci: generalize "unused" rule to cover more than "strbuf"
cocci: add and apply a rule to find "unused" strbufs
cocci: have "coccicheck{,-pending}" depend on "coccicheck-test"
cocci: add a "coccicheck-test" target and test *.cocci rules
Makefile & .gitignore: ignore & clean "git.res", not "*.res"
Makefile: remove mandatory "spatch" arguments from SPATCH_FLAGS
Apply Coccinelle rule to turn raw memmove() into MOVE_ARRAY() cpp
macro, which would improve maintainability and readability.
* jc/builtin-mv-move-array:
builtin/mv.c: use the MOVE_ARRAY() macro instead of memmove()
git-cat-file is used by tools like GitLab to get commit tag contents
that are then displayed to users. This content which has author,
committer or tagger information, could benefit from passing through the
mailmap mechanism before being sent or displayed.
This patch adds --[no-]use-mailmap command line option to the git
cat-file command. It also adds --[no-]mailmap option as an alias to
--[no-]use-mailmap.
This patch also introduces new test cases to test the mailmap mechanism in
git cat-file command.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: John Cai <johncai86@gmail.com>
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Further preparation to turn git-submodule.sh into a builtin.
* ab/submodule-cleanup:
git-sh-setup.sh: remove "say" function, change last users
git-submodule.sh: use "$quiet", not "$GIT_QUIET"
submodule--helper: eliminate internal "--update" option
submodule--helper: understand --checkout, --merge and --rebase synonyms
submodule--helper: report "submodule" as our name in some "-h" output
submodule--helper: rename "absorb-git-dirs" to "absorbgitdirs"
submodule update: remove "-v" option
submodule--helper: have --require-init imply --init
git-submodule.sh: remove unused top-level "--branch" argument
git-submodule.sh: make the "$cached" variable a boolean
git-submodule.sh: remove unused $prefix variable
git-submodule.sh: remove unused sanitize_submodule_env()
"git mv A B" in a sparsely populated working tree can be asked to
move a path between directories that are "in cone" (i.e. expected
to be materialized in the working tree) and "out of cone"
(i.e. expected to be hidden). The handling of such cases has been
improved.
* sy/mv-out-of-cone:
mv: add check_dir_in_index() and solve general dir check issue
mv: use flags mode for update_mode
mv: check if <destination> exists in index to handle overwriting
mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
mv: decouple if/else-if checks using goto
mv: update sparsity after moving from out-of-cone to in-cone
t1092: mv directory from out-of-cone to in-cone
t7002: add tests for moving out-of-cone file/directory
Allow large objects read from a packstream to be streamed into a
loose object file straight, without having to keep it in-core as a
whole.
* hx/unpack-streaming:
unpack-objects: use stream_loose_object() to unpack large objects
core doc: modernize core.bigFileThreshold documentation
object-file.c: add "stream_loose_object()" to handle large object
object-file.c: factor out deflate part of write_loose_object()
object-file.c: refactor write_loose_object() to several steps
unpack-objects: low memory footprint for get_data() in dry_run mode
"git merge-tree" learned a new mode where it takes two commits and
computes a tree that would result in the merge commit, if the
histories leading to these two commits were to be merged.
* en/merge-tree:
git-merge-tree.txt: add a section on potentional usage mistakes
merge-tree: add a --allow-unrelated-histories flag
merge-tree: allow `ls-files -u` style info to be NUL terminated
merge-ort: optionally produce machine-readable output
merge-ort: store more specific conflict information
merge-ort: make `path_messages` a strmap to a string_list
merge-ort: store messages in a list, not in a single strbuf
merge-tree: provide easy access to `ls-files -u` style info
merge-tree: provide a list of which files have conflicts
merge-ort: remove command-line-centric submodule message from merge-ort
merge-ort: provide a merge_get_conflicted_files() helper function
merge-tree: support including merge messages in output
merge-ort: split out a separate display_update_messages() function
merge-tree: implement real merges
merge-tree: add option parsing and initial shell for real merge function
merge-tree: move logic for existing merge into new function
merge-tree: rename merge_trees() to trivial_merge_trees()
When sorting the output of `git shortlog` by count, a list of authors in
alphabetical order is then sorted by contribution count. Obviously, the
idea is to maintain the alphabetical order for items with identical
contribution count.
At the moment, this job is performed by `qsort()`. As that function is
not guaranteed to implement a stable sort algorithm, this can lead to
inconsistent and/or surprising behavior: items with identical
contribution count could lose their alphabetical sub-order.
The `qsort()` in MS Visual C's runtime does _not_ implement a stable
sort algorithm, and under certain circumstances this even causes a test
failure in t4201.21 "shortlog can match multiple groups", where two
authors both are listed with 2 contributions, and are listed in inverse
alphabetical order.
Let's instead use the stable sort provided by `git_stable_qsort()` to
avoid this inconsistency.
This is a companion to 2049b8dc65 (diffcore_rename(): use a stable sort,
2019-09-30).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At the end of `git checkout <pathspec>`, we get a message informing how
many entries were updated in the working tree. However, this number can
be inaccurate for two reasons:
1) Delayed entries currently get counted twice.
2) Failed entries are included in the count.
The first problem happens because the counter is first incremented
before inserting the entry in the delayed checkout queue, and once again
when finish_delayed_checkout() calls checkout_entry(). And the second
happens because the counter is incremented too early in
checkout_entry(), before the entry was in fact checked out. Fix that by
moving the count increment further down in the call stack and removing
the duplicate increment on delayed entries. Note that we have to keep
a per-entry reference for the counter (both on parallel checkout and
delayed checkout) because not all entries are always accumulated at the
same counter. See checkout_worktree(), at builtin/checkout.c for an
example.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git remote show [-n] frotz" now pays attention to negative
pathspec.
* jk/remote-show-with-negative-refspecs:
remote: handle negative refspecs in git remote show
"git mktree --missing" lazily fetched objects that are missing from
the local object store, which was totally unnecessary for the purpose
of creating the tree object(s) from its input.
* ro/mktree-allow-missing-fix:
mktree: do not check type of remote objects
* ds/branch-checked-out:
branch: drop unused worktrees variable
fetch: stop passing around unused worktrees variable
branch: fix branch_checked_out() leaks
branch: use branch_checked_out() when deleting refs
fetch: use new branch_checked_out() and add tests
branch: check for bisects and rebases
branch: add branch_checked_out() helper
Commit 0139c58ab9 (revisions API users: add "goto cleanup" for
release_revisions(), 2022-04-13) converted an early return in
cmd_diff_files() into a goto. But it put the cleanup label too early: if
read_cache_preload() returns an error, we'll set result to "-1", but
then jump to calling run_diff_files(), overwriting our result.
We should jump past the call to run_diff_files(). Likewise, we should go
past diff_result_code(), which is expecting to see a code from an actual
diff, not a negative error code.
In practice, I suspect this bug cannot actually be triggered, because
read_cache_preload() does not seem to ever return an error. Its return
value (eventually) comes from do_read_index(), which gives the number of
cache entries found, and calls die() on error. Still, it makes sense to
fix the inadvertent change from 0139c58ab9 first, and we can look into
the overall error handling of read_cache() separately (which is present
in many other callsites).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we found an invalid object recorded in the resolve-undo data,
we would have ended up dereferencing NULL while fsck. Reporting the
problem and going on to the next object is the right thing to do
here.
Noticed by SZEDER Gábor.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a helper to see if a branch is already being worked on
(hence should not be newly checked out in a working tree), which
performs much better than the existing find_shared_symref() to
replace many uses of the latter.
* ds/branch-checked-out:
branch: drop unused worktrees variable
fetch: stop passing around unused worktrees variable
branch: fix branch_checked_out() leaks
branch: use branch_checked_out() when deleting refs
fetch: use new branch_checked_out() and add tests
branch: check for bisects and rebases
branch: add branch_checked_out() helper
Prior to 4f37d45706 (clone: respect remote unborn HEAD, 2021-02-05),
creation of the local HEAD was always done in update_head(). That commit
added code to handle an unborn head in an empty repository, and just did
all symref creation and config setup there.
This makes the code flow a little bit confusing, especially as new
corner cases have been covered (like the previous commit to match our
default branch name to a non-HEAD remote branch).
Let's move the creation of the unborn symref into update_head(). This
matches the other HEAD-creation cases, and now the logic is consistently
separated: the main cmd_clone() function only examines the situation and
sets variables based on what it finds, and update_head() actually
performs the update.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Although parse_options() can handle unknown --options just fine, none
of 'git multi-pack-index's subcommands rely on it, but do it on their
own: they invoke parse_options() with the PARSE_OPT_KEEP_UNKNOWN flag,
then check whether there are any unparsed arguments left, and print
usage and quit if necessary.
Drop that PARSE_OPT_KEEP_UNKNOWN flag to let parse_options() handle
unknown options instead, which has the additional benefit that it
prints not only the usage but an "error: unknown option `foo'" message
as well.
Do leave the unparsed arguments check to catch any unexpected
non-option arguments, though, e.g. 'git multi-pack-index write foo'.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The variables 'source', 'destination', and 'submodule_gitfile' are
all of type "const char **", and an element of such an array is of
"type const char *", but these memmove() calls were written as if
these variables are of type "char **".
Once these memmove() calls are fixed to use the correct type to
compute the number of bytes to be moved, e.g.
- memmove(source + i, source + i + 1, n * sizeof(char *));
+ memmove(source + i, source + i + 1, n * sizeof(const char *));
existing contrib/coccinelle/array.cocci rules can recognize them as
candidates for turning into MOVE_ARRAY().
While at it, use CALLOC_ARRAY() instead of xcalloc() to allocate the
modes[] array that is involved in the change.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Usually clone tries to use the same local HEAD as the remote (unless the
user has given --branch explicitly). Even if the remote HEAD is detached
or unborn, we can detect those situations with modern versions of Git.
If the remote is too old to support the "unborn" extension (or it has
been disabled via config), then we can't know the name of the remote's
unborn HEAD, and we fall back whatever the local default branch name is
configured to be.
But that leads to one weird corner case. It's rare because it needs a
number of factors:
- the remote has an unborn HEAD
- the remote is too old to support "unborn", or has disabled it
- the remote has another branch "foo"
- the local default branch name is "foo"
In that case you end up with a local clone on an unborn "foo" branch,
disconnected completely from the remote's "foo". This is rare in
practice, but the result is quite confusing.
When choosing "foo", we can double check whether the remote has such a
name, and if so, start our local "foo" at the same spot, rather than
making it unborn.
Note that this causes a test failure in t5605, which is cloning from a
bundle that doesn't contain HEAD (so it behaves like a remote that
doesn't support "unborn"), but has a single "main" branch. That test
expects that we end up in the weird "unborn main" case, where we don't
actually check out the remote branch of the same name. Even though we
have to update the test, this seems like an argument in favor of this
patch: checking out main is what I'd expect from such a bundle.
So this patch updates the test for the new behavior and adds an adjacent
one that checks what the original was going for: if there's no HEAD and
the bundle _doesn't_ have a branch that matches our local default name,
then we end up with nothing checked out.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unless "--branch" was given, clone generally tries to match the local
HEAD to the remote one. For most repositories, this is easy: the remote
tells us which branch HEAD was pointing to, and we call our local
checkout() function on that branch.
When cloning an empty repository, it's a little more tricky: we have
special code that checks the transport's "unborn" extension, or falls back
to our local idea of what the default branch should be. In either case,
we point the new HEAD to that, and set up the branch.* config.
But that leaves one case unhandled: when the remote repository _isn't_
empty, but its HEAD is unborn. The checkout() function is smart enough
to realize we didn't fetch the remote HEAD and it bails with a warning.
But we'll have ignored any information the remote gave us via the unborn
extension. This leads to nonsense outcomes:
- If the remote has its HEAD pointing to an unborn "foo" and contains
another branch "bar", cloning will get branch "bar" but leave the
local HEAD pointing at "master" (or whatever our local default is),
which is useless. The project does not use "master" as a branch.
- Worse, if the other branch "bar" is instead called "master" (but
again, the remote HEAD is not pointing to it), then we end up with a
local unborn branch "master", which is not connected to the remote
"master" (it shares no history, and there's no branch.* config).
Instead, we should try to use the remote's HEAD, even if its unborn, to
be consistent with the other cases.
The reason this case was missed is that cmd_clone() handles empty and
non-empty repositories on two different sides of a conditional:
if (we have any refs) {
fetch refs;
check for --branch;
otherwise, try to point our head at remote head;
otherwise, our head is NULL;
} else {
check for --branch;
otherwise, try to use "unborn" extension;
otherwise, fall back to our default name name;
}
So the smallest change would be to repeat the "unborn" logic at the end
of the first block. But we can note some other overlaps and
inconsistencies:
- both sides have to handle --branch (though note that it's always an
error for the empty repo case, since an empty repo by definition
does not have a matching branch)
- the fall back to the default name is much more explicit in the
empty-repo case. The non-empty case eventually ends up bailing
from checkout() with a warning, which produces a similar result, but
fails to set up the branch config we do in the empty case.
So let's pull the HEAD setup out of this conditional entirely. This
de-duplicates some of the code and the result is easy to follow, because
helper functions like find_ref_by_name() do the right thing even in the
empty-repo case (i.e., by returning NULL).
There are two subtleties:
- for a remote with a detached HEAD, it will advertise an oid for HEAD
(which we store in our "remote_head" variable), but we won't find a
matching refname (so our "remote_head_points_at" is NULL). In this
case we make a local detached HEAD to match. Right now this happens
implicitly by reaching update_head() with a non-NULL remote_head
(since we skip all of the unborn-fallback). We'll now need to
account for it explicitly before doing the fallback.
- for an empty repo, we issue a warning to the user that they've
cloned an empty repo. The text of that warning doesn't make sense
for a non-empty repo with an unborn HEAD, so we'll have to
differentiate the two cases there. We could just use different text,
but instead let's allow the code to continue down to checkout(),
which will issue an appropriate warning, like:
remote HEAD refers to nonexistent ref, unable to checkout
Continuing down to checkout() will make it easier to do more fixes
on top (see below).
Note that this patch fixes the case where the other side reports an
unborn head to us using the protocol extension. It _doesn't_ fix the
case where the other side doesn't tell us, we locally guess "master",
and the other side happens to have a "master" which its HEAD doesn't
point. But it doesn't make anything worse there, and it should actually
make it easier to fix that problem on top.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We don't need to put a "\n" in calls to warning(), since it adds one
itself (and the user sees an extra blank line). Drop it, and while we're
here, drop the full-stop from the message, which goes against our
guidelines.
This bug dates all the way back to 8434c2f1af (Build in clone,
2008-04-27), but presumably nobody noticed because it's hard to trigger:
you have to clone a repository whose HEAD is unborn, but which is not
otherwise empty.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Generalize the newly added "unused.cocci" rule to find more than just
"struct strbuf", let's have it find the same unused patterns for
"struct string_list", as well as other code that uses
similar-looking *_{release,clear,free}() and {release,clear,free}_*()
functions.
We're intentionally loose in accepting e.g. a "strbuf_init(&sb)"
followed by a "string_list_clear(&sb, 0)". It's assumed that the
compiler will catch any such invalid code, i.e. that our
constructors/destructors don't take a "void *".
See [1] for example of code that would be covered by the
"get_worktrees()" part of this rule. We'd still need work that the
series is based on (we were passing "worktrees" to a function), but
could now do the change in [1] automatically.
1. https://lore.kernel.org/git/Yq6eJFUPPTv%2Fzc0o@coredump.intra.peff.net/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a coccinelle rule to remove "struct strbuf" initialization
followed by calling "strbuf_release()" function, without any uses of
the strbuf in the same function.
See the tests in contrib/coccinelle/tests/unused.{c,res} for what it's
intended to find and replace.
The inclusion of "contrib/scalar/scalar.c" is because "spatch" was
manually run on it (we don't usually run spatch on contrib).
Per the "buggy code" comment we also match a strbuf_init() before the
xmalloc(), but we're not seeking to be so strict as to make checks
that the compiler will catch for us redundant. Saying we'll match
either "init" or "xmalloc" lines makes the rule simpler.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Originally, moving a <source> directory which is not on-disk due
to its existence outside of sparse-checkout cone, "giv mv" command
errors out with "bad source".
Add a helper check_dir_in_index() function to see if a directory
name exists in the index. Also add a SKIP_WORKTREE_DIR bit to mark
such directories.
Change the checking logic, so that such <source> directory makes
"giv mv" command warns with "advise_on_updating_sparse_paths()"
instead of "bad source"; also user now can supply a "--sparse" flag so
this operation can be carried out successfully.
Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As suggested by Derrick [1], move the in-line definition of
"enum update_mode" to the top of the file and make it use "flags"
mode (each state is a different bit in the word).
Change the flag assignments from '=' (single assignment) to '|='
(additive). Also change flag evaluation from '==' to '&', etc.
[1] https://lore.kernel.org/git/22aadea2-9330-aa9e-7b6a-834585189144@github.com/
Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Originally, moving a sparse file into cone can result in unwarned
overwrite of existing entry. The expected behavior is that if the
<destination> exists in the entry, user should be prompted to supply
a [-f|--force] to carry out the operation, or the operation should
fail.
Add a check mechanism to do that.
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Originally, moving a <source> file which is not on-disk but exists in
index as a SKIP_WORKTREE enabled cache entry, "giv mv" command errors
out with "bad source".
Change the checking logic, so that such <source>
file makes "giv mv" command warns with "advise_on_updating_sparse_paths()"
instead of "bad source"; also user now can supply a "--sparse" flag so
this operation can be carried out successfully.
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previous if/else-if chain are highly nested and hard to develop/extend.
Refactor to decouple this if/else-if chain by using goto to jump ahead.
Suggested-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Originally, "git mv" a sparse file from out-of-cone to
in-cone does not update the moved file's sparsity (remove its
SKIP_WORKTREE bit). And the corresponding cache entry is, unexpectedly,
not checked out in the working tree.
Update the behavior so that:
1. Moving from out-of-cone to in-cone removes the SKIP_WORKTREE bit from
corresponding cache entry.
2. The moved cache entry is checked out in the working tree to reflect
the updated sparsity.
Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak introduced in 44c175c7a4 (pull: error on no merge
candidates, 2015-06-18). As a result we can mark several tests as
passing with SANITIZE=leak using "TEST_PASSES_SANITIZE_LEAK=true".
Removing the "int ret = 0" assignment added here in a6d7eb2c7a (pull:
optionally rebase submodules (remote submodule changes only),
2017-06-23) is not a logic error, it could always have been left
uninitialized (as "int ret"), now that we'll use the "ret" from the
upper scope we can drop the assignment in the "opt_rebase" branch.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak where "cat-file" will leak the "path" member. See
e5fba602e5 (textconv: support for cat_file, 2010-06-15) for the code
that introduced the offending get_oid_with_context() call (called
get_sha1_with_context() at the time).
As a result we can mark several tests as passing with SANITIZE=leak
using "TEST_PASSES_SANITIZE_LEAK=true".
As noted in dc944b65f1 (get_sha1_with_context: dynamically allocate
oc->path, 2017-05-19) callers must free the "path" member. That same
commit added the relevant free() to this function, but we weren't
catching cases where we'd return early.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak in code added in 41abfe15d9 (maintenance: add
pack-refs task, 2021-02-09), we need to call strvec_clear() on the
"struct strvec" that we initialized.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 1c41d2805e (unpack_trees_options: free messages when done,
2018-05-21) we started calling clear_unpack_trees_porcelain() on this
codepath, but missed this error path.
We could call clear_unpack_trees_porcelain() just before we error()
and return when unmerged_cache() fails, but the more correct fix is to
not have the unmerged_cache() check happen in the middle of our
"topts" setup.
Before 23cbf11b5c (merge-recursive: porcelain messages for checkout,
2010-08-11) we would not malloc() to setup our "topts", which is when
this started to leak on the error path.
Before that this code wasn't conflating the setup of "topts" and the
unmerged_cache() call in any meaningful way. The initial version in
782c2d65c2 (Build in checkout, 2008-02-07) just does a "memset" of
it, and initializes a single struct member.
Then in 8ccba008ee (unpack-trees: allow Porcelain to give different
error messages, 2008-05-17) we added the initialization of the error
message, which as noted above finally started leaking in 23cbf11b5c.
Let's fix the memory leak, and avoid future issues by initializing the
"topts" with a helper function. There are no functional changes here.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak in "merge-file", we need to loop over the "mmfs"
array and free() what we've got so far when we error out. As a result
we can mark a test as passing with SANITIZE=leak using
"TEST_PASSES_SANITIZE_LEAK=true".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the code in builtin/merge-file.c to:
* Use the initializer to zero out "mmfs", and use modern C syntax for
the rest.
* Refactor the the inner loop to use a variable and "if/else if"
pattern followed by "return". This will make a change to change it to
a "goto cleanup" pattern smaller.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak introduced in 440c705ea6 (cat-file: add
--batch-command mode, 2022-02-18). The free_cmds() function was only
called on "queued_nr" if we had a "flush" command. As the "without
flush for blob info" test added in the same commit shows we can't rely
on that, so let's call free_cmds() again at the end.
Since "nr" follows the usual pattern of being set to 0 if we've
free()'d the memory already it's OK to call it twice, even in cases
where we are doing a "flush".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Call the release_revisions() function added in
1878b5edc0 (revision.[ch]: provide and start using a
release_revisions(), 2022-04-13) in cmd_revert(), as well as freeing
the xmalloc()'d "revs" member itself.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak added in 0ec4b1650c (clone: fix ref selection in
--single-branch --branch=xxx, 2012-06-22).
Whether we get our "remote_head" from copy_ref() directly, or with a
call to guess_remote_head() it'll be the result of a copy_ref() in
either case, as guess_remote_head() is a wrapper for copy_ref() (or it
returns NULL).
We can't mark any tests passing passing with SANITIZE=leak using
"TEST_PASSES_SANITIZE_LEAK=true" as a result of this change, but
e.g. "t/t1500-rev-parse.sh" now gets closer to passing.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak in "git check-ref-format" that's been present in the
code in one form or another since 38eedc634b (git check-ref-format
--print, 2009-10-12), the code got substantially refactored in
cfbe22f03f (check-ref-format: handle subcommands in separate
functions, 2010-08-05).
As a result we can mark a test as passing with SANITIZE=leak using
"TEST_PASSES_SANITIZE_LEAK=true".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All invocations of do_get_submodule_displaypath() pass
get_super_prefix() as the super_prefix arg, which is exactly the same
as get_submodule_displaypath().
Replace all calls to do_get_submodule_displaypath() with
get_submodule_displaypath(), and since it has no more callers, remove
it.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unlike the other subcommands, "git submodule--helper update" uses the
"--recursive-prefix" flag instead of "--super-prefix". The two flags are
otherwise identical (they only serve to compute the 'display path' of a
submodule), except that there is a dedicated helper function to get the
value of "--super-prefix".
This inconsistency exists because "git submodule update" used to pass
"--recursive-prefix" between shell and C (introduced in [1]) before
"--super-prefix" was introduced (in [2]), and for simplicity, we kept
this name when "git submodule--helper update" was created.
Remove "--recursive-prefix" and its associated code from "git
submodule--helper update", replacing it with "--super-prefix".
To use "--super-prefix", module_update is marked with
SUPPORT_SUPER_PREFIX. Note that module_clone must also be marked with
SUPPORT_SUPER_PREFIX, otherwise the "git submodule--helper clone"
subprocess will fail check because "--super-prefix" is propagated via
the environment.
[1] 48308681b0 (git submodule update: have a dedicated helper for
cloning, 2016-02-29)
[2] 74866d7579 (git: make super-prefix option, 2016-10-07)
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the SUPPORT_SUPER_PREFIX flag from "add", "init" and
"summary". For the "add" command it hasn't been used since [1],
likewise for "init" and "summary" since [2] and [3], respectively.
As implemented in 74866d7579 (git: make super-prefix option,
2016-10-07) the SUPPORT_SUPER_PREFIX flag in git.c applies for the
entire command, but as implemented in 89c8626557 (submodule helper:
support super prefix, 2016-12-08) we assert here in
cmd_submodule__helper() that we're not getting the flag unexpectedly.
1. 8c8195e9c3 (submodule--helper: introduce add-clone subcommand,
2021-07-10)
2. 6e7c14e65c (submodule update --init: display correct path from
submodule, 2017-01-06)
3. 1cf823d8f0 (submodule: remove unnecessary `prefix` based option
logic, 2021-06-22)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace a chunk of code in update_submodule() with an equivalent
do_get_submodule_displaypath() invocation. This is already tested by
t/t7406-submodule-update.sh:'submodule update --init --recursive from
subdirectory', so no tests are added.
The two are equivalent because:
- Exactly one of recursive_prefix|prefix is non-NULL at a time; prefix
is set at the superproject level, and recursive_prefix is set when
recursing into submodules. There is also a BUG() statement in
get_submodule_displaypath() that asserts that both cannot be non-NULL.
- In get_submodule_displaypath(), get_super_prefix() always returns NULL
because "--super-prefix" is never passed. Thus calling it is
equivalent to calling do_get_submodule_displaypath() with super_prefix
= NULL.
Therefore:
- When recursive_prefix is non-NULL, prefix is NULL, and thus
get_submodule_displaypath() just returns prefixed_path. This is
identical to calling do_get_submodule_displaypath() with super_prefix
= recursive_prefix because the return value is still the concatenation
of recursive_prefix + update_data->sm_path.
- When prefix is non-NULL, prefixed_path = update_data->sm_path. Thus
calling get_submodule_displaypath() with prefixed_path is equivalent
to calling do_get_submodule_displaypath() with update_data->sm_path
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
update_submodule() uses duplicated code to compute
update_data->displaypath and next.recursive_prefix. The latter is just
the former with "/" appended to it, and since update_data->displaypath
not changed outside of this statement, we can just reuse the already
computed result.
We can go one step further and remove the reference to
next.recursive_prefix altogether. Since it is only used in
update_data_to_args() (to compute the "--recursive-prefix" flag for the
recursive update child process) we can just use the already computed
.displaypath value of there.
Delete the duplicated code, and remove the unnecessary reference to
next.recursive_prefix. As a bonus, this fixes a memory leak where
prefixed_path was never freed (this leak was first reported in [1]).
[1] https://lore.kernel.org/git/877a45867ae368bf9e053caedcb6cf421e02344d.1655336146.git.gitgitgadget@gmail.com
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are two locations in prepare_to_clone_next_submodule() that
manually calculate the submodule display path, but should just use
do_get_submodule_displaypath() for consistency.
Do this replacement and reorder the code slightly to avoid computing
the display path twice.
Until the preceding commit this code had never been tested, with our
newly added tests we can see that both these sites have been computing
the display path incorrectly ever since they were introduced in
48308681b0 (git submodule update: have a dedicated helper for cloning,
2016-02-29) [1]:
- The first hunk puts a "/" between recursive_prefix and ce->name, but
recursive_prefix already ends with "/".
- The second hunk calls relative_path() on recursive_prefix and
ce->name, but relative_path() only makes sense when both paths share
the same base directory. This is never the case here:
- recursive_prefix is the path from the topmost superproject to the
current submodule
- ce->name is the path from the root of the current submodule to its
submodule.
so, e.g. recursive_prefix="super" and ce->name="submodule" produces
displayname="../super" instead of "super/submodule".
[1] I verified this by applying the tests to 48308681b0.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Follow-up on the preceding commit which taught "git submodule--helper
update" to understand "--merge", "--checkout" and "--rebase" and use
those options instead of "--update=(rebase|merge|checkout|none)" when
the command invokes itself.
Unlike the preceding change this isn't strictly necessary to
eventually change "git-submodule.sh" so that it invokes "git
submodule--helper update" directly, but let's remove this
inconsistency in the command-line interface. We shouldn't need to
carry special synonyms for existing options in "git submodule--helper"
when that command can use the primary documented names instead.
But, as seen in the post-image this makes the control flow within
"builtin/submodule--helper.c" simpler, we can now write directly to
the "update_default" member of "struct update_data" when parsing the
options in "module_update()".
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Understand --checkout, --merge and --rebase synonyms for
--update={checkout,merge,rebase}, as well as the short options that
'git submodule' itself understands.
This removes a difference between the CLI API of "git submodule" and
"git submodule--helper", making it easier to make the latter an alias
for the former. See 48308681b0 (git submodule update: have a
dedicated helper for cloning, 2016-02-29) for the initial addition of
--update.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the user-facing "git submodule--helper" commands so that
they'll report their name as being "git submodule". To a user these
commands are internal implementation details, and it doesn't make
sense to emit usage about an internal helper when "git submodule" is
invoked with invalid options.
Before this we'd emit e.g.:
$ git submodule absorbgitdirs --blah
error: unknown option `blah'
usage: git submodule--helper absorbgitdirs [<options>] [<path>...]
[...]
And:
$ git submodule set-url -- --
usage: git submodule--helper set-url [--quiet] <path> <newurl>
[...]
Now we'll start with "usage: git submodule [...]" in both of those
cases. This change does not alter the "list", "name", "clone",
"config" and "create-branch" commands, those are internal-only (as an
aside; their usage info should probably invoke BUG(...)). This only
changes the user-facing commands.
The "status", "deinit" and "update" commands are not included in this
change, because their usage information already used "submodule"
rather than "submodule--helper".
I don't think it's currently possible to emit some of this usage
information in practice, as git-submodule.sh will catch unknown
options, and e.g. it doesn't seem to be possible to get "add" to emit
its usage information from "submodule--helper".
Though that change may be superfluous now, it's also harmless, and
will allow us to eventually dispatch further into "git
submodule--helper" from git-submodule.sh, while emitting the correct
usage output.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename the "absorb-git-dirs" subcommand to "absorbgitdirs", which is
what the "git submodule" command itself has called it since the
subcommand was implemented in f6f8586140 (submodule: add
absorb-git-dir function, 2016-12-12).
Having these two be different will make it more tedious to dispatch to
eventually dispatch "git submodule--helper" directly, as we'd need to
retain this name mapping. So let's get rid of this needless
inconsistency.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust code added in 0060fd1511 (clone --recurse-submodules: prevent
name squatting on Windows, 2019-09-12) to have the internal
--require-init option imply --init, rather than having
"git-submodule.sh" add it implicitly.
This change doesn't make any difference now, but eliminates another
special-case where "git submodule--helper update"'s behavior was
different from "git submodule update". This will make it easier to
eventually replace the cmd_update() function in git-submodule.sh.
We'll still need to keep the distinction between "--init" and
"--require-init" in git-submodule.sh. Once cmd_update() gets
re-implemented in C we'll be able to change variables and other code
related to that, but not yet.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Folks may want to merge histories that have no common ancestry; provide
a flag with the same name as used by `git merge` to allow this.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Much as `git ls-files` has a -z option, let's add one to merge-tree so
that the conflict-info section can be NUL terminated (and avoid quoting
of unusual filenames).
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the new `detailed` parameter, a new mode can be triggered when
displaying the merge messages: The `detailed` mode prints NUL-delimited
fields of the following form:
<path-count> NUL <path>... NUL <conflict-type> NUL <message>
The `<path-count>` field determines how many `<path>` fields there are.
The intention of this mode is to support server-side operations, where
worktree-less merges can lead to conflicts and depending on the type
and/or path count, the caller might know how to handle said conflict.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Much like `git merge` updates the index with information of the form
(mode, oid, stage, name)
provide this output for conflicted files for merge-tree as well.
Provide a --name-only option for users to exclude the mode, oid, and
stage and only get the list of conflicted filenames.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Callers of `git merge-tree --write-tree` will often want to know which
files had conflicts. While they could potentially attempt to parse the
CONFLICT notices printed, those messages are not meant to be machine
readable. Provide a simpler mechanism of just printing the files (in
the same format as `git ls-files` with quoting, but restricted to
unmerged files) in the output before the free-form messages.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running `git merge-tree --write-tree`, we previously would only
return an exit status reflecting the cleanness of a merge, and print out
the toplevel tree of the resulting merge. Merges also have
informational messages, such as:
* "Auto-merging <PATH>"
* "CONFLICT (content): ..."
* "CONFLICT (file/directory)"
* etc.
In fact, when non-content conflicts occur (such as file/directory,
modify/delete, add/add with differing modes, rename/rename (1to2),
etc.), these informational messages may be the only notification the
user gets since these conflicts are not representable in the contents
of the file.
Add a --[no-]messages option so that callers can request these messages
be included at the end of the output. Include such messages by default
when there are conflicts, and omit them by default when the merge is
clean.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds the ability to perform real merges rather than just trivial
merges (meaning handling three way content merges, recursive ancestor
consolidation, renames, proper directory/file conflict handling, and so
forth). However, unlike `git merge`, the working tree and index are
left alone and no branch is updated.
The only output is:
- the toplevel resulting tree printed on stdout
- exit status of 0 (clean), 1 (conflicts present), anything else
(merge could not be performed; unknown if clean or conflicted)
This output is meant to be used by some higher level script, perhaps in
a sequence of steps like this:
NEWTREE=$(git merge-tree --write-tree $BRANCH1 $BRANCH2)
test $? -eq 0 || die "There were conflicts..."
NEWCOMMIT=$(git commit-tree $NEWTREE -p $BRANCH1 -p $BRANCH2)
git update-ref $BRANCH1 $NEWCOMMIT
Note that higher level scripts may also want to access the
conflict/warning messages normally output during a merge, or have quick
access to a list of files with conflicts. That is not available in this
preliminary implementation, but subsequent commits will add that
ability (meaning that NEWTREE would be a lot more than a tree in the
case of conflicts).
This also marks the traditional trivial merge of merge-tree as
deprecated. The trivial merge not only had limited applicability, the
output format was also difficult to work with (and its format
undocumented), and will generally be less performant than real merges.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Let merge-tree accept a `--write-tree` parameter for choosing real
merges instead of trivial merges, and accept an optional
`--trivial-merge` option to get the traditional behavior. Note that
these accept different numbers of arguments, though, so these names
need not actually be used.
Note that real merges differ from trivial merges in that they handle:
- three way content merges
- recursive ancestor consolidation
- renames
- proper directory/file conflict handling
- etc.
Basically all the stuff you'd expect from `git merge`, just without
updating the index and working tree. The initial shell added here does
nothing more than die with "real merges are not yet implemented", but
that will be fixed in subsequent commits.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for adding a non-trivial merge capability to merge-tree,
move the existing merge logic for trivial merges into a new function.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
merge-recursive.h defined its own merge_trees() function, different than
the one found in builtin/merge-tree.c. That was okay in the past, but
we want merge-tree to be able to use the merge-ort functions, which will
end up including merge-recursive.h. Rename the function found in
builtin/merge-tree.c to avoid the conflict.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This patch adds a command line option analogous to that of GNU
grep(1)'s -m / --max-count, which users might already be used to.
This makes it possible to limit the amount of matches shown in the
output while keeping the functionality of other options such as -C
(show code context) or -p (show containing function), which would be
difficult to do with a shell pipeline (e.g. head(1)).
Signed-off-by: Carlos López 00xc@protonmail.com
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With 31c8221a (mktree: validate entry type in input, 2009-05-14), we
called the sha1_object_info() API to obtain the type information, but
allowed the call to silently fail when the object was missing locally,
so that we can sanity-check the types opportunistically when the
object did exist.
The implementation is understandable because back then there was no
lazy/on-demand downloading of individual objects from the promisor
remotes that causes a long delay and materializes the object, hence
defeating the point of using "--missing". The design is hurting us
now.
We could bypass the opportunistic type/mode consistency check
altogether when "--missing" is given, but instead, use the
oid_object_info_extended() API and tell it that we are only interested
in objects that locally exist and are immediately available by passing
OBJECT_INFO_SKIP_FETCH_OBJECT bit to it. That way, we will still
retain the cheap and opportunistic sanity check for local objects.
Signed-off-by: Richard Oliver <roliver@roku.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After b489b9d9aa (branch: use branch_checked_out() when deleting refs,
2022-06-14), we no longer look at our local "worktrees" variable, since
branch_checked_out() handles it under the hood. The compiler didn't
notice the unused variable because we call functions to initialize and
free it (so it's not totally unused, it just doesn't do anything
useful).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 12d47e3b1f (fetch: use new branch_checked_out() and add tests,
2022-06-14), fetch's update_local_ref() function stopped using its
"worktrees" parameter. It doesn't need it, since the
branch_checked_out() function examines the global worktrees under the
hood.
So we can not only drop the unused parameter from that function, but
also from its entire call chain. And as we do so all the way up to
do_fetch(), we can see that nobody uses it at all, and we can drop the
local variable there entirely.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is an option rather than command. Make the message convey this
similar to the other messages in the file.
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some config variables are combinations of multiple words, and we
typically write them in camelCase forms in manpage and translatable
strings. It's not easy to find mismatches for these camelCase config
variables during code reviews, but occasionally they are identified
during localization translations.
To check for mismatched config variables, I introduced a new feature
in the helper program for localization[^1]. The following mismatched
config variables have been identified by running the helper program,
such as "git-po-helper check-pot".
Lowercase in manpage should use camelCase:
* Documentation/config/http.txt: http.pinnedpubkey
Lowercase in translable strings should use camelCase:
* builtin/fast-import.c: pack.indexversion
* builtin/gc.c: gc.logexpiry
* builtin/index-pack.c: pack.indexversion
* builtin/pack-objects.c: pack.indexversion
* builtin/repack.c: pack.writebitmaps
* commit.c: i18n.commitencoding
* gpg-interface.c: user.signingkey
* http.c: http.postbuffer
* submodule-config.c: submodule.fetchjobs
Mismatched camelCases, choose the former:
* Documentation/config/transfer.txt: transfer.credentialsInUrl
remote.c: transfer.credentialsInURL
[^1]: https://github.com/git-l10n/git-po-helper
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
By default, the git remote show command will query data from remotes to
show data about what might be done on a future git fetch. This process
currently does not handle negative refspecs. This can be confusing,
because the show command will list refs as if they would be fetched. For
example if the fetch refspec "^refs/heads/pr/*", it still displays the
following:
* remote jdk19
Fetch URL: git@github.com:openjdk/jdk19.git
Push URL: git@github.com:openjdk/jdk19.git
HEAD branch: master
Remote branches:
master tracked
pr/1 new (next fetch will store in remotes/jdk19)
pr/2 new (next fetch will store in remotes/jdk19)
pr/3 new (next fetch will store in remotes/jdk19)
Local ref configured for 'git push':
master pushes to master (fast-forwardable)
Fix this by adding an additional check inside of get_ref_states. If a
ref matches one of the negative refspecs, mark it as skipped instead of
marking it as new or tracked.
With this change, we now report remote branches that are skipped due to
negative refspecs properly:
* remote jdk19
Fetch URL: git@github.com:openjdk/jdk19.git
Push URL: git@github.com:openjdk/jdk19.git
HEAD branch: master
Remote branches:
master tracked
pr/1 skipped
pr/2 skipped
pr/3 skipped
Local ref configured for 'git push':
master pushes to master (fast-forwardable)
By showing the refs as skipped, it helps clarify that these references
won't actually be fetched.
This does not properly handle refs going stale due to a newly added
negative refspec. In addition, git remote prune doesn't handle that
negative refspec case either. Fixing that requires digging into
get_stale_heads and handling the case of a ref which exists on the
remote but is omitted due to a negative refspec locally.
Add a new test case which covers the functionality above, as well as a
new expected failure indicating the poor overlap with stale refs.
Reported-by: Pavel Rappo <pavel.rappo@gmail.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In c51f8f94e5 (submodule--helper: run update procedures from C,
2021-08-24), we added code that first obtains the default remote, and
then adds that to a `strvec`.
However, we never released the default remote's memory.
Reported by Coverity.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Various error messages that talk about the removal of
"--preserve-merges" in "rebase" have been strengthened, and "rebase
--abort" learned to get out of a state that was left by an earlier
use of the option.
* po/rebase-preserve-merges:
rebase: translate a die(preserve-merges) message
rebase: note `preserve` merges may be a pull config option
rebase: help users when dying with `preserve-merges`
rebase.c: state preserve-merges has been removed
"git revert" learns "--reference" option to use more human-readable
reference to the commit it reverts in the message template it
prepares for the user.
* jc/revert-show-parent-info:
revert: --reference should apply only to 'revert', not 'cherry-pick'
revert: optionally refer to commit in the "reference" format
This was found during l10n process by Jiang Xin.
Reported-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Fangyi Zhou <me@fangyi.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is the last current use of find_shared_symref() that can easily be
replaced by branch_checked_out(). The benefit of this switch is that the
code is a bit simpler, but also it is faster on repeated calls.
The remaining uses of find_shared_symref() are non-trivial to remove, so
we probably should not continue in that direction:
* builtin/notes.c uses find_shared_symref() with "NOTES_MERGE_REF"
instead of "HEAD", so it doesn't have an immediate analogue with
branch_checked_out(). Perhaps we should consider extending it to
include that symref in addition to HEAD, BISECT_HEAD, and
REBASE_HEAD.
* receive-pack.c checks to see if a worktree has a checkout for the ref
that is being updated. The tricky part is that it can actually decide
to update the worktree directly instead of just skipping the update.
This all depends on the receive.denyCurrentBranch config option. The
implementation currenty cares about receiving the worktree in the
result, so the current branch_checked_out() prototype is insufficient
currently. This is something to investigate later, though, since a
large number of refs could be updated at the same time and using the
strmap implementation of branch_checked_out() could be beneficial.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fetching refs from a remote, it is possible that the refspec will
cause use to overwrite a ref that is checked out in a worktree. The
existing logic in builtin/fetch.c uses a possibly-slow mechanism. Update
those sections to use the new, more efficient branch_checked_out()
helper.
These uses were not previously tested, so add a test case that can be
used for these kinds of collisions. There is only one test now, but more
tests will be added as other consumers of branch_checked_out() are
added.
Note that there are two uses in builtin/fetch.c, but only one of the
messages is tested. This is because the tested check is run before
completing the fetch, and the untested check is not reachable without
concurrent updates to the filesystem. Thus, it is beneficial to keep
that extra check for the sake of defense-in-depth. However, we should
not attempt to test the check, as the effort required is too
complicated to be worth the effort. This use in update_local_ref()
also requires a change in the error message because we no longer have
access to the worktree struct, only the path of the worktree. This error
is so rare that making a distinction between the two is not critical.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git show-ref --heads" (and "--tags") still iterated over all the
refs only to discard refs outside the specified area, which has
been corrected.
* tb/show-ref-optim:
builtin/show-ref.c: avoid over-iterating with --heads, --tags
Make use of the stream_loose_object() function introduced in the
preceding commit to unpack large objects. Before this we'd need to
malloc() the size of the blob before unpacking it, which could cause
OOM with very large blobs.
We could use the new streaming interface to unpack all blobs, but
doing so would be much slower, as demonstrated e.g. with this
benchmark using git-hyperfine[0]:
rm -rf /tmp/scalar.git &&
git clone --bare https://github.com/Microsoft/scalar.git /tmp/scalar.git &&
mv /tmp/scalar.git/objects/pack/*.pack /tmp/scalar.git/my.pack &&
git hyperfine \
-r 2 --warmup 1 \
-L rev origin/master,HEAD -L v "10,512,1k,1m" \
-s 'make' \
-p 'git init --bare dest.git' \
-c 'rm -rf dest.git' \
'./git -C dest.git -c core.bigFileThreshold={v} unpack-objects </tmp/scalar.git/my.pack'
Here we'll perform worse with lower core.bigFileThreshold settings
with this change in terms of speed, but we're getting lower memory use
in return:
Summary
'./git -C dest.git -c core.bigFileThreshold=10 unpack-objects </tmp/scalar.git/my.pack' in 'origin/master' ran
1.01 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1k unpack-objects </tmp/scalar.git/my.pack' in 'origin/master'
1.01 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1m unpack-objects </tmp/scalar.git/my.pack' in 'origin/master'
1.01 ± 0.02 times faster than './git -C dest.git -c core.bigFileThreshold=1m unpack-objects </tmp/scalar.git/my.pack' in 'HEAD'
1.02 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/scalar.git/my.pack' in 'origin/master'
1.09 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1k unpack-objects </tmp/scalar.git/my.pack' in 'HEAD'
1.10 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/scalar.git/my.pack' in 'HEAD'
1.11 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=10 unpack-objects </tmp/scalar.git/my.pack' in 'HEAD'
A better benchmark to demonstrate the benefits of that this one, which
creates an artificial repo with a 1, 25, 50, 75 and 100MB blob:
rm -rf /tmp/repo &&
git init /tmp/repo &&
(
cd /tmp/repo &&
for i in 1 25 50 75 100
do
dd if=/dev/urandom of=blob.$i count=$(($i*1024)) bs=1024
done &&
git add blob.* &&
git commit -mblobs &&
git gc &&
PACK=$(echo .git/objects/pack/pack-*.pack) &&
cp "$PACK" my.pack
) &&
git hyperfine \
--show-output \
-L rev origin/master,HEAD -L v "512,50m,100m" \
-s 'make' \
-p 'git init --bare dest.git' \
-c 'rm -rf dest.git' \
'/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold={v} unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum'
Using this test we'll always use >100MB of memory on
origin/master (around ~105MB), but max out at e.g. ~55MB if we set
core.bigFileThreshold=50m.
The relevant "Maximum resident set size" lines were manually added
below the relevant benchmark:
'/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=50m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'origin/master' ran
Maximum resident set size (kbytes): 107080
1.02 ± 0.78 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'origin/master'
Maximum resident set size (kbytes): 106968
1.09 ± 0.79 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=100m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'origin/master'
Maximum resident set size (kbytes): 107032
1.42 ± 1.07 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=100m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'HEAD'
Maximum resident set size (kbytes): 107072
1.83 ± 1.02 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=50m unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'HEAD'
Maximum resident set size (kbytes): 55704
2.16 ± 1.19 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/repo/my.pack 2>&1 | grep Maximum' in 'HEAD'
Maximum resident set size (kbytes): 4564
This shows that if you have enough memory this new streaming method is
slower the lower you set the streaming threshold, but the benefit is
more bounded memory use.
An earlier version of this patch introduced a new
"core.bigFileStreamingThreshold" instead of re-using the existing
"core.bigFileThreshold" variable[1]. As noted in a detailed overview
of its users in [2] using it has several different meanings.
Still, we consider it good enough to simply re-use it. While it's
possible that someone might want to e.g. consider objects "small" for
the purposes of diffing but "big" for the purposes of writing them
such use-cases are probably too obscure to worry about. We can always
split up "core.bigFileThreshold" in the future if there's a need for
that.
0. https://github.com/avar/git-hyperfine/
1. https://lore.kernel.org/git/20211210103435.83656-1-chiyutianyi@gmail.com/
2. https://lore.kernel.org/git/20220120112114.47618-5-chiyutianyi@gmail.com/
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Derrick Stolee <stolee@gmail.com>
Helped-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Han Xin <chiyutianyi@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As the name implies, "get_data(size)" will allocate and return a given
amount of memory. Allocating memory for a large blob object may cause the
system to run out of memory. Before preparing to replace calling of
"get_data()" to unpack large blob objects in latter commits, refactor
"get_data()" to reduce memory footprint for dry_run mode.
Because in dry_run mode, "get_data()" is only used to check the
integrity of data, and the returned buffer is not used at all, we can
allocate a smaller buffer and use it as zstream output. Make the function
return NULL in the dry-run mode, as no callers use the returned buffer.
The "find [...]objects/?? -type f | wc -l" test idiom being used here
is adapted from the same "find" use added to another test in
d9545c7f46 (fast-import: implement unpack limit, 2016-04-25).
Suggested-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Han Xin <chiyutianyi@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A new bug() and BUG_if_bug() API is introduced to make it easier to
uniformly log "detect multiple bugs and abort in the end" pattern.
* ab/bug-if-bug:
cache-tree.c: use bug() and BUG_if_bug()
receive-pack: use bug() and BUG_if_bug()
parse-options.c: use optbug() instead of BUG() "opts" check
parse-options.c: use new bug() API for optbug()
usage.c: add a non-fatal bug() function to go with BUG()
common-main.c: move non-trace2 exit() behavior out of trace2.c
More fsmonitor--daemon.
* jh/builtin-fsmonitor-part3: (30 commits)
t7527: improve implicit shutdown testing in fsmonitor--daemon
fsmonitor--daemon: allow --super-prefix argument
t7527: test Unicode NFC/NFD handling on MacOS
t/lib-unicode-nfc-nfd: helper prereqs for testing unicode nfc/nfd
t/helper/hexdump: add helper to print hexdump of stdin
fsmonitor: on macOS also emit NFC spelling for NFD pathname
t7527: test FSMonitor on case insensitive+preserving file system
fsmonitor: never set CE_FSMONITOR_VALID on submodules
t/perf/p7527: add perf test for builtin FSMonitor
t7527: FSMonitor tests for directory moves
fsmonitor: optimize processing of directory events
fsm-listen-darwin: shutdown daemon if worktree root is moved/renamed
fsm-health-win32: force shutdown daemon if worktree root moves
fsm-health-win32: add polling framework to monitor daemon health
fsmonitor--daemon: stub in health thread
fsmonitor--daemon: rename listener thread related variables
fsmonitor--daemon: prepare for adding health thread
fsmonitor--daemon: cd out of worktree root
fsm-listen-darwin: ignore FSEvents caused by xattr changes on macOS
unpack-trees: initialize fsmonitor_has_run_once in o->result
...
Rename .env_array member to .env in the child_process structure.
* ab/env-array:
run-command API users: use "env" not "env_array" in comments & names
run-command API: rename "env_array" to "env"
The resolve-undo extension was added to the index in cfc5789a
(resolve-undo: record resolved conflicts in a new index extension
section, 2009-12-25). This extension records the blob object names
and their modes of conflicted paths when the path gets resolved
(e.g. with "git add"), to allow "undoing" the resolution with
"checkout -m path". These blob objects should be guarded from
garbage-collection while we have the resolve-undo information in the
index (otherwise unresolve operation may try to use a blob object
that has already been pruned away).
But the code called from mark_reachable_objects() for the index
forgets to do so. Teach add_index_objects_to_pending() helper to
also add objects referred to by the resolve-undo extension.
Also make matching changes to "fsck", which has code that is fairly
similar to the reachability stuff, but have parallel implementations
for all these stuff, which may (or may not) someday want to be unified.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git clone --origin X" leaked piece of memory that held value read
from the clone.defaultRemoteName configuration variable, which has
been plugged.
source: <xmqqlevl4ysk.fsf@gitster.g>
* jc/clone-remote-name-leak-fix:
clone: plug a miniscule leak
The path taken by "git multi-pack-index" command from the end user
was compared with path internally prepared by the tool withut first
normalizing, which lead to duplicated paths not being noticed,
which has been corrected.
source: <pull.1221.v2.git.1650911234.gitgitgadget@gmail.com>
* ds/midx-normalize-pathname-before-comparison:
cache: use const char * for get_object_directory()
multi-pack-index: use --object-dir real path
midx: use real paths in lookup_multi_pack_index()
"git rebase --keep-base <upstream> <branch-to-rebase>" computed the
commit to rebase onto incorrectly, which has been corrected.
source: <20220421044233.894255-1-alexhenrie24@gmail.com>
* ah/rebase-keep-base-fix:
rebase: use correct base for --keep-base when a branch is given
The "--current" option of "git show-branch" should have been made
incompatible with the "--reflog" mode, but this was not enforced,
which has been corrected.
source: <xmqqh76mf7s4.fsf_-_@gitster.g>
* jc/show-branch-g-current:
show-branch: -g and --current are incompatible
Plug the memory leaks from the trickiest API of all, the revision
walker.
* ab/plug-leak-in-revisions: (27 commits)
revisions API: add a TODO for diff_free(&revs->diffopt)
revisions API: have release_revisions() release "topo_walk_info"
revisions API: have release_revisions() release "date_mode"
revisions API: call diff_free(&revs->pruning) in revisions_release()
revisions API: release "reflog_info" in release revisions()
revisions API: clear "boundary_commits" in release_revisions()
revisions API: have release_revisions() release "prune_data"
revisions API: have release_revisions() release "grep_filter"
revisions API: have release_revisions() release "filter"
revisions API: have release_revisions() release "cmdline"
revisions API: have release_revisions() release "mailmap"
revisions API: have release_revisions() release "commits"
revisions API users: use release_revisions() for "prune_data" users
revisions API users: use release_revisions() with UNLEAK()
revisions API users: use release_revisions() in builtin/log.c
revisions API users: use release_revisions() in http-push.c
revisions API users: add "goto cleanup" for release_revisions()
stash: always have the owner of "stash_info" free it
revisions API users: use release_revisions() needing REV_INFO_INIT
revision.[ch]: document and move code declared around "init"
...
This is a user facing message for a situation seen in the wild.
Translate it.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `--preserve-merges` option was removed by v2.34.0. However
users may not be aware that it is also a Pull configuration option,
which is still offered by major IDE vendors such as Visual Studio.
Extend the `--preserve-merges` die message to also direct users to
the possible use of the `preserve` option in the `pull.rebase` config.
This is an additional 'belt and braces' information statement.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git would die if a "rebase --preserve-merges" was in progress.
Users could neither --quit, --abort, nor --continue the rebase.
Make the `rebase --abort` option available to allow users to remove
traces of any preserve-merges rebase, even if they had upgraded
during a rebase.
One trigger case was an unexpectedly difficult to resolve conflict, as
reported on the `git-users` group.
(https://groups.google.com/g/git-for-windows/c/3jMWbBlXXHM)
Other potential use-cases include git-experts using the portable
'Git on a stick' to help users with an older git version.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since feebd2d256 (rebase: hide --preserve-merges option, 2019-10-18)
this option is now removed as stated in the subsequent release notes.
Fix and reflow the option tip.
Signed-off-by: Philip Oakley <philipoakley@iee.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `show-ref` is combined with the `--heads` or `--tags` options, it
can avoid iterating parts of a repository's references that it doesn't
care about.
But it doesn't take advantage of this potential optimization. When this
command was introduced back in 358ddb62cf (Add "git show-ref" builtin
command, 2006-09-15), `for_each_ref_in()` did exist. But since most
repositories don't have many (any?) references that aren't branches or
tags already, this makes little difference in practice.
Though for repositories with a large imbalance of branches and tags (or,
more likely in the case of server operators, many hidden references),
this can make quite a difference. Take, for example, a repository with
500,000 "hidden" references (all of the form "refs/__hidden__/N"), and
a single branch:
git commit --allow-empty -m "base" &&
seq 1 500000 | sed 's,\(.*\),create refs/__hidden__/\1 HEAD,' |
git update-ref --stdin &&
git pack-refs --all
Outputting the existence of that single branch currently takes on the
order of ~50ms on my machine. The vast majority of this time is wasted
iterating through references that we know we're going to discard.
Instead, teach `show-ref` that it can iterate just "refs/heads" and/or
"refs/tags" when given `--heads` and/or `--tags`, respectively. A few
small interesting things to note:
- When given either option, we can avoid the general-purpose
for_each_ref() call altogether, since we know that it won't give us
any references that we wouldn't filter out already.
- We can make two separate calls to `for_each_fullref_in()` (and
avoid, say, the more specialized `for_each_fullref_in_prefixes()`,
since we know that the set of references enumerated by each is
disjoint, so we'll never see the same reference appear in both
calls.
- We have to use the "fullref" variant (instead of just
`for_each_branch_ref()` and `for_each_tag_ref()`), since we expect
fully-qualified reference names to appear in `show-ref`'s output.
When either of `heads_only` or `tags_only` is set, we can eliminate the
strcmp() calls in `builtin/show-ref.c::show_ref()` altogether, since we
know that `show_ref()` will never see a non-branch or tag reference.
Unfortunately, we can't use `for_each_fullref_in_prefixes()` to enhance
`show-ref`'s pattern matching, since `show-ref` patterns match on the
_suffix_ (e.g., the pattern "foo" shows "refs/heads/foo",
"refs/tags/foo", and etc, not "foo/*").
Nonetheless, in our synthetic example above, this provides a significant
speed-up ("git" is roughly v2.36, "git.compile" is this patch):
$ hyperfine -N 'git show-ref --heads' 'git.compile show-ref --heads'
Benchmark 1: git show-ref --heads
Time (mean ± σ): 49.9 ms ± 6.2 ms [User: 45.6 ms, System: 4.1 ms]
Range (min … max): 46.1 ms … 73.6 ms 43 runs
Benchmark 2: git.compile show-ref --heads
Time (mean ± σ): 2.8 ms ± 0.4 ms [User: 1.4 ms, System: 1.2 ms]
Range (min … max): 1.3 ms … 5.6 ms 957 runs
Summary
'git.compile show-ref --heads' ran
18.03 ± 3.38 times faster than 'git show-ref --heads'
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A mechanism to pack unreachable objects into a "cruft pack",
instead of ejecting them into loose form to be reclaimed later, has
been introduced.
* tb/cruft-packs:
sha1-file.c: don't freshen cruft packs
builtin/gc.c: conditionally avoid pruning objects via loose
builtin/repack.c: add cruft packs to MIDX during geometric repack
builtin/repack.c: use named flags for existing_packs
builtin/repack.c: allow configuring cruft pack generation
builtin/repack.c: support generating a cruft pack
builtin/pack-objects.c: --cruft with expiration
reachable: report precise timestamps from objects in cruft packs
reachable: add options to add_unseen_recent_objects_to_traversal
builtin/pack-objects.c: --cruft without expiration
builtin/pack-objects.c: return from create_object_entry()
t/helper: add 'pack-mtimes' test-tool
pack-mtimes: support writing pack .mtimes files
chunk-format.h: extract oid_version()
pack-write: pass 'struct packing_data' to 'stage_tmp_packfiles'
pack-mtimes: support reading .mtimes files
Documentation/technical: add cruft-packs.txt
A workflow change for translators are being proposed.
* jx/l10n-workflow-change:
l10n: Document the new l10n workflow
Makefile: add "po-init" rule to initialize po/XX.po
Makefile: add "po-update" rule to update po/XX.po
po/git.pot: don't check in result of "make pot"
po/git.pot: this is now a generated file
Makefile: remove duplicate and unwanted files in FOUND_SOURCE_FILES
i18n CI: stop allowing non-ASCII source messages in po/git.pot
Makefile: have "make pot" not "reset --hard"
Makefile: generate "po/git.pot" from stable LOCALIZED_C
Makefile: sort source files before feeding to xgettext
Teach "git repack --geometric" work better with "--keep-pack" and
avoid corrupting the repository when packsize limit is used.
* tb/geom-repack-with-keep-and-max:
builtin/repack.c: ensure that `names` is sorted
t7703: demonstrate object corruption with pack.packSizeLimit
repack: respect --keep-pack with geometric repack
"sparse-checkout" learns to work well with the sparse-index
feature.
* ds/sparse-sparse-checkout:
sparse-checkout: integrate with sparse index
p2000: add test for 'git sparse-checkout [add|set]'
sparse-index: complete partial expansion
sparse-index: partially expand directories
sparse-checkout: --no-sparse-index needs a full index
cache-tree: implement cache_tree_find_path()
sparse-index: introduce partially-sparse indexes
sparse-index: create expand_index()
t1092: stress test 'git sparse-checkout set'
t1092: refactor 'sparse-index contents' test
The multi-pack-index code did not protect the packfile it is going
to depend on from getting removed while in use, which has been
corrected.
* tb/midx-race-in-pack-objects:
builtin/pack-objects.c: ensure pack validity from MIDX bitmap objects
builtin/pack-objects.c: ensure included `--stdin-packs` exist
builtin/pack-objects.c: avoid redundant NULL check
pack-bitmap.c: check preferred pack validity when opening MIDX bitmap
Preliminary code refactoring around transport and bundle code.
* ds/bundle-uri:
bundle.h: make "fd" version of read_bundle_header() public
remote: allow relative_url() to return an absolute url
remote: move relative_url()
http: make http_get_file() external
fetch-pack: move --keep=* option filling to a function
fetch-pack: add a deref_without_lazy_fetch_extended()
dir API: add a generalized path_match_flags() function
connect.c: refactor sending of agent & object-format
Introduce a filesystem-dependent mechanism to optimize the way the
bits for many loose object files are ensured to hit the disk
platter.
* ns/batch-fsync:
core.fsyncmethod: performance tests for batch mode
t/perf: add iteration setup mechanism to perf-lib
core.fsyncmethod: tests for batch mode
test-lib-functions: add parsing helpers for ls-files and ls-tree
core.fsync: use batch mode and sync loose objects by default on Windows
unpack-objects: use the bulk-checkin infrastructure
update-index: use the bulk-checkin infrastructure
builtin/add: add ODB transaction around add_files_to_cache
cache-tree: use ODB transaction around writing a tree
core.fsyncmethod: batched disk flushes for loose-objects
bulk-checkin: rebrand plug/unplug APIs as 'odb transactions'
bulk-checkin: rename 'state' variable and separate 'plugged' boolean
Deprecate non-cone mode of the sparse-checkout feature.
* en/sparse-cone-becomes-default:
Documentation: some sparsity wording clarifications
git-sparse-checkout.txt: mark non-cone mode as deprecated
git-sparse-checkout.txt: flesh out pattern set sections a bit
git-sparse-checkout.txt: add a new EXAMPLES section
git-sparse-checkout.txt: shuffle some sections and mark as internal
git-sparse-checkout.txt: update docs for deprecation of 'init'
git-sparse-checkout.txt: wording updates for the cone mode default
sparse-checkout: make --cone the default
tests: stop assuming --no-cone is the default mode for sparse-checkout
Follow-up on a preceding commit which changed all references to the
"env_array" when referring to the "struct child_process" member. These
changes are all unnecessary for the compiler, but help the code's
human readers.
All the comments that referred to "env_array" have now been updated,
as well as function names and variables that had "env_array" in their
name, they now refer to "env".
In addition the "out" name for the submodule.h prototype was
inconsistent with the function definition's use of "env_array" in
submodule.c. Both of them use "env" now.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Start following-up on the rename mentioned in c7c4bdeccf (run-command
API: remove "env" member, always use "env_array", 2021-11-25) of
"env_array" to "env".
The "env_array" name was picked in 19a583dc39 (run-command: add
env_array, an optional argv_array for env, 2014-10-19) because "env"
was taken. Let's not forever keep the oddity of "*_array" for this
"struct strvec", but not for its "args" sibling.
This commit is almost entirely made with a coccinelle rule[1]. The
only manual change here is in run-command.h to rename the struct
member itself and to change "env_array" to "env" in the
CHILD_PROCESS_INIT initializer.
The rest of this is all a result of applying [1]:
* make contrib/coccinelle/run_command.cocci.patch
* patch -p1 <contrib/coccinelle/run_command.cocci.patch
* git add -u
1. cat contrib/coccinelle/run_command.pending.cocci
@@
struct child_process E;
@@
- E.env_array
+ E.env
@@
struct child_process *E;
@@
- E->env_array
+ E->env
I've avoided changing any comments and derived variable names here,
that will all be done in the next commit.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Amend code added in a6a8431968 (receive-pack.c: shorten the
execute_commands loop over all commands, 2015-01-07) and amended to
hard die in b6a4788586 (receive-pack.c: die instead of error in case
of possible future bug, 2015-01-07) to use the new bug() function
instead.
Let's also rename the warn_if_*() function that code is in to
BUG_if_*(), its name became outdated in b6a4788586.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As 'revert' and 'cherry-pick' share a lot of code, it is easy to
modify the behaviour of one command and inadvertently affect the
other. An earlier change to teach the '--reference' option and the
'revert.reference' configuration variable to the former was not
careful enough and 'cherry-pick --reference' wasn't rejected as an
error.
It is possible to think 'cherry-pick -x' might benefit from the
'--reference' option, but it is fundamentally different from
'revert' in at least two ways to make it questionable:
- 'revert' names a commit that is ancestor of the resulting commit,
so an abbreviated object name with human readable title is
sufficient to identify the named commit uniquely without using
the full object name. On the other hand, 'cherry-pick'
usually [*] picks a commit that is not an ancestor. It might be
even picking a private commit that never becomes part of the
public history.
- The whole commit message of 'cherry-pick' is a copy of the
original commit, and there is nothing gained to repeat only the
title part on 'cherry-picked from' message.
[*] well, you could revert and then you can pick the original that
was reverted to get back to where you were, but then you can
revert the revert to do the same thing.
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git add -i" was rewritten in C some time ago and has been in
testing; the reimplementation is now exposed to general public by
default.
* js/use-builtin-add-i:
add -i: default to the built-in implementation
t2016: require the PERL prereq only when necessary
A typical "git revert" commit uses the full title of the original
commit in its title, and starts its body of the message with:
This reverts commit 8fa7f667cf61386257c00d6e954855cc3215ae91.
This does not encourage the best practice of describing not just
"what" (i.e. "Revert X" on the title says what we did) but "why"
(i.e. and it does not say why X was undesirable).
We can instead phrase this first line of the body to be more like
This reverts commit 8fa7f667 (do this and that, 2022-04-25)
so that the title does not have to be
Revert "do this and that"
We can instead use the title to describe "why" we are reverting the
original commit.
Introduce the "--reference" option to "git revert", and also the
revert.reference configuration variable, which defaults to false, to
tweak the title and the first line of the draft commit message for
when creating a "revert" commit.
When this option is in use, the first line of the pre-filled editor
buffer becomes a comment line that tells the user to say _why_. If
the user exits the editor without touching this line by mistake,
what we prepare to become the first line of the body, i.e. "This
reverts commit 8fa7f667 (do this and that, 2022-04-25)", ends up to
be the title of the resulting commit. This behaviour is designed to
help such a user to identify such a revert in "git log --oneline"
easily so that it can be further reworded with "git rebase -i" later.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create another thread to watch over the daemon process and
automatically shut it down if necessary.
This commit creates the basic framework for a "health" thread
to monitor the daemon and/or the file system. Later commits
will add platform-specific code to do the actual work.
The "health" thread is intended to monitor conditions that
would be difficult to track inside the IPC thread pool and/or
the file system listener threads. For example, when there are
file system events outside of the watched worktree root or if
we want to have an idle-timeout auto-shutdown feature.
This commit creates the health thread itself, defines the thread-proc
and sets up the thread's event loop. It integrates this new thread
into the existing IPC and Listener thread models.
This commit defines the API to the platform-specific code where all of
the monitoring will actually happen.
The platform-specific code for MacOS is just stubs. Meaning that the
health thread will immediately exit on MacOS, but that is OK and
expected. Future work can define MacOS-specific monitoring.
The platform-specific code for Windows sets up enough of the
WaitForMultipleObjects() machinery to watch for system and/or custom
events. Currently, the set of wait handles only includes our custom
shutdown event (sent from our other theads). Later commits in this
series will extend the set of wait handles to monitor other
conditions.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename platform-specific listener thread related variables
and data types as we prepare to add another backend thread
type.
[] `struct fsmonitor_daemon_backend_data` becomes `struct fsm_listen_data`
[] `state->backend_data` becomes `state->listen_data`
[] `state->error_code` becomes `state->listen_error_code`
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor daemon thread startup to make it easier to start
a third thread class to monitor the health of the daemon.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach the fsmonitor--daemon to CD outside of the worktree
before starting up.
The common Git startup mechanism causes the CWD of the daemon process
to be in the root of the worktree. On Windows, this causes the daemon
process to hold a locked handle on the CWD and prevents other
processes from moving or deleting the worktree while the daemon is
running.
CD to HOME before entering main event loops.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Bare repos do not have a worktree, so there is nothing for the
daemon watch.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Expose the new `git repack --cruft` mode from `git gc` via a new opt-in
flag. When invoked like `git gc --cruft`, `git gc` will avoid exploding
unreachable objects as loose ones, and instead create a cruft pack and
`.mtimes` file.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using cruft packs, the following race can occur when a geometric
repack that writes a MIDX bitmap takes place afterwords:
- First, create an unreachable object and do an all-into-one cruft
repack which stores that object in the repository's cruft pack.
- Then make that object reachable.
- Finally, do a geometric repack and write a MIDX bitmap.
Assuming that we are sufficiently unlucky as to select a commit from the
MIDX which reaches that object for bitmapping, then the `git
multi-pack-index` process will complain that that object is missing.
The reason is because we don't include cruft packs in the MIDX when
doing a geometric repack. Since the "make that object reachable" doesn't
necessarily mean that we'll create a new copy of that object in one of
the packs that will get rolled up as part of a geometric repack, it's
possible that the MIDX won't see any copies of that now-reachable
object.
Of course, it's desirable to avoid including cruft packs in the MIDX
because it causes the MIDX to store a bunch of objects which are likely
to get thrown away. But excluding that pack does open us up to the above
race.
This patch demonstrates the bug, and resolves it by including cruft
packs in the MIDX even when doing a geometric repack.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use the `util` pointer for items in the `existing_packs` string list
to indicate which packs are going to be deleted. Since that has so far
been the only use of that `util` pointer, we just set it to 0 or 1.
But we're going to add an additional state to this field in the next
patch, so prepare for that by adding a #define for the first bit so we
can more expressively inspect the flags state.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In servers which set the pack.window configuration to a large value, we
can wind up spending quite a lot of time finding new bases when breaking
delta chains between reachable and unreachable objects while generating
a cruft pack.
Introduce a handful of `repack.cruft*` configuration variables to
control the parameters used by pack-objects when generating a cruft
pack.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Expose a way to split the contents of a repository into a main and cruft
pack when doing an all-into-one repack with `git repack --cruft -d`, and
a complementary configuration variable.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a previous patch, pack-objects learned how to generate a cruft pack
so long as no objects are dropped.
This patch teaches pack-objects to handle the case where a non-never
`--cruft-expiration` value is passed. This case is slightly more
complicated than before, because we want pack-objects to save
unreachable objects which would have been pruned when there is another
recent (i.e., non-prunable) unreachable object which reaches the other.
We'll call these objects "unreachable but reachable-from-recent".
Here is how pack-objects handles `--cruft-expiration`:
- Instead of adding all objects outside of the kept pack(s) into the
packing list, only handle the ones whose mtime is within the grace
period.
- Construct a reachability traversal whose tips are the
unreachable-but-recent objects.
- Then, walk along that traversal, stopping if we reach an object in
the kept pack. At each step along the traversal, we add the object
we are visiting to the packing list.
In the majority of these cases, any object we visit in this traversal
will already be in our packing list. But we will sometimes encounter
reachable-from-recent cruft objects, which we want to retain even if
they aged out of the grace period.
The most subtle point of this process is that we actually don't need to
bother to update the rescued object's mtime. Even though we will write
an .mtimes file with a value that is older than the expiration window,
it will continue to survive cruft repacks so long as any objects which
reach it haven't aged out.
That is, a future repack will also exclude that object from the initial
packing list, only to discover it later on when doing the reachability
traversal.
Finally, stopping early once an object is found in a kept pack is safe
to do because the kept packs ordinarily represent which packs will
survive after repacking. Assuming that it _isn't_ safe to halt a
traversal early would mean that there is some ancestor object which is
missing, which implies repository corruption (i.e., the complete set of
reachable objects isn't present).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function behaves very similarly to what we will need in
pack-objects in order to implement cruft packs with expiration. But it
is lacking a couple of things. Namely, it needs:
- a mechanism to communicate the timestamps of individual recent
objects to some external caller
- and, in the case of packed objects, our future caller will also want
to know the originating pack, as well as the offset within that pack
at which the object can be found
- finally, it needs a way to skip over packs which are marked as kept
in-core.
To address the first two, add a callback interface in this patch which
reports the time of each recent object, as well as a (packed_git,
off_t) pair for packed objects.
Likewise, add a new option to the packed object iterators to skip over
packs which are marked as kept in core. This option will become
implicitly tested in a future patch.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach `pack-objects` how to generate a cruft pack when no objects are
dropped (i.e., `--cruft-expiration=never`). Later patches will teach
`pack-objects` how to generate a cruft pack that prunes objects.
When generating a cruft pack which does not prune objects, we want to
collect all unreachable objects into a single pack (noting and updating
their mtimes as we accumulate them). Ordinary use will pass the result
of a `git repack -A` as a kept pack, so when this patch says "kept
pack", readers should think "reachable objects".
Generating a non-expiring cruft packs works as follows:
- Callers provide a list of every pack they know about, and indicate
which packs are about to be removed.
- All packs which are going to be removed (we'll call these the
redundant ones) are marked as kept in-core.
Any packs the caller did not mention (but are known to the
`pack-objects` process) are also marked as kept in-core. Packs not
mentioned by the caller are assumed to be unknown to them, i.e.,
they entered the repository after the caller decided which packs
should be kept and which should be discarded.
Since we do not want to include objects in these "unknown" packs
(because we don't know which of their objects are or aren't
reachable), these are also marked as kept in-core.
- Then, we enumerate all objects in the repository, and add them to
our packing list if they do not appear in an in-core kept pack.
This results in a new cruft pack which contains all known objects that
aren't included in the kept packs. When the kept pack is the result of
`git repack -A`, the resulting pack contains all unreachable objects.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A new caller in the next commit will want to immediately modify the
object_entry structure created by create_object_entry(). Instead of
forcing that caller to wastefully look-up the entry we just created,
return it from create_object_entry() instead.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This structure will be used to communicate the per-object mtimes when
writing a cruft pack. Here, we need the full packing_data structure
because the mtime information is stored in an array there, not on the
individual object_entry's themselves (to avoid paying the overhead in
structure width for operations which do not generate a cruft pack).
We haven't passed this information down before because one of the two
callers (in bulk-checkin.c) does not have a packing_data structure at
all. In that case (where no cruft pack will be generated), NULL is
passed instead.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To store the individual mtimes of objects in a cruft pack, introduce a
new `.mtimes` format that can optionally accompany a single pack in the
repository.
The format is defined in Documentation/technical/pack-format.txt, and
stores a 4-byte network order timestamp for each object in name (index)
order.
This patch prepares for cruft packs by defining the `.mtimes` format,
and introducing a basic API that callers can use to read out individual
mtimes.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git remote -v" now shows the list-objects-filter used during
fetching from the remote, if available.
* ac/remote-v-with-object-list-filters:
builtin/remote.c: teach `-v` to list filters for promisor remotes
"git -c branch.autosetupmerge=simple branch $A $B" will set the $B
as $A's upstream only when $A and $B shares the same name, and "git
-c push.default=simple" on branch $A would push to update the
branch $A at the remote $B came from. Also more places use the
sole remote, if exists, before defaulting to 'origin'.
* tk/simple-autosetupmerge:
push: new config option "push.autoSetupRemote" supports "simple" push
push: default to single remote even when not named origin
branch: new autosetupmerge option 'simple' for matching branches
In the preceding commit we moved away from using xgettext(1) to both
generate the po/git.pot, and to merge the incrementally generated
po/git.pot+ file as we sourced translations from C, shell and Perl.
Doing it this way, which dates back to my initial
implementation[1][2][3] was conflating two things: With xgettext(1)
the --from-code both controls what encoding is specified in the
po/git.pot's header, and what encoding we allow in source messages.
We don't ever want to allow non-ASCII in *source messages*, and doing
so has hid e.g. a buggy message introduced in
a6226fd772 (submodule--helper: convert the bulk of cmd_add() to C,
2021-08-10) from us, we'd warn about it before, but only when running
"make pot", but the operation would still succeed. Now we'll error out
on it when running "make pot".
Since the preceding Makefile changes made this easy: let's add a "make
check-pot" target with the same prerequisites as the "po/git.pot"
target, but without changing the file "po/git.pot". Running it as part
of the "static-analysis" CI target will ensure that we catch any such
issues in the future. E.g.:
$ make check-pot
XGETTEXT .build/pot/po/builtin/submodule--helper.c.po
xgettext: Non-ASCII string at builtin/submodule--helper.c:3381.
Please specify the source encoding through --from-code.
make: *** [.build/pot/po/builtin/submodule--helper.c.po] Error 1
1. cd5513a716 (i18n: Makefile: "pot" target to extract messages
marked for translation, 2011-02-22)
2. adc3b2b276 (Makefile: add xgettext target for *.sh files,
2011-05-14)
3. 5e9637c629 (i18n: add infrastructure for translating Git with
gettext, 2011-11-18)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fetch --recurse-submodules" from multiple remotes (either from
a remote group, or "--all") used to make one extra "git fetch" in
the submodules, which has been corrected.
* jc/avoid-redundant-submodule-fetch:
fetch: do not run a redundant fetch from submodule
The way "git fetch" without "--update-head-ok" ensures that HEAD in
no worktree points at any ref being updated was too wasteful, which
has been optimized a bit.
* os/fetch-check-not-current-branch:
fetch: limit shared symref check only for local branches
The "--current" option of "git show-branch" should have been made
incompatible with the "--reflog" mode, but this was not enforced,
which has been corrected.
* jc/show-branch-g-current:
show-branch: -g and --current are incompatible
When using a multi-pack bitmap, pack-objects will try to perform its
traversal using a call to `traverse_bitmap_commit_list()`, which calls
`add_object_entry_from_bitmap()` to add each object it finds to its
packing list.
This path can cause pack-objects to add objects from packs that don't
have open pack_fds on them, by avoiding a call to `is_pack_valid()`.
This is because we only call `is_pack_valid()` on the preferred pack (in
order to do verbatim reuse via `reuse_partial_packfile_from_bitmap()`)
and not others when loading a MIDX bitmap.
In this case, `add_object_entry_from_bitmap()` will check whether it
wants each object entry by calling `want_object_in_pack()`, which will
call `want_found_object` (since its caller already supplied a
`found_pack`). In most cases (particularly without `--local`, and when
`ignored_packed_keep_on_disk` and `ignored_packed_keep_in_core` are
both "0"), we'll take the entry from the pack contained in the MIDX
bitmap, all without an open pack_fd.
When we then try to use that entry later to assemble the actual pack,
we'll be susceptible to any simultaneous writers moving that pack out of
the way (e.g., due to a concurrent repack) without having an open file
descriptor, causing races that result in errors like:
remote: Enumerating objects: 1498802, done.
remote: fatal: packfile ./objects/pack/pack-e57d433b5a588daa37fbe946e2b28dfaec03a93e.pack cannot be accessed
remote: aborting due to possible repository corruption on the remote side.
This race can happen even with multi-pack bitmaps, since we may open a
MIDX bitmap that is being rewritten long before its packs are actually
unlinked.
Work around this by calling `is_pack_valid()` from within
`want_found_object()`, matching the behavior in
`want_object_in_pack_one()` (which has an analogous call). Most calls to
`is_pack_valid()` should be basically no-ops, since only the first call
requires us to open a file (subsequent calls realize the file is already
open, and return immediately).
Importantly, when `want_object_in_pack()` is given a non-NULL
`*found_pack`, but `want_found_object()` rejects the copy of the object
in that pack, we must reset `*found_pack` and `*found_offset` to NULL
and 0, respectively. Failing to do so could lead to other checks in
`want_object_in_pack()` (such as `want_object_in_pack_one()`) using the
same (invalid) pack as `*found_pack`, meaning that we don't call
`is_pack_valid()` because `p == *found_pack`. This can lead the caller
to believe it can use a copy of an object from an invalid pack.
An alternative approach to closing this race would have been to call
`is_pack_valid()` on _all_ packs in a multi-pack bitmap on load. This
has a couple of problems:
- it is unnecessarily expensive in the cases where we don't actually
need to open any packs (e.g., in `git rev-list --use-bitmap-index
--count`)
- more importantly, it means any time we would have hit this race,
we'll avoid using bitmaps altogether, leading to significant
slowdowns by forcing a full object traversal
Co-authored-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A subsequent patch will teach `want_object_in_pack()` to set its
`*found_pack` and `*found_offset` poitners to NULL when the provided
pack does not pass the `is_pack_valid()` check.
The `--stdin-packs` mode of `pack-objects` is not quite prepared to
handle this. To prepare it for this change, do the following two things:
- Ensure provided packs pass the `is_pack_valid()` check when
collecting the caller-provided packs into the "included" and
"excluded" lists.
- Gracefully handle any _invalid_ packs being passed to
`want_object_in_pack()`.
Calling `is_pack_valid()` early on makes it substantially less likely
that we will have to deal with a pack going away, since we'll have an
open file descriptor on its contents much earlier.
But even packs with open descriptors can become invalid in the future if
we (a) hit our open descriptor limit, forcing us to close some open
packs, and (b) one of those just-closed packs has gone away in the
meantime.
`add_object_entry_from_pack()` depends on having a non-NULL
`*found_pack`, since it passes that pointer to `packed_object_info()`,
meaning that we would SEGV if the pointer became NULL (like we propose
to do in `want_object_in_pack()` in the following patch).
But avoiding calling `packed_object_info()` entirely is OK, too, since
its only purpose is to identify which objects in the included packs are
commits, so that they can form the tips of the advisory traversal used
to discover the object namehashes.
Failing to do this means that at worst we will produce lower-quality
deltas, but it does not prevent us from generating the pack as long as
we can find a copy of each object from the disappearing pack in some
other part of the repository.
Co-authored-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before calling `for_each_object_in_pack()`, the caller
`read_packs_list_from_stdin()` loops through each of the `include_packs`
and checks that its `->util` pointer (which is used to store the `struct
packed_git *` itself) is non-NULL.
This check is redundant, because `read_packs_list_from_stdin()` already
checks that the included packs are non-NULL earlier on in the same
function (and it does not add any new entries in between).
Remove this check, since it is not doing anything in the meantime.
Co-authored-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When modifying the sparse-checkout definition, the sparse-checkout
builtin calls update_sparsity() to modify the SKIP_WORKTREE bits of all
cache entries in the index. Before, we needed the index to be fully
expanded in order to ensure we had the full list of files necessary that
match the new patterns.
Insert a call to reset_sparse_directories() that expands sparse
directories that are within the new pattern list, but only far enough
that every necessary file path now exists as a cache entry. The
remaining logic within update_sparsity() will modify the SKIP_WORKTREE
bits appropriately.
This allows us to disable command_requires_full_index within the
sparse-checkout builtin. Add tests that demonstrate that we are not
expanding to a full index unnecessarily.
We can see the improved performance in the p2000 test script:
Test HEAD~1 HEAD
------------------------------------------------------------------------
2000.24: git ... (sparse-v3) 2.14(1.55+0.58) 1.57(1.03+0.53) -26.6%
2000.25: git ... (sparse-v4) 2.20(1.62+0.57) 1.58(0.98+0.59) -28.2%
These reductions of 26-28% are small compared to most examples, but the
time is dominated by writing a new copy of the base repository to the
worktree and then deleting it again. The fact that the previous index
expansion was such a large portion of the time is telling how important
it is to complete this sparse index integration.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the --no-sparse-index option is supplied, the sparse-checkout
builtin should explicitly ask to expand a sparse index to a full one.
This is currently done implicitly due to the command_requires_full_index
protection, but that will be removed in an upcoming change.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A future change will present a temporary, in-memory mode where the index
can both contain sparse directory entries but also not be completely
collapsed to the smallest possible sparse directories. This will be
necessary for modifying the sparse-checkout definition while using a
sparse index.
For now, convert the single-bit member 'sparse_index' in 'struct
index_state' to be a an 'enum sparse_index_mode' with three modes:
* INDEX_EXPANDED (0): No sparse directories exist. This is always the
case for repositories that do not use cone-mode sparse-checkout.
* INDEX_COLLAPSED: Sparse directories may exist. Files outside the
sparse-checkout cone are reduced to sparse directory entries whenever
possible.
* INDEX_PARTIALLY_SPARSE: Sparse directories may exist. Some file
entries outside the sparse-checkout cone may exist. Running
convert_to_sparse() may further reduce those files to sparse directory
entries.
The main reason to store this extra information is to allow
convert_to_sparse() to short-circuit when the index is already in
INDEX_EXPANDED mode but to actually do the necessary work when in
INDEX_PARTIALLY_SPARSE mode.
The INDEX_PARTIALLY_SPARSE mode will be used in an upcoming change.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce and apply coccinelle rule to discourage an explicit
comparison between a pointer and NULL, and applies the clean-up to
the maintenance track.
* ep/maint-equals-null-cocci:
tree-wide: apply equals-null.cocci
tree-wide: apply equals-null.cocci
contrib/coccinnelle: add equals-null.cocci
"git show :<path>" learned to work better with the sparse-index
feature.
* ds/sparse-colon-path:
rev-parse: integrate with sparse index
object-name: diagnose trees in index properly
object-name: reject trees found in the index
show: integrate with the sparse index
t1092: add compatibility tests for 'git show'
Teach "git stash" to work better with sparse index entries.
* vd/sparse-stash:
unpack-trees: preserve index sparsity
stash: apply stash using 'merge_ort_nonrecursive()'
read-cache: set sparsity when index is new
sparse-index: expose 'is_sparse_index_allowed()'
stash: integrate with sparse index
stash: expand sparse-checkout compatibility testing
"git bisect" was too silent before it is ready to start computing
the actual bisection, which has been corrected.
* cd/bisect-messages-from-pre-flight-states:
bisect: output bisect setup status in bisect log
bisect: output state before we are ready to compute bisection
"git pull" without "--recurse-submodules=<arg>" made
submodule.recurse take precedence over fetch.recurseSubmodules by
mistake, which has been corrected.
* gc/pull-recurse-submodules:
pull: do not let submodule.recurse override fetch.recurseSubmodules
The previous patch demonstrates a scenario where the list of packs
written by `pack-objects` (and stored in the `names` string_list) is
out-of-order, and can thus cause us to delete packs we shouldn't.
This patch resolves that bug by ensuring that `names` is sorted in all
cases, not just when
delete_redundant && pack_everything & ALL_INTO_ONE
is true.
Because we did sort `names` in that case (which, prior to `--geometric`
repacks, was the only time we would actually delete packs, this is only
a bug for `--geometric` repacks.
It would be sufficient to only sort `names` when `delete_redundant` is
set to a non-zero value. But sorting a small list of strings is cheap,
and it is defensive against future calls to `string_list_has_string()`
on this list.
Co-discovered-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update 'repack' to ignore packs named on the command line with the
'--keep-pack' option. Specifically, modify 'init_pack_geometry()' to treat
command line-kept packs the same way it treats packs with an on-disk '.keep'
file (that is, skip the pack and do not include it in the 'geometry'
structure).
Without this handling, a '--keep-pack' pack would be included in the
'geometry' structure. If the pack is *before* the geometry split line (with
at least one other pack and/or loose objects present), 'repack' assumes the
pack's contents are "rolled up" into another pack via 'pack-objects'.
However, because the internally-invoked 'pack-objects' properly excludes
'--keep-pack' objects, any new pack it creates will not contain the kept
objects. Finally, 'repack' deletes the '--keep-pack' as "redundant" (since
it assumes 'pack-objects' created a new pack with its contents), resulting
in possible object loss and repository corruption.
Add a test ensuring that '--keep-pack' packs are now appropriately handled.
Co-authored-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In c7c4bdeccf (run-command API: remove "env" member, always use
"env_array", 2021-11-25), there was a push to replace
cld.env = env->v;
with
strvec_pushv(&cld.env_array, env->v);
The conversion in c7c4bdeccf was mostly plug-and-play, with the snag
that some instances of strvec_pushv() became guarded with a NULL check
to ensure that the second argument was non-NULL.
This conversion was slightly over-eager to add a conditional in
builtin/receive-pack.c::unpack(), since we know at the point that we
add the result of `tmp_objdir_env()` into the child process's
environment, that `tmp_objdir` is non-NULL.
This follows from the conditional just before our strvec_pushv() call
(which returns from the function if `tmp_objdir` was NULL), as well as
the call to tmp_objdir_add_as_alternate() just below, which relies on
its argument (`tmp_objdir`) being non-NULL.
In the meantime, this extra conditional isn't hurting anything. But it
is redundant and thus unnecessarily confusing. So let's remove it.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 7dce19d3 (fetch/pull: Add the --recurse-submodules option,
2010-11-12) introduced the "--recurse-submodule" option, the
approach taken was to perform fetches in submodules only once, after
all the main fetching (it may usually be a fetch from a single
remote, but it could be fetching from a group of remotes using
fetch_multiple()) succeeded. Later we added "--all" to fetch from
all defined remotes, which complicated things even more.
If your project has a submodule, and you try to run "git fetch
--recurse-submodule --all", you'd see a fetch for the top-level,
which invokes another fetch for the submodule, followed by another
fetch for the same submodule. All but the last fetch for the
submodule come from a "git fetch --recurse-submodules" subprocess
that is spawned via the fetch_multiple() interface for the remotes,
and the last fetch comes from the code at the end.
Because recursive fetching from submodules is done in each fetch for
the top-level in fetch_multiple(), the last fetch in the submodule
is redundant. It only matters when fetch_one() interacts with a
single remote at the top-level.
While we are at it, there is one optimization that exists in dealing
with a group of remote, but is missing when "--all" is used. In the
former, when the group turns out to be a group of one, instead of
spawning "git fetch" as a subprocess via the fetch_multiple()
interface, we use the normal fetch_one() code path. Do the same
when handing "--all", if it turns out that we have only one remote
defined.
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This method was initially written in 63e95beb0 (submodule: port
resolve_relative_url from shell to C, 2016-05-15). As we will need
similar functionality in the bundle URI feature, extract this to be
available in remote.h.
The code is almost exactly the same, except for the following trivial
differences:
* Fix whitespace and wrapping issues with the prototype and argument
lists.
* Let's call starts_with_dot_{,dot_}slash_native() instead of the
functionally identical "starts_with_dot_{,dot_}slash()" wrappers
"builtin/submodule--helper.c".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a path_match_flags() function and have the two sets of
starts_with_dot_{,dot_}slash() functions added in
63e95beb08 (submodule: port resolve_relative_url from shell to C,
2016-04-15) and a2b26ffb1a (fsck: convert gitmodules url to URL
passed to curl, 2020-04-18) be thin wrappers for it.
As the latter of those notes the fsck version was copied from the
initial builtin/submodule--helper.c version.
Since the code added in a2b26ffb1a was doing really doing the same as
win32_is_dir_sep() added in 1cadad6f65 (git clone <url>
C:\cygwin\home\USER\repo' is working (again), 2018-12-15) let's move
the latter to git-compat-util.h is a is_xplatform_dir_sep(). We can
then call either it or the platform-specific is_dir_sep() from this
new function.
Let's likewise change code in various other places that was hardcoding
checks for "'/' || '\\'" with the new is_xplatform_dir_sep(). As can
be seen in those callers some of them still concern themselves with
':' (Mac OS classic?), but let's leave the question of whether that
should be consolidated for some other time.
As we expect to make wider use of the "native" case in the future,
define and use two starts_with_dot_{,dot_}slash_native() convenience
wrappers. This makes the diff in builtin/submodule--helper.c much
smaller.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This check was introduced in 8ee5d73137 (Fix fetch/pull when run without
--update-head-ok, 2008-10-13) in order to protect against replacing the ref
of the active branch by mistake, for example by running git fetch origin
master:master.
It was later extended in 8bc1f39f41 (fetch: protect branches checked out
in all worktrees, 2021-12-01) to scan all worktrees.
This operation is very expensive (takes about 30s in my repository) when
there are many tags or branches, and it is executed on every fetch, even if
no local heads are updated at all.
Limit it to protect only refs/heads/* to improve fetch performance.
Signed-off-by: Orgad Shaneh <orgads@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 4c28e4ada0 (commit: die before asking to edit the log
message, 2010-12-20), we have been "leaking" the "author_ident" when
prepare_to_commit() fails. Instead of returning from right there,
introduce an exit status variable and jump to the clean-up label
at the end.
Instead of explicitly releasing the resource with strbuf_release(),
mark the variable with UNLEAK() at the end, together with two other
variables that are already marked as such. If this were in a
utility function that is called number of times, but these are
different, we should explicitly release resources that grow
proportionally to the size of the problem being solved, but
cmd_commit() is like main() and there is no point in spending extra
cycles to release individual pieces of resource at the end, just
before process exit will clean everything for us for free anyway.
This fixes a leak demonstrated by e.g. "t3505-cherry-pick-empty.sh",
but unfortunately we cannot mark it or other affected tests as passing
now with "TEST_PASSES_SANITIZE_LEAK=true" as we'll need to fix many
other memory leaks before doing so.
Incidentally there are two tests that always passes the leak checker
with or without this change. Mark them as such.
This is based on an earlier patch by Ævar, but takes a different
approach that is more maintainable.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a bug in "git pull" where `submodule.recurse` is preferred over
`fetch.recurseSubmodules` when performing a fetch
(Documentation/config/fetch.txt says that `fetch.recurseSubmodules`
should be preferred.). Do this by passing the value of the
"--recurse-submodules" CLI option to the underlying fetch, instead of
passing a value that combines the CLI option and config variables.
In other words, this bug occurred because builtin/pull.c is conflating
two similar-sounding, but different concepts:
- Whether "git pull" itself should care about submodules e.g. whether it
should update the submodule worktrees after performing a merge.
- The value of "--recurse-submodules" to pass to the underlying "git
fetch".
Thus, when `submodule.recurse` is set, the underlying "git fetch" gets
invoked with "--recurse-submodules[=value]", overriding the value of
`fetch.recurseSubmodules`.
An alternative (and more obvious) approach to fix the bug would be to
teach "git pull" to understand `fetch.recurseSubmodules`, but the
proposed solution works better because:
- We don't maintain two identical config-parsing implementions in "git
pull" and "git fetch".
- It works better with other commands invoked by "git pull" e.g. "git
merge" won't accidentally respect `fetch.recurseSubmodules`.
Reported-by: Huang Zou <huang.zou@schrodinger.com>
Helped-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git rebase --keep-base <upstream> <branch-to-rebase>" computed the
commit to rebase onto incorrectly, which has been corrected.
* ah/rebase-keep-base-fix:
rebase: use correct base for --keep-base when a branch is given
This allows seeing the current intermediate status without adding a new
good or bad commit:
$ git bisect log | tail -1
# status: waiting for bad commit, 1 good commit known
Signed-off-by: Chris Down <chris@chrisdown.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 73c6de06af ("bisect: don't use invalid oid as rev when
starting") changes the behaviour of `git bisect` to consider invalid
oids as pathspecs again, as in the old shell implementation.
While that behaviour may be desirable, it can also cause confusion. For
example, while bisecting in a particular repo I encountered this:
$ git bisect start d93ff48803f0 v6.3
$
...which led to me sitting for a few moments, wondering why there's no
printout stating the first rev to check.
It turns out that the tag was actually "6.3", not "v6.3", and thus the
bisect was still silently started with only a bad rev, because
d93ff48803f0 was a valid oid and "v6.3" was silently considered to be a
pathspec.
While this behaviour may be desirable, it can be confusing, especially
with different repo conventions either using or not using "v" before
release names, or when a branch name or tag is simply misspelled on the
command line.
In order to avoid situations like this, make it more clear what we're
waiting for:
$ git bisect start d93ff48803f0 v6.3
status: waiting for good commit(s), bad commit known
We already have good output once the bisect process has begun in
earnest, so we don't need to do anything more there.
Signed-off-by: Chris Down <chris@chrisdown.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The progress meter of "git blame" was showing incorrect numbers
when processing only parts of the file.
* ea/progress-partial-blame:
blame: report correct number of lines in progress when using ranges
Update 'stash' to use 'merge_ort_nonrecursive()' to apply a stash to the
current working tree. When 'git stash apply' was converted from its shell
script implementation to a builtin in 8a0fc8d19d (stash: convert apply to
builtin, 2019-02-25), 'merge_recursive_generic()' was used to merge a stash
into the working tree as part of 'git stash (apply|pop)'. However, with the
single merge base used in 'do_apply_stash()', the commit wrapping done by
'merge_recursive_generic()' is not only unnecessary, but misleading (the
*real* merge base is labeled "constructed merge base"). Therefore, a
non-recursive merge of the working tree, stashed tree, and stash base tree
is more appropriate.
There are two options for a non-recursive merge-then-update-worktree
function: 'merge_trees()' and 'merge_ort_nonrecursive()'. Use
'merge_ort_nonrecursive()' to align with the default merge strategy used by
'git merge' (6a5fb96672 (Change default merge backend from recursive to ort,
2021-08-04)) and, because merge-ort does not operate in-place on the index,
avoid unnecessary index expansion. Update tests in 't1092' verifying index
expansion for 'git stash' accordingly.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Enable sparse index in 'git stash' by disabling
'command_requires_full_index'.
With sparse index enabled, some subcommands of 'stash' work without
expanding the index, e.g., 'git stash', 'git stash list', 'git stash drop',
etc. Others ensure the index is expanded either directly (as in the case of
'git stash [pop|apply]', where the call to 'merge_recursive_generic()' in
'do_apply_stash()' triggers the expansion), or in a command called
internally by stash (e.g., 'git update-index' in 'git stash -u'). So, in
addition to enabling sparse index, add tests to 't1092' demonstrating which
variants of 'git stash' expand the index, and which do not.
Finally, add the option to skip writing 'untracked.txt' in
'ensure_not_expanded', and use that option to successfully apply stashed
untracked files without a conflict in 'untracked.txt'.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git remote -v` (`--verbose`) lists down the names of remotes along with
their URLs. It would be beneficial for users to also specify the filter
types for promisor remotes. Something like this -
origin remote-url (fetch) [blob:none]
origin remote-url (push)
Teach `git remote -v` to also specify the filters for promisor remotes.
Closes: https://github.com/gitgitgadget/git/issues/1211
Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git format-patch <args> -- <pathspec>" lost the pathspec when
showing the second and subsequent commits, which has been
corrected.
source: <c36896a1-6247-123b-4fa3-b7eb24af1897@web.de>
* rs/format-patch-pathspec-fix:
2.36 format-patch regression fix
"git fast-export -- <pathspec>" lost the pathspec when showing the
second and subsequent commits, which has been corrected.
source: <2c988c7b-0efe-4222-4a43-8124fe1a9da6@web.de>
* rs/fast-export-pathspec-fix:
2.36 fast-export regression fix
"git show <commit1> <commit2>... -- <pathspec>" lost the pathspec
when showing the second and subsequent commits, which has been
corrected.
source: <xmqqo80j87g0.fsf_-_@gitster.g>
* jc/show-pathspec-fix:
2.36 show regression fix
Regression fix for 2.36 where "git name-rev" started to sometimes
reference strings after they are freed.
This fixes a regression in 2.36 and is slate to go to 2.36.1
source: <340c8810-d912-7b18-d46e-a9d43f20216a@web.de>
* rs/name-rev-fix-free-after-use:
Revert "name-rev: release unused name strings"
"diff-tree --stdin" has been broken for about a year, but 2.36
release broke it even worse by breaking running the command with
<pathspec>, which in turn broke "gitk" and got noticed. This has
been corrected by aligning its behaviour to that of "log".
This fixes a regression in 2.36 and is slate to go to 2.36.1
source: <xmqq7d7bsu2n.fsf@gitster.g>
* jc/diff-tree-stdin-fix:
2.36 gitk/diff-tree --stdin regression fix
"git submodule update" without pathspec should silently skip an
uninitialized submodule, but it started to become noisy by mistake.
This fixes a regression in 2.36 and is slate to go to 2.36.1
source: <pull.1258.v2.git.git.1650890741430.gitgitgadget@gmail.com>
* gc/submodule-update-part2:
submodule--helper: fix initialization of warn_if_uninitialized
The path taken by "git multi-pack-index" command from the end user
was compared with path internally prepared by the tool withut first
normalizing, which lead to duplicated paths not being noticed,
which has been corrected.
* ds/midx-normalize-pathname-before-comparison:
cache: use const char * for get_object_directory()
multi-pack-index: use --object-dir real path
midx: use real paths in lookup_multi_pack_index()
"git clone --origin X" leaked piece of memory that held value read
from the clone.defaultRemoteName configuration variable, which has
been plugged.
* jc/clone-remote-name-leak-fix:
clone: plug a miniscule leak
"git format-patch <args> -- <pathspec>" lost the pathspec when
showing the second and subsequent commits, which has been
corrected.
* rs/format-patch-pathspec-fix:
2.36 format-patch regression fix
"git fast-export -- <pathspec>" lost the pathspec when showing the
second and subsequent commits, which has been corrected.
* rs/fast-export-pathspec-fix:
2.36 fast-export regression fix
"git show <commit1> <commit2>... -- <pathspec>" lost the pathspec
when showing the second and subsequent commits, which has been
corrected.
* jc/show-pathspec-fix:
2.36 show regression fix
The remote_name variable is first assigned a copy of the value of
the "clone.defaultremotename" configuration variable and then by the
value of the "--origin" command line option. The former is prepared
to see multiple instances of the configuration variable by freeing
the current value of the variable before a copy of the newly
discovered value gets assigned to it. The latter however blindly
assigned a copy of the new value to the variable, thereby leaking
the value read from the configuration variable.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
e900d494dc (diff: add an API for deferred freeing, 2021-02-11) added a
way to allow reusing diffopts: the no_free bit. 244c27242f (diff.[ch]:
have diff_free() call clear_pathspec(opts.pathspec), 2022-02-16) made
that mechanism mandatory.
git fast-export doesn't set no_free, so path limiting stopped working
after the first commit. Set the flag and add a basic test to make sure
only changes to the specified files are exported.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
e900d494dc (diff: add an API for deferred freeing, 2021-02-11) added a
way to allow reusing diffopts: the no_free bit. 244c27242f (diff.[ch]:
have diff_free() call clear_pathspec(opts.pathspec), 2022-02-16) made
that mechanism mandatory.
git format-patch only sets no_free when --output is given, causing it to
forget pathspecs after the first commit. Set no_free unconditionally
instead.
The existing test was unable to detect this breakage because it checks
stderr for the absence of a certain string, but format-patch writes to
stdout. Also the test was not checking the case of one commit modifying
multiple files and a pathspec limiting the diff. Replace it with a more
thorough one.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This only surfaced as a regression after 2.36 release, but the
breakage was already there with us for at least a year.
e900d494 (diff: add an API for deferred freeing, 2021-02-11)
introduced a mechanism to delay freeing resources held in
diff_options struct that need to be kept as long as the struct will
be reused to compute diff. "git log -p" was taught to utilize the
mechanism but it was done with an incorrect assumption that the
underlying helper function, cmd_log_walk(), is called only once,
and it is OK to do the freeing at the end of it.
Alas, for "git show A B", the function is called once for each
commit given, so it is not OK to free the resources until we finish
calling it for all the commits given from the command line.
During 2.36 release cycle, we started clearing the <pathspec> as
part of this freeing, which made the bug a lot more visible.
Fix this breakage by tweaking how cmd_log_walk() frees the resources
at the end and using a variant of it that does not immediately free
the resources to show each commit object from the command line in
"git show".
Protect the fix with a few new tests.
Reported-by: Daniel Li <dan@danielyli.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In some "simple" centralized workflows, users expect remote tracking
branch names to match local branch names. "git push" pushes to the
remote version/instance of the branch, and "git pull" pulls any changes
to the remote branch (changes made by the same user in another place, or
by other users).
This expectation is supported by the push.default default option "simple"
which refuses a default push for a mismatching tracking branch name, and
by the new branch.autosetupmerge option, "simple", which only sets up
remote tracking for same-name remote branches.
When a new branch has been created by the user and has not yet been
pushed (and push.default is not set to "current"), the user is prompted
with a "The current branch %s has no upstream branch" error, and
instructions on how to push and add tracking.
This error is helpful in that following the advice once per branch
"resolves" the issue for that branch forever, but inconvenient in that
for the "simple" centralized workflow, this is always the right thing to
do, so it would be better to just do it.
Support this workflow with a new config setting, push.autoSetupRemote,
which will cause a default push, when there is no remote tracking branch
configured, to push to the same-name on the remote and --set-upstream.
Also add a hint offering this new option when the "The current branch %s
has no upstream branch" error is encountered, and add corresponding tests.
Signed-off-by: Tao Klerks <tao@klerks.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the default push.default option, "simple", beginners are
protected from accidentally pushing to the "wrong" branch in
centralized workflows: if the remote tracking branch they would push
to does not have the same name as the local branch, and they try to do
a "default push", they get an error and explanation with options.
There is a particular centralized workflow where this often happens:
a user branches to a new local topic branch from an existing
remote branch, eg with "checkout -b feature1 origin/master". With
the default branch.autosetupmerge configuration (value "true"), git
will automatically add origin/master as the upstream tracking branch.
When the user pushes with a default "git push", with the intention of
pushing their (new) topic branch to the remote, they get an error, and
(amongst other things) a suggestion to run "git push origin HEAD".
If they follow this suggestion the push succeeds, but on subsequent
default pushes they continue to get an error - so eventually they
figure out to add "-u" to change the tracking branch, or they spelunk
the push.default config doc as proposed and set it to "current", or
some GUI tooling does one or the other of these things for them.
When one of their coworkers later works on the same topic branch,
they don't get any of that "weirdness". They just "git checkout
feature1" and everything works exactly as they expect, with the shared
remote branch set up as remote tracking branch, and push and pull
working out of the box.
The "stable state" for this way of working is that local branches have
the same-name remote tracking branch (origin/feature1 in this
example), and multiple people can work on that remote feature branch
at the same time, trusting "git pull" to merge or rebase as required
for them to be able to push their interim changes to that same feature
branch on that same remote.
(merging from the upstream "master" branch, and merging back to it,
are separate more involved processes in this flow).
There is a problem in this flow/way of working, however, which is that
the first user, when they first branched from origin/master, ended up
with the "wrong" remote tracking branch (different from the stable
state). For a while, before they pushed (and maybe longer, if they
don't use -u/--set-upstream), their "git pull" wasn't getting other
users' changes to the feature branch - it was getting any changes from
the remote "master" branch instead (a completely different class of
changes!)
An experienced git user might say "well yeah, that's what it means to
have the remote tracking branch set to origin/master!" - but the
original user above didn't *ask* to have the remote master branch
added as remote tracking branch - that just happened automatically
when they branched their feature branch. They didn't necessarily even
notice or understand the meaning of the "set up to track 'origin/master'"
message when they created the branch - especially if they are using a
GUI.
Looking at how to fix this, you might think "OK, so disable auto setup
of remote tracking - set branch.autosetupmerge to false" - but that
will inconvenience the *second* user in this story - the one who just
wanted to start working on the topic branch. The first and second
users swap roles at different points in time of course - they should
both have a sane configuration that does the right thing in both
situations.
Make this "branches have the same name locally as on the remote"
workflow less painful / more obvious by introducing a new
branch.autosetupmerge option called "simple", to match the same-name
"push.default" option that makes similar assumptions.
This new option automatically sets up tracking in a *subset* of the
current default situations: when the original ref is a remote tracking
branch *and* has the same branch name on the remote (as the new local
branch name).
Update the error displayed when the 'push.default=simple' configuration
rejects a mismatching-upstream-name default push, to offer this new
branch.autosetupmerge option that will prevent this class of error.
With this new configuration, in the example situation above, the first
user does *not* get origin/master set up as the tracking branch for
the new local branch. If they "git pull" in their new local-only
branch, they get an error explaining there is no upstream branch -
which makes sense and is helpful. If they "git push", they get an
error explaining how to push *and* suggesting they specify
--set-upstream - which is exactly the right thing to do for them.
This new option is likely not appropriate for users intentionally
implementing a "triangular workflow" with a shared upstream tracking
branch, that they "git pull" in and a "private" feature branch that
they push/force-push to just for remote safe-keeping until they are
ready to push up to the shared branch explicitly/separately. Such
users are likely to prefer keeping the current default
merge.autosetupmerge=true behavior, and change their push.default to
"current".
Also extend the existing branch tests with three new cases testing
this option - the obvious matching-name and non-matching-name cases,
and also a non-matching-ref-type case. The matching-name case needs to
temporarily create an independent repo to fetch from, as the general
strategy of using the local repo as the remote in these tests
precludes locally branching with the same name as in the "remote".
Signed-off-by: Tao Klerks <tao@klerks.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Regression fix for 2.36 where "git name-rev" started to sometimes
reference strings after they are freed.
* rs/name-rev-fix-free-after-use:
Revert "name-rev: release unused name strings"
"diff-tree --stdin" has been broken for about a year, but 2.36
release broke it even worse by breaking running the command with
<pathspec>, which in turn broke "gitk" and got noticed. This has
been corrected by aligning its behaviour to that of "log".
* jc/diff-tree-stdin-fix:
2.36 gitk/diff-tree --stdin regression fix
"git submodule update" without pathspec should silently skip an
uninitialized submodule, but it started to become noisy by mistake.
* gc/submodule-update-part2:
submodule--helper: fix initialization of warn_if_uninitialized
It is not obvious that the 'git rev-parse' builtin would use the sparse
index, but it is possible to parse paths out of the index using the
":<path>" syntax. The 'git rev-parse' output is only the OID of the
object found at that location, but otherwise behaves similarly to 'git
show :<path>'. This includes the failure conditions on directories and
the error messages depending on whether a path is in the worktree or
not.
The only code change required is to change the
command_requires_full_index setting in builtin/rev-parse.c, and we can
re-use many existing 'git show' tests for the rev-parse case.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'git show' command can take an input to request the state of an
object in the index. This can lead to parsing the index in order to load
a specific file entry. Without the change presented here, a sparse index
would expand to a full one, taking much longer than usual to access a
simple file.
There is one behavioral change that happens here, though: we now can
find a sparse directory entry within the index! Commands that previously
failed because we could not find an entry in the worktree or index now
succeed because we _do_ find an entry in the index.
There might be more work to do to make other situations succeed when
looking for an indexed tree, perhaps by looking at or updating the
cache-tree extension as needed. These situations include having a full
index or asking for a directory that is within the sparse-checkout cone
(and hence is not a sparse directory entry in the index).
For now, we demonstrate how the sparse index integration is extremely
simple for files outside of the cone as well as directories within the
cone. A later change will resolve this behavior around sparse
directories.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The .warn_if_uninitialized member was introduced by 48308681
(git submodule update: have a dedicated helper for cloning,
2016-02-29) to submodule_update_clone struct and initialized to
false. When c9911c93 (submodule--helper: teach update_data more
options, 2022-03-15) moved it to update_data struct, it started
to initialize it to true but this change was not explained in
its log message.
The member is set to true only when pathspec was given, and is
used when a submodule that matched the pathspec is found
uninitialized to give diagnostic message. "submodule update"
without pathspec is supposed to iterate over all submodules
(i.e. without pathspec limitation) and update only the
initialized submodules, and finding uninitialized submodules
during the iteration is a totally expected and normal thing that
should not be warned.
[jc: added tests]
Signed-off-by: Orgad Shaneh <orgads@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This only surfaced as a regression after 2.36 release, but the
breakage was already there with us for at least a year.
The diff_free() call is to be used after we completely finished with
a diffopt structure. After "git diff A B" finishes producing
output, calling it before process exit is fine. But there are
commands that prepares diff_options struct once, compares two sets
of paths, releases resources that were used to do the comparison,
then reuses the same diff_option struct to go on to compare the next
two sets of paths, like "git log -p".
After "git log -p" finishes showing a single commit, calling it
before it goes on to the next commit is NOT fine. There is a
mechanism, the .no_free member in diff_options struct, to help "git
log" to avoid calling diff_free() after showing each commit and
instead call it just one. When the mechanism was introduced in
e900d494 (diff: add an API for deferred freeing, 2021-02-11),
however, we forgot to do the same to "diff-tree --stdin", which *is*
a moral equivalent to "git log".
During 2.36 release cycle, we started clearing the pathspec in
diff_free(), so programs like gitk that runs
git diff-tree --stdin -- <pathspec>
downstream of a pipe, processing one commit after another, started
showing irrelevant comparison outside the given <pathspec> from the
second commit. The same commit, by forgetting to teach the .no_free
mechanism, broke "diff-tree --stdin -I<regexp>" and nobody noticed
it for over a year, presumably because it is so seldom used an
option.
But <pathspec> is a different story. The breakage was very
prominently visible and was reported immediately after 2.36 was
released.
Fix this breakage by mimicking how "git log" utilizes the .no_free
member so that "diff-tree --stdin" behaves more similarly to "log".
Protect the fix with a few new tests.
Reported-by: Matthias Aßhauer <mha1993@live.de>
Helped-by: René Scharfe <l.s.r@web.de>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --object-dir argument to 'git multi-pack-index' allows a user to
specify an alternate to use instead of the local $GITDIR. This is used
by third-party tools like VFS for Git to maintain the pack-files in a
"shared object cache" used by multiple clones.
On Windows, the user can specify a path using a Windows-style file path
with backslashes such as "C:\Path\To\ObjectDir". This same path style is
used in the .git/objects/info/alternates file, so it already matches the
path of that alternate. However, find_odb() converts these paths to
real-paths for the comparison, which use forward slashes. As of the
previous change, lookup_multi_pack_index() uses real-paths, so it
correctly finds the target multi-pack-index when given these paths.
Some commands such as 'git multi-pack-index repack' call child processes
using the object_dir value, so it can be helpful to convert the path to
the real-path before sending it to those locations.
Add a callback to convert the real path immediately upon parsing the
argument. We need to be careful that we don't store the exact value out
of get_object_directory() and free it, or we could corrupt a later use
of the_repository->objects->odb->path.
We don't use get_object_directory() for the initial instantiation in
cmd_multi_pack_index() because we need 'git multi-pack-index -h' to work
without a Git repository.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 2d53975488.
3656f84278 (name-rev: prefer shorter names over following merges,
2021-12-04) broke the assumption of 2d53975488 (name-rev: release unused
name strings, 2020-02-04) that a better name for a child is a better
name for all of its ancestors as well, because it added a penalty for
generation > 0. This leads to strings being free(3)'d that are still
needed.
079f970971 (name-rev: sort tip names before applying, 2020-02-05)
already reduced the number of free(3) calls for the use case that
motivated the original patch (name-rev --all in the Chromium repository)
from ca. 44000 to 5, and 3656f84278 eliminated even those few. So this
revert won't affect name-rev's performance on that particular repo.
Reported-by: Thomas Hurst <tom@hur.st>
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make cone mode the default, and update the documentation accordingly.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "--current" is given to "git show-branch" running in the
"--reflog" mode, the code tries to reference a "reflog" message
that does not even exist. This is because the --current is not
prepared to work in that mode.
The reason "--current" exists is to support this request:
I list branches on the command line. These are the branchesI
care about and I use as anchoring points. I may or may not be on
one of these main branches. Please make sure I can view the
commits on the current branch with respect to what is in these
other branches.
And to serve that request, the code checks if the current branch is
among the ones listed on the command line, and adds it only if it is
not to the end of one array, which essentially lists the objects.
The reflog mode additionally uses another array to list reflog
messages, which the "--current" code does not add to. This leaves
one uninitialized slot at the end of the array of reflog messages,
and causes the program to show garbage or segfault.
Catch the unsupported (and meaningless) combination and exit with a
usage error.
There are other combinations of options that are incompatible but
have not been tested. Add test to cover them while adding coverage
for this new combination.
Reported-by: Gregory David <gregory.david@p1sec.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
--keep-base rebases onto the merge base of the given upstream and the
current HEAD regardless of whether a branch is given. This is contrary
to the documentation and to the option's intended purpose. Instead,
rebase onto the merge base of the given upstream and the given branch.
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is an if statement where both if and else have the same
assignment of options.type to REBASE_MERGE. Simplify
it by getting that assigmnent out of the if.
Signed-off-by: Edmundo Carmona Antoranz <eantoranz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Call diff_free() on the "pruning" member of "struct rev_info". Doing
so makes several tests pass under SANITIZE=leak.
This was also the last missing piece that allows us to remove the
UNLEAK() in "cmd_diff" and "cmd_diff_index", which allows us to use
those commands as a canary for general leaks in the revisions API. See
[1] for further rationale, and 886e1084d7 (builtin/: add UNLEAKs,
2017-10-01) for the commit that added the UNLEAK() there.
1. https://lore.kernel.org/git/220218.861r00ib86.gmgdl@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the the release_revisions() function so that it frees the
"prune_data" in the "struct rev_info". This means that any code that
calls "release_revisions()" already can get rid of adjacent calls to
clear_pathspec().
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the the release_revisions() function so that it frees the
"mailmap" in the "struct rev_info".
The log family of functions now calls the clear_mailmap() function
added in fa8afd18e5a (revisions API: provide and use a
release_revisions(), 2021-09-19), allowing us to whitelist some tests
with "TEST_PASSES_SANITIZE_LEAK=true".
Unfortunately having a pointer to a mailmap in "struct rev_info"
instead of an embedded member that we "own" get a bit messy, as can be
seen in the change to builtin/commit.c.
When we free() this data we won't be able to tell apart a pointer to a
"mailmap" on the heap from one on the stack. As seen in
ea57bc0d41 (log: add --use-mailmap option, 2013-01-05) the "log"
family allocates it on the heap, but in the find_author_by_nickname()
code added in ea16794e43 (commit: search author pattern against
mailmap, 2013-08-23) we allocated it on the stack instead.
Ideally we'd simply change that member to a "struct string_list
mailmap" and never free() the "mailmap" itself, but that would be a
much larger change to the revisions API.
We have code that needs to hand an existing "mailmap" to a "struct
rev_info", while we could change all of that, let's not go there
now.
The complexity isn't in the ownership of the "mailmap" per-se, but
that various things assume a "rev_info.mailmap == NULL" means "doesn't
want mailmap", if we changed that to an init'd "struct string_list
we'd need to carefully refactor things to change those assumptions.
Let's instead always free() it, and simply declare that if you add
such a "mailmap" it must be allocated on the heap. Any modern libc
will correctly panic if we free() a stack variable, so this should be
safe going forward.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use release_revisions() for users of "struct rev_list" that reach into
the "struct rev_info" and clear the "prune_data" already.
In a subsequent commit we'll teach release_revisions() to clear this
itself, but in the meantime let's invoke release_revisions() here to
clear anything else we may have missed, and for reasons of having
consistent boilerplate.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use a release_revisions() with those "struct rev_list" users which
already "UNLEAK" the struct. It may seem odd to simultaneously attempt
to free() memory, but also to explicitly ignore whether we have memory
leaks in the same.
As explained in preceding commits this is being done to use the
built-in commands as a guinea pig for whether the release_revisions()
function works as expected, we'd like to test e.g. whether we segfault
as we change it. In subsequent commits we'll then remove these
UNLEAK() as the function is made to free the memory that caused us to
add them in the first place.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for having the "log" family of functions make wider use
of release_revisions() let's have them call it just before
exiting. This changes the "log", "whatchanged", "show",
"format-patch", etc. commands, all of which live in this file.
The release_revisions() API still only frees the "pending" member, but
will learn to release more members of "struct rev_info" in subsequent
commits.
In the case of "format-patch" revert the addition of UNLEAK() in
dee839a263 (format-patch: mark rev_info with UNLEAK, 2021-12-16),
which will cause several tests that previously passed under
"TEST_PASSES_SANITIZE_LEAK=true" to start failing.
In subsequent commits we'll now be able to use those tests to check
whether that part of the API is really leaking memory, and will fix
all of those memory leaks. Removing the UNLEAK() allows us to make
incremental progress in that direction. See [1] for further details
about this approach.
Note that the release_revisions() will not be sufficient to deal with
the code in cmd_show() added in 5d7eeee2ac (git-show: grok blobs,
trees and tags, too, 2006-12-14) which clobbers the "pending" array in
the case of "OBJ_COMMIT". That will need to be dealt with by some
future follow-up work.
1. https://lore.kernel.org/git/220218.861r00ib86.gmgdl@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a release_revisions() to various users of "struct rev_info" which
requires a minor refactoring to a "goto cleanup" pattern to use that
function.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the initialization of the "revision" member of "struct
stash_info" to be initialized vi a macro, and more importantly that
that initializing function be tasked to free it, usually via a "goto
cleanup" pattern.
Despite the "revision" name (and the topic of the series containing
this commit) the "stash info" has nothing to do with the "struct
rev_info". I'm making this change because in the subsequent commit
when we do want to free the "struct rev_info" via a "goto cleanup"
pattern we'd otherwise free() uninitialized memory in some cases, as
we only strbuf_init() the string in get_stash_info().
So while it's not the smallest possible change, let's convert all
users of this pattern in the file while we're at it.
A good follow-up to this change would be to change all the "ret = -1;
goto done;" in this file to instead use a "goto cleanup", and
initialize "int ret = -1" at the start of the relevant functions. That
would allow us to drop a lot of needless brace verbosity on two-line
"if" statements, but let's leave that alone for now.
To ensure that there's a 1=1 mapping between owners of the "struct
stash_info" and free_stash_info() change the assert_stash_ref()
function to be a trivial get_stash_info_assert() wrapper. The caller
will call free_stash_info(), and by returning -1 we'll eventually (via
!!ret) exit with status 1 anyway.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use release_revisions() to various users of "struct rev_list" which
need to have their "struct rev_info" zero-initialized before we can
start using it.
For the bundle.c code see the early exit case added in
3bbbe467f2 (bundle verify: error out if called without an object
database, 2019-05-27).
For the relevant bisect.c code see 45b6370812 (bisect: libify
`check_good_are_ancestors_of_bad` and its dependents, 2020-02-17).
For the submodule.c code see the "goto" on "(!left || !right || !sub)"
added in 8e6df65015 (submodule: refactor show_submodule_summary with
helper function, 2016-08-31).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a release_revisions() to various users of "struct rev_list" in
those straightforward cases where we only need to add the
release_revisions() call to the end of a block, and don't need to
e.g. refactor anything to use a "goto cleanup" pattern.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The users of the revision.[ch] API's "struct rev_info" are a major
source of memory leaks in the test suite under SANITIZE=leak, which in
turn adds a lot of noise when trying to mark up tests with
"TEST_PASSES_SANITIZE_LEAK=true".
The users of that API are largely one-shot, e.g. "git rev-list" or
"git log", or the "git checkout" and "git stash" being modified here
For these callers freeing the memory is arguably a waste of time, but
in many cases they've actually been trying to free the memory, and
just doing that in a buggy manner.
Let's provide a release_revisions() function for these users, and
start migrating them over per the plan outlined in [1]. Right now this
only handles the "pending" member of the struct, but more will be
added in subsequent commits.
Even though we only clear the "pending" member now, let's not leave a
trap in code like the pre-image of index_differs_from(), where we'd
start doing the wrong thing as soon as the release_revisions() learned
to clear its "diffopt". I.e. we need to call release_revisions() after
we've inspected any state in "struct rev_info".
This leaves in place e.g. clear_pathspec(&rev.prune_data) in
stash_working_tree() in builtin/stash.c, subsequent commits will teach
release_revisions() to free "prune_data" and other members that in
some cases are individually cleared by users of "struct rev_info" by
reaching into its members. Those subsequent commits will remove the
relevant calls to e.g. clear_pathspec().
We avoid amending code in index_differs_from() in diff-lib.c as well
as wt_status_collect_changes_index(), has_unstaged_changes() and
has_uncommitted_changes() in wt-status.c in a way that assumes that we
are already clearing the "diffopt" member. That will be handled in a
subsequent commit.
1. https://lore.kernel.org/git/87a6k8daeu.fsf@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add and apply coccinelle rules to remove "if (E)" before
"free_commit_list(E)", the function can accept NULL, and further
change cases where "E = NULL" followed to also be unconditionally.
The code changes in this commit were entirely made by the coccinelle
rule being added here, and applied with:
make contrib/coccinelle/free.cocci.patch
patch -p1 <contrib/coccinelle/free.cocci.patch
The only manual intervention here is that the the relevant code in
commit.c has been manually re-indented.
Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix two memory leaks in "struct rev_info" by freeing that memory in
cmd_format_patch(). These two are unusual special-cases in being in
the "struct rev_info", but not being "owned" by the code in
revision.c. I.e. they're members of the struct so that this code in
"builtin/log.c" can conveniently pass information code in
"log-tree.c".
See e.g. the make_cover_letter() caller of log_write_email_headers()
here in "builtin/log.c", and [1] for a demonstration of where the
"extra_headers" and "ref_message_ids" struct members are used.
See 20ff06805c (format-patch: resurrect extra headers from config,
2006-06-02) and d1566f7883 (git-format-patch: Make the second and
subsequent mails replies to the first, 2006-07-14) for the initial
introduction of "extra_headers" and "ref_message_ids".
We can count on repo_init_revisions() memset()-ing this data to 0
however, so we can count on it being either NULL or something we
allocated. In the case of "extra_headers" let's add a local "char *"
variable to hold it, to avoid the eventual cast from "const char *"
when we free() it.
1. https://lore.kernel.org/git/220401.868rsoogxf.gmgdl@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Follow-up on the introduction of string_list_init_nodup() and
string_list_init_dup() in the series merged in bd4232fac3 (Merge
branch 'ab/struct-init', 2021-07-16) and convert code that implicitly
relied on xcalloc() being equivalent to the initializer to use
xmalloc() and string_list_init_{no,}dup() instead.
In the case of get_unmerged() in merge-recursive.c we used the
combination of xcalloc() and assigning "1" to "strdup_strings" to get
what we'd get via string_list_init_dup(), let's use that instead.
Adjacent code in cmd_format_patch() will be changed in a subsequent
commit, since we're changing that let's change the other in-tree
patterns that do the same. Let's also convert a "x == NULL" to "!x"
per our CodingGuidelines, as we need to change the "if" line anyway.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Amend a freeing pattern added in 0906ac2b54 (blame: use changed-path
Bloom filters, 2020-04-16) to use a "goto cleanup", so that we can be
sure that we call cleanup_scoreboard().
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
String in submodule--helper is not correctly formatting
placeholders. The string in git-send-email is partial.
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The missing space at the end of the line makes the closing square
bracket sticking to the dash in the next line
Found during localisation v2.36.0 round 1
Signed-off-by: Fangyi Zhou <me@fangyi.io>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git ls-tree" learns "--oid-only" option, similar to "--name-only",
and more generalized "--format" option.
source: <cover.1648026472.git.dyroneteng@gmail.com>
* tl/ls-tree-oid-only:
ls-tree: `-l` should not imply recursive listing
The unpack-objects functionality is used by fetch, push, and fast-import
to turn the transfered data into object database entries when there are
fewer objects than the 'unpacklimit' setting.
By enabling an odb-transaction when unpacking objects, we can take advantage
of batched fsyncs.
Here are some performance numbers to justify batch mode for
unpack-objects, collected on a WSL2 Ubuntu VM.
Fsync Mode | Time for 90 objects (ms)
-------------------------------------
Off | 170
On,fsync | 760
On,batch | 230
Note that the default unpackLimit is 100 objects, so there's a 3x
benefit in the worst case. The non-batch mode fsync scales linearly
with the number of objects, so there are significant benefits even with
smaller numbers of objects.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The update-index functionality is used internally by 'git stash push' to
setup the internal stashed commit.
This change enables odb-transactions for update-index infrastructure to
speed up adding new objects to the object database by leveraging the
batch fsync functionality.
There is some risk with this change, since under batch fsync, the object
files will be in a tmp-objdir until update-index is complete, so callers
using the --stdin option will not see them until update-index is done.
This risk is mitigated by flushing the ODB transaction prior to
reporting any verbose output so that objects will be visible to callers
that are synchronizing with update-index by snooping its output.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The add_files_to_cache function is invoked internally by
builtin/commit.c and builtin/checkout.c for their flags that stage
modified files before doing the larger operation. These commands
can benefit from batched fsyncing.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make it clearer in the naming and documentation of the plug_bulk_checkin
and unplug_bulk_checkin APIs that they can be thought of as
a "transaction" to optimize operations on the object database. These
transactions may be nested so that subsystems like the cache-tree
writing code can optimize their operations without caring whether the
top-level code has a transaction active.
Add a flush_odb_transaction API that will be used in update-index to
make objects visible even if a transaction is active. The flush call may
also be useful in future cases if we hold a transaction active around
calling hooks.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove include "lockfile.h" from builtin/apply.c, which is orphaned
since 6d058c8826 (apply: move lockfile into `apply_state`, 2017-10-05)
Signed-off-by: Garrit Franke <garrit@slashdev.space>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 9c4d58ff2c (ls-tree: split up "fast path" callbacks, 2022-03-23), a
refactoring of the various read_tree_at() callbacks caused us to
unconditionally recurse into directories if `-l` (long format) was
passed on the command line, regardless of whether or not we also pass
the `-r` (recursive) flag.
Fix this by making show_tree_long() return the value of `recurse`,
rather than always returning 1. This value is interpreted by
read_tree_at() to be a signal on whether or not to recurse.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git worktree list --porcelain" did not c-quote pathnames and lock
reasons with unsafe bytes correctly, which is worked around by
introducing NUL terminated output format with "-z".
* pw/worktree-list-with-z:
worktree: add -z option for list subcommand
Built-in fsmonitor (part 2).
* jh/builtin-fsmonitor-part2: (30 commits)
t7527: test status with untracked-cache and fsmonitor--daemon
fsmonitor: force update index after large responses
fsmonitor--daemon: use a cookie file to sync with file system
fsmonitor--daemon: periodically truncate list of modified files
t/perf/p7519: add fsmonitor--daemon test cases
t/perf/p7519: speed up test on Windows
t/perf/p7519: fix coding style
t/helper/test-chmtime: skip directories on Windows
t/perf: avoid copying builtin fsmonitor files into test repo
t7527: create test for fsmonitor--daemon
t/helper/fsmonitor-client: create IPC client to talk to FSMonitor Daemon
help: include fsmonitor--daemon feature flag in version info
fsmonitor--daemon: implement handle_client callback
compat/fsmonitor/fsm-listen-darwin: implement FSEvent listener on MacOS
compat/fsmonitor/fsm-listen-darwin: add MacOS header files for FSEvent
compat/fsmonitor/fsm-listen-win32: implement FSMonitor backend on Windows
fsmonitor--daemon: create token-based changed path cache
fsmonitor--daemon: define token-ids
fsmonitor--daemon: add pathname classification
fsmonitor--daemon: implement 'start' command
...
"git fetch --refetch" learned to fetch everything without telling
the other side what we already have, which is useful when you
cannot trust what you have in the local object store.
* rc/fetch-refetch:
docs: mention --refetch fetch option
fetch: after refetch, encourage auto gc repacking
t5615-partial-clone: add test for fetch --refetch
fetch: add --refetch option
builtin/fetch-pack: add --refetch option
fetch-pack: add refetch
fetch-negotiator: add specific noop initializer
"git am" can read from the standard input when no mailbox is given
on the command line, but the end-user gets no indication when it
happens, making Git appear stuck.
* jc/mailsplit-warn-on-tty:
am/apply: warn if we end up reading patches from terminal
A handful of obvious clean-ups around a topic that is already in
'master'.
* gc/branch-recurse-submodules-fix:
branch.c: simplify advice-and-die sequence
branch: rework comments for future developers
branch: remove negative exit code
branch --set-upstream-to: be consistent when advising
branch: give submodule updating advice before exit
branch: support more tracking modes when recursing
Move more "git submodule update" to C.
* gc/submodule-update-part2:
submodule--helper: remove forward declaration
submodule: move core cmd_update() logic to C
submodule--helper: reduce logic in run_update_procedure()
submodule--helper: teach update_data more options
builtin/submodule--helper.c: rename option struct to "opt"
submodule update: use die_message()
submodule--helper: run update using child process struct
Code clean-up.
* ds/partial-bundle-more:
pack-objects: lazily set up "struct rev_info", don't leak
bundle: output hash information in 'verify'
bundle: move capabilities to end of 'verify'
pack-objects: parse --filter directly into revs.filter
pack-objects: move revs out of get_object_list()
list-objects-filter: remove CL_ARG__FILTER
"git ls-tree" learns "--oid-only" option, similar to "--name-only",
and more generalized "--format" option.
* tl/ls-tree-oid-only:
ls-tree: split up "fast path" callbacks
ls-tree: detect and error on --name-only --name-status
ls-tree: support --object-only option for "git-ls-tree"
ls-tree: introduce "--format" option
cocci: allow padding with `strbuf_addf()`
ls-tree: introduce struct "show_tree_data"
ls-tree: slightly refactor `show_tree()`
ls-tree: fix "--name-only" and "--long" combined use bug
ls-tree: simplify nesting if/else logic in "show_tree()"
ls-tree: rename "retval" to "recurse" in "show_tree()"
ls-tree: use "size_t", not "int" for "struct strbuf"'s "len"
ls-tree: use "enum object_type", not {blob,tree,commit}_type
ls-tree: add missing braces to "else" arms
ls-tree: remove commented-out code
ls-tree tests: add tests for --name-status
"git reflog" command now uses parse-options API to parse its
command line options.
* ab/reflog-parse-options:
reflog: fix 'show' subcommand's argv
reflog [show]: display sensible -h output
reflog: convert to parse_options() API
reflog exists: use parse_options() API
git reflog [expire|delete]: make -h output consistent with SYNOPSIS
reflog: move "usage" variables and use macros
reflog tests: add missing "git reflog exists" tests
reflog: refactor cmd_reflog() to "if" branches
reflog.c: indent argument lists
* ds/partial-bundle-more:
pack-objects: lazily set up "struct rev_info", don't leak
bundle: output hash information in 'verify'
bundle: move capabilities to end of 'verify'
pack-objects: parse --filter directly into revs.filter
pack-objects: move revs out of get_object_list()
list-objects-filter: remove CL_ARG__FILTER
* ab/usage-die-message:
config API: use get_error_routine(), not vreportf()
usage.c + gc: add and use a die_message_errno()
gc: return from cmd_gc(), don't call exit()
usage.c API users: use die_message() for error() + exit 128
usage.c API users: use die_message() for "fatal :" + exit 128
usage.c: add a die_message() routine
Add a -z option to be used in conjunction with --porcelain that gives
NUL-terminated output. As 'worktree list --porcelain' does not quote
worktree paths this enables it to handle worktree paths that contain
newlines.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git stash" does not allow subcommands it internally runs as its
implementation detail, except for "git reset", to emit messages;
now "git reset" part has also been squelched.
* vd/stash-silence-reset:
reset: show --no-refresh in the short-help
reset: remove 'reset.refresh' config option
reset: remove 'reset.quiet' config option
reset: do not make '--quiet' disable index refresh
stash: make internal resets quiet and refresh index
reset: suppress '--no-refresh' advice if logging is silenced
reset: replace '--quiet' with '--no-refresh' in performance advice
reset: introduce --[no-]refresh option to --mixed
reset: revise index refresh advice
"git branch --recurse-submodules" does not propagate "--track=inherit"
or "--no-track" to submodules, which causes submodule branches to use
the wrong tracking mode [1]. To fix this, pass the correct options to
the "submodule--helper create-branch" child process and test for it.
While we are refactoring the same code, replace "--track" with the
synonymous, but more consistent-looking "--track=direct" option
(introduced at the same time as "--track=inherit", d3115660b4 (branch:
add flags and config to inherit tracking, 2021-12-20)).
[1] This bug is partially a timing issue: "branch --recurse-submodules"
was introduced around the same time as "--track=inherit", and even
though I rebased "branch --recurse-submodules" on top of that, I had
neglected to support the new tracking mode. Omitting "--no-track"
was just a plain old mistake, though.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git rebase $base $non_branch_commit", when $base is an ancestor or
the $non_branch_commit, modified the current branch, which has been
corrected.
* jc/rebase-detach-fix:
rebase: set REF_HEAD_DETACH in checkout_up_to_date()
rebase: use test_commit helper in setup
The worktree repair command was not added to the usage menu for the
worktree command. This commit adds the usage of 'worktree repair'
according to the existing docs.
Signed-off-by: Des Preston <despreston@gmail.com>
Acked-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
cmd_reflog() invokes parse_options() with PARSE_OPT_KEEP_ARGV0, but it
doesn't account for the retained argv[0] before invoking
cmd_reflog_show() to handle the 'git reflog show' subcommand.
Consequently, cmd_reflog_show() always gets an 'argv' array starting
with elements argv[0]="reflog" and argv[1]="show".
Strip the name of the git command from the 'argv' array before passing
it to the function handling the 'show' subcommand.
There is no user-visible bug here, because cmd_reflog_show() doesn't
have any options or parameters of its own.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After invoking `fetch --refetch`, the object db will likely contain many
duplicate objects. If auto-maintenance is enabled, invoke it with
appropriate settings to encourage repacking/consolidation.
* gc.autoPackLimit: unless this is set to 0 (disabled), override the
value to 1 to force pack consolidation.
* maintenance.incremental-repack.auto: unless this is set to 0, override
the value to -1 to force incremental repacking.
Signed-off-by: Robert Coup <robert@coup.net.nz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach fetch and transports the --refetch option to force a full fetch
without negotiating common commits with the remote. Use when applying a
new partial clone filter to refetch all matching objects.
Signed-off-by: Robert Coup <robert@coup.net.nz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a refetch option to fetch-pack to force a full fetch. Use when
applying a new partial clone filter to refetch all matching objects.
Signed-off-by: Robert Coup <robert@coup.net.nz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>