Plugging leaks in submodule--helper.
* ab/submodule-helper-leakfix:
submodule--helper: fix a configure_added_submodule() leak
submodule--helper: free rest of "displaypath" in "struct update_data"
submodule--helper: free some "displaypath" in "struct update_data"
submodule--helper: fix a memory leak in print_status()
submodule--helper: fix a leak in module_add()
submodule--helper: fix obscure leak in module_add()
submodule--helper: fix "reference" leak
submodule--helper: fix a memory leak in get_default_remote_submodule()
submodule--helper: fix a leak with repo_clear()
submodule--helper: fix "sm_path" and other "module_cb_list" leaks
submodule--helper: fix "errmsg_str" memory leak
submodule--helper: add and use *_release() functions
submodule--helper: don't leak {run,capture}_command() cp.dir argument
submodule--helper: "struct pathspec" memory leak in module_update()
submodule--helper: fix most "struct pathspec" memory leaks
submodule--helper: fix trivial get_default_remote_submodule() leak
submodule--helper: fix a leak in "clone_submodule"
Share the text used to explain configuration variables used by "git
<subcmd>" in "git help <subcmd>" with the text from "git help config".
* ab/dedup-config-and-command-docs:
docs: add CONFIGURATION sections that fuzzy map to built-ins
docs: add CONFIGURATION sections that map to a built-in
log docs: de-duplicate configuration sections
difftool docs: de-duplicate configuration sections
notes docs: de-duplicate and combine configuration sections
apply docs: de-duplicate configuration sections
send-email docs: de-duplicate configuration sections
grep docs: de-duplicate configuration sections
docs: add and use include template for config/* includes
Undoes 'jk/unused-annotation' topic and redoes it to work around
Coccinelle rules misfiring false positives in unrelated codepaths.
* ab/unused-annotation:
git-compat-util.h: use "deprecated" for UNUSED variables
git-compat-util.h: use "UNUSED", not "UNUSED(var)"
Annotate function parameters that are not used (but cannot be
removed for structural reasons), to prepare us to later compile
with -Wunused warning turned on.
* jk/unused-annotation:
is_path_owned_by_current_uid(): mark "report" parameter as unused
run-command: mark unused async callback parameters
mark unused read_tree_recursive() callback parameters
hashmap: mark unused callback parameters
config: mark unused callback parameters
streaming: mark unused virtual method parameters
transport: mark bundle transport_options as unused
refs: mark unused virtual method parameters
refs: mark unused reflog callback parameters
refs: mark unused each_ref_fn parameters
git-compat-util: add UNUSED macro
As the 'master' front will soon tag a preview and then release
candidates for 2.38, it is unknown if we are going to issue another
maintenance release on the 2.37.x track, but as we have accumulated
enough material there, let's prepare a draft for it.
Even if we end up not tagging 2.37.4, it would help motivated distro
packagers to maintain their slightly older and "more stable" versions.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The auto-stashed local changes created by "git merge --autostash"
was mixed into a conflicted state left in the working tree, which
has been corrected.
* en/merge-unstash-only-on-clean-merge:
merge: only apply autostash when appropriate
Update the version of Ubuntu used for GitHub Actions CI from 18.04
to 22.04.
* ds/github-actions-use-newer-ubuntu:
ci: update 'static-analysis' to Ubuntu 22.04
The preload-index codepath made copies of pathspec to give to
multiple threads, which were left leaked.
* ad/preload-plug-memleak:
preload-index: fix memleak
xcalloc(), imitating calloc(), takes "number of elements of the
array", and "size of a single element", in this order. A call that
does not follow this ordering has been corrected.
* sg/xcalloc-cocci-fix:
promisor-remote: fix xcalloc() argument order
Fix deadlocks between main Git process and subprocess spawned via
the pipe_command() API, that can kill "git add -p" that was
reimplemented in C recently.
* jk/pipe-command-nonblock:
pipe_command(): mark stdin descriptor as non-blocking
pipe_command(): handle ENOSPC when writing to a pipe
pipe_command(): avoid xwrite() for writing to pipe
git-compat-util: make MAX_IO_SIZE define globally available
nonblock: support Windows
compat: add function to enable nonblocking pipes
An earlier optimization discarded a tree-object buffer that is
still in use, which has been corrected.
* jk/is-promisor-object-keep-tree-in-use:
is_promisor_object(): fix use-after-free of tree buffer
The parser in the script interface to parse-options in "git
rev-parse" has been updated to diagnose a bogus input correctly.
* ow/rev-parse-parseopt-fix:
rev-parse --parseopt: detect missing opt-spec
More fixes to "add -p"
* js/builtin-add-p-portability-fix:
t6132(NO_PERL): do not run the scripted `add -p`
t3701: test the built-in `add -i` regardless of NO_PERL
add -p: avoid ambiguous signed/unsigned comparison
The codepath for the OPT_SUBCOMMAND facility has been cleaned up.
* sg/parse-options-subcommand:
notes, remote: show unknown subcommands between `'
notes: simplify default operation mode arguments check
test-parse-options.c: fix style of comparison with zero
test-parse-options.c: don't use for loop initial declaration
t0040-parse-options: remove leftover debugging
"git rev-list --verify-objects" ought to inspect the contents of
objects and notice corrupted ones, but it didn't when the commit
graph is in use, which has been corrected.
* jk/rev-list-verify-objects-fix:
rev-list: disable commit graph with --verify-objects
lookup_commit_in_graph(): use prepare_commit_graph() to check for graph
The server side that responds to "git fetch" and "git clone"
request has been optimized by allowing it to send objects in its
object store without recomputing and validating the object names.
* jk/upload-pack-skip-hash-check:
t1060: check partial clone of misnamed blob
parse_object(): check commit-graph when skip_hash set
upload-pack: skip parse-object re-hashing of "want" objects
parse_object(): allow skipping hash check
"git diff --no-index A B" managed its the pathnames of its two
input files rather haphazardly, sometimes leaking them. The
command line argument processing has been straightened out to clean
it up.
* rs/diff-no-index-cleanup:
diff-no-index: simplify argv index calculation
diff-no-index: release prefixed filenames
diff-no-index: release strbuf on queue error
The built-in fsmonitor refuses to work on a network mounted
repositories; a configuration knob for users to override this has
been introduced.
* ed/fsmonitor-on-network-disk:
fsmonitor: option to allow fsmonitor to run against network-mounted repos
Segfault fix-up to an earlier fix to the topic to teach "git reset"
and "git checkout" work better in a sparse checkout.
* vd/sparse-reset-checkout-fixes:
unpack-trees: fix sparse directory recursion check
Remove the assembly version of SHA-1 implementation for PPC.
* ab/retire-ppc-sha1:
Makefile: use $(OBJECTS) instead of $(C_OBJ)
Makefile + hash.h: remove PPC_SHA1 implementation
"git format-patch --from=<ident>" can be told to add an in-body
"From:" line even for commits that are authored by the given
<ident> with "--force-in-body-from"option.
* jc/format-patch-force-in-body-from:
format-patch: learn format.forceInBodyFrom configuration variable
format-patch: allow forcing the use of in-body From: header
pretty: separate out the logic to decide the use of in-body from
Those who use diff-so-fancy as the diff-filter noticed a regression
or two in the code that parses the diff output in the built-in
version of "add -p", which has been corrected.
* js/add-p-diff-parsing-fix:
add -p: ignore dirty submodules
add -p: gracefully handle unparseable hunk headers in colored diffs
add -p: detect more mismatches between plain vs colored diffs
After 2d893dff4c (rev-parse --parseopt: allow [*=?!] in argument hints,
2015-07-14) updated the parser, a line in parseopts's input can start
with one of the flag characters and be erroneously parsed as a opt-spec
where the short name of the option is the flag character itself and the
long name is after the end of the string. This makes Git want to
allocate SIZE_MAX bytes of memory at this line:
o->long_name = xmemdupz(sb.buf + 2, s - sb.buf - 2);
Since s and sb.buf are equal the second argument is -2 (except unsigned)
and xmemdupz allocates len + 1 bytes, ie. -1 meaning SIZE_MAX.
Avoid this by checking whether a flag character was found in the zeroth
position.
Reported-by: Ingy dot Net <ingy@ingy.net>
Signed-off-by: Øystein Walle <oystwa@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A recent commit (upload-pack: skip parse-object re-hashing of "want"
objects, 2022-09-06) loosened the behavior of upload-pack so that it
does not verify the sha1 of objects it receives directly via "want"
requests. The existing corruption tests in t1060 aren't affected by
this: the corruptions are blobs reachable from commits, and the client
requests the commits.
The more interesting case here is a partial clone, where the client will
directly ask for the corrupted blob when it does an on-demand fetch of
the filtered object. And that is not covered at all, so let's add a
test.
It's important here that we use the "misnamed" corruption and not
"bit-error". The latter is sufficiently corrupted that upload-pack
cannot even figure out the type of the object, so it bails identically
both before and after the recent change. But with "misnamed", with the
hash-checks enabled it sees the problem (though the error messages are a
bit confusing because of the inability to create a "struct object" to
store the flags):
error: hash mismatch d95f3ad14dee633a758d2e331151e950dd13e4ed
fatal: git upload-pack: not our ref d95f3ad14dee633a758d2e331151e950dd13e4ed
fatal: remote error: upload-pack: not our ref d95f3ad14dee633a758d2e331151e950dd13e4ed
After the change to skip the hash check, the server side happily sends
the bogus object, but the client correctly realizes that it did not get
the necessary data:
remote: Enumerating objects: 1, done.
remote: Counting objects: 100% (1/1), done.
remote: Total 1 (delta 0), reused 0 (delta 0), pack-reused 0
Receiving objects: 100% (1/1), 49 bytes | 49.00 KiB/s, done.
fatal: bad revision 'd95f3ad14dee633a758d2e331151e950dd13e4ed'
error: [...]/misnamed did not send all necessary objects
which is exactly what we expect to happen.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 16bb3d714d (diff --no-index: use parse_options() instead of
diff_opt_parse(), 2019-03-24) argc must be 2 if we reach the loop, i.e.
argc - 2 == 0. Remove that inconsequential term.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Callers of prefix_filename() are responsible for freeing its result.
Remember the returned strings and release them to appease leak checkers.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The strbuf is small and we are about to exit, so we could leave its
cleanup to the OS. If we release it explicitly at all, however, then we
should do it on early exit as well. Move the strbuf_release call to a
new cleanup section at the end and make sure all execution paths go
through it.
Suggested-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the caller told us that they don't care about us checking the object
hash, then we're free to implement any optimizations that get us the
parsed value more quickly. An obvious one is to check the commit graph
before loading an object from disk. And in fact, both of the callers who
pass in this flag are already doing so before they call parse_object()!
So we can simplify those callers, as well as any possible future ones,
by moving the logic into parse_object().
There are two subtle things to note in the diff, but neither has any
impact in practice:
- it seems least-surprising here to do the graph lookup on the
git-replace'd oid, rather than the original. This is in theory a
change of behavior from the earlier code, as neither caller did a
replace lookup itself. But in practice it doesn't matter, as we
disable the commit graph entirely if there are any replace refs.
- the caller in get_reference() passes the skip_hash flag only if
revs->verify_objects isn't set, whereas it would look in the commit
graph unconditionally. In practice this should not matter as we
should disable the commit graph entirely when using verify_objects
(and that was done recently in another patch).
So this should be a pure cleanup with no behavior change.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Imagine we have a history with commit C pointing to a large blob B.
If a client asks us for C, we can generally serve both objects to them
without accessing the uncompressed contents of B. In upload-pack, we
figure out which commits we have and what the client has, and feed those
tips to pack-objects. In pack-objects, we traverse the commits and trees
(or use bitmaps!) to find the set of objects needed, but we never open
up B. When we serve it to the client, we can often pass the compressed
bytes directly from the on-disk packfile over the wire.
But if a client asks us directly for B, perhaps because they are doing
an on-demand fetch to fill in the missing blob of a partial clone, we
end up much slower. Upload-pack calls parse_object() on the oid we
receive, which opens up the object and re-checks its hash (even though
if it were a commit, we might skip this parse entirely in favor of the
commit graph!). And then we feed the oid directly to pack-objects, which
again calls parse_object() and opens the object. And then finally, when
we write out the result, we may send bytes straight from disk, but only
after having unnecessarily uncompressed and computed the sha1 of the
object twice!
This patch teaches both code paths to use the new SKIP_HASH_CHECK flag
for parse_object(). You can see the speed-up in p5600, which does a
blob:none clone followed by a checkout. The savings for git.git are
modest:
Test HEAD^ HEAD
----------------------------------------------------------------------
5600.3: checkout of result 2.23(4.19+0.24) 1.72(3.79+0.18) -22.9%
But the savings scale with the number of bytes. So on a repository like
linux.git with more files, we see more improvement (in both absolute and
relative numbers):
Test HEAD^ HEAD
----------------------------------------------------------------------------
5600.3: checkout of result 51.62(77.26+2.76) 34.86(61.41+2.63) -32.5%
And here's an even more extreme case. This is the android gradle-plugin
repository, whose tip checkout has ~3.7GB of files:
Test HEAD^ HEAD
--------------------------------------------------------------------------
5600.3: checkout of result 79.51(90.84+5.55) 40.28(51.88+5.67) -49.3%
Keep in mind that these timings are of the whole checkout operation. So
they count the client indexing the pack and actually writing out the
files. If we want to see just the server's view, we can hack up the
GIT_TRACE_PACKET output from those operations and replay it via
upload-pack. For the gradle example, that gives me:
Benchmark 1: GIT_PROTOCOL=version=2 git.old upload-pack ../gradle-plugin <input
Time (mean ± σ): 50.884 s ± 0.239 s [User: 51.450 s, System: 1.726 s]
Range (min … max): 50.608 s … 51.025 s 3 runs
Benchmark 2: GIT_PROTOCOL=version=2 git.new upload-pack ../gradle-plugin <input
Time (mean ± σ): 9.728 s ± 0.112 s [User: 10.466 s, System: 1.535 s]
Range (min … max): 9.618 s … 9.842 s 3 runs
Summary
'GIT_PROTOCOL=version=2 git.new upload-pack ../gradle-plugin <input' ran
5.23 ± 0.07 times faster than 'GIT_PROTOCOL=version=2 git.old upload-pack ../gradle-plugin <input'
So a server would see an 80% reduction in CPU serving the initial
checkout of a partial clone for this repository. Or possibly even more
depending on the packing; most of the time spent in the faster one were
objects we had to open during the write phase.
In both cases skipping the extra hashing on the server should be pretty
safe. The client doesn't trust the server anyway, so it will re-hash all
of the objects via index-pack. There is one thing to note, though: the
change in get_reference() affects not just pack-objects, but rev-list,
git-log, etc. We could use a flag to limit to index-pack here, but we
may already skip hash checks in this instance. For commits, we'd skip
anything we load via the commit-graph. And while before this commit we
would check a blob fed directly to rev-list on the command-line, we'd
skip checking that same blob if we found it by traversing a tree.
The exception for both is if --verify-objects is used. In that case,
we'll skip this optimization, and the new test makes sure we do this
correctly.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The parse_object() function checks the object hash of any object it
parses. This is a nice feature, as it means we may catch bit corruption
during normal use, rather than waiting for specific fsck operations.
But it also can be slow. It's particularly noticeable for blobs, where
except for the hash check, we could return without loading the object
contents at all. Now one may wonder what is the point of calling
parse_object() on a blob in the first place then, but usually it's not
intentional: we were fed an oid from somewhere, don't know the type, and
want an object struct. For commits and trees, the parsing is usually
helpful; we're about to look at the contents anyway. But this is less
true for blobs, where we may be collecting them as part of a
reachability traversal, etc, and don't actually care what's in them. And
blobs, of course, tend to be larger.
We don't want to just throw out the hash-checks for blobs, though. We do
depend on them in some circumstances (e.g., rev-list --verify-objects
uses parse_object() to check them). It's only the callers that know
how they're going to use the result. And so we can help them by
providing a special flag to skip the hash check.
We could just apply this to blobs, as they're going to be the main
source of performance improvement. But if a caller doesn't care about
checking the hash, we might as well skip it for other object types, too.
Even though we can't avoid reading the object contents, we can still
skip the actual hash computation.
If this seems like it is making Git a little bit less safe against
corruption, it may be. But it's part of a series of tradeoffs we're
already making. For instance, "rev-list --objects" does not open the
contents of blobs it prints. And when a commit graph is present, we skip
opening most commits entirely. The important thing will be to use this
flag in cases where it's safe to skip the check. For instance, when
serving a pack for a fetch, we know the client will fully index the
objects and do a connectivity check itself. There's little to be gained
from the server side re-hashing a blob itself. And indeed, most of the
time we don't! The revision machinery won't open up a blob reached by
traversal, but only one requested directly with a "want" line. So
applied properly, this new feature shouldn't make anything less safe in
practice.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update the "unknown subcommand" error message in 'git notes' and 'git
remote' to wrap the offending argument between `', to make it
consistent with the "unknown switch/option/subcommand" error messages
in parse-options.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git notes' has a default operation mode, but when invoked without a
subcommand it doesn't accept any arguments (although the 'list'
subcommand implementing the default operation mode does accept
arguments). The condition checking this ended up a bit awkward, so
let's make it clearer.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The preferred style is '!argc' instead of 'argc == 0'.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We would like to eventually use for loop initial declarations in our
codebase, but we are not there yet.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a CONFIGURATION section to the documentation of various built-ins,
for those cases where the relevant config/NAME.txt doesn't map only to
one git-NAME.txt. In particular:
* config/blame.txt: used by git-{blame,annotate}.txt. Since the
git-annotate(1) documentation refers to git-blame(1) don't add a
"CONFIGURATION" section to git-annotate(1), only to git-blame(1).
* config/checkout.txt: maps to both git-checkout.txt and
git-switch.txt (but nothing else).
* config/init.txt: should be included in git-init(1) and
git-clone(1).
* config/column.txt: We should ideally mention the relevant subset of
this in git-{branch,clean,status,tag}.txt, but let's punt on it for
now. We will when we eventually split these sort of files into
e.g. config/column.txt and
config/column/{branch,clean,status,tag}.txt, with the former
including the latter set.
Things that are being left out, and why:
* config/{remote,remotes,credential}.txt: Configuration that affects
how we talk to remote repositories is harder to untangle. We'll need
to include some of this in git-{fetch,remote,push,ls-remote}.txt
etc., but some of those only use a small subset of these
options. Let's leave this for now.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a CONFIGURATION section to the documentation of various built-ins,
for those cases where the relevant config/NAME.txt describes
configuration that is only used by the relevant built-in documented in
git-NAME.txt. Subsequent commits will handle more complex cases.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Include the "config/difftool.txt" file in "git-difftool.txt", and move
the relevant part of git-difftool(1) configuration from
"config/diff.txt" to config/difftool.txt".
Doing this is slightly odd, as we usually discuss configuration in
alphabetical order, but by doing it we're able to include the full set
of configuration used by git-difftool(1) (and only that configuration)
in its own documentation.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Combine the various "notes" configuration sections spread across
Documentation/config/notes.txt and Documentation/git-notes.txt to live
in the former, and to be included in the latter.
We'll now forward link from "git notes" to the "CONFIGURATION" section
below, rather than to "git-config(1)" when discussing configuration
variables that are (also) discussed in that section.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The wording is not identical to Documentation/config/apply.txt, but
that version is better.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
De-duplicate the discussion of "send-email" configuration, such that
the "git-config(1)" manual page becomes the source of truth, and
"git-send-email(1)" includes the relevant part.
Most commands that suffered from such duplication had diverging text
discussing the same variables, but in this case some config was also
only discussed in one or the other.
This is mostly a move-only change, the exception is a minor rewording
of changing wording like "see above" to "see linkgit:git-config[1]",
as well as a clarification about the big section of command-line
option tweaking config being discussed in git-send-email(1)'s main
docs.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Include the "config/grep.txt" file in "git-grep.txt", instead of
repeating an almost identical description of the "grep" configuration
variables in two places.
There is no loss of information here that isn't shown in the addition
to "grep.txt". This change was made by copying the contents of
"git-grep.txt"'s version over the "grep.txt" version. Aside from the
change "grep.txt" being made here the two were identical.
This documentation started being copy/pasted around in
b22520a37c (grep: allow -E and -n to be turned on by default via
configuration, 2011-03-30). After that in e.g. 6453f7b348 (grep: add
grep.fullName config variable, 2014-03-17) they started drifting
apart, with only grep.fullName being described in the command
documentation.
In 434e6e753f (config.txt: move grep.* to a separate file,
2018-10-27) we gained the include, but didn't do this next step, let's
do it now.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>