Bundle providers may want to distribute that data across multiple CDNs.
This might require a change in the base URI, all the way to the domain
name. If all bundles require an absolute URI in their 'uri' value, then
every push to a CDN would require altering the table of contents to
match the expected domain and exact location within it.
Allow a bundle list to specify a relative URI for the bundles. This URI
is based on where the client received the bundle list. For a list
provided in the 'bundle-uri' protocol v2 command, the Git remote URI is
the base URI. Otherwise, the bundle list was provided from an HTTP URI
not using the Git protocol, and that URI is the base URI. This allows
easier distribution of bundle data.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a 'test-tool bundle-uri ls-remote' command. This is a thin wrapper
for issuing protocol v2 "bundle-uri" commands to a server, and to the
parsing routines in bundle-uri.c.
In the "git clone" case we'll have already done the handshake(),
but not here. Add an extra case to check for this handshake in
get_bundle_uri() for ease of use for future callers.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Various leak fixes.
* ab/various-leak-fixes:
built-ins: use free() not UNLEAK() if trivial, rm dead code
revert: fix parse_options_concat() leak
cherry-pick: free "struct replay_opts" members
rebase: don't leak on "--abort"
connected.c: free the "struct packed_git"
sequencer.c: fix "opts->strategy" leak in read_strategy_opts()
ls-files: fix a --with-tree memory leak
revision API: call graph_clear() in release_revisions()
unpack-file: fix ancient leak in create_temp_file()
built-ins & libs & helpers: add/move destructors, fix leaks
dir.c: free "ident" and "exclude_per_dir" in "struct untracked_cache"
read-cache.c: clear and free "sparse_checkout_patterns"
commit: discard partial cache before (re-)reading it
{reset,merge}: call discard_index() before returning
tests: mark tests as passing with SANITIZE=leak
Preparation to remove git-submodule.sh and replace it with a builtin.
* ab/submodule-helper-prep-only:
submodule--helper: use OPT_SUBCOMMAND() API
submodule--helper: drop "update --prefix <pfx>" for "-C <pfx> update"
submodule--helper: remove --prefix from "absorbgitdirs"
submodule API & "absorbgitdirs": remove "----recursive" option
submodule.c: refactor recursive block out of absorb function
submodule tests: test for a "foreach" blind-spot
submodule--helper: fix a memory leak in "status"
submodule tests: add tests for top-level flag output
submodule--helper: move "config" to a test-tool
Fix various leaks in built-ins, libraries and a test helper here we
were missing a call to strbuf_release(), string_list_clear() etc, or
were calling them after a potential "return".
Comments on individual changes:
- builtin/checkout.c: Fix a memory leak that was introduced in [1]. A
sibling leak introduced in [2] was recently fixed in [3]. As with [3]
we should be using the wt_status_state_free_buffers() API introduced
in [4].
- builtin/repack.c: Fix a leak that's been here since this use of
"strbuf_release()" was added in a1bbc6c017 (repack: rewrite the shell
script in C, 2013-09-15). We don't use the variable for anything
except this loop, so we can instead free it right afterwards.
- builtin/rev-parse: Fix a leak that's been here since this code was
added in 21d4783538 (Add a parseopt mode to git-rev-parse to bring
parse-options to shell scripts., 2007-11-04).
- builtin/stash.c: Fix a couple of leaks that have been here since
this code was added in d4788af875 (stash: convert create to builtin,
2019-02-25), we strbuf_release()'d only some of the "struct strbuf" we
allocated earlier in the function, let's release all of them.
- ref-filter.c: Fix a leak in 482c119186 (gpg-interface: improve
interface for parsing tags, 2021-02-11), we don't use the "payload"
variable that we ask parse_signature() to populate for us, so let's
free it.
- t/helper/test-fake-ssh.c: Fix a leak that's been here since this
code was added in 3064d5a38c (mingw: fix t5601-clone.sh,
2016-01-27). Let's free the "struct strbuf" as soon as we don't need
it anymore.
1. c45f0f525d (switch: reject if some operation is in progress,
2019-03-29)
2. 2708ce62d2 (branch: sort detached HEAD based on a flag,
2021-01-07)
3. abcac2e19f (ref-filter.c: fix a leak in get_head_description,
2022-09-25)
4. 962dd7ebc3 (wt-status: introduce wt_status_state_free_buffers(),
2020-09-27).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
In a preceding commit we fully applied the
"index-compatibility.pending.cocci" rule to "t/helper/*".
Let's now stop defining "USE_THE_INDEX_COMPATIBILITY_MACROS" in
test-tool.h itself, and instead instead define
"USE_THE_INDEX_VARIABLE" in the individual test helpers that need
it. This mirrors how we do the same thing in the "builtin/" directory.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apply the "index-compatibility.pending.cocci" rule to the "t/helper/*"
directory, a subsequent commit will extend cache.h to further narrow
down the use of "USE_THE_INDEX_COMPATIBILITY_MACROS" in this area.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mostly apply the part of "index-compatibility.pending.cocci" that
renames the global variables like "active_nr", which are a shorthand
to referencing (in that case) a struct member as "the_index.cache_nr".
In doing so move more of "index-compatibility.pending.cocci" to
"index-compatibility.cocci".
In the case of "active_nr" we'd have a textual conflict with
"ab/various-leak-fixes" in "next"[1]. Let's exclude that specific case
while moving the rule over from "pending".
1. 407b94280f8 (commit: discard partial cache before (re-)reading it,
2022-11-08)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid calling 'cache_tree_update()' when doing so would be redundant.
* vd/skip-cache-tree-update:
rebase: use 'skip_cache_tree_update' option
read-tree: use 'skip_cache_tree_update' option
reset: use 'skip_cache_tree_update' option
unpack-trees: add 'skip_cache_tree_update' option
cache-tree: add perf test comparing update and prime
Makefile comments updates and reordering to clarify knobs used to
choose SHA implementations.
* ab/sha-makefile-doc:
Makefile: discuss SHAttered in *_SHA{1,256} discussion
Makefile: document default SHA-1 backend on OSX
Makefile & test-tool: replace "DC_SHA1" variable with a "define"
Makefile: document SHA-1 and SHA-256 default and selection order
Makefile: document default SHA-256 backend
Makefile: rephrase the discussion of *_SHA1 knobs
Makefile: create and use sections for "define" flag listing
Makefile: correct DC_SHA1 documentation
INSTALL: remove discussion of SHA-1 backends
Makefile: always (re)set DC_SHA1 on fallback
Add a performance test comparing the execution times of 'prime_cache_tree()'
and 'cache_tree_update(_, WRITE_TREE_SILENT | WRITE_TREE_REPAIR)'. The goal
of comparing these two is to identify which is the faster method for
rebuilding an invalid cache tree, ultimately to remove one when both are
(reundantly) called in immediate succession.
Both methods are fast, so the new tests in 'p0090-cache-tree.sh' must call
each tested function multiple times to ensure the reported times (to 0.01s
resolution) convey the differences between them.
The tests compare the timing of a 'test-tool cache-tree' run as a no-op (to
capture a baseline for the overhead associated with running the tool),
'cache_tree_update()', and 'prime_cache_tree()' on four scenarios:
- A completely valid cache tree
- A cache tree with 2 invalid paths
- A cache tree with 50 invalid paths
- A completely empty cache tree
Example results:
Test this tree
-----------------------------------------------------------
0090.2: no-op, clean 1.27(0.48+0.52)
0090.3: prime_cache_tree, clean 2.02(0.83+0.85)
0090.4: cache_tree_update, clean 1.30(0.49+0.54)
0090.5: no-op, invalidate 2 1.29(0.48+0.54)
0090.6: prime_cache_tree, invalidate 2 1.98(0.81+0.83)
0090.7: cache_tree_update, invalidate 2 2.12(0.94+0.86)
0090.8: no-op, invalidate 50 1.32(0.50+0.55)
0090.9: prime_cache_tree, invalidate 50 2.10(0.86+0.89)
0090.10: cache_tree_update, invalidate 50 2.35(1.14+0.90)
0090.11: no-op, empty 1.33(0.50+0.54)
0090.12: prime_cache_tree, empty 2.04(0.84+0.87)
0090.13: cache_tree_update, empty 2.51(1.27+0.92)
These timings show that, while 'cache_tree_update()' is faster when the
cache tree is completely valid, it is equal to or slower than
'prime_cache_tree()' when there are any invalid paths. Since the redundant
calls are mostly in scenarios where the cache tree will be at least
partially invalid (e.g., 'git reset --hard'), 'prime_cache_tree()' will
likely perform better than 'cache_tree_update()' in typical cases.
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Simplify the run-command API.
* rs/no-more-run-command-v:
replace and remove run_command_v_opt()
replace and remove run_command_v_opt_cd_env_tr2()
replace and remove run_command_v_opt_tr2()
replace and remove run_command_v_opt_cd_env()
use child_process members "args" and "env" directly
use child_process member "args" instead of string array variable
sequencer: simplify building argument list in do_exec()
bisect--helper: factor out do_bisect_run()
bisect: simplify building "checkout" argument list
am: simplify building "show" argument list
run-command: fix return value comment
merge: remove always-the-same "verbose" arguments
As with other moves to "test-tool" in f322e9f51b (Merge branch
'ab/submodule-helper-prep', 2022-09-13) the "config" sub-command was
only used by our own tests.
It was last used by "git submodule" itself in code that went away with
a6226fd772 (submodule--helper: convert the bulk of cmd_add() to C,
2021-08-10).
Let's move it over, and while doing so make it easier to reason about
by splitting up the various uses for it into separate sub-commands, so
that we don't need to count arguments to see what it does.
This also has the advantage that we stop wasting future translator
time on this command, currently the usage information for this
internal-only tool has been translated into several languages. The use
of the "_" function has also been removed from the "please make
sure..." message.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Address the root cause of technical debt we've been carrying since
sha1collisiondetection was made the default in [1]. In a preceding
commit we narrowly fixed a bug where the "DC_SHA1" variable would be
unset (in combination with "NO_APPLE_COMMON_CRYPTO=" on OSX), even
though we had the sha1collisiondetection library enabled.
But the only reason we needed to have such a user-exposed knob went
away with [1], and it's been doing nothing useful since then. We don't
care if you define DC_SHA1=*, we only care that you don't ask for any
other SHA-1 implementation. If it turns out that you didn't, we'll use
sha1collisiondetection, whether you had "DC_SHA1" set or not.
As a result of this being confusing we had e.g. [2] for cmake and the
recent [3] for ci/lib.sh setting "DC_SHA1" explicitly, even though
this was always a NOOP.
A much simpler way to do this is to stop having the Makefile and
CMakeLists.txt set "DC_SHA1" to be picked up by the test-lib.sh, let's
instead add a trivial "test-tool sha1-is-sha1dc". It returns zero if
we're using sha1collisiondetection, non-zero otherwise.
1. e6b07da278 (Makefile: make DC_SHA1 the default, 2017-03-17)
2. c4b2f41b5f (cmake: support for testing git with ctest, 2020-06-26)
3. 1ad5c3df35 (ci: use DC_SHA1=YesPlease on osx-clang job for CI,
2022-10-20)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Define the logical elements of a "bundle list", data structure to
store them in-core, format to transfer them, and code to parse
them.
* ds/bundle-uri-3:
bundle-uri: suppress stderr from remote-https
bundle-uri: quiet failed unbundlings
bundle: add flags to verify_bundle()
bundle-uri: fetch a list of bundles
bundle: properly clear all revision flags
bundle-uri: limit recursion depth for bundle lists
bundle-uri: parse bundle list in config format
bundle-uri: unit test "key=value" parsing
bundle-uri: create "key=value" line parsing
bundle-uri: create base key-value pair parsing
bundle-uri: create bundle_list struct and helpers
bundle-uri: use plain string in find_temp_filename()
Two new facilities, "timer" and "counter", are introduced to the
trace2 API.
* jh/trace2-timers-and-counters:
trace2: add global counter mechanism
trace2: add stopwatch timers
trace2: convert ctx.thread_name from strbuf to pointer
trace2: improve thread-name documentation in the thread-context
trace2: rename the thread_name argument to trace2_thread_start
api-trace2.txt: elminate section describing the public trace2 API
tr2tls: clarify TLS terminology
trace2: use size_t alloc,nr_open_regions in tr2tls_thread_ctx
Replace the remaining calls of run_command_v_opt() with run_command()
calls and explict struct child_process variables. This is more verbose,
but not by much overall. The code becomes more flexible, e.g. it's easy
to extend to conditionally add a new argument.
Then remove the now unused function and its own flag names, simplifying
the run-command API.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Use run_command() with a struct child_process variable and populate its
"args" member directly instead of building a string array and passing it
to run_command_v_opt(). This avoids the use of magic index numbers and
makes simplifies the possible addition of more arguments in the future.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The short-help text shown by "git cmd -h" and the synopsis text
shown at the beginning of "git help cmd" have been made more
consistent.
* ab/doc-synopsis-and-cmd-usage: (34 commits)
tests: assert consistent whitespace in -h output
tests: start asserting that *.txt SYNOPSIS matches -h output
doc txt & -h consistency: make "worktree" consistent
worktree: define subcommand -h in terms of command -h
reflog doc: list real subcommands up-front
doc txt & -h consistency: make "commit" consistent
doc txt & -h consistency: make "diff-tree" consistent
doc txt & -h consistency: use "[<label>...]" for "zero or more"
doc txt & -h consistency: make "annotate" consistent
doc txt & -h consistency: make "stash" consistent
doc txt & -h consistency: add missing options
doc txt & -h consistency: use "git foo" form, not "git-foo"
doc txt & -h consistency: make "bundle" consistent
doc txt & -h consistency: make "read-tree" consistent
doc txt & -h consistency: make "rerere" consistent
doc txt & -h consistency: add missing options and labels
doc txt & -h consistency: make output order consistent
doc txt & -h consistency: add or fix optional "--" syntax
doc txt & -h consistency: fix mismatching labels
doc SYNOPSIS & -h: use "-" to separate words in labels, not "_"
...
Move a global variable added as a hack during regression fixes to
its proper place in the API.
* ab/run-hook-api-cleanup:
run-command.c: remove "max_processes", add "const" to signal() handler
run-command.c: pass "opts" further down, and use "opts->processes"
run-command.c: use "opts->processes", not "pp->max_processes"
run-command.c: don't copy "data" to "struct parallel_processes"
run-command.c: don't copy "ungroup" to "struct parallel_processes"
run-command.c: don't copy *_fn to "struct parallel_processes"
run-command.c: make "struct parallel_processes" const if possible
run-command API: move *_tr2() users to "run_processes_parallel()"
run-command API: have run_process_parallel() take an "opts" struct
run-command.c: use designated init for pp_init(), add "const"
run-command API: don't fall back on online_cpus()
run-command API: make "n" parameter a "size_t"
run-command tests: use "return", not "exit"
run-command API: have "run_processes_parallel{,_tr2}()" return void
run-command test helper: use "else if" pattern
More UNUSED annotation to help using -Wunused option with the
compiler.
* jk/unused-anno-more:
ll-merge: mark unused parameters in callbacks
diffcore-pickaxe: mark unused parameters in pickaxe functions
convert: mark unused parameter in null stream filter
apply: mark unused parameters in noop error/warning routine
apply: mark unused parameters in handlers
date: mark unused parameters in handler functions
string-list: mark unused callback parameters
object-file: mark unused parameters in hash_unknown functions
mark unused parameters in trivial compat functions
update-index: drop unused argc from do_reupdate()
submodule--helper: drop unused argc from module_list_compute()
diffstat_consume(): assert non-zero length
Add global counters mechanism to Trace2.
The Trace2 counters mechanism adds the ability to create a set of
global counter variables and an API to increment them efficiently.
Counters can optionally report per-thread usage in addition to the sum
across all threads.
Counter events are emitted to the Trace2 logs when a thread exits and
at process exit.
Counters are an alternative to `data` and `data_json` events.
Counters are useful when you want to measure something across the life
of the process, when you don't want per-measurement events for
performance reasons, when the data does not fit conveniently within a
region, or when your control flow does not easily let you write the
final total. For example, you might use this to report the number of
calls to unzip() or the number of de-delta steps during a checkout.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add stopwatch timer mechanism to Trace2.
Timers are an alternative to Trace2 Regions. Regions are useful for
measuring the time spent in various computation phases, such as the
time to read the index, time to scan for unstaged files, time to scan
for untracked files, and etc.
However, regions are not appropriate in all places. For example,
during a checkout, it would be very inefficient to use regions to
measure the total time spent inflating objects from the ODB from
across the entire lifetime of the process; a per-unzip() region would
flood the output and significantly slow the command; and some form of
post-processing would be requried to compute the time spent in unzip().
Timers can be used to measure a series of timer intervals and emit
a single summary event (at thread and/or process exit).
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
String-lists may be used with callbacks for clearing or iteration. These
callbacks need to conform to a particular interface, even though not
every callback needs all of its parameters. Mark the unused ones to make
-Wunused-parameter happy.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's arguably more correct to say "[<option>...]" than either of these
forms, but the vast majority of our documentation uses the
"[<options>]" form to indicate an arbitrary number of options, let's
do the same in these cases, which were the odd ones out.
In the case of "mv" and "sparse-checkout" let's add the missing "[]"
to indicate that these are optional.
In the case of "t/helper/test-proc-receive.c" there is no *.txt
version, making it the only hunk in this commit that's not a "doc txt
& -h consistency" change.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As noted in fd3aaf53f7 (run-command: add an "ungroup" option to
run_process_parallel(), 2022-06-07) which added the "ungroup" passing
it to "run_process_parallel()" via the global
"run_processes_parallel_ungroup" variable was a compromise to get the
smallest possible regression fix for "maint" at the time.
This follow-up to that is a start at passing that parameter and others
via a new "struct run_process_parallel_opts", as the earlier
version[1] of what became fd3aaf53f7 did.
Since we need to change all of the occurrences of "n" to
"opt->SOMETHING" let's take the opportunity and rename the terse "n"
to "processes". We could also have picked "max_processes", "jobs",
"threads" etc., but as the API is named "run_processes_parallel()"
let's go with "processes".
Since the new "run_processes_parallel()" function is able to take an
optional "tr2_category" and "tr2_label" via the struct we can at this
point migrate all of the users of "run_processes_parallel_tr2()" over
to it.
But let's not migrate all the API users yet, only the two users that
passed the "ungroup" parameter via the
"run_processes_parallel_ungroup" global
1. https://lore.kernel.org/git/cover-v2-0.8-00000000000-20220518T195858Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the "run-command" test helper to "return" instead of calling
"exit", see 338abb0f04 (builtins + test helpers: use return instead
of exit() in cmd_*, 2021-06-08)
Because we'd previously gotten past the SANITIZE=leak check by using
exit() here we need to move to "goto cleanup" pattern.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the "run_processes_parallel{,_tr2}()" functions to return void,
instead of int. Ever since c553c72eed (run-command: add an
asynchronous parallel child processor, 2015-12-15) they have
unconditionally returned 0.
To get a "real" return value out of this function the caller needs to
get it via the "task_finished_fn" callback, see the example in hook.c
added in 96e7225b31 (hook: add 'run' subcommand, 2021-12-22).
So the "result = " and "if (!result)" code added to "builtin/fetch.c"
d54dea77db (fetch: let --jobs=<n> parallelize --multiple, too,
2019-10-05) has always been redundant, we always took that "if"
path. Likewise the "ret =" in "t/helper/test-run-command.c" added in
be5d88e112 (test-tool run-command: learn to run (parts of) the
testsuite, 2019-10-04) wasn't used, instead we got the return value
from the "if (suite.failed.nr > 0)" block seen in the context.
Subsequent commits will alter this API interface, getting rid of this
always-zero return value makes it easier to understand those changes.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust the cmd__run_command() to use an "if/else if" chain rather than
mutually exclusive "if" statements. This non-functional change makes a
subsequent commit smaller.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a bundle provider wants to operate independently from a Git remote,
they want to provide a single, consistent URI that users can use in
their 'git clone --bundle-uri' commands. At this point, the Git client
expects that URI to be a single bundle that can be unbundled and used to
bootstrap the rest of the clone from the Git server. This single bundle
cannot be re-used to assist with future incremental fetches.
To allow for the incremental fetch case, teach Git to understand a
bundle list that could be advertised at an independent bundle URI. Such
a bundle list is likely to be inspected by human readers, even if only
by the bundle provider creating the list. For this reason, we can take
our expected "key=value" pairs and instead format them using Git config
format.
Create bundle_uri_parse_config_format() to parse a file in config format
and convert that into a 'struct bundle_list' filled with its
understanding of the contents.
Be careful to use error_action CONFIG_ERROR_ERROR when calling
git_config_from_file_with_options() because the default action for
git_config_from_file() is to die() on a parsing error. The current
warning isn't particularly helpful if it arises to a user, but it will
be made more verbose at a higher layer later.
Update 'test-tool bundle-uri' to take this config file format as input.
It uses a filename instead of stdin because there is no existing way to
parse a FILE pointer in the config machinery. Using
git_config_from_mem() is overly complicated and more likely to introduce
bugs than this simpler version.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a new 'test-tool bundle-uri' test helper. This helper will assist
in testing logic deep in the bundle URI feature.
This change introduces the 'parse-key-values' subcommand, which parses
an input file as a list of lines. These are fed into
bundle_uri_parse_line() to test how we construct a 'struct bundle_list'
from that data. The list is then output to stdout as if the key-value
pairs were a Git config file.
We use an input file instead of stdin because of a future change to
parse in config-file format that works better as an input file.
Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The resolve_relative_url() function takes argc and argv parameters; it
then reads up to 3 elements of argv without looking at argc at all. At
first glance, this seems like a bug. But it has only one caller,
cmd__submodule_resolve_relative_url(), which does confirm that argc is
3.
The main reason this is a separate function is that it was moved from
library code in 96a28a9bc6 (submodule--helper: move
"resolve-relative-url-test" to a test-tool, 2022-09-01).
We can make this code simpler and more obviously safe by just inlining
the function in its caller. As a bonus, this silences a
-Wunused-parameter warning.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Undoes 'jk/unused-annotation' topic and redoes it to work around
Coccinelle rules misfiring false positives in unrelated codepaths.
* ab/unused-annotation:
git-compat-util.h: use "deprecated" for UNUSED variables
git-compat-util.h: use "UNUSED", not "UNUSED(var)"
Annotate function parameters that are not used (but cannot be
removed for structural reasons), to prepare us to later compile
with -Wunused warning turned on.
* jk/unused-annotation:
is_path_owned_by_current_uid(): mark "report" parameter as unused
run-command: mark unused async callback parameters
mark unused read_tree_recursive() callback parameters
hashmap: mark unused callback parameters
config: mark unused callback parameters
streaming: mark unused virtual method parameters
transport: mark bundle transport_options as unused
refs: mark unused virtual method parameters
refs: mark unused reflog callback parameters
refs: mark unused each_ref_fn parameters
git-compat-util: add UNUSED macro
The codepath for the OPT_SUBCOMMAND facility has been cleaned up.
* sg/parse-options-subcommand:
notes, remote: show unknown subcommands between `'
notes: simplify default operation mode arguments check
test-parse-options.c: fix style of comparison with zero
test-parse-options.c: don't use for loop initial declaration
t0040-parse-options: remove leftover debugging
The preferred style is '!argc' instead of 'argc == 0'.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We would like to eventually use for loop initial declarations in our
codebase, but we are not there yet.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As its name suggests the "resolve-relative-url-test" has never been
used outside of the test suite, see 63e95beb08 (submodule: port
resolve_relative_url from shell to C, 2016-04-15) for its original
addition.
Perhaps it would make sense to drop this code entirely, as we feel
that we've got enough indirect test coverage, but let's leave that
question to a possible follow-up change. For now let's keep the test
coverage this gives us.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the "check-name" helper to a test-tool, since
a6226fd772 (submodule--helper: convert the bulk of cmd_add() to C,
2021-08-10) it has only been used by this test, not git-submodule.sh.
As noted with its introduction in 0383bbb901 (submodule-config:
verify submodule names as paths, 2018-04-30) the intent of
t7450-bad-git-dotfiles.sh has always been to unit test the
check_submodule_name() function.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a new "test-tool submodule" and move the "is-active" subcommand
over to it. It was added in 5c2bd8b77a (submodule--helper: add
is-active subcommand, 2017-03-16), since
a452128a36 (submodule--helper: introduce add-config subcommand,
2021-08-06) it hasn't been used by git-submodule.sh.
Since we're creating a command dispatch similar to test-tool.c itself
let's split out the "struct test_cmd" into a new test-tool-utils.h,
which both this new code and test-tool.c itself can use.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
No test has used this "--url" parameter since the test code that made
use of it was removed in 32bc548329 (submodule-config: remove support
for overlaying repository config, 2017-08-03).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce the "subcommand" mode to parse-options API and update the
command line parser of Git commands with subcommands.
* sg/parse-options-subcommand: (23 commits)
remote: run "remote rm" argv through parse_options()
maintenance: add parse-options boilerplate for subcommands
pass subcommand "prefix" arguments to parse_options()
builtin/worktree.c: let parse-options parse subcommands
builtin/stash.c: let parse-options parse subcommands
builtin/sparse-checkout.c: let parse-options parse subcommands
builtin/remote.c: let parse-options parse subcommands
builtin/reflog.c: let parse-options parse subcommands
builtin/notes.c: let parse-options parse subcommands
builtin/multi-pack-index.c: let parse-options parse subcommands
builtin/hook.c: let parse-options parse subcommands
builtin/gc.c: let parse-options parse 'git maintenance's subcommands
builtin/commit-graph.c: let parse-options parse subcommands
builtin/bundle.c: let parse-options parse subcommands
parse-options: add support for parsing subcommands
parse-options: drop leading space from '--git-completion-helper' output
parse-options: clarify the limitations of PARSE_OPT_NODASH
parse-options: PARSE_OPT_KEEP_UNKNOWN only applies to --options
api-parse-options.txt: fix description of OPT_CMDMODE
t0040-parse-options: test parse_options() with various 'parse_opt_flags'
...
As reported in [1] the "UNUSED(var)" macro introduced in
2174b8c75de (Merge branch 'jk/unused-annotation' into next,
2022-08-24) breaks coccinelle's parsing of our sources in files where
it occurs.
Let's instead partially go with the approach suggested in [2] of
making this not take an argument. As noted in [1] "coccinelle" will
ignore such tokens in argument lists that it doesn't know about, and
it's less of a surprise to syntax highlighters.
This undoes the "help us notice when a parameter marked as unused is
actually use" part of 9b24034754 (git-compat-util: add UNUSED macro,
2022-08-19), a subsequent commit will further tweak the macro to
implement a replacement for that functionality.
1. https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/
2. https://lore.kernel.org/git/220819.868rnk54ju.gmgdl@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since ee69e7884e (gc: use temporary file for editing crontab,
2022-08-28), we now insist that "argc == 3" (and otherwise return an
error). Coverity notes that this causes some dead code:
if (argc == 3)
fclose(from);
else
fclose(to);
as we will never trigger the else. This also causes a memory leak, since
we'll never close "to".
Now that all paths require 2 arguments, we can just reorganize the
function to check argc up front, and tweak the cleanup to do the right
thing for all cases.
While we're here, we can also notice some minor problems:
- we return a negative int via error() from what is essentially a
main() function; we should return a positive non-zero value for
error. Or better yet, we can just use usage(), which gives a better
message.
- while writing the usage message, we can note the one in the comment
was made out of date by ee69e7884e. But it also had a typo already,
calling the subcommand "cron" and not "crontab"
- we didn't check for an error from fopen(), meaning we would segfault
if the to-be-read file was missing. We can use xfopen() to catch
this.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test portability improvements.
* mt/rot13-in-c:
tests: use the new C rot13-filter helper to avoid PERL prereq
t0021: implementation the rot13-filter.pl script in C
t0021: avoid grepping for a Perl-specific string at filter output
While cron is specified by POSIX, there are a wide variety of
implementations in use. "git maintenance" assumes that the
"crontab" command can be fed from its standard input the new
contents and the syntax to do so is not to have any filename
argument, as POSIX describes. However, on FreeBSD, the cron
implementation requires a file name argument: if the user wants to
edit standard input, they must specify "-".
Unfortunately, POSIX systems do not have to interpret "-" on the
command line of crontab as a request to read from the standard
input. Blindly adding "-" on the command line would not work as a
general solution.
Since POSIX tells us that cron must accept a file name argument, let's
solve this problem by specifying a temporary file instead. This will
ensure that we work with the vast majority of implementations.
Note that because delete_tempfile closes the file for us, we should not
call fclose here on the handle, since doing so will introduce a double
free.
Reported-by: Renato Botelho <garga@FreeBSD.org>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous patch almost halved the number of heap allocations for the
sort subcommand. Reduce it further by using a mem_pool for the line
objects.
Note that t/perf/run can't be used directly to compare two versions of
test-mergesort because it always runs the helpers from the checked-out
version. So I hand-merged the results of separate runs before and with
this patch:
macOS 12.5.1 on M1:
0071.12: DEFINE_LIST_SORT unsorted 0.22(0.20+0.01) 0.21(0.19+0.01)
0071.14: DEFINE_LIST_SORT sorted 0.10(0.08+0.01) 0.10(0.08+0.01)
0071.16: DEFINE_LIST_SORT reversed 0.10(0.08+0.01) 0.10(0.08+0.01)
Git SDK 64-bit on Windows 11 21H2 on Ryzen 7 5800H:
0071.12: DEFINE_LIST_SORT unsorted 0.54(0.00+0.06) 0.44(0.01+0.06)
0071.14: DEFINE_LIST_SORT sorted 0.21(0.03+0.03) 0.19(0.04+0.01)
0071.16: DEFINE_LIST_SORT reversed 0.21(0.01+0.04) 0.19(0.04+0.04)
Debian bullseye on WSL2 on the same system:
0071.12: DEFINE_LIST_SORT unsorted 0.29(0.27+0.01) 0.22(0.19+0.02)
0071.14: DEFINE_LIST_SORT sorted 0.07(0.06+0.01) 0.06(0.04+0.02)
0071.16: DEFINE_LIST_SORT reversed 0.07(0.04+0.03) 0.06(0.04+0.02)
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sort subcommand of test-mergesort is used to test the performance of
sorting linked lists. It reads lines from stdin, sorts them and prints
the result to stdout. Two heap allocations are done per line: One for
the linked list item and one for the actual line string. That imposes a
significant amount of allocation overhead.
Reduce it by doing the same as the sort subcommand of test-string-list,
namely to read the whole input file into a single buffer and then split
it in-place.
Note that t/perf/run can't be used directly to compare two versions of
test-mergesort because it always runs the helpers from the checked-out
version. So I hand-merged the results of separate runs before and with
this patch:
macOS 12.5.1 on M1:
0071.12: DEFINE_LIST_SORT unsorted 0.23(0.20+0.01) 0.22(0.20+0.01)
0071.14: DEFINE_LIST_SORT sorted 0.12(0.10+0.01) 0.10(0.08+0.01)
0071.16: DEFINE_LIST_SORT reversed 0.12(0.10+0.01) 0.10(0.08+0.01)
Git SDK 64-bit on Windows 11 21H2 on Ryzen 7 5800H:
0071.12: DEFINE_LIST_SORT unsorted 0.71(0.00+0.03) 0.54(0.00+0.06)
0071.14: DEFINE_LIST_SORT sorted 0.42(0.00+0.04) 0.21(0.03+0.03)
0071.16: DEFINE_LIST_SORT reversed 0.42(0.06+0.01) 0.21(0.01+0.04)
Debian bullseye on WSL2 on the same system:
0071.12: DEFINE_LIST_SORT unsorted 0.41(0.39+0.02) 0.29(0.27+0.01)
0071.14: DEFINE_LIST_SORT sorted 0.11(0.08+0.02) 0.07(0.06+0.01)
0071.16: DEFINE_LIST_SORT reversed 0.11(0.08+0.02) 0.07(0.04+0.03)
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The callback passed to git_config() must conform to a particular
interface. But most callbacks don't actually look at the extra "void
*data" parameter. Let's mark the unused parameters to make
-Wunused-parameter happy.
Note there's one unusual case here in get_remote_default() where we
actually ignore the "value" parameter. That's because it's only checking
whether the option is found at all, and not parsing its value.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Functions used with for_each_reflog_ent() need to conform to a
particular interface, but not every function needs all of the
parameters. Mark the unused ones to make -Wunused-parameter happy.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Functions used with for_each_ref(), etc, need to conform to the
each_ref_fn interface. But most of them don't need every parameter;
let's annotate the unused ones to quiet -Wunused-parameter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Several Git commands have subcommands to implement mutually exclusive
"operation modes", and they usually parse their subcommand argument
with a bunch of if-else if statements.
Teach parse-options to handle subcommands as well, which will result
in shorter and simpler code with consistent error handling and error
messages on unknown or missing subcommand, and it will also make
possible for our Bash completion script to handle subcommands
programmatically.
The approach is guided by the following observations:
- Most subcommands [1] are implemented in dedicated functions, and
most of those functions [2] either have a signature matching the
'int cmd_foo(int argc, const char **argc, const char *prefix)'
signature of builtin commands or can be trivially converted to
that signature, because they miss only that last prefix parameter
or have no parameters at all.
- Subcommand arguments only have long form, and they have no double
dash prefix, no negated form, and no description, and they don't
take any arguments, and can't be abbreviated.
- There must be exactly one subcommand among the arguments, or zero
if the command has a default operation mode.
- All arguments following the subcommand are considered to be
arguments of the subcommand, and, conversely, arguments meant for
the subcommand may not preceed the subcommand.
So in the end subcommand declaration and parsing would look something
like this:
parse_opt_subcommand_fn *fn = NULL;
struct option builtin_commit_graph_options[] = {
OPT_STRING(0, "object-dir", &opts.obj_dir, N_("dir"),
N_("the object directory to store the graph")),
OPT_SUBCOMMAND("verify", &fn, graph_verify),
OPT_SUBCOMMAND("write", &fn, graph_write),
OPT_END(),
};
argc = parse_options(argc, argv, prefix, options,
builtin_commit_graph_usage, 0);
return fn(argc, argv, prefix);
Here each OPT_SUBCOMMAND specifies the name of the subcommand and the
function implementing it, and the address of the same 'fn' subcommand
function pointer. parse_options() then processes the arguments until
it finds the first argument matching one of the subcommands, sets 'fn'
to the function associated with that subcommand, and returns, leaving
the rest of the arguments unprocessed. If none of the listed
subcommands is found among the arguments, parse_options() will show
usage and abort.
If a command has a default operation mode, 'fn' should be initialized
to the function implementing that mode, and parse_options() should be
invoked with the PARSE_OPT_SUBCOMMAND_OPTIONAL flag. In this case
parse_options() won't error out when not finding any subcommands, but
will return leaving 'fn' unchanged. Note that if that default
operation mode has any --options, then the PARSE_OPT_KEEP_UNKNOWN_OPT
flag is necessary as well (otherwise parse_options() would error out
upon seeing the unknown option meant to the default operation mode).
Some thoughts about the implementation:
- The same pointer to 'fn' must be specified as 'value' for each
OPT_SUBCOMMAND, because there can be only one set of mutually
exclusive subcommands; parse_options() will BUG() otherwise.
There are other ways to tell parse_options() where to put the
function associated with the subcommand given on the command line,
but I didn't like them:
- Change parse_options()'s signature by adding a pointer to
subcommand function to be set to the function associated with
the given subcommand, affecting all callsites, even those that
don't have subcommands.
- Introduce a specific parse_options_and_subcommand() variant
with that extra funcion parameter.
- I decided against automatically calling the subcommand function
from within parse_options(), because:
- There are commands that have to perform additional actions
after option parsing but before calling the function
implementing the specified subcommand.
- The return code of the subcommand is usually the return code
of the git command, but preserving the return code of the
automatically called subcommand function would have made the
API awkward.
- Also add a OPT_SUBCOMMAND_F() variant to allow specifying an
option flag: we have two subcommands that are purposefully
excluded from completion ('git remote rm' and 'git stash save'),
so they'll have to be specified with the PARSE_OPT_NOCOMPLETE
flag.
- Some of the 'parse_opt_flags' don't make sense with subcommands,
and using them is probably just an oversight or misunderstanding.
Therefore parse_options() will BUG() when invoked with any of the
following flags while the options array contains at least one
OPT_SUBCOMMAND:
- PARSE_OPT_KEEP_DASHDASH: parse_options() stops parsing
arguments when encountering a "--" argument, so it doesn't
make sense to expect and keep one before a subcommand, because
it would prevent the parsing of the subcommand.
However, this flag is allowed in combination with the
PARSE_OPT_SUBCOMMAND_OPTIONAL flag, because the double dash
might be meaningful for the command's default operation mode,
e.g. to disambiguate refs and pathspecs.
- PARSE_OPT_STOP_AT_NON_OPTION: As its name suggests, this flag
tells parse_options() to stop as soon as it encouners a
non-option argument, but subcommands are by definition not
options... so how could they be parsed, then?!
- PARSE_OPT_KEEP_UNKNOWN: This flag can be used to collect any
unknown --options and then pass them to a different command or
subsystem. Surely if a command has subcommands, then this
functionality should rather be delegated to one of those
subcommands, and not performed by the command itself.
However, this flag is allowed in combination with the
PARSE_OPT_SUBCOMMAND_OPTIONAL flag, making possible to pass
--options to the default operation mode.
- If the command with subcommands has a default operation mode, then
all arguments to the command must preceed the arguments of the
subcommand.
AFAICT we don't have any commands where this makes a difference,
because in those commands either only the command accepts any
arguments ('notes' and 'remote'), or only the default subcommand
('reflog' and 'stash'), but never both.
- The 'argv' array passed to subcommand functions currently starts
with the name of the subcommand. Keep this behavior. AFAICT no
subcommand functions depend on the actual content of 'argv[0]',
but the parse_options() call handling their options expects that
the options start at argv[1].
- To support handling subcommands programmatically in our Bash
completion script, 'git cmd --git-completion-helper' will now list
both subcommands and regular --options, if any. This means that
the completion script will have to separate subcommands (i.e.
words without a double dash prefix) from --options on its own, but
that's rather easy to do, and it's not much work either, because
the number of subcommands a command might have is rather low, and
those commands accept only a single --option or none at all. An
alternative would be to introduce a separate option that lists
only subcommands, but then the completion script would need not
one but two git invocations and command substitutions for commands
with subcommands.
Note that this change doesn't affect the behavior of our Bash
completion script, because when completing the --option of a
command with subcommands, e.g. for 'git notes --<TAB>', then all
subcommands will be filtered out anyway, as none of them will
match the word to be completed starting with that double dash
prefix.
[1] Except 'git rerere', because many of its subcommands are
implemented in the bodies of the if-else if statements parsing the
command's subcommand argument.
[2] Except 'credential', 'credential-store' and 'fsmonitor--daemon',
because some of the functions implementing their subcommands take
special parameters.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The description of 'PARSE_OPT_KEEP_UNKNOWN' starts with "Keep unknown
arguments instead of erroring out". This is a bit misleading, as this
flag only applies to unknown --options, while non-option arguments are
kept even without this flag.
Update the description to clarify this, and rename the flag to
PARSE_OPTIONS_KEEP_UNKNOWN_OPT to make this obvious just by looking at
the flag name.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 't0040-parse-options.sh' we thoroughly test the parsing of all
types and forms of options, but in all those tests parse_options() is
always invoked with a 0 flags parameter.
Add a few tests to demonstrate how various 'enum parse_opt_flags'
values are supposed to influence option parsing.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This script is currently used by three test files: t0021-conversion.sh,
t2080-parallel-checkout-basics.sh, and
t2082-parallel-checkout-attributes.sh. To avoid the need for the PERL
dependency at these tests, let's convert the script to a C test-tool
command. The following commit will take care of actually modifying the
said tests to use the new C helper and removing the Perl script.
The Perl script flushes the log file handler after each write. As
commented in [1], this seems to be an early design decision that was
later reconsidered, but possibly ended up being left in the code by
accident:
>> +$debug->flush();
>
> Isn't $debug flushed automatically?
Maybe, but autoflush is not explicitly enabled. I will
enable it again (I disabled it because of Eric's comment
but I re-read the comment and he is only talking about
pipes).
Anyways, this behavior is not really needed for the tests and the
flush() calls make the code slightly larger, so let's avoid them
altogether in the new C version.
[1]: https://lore.kernel.org/git/7F1F1A0E-8FC3-4FBD-81AA-37786DE0EF50@gmail.com/
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Plug a bit more leaks in the revisions API.
* ab/plug-revisions-leak:
revisions API: don't leak memory on argv elements that need free()-ing
bisect.c: partially fix bisect_rev_setup() memory leak
log: refactor "rev.pending" code in cmd_show()
log: fix a memory leak in "git show <revision>..."
test-fast-rebase helper: use release_revisions() (again)
bisect.c: add missing "goto" for release_revisions()
Make our mergesort implementation type-safe.
* rs/mergesort:
mergesort: remove llist_mergesort()
packfile: use DEFINE_LIST_SORT
fetch-pack: use DEFINE_LIST_SORT
commit: use DEFINE_LIST_SORT
blame: use DEFINE_LIST_SORT
test-mergesort: use DEFINE_LIST_SORT
test-mergesort: use DEFINE_LIST_SORT_DEBUG
mergesort: add macros for typed sort of linked lists
mergesort: tighten merge loop
mergesort: unify ranks loops
Fix a bug in 0139c58ab9 (revisions API users: add "goto cleanup" for
release_revisions(), 2022-04-13), in that commit a release_revisions()
call was added to this function, but it never did anything due to this
TODO memset() added in fe1a21d526 (fast-rebase: demonstrate
merge-ort's API via new test-tool command, 2020-10-29).
Simply removing the memset() will fix the "cmdline" which can be seen
when running t5520-pull.sh.
This sort of thing could be detected automatically with a rule similar
to the unused.cocci merged in 7fa60d2a5b6 (Merge branch
'ab/cocci-unused' into next, 2022-07-11). The following rule on top
would catch the case being fixed here:
@@
type T;
identifier I;
identifier REL1 =~ "^[a-z_]*_(release|reset|clear|free)$";
identifier REL2 =~ "^(release|clear|free)_[a-z_]*$";
@@
- memset(\( I \| &I \), 0, ...);
... when != \( I \| &I \)
(
\( REL1 \| REL2 \)( \( I \| &I \), ...);
|
\( REL1 \| REL2 \)( \( &I \| I \) );
)
... when != \( I \| &I \)
That rule should arguably use only &I, not I (as we might be passed a
pointer). The distinction would matter if anyone cared about the
side-effects of a memset() followed by release() of a pointer to a
variable passed into the function.
As such a pattern would be at best very confusing, and most likely
point to buggy code as in this case, the above rule is probably fine
as-is.
But as this rule only found one such bug in the entire codebase let's
not add it to contrib/coccinelle/unused.cocci for now, we can always
dig it up in the future if it's deemed useful.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Build a typed sort function for the mergesort performance test tool
using DEFINE_LIST_SORT instead of calling llist_mergesort(). This gets
rid of the next pointer accessor functions and improves the performance
at the cost of a slightly higher object text size.
Before:
0071.12: llist_mergesort() unsorted 0.24(0.22+0.01)
0071.14: llist_mergesort() sorted 0.12(0.10+0.01)
0071.16: llist_mergesort() reversed 0.12(0.10+0.01)
__TEXT __DATA __OBJC others dec hex
6407 276 0 24701 31384 7a98 t/helper/test-mergesort.o
With this patch:
0071.12: DEFINE_LIST_SORT unsorted 0.22(0.21+0.01)
0071.14: DEFINE_LIST_SORT sorted 0.11(0.10+0.01)
0071.16: DEFINE_LIST_SORT reversed 0.11(0.10+0.01)
__TEXT __DATA __OBJC others dec hex
6615 276 0 25832 32723 7fd3 t/helper/test-mergesort.o
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Define a typed sort function using DEFINE_LIST_SORT_DEBUG for the
mergesort sanity check instead of using llist_mergesort(). This gets
rid of the next pointer accessor functions and improves the performance
at the cost of slightly bigger object text.
Before:
Benchmark 1: t/helper/test-tool mergesort test
Time (mean ± σ): 108.4 ms ± 0.2 ms [User: 106.7 ms, System: 1.2 ms]
Range (min … max): 108.0 ms … 108.8 ms 27 runs
__TEXT __DATA __OBJC others dec hex
6251 276 0 23172 29699 7403 t/helper/test-mergesort.o
With this patch:
Benchmark 1: t/helper/test-tool mergesort test
Time (mean ± σ): 94.0 ms ± 0.2 ms [User: 92.4 ms, System: 1.1 ms]
Range (min … max): 93.7 ms … 94.5 ms 31 runs
__TEXT __DATA __OBJC others dec hex
6407 276 0 24701 31384 7a98 t/helper/test-mergesort.o
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak introduced in a310d43494 ([PATCH] Deltification
library work by Nicolas Pitre., 2005-05-19), as a result we can mark
another test as passing with SANITIZE=leak using
"TEST_PASSES_SANITIZE_LEAK=true".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak introduced in fa099d2322 (worktree.c: kill
parse_ref() in favor of refs_resolve_ref_unsafe(), 2017-04-24), as a
result we can mark another test as passing with SANITIZE=leak using
"TEST_PASSES_SANITIZE_LEAK=true".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix memory leaks introduced with these tests in f1294eaf7f (bloom.c:
introduce core Bloom filter constructs, 2020-03-30), as a result we
can mark almost the entirety of t0095-bloom.sh as passing with
SANITIZE=leak using "TEST_PASSES_SANITIZE_LEAK=true", there's still an
unrelated memory leak in "git commit" in one of the tests, let's skip
that one under SANITIZE_LEAK for now.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix memory leaks introduced with these tests in
75459410ed (json_writer: new routines to create JSON data,
2018-07-13), as a result we can mark a test as passing with
SANITIZE=leak using "TEST_PASSES_SANITIZE_LEAK=true".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix memory leaks in "test-tool regex" which have been there since
c91841594c (test-regex: Add a test to check for a bug in the regex
routines, 2012-09-01), as a result we can mark a test as passing with
SANITIZE=leak using "TEST_PASSES_SANITIZE_LEAK=true".
We could regfree() on the die() paths here, which would make some
invocations of valgrind(1) happy, but let's just target SANITIZE=leak
for now. Variables that are still reachable when we die() are not
reported as leaks.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak in "test-tool urlmatch-normalization", as a result
we can mark the corresponding test as passing with SANITIZE=leak using
"TEST_PASSES_SANITIZE_LEAK=true".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix memory leaks in two test-tools used by t0090-cache-tree.sh. As a
result we can mark the test as passing with SANITIZE=leak using
"TEST_PASSES_SANITIZE_LEAK=true".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak in "test-tool path-utils", as a result we can mark
the corresponding test as passing with SANITIZE=leak using
"TEST_PASSES_SANITIZE_LEAK=true".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak in "test-tool test-hash" which has been there since
b57cbbf8a8 (test-sha1: test hashing large buffer, 2006-06-24), as a
result we can mark more tests as passing with SANITIZE=leak using
"TEST_PASSES_SANITIZE_LEAK=true".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code added 0cc05b044f (usage.c: add a non-fatal bug() function to go
with BUG(), 2022-06-02) sets up two va_list variables: one to output to
stderr, and one to trace2. But the order of initialization is wrong:
va_list ap, cp;
va_copy(cp, ap);
va_start(ap, fmt);
We copy the contents of "ap" into "cp" before it is initialized, meaning
it is full of garbage. The two should be swapped.
However, there's another bug, noticed by Johannes Schindelin: we forget
to call va_end() for the copy. So instead of just fixing the copy's
initialization, let's do two separate start/end pairs. This is allowed
by the standard, and we don't need to use copy here since we have access
to the original varargs. Matching the pairs with the calls makes it more
obvious that everything is being done correctly.
Note that we do call bug_fl() in the tests, but it didn't trigger this
problem because our format string doesn't have any placeholders. So even
though we were passing a garbage va_list through the stack, nobody ever
needed to look at it. We can easily adjust one of the trace2 tests to
trigger this, both for bug() and for BUG(). The latter isn't broken, but
it's nice to exercise both a bit more. Without the fix in this patch
(but with the test change), the bug() case causes a segfault.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In Git 2.36 we revamped the way how hooks are invoked. One change
that is end-user visible is that the output of a hook is no longer
directly connected to the standard output of "git" that spawns the
hook, which was noticed post release. This is getting corrected.
* ab/hooks-regression-fix:
hook API: fix v2.36.0 regression: hooks should be connected to a TTY
run-command: add an "ungroup" option to run_process_parallel()
A new bug() and BUG_if_bug() API is introduced to make it easier to
uniformly log "detect multiple bugs and abort in the end" pattern.
* ab/bug-if-bug:
cache-tree.c: use bug() and BUG_if_bug()
receive-pack: use bug() and BUG_if_bug()
parse-options.c: use optbug() instead of BUG() "opts" check
parse-options.c: use new bug() API for optbug()
usage.c: add a non-fatal bug() function to go with BUG()
common-main.c: move non-trace2 exit() behavior out of trace2.c
More fsmonitor--daemon.
* jh/builtin-fsmonitor-part3: (30 commits)
t7527: improve implicit shutdown testing in fsmonitor--daemon
fsmonitor--daemon: allow --super-prefix argument
t7527: test Unicode NFC/NFD handling on MacOS
t/lib-unicode-nfc-nfd: helper prereqs for testing unicode nfc/nfd
t/helper/hexdump: add helper to print hexdump of stdin
fsmonitor: on macOS also emit NFC spelling for NFD pathname
t7527: test FSMonitor on case insensitive+preserving file system
fsmonitor: never set CE_FSMONITOR_VALID on submodules
t/perf/p7527: add perf test for builtin FSMonitor
t7527: FSMonitor tests for directory moves
fsmonitor: optimize processing of directory events
fsm-listen-darwin: shutdown daemon if worktree root is moved/renamed
fsm-health-win32: force shutdown daemon if worktree root moves
fsm-health-win32: add polling framework to monitor daemon health
fsmonitor--daemon: stub in health thread
fsmonitor--daemon: rename listener thread related variables
fsmonitor--daemon: prepare for adding health thread
fsmonitor--daemon: cd out of worktree root
fsm-listen-darwin: ignore FSEvents caused by xattr changes on macOS
unpack-trees: initialize fsmonitor_has_run_once in o->result
...
Rename .env_array member to .env in the child_process structure.
* ab/env-array:
run-command API users: use "env" not "env_array" in comments & names
run-command API: rename "env_array" to "env"
Plug the memory leaks from the trickiest API of all, the revision
walker.
* ab/plug-leak-in-revisions: (27 commits)
revisions API: add a TODO for diff_free(&revs->diffopt)
revisions API: have release_revisions() release "topo_walk_info"
revisions API: have release_revisions() release "date_mode"
revisions API: call diff_free(&revs->pruning) in revisions_release()
revisions API: release "reflog_info" in release revisions()
revisions API: clear "boundary_commits" in release_revisions()
revisions API: have release_revisions() release "prune_data"
revisions API: have release_revisions() release "grep_filter"
revisions API: have release_revisions() release "filter"
revisions API: have release_revisions() release "cmdline"
revisions API: have release_revisions() release "mailmap"
revisions API: have release_revisions() release "commits"
revisions API users: use release_revisions() for "prune_data" users
revisions API users: use release_revisions() with UNLEAK()
revisions API users: use release_revisions() in builtin/log.c
revisions API users: use release_revisions() in http-push.c
revisions API users: add "goto cleanup" for release_revisions()
stash: always have the owner of "stash_info" free it
revisions API users: use release_revisions() needing REV_INFO_INIT
revision.[ch]: document and move code declared around "init"
...
Extend the parallel execution API added in c553c72eed (run-command:
add an asynchronous parallel child processor, 2015-12-15) to support a
mode where the stdout and stderr of the processes isn't captured and
output in a deterministic order, instead we'll leave it to the kernel
and stdio to sort it out.
This gives the API same functionality as GNU parallel's --ungroup
option. As we'll see in a subsequent commit the main reason to want
this is to support stdout and stderr being connected to the TTY in the
case of jobs=1, demonstrated here with GNU parallel:
$ parallel --ungroup 'test -t {} && echo TTY || echo NTTY' ::: 1 2
TTY
TTY
$ parallel 'test -t {} && echo TTY || echo NTTY' ::: 1 2
NTTY
NTTY
Another is as GNU parallel's documentation notes a potential for
optimization. As demonstrated in next commit our results with "git
hook run" will be similar, but generally speaking this shows that if
you want to run processes in parallel where the exact order isn't
important this can be a lot faster:
$ hyperfine -r 3 -L o ,--ungroup 'parallel {o} seq ::: 10000000 >/dev/null '
Benchmark 1: parallel seq ::: 10000000 >/dev/null
Time (mean ± σ): 220.2 ms ± 9.3 ms [User: 124.9 ms, System: 96.1 ms]
Range (min … max): 212.3 ms … 230.5 ms 3 runs
Benchmark 2: parallel --ungroup seq ::: 10000000 >/dev/null
Time (mean ± σ): 154.7 ms ± 0.9 ms [User: 136.2 ms, System: 25.1 ms]
Range (min … max): 153.9 ms … 155.7 ms 3 runs
Summary
'parallel --ungroup seq ::: 10000000 >/dev/null ' ran
1.42 ± 0.06 times faster than 'parallel seq ::: 10000000 >/dev/null '
A large part of the juggling in the API is to make the API safer for
its maintenance and consumers alike.
For the maintenance of the API we e.g. avoid malloc()-ing the
"pp->pfd", ensuring that SANITIZE=address and other similar tools will
catch any unexpected misuse.
For API consumers we take pains to never pass the non-NULL "out"
buffer to an API user that provided the "ungroup" option. The
resulting code in t/helper/test-run-command.c isn't typical of such a
user, i.e. they'd typically use one mode or the other, and would know
whether they'd provided "ungroup" or not.
We could also avoid the strbuf_init() for "buffered_output" by having
"struct parallel_processes" use a static PARALLEL_PROCESSES_INIT
initializer, but let's leave that cleanup for later.
Using a global "run_processes_parallel_ungroup" variable to enable
this option is rather nasty, but is being done here to produce as
minimal of a change as possible for a subsequent regression fix. This
change is extracted from a larger initial version[1] which ends up
with a better end-state for the API, but in doing so needed to modify
all existing callers of the API. Let's defer that for now, and
narrowly focus on what we need for fixing the regression in the
subsequent commit.
It's safe to do this with a global variable because:
A) hook.c is the only user of it that sets it to non-zero, and before
we'll get any other API users we'll refactor away this method of
passing in the option, i.e. re-roll [1].
B) Even if hook.c wasn't the only user we don't have callers of this
API that concurrently invoke this parallel process starting API
itself in parallel.
As noted above "A" && "B" are rather nasty, and we don't want to live
with those caveats long-term, but for now they should be an acceptable
compromise.
1. https://lore.kernel.org/git/cover-v2-0.8-00000000000-20220518T195858Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Start following-up on the rename mentioned in c7c4bdeccf (run-command
API: remove "env" member, always use "env_array", 2021-11-25) of
"env_array" to "env".
The "env_array" name was picked in 19a583dc39 (run-command: add
env_array, an optional argv_array for env, 2014-10-19) because "env"
was taken. Let's not forever keep the oddity of "*_array" for this
"struct strvec", but not for its "args" sibling.
This commit is almost entirely made with a coccinelle rule[1]. The
only manual change here is in run-command.h to rename the struct
member itself and to change "env_array" to "env" in the
CHILD_PROCESS_INIT initializer.
The rest of this is all a result of applying [1]:
* make contrib/coccinelle/run_command.cocci.patch
* patch -p1 <contrib/coccinelle/run_command.cocci.patch
* git add -u
1. cat contrib/coccinelle/run_command.pending.cocci
@@
struct child_process E;
@@
- E.env_array
+ E.env
@@
struct child_process *E;
@@
- E->env_array
+ E->env
I've avoided changing any comments and derived variable names here,
that will all be done in the next commit.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a bug() function to use in cases where we'd like to indicate a
runtime BUG(), but would like to defer the BUG() call because we're
possibly accumulating more bug() callers to exhaustively indicate what
went wrong.
We already have this sort of facility in various parts of the
codebase, just in the form of ad-hoc re-inventions of the
functionality that this new API provides. E.g. this will be used to
replace optbug() in parse-options.c, and the 'error("BUG:[...]' we do
in a loop in builtin/receive-pack.c.
Unlike the code this replaces we'll log to trace2 with this new bug()
function (as with other usage.c functions, including BUG()), we'll
also be able to avoid calls to xstrfmt() in some cases, as the bug()
function itself accepts variadic sprintf()-like arguments.
Any caller to bug() can follow up such calls with BUG_if_bug(),
which will BUG() out (i.e. abort()) if there were any preceding calls
to bug(), callers can also decide not to call BUG_if_bug() and leave
the resulting BUG() invocation until exit() time. There are currently
no bug() API users that don't call BUG_if_bug() themselves after a
for-loop, but allowing for not calling BUG_if_bug() keeps the API
flexible. As the tests and documentation here show we'll catch missing
BUG_if_bug() invocations in our exit() wrapper.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a stress test to hammer on the fsmonitor daemon.
Create a client-side thread pool of n threads and have
each of them make m requests as fast as they can.
We do not currently inspect the contents of the response.
We're only interested in placing a heavy request load on
the daemon.
This test is useful for interactive testing and various
experimentation. For example, to place additional load
on the daemon while another test is running. We currently
do not have a test script that actually uses this helper.
We might add such a test in the future.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the next patch, we will implement and test support for writing a
cruft pack via a special mode of `git pack-objects`. To make sure that
objects are written with the correct timestamps, and a new test-tool
that can dump the object names and corresponding timestamps from a given
`.mtimes` file.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a release_revisions() to various users of "struct rev_info" which
requires a minor refactoring to a "goto cleanup" pattern to use that
function.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a release_revisions() to various users of "struct rev_list" in
those straightforward cases where we only need to add the
release_revisions() call to the end of a block, and don't need to
e.g. refactor anything to use a "goto cleanup" pattern.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak that's been with us since f9500261e0 (fast-rebase:
write conflict state to working tree, index, and HEAD, 2021-05-20)
changed this code to move these strbuf_release() into an if/else
block.
We'll also add to "reflog_msg" in the "else" arm of the "if" block
being modified here, and we'll append to "branch_msg" in both
cases. But after f9500261e0 only the "if" block would free these two
"struct strbuf".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Built-in fsmonitor (part 2).
* jh/builtin-fsmonitor-part2: (30 commits)
t7527: test status with untracked-cache and fsmonitor--daemon
fsmonitor: force update index after large responses
fsmonitor--daemon: use a cookie file to sync with file system
fsmonitor--daemon: periodically truncate list of modified files
t/perf/p7519: add fsmonitor--daemon test cases
t/perf/p7519: speed up test on Windows
t/perf/p7519: fix coding style
t/helper/test-chmtime: skip directories on Windows
t/perf: avoid copying builtin fsmonitor files into test repo
t7527: create test for fsmonitor--daemon
t/helper/fsmonitor-client: create IPC client to talk to FSMonitor Daemon
help: include fsmonitor--daemon feature flag in version info
fsmonitor--daemon: implement handle_client callback
compat/fsmonitor/fsm-listen-darwin: implement FSEvent listener on MacOS
compat/fsmonitor/fsm-listen-darwin: add MacOS header files for FSEvent
compat/fsmonitor/fsm-listen-win32: implement FSMonitor backend on Windows
fsmonitor--daemon: create token-based changed path cache
fsmonitor--daemon: define token-ids
fsmonitor--daemon: add pathname classification
fsmonitor--daemon: implement 'start' command
...
Teach `test-tool.exe chmtime` to ignore errors when setting the mtime
on a directory on Windows.
NEEDSWORK: The Windows version of `utime()` (aka `mingw_utime()`) does
not properly handle directories because it uses `_wopen()`. It should
be converted to using `CreateFileW()` and backup semantics at a minimum.
Since I'm already in the middle of a large patch series, I did not want
to destabilize other callers of `utime()` right now. The problem has
only been observed in the t/perf/p7519 test when the test repo contains
an empty directory on disk.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create an IPC client to send query and flush commands to the daemon.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Count string_list items in size_t, not "unsigned int".
* ab/string-list-count-in-size-t:
string-list API: change "nr" and "alloc" to "size_t"
gettext API users: don't explicitly cast ngettext()'s "n"
Fixes to the way generation number v2 in the commit-graph files are
(not) handled.
* ds/commit-graph-gen-v2-fixes:
commit-graph: declare bankruptcy on GDAT chunks
commit-graph: fix generation number v2 overflow values
commit-graph: start parsing generation v2 (again)
commit-graph: fix ordering bug in generation numbers
t5318: extract helpers to lib-commit-graph.sh
test-read-graph: include extra post-parse info
Change the "nr" and "alloc" members of "struct string_list" to use
"size_t" instead of "nr". On some platforms the size of an "unsigned
int" will be smaller than a "size_t", e.g. a 32 bit unsigned v.s. 64
bit unsigned. As "struct string_list" is a generic API we use in a lot
of places this might cause overflows.
As one example: code in "refs.c" keeps track of the number of refs
with a "size_t", and auxiliary code in builtin/remote.c in
get_ref_states() appends those to a "struct string_list".
While we're at it split the "nr" and "alloc" in string-list.h across
two lines, which is the case for most such struct member
declarations (e.g. in "strbuf.h" and "strvec.h").
Changing e.g. "int i" to "size_t i" in run_and_feed_hook() isn't
strictly necessary, and there are a lot more cases where we'll use a
local "int", "unsigned int" etc. variable derived from the "nr" in the
"struct string_list". But in that case as well as
add_wrapped_shortlog_msg() in builtin/shortlog.c we need to adjust the
printf format referring to "nr" anyway, so let's also change the other
variables referring to it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It can be helpful to verify that the 'struct commit_graph' that results
from parsing a commit-graph is correctly structured. The existence of
different chunks is not enough to verify that all of the optional
features are correctly enabled.
Update 'test-tool read-graph' to output an "options:" line that includes
information for different parts of the struct commit_graph.
In particular, this change demonstrates that the read_generation_data
option is never being enabled, which will be fixed in a later change.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>