"git rebase --keep-base <upstream> <branch-to-rebase>" computed the
commit to rebase onto incorrectly, which has been corrected.
source: <20220421044233.894255-1-alexhenrie24@gmail.com>
* ah/rebase-keep-base-fix:
rebase: use correct base for --keep-base when a branch is given
Avoid problems from interaction between malloc_check and address
sanitizer.
source: <pull.1210.git.1649507317350.gitgitgadget@gmail.com>
* pw/test-malloc-with-sanitize-address:
tests: make SANITIZE=address imply TEST_NO_MALLOC_CHECK
The commit summary shown after making a commit is matched to what
is given in "git status" not to use the break-rewrite heuristics.
source: <c35bd0aa-2e46-e710-2b39-89f18bad0097@web.de>
* rs/commit-summary-wo-break-rewrite:
commit, sequencer: turn off break_opt for commit summary
A bit of test framework fixes with a few fixes to issues found by
valgrind.
source: <20220512223218.237544-1-gitster@pobox.com>
* ab/valgrind-fixes:
commit-graph.c: don't assume that stat() succeeds
object-file: fix a unpack_loose_header() regression in 3b6a8db3b0
log test: skip a failing mkstemp() test under valgrind
tests: using custom GIT_EXEC_PATH breaks --valgrind tests
The "--current" option of "git show-branch" should have been made
incompatible with the "--reflog" mode, but this was not enforced,
which has been corrected.
source: <xmqqh76mf7s4.fsf_-_@gitster.g>
* jc/show-branch-g-current:
show-branch: -g and --current are incompatible
Test that "git config --show-scope" shows the "worktree" scope, and add
it to the list of scopes in Documentation/git-config.txt.
"git config --help" does not need to be updated because it already
mentions "worktree".
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A git subcommand like "git add -p" spawns a separate git process
while relaying its command line arguments. A pathspec with only
negative elements was mistakenly passed with an empty string, which
has been corrected.
* jc/all-negative-pathspec:
pathspec: correct an empty string used as a pathspec element
Implementation of "scalar diagnose" subcommand.
* js/scalar-diagnose:
scalar: teach `diagnose` to gather loose objects information
scalar: teach `diagnose` to gather packfile info
scalar diagnose: include disk space information
scalar: implement `scalar diagnose`
scalar: validate the optional enlistment argument
archive --add-virtual-file: allow paths containing colons
archive: optionally add "virtual" files
Update the GitHub workflow support to make it quicker to get to the
failing test.
* js/ci-github-workflow-markup:
ci: call `finalize_test_case_output` a little later
ci(github): mention where the full logs can be found
ci: use `--github-workflow-markup` in the GitHub workflow
ci(github): avoid printing test case preamble twice
ci(github): skip the logs of the successful test cases
ci: optionally mark up output in the GitHub workflow
ci/run-build-and-tests: add some structure to the GitHub workflow output
ci: make it easier to find failed tests' logs in the GitHub workflow
ci/run-build-and-tests: take a more high-level view
test(junit): avoid line feeds in XML attributes
tests: refactor --write-junit-xml code
ci: fix code style
Plug the memory leaks from the trickiest API of all, the revision
walker.
* ab/plug-leak-in-revisions: (27 commits)
revisions API: add a TODO for diff_free(&revs->diffopt)
revisions API: have release_revisions() release "topo_walk_info"
revisions API: have release_revisions() release "date_mode"
revisions API: call diff_free(&revs->pruning) in revisions_release()
revisions API: release "reflog_info" in release revisions()
revisions API: clear "boundary_commits" in release_revisions()
revisions API: have release_revisions() release "prune_data"
revisions API: have release_revisions() release "grep_filter"
revisions API: have release_revisions() release "filter"
revisions API: have release_revisions() release "cmdline"
revisions API: have release_revisions() release "mailmap"
revisions API: have release_revisions() release "commits"
revisions API users: use release_revisions() for "prune_data" users
revisions API users: use release_revisions() with UNLEAK()
revisions API users: use release_revisions() in builtin/log.c
revisions API users: use release_revisions() in http-push.c
revisions API users: add "goto cleanup" for release_revisions()
stash: always have the owner of "stash_info" free it
revisions API users: use release_revisions() needing REV_INFO_INIT
revision.[ch]: document and move code declared around "init"
...
Fix a regression reported[1] against f443246b9f (commit: convert
{pre-commit,prepare-commit-msg} hook to hook.h, 2021-12-22): Due to
using the run_process_parallel() API in the earlier 96e7225b31 (hook:
add 'run' subcommand, 2021-12-22) we'd capture the hook's stderr and
stdout, and thus lose the connection to the TTY in the case of
e.g. the "pre-commit" hook.
As a preceding commit notes GNU parallel's similar --ungroup option
also has it emit output faster. While we're unlikely to have hooks
that emit truly massive amounts of output (or where the performance
thereof matters) it's still informative to measure the overhead. In a
similar "seq" test we're now ~30% faster:
$ cat .git/hooks/seq-hook; git hyperfine -L rev origin/master,HEAD~0 -s 'make CFLAGS=-O3' './git hook run seq-hook'
#!/bin/sh
seq 100000000
Benchmark 1: ./git hook run seq-hook' in 'origin/master
Time (mean ± σ): 787.1 ms ± 13.6 ms [User: 701.6 ms, System: 534.4 ms]
Range (min … max): 773.2 ms … 806.3 ms 10 runs
Benchmark 2: ./git hook run seq-hook' in 'HEAD~0
Time (mean ± σ): 603.4 ms ± 1.6 ms [User: 573.1 ms, System: 30.3 ms]
Range (min … max): 601.0 ms … 606.2 ms 10 runs
Summary
'./git hook run seq-hook' in 'HEAD~0' ran
1.30 ± 0.02 times faster than './git hook run seq-hook' in 'origin/master'
1. https://lore.kernel.org/git/CA+dzEBn108QoMA28f0nC8K21XT+Afua0V2Qv8XkR8rAeqUCCZw@mail.gmail.com/
Reported-by: Anthony Sottile <asottile@umich.edu>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
[jc: minor fix-up to tests for consistency]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the parallel execution API added in c553c72eed (run-command:
add an asynchronous parallel child processor, 2015-12-15) to support a
mode where the stdout and stderr of the processes isn't captured and
output in a deterministic order, instead we'll leave it to the kernel
and stdio to sort it out.
This gives the API same functionality as GNU parallel's --ungroup
option. As we'll see in a subsequent commit the main reason to want
this is to support stdout and stderr being connected to the TTY in the
case of jobs=1, demonstrated here with GNU parallel:
$ parallel --ungroup 'test -t {} && echo TTY || echo NTTY' ::: 1 2
TTY
TTY
$ parallel 'test -t {} && echo TTY || echo NTTY' ::: 1 2
NTTY
NTTY
Another is as GNU parallel's documentation notes a potential for
optimization. As demonstrated in next commit our results with "git
hook run" will be similar, but generally speaking this shows that if
you want to run processes in parallel where the exact order isn't
important this can be a lot faster:
$ hyperfine -r 3 -L o ,--ungroup 'parallel {o} seq ::: 10000000 >/dev/null '
Benchmark 1: parallel seq ::: 10000000 >/dev/null
Time (mean ± σ): 220.2 ms ± 9.3 ms [User: 124.9 ms, System: 96.1 ms]
Range (min … max): 212.3 ms … 230.5 ms 3 runs
Benchmark 2: parallel --ungroup seq ::: 10000000 >/dev/null
Time (mean ± σ): 154.7 ms ± 0.9 ms [User: 136.2 ms, System: 25.1 ms]
Range (min … max): 153.9 ms … 155.7 ms 3 runs
Summary
'parallel --ungroup seq ::: 10000000 >/dev/null ' ran
1.42 ± 0.06 times faster than 'parallel seq ::: 10000000 >/dev/null '
A large part of the juggling in the API is to make the API safer for
its maintenance and consumers alike.
For the maintenance of the API we e.g. avoid malloc()-ing the
"pp->pfd", ensuring that SANITIZE=address and other similar tools will
catch any unexpected misuse.
For API consumers we take pains to never pass the non-NULL "out"
buffer to an API user that provided the "ungroup" option. The
resulting code in t/helper/test-run-command.c isn't typical of such a
user, i.e. they'd typically use one mode or the other, and would know
whether they'd provided "ungroup" or not.
We could also avoid the strbuf_init() for "buffered_output" by having
"struct parallel_processes" use a static PARALLEL_PROCESSES_INIT
initializer, but let's leave that cleanup for later.
Using a global "run_processes_parallel_ungroup" variable to enable
this option is rather nasty, but is being done here to produce as
minimal of a change as possible for a subsequent regression fix. This
change is extracted from a larger initial version[1] which ends up
with a better end-state for the API, but in doing so needed to modify
all existing callers of the API. Let's defer that for now, and
narrowly focus on what we need for fixing the regression in the
subsequent commit.
It's safe to do this with a global variable because:
A) hook.c is the only user of it that sets it to non-zero, and before
we'll get any other API users we'll refactor away this method of
passing in the option, i.e. re-roll [1].
B) Even if hook.c wasn't the only user we don't have callers of this
API that concurrently invoke this parallel process starting API
itself in parallel.
As noted above "A" && "B" are rather nasty, and we don't want to live
with those caveats long-term, but for now they should be an acceptable
compromise.
1. https://lore.kernel.org/git/cover-v2-0.8-00000000000-20220518T195858Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The short upstream state indicator inherits the colour of the last short
state indicator before it (if there is one), and the sparsity state
indicator inherits this colour as well. This behaviour was introduced by
0ec7c23cdc (git-prompt: make upstream state indicator location
consistent, 2022-02-27), while before this change the aforementioned
indicators were white/the default text colour. Some examples to
illustrate this behaviour (assuming all indicators are enabled and
colourization is on):
* If there is something in the stash, both the '$' and the short
upstream state indicator following it will be blue.
* If the local tree has new, untracked files and there is nothing in
the stash, both the '%' and the short upstream state indicator
will be red.
* If all local changes are added to the index and the stash is empty,
both the '+' and the short upstream state indicator following it will
be green.
* If the local tree is clean and there is nothing in the stash, the
short upstream state indicator will be white/${default text colour}.
This appears to be an unintended side-effect of the change, and makes
little sense semantically (e.g. why is it bad to be in sync with
upstream when you have uncommitted local changes?). The cause of the
change in colourization is that previously, the short upstream state
indicator appeared immediately after the rebase/revert/bisect/merge
state indicator (note the position of $p in $gitstring):
local f="$h$w$i$s$u"
local gitstring="$c$b${f:+$z$f}${sparse}$r$p"
Said indicator is prepended with the clear colour code, and the short
upstream state indicator is thus also uncoloured. Now, the short
upstream state indicator follows the sequence of colourized indicators,
without any clearing of colour (again note the position of $p, now in
$f):
local f="$h$w$i$s$u$p"
local gitstring="$c$b${f:+$z$f}${sparse}$r${upstream}"
If the user is in a sparse checkout, the sparsity state indicator
follows a similar pattern to the short upstream state indicator.
However, clearing colour of the colourized indicators changes how the
sparsity state indicator is colourized, as it currently inherits (and
before the change referenced also inherited) the colour of the last
short state indicator before it. Reading the commit message of the
change that introduced the sparsity state indicator, afda36dbf3
(git-prompt: include sparsity state as well, 2020-06-21), it appears
this colourization also was unintended, so clearing the colour for said
indicator further increases consistency.
Make the colourization of these state indicators consistent by making
all colourized indicators clear their own colour. Make colouring of $c
dependent on it not being empty, as it is no longer being used to colour
the branch name. Move clearing of $b's prefix to before colourization so
it gets cleared properly when colour codes are inserted into it. These
changes make changing the layout of the prompt less prone to unintended
colour changes in the future.
Change coloured Bash prompt tests to reflect the colourization changes:
* Move the colour codes to wrap the expected content of the expanded
$__git_ps1_branch_name in all tests.
* Insert a clear-colour code after the symbol for the first indicator
in "prompt - bash color pc mode - dirty status indicator - dirty
index and worktree", to reflect that all indicators should clear
their own colour.
Signed-off-by: Joakim Petersen <joak-pet@online.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After generating diffs for each range to be compared using a 'git log'
invocation, range-diff.c::read_patches looks for the "diff --git" header
in those diffs to recognize the beginning of a new change.
In a project with submodules, and with 'diff.submodule=log' set in the
config, this header is missing for the diff of a changed submodule, so
any submodule changes are quietly ignored in the range-diff.
When 'diff.submodule=diff' is set in the config, the "diff --git" header
is also missing for the submodule itself, but is shown for submodule
content changes, which can easily confuse 'git range-diff' and lead to
errors such as:
error: git apply: bad git-diff - inconsistent old filename on line 1
error: could not parse git header 'diff --git path/to/submodule/and/some/file/within
'
error: could not parse log for '@{u}..@{1}'
Force the submodule diff format to its default ("short") when invoking
'git log' to generate the patches for each range, such that submodule
changes are always detected.
Add a test, including an invocation with '--creation-factor=100' to
force the second commit in the range not to be considered a complete
rewrite, in order to verify we do indeed get the "short" format.
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change those tests that assumed that a .git/info directory would be
created for them when writing .git/info/sparse-checkout to explicitly
create the directory by setting "TEST_CREATE_REPO_NO_TEMPLATE=1"
before sourcing test-lib.sh, and using the "--template=" argument to
"git clone".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change those tests that assumed that a .git/info directory would be
created for them when writing .git/info/exclude to explicitly create
the directory by setting "TEST_CREATE_REPO_NO_TEMPLATE=1" before
sourcing test-lib.sh, and using the "--template=" argument to "git
clone" and "git init".
In the case of ".git/modules/sub1/info" we deviate from the
established pattern in this and preceding commits of passing a
"--template=" and doing a "mkdir .git/info".
In that case "git checkout" will run the "submodule--helper clone",
and both e.g. "git submodule update --init" and "git checkout" do not
have a way to pass down options to the eventual "git init" or "git
clone". Let's instead assume that the submodule was populated with our
default templates, remove them, and then run the "mkdir".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change those tests that assumed that a .git/info directory would be
created for them when writing .git/info/refs to explicitly create the
directory by using the "--template=" argument to "git init".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change those tests that assumed that a .git/info directory would be
created for them when writing .git/info/attributes to explicitly
create the directory by setting "TEST_CREATE_REPO_NO_TEMPLATE=1"
before sourcing test-lib.sh, and using the "--template=" argument to
"git clone".
The change here in here in t7814-grep-recurse-submodules.sh would
continue "succeeding" with only the "TEST_CREATE_REPO_NO_TEMPLATE=1"
part of this change. That's because those tests use
"test_expect_failure", so they'd "pass" without this change, as
"test_expect_failure" by design isn't discerning about what failure
conditions it'll accept.
But as we're fixing these sorts of issues across the test suite let's
fix this one too. This issue was spotted with a local merge with
another topic of mine[1], which introduces a stricter alternative to
"test_expect_failure".
1. https://lore.kernel.org/git/cover-0.7-00000000000-20220318T002951Z-avarab@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change those tests that assumed that a .git/info directory would be
created for them when writing .git/info/grafts to explicitly create
the directory.
Do this using the new "TEST_CREATE_REPO_NO_TEMPLATE" facility, and use
"mkdir" instead of "mkdir -p" to assert that we don't have the
.git/info already. An exception to this is the "with grafts" test in
"t6001-rev-list-graft.sh". There we're modifying our ".git" state in a
for-loop, in lieu of refactoring that more extensively let's use
"mkdir -p" there.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As noted in c8a58ac5a5 (Revert "Don't create the $GIT_DIR/branches
directory on init", 2009-10-31) there was an attempt long ago in
0cc5691a8b (Don't create the $GIT_DIR/branches directory on init,
2009-10-30) to get rid of the legacy "branches" directory.
We should probably get rid of its creation by removing the
"templates/branches--" file. But whatever our default behavior, our
tests should be tightened up to explicitly create the .git/branches
directory if they rely on our default templates, to make the
dependency on those templates clear.
So let's amend the two tests that would fail if .git/branches wasn't
created. To do this introduce a new "TEST_CREATE_REPO_NO_TEMPLATE"
variable, which we'll set before sourcing test-lib.sh, and change the
"git clone" and "git init" commands in the tests themselves to
explicitly pass "--template=".
This way they won't get a .git/branches in either their top-level
.git, or in the ones they create. We can then amend the tests that
rely on the ".git/branches" directory existing to create it
explicitly, and to remove it after its creation.
This new "TEST_CREATE_REPO_NO_TEMPLATE" variable is a less
heavy-handed version of the "NO_SET_GIT_TEMPLATE_DIR" variable. See
a94d305bf8 (t/t0001-init.sh: add test for 'init with init.templatedir
set', 2010-02-26) for its implementation.
Unlike "TEST_CREATE_REPO_NO_TEMPLATE", this new
"TEST_CREATE_REPO_NO_TEMPLATE" variable is narrowly scoped to what the
"git init" in test-lib.sh does, as opposed to the global effect of
"NO_SET_GIT_TEMPLATE_DIR" and the setting of "GIT_TEMPLATE_DIR" in
wrap-for-bin.sh.
I experimented with adding a new "GIT_WRAP_FOR_BIN_VIA_TEST_LIB"
variable set in test-lib.sh, which would cause wrap-for-bin.sh to not
set GIT_TEMPLATE_DIR, GITPERLLIB etc, as we set those in
test-lib.sh. I think that's a viable approach, but it would interact
e.g. with the appending feature of GITPERLLIB added in
8bade1e12e (wrap-for-bin: make bin-wrappers chainable, 2013-07-04).
Doing so would allow us to convert the tests in t0001-init.sh that now
use "NO_SET_GIT_TEMPLATE_DIR" to simply unset "GIT_TEMPLATE_DIR" in a
sub-shell before invoking "git init" or "git clone". I think that
approach is worth pursuing, but let's table it for now. Some future
wrap-for-bin.sh refactoring can try to address it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change a test added in 368aa52952 (add git-check-ignore sub-command,
2013-01-06) to clobber .git/info/exclude rather than append to
it.
These tests would break if the "templates/info--exclude" file added in
d3af621b14 (Redo the templates generation and installation.,
2005-08-06) wasn't exactly 6 lines (of only comments).
Let's instead clobber the default .git/info/excludes file, and test
only our own expected content. This is not strictly needed for
anything in this series, but is a good cleanup while we're at it.
As discussed in the preceding commit a lot of things depend on the
"info" directory being created, but this was the only test that relied
on the specific content in the "templates/info--exclude" file.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a commit is parsed, it pretends to have a different (possibly
empty) list of parents if there is graft information for that commit.
But there is a bug that could occur when a commit is parsed, the graft
information is updated (for example, when a shallow file is rewritten),
and the same commit is subsequently used: the parents of the commit do
not conform to the updated graft information, but the information at the
time of parsing.
This is usually not an issue, as a commit is usually introduced into the
repository at the same time as its graft information. That means that
when we try to parse that commit, we already have its graft information.
But it is an issue when fetching a shallow point directly into a
repository with submodules. The function
assign_shallow_commits_to_refs() parses all sought objects (including
the shallow point, which we are directly fetching). In update_shallow()
in fetch-pack.c, assign_shallow_commits_to_refs() is called before
commit_shallow_file(), which means that the shallow point would have
been parsed before graft information is updated. Once a commit is
parsed, it is no longer sensitive to any graft information updates. This
parsed commit is subsequently used when we do a revision walk to search
for submodules to fetch, meaning that the commit is considered to have
parents even though it is a shallow point (and therefore should be
treated as having no parents).
Therefore, whenever graft information is updated, mark the commits that
were previously grafts and the commits that are newly grafts as
unparsed.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Users sometimes provide a "username:password" combination in their
plaintext URLs. Since Git stores these URLs in plaintext in the
.git/config file, this is a very insecure way of storing these
credentials. Credential managers are a more secure way of storing this
information.
System administrators might want to prevent this kind of use by users on
their machines.
Create a new "fetch.credentialsInUrl" config option and teach Git to
warn or die when seeing a URL with this kind of information. The warning
anonymizes the sensitive information of the URL to be clear about the
issue.
This change currently defaults the behavior to "allow" which does
nothing with these URLs. We can consider changing this behavior to
"warn" by default if we wish. At that time, we may want to add some
advice about setting fetch.credentialsInUrl=ignore for users who still
want to follow this pattern (and not receive the warning).
An earlier version of this change injected the logic into
url_normalize() in urlmatch.c. While most code paths that parse URLs
eventually normalize the URL, that normalization does not happen early
enough in the stack to avoid attempting connections to the URL first. By
inserting a check into the remote validation, we identify the issue
before making a connection. In the old code path, this was revealed by
testing the new t5601-clone.sh test under --stress, resulting in an
instance where the return code was 13 (SIGPIPE) instead of 128 from the
die().
However, we can reuse the parsing information from url_normalize() in
order to benefit from its well-worn parsing logic. We can use the struct
url_info that is created in that method to replace the password with
"<redacted>" in our error messages. This comes with a slight downside
that the normalized URL might look slightly different from the input URL
(for instance, the normalized version adds a closing slash). This should
not hinder users figuring out what the problem is and being able to fix
the issue.
As an attempt to ensure the parsing logic did not catch any
unintentional cases, I modified this change locally to to use the "die"
option by default. Running the test suite succeeds except for the
explicit username:password URLs used in t5550-http-fetch-dumb.sh and
t5541-http-push-smart.sh. This means that all other tested URLs did not
trigger this logic.
The tests show that the proper error messages appear (or do not
appear), but also count the number of error messages. When only warning,
each process validates the remote URL and outputs a warning. This
happens twice for clone, three times for fetch, and once for push.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A mechanism to pack unreachable objects into a "cruft pack",
instead of ejecting them into loose form to be reclaimed later, has
been introduced.
* tb/cruft-packs:
sha1-file.c: don't freshen cruft packs
builtin/gc.c: conditionally avoid pruning objects via loose
builtin/repack.c: add cruft packs to MIDX during geometric repack
builtin/repack.c: use named flags for existing_packs
builtin/repack.c: allow configuring cruft pack generation
builtin/repack.c: support generating a cruft pack
builtin/pack-objects.c: --cruft with expiration
reachable: report precise timestamps from objects in cruft packs
reachable: add options to add_unseen_recent_objects_to_traversal
builtin/pack-objects.c: --cruft without expiration
builtin/pack-objects.c: return from create_object_entry()
t/helper: add 'pack-mtimes' test-tool
pack-mtimes: support writing pack .mtimes files
chunk-format.h: extract oid_version()
pack-write: pass 'struct packing_data' to 'stage_tmp_packfiles'
pack-mtimes: support reading .mtimes files
Documentation/technical: add cruft-packs.txt
Teach "git repack --geometric" work better with "--keep-pack" and
avoid corrupting the repository when packsize limit is used.
* tb/geom-repack-with-keep-and-max:
builtin/repack.c: ensure that `names` is sorted
t7703: demonstrate object corruption with pack.packSizeLimit
repack: respect --keep-pack with geometric repack
"sparse-checkout" learns to work well with the sparse-index
feature.
* ds/sparse-sparse-checkout:
sparse-checkout: integrate with sparse index
p2000: add test for 'git sparse-checkout [add|set]'
sparse-index: complete partial expansion
sparse-index: partially expand directories
sparse-checkout: --no-sparse-index needs a full index
cache-tree: implement cache_tree_find_path()
sparse-index: introduce partially-sparse indexes
sparse-index: create expand_index()
t1092: stress test 'git sparse-checkout set'
t1092: refactor 'sparse-index contents' test
Introduce a filesystem-dependent mechanism to optimize the way the
bits for many loose object files are ensured to hit the disk
platter.
* ns/batch-fsync:
core.fsyncmethod: performance tests for batch mode
t/perf: add iteration setup mechanism to perf-lib
core.fsyncmethod: tests for batch mode
test-lib-functions: add parsing helpers for ls-files and ls-tree
core.fsync: use batch mode and sync loose objects by default on Windows
unpack-objects: use the bulk-checkin infrastructure
update-index: use the bulk-checkin infrastructure
builtin/add: add ODB transaction around add_files_to_cache
cache-tree: use ODB transaction around writing a tree
core.fsyncmethod: batched disk flushes for loose-objects
bulk-checkin: rebrand plug/unplug APIs as 'odb transactions'
bulk-checkin: rename 'state' variable and separate 'plugged' boolean
Deprecate non-cone mode of the sparse-checkout feature.
* en/sparse-cone-becomes-default:
Documentation: some sparsity wording clarifications
git-sparse-checkout.txt: mark non-cone mode as deprecated
git-sparse-checkout.txt: flesh out pattern set sections a bit
git-sparse-checkout.txt: add a new EXAMPLES section
git-sparse-checkout.txt: shuffle some sections and mark as internal
git-sparse-checkout.txt: update docs for deprecation of 'init'
git-sparse-checkout.txt: wording updates for the cone mode default
sparse-checkout: make --cone the default
tests: stop assuming --no-cone is the default mode for sparse-checkout
Add a test for the regression introduced in my 9c4d58ff2c (ls-tree:
split up "fast path" callbacks, 2022-03-23) and fixed in
350296cc78 (ls-tree: `-l` should not imply recursive listing,
2022-04-04), and test for the test of ls-tree option/mode combinations
to make sure we don't have other blind spots.
The setup for these tests can be shared with those added in the
1041d58b4d (Merge branch 'tl/ls-tree-oid-only', 2022-04-04) topic, so
let's create a new t/lib-t3100.sh to help them share data.
The existing tests in "t3104-ls-tree-format.sh" didn't deal with a
submodule, which they'll now encounter with as the
setup_basic_ls_tree_data() sets one up.
This extensive testing should give us confidence that there were no
further regressions in this area. The lack of testing was noted back
in [1], but unfortunately we didn't cover that blind-spot before
9c4d58ff2c.
1. https://lore.kernel.org/git/211115.86o86lqe3c.gmgdl@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Start following-up on the rename mentioned in c7c4bdeccf (run-command
API: remove "env" member, always use "env_array", 2021-11-25) of
"env_array" to "env".
The "env_array" name was picked in 19a583dc39 (run-command: add
env_array, an optional argv_array for env, 2014-10-19) because "env"
was taken. Let's not forever keep the oddity of "*_array" for this
"struct strvec", but not for its "args" sibling.
This commit is almost entirely made with a coccinelle rule[1]. The
only manual change here is in run-command.h to rename the struct
member itself and to change "env_array" to "env" in the
CHILD_PROCESS_INIT initializer.
The rest of this is all a result of applying [1]:
* make contrib/coccinelle/run_command.cocci.patch
* patch -p1 <contrib/coccinelle/run_command.cocci.patch
* git add -u
1. cat contrib/coccinelle/run_command.pending.cocci
@@
struct child_process E;
@@
- E.env_array
+ E.env
@@
struct child_process *E;
@@
- E->env_array
+ E->env
I've avoided changing any comments and derived variable names here,
that will all be done in the next commit.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a bug() function to use in cases where we'd like to indicate a
runtime BUG(), but would like to defer the BUG() call because we're
possibly accumulating more bug() callers to exhaustively indicate what
went wrong.
We already have this sort of facility in various parts of the
codebase, just in the form of ad-hoc re-inventions of the
functionality that this new API provides. E.g. this will be used to
replace optbug() in parse-options.c, and the 'error("BUG:[...]' we do
in a loop in builtin/receive-pack.c.
Unlike the code this replaces we'll log to trace2 with this new bug()
function (as with other usage.c functions, including BUG()), we'll
also be able to avoid calls to xstrfmt() in some cases, as the bug()
function itself accepts variadic sprintf()-like arguments.
Any caller to bug() can follow up such calls with BUG_if_bug(),
which will BUG() out (i.e. abort()) if there were any preceding calls
to bug(), callers can also decide not to call BUG_if_bug() and leave
the resulting BUG() invocation until exit() time. There are currently
no bug() API users that don't call BUG_if_bug() themselves after a
for-loop, but allowing for not calling BUG_if_bug() keeps the API
flexible. As the tests and documentation here show we'll catch missing
BUG_if_bug() invocations in our exit() wrapper.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
According to the HTML Standard FAQ:
“What is the DOCTYPE for modern HTML documents?
In text/html documents:
<!DOCTYPE html>
In documents delivered with an XML media type: no DOCTYPE is required
and its use is generally unnecessary. However, you may use one if you
want (see the following question). Note that the above is well-formed
XML.”
Source: [1]
Gitweb uses an XHTML 1.0 DOCTYPE:
<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
While that DOCTYPE is still valid [2], it has several disadvantages:
1. It’s misleading. If an XML parser uses the DTD at the given link,
then the entities and ⋅ won’t get declared. Instead, the
parser has to use a DTD from the HTML Standard that has nothing to do
with XHTML 1.0 [2].
2. It’s obsolete. XHTML 1.0 was last revised in 2002 and was superseded in
2018 [3].
3. It’s unreliable. Gitweb uses and ⋅ but lets an external file
define them. “[…U]using entity references for characters in XML documents
is unsafe if they are defined in an external file (except for <, >,
&, ", and ').” [4]
[1]: <https://github.com/whatwg/html/blob/main/FAQ.md#what-is-the-doctype-for-modern-html-documents>
[2]: <https://html.spec.whatwg.org/multipage/xhtml.html#parsing-xhtml-documents>
[3]: <https://www.w3.org/TR/xhtml1/#xhtml>
[4]: <https://html.spec.whatwg.org/multipage/xhtml.html#writing-xhtml-documents>
Signed-off-by: Jason Yundt <jason@jasonyundt.email>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Branch names can't be empty, so config keys with an empty branch name,
e.g. "branch..remote", are silently ignored.
Since these config keys will never be useful, make it a fatal error when
remote.c finds a key that starts with "branch." and has an empty
subsection.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
4a2dcb1a08 (remote: die if branch is not found in repository,
2021-11-17) introduced a regression where multiple config entries with
an empty branch name, e.g.
[branch ""]
remote = foo
merge = bar
could cause Git to fail when it tries to look up branch tracking
information.
We parse the config key to get (branch name, branch name length), but
when the branch name subsection is empty, we get a bogus branch name,
e.g. "branch..remote" gives (".remote", 0). We continue to use the bogus
branch name as if it were valid, and prior to 4a2dcb1a08, this wasn't an
issue because length = 0 caused the branch name to effectively be ""
everywhere.
However, that commit handles length = 0 inconsistently when we create
the branch:
- When find_branch() is called to check if the branch exists in the
branch hash map, it interprets a length of 0 to mean that it should
call strlen on the char pointer.
- But the code path that inserts into the branch hash map interprets a
length of 0 to mean that the string is 0-length.
This results in the bug described above:
- "branch..remote" looks for ".remote" in the branch hash map. Since we
do not find it, we insert the "" entry into the hash map.
- "branch..merge" looks for ".merge" in the branch hash map. Since we
do not find it, we again try to insert the "" entry into the hash map.
However, the entries in the branch hash map are supposed to be
appended to, not overwritten.
- Since overwriting an entry is a BUG(), Git fails instead of silently
ignoring the empty branch name.
Fix the bug by removing the convenience strlen functionality, so that
0 means that the string is 0-length. We still insert a bogus branch name
into the hash map, but this will be fixed in a later commit.
Reported-by: "Ing. Martin Prantl Ph.D." <perry@ntis.zcu.cz>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As 'revert' and 'cherry-pick' share a lot of code, it is easy to
modify the behaviour of one command and inadvertently affect the
other. An earlier change to teach the '--reference' option and the
'revert.reference' configuration variable to the former was not
careful enough and 'cherry-pick --reference' wasn't rejected as an
error.
It is possible to think 'cherry-pick -x' might benefit from the
'--reference' option, but it is fundamentally different from
'revert' in at least two ways to make it questionable:
- 'revert' names a commit that is ancestor of the resulting commit,
so an abbreviated object name with human readable title is
sufficient to identify the named commit uniquely without using
the full object name. On the other hand, 'cherry-pick'
usually [*] picks a commit that is not an ancestor. It might be
even picking a private commit that never becomes part of the
public history.
- The whole commit message of 'cherry-pick' is a copy of the
original commit, and there is nothing gained to repeat only the
title part on 'cherry-picked from' message.
[*] well, you could revert and then you can pick the original that
was reverted to get back to where you were, but then you can
revert the revert to do the same thing.
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git add -i" was rewritten in C some time ago and has been in
testing; the reimplementation is now exposed to general public by
default.
* js/use-builtin-add-i:
add -i: default to the built-in implementation
t2016: require the PERL prereq only when necessary
The tests that ensured merges stop when interfering local changes
are present did not make sure that local changes are preserved; now
they do.
* jc/t6424-failing-merge-preserve-local-changes:
t6424: make sure a failed merge preserves local changes
With the new http.curloptResolve configuration, the CURLOPT_RESOLVE
mechanism that allows cURL based applications to use pre-resolved
IP addresses for the requests is exposed to the scripts.
* cc/http-curlopt-resolve:
http: add custom hostname to IP address resolutions
By allowing the path to be enclosed in double-quotes, we can avoid
the limitation that paths cannot contain colons.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the `--add-virtual-file=<path>:<content>` option, `git archive` now
supports use cases where relatively trivial files need to be added that
do not exist on disk.
This will allow us to generate `.zip` files with generated content,
without having to add said content to the object database and without
having to write it out to disk.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
[jc: tweaked <path> handling]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Pathspecs with only negative elements did not work with some
commands that pass the pathspec along to a subprocess. For
instance,
$ git add -p -- ':!*.txt'
should add everything except for paths ending in ".txt", but it gets
complaint from underlying "diff-index" and aborts.
We used to error out when a pathspec with only negative elements in
it, like the one in the above example. Later, 859b7f1d (pathspec:
don't error out on all-exclusionary pathspec patterns, 2017-02-07)
updated the logic to add an empty string as an extra element. The
intention was to let the extra element to match everything and let
the negative ones given by the user to subtract from it.
At around the same time, we were migrating from "an empty string is
a valid pathspec element that matches everything" to "either a dot
or ":/" is used to match all, and an empty string is rejected",
between d426430e (pathspec: warn on empty strings as pathspec,
2016-06-22) and 9e4e8a64 (pathspec: die on empty strings as
pathspec, 2017-06-06). I think 9e4e8a64, which happened long after
859b7f1d happened, was not careful enough to turn the empty string
859b7f1d added to either a dot or ":/".
A care should be taken as the definition of "everything" depends on
subcommand. For the purpose of "add -p", adding a "." to add
everything in the current directory is the right thing to do. But
for some other commands, ":/" (i.e. really really everything, even
things outside the current subdirectory) is the right choice.
We would break commands in a big way if we get this wrong, so add a
handful of test pieces to make sure the resulting code still
excludes the paths that are expected and includes "everything" else.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A typical "git revert" commit uses the full title of the original
commit in its title, and starts its body of the message with:
This reverts commit 8fa7f667cf61386257c00d6e954855cc3215ae91.
This does not encourage the best practice of describing not just
"what" (i.e. "Revert X" on the title says what we did) but "why"
(i.e. and it does not say why X was undesirable).
We can instead phrase this first line of the body to be more like
This reverts commit 8fa7f667 (do this and that, 2022-04-25)
so that the title does not have to be
Revert "do this and that"
We can instead use the title to describe "why" we are reverting the
original commit.
Introduce the "--reference" option to "git revert", and also the
revert.reference configuration variable, which defaults to false, to
tweak the title and the first line of the draft commit message for
when creating a "revert" commit.
When this option is in use, the first line of the pre-filled editor
buffer becomes a comment line that tells the user to say _why_. If
the user exits the editor without touching this line by mistake,
what we prepare to become the first line of the body, i.e. "This
reverts commit 8fa7f667 (do this and that, 2022-04-25)", ends up to
be the title of the resulting commit. This behaviour is designed to
help such a user to identify such a revert in "git log --oneline"
easily so that it can be further reworded with "git rebase -i" later.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the tests that exercise implicit shutdown cases
to make them more robust and less racy.
The fsmonitor--daemon will implicitly shutdown in a variety
of situations, such as when the ".git" directory is deleted
or renamed.
The existing tests would delete or rename the directory, sleep
for one second, and then check the status of the daemon. This
is racy, since the client/status command has no way to sync
with the daemon. This was noticed occasionally on very slow
CI build machines where it would cause a random test to fail.
Replace the simple sleep with a sleep-and-retry loop.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a test in t7527 to verify that we get a stray warning from
`git fsmonitor--daemon start` when indirectly called from
`git submodule absorbgitdirs`.
Update `git fsmonitor--daemon` to take (and ignore) the `--super-prefix`
argument to suppress the warning.
When we have:
1. a submodule with a `sub/.git/` directory (rather than a `sub/.git`
file).
2. `core.fsmonitor` is turned on in the submodule, but the daemon is
not yet started in the submodule.
3. and someone does a `git submodule absorbgitdirs` in the super.
Git will recursively invoke `git submodule--helper absorb-git-dirs`
in the submodule. This will read the index and may attempt to start
the fsmonitor--daemon with the `--super-prefix` argument.
`git fsmonitor--daemon start` does not accept the `--super-prefix`
argument and causes a warning to be issued.
This does not cause a problem because the `refresh_index()` code
assumes a trivial response if the daemon does not start.
The net-net is a harmelss, but stray warning. Lets eliminate the
warning.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Confirm that the daemon reports events using the on-disk
spelling for Unicode NFC/NFD characters. On APFS we still
have Unicode aliasing, so we cannot create two files that
only differ by NFC/NFD, but the on-disk format preserves
the spelling used to create the file. On HFS+ we also
have aliasing, but the path is always stored on disk in
NFD.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a set of prereqs to help understand how file names
are handled by the filesystem when they contain NFC and NFD
Unicode characters.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test that FS events from the OS are received using the preserved,
on-disk spelling of files/directories rather than spelling used
to make the change.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Never set CE_FSMONITOR_VALID on the cache-entry of submodule
directories.
During a client command like 'git status', we may need to recurse
into each submodule to compute a status summary for the submodule.
Since the purpose of the ce_flag is to let Git avoid scanning a
cache-entry, setting the flag causes the recursive call to be
avoided and we report incorrect (no status) for the submodule.
We created an OS watch on the root directory of our working
directory and we receive events for everything in the cone
under it. When submodules are present inside our working
directory, we receive events for both our repo (the super) and
any subs within it. Since our index doesn't have any information
for items within the submodules, we can't use those events.
We could try to truncate the paths of those events back to the
submodule boundary and mark the GITLINK as dirty, but that
feels expensive since we would have to prefix compare every FS
event that we receive against a list of submodule roots. And
it still wouldn't be sufficient to correctly report status on
the submodule, since we don't have any space in the cache-entry
to cache the submodule's status (the 'SCMU' bits in porcelain
V2 speak). That is, the CE_FSMONITOR_VALID bit just says that
we don't need to scan/inspect it because we already know the
answer -- it doesn't say that the item is clean -- and we
don't have space in the cache-entry to store those answers.
So we should always do the recursive scan.
Therefore, we should never set the flag on GITLINK cache-entries.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create unit tests to move a directory. Verify that `git status`
gives the same result with and without FSMonitor enabled.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
VFS for Git virtual repositories are incompatible with FSMonitor.
VFS for Git is a downstream fork of Git. It contains its own custom
file system watcher that is aware of the virtualization. If a working
directory is being managed by VFS for Git, we should not try to watch
it because we may get incomplete results.
We do not know anything about how VFS for Git works, but we do
know that VFS for Git working directories contain a well-defined
config setting. If it is set, mark the working directory as
incompatible.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Bare repos do not have a worktree, so there is nothing for the
daemon watch.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a stress test to hammer on the fsmonitor daemon.
Create a client-side thread pool of n threads and have
each of them make m requests as fast as they can.
We do not currently inspect the contents of the response.
We're only interested in placing a heavy request load on
the daemon.
This test is useful for interactive testing and various
experimentation. For example, to place additional load
on the daemon while another test is running. We currently
do not have a test script that actually uses this helper.
We might add such a test in the future.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create some test repos with UTF8 characters in the pathname of the
root directory and verify that the builtin FSMonitor can watch them.
This test is mainly for Windows where we need to avoid `*A()`
routines.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach FSMonitor daemon on Windows to recognize shortname paths as
aliases of normal longname paths. FSMonitor clients, such as `git
status`, should receive the longname spelling of changed files (when
possible).
Sometimes we receive FS events using the shortname, such as when a CMD
shell runs "RENAME GIT~1 FOO" or "RMDIR GIT~1". The FS notification
arrives using whatever combination of long and shortnames were used by
the other process. (Shortnames do seem to be case normalized,
however.)
Use Windows GetLongPathNameW() to try to map the pathname spelling in
the notification event into the normalized longname spelling. (This
can fail if the file/directory is deleted, moved, or renamed, because
we are asking the FS for the mapping in response to the event and
after it has already happened, but we try.)
Special case the shortname spelling of ".git" to avoid under-reporting
these events.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We don't bother to freshen objects stored in a cruft pack individually
by updating the `.mtimes` file. This is because we can't portably `mmap`
and write into the middle of a file (i.e., to update the mtime of just
one object). Instead, we would have to rewrite the entire `.mtimes` file
which may incur some wasted effort especially if there a lot of cruft
objects and they are freshened infrequently.
Instead, force the freshening code to avoid an optimizing write by
writing out the object loose and letting it pick up a current mtime.
This works because we prefer the mtime of the loose copy of an object
when both a loose and packed one exist (whether or not the packed copy
comes from a cruft pack or not).
This could certainly do with a test and/or be included earlier in this
series/PR, but I want to wait until after I have a chance to clean up
the overly-repetitive nature of the cruft pack tests in general.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Expose the new `git repack --cruft` mode from `git gc` via a new opt-in
flag. When invoked like `git gc --cruft`, `git gc` will avoid exploding
unreachable objects as loose ones, and instead create a cruft pack and
`.mtimes` file.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using cruft packs, the following race can occur when a geometric
repack that writes a MIDX bitmap takes place afterwords:
- First, create an unreachable object and do an all-into-one cruft
repack which stores that object in the repository's cruft pack.
- Then make that object reachable.
- Finally, do a geometric repack and write a MIDX bitmap.
Assuming that we are sufficiently unlucky as to select a commit from the
MIDX which reaches that object for bitmapping, then the `git
multi-pack-index` process will complain that that object is missing.
The reason is because we don't include cruft packs in the MIDX when
doing a geometric repack. Since the "make that object reachable" doesn't
necessarily mean that we'll create a new copy of that object in one of
the packs that will get rolled up as part of a geometric repack, it's
possible that the MIDX won't see any copies of that now-reachable
object.
Of course, it's desirable to avoid including cruft packs in the MIDX
because it causes the MIDX to store a bunch of objects which are likely
to get thrown away. But excluding that pack does open us up to the above
race.
This patch demonstrates the bug, and resolves it by including cruft
packs in the MIDX even when doing a geometric repack.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In servers which set the pack.window configuration to a large value, we
can wind up spending quite a lot of time finding new bases when breaking
delta chains between reachable and unreachable objects while generating
a cruft pack.
Introduce a handful of `repack.cruft*` configuration variables to
control the parameters used by pack-objects when generating a cruft
pack.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Expose a way to split the contents of a repository into a main and cruft
pack when doing an all-into-one repack with `git repack --cruft -d`, and
a complementary configuration variable.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a previous patch, pack-objects learned how to generate a cruft pack
so long as no objects are dropped.
This patch teaches pack-objects to handle the case where a non-never
`--cruft-expiration` value is passed. This case is slightly more
complicated than before, because we want pack-objects to save
unreachable objects which would have been pruned when there is another
recent (i.e., non-prunable) unreachable object which reaches the other.
We'll call these objects "unreachable but reachable-from-recent".
Here is how pack-objects handles `--cruft-expiration`:
- Instead of adding all objects outside of the kept pack(s) into the
packing list, only handle the ones whose mtime is within the grace
period.
- Construct a reachability traversal whose tips are the
unreachable-but-recent objects.
- Then, walk along that traversal, stopping if we reach an object in
the kept pack. At each step along the traversal, we add the object
we are visiting to the packing list.
In the majority of these cases, any object we visit in this traversal
will already be in our packing list. But we will sometimes encounter
reachable-from-recent cruft objects, which we want to retain even if
they aged out of the grace period.
The most subtle point of this process is that we actually don't need to
bother to update the rescued object's mtime. Even though we will write
an .mtimes file with a value that is older than the expiration window,
it will continue to survive cruft repacks so long as any objects which
reach it haven't aged out.
That is, a future repack will also exclude that object from the initial
packing list, only to discover it later on when doing the reachability
traversal.
Finally, stopping early once an object is found in a kept pack is safe
to do because the kept packs ordinarily represent which packs will
survive after repacking. Assuming that it _isn't_ safe to halt a
traversal early would mean that there is some ancestor object which is
missing, which implies repository corruption (i.e., the complete set of
reachable objects isn't present).
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach `pack-objects` how to generate a cruft pack when no objects are
dropped (i.e., `--cruft-expiration=never`). Later patches will teach
`pack-objects` how to generate a cruft pack that prunes objects.
When generating a cruft pack which does not prune objects, we want to
collect all unreachable objects into a single pack (noting and updating
their mtimes as we accumulate them). Ordinary use will pass the result
of a `git repack -A` as a kept pack, so when this patch says "kept
pack", readers should think "reachable objects".
Generating a non-expiring cruft packs works as follows:
- Callers provide a list of every pack they know about, and indicate
which packs are about to be removed.
- All packs which are going to be removed (we'll call these the
redundant ones) are marked as kept in-core.
Any packs the caller did not mention (but are known to the
`pack-objects` process) are also marked as kept in-core. Packs not
mentioned by the caller are assumed to be unknown to them, i.e.,
they entered the repository after the caller decided which packs
should be kept and which should be discarded.
Since we do not want to include objects in these "unknown" packs
(because we don't know which of their objects are or aren't
reachable), these are also marked as kept in-core.
- Then, we enumerate all objects in the repository, and add them to
our packing list if they do not appear in an in-core kept pack.
This results in a new cruft pack which contains all known objects that
aren't included in the kept packs. When the kept pack is the result of
`git repack -A`, the resulting pack contains all unreachable objects.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the next patch, we will implement and test support for writing a
cruft pack via a special mode of `git pack-objects`. To make sure that
objects are written with the correct timestamps, and a new test-tool
that can dump the object names and corresponding timestamps from a given
`.mtimes` file.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git remote -v" now shows the list-objects-filter used during
fetching from the remote, if available.
* ac/remote-v-with-object-list-filters:
builtin/remote.c: teach `-v` to list filters for promisor remotes
With a recent update to refuse access to repositories of other
people by default, "sudo make install" and "sudo git describe"
stopped working. This series intends to loosen it while keeping
the safety.
* cb/path-owner-check-with-sudo:
t0034: add negative tests and allow git init to mostly work under sudo
git-compat-util: avoid failing dir ownership checks if running privileged
t: regression git needs safe.directory when using sudo
"git -c branch.autosetupmerge=simple branch $A $B" will set the $B
as $A's upstream only when $A and $B shares the same name, and "git
-c push.default=simple" on branch $A would push to update the
branch $A at the remote $B came from. Also more places use the
sole remote, if exists, before defaulting to 'origin'.
* tk/simple-autosetupmerge:
push: new config option "push.autoSetupRemote" supports "simple" push
push: default to single remote even when not named origin
branch: new autosetupmerge option 'simple' for matching branches
"git fetch --recurse-submodules" from multiple remotes (either from
a remote group, or "--all") used to make one extra "git fetch" in
the submodules, which has been corrected.
* jc/avoid-redundant-submodule-fetch:
fetch: do not run a redundant fetch from submodule
The "--current" option of "git show-branch" should have been made
incompatible with the "--reflog" mode, but this was not enforced,
which has been corrected.
* jc/show-branch-g-current:
show-branch: -g and --current are incompatible
A bit of test framework fixes with a few fixes to issues found by
valgrind.
* ab/valgrind-fixes:
commit-graph.c: don't assume that stat() succeeds
object-file: fix a unpack_loose_header() regression in 3b6a8db3b0
log test: skip a failing mkstemp() test under valgrind
tests: using custom GIT_EXEC_PATH breaks --valgrind tests
When modifying the sparse-checkout definition, the sparse-checkout
builtin calls update_sparsity() to modify the SKIP_WORKTREE bits of all
cache entries in the index. Before, we needed the index to be fully
expanded in order to ensure we had the full list of files necessary that
match the new patterns.
Insert a call to reset_sparse_directories() that expands sparse
directories that are within the new pattern list, but only far enough
that every necessary file path now exists as a cache entry. The
remaining logic within update_sparsity() will modify the SKIP_WORKTREE
bits appropriately.
This allows us to disable command_requires_full_index within the
sparse-checkout builtin. Add tests that demonstrate that we are not
expanding to a full index unnecessarily.
We can see the improved performance in the p2000 test script:
Test HEAD~1 HEAD
------------------------------------------------------------------------
2000.24: git ... (sparse-v3) 2.14(1.55+0.58) 1.57(1.03+0.53) -26.6%
2000.25: git ... (sparse-v4) 2.20(1.62+0.57) 1.58(0.98+0.59) -28.2%
These reductions of 26-28% are small compared to most examples, but the
time is dominated by writing a new copy of the base repository to the
worktree and then deleting it again. The fact that the previous index
expansion was such a large portion of the time is telling how important
it is to complete this sparse index integration.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sparse-checkout builtin is almost completely integrated with the
sparse index, allowing the sparse-checkout boundary to be modified
without expanding a sparse index to a full one. Add a test to
p2000-sparse-operations.sh that adds a directory to the sparse-checkout
definition, then removes it. Using both operations is important to
ensure that the operation is doing the same work in each repetition as
well as leaving the test repo in a good state for later tests.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'sparse-index contents' test checks that the sparse index has the
correct set of sparse directories in the index after modifying the cone
mode patterns using 'git sparse-checkout set'. Add to the coverage here
by adding more complicated scenarios that were not previously tested.
In order to check paths that do not exist at HEAD, we need to modify the
test_sparse_checkout_set helper slightly:
1. Add the --skip-checks argument to the 'set' command to avoid failures
when passing paths that do not exist at HEAD.
2. When looking for the non-existence of sparse directories for the
paths in $CONE_DIRS, allow the rev-list command to fail because the
path does not exist at HEAD.
This allows us to add some interesting test cases.
Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before expanding this test with more involved cases, first extract the
repeated logic into a new test_sparse_checkout_set helper. This helper
checks that 'git sparse-checkout set ...' succeeds and then verifies
that certain directories have sparse directory entries in the sparse
index. It also verifies that the in-cone directories are _not_ sparse
directory entries in the sparse index.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We used to call that function already before printing the final verdict.
However, now that we added grouping to the GitHub workflow output, we
will want to include even that part in the collapsible group for that
test case.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We want to mark up the test case preamble when presenting test output in
Git's GitHub workflow. Let's suppress the non-marked-up version in that
case. Any information it would contain is included in the marked-up
variant already.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In most instances, looking at the log of failed test cases is enough to
identify the problem.
In some (rare?) instances, a previous test case that was marked as
successful actually has information pertaining to a later test case that
fails.
To allow the page to load relatively quickly, let's only show the logs
of the failed test cases to be shown. The full logs are available for
download as artifacts, should a deeper investigation become necessary.
Co-authored-by: Victoria Dye <vdye@github.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A couple of commands exist to spruce up the output in GitHub workflows:
https://docs.github.com/en/actions/learn-github-actions/workflow-commands-for-github-actions
In addition to the `::group::<label>`/`::endgroup::` commands (which we
already use to structure the output of the build step better), we also
use `::error::`/`::notice::` to draw the attention to test failures and
to test cases that were expected to fail but didn't.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the test case's output, we do want newline characters, but in the XML
attributes we do not want them.
However, the `xml_attr_encode` function always adds a Line Feed at the
end (which are then encoded as `
`, even for XML attributes.
This seems not to faze Azure Pipelines' XML parser, but it still is
incorrect, so let's fix it.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code writing JUnit XML is interspersed directly with all the code in
`t/test-lib.sh`, and it is therefore not only ill-separated, but
introducing yet another output format would make the situation even
worse.
Let's introduce an abstraction layer by hiding the JUnit XML code behind
four new functions that are supposed to be called before and after each
test and test case.
This is not just an academic exercise, refactoring for refactoring's
sake. We _actually_ want to introduce such a new output format, to
make it substantially easier to diagnose test failures in our GitHub
workflow, therefore we do need this refactoring.
This commit is best viewed with `git show --color-moved
--color-moved-ws=allow-indentation-change <commit>`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git show :<path>" learned to work better with the sparse-index
feature.
* ds/sparse-colon-path:
rev-parse: integrate with sparse index
object-name: diagnose trees in index properly
object-name: reject trees found in the index
show: integrate with the sparse index
t1092: add compatibility tests for 'git show'
Teach "git stash" to work better with sparse index entries.
* vd/sparse-stash:
unpack-trees: preserve index sparsity
stash: apply stash using 'merge_ort_nonrecursive()'
read-cache: set sparsity when index is new
sparse-index: expose 'is_sparse_index_allowed()'
stash: integrate with sparse index
stash: expand sparse-checkout compatibility testing
"git bisect" was too silent before it is ready to start computing
the actual bisection, which has been corrected.
* cd/bisect-messages-from-pre-flight-states:
bisect: output bisect setup status in bisect log
bisect: output state before we are ready to compute bisection
"git pull" without "--recurse-submodules=<arg>" made
submodule.recurse take precedence over fetch.recurseSubmodules by
mistake, which has been corrected.
* gc/pull-recurse-submodules:
pull: do not let submodule.recurse override fetch.recurseSubmodules
"git log --since=X" will stop traversal upon seeing a commit that
is older than X, but there may be commits behind it that is younger
than X when the commit was created with a faulty clock. A new
option is added to keep digging without stopping, and instead
filter out commits with timestamp older than X.
* mv/log-since-as-filter:
log: "--since-as-filter" option is a non-terminating "--since" variant
The temporary files fed to external diff command are now generated
inside a new temporary directory under the same basename.
* rs/external-diff-tempfile:
diff: use mks_tempfile_dt()
tempfile: add mks_tempfile_dt()
New tests for the safe.directory mechanism.
* sg/safe-directory-tests-and-docs:
safe.directory: document and check that it's ignored in the environment
t0033-safe-directory: check when 'safe.directory' is ignored
t0033-safe-directory: check the error message without matching the trash dir
The previous patch demonstrates a scenario where the list of packs
written by `pack-objects` (and stored in the `names` string_list) is
out-of-order, and can thus cause us to delete packs we shouldn't.
This patch resolves that bug by ensuring that `names` is sorted in all
cases, not just when
delete_redundant && pack_everything & ALL_INTO_ONE
is true.
Because we did sort `names` in that case (which, prior to `--geometric`
repacks, was the only time we would actually delete packs, this is only
a bug for `--geometric` repacks.
It would be sufficient to only sort `names` when `delete_redundant` is
set to a non-zero value. But sorting a small list of strings is cheap,
and it is defensive against future calls to `string_list_has_string()`
on this list.
Co-discovered-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When doing a `--geometric=<d>` repack, `git repack` determines a
splitting point among packs ordered by their object count such that:
- each pack above the split has at least `<d>` times as many objects
as the next-largest pack by object count, and
- the first pack above the split has at least `<d>` times as many
object as the sum of all packs below the split line combined
`git repack` then creates a pack containing all of the objects contained
in packs below the split line by running `git pack-objects
--stdin-packs` underneath. Once packs are moved into place, then any
packs below the split line are removed, since their objects were just
combined into a new pack.
But `git repack` tries to be careful to avoid removing a pack that it
just wrote, by checking:
struct packed_git *p = geometry->pack[i];
if (string_list_has_string(&names, hash_to_hex(p->hash)))
continue;
in the `delete_redundant` and `geometric` conditional towards the end of
`cmd_repack`.
But it's possible to trick `git repack` into not recognizing a pack that
it just wrote when `names` is out-of-order (which violates
`string_list_has_string()`'s assumption that the list is sorted and thus
binary search-able).
When this happens in just the right circumstances, it is possible to
remove a pack that we just wrote, leading to object corruption.
Luckily, this is quite difficult to provoke in practice (for a couple of
reasons):
- we ordinarily write just one pack, so `names` usually contains just
one entry, and is thus sorted
- when we do write more than one pack (e.g., due to `--max-pack-size`)
we have to: (a) write a pack identical to one that already
exists, (b) have that pack be below the split line, and (c) have
the set of packs written by `pack-objects` occur in an order which
tricks `string_list_has_string()`.
Demonstrate the above scenario in a failing test, which causes `git
repack --geometric` to write a pack which occurs below the split line,
_and_ fail to recognize that it wrote that pack.
The following patch will fix this bug.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update 'repack' to ignore packs named on the command line with the
'--keep-pack' option. Specifically, modify 'init_pack_geometry()' to treat
command line-kept packs the same way it treats packs with an on-disk '.keep'
file (that is, skip the pack and do not include it in the 'geometry'
structure).
Without this handling, a '--keep-pack' pack would be included in the
'geometry' structure. If the pack is *before* the geometry split line (with
at least one other pack and/or loose objects present), 'repack' assumes the
pack's contents are "rolled up" into another pack via 'pack-objects'.
However, because the internally-invoked 'pack-objects' properly excludes
'--keep-pack' objects, any new pack it creates will not contain the kept
objects. Finally, 'repack' deletes the '--keep-pack' as "redundant" (since
it assumes 'pack-objects' created a new pack with its contents), resulting
in possible object loss and repository corruption.
Add a test ensuring that '--keep-pack' packs are now appropriately handled.
Co-authored-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We do make sure that an attempt to merge with various forms of local
changes will "fail", but the point of stopping the merge is so that
we refrain from discarding uncommitted local changes that could be
precious. Add a few more checks for each case to make sure the
local changes are left intact.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 7dce19d3 (fetch/pull: Add the --recurse-submodules option,
2010-11-12) introduced the "--recurse-submodule" option, the
approach taken was to perform fetches in submodules only once, after
all the main fetching (it may usually be a fetch from a single
remote, but it could be fetching from a group of remotes using
fetch_multiple()) succeeded. Later we added "--all" to fetch from
all defined remotes, which complicated things even more.
If your project has a submodule, and you try to run "git fetch
--recurse-submodule --all", you'd see a fetch for the top-level,
which invokes another fetch for the submodule, followed by another
fetch for the same submodule. All but the last fetch for the
submodule come from a "git fetch --recurse-submodules" subprocess
that is spawned via the fetch_multiple() interface for the remotes,
and the last fetch comes from the code at the end.
Because recursive fetching from submodules is done in each fetch for
the top-level in fetch_multiple(), the last fetch in the submodule
is redundant. It only matters when fetch_one() interacts with a
single remote at the top-level.
While we are at it, there is one optimization that exists in dealing
with a group of remote, but is missing when "--all" is used. In the
former, when the group turns out to be a group of one, instead of
spawning "git fetch" as a subprocess via the fetch_multiple()
interface, we use the normal fetch_one() code path. Do the same
when handing "--all", if it turns out that we have only one remote
defined.
Reviewed-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Libcurl has a CURLOPT_RESOLVE easy option that allows
the result of hostname resolution in the following
format to be passed:
[+]HOST:PORT:ADDRESS[,ADDRESS]
This way, redirects and everything operating against the
HOST+PORT will use the provided ADDRESS(s).
The following format is also allowed to stop using
hostname resolutions that have already been passed:
-HOST:PORT
See https://curl.se/libcurl/c/CURLOPT_RESOLVE.html for
more details.
Let's add a corresponding "http.curloptResolve" config
option that takes advantage of CURLOPT_RESOLVE.
Each value configured for the "http.curloptResolve" key
is passed "as is" to libcurl through CURLOPT_RESOLVE, so
it should be in one of the above 2 formats. This keeps
the implementation simple and makes us consistent with
libcurl's CURLOPT_RESOLVE, and with curl's corresponding
`--resolve` command line option.
The implementation uses CURLOPT_RESOLVE only in
get_active_slot() which is called by all the HTTP
request sending functions.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a support library that provides one function that can be used
to run a "scriplet" of commands through sudo and that helps invoking
sudo in the slightly awkward way that is required to ensure it doesn't
block the call (if shell was allowed as tested in the prerequisite)
and it doesn't run the command through a different shell than the one
we intended.
Add additional negative tests as suggested by Junio and that use a
new workspace that is owned by root.
Document a regression that was introduced by previous commits where
root won't be able anymore to access directories they own unless
SUDO_UID is removed from their environment.
The tests document additional ways that this new restriction could
be worked around and the documentation explains why it might be instead
considered a feature, but a "fix" is planned for a future change.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
bdc77d1d68 (Add a function to determine whether a path is owned by the
current user, 2022-03-02) checks for the effective uid of the running
process using geteuid() but didn't account for cases where that user was
root (because git was invoked through sudo or a compatible tool) and the
original uid that repository trusted for its config was no longer known,
therefore failing the following otherwise safe call:
guy@renard ~/Software/uncrustify $ sudo git describe --always --dirty
[sudo] password for guy:
fatal: unsafe repository ('/home/guy/Software/uncrustify' is owned by someone else)
Attempt to detect those cases by using the environment variables that
those tools create to keep track of the original user id, and do the
ownership check using that instead.
This assumes the environment the user is running on after going
privileged can't be tampered with, and also adds code to restrict that
the new behavior only applies if running as root, therefore keeping the
most common case, which runs unprivileged, from changing, but because of
that, it will miss cases where sudo (or an equivalent) was used to change
to another unprivileged user or where the equivalent tool used to raise
privileges didn't track the original id in a sudo compatible way.
Because of compatibility with sudo, the code assumes that uid_t is an
unsigned integer type (which is not required by the standard) but is used
that way in their codebase to generate SUDO_UID. In systems where uid_t
is signed, sudo might be also patched to NOT be unsigned and that might
be able to trigger an edge case and a bug (as described in the code), but
it is considered unlikely to happen and even if it does, the code would
just mostly fail safely, so there was no attempt either to detect it or
prevent it by the code, which is something that might change in the future,
based on expected user feedback.
Reported-by: Guy Maurel <guy.j@maurel.de>
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Randall Becker <rsbecker@nexbridge.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Suggested-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Originally reported after release of v2.35.2 (and other maint branches)
for CVE-2022-24765 and blocking otherwise harmless commands that were
done using sudo in a repository that was owned by the user.
Add a new test script with very basic support to allow running git
commands through sudo, so a reproduction could be implemented and that
uses only `git status` as a proxy of the issue reported.
Note that because of the way sudo interacts with the system, a much
more complete integration with the test framework will require a lot
more work and that was therefore intentionally punted for now.
The current implementation requires the execution of a special cleanup
function which should always be kept as the last "test" or otherwise
the standard cleanup functions will fail because they can't remove
the root owned directories that are used. This also means that if
failures are found while running, the specifics of the failure might
not be kept for further debugging and if the test was interrupted, it
will be necessary to clean the working directory manually before
restarting by running:
$ sudo rm -rf trash\ directory.t0034-root-safe-directory/
The test file also uses at least one initial "setup" test that creates
a parallel execution directory under the "root" sub directory, which
should be used as top level directory for all repositories that are
used in this test file. Unlike all other tests the repository provided
by the test framework should go unused.
Special care should be taken when invoking commands through sudo, since
the environment is otherwise independent from what the test framework
setup and might have changed the values for HOME, SHELL and dropped
several relevant environment variables for your test. Indeed `git status`
was used as a proxy because it doesn't even require commits in the
repository to work and usually doesn't require much from the environment
to run, but a future patch will add calls to `git init` and that will
fail to honor the default branch name, unless that setting is NOT
provided through an environment variable (which means even a CI run
could fail that test if enabled incorrectly).
A new SUDO prerequisite is provided that does some sanity checking
to make sure the sudo command that will be used allows for passwordless
execution as root without restrictions and doesn't mess with git's
execution path. This matches what is provided by the macOS agents that
are used as part of GitHub actions and probably nowhere else.
Most of those characteristics make this test mostly only suitable for
CI, but it might be executed locally if special care is taken to provide
for all of them in the local configuration and maybe making use of the
sudo credential cache by first invoking sudo, entering your password if
needed, and then invoking the test with:
$ GIT_TEST_ALLOW_SUDO=YES ./t0034-root-safe-directory.sh
If it fails to run, then it means your local setup wouldn't work for the
test because of the configuration sudo has or other system settings, and
things that might help are to comment out sudo's secure_path config, and
make sure that the account you are using has no restrictions on the
commands it can run through sudo, just like is provided for the user in
the CI.
For example (assuming a username of marta for you) something probably
similar to the following entry in your /etc/sudoers (or equivalent) file:
marta ALL=(ALL:ALL) NOPASSWD: ALL
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 4c28e4ada0 (commit: die before asking to edit the log
message, 2010-12-20), we have been "leaking" the "author_ident" when
prepare_to_commit() fails. Instead of returning from right there,
introduce an exit status variable and jump to the clean-up label
at the end.
Instead of explicitly releasing the resource with strbuf_release(),
mark the variable with UNLEAK() at the end, together with two other
variables that are already marked as such. If this were in a
utility function that is called number of times, but these are
different, we should explicitly release resources that grow
proportionally to the size of the problem being solved, but
cmd_commit() is like main() and there is no point in spending extra
cycles to release individual pieces of resource at the end, just
before process exit will clean everything for us for free anyway.
This fixes a leak demonstrated by e.g. "t3505-cherry-pick-empty.sh",
but unfortunately we cannot mark it or other affected tests as passing
now with "TEST_PASSES_SANITIZE_LEAK=true" as we'll need to fix many
other memory leaks before doing so.
Incidentally there are two tests that always passes the leak checker
with or without this change. Mark them as such.
This is based on an earlier patch by Ævar, but takes a different
approach that is more maintainable.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a regression in my 3b6a8db3b0 (object-file.c: use "enum" return
type for unpack_loose_header(), 2021-10-01) revealed both by running
the test suite with --valgrind, and with the amended "git fsck" test.
In practice this regression in v2.34.0 caused us to claim that we
couldn't parse the header, as opposed to not being able to unpack
it. Before the change in the C code the test_cmp added here would emit:
-error: unable to unpack header of ./objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391
+error: unable to parse header of ./objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391
I.e. we'd proceed to call parse_loose_header() on the uninitialized
"hdr" value, and it would have been very unlikely for that
uninitialized memory to be a valid git object.
The other callers of unpack_loose_header() were already checking the
enum values exhaustively. See 3b6a8db3b0 and
5848fb11ac (object-file.c: return ULHR_TOO_LONG on "header too long",
2021-10-01).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Skip a test added in f1e3df3169 (t: increase test coverage of
signature verification output, 2020-03-04) when running under
valgrind. Due to valgrind's interception of mkstemp() this test will
fail with:
+ pwd
+ TMPDIR=[...]/t/trash directory.t4202-log/bogus git log --show-signature -n1 plain-fail
==7696== VG_(mkstemp): failed to create temp file: [...]/t/trash directory.t4202-log/bogus/valgrind_proc_7696_cmdline_d545ddcf
[... 10 more similar lines omitted ..]
valgrind: Startup or configuration error:
valgrind: Can't create client cmdline file in [...]/t/trash directory.t4202-log/bogus/valgrind_proc_7696_cmdline_6e542d1d
valgrind: Unable to start up properly. Giving up.
error: last command exited with $?=1
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a regression in b7d11a0f5d (tests: exercise the RUNTIME_PREFIX
feature, 2021-07-24) where tests that want to set up and test a "git"
wrapper in $PATH conflicted with the t/bin/valgrind wrapper(s) doing
the same.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a bug in "git pull" where `submodule.recurse` is preferred over
`fetch.recurseSubmodules` when performing a fetch
(Documentation/config/fetch.txt says that `fetch.recurseSubmodules`
should be preferred.). Do this by passing the value of the
"--recurse-submodules" CLI option to the underlying fetch, instead of
passing a value that combines the CLI option and config variables.
In other words, this bug occurred because builtin/pull.c is conflating
two similar-sounding, but different concepts:
- Whether "git pull" itself should care about submodules e.g. whether it
should update the submodule worktrees after performing a merge.
- The value of "--recurse-submodules" to pass to the underlying "git
fetch".
Thus, when `submodule.recurse` is set, the underlying "git fetch" gets
invoked with "--recurse-submodules[=value]", overriding the value of
`fetch.recurseSubmodules`.
An alternative (and more obvious) approach to fix the bug would be to
teach "git pull" to understand `fetch.recurseSubmodules`, but the
proposed solution works better because:
- We don't maintain two identical config-parsing implementions in "git
pull" and "git fetch".
- It works better with other commands invoked by "git pull" e.g. "git
merge" won't accidentally respect `fetch.recurseSubmodules`.
Reported-by: Huang Zou <huang.zou@schrodinger.com>
Helped-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit summary shown after making a commit is matched to what
is given in "git status" not to use the break-rewrite heuristics.
* rs/commit-summary-wo-break-rewrite:
commit, sequencer: turn off break_opt for commit summary
Avoid problems from interaction between malloc_check and address
sanitizer.
* pw/test-malloc-with-sanitize-address:
tests: make SANITIZE=address imply TEST_NO_MALLOC_CHECK
"git rebase --keep-base <upstream> <branch-to-rebase>" computed the
commit to rebase onto incorrectly, which has been corrected.
* ah/rebase-keep-base-fix:
rebase: use correct base for --keep-base when a branch is given
This allows seeing the current intermediate status without adding a new
good or bad commit:
$ git bisect log | tail -1
# status: waiting for bad commit, 1 good commit known
Signed-off-by: Chris Down <chris@chrisdown.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 73c6de06af ("bisect: don't use invalid oid as rev when
starting") changes the behaviour of `git bisect` to consider invalid
oids as pathspecs again, as in the old shell implementation.
While that behaviour may be desirable, it can also cause confusion. For
example, while bisecting in a particular repo I encountered this:
$ git bisect start d93ff48803f0 v6.3
$
...which led to me sitting for a few moments, wondering why there's no
printout stating the first rev to check.
It turns out that the tag was actually "6.3", not "v6.3", and thus the
bisect was still silently started with only a bad rev, because
d93ff48803f0 was a valid oid and "v6.3" was silently considered to be a
pathspec.
While this behaviour may be desirable, it can be confusing, especially
with different repo conventions either using or not using "v" before
release names, or when a branch name or tag is simply misspelled on the
command line.
In order to avoid situations like this, make it more clear what we're
waiting for:
$ git bisect start d93ff48803f0 v6.3
status: waiting for good commit(s), bad commit known
We already have good output once the bisect process has begun in
earnest, so we don't need to do anything more there.
Signed-off-by: Chris Down <chris@chrisdown.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The progress meter of "git blame" was showing incorrect numbers
when processing only parts of the file.
* ea/progress-partial-blame:
blame: report correct number of lines in progress when using ranges
Reimplement "vimdiff[123]" mergetool drivers with a more generic
layout mechanism.
* fr/vimdiff-layout:
mergetools: add description to all diff/merge tools
vimdiff: add tool documentation
vimdiff: integrate layout tests in the unit tests framework ('t' folder)
vimdiff: new implementation with layout support
The performance of the "untracked cache" feature has been improved
when "--untracked-files=<mode>" and "status.showUntrackedFiles"
are combined.
* tk/untracked-cache-with-uall:
untracked-cache: support '--untracked-files=all' if configured
untracked-cache: test untracked-cache-bypassing behavior with -uall
When unpacking trees, set the default sparsity of the resultant index based
on repo settings and 'is_sparse_index_allowed()'.
Normally, when executing 'unpack_trees', the output index is marked sparse
when (and only when) it unpacks a sparse directory. However, an index may be
"sparse" even if it contains no sparse directories - when all files fall
inside the sparse-checkout definition or otherwise have SKIP_WORKTREE
disabled. Therefore, the output index may be marked "full" even when it is
"sparse", resulting in unnecessary 'ensure_full_index' calls when writing to
disk. Avoid this by setting the "default" index sparsity to match what is
expected for the repository.
As a consequence of this fix, the (non-merge) 'read-tree' performed when
applying a stash with untracked files no longer expands the index. Update
the corresponding test in 't1092'.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update 'stash' to use 'merge_ort_nonrecursive()' to apply a stash to the
current working tree. When 'git stash apply' was converted from its shell
script implementation to a builtin in 8a0fc8d19d (stash: convert apply to
builtin, 2019-02-25), 'merge_recursive_generic()' was used to merge a stash
into the working tree as part of 'git stash (apply|pop)'. However, with the
single merge base used in 'do_apply_stash()', the commit wrapping done by
'merge_recursive_generic()' is not only unnecessary, but misleading (the
*real* merge base is labeled "constructed merge base"). Therefore, a
non-recursive merge of the working tree, stashed tree, and stash base tree
is more appropriate.
There are two options for a non-recursive merge-then-update-worktree
function: 'merge_trees()' and 'merge_ort_nonrecursive()'. Use
'merge_ort_nonrecursive()' to align with the default merge strategy used by
'git merge' (6a5fb96672 (Change default merge backend from recursive to ort,
2021-08-04)) and, because merge-ort does not operate in-place on the index,
avoid unnecessary index expansion. Update tests in 't1092' verifying index
expansion for 'git stash' accordingly.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the index read in 'do_read_index()' does not exist on-disk, mark the
index "sparse" if the executing command does not require a full index and
sparse index is otherwise enabled.
Some commands (such as 'git stash -u') implicitly create a new index (when
the 'GIT_INDEX_FILE' variable points to a non-existent file) and perform
some operation on it. However, when this index is created, it isn't created
with the same sparsity settings as the repo index. As a result, while these
indexes may be sparse during the operation, they are always expanded before
being written to disk. We can avoid that expansion by defaulting the index
to "sparse", in which case it will only be expanded if the full index is
needed.
Note that the function 'set_new_index_sparsity()' is created despite having
only a single caller because additional callers will be added in a
subsequent patch.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Enable sparse index in 'git stash' by disabling
'command_requires_full_index'.
With sparse index enabled, some subcommands of 'stash' work without
expanding the index, e.g., 'git stash', 'git stash list', 'git stash drop',
etc. Others ensure the index is expanded either directly (as in the case of
'git stash [pop|apply]', where the call to 'merge_recursive_generic()' in
'do_apply_stash()' triggers the expansion), or in a command called
internally by stash (e.g., 'git update-index' in 'git stash -u'). So, in
addition to enabling sparse index, add tests to 't1092' demonstrating which
variants of 'git stash' expand the index, and which do not.
Finally, add the option to skip writing 'untracked.txt' in
'ensure_not_expanded', and use that option to successfully apply stashed
untracked files without a conflict in 'untracked.txt'.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add tests verifying expected 'git stash' behavior in
't1092-sparse-checkout-compatibility'. These cases establish the expected
behavior of 'git stash' in a sparse-checkout and verify consistency both
with and without a sparse index. Although no sparse index compatibility has
been integrated into 'git stash' yet, the tests are all 'expect_success' -
we don't want the cone-mode sparse-checkout behavior to change depending on
whether it is using a sparse index or not. Therefore, we expect these tests
to continue passing once sparse index is integrated with 'git stash'.
Additionally, add performance test cases for 'git stash' both with and
without untracked files. Note that, unlike the other tests in
'p2000-sparse-operations.sh', the tests added for 'stash' are combination
operations. This is done to ensure the stash/unstash is not blocked by the
modification of '$SPARSE_CONE/a' performed as part of 'test_perf_on_all'.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git remote -v` (`--verbose`) lists down the names of remotes along with
their URLs. It would be beneficial for users to also specify the filter
types for promisor remotes. Something like this -
origin remote-url (fetch) [blob:none]
origin remote-url (push)
Teach `git remote -v` to also specify the filters for promisor remotes.
Closes: https://github.com/gitgitgadget/git/issues/1211
Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git format-patch <args> -- <pathspec>" lost the pathspec when
showing the second and subsequent commits, which has been
corrected.
source: <c36896a1-6247-123b-4fa3-b7eb24af1897@web.de>
* rs/format-patch-pathspec-fix:
2.36 format-patch regression fix
"git fast-export -- <pathspec>" lost the pathspec when showing the
second and subsequent commits, which has been corrected.
source: <2c988c7b-0efe-4222-4a43-8124fe1a9da6@web.de>
* rs/fast-export-pathspec-fix:
2.36 fast-export regression fix
"git show <commit1> <commit2>... -- <pathspec>" lost the pathspec
when showing the second and subsequent commits, which has been
corrected.
source: <xmqqo80j87g0.fsf_-_@gitster.g>
* jc/show-pathspec-fix:
2.36 show regression fix
"diff-tree --stdin" has been broken for about a year, but 2.36
release broke it even worse by breaking running the command with
<pathspec>, which in turn broke "gitk" and got noticed. This has
been corrected by aligning its behaviour to that of "log".
This fixes a regression in 2.36 and is slate to go to 2.36.1
source: <xmqq7d7bsu2n.fsf@gitster.g>
* jc/diff-tree-stdin-fix:
2.36 gitk/diff-tree --stdin regression fix
"git submodule update" without pathspec should silently skip an
uninitialized submodule, but it started to become noisy by mistake.
This fixes a regression in 2.36 and is slate to go to 2.36.1
source: <pull.1258.v2.git.git.1650890741430.gitgitgadget@gmail.com>
* gc/submodule-update-part2:
submodule--helper: fix initialization of warn_if_uninitialized
git-p4 is designed to run correctly under python2.7 and python3, but
its functional behavior wrt importing user-entered text differs across
these environments:
Under python2, git-p4 "naively" writes the Perforce bytestream into git
metadata (and does not set an "encoding" header on the commits); this
means that any non-utf-8 byte sequences end up creating invalidly-encoded
commit metadata in git.
Under python3, git-p4 attempts to decode the Perforce bytestream as utf-8
data, and fails badly (with an unhelpful error) when non-utf-8 data is
encountered.
Perforce clients (especially p4v) encourage user entry of changelist
descriptions (and user full names) in OS-local encoding, and store the
resulting bytestream to the server unmodified - such that different
clients can end up creating mutually-unintelligible messages. The most
common inconsistency, in many Perforce environments, is likely to be utf-8
(typical in linux) vs cp-1252 (typical in windows).
Make the changelist-description- and user-fullname-handling code
python-runtime-agnostic, introducing three "strategies" selectable via
config:
- 'passthrough', behaving as previously under python2,
- 'strict', behaving as previously under python3, and
- 'fallback', favoring utf-8 but supporting a secondary encoding when
utf-8 decoding fails, and finally escaping high-range bytes if the
decoding with the secondary encoding also fails.
Keep the python2 default behavior as-is ('legacy' strategy), but switch
the python3 default strategy to 'fallback' with default fallback encoding
'cp1252'.
Also include tests exercising these encoding strategies, documentation for
the new config, and improve the user-facing error messages when decoding
does fail.
Signed-off-by: Tao Klerks <tao@klerks.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git format-patch <args> -- <pathspec>" lost the pathspec when
showing the second and subsequent commits, which has been
corrected.
* rs/format-patch-pathspec-fix:
2.36 format-patch regression fix
"git fast-export -- <pathspec>" lost the pathspec when showing the
second and subsequent commits, which has been corrected.
* rs/fast-export-pathspec-fix:
2.36 fast-export regression fix
"git show <commit1> <commit2>... -- <pathspec>" lost the pathspec
when showing the second and subsequent commits, which has been
corrected.
* jc/show-pathspec-fix:
2.36 show regression fix
e900d494dc (diff: add an API for deferred freeing, 2021-02-11) added a
way to allow reusing diffopts: the no_free bit. 244c27242f (diff.[ch]:
have diff_free() call clear_pathspec(opts.pathspec), 2022-02-16) made
that mechanism mandatory.
git fast-export doesn't set no_free, so path limiting stopped working
after the first commit. Set the flag and add a basic test to make sure
only changes to the specified files are exported.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
e900d494dc (diff: add an API for deferred freeing, 2021-02-11) added a
way to allow reusing diffopts: the no_free bit. 244c27242f (diff.[ch]:
have diff_free() call clear_pathspec(opts.pathspec), 2022-02-16) made
that mechanism mandatory.
git format-patch only sets no_free when --output is given, causing it to
forget pathspecs after the first commit. Set no_free unconditionally
instead.
The existing test was unable to detect this breakage because it checks
stderr for the absence of a certain string, but format-patch writes to
stdout. Also the test was not checking the case of one commit modifying
multiple files and a pathspec limiting the diff. Replace it with a more
thorough one.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This only surfaced as a regression after 2.36 release, but the
breakage was already there with us for at least a year.
e900d494 (diff: add an API for deferred freeing, 2021-02-11)
introduced a mechanism to delay freeing resources held in
diff_options struct that need to be kept as long as the struct will
be reused to compute diff. "git log -p" was taught to utilize the
mechanism but it was done with an incorrect assumption that the
underlying helper function, cmd_log_walk(), is called only once,
and it is OK to do the freeing at the end of it.
Alas, for "git show A B", the function is called once for each
commit given, so it is not OK to free the resources until we finish
calling it for all the commits given from the command line.
During 2.36 release cycle, we started clearing the <pathspec> as
part of this freeing, which made the bug a lot more visible.
Fix this breakage by tweaking how cmd_log_walk() frees the resources
at the end and using a variant of it that does not immediately free
the resources to show each commit object from the command line in
"git show".
Protect the fix with a few new tests.
Reported-by: Daniel Li <dan@danielyli.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In some "simple" centralized workflows, users expect remote tracking
branch names to match local branch names. "git push" pushes to the
remote version/instance of the branch, and "git pull" pulls any changes
to the remote branch (changes made by the same user in another place, or
by other users).
This expectation is supported by the push.default default option "simple"
which refuses a default push for a mismatching tracking branch name, and
by the new branch.autosetupmerge option, "simple", which only sets up
remote tracking for same-name remote branches.
When a new branch has been created by the user and has not yet been
pushed (and push.default is not set to "current"), the user is prompted
with a "The current branch %s has no upstream branch" error, and
instructions on how to push and add tracking.
This error is helpful in that following the advice once per branch
"resolves" the issue for that branch forever, but inconvenient in that
for the "simple" centralized workflow, this is always the right thing to
do, so it would be better to just do it.
Support this workflow with a new config setting, push.autoSetupRemote,
which will cause a default push, when there is no remote tracking branch
configured, to push to the same-name on the remote and --set-upstream.
Also add a hint offering this new option when the "The current branch %s
has no upstream branch" error is encountered, and add corresponding tests.
Signed-off-by: Tao Klerks <tao@klerks.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With "push.default=current" configured, a simple "git push" will push to
the same-name branch on the current branch's branch.<name>.pushRemote, or
remote.pushDefault, or origin. If none of these are defined, the push will
fail with error "fatal: No configured push destination".
The same "default to origin if no config" behavior applies with
"push.default=matching".
Other commands use "origin" as a default when there are multiple options,
but default to the single remote when there is only one - for example,
"git checkout <something>". This "assume the single remote if there is
only one" behavior is more friendly/useful than a defaulting behavior
that only uses the name "origin" no matter what.
Update "git push" to also default to the single remote (and finally fall
back to "origin" as default if there are several), for
"push.default=current" and for other current and future remote-defaulting
push behaviors.
This change also modifies the behavior of ls-remote in a consistent way,
so defaulting not only supplies 'origin', but any single configured remote
also.
Document the change in behavior, correct incorrect assumptions in related
tests, and add test cases reflecting this new single-remote-defaulting
behavior.
Signed-off-by: Tao Klerks <tao@klerks.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the default push.default option, "simple", beginners are
protected from accidentally pushing to the "wrong" branch in
centralized workflows: if the remote tracking branch they would push
to does not have the same name as the local branch, and they try to do
a "default push", they get an error and explanation with options.
There is a particular centralized workflow where this often happens:
a user branches to a new local topic branch from an existing
remote branch, eg with "checkout -b feature1 origin/master". With
the default branch.autosetupmerge configuration (value "true"), git
will automatically add origin/master as the upstream tracking branch.
When the user pushes with a default "git push", with the intention of
pushing their (new) topic branch to the remote, they get an error, and
(amongst other things) a suggestion to run "git push origin HEAD".
If they follow this suggestion the push succeeds, but on subsequent
default pushes they continue to get an error - so eventually they
figure out to add "-u" to change the tracking branch, or they spelunk
the push.default config doc as proposed and set it to "current", or
some GUI tooling does one or the other of these things for them.
When one of their coworkers later works on the same topic branch,
they don't get any of that "weirdness". They just "git checkout
feature1" and everything works exactly as they expect, with the shared
remote branch set up as remote tracking branch, and push and pull
working out of the box.
The "stable state" for this way of working is that local branches have
the same-name remote tracking branch (origin/feature1 in this
example), and multiple people can work on that remote feature branch
at the same time, trusting "git pull" to merge or rebase as required
for them to be able to push their interim changes to that same feature
branch on that same remote.
(merging from the upstream "master" branch, and merging back to it,
are separate more involved processes in this flow).
There is a problem in this flow/way of working, however, which is that
the first user, when they first branched from origin/master, ended up
with the "wrong" remote tracking branch (different from the stable
state). For a while, before they pushed (and maybe longer, if they
don't use -u/--set-upstream), their "git pull" wasn't getting other
users' changes to the feature branch - it was getting any changes from
the remote "master" branch instead (a completely different class of
changes!)
An experienced git user might say "well yeah, that's what it means to
have the remote tracking branch set to origin/master!" - but the
original user above didn't *ask* to have the remote master branch
added as remote tracking branch - that just happened automatically
when they branched their feature branch. They didn't necessarily even
notice or understand the meaning of the "set up to track 'origin/master'"
message when they created the branch - especially if they are using a
GUI.
Looking at how to fix this, you might think "OK, so disable auto setup
of remote tracking - set branch.autosetupmerge to false" - but that
will inconvenience the *second* user in this story - the one who just
wanted to start working on the topic branch. The first and second
users swap roles at different points in time of course - they should
both have a sane configuration that does the right thing in both
situations.
Make this "branches have the same name locally as on the remote"
workflow less painful / more obvious by introducing a new
branch.autosetupmerge option called "simple", to match the same-name
"push.default" option that makes similar assumptions.
This new option automatically sets up tracking in a *subset* of the
current default situations: when the original ref is a remote tracking
branch *and* has the same branch name on the remote (as the new local
branch name).
Update the error displayed when the 'push.default=simple' configuration
rejects a mismatching-upstream-name default push, to offer this new
branch.autosetupmerge option that will prevent this class of error.
With this new configuration, in the example situation above, the first
user does *not* get origin/master set up as the tracking branch for
the new local branch. If they "git pull" in their new local-only
branch, they get an error explaining there is no upstream branch -
which makes sense and is helpful. If they "git push", they get an
error explaining how to push *and* suggesting they specify
--set-upstream - which is exactly the right thing to do for them.
This new option is likely not appropriate for users intentionally
implementing a "triangular workflow" with a shared upstream tracking
branch, that they "git pull" in and a "private" feature branch that
they push/force-push to just for remote safe-keeping until they are
ready to push up to the shared branch explicitly/separately. Such
users are likely to prefer keeping the current default
merge.autosetupmerge=true behavior, and change their push.default to
"current".
Also extend the existing branch tests with three new cases testing
this option - the obvious matching-name and non-matching-name cases,
and also a non-matching-ref-type case. The matching-name case needs to
temporarily create an independent repo to fetch from, as the general
strategy of using the local repo as the remote in these tests
precludes locally branching with the same name as in the "remote".
Signed-off-by: Tao Klerks <tao@klerks.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"diff-tree --stdin" has been broken for about a year, but 2.36
release broke it even worse by breaking running the command with
<pathspec>, which in turn broke "gitk" and got noticed. This has
been corrected by aligning its behaviour to that of "log".
* jc/diff-tree-stdin-fix:
2.36 gitk/diff-tree --stdin regression fix
"git submodule update" without pathspec should silently skip an
uninitialized submodule, but it started to become noisy by mistake.
* gc/submodule-update-part2:
submodule--helper: fix initialization of warn_if_uninitialized
The description of 'safe.directory' mentions that it's respected in
the system and global configs, and ignored in the repository config
and on the command line, but it doesn't mention whether it's respected
or ignored when specified via environment variables (nor does the
commit message adding 'safe.directory' [1]).
Clarify that 'safe.directory' is ignored when specified in the
environment, and add tests to make sure that it remains so.
[1] 8959555cee (setup_git_directory(): add an owner check for the
top-level directory, 2022-03-02)
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
According to the documentation 'safe.directory' "is only respected
when specified in a system or global config, not when it is specified
in a repository config or via the command line option -c
safe.directory=<path>".
Add tests to check that 'safe.directory' in the repository config or
on the command line is indeed ignored.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 8959555cee (setup_git_directory(): add an owner check for the
top-level directory, 2022-03-02) when git finds itself in a repository
owned by someone else, it aborts with a "fatal: unsafe repository
(<repo path>)" error message and an advice about how to set the
'safe.directory' config variable to mark that repository as safe.
't0033-safe-directory.sh' contains tests that check that this feature
and handling said config work as intended. To ensure that git dies
for the right reason, several of those tests check that its standard
error contains the name of that config variable, but:
- it only appears in the advice part, not in the actual error
message.
- it is interpreted as a regexp by 'grep', so, because of the dot,
it matches the name of the test script and the path of the trash
directory as well. Consequently, these tests could be fooled by
any error message that would happen to include the path of the
test repository.
Tighten these checks to look for "unsafe repository" instead.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is not obvious that the 'git rev-parse' builtin would use the sparse
index, but it is possible to parse paths out of the index using the
":<path>" syntax. The 'git rev-parse' output is only the OID of the
object found at that location, but otherwise behaves similarly to 'git
show :<path>'. This includes the failure conditions on directories and
the error messages depending on whether a path is in the worktree or
not.
The only code change required is to change the
command_requires_full_index setting in builtin/rev-parse.c, and we can
re-use many existing 'git show' tests for the rev-parse case.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running 'git show :<path>' where '<path>' is a directory, then
there is a subtle difference between a full checkout and a sparse
checkout. The error message from diagnose_invalid_index_path() reports
whether the path is on disk or not. The full checkout will have the
directory on disk, but the path will not be in the index. The sparse
checkout could have the directory not exist, specifically when that
directory is outside of the sparse-checkout cone.
In the case of a sparse index, we have yet another state: the path can
be a sparse directory in the index. In this case, the error message from
diagnose_invalid_index_path() would erroneously say "path '<path>' is in
the index, but not at stage 0", which is false.
Add special casing around sparse directory entries so we get to the
correct error message. This requires two checks in order to get parity
with the normal sparse-checkout case.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The get_oid_with_context_1() method is used when parsing revision
arguments. One particular case is to take a ":<path>" string and search
the index for the given path.
In the case of a sparse index, this might find a sparse directory entry,
in which case the contained object is a tree. In the case of a full
index, this search within the index would fail.
In order to maintain identical return state as in a full index, inspect
the discovered cache entry to see if it is a sparse directory and reject
it. This requires being careful around the only_to_die option to be sure
we die only at the correct time.
This changes the behavior of 'git show :<sparse-dir>', but does not
bring it entirely into alignment with a full index case. It specifically
hits the wrong error message within diagnose_invalid_index_path(). That
error message will be corrected in a future change.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'git show' command can take an input to request the state of an
object in the index. This can lead to parsing the index in order to load
a specific file entry. Without the change presented here, a sparse index
would expand to a full one, taking much longer than usual to access a
simple file.
There is one behavioral change that happens here, though: we now can
find a sparse directory entry within the index! Commands that previously
failed because we could not find an entry in the worktree or index now
succeed because we _do_ find an entry in the index.
There might be more work to do to make other situations succeed when
looking for an indexed tree, perhaps by looking at or updating the
cache-tree extension as needed. These situations include having a full
index or asking for a directory that is within the sparse-checkout cone
(and hence is not a sparse directory entry in the index).
For now, we demonstrate how the sparse index integration is extremely
simple for files outside of the cone as well as directories within the
cone. A later change will resolve this behavior around sparse
directories.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The .warn_if_uninitialized member was introduced by 48308681
(git submodule update: have a dedicated helper for cloning,
2016-02-29) to submodule_update_clone struct and initialized to
false. When c9911c93 (submodule--helper: teach update_data more
options, 2022-03-15) moved it to update_data struct, it started
to initialize it to true but this change was not explained in
its log message.
The member is set to true only when pathspec was given, and is
used when a submodule that matched the pathspec is found
uninitialized to give diagnostic message. "submodule update"
without pathspec is supposed to iterate over all submodules
(i.e. without pathspec limitation) and update only the
initialized submodules, and finding uninitialized submodules
during the iteration is a totally expected and normal thing that
should not be warned.
[jc: added tests]
Signed-off-by: Orgad Shaneh <orgads@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This only surfaced as a regression after 2.36 release, but the
breakage was already there with us for at least a year.
The diff_free() call is to be used after we completely finished with
a diffopt structure. After "git diff A B" finishes producing
output, calling it before process exit is fine. But there are
commands that prepares diff_options struct once, compares two sets
of paths, releases resources that were used to do the comparison,
then reuses the same diff_option struct to go on to compare the next
two sets of paths, like "git log -p".
After "git log -p" finishes showing a single commit, calling it
before it goes on to the next commit is NOT fine. There is a
mechanism, the .no_free member in diff_options struct, to help "git
log" to avoid calling diff_free() after showing each commit and
instead call it just one. When the mechanism was introduced in
e900d494 (diff: add an API for deferred freeing, 2021-02-11),
however, we forgot to do the same to "diff-tree --stdin", which *is*
a moral equivalent to "git log".
During 2.36 release cycle, we started clearing the pathspec in
diff_free(), so programs like gitk that runs
git diff-tree --stdin -- <pathspec>
downstream of a pipe, processing one commit after another, started
showing irrelevant comparison outside the given <pathspec> from the
second commit. The same commit, by forgetting to teach the .no_free
mechanism, broke "diff-tree --stdin -I<regexp>" and nobody noticed
it for over a year, presumably because it is so seldom used an
option.
But <pathspec> is a different story. The breakage was very
prominently visible and was reported immediately after 2.36 was
released.
Fix this breakage by mimicking how "git log" utilizes the .no_free
member so that "diff-tree --stdin" behaves more similarly to "log".
Protect the fix with a few new tests.
Reported-by: Matthias Aßhauer <mha1993@live.de>
Helped-by: René Scharfe <l.s.r@web.de>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When cloning directly from a local repository, we load a list of refs
based on scanning the $GIT_DIR/refs/ directory of the "server"
repository. If files exist in that directory that do not parse as
hexadecimal hashes, then the ref array used by write_remote_refs()
ends up with some entries with null OIDs. This causes us to hit a BUG()
statement in ref_transaction_create():
BUG: create called without valid new_oid
This BUG() call used to be a die() until 033abf97f (Replace all
die("BUG: ...") calls by BUG() ones, 2018-05-02). Before that, the die()
was added by f04c5b552 (ref_transaction_create(): check that new_sha1 is
valid, 2015-02-17).
The original report for this bug [1] mentioned that this problem did not
exist in Git 2.27.0. The failure bisects unsurprisingly to 968f12fda
(refs: turn on GIT_REF_PARANOIA by default, 2021-09-24). When
GIT_REF_PARANOIA is enabled, this case always fails as far back as I am
able to successfully compile and test the Git codebase.
[1] https://github.com/git-for-windows/git/issues/3781
There are two approaches to consider here. One would be to remove this
BUG() statement in favor of returning with an error. There are only two
callers to ref_transaction_create(), so this would have a limited
impact.
The other approach would be to add special casing in 'git clone' to
avoid this faulty input to the method.
While I originally started with changing 'git clone', I decided that
modifying ref_transaction_create() was a more complete solution. This
prevents failing with a BUG() statement when we already have a good way
to report an error (including a reason for that error) within the
method. Both callers properly check the return value and die() with the
error message, so this is an appropriate direction.
The added test helps check against a regression, but does check that our
intended error message is handled correctly.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "--since=<time>" option of "git log" limits the commits displayed by
the command by stopping the traversal once it sees a commit whose
timestamp is older than the given time and not digging further into its
parents.
This is OK in a history where a commit always has a newer timestamp than
any of its parents'. Once you see a commit older than the given <time>,
all ancestor commits of it are even older than the time anyway. It
poses, however, a problem when there is a commit with a wrong timestamp
that makes it appear older than its parents. Stopping traversal at the
"incorrectly old" commit will hide its ancestors that are newer than
that wrong commit and are newer than the cut-off time given with the
--since option. --max-age and --after being the synonyms to --since,
they share the same issue.
Add a new "--since-as-filter" option that is a variant of
"--since=<time>". Instead of stopping the traversal to hide an old
enough commit and its all ancestors, exclude commits with an old
timestamp from the output but still keep digging the history.
Without other traversal stopping options, this will force the command in
"git log" family to dig down the history to the root. It may be an
acceptable cost for a small project with short history and many commits
with screwy timestamps.
It is quite unlikely for us to add traversal stopper other than since,
so have this as a --since-as-filter option, rather than a separate
--as-filter, that would be probably more confusing.
Signed-off-by: Miklos Vajna <vmiklos@vmiklos.hu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add an explicit --no-cone to several sparse-checkout invocations in
preparation for changing the default to cone mode.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "--current" is given to "git show-branch" running in the
"--reflog" mode, the code tries to reference a "reflog" message
that does not even exist. This is because the --current is not
prepared to work in that mode.
The reason "--current" exists is to support this request:
I list branches on the command line. These are the branchesI
care about and I use as anchoring points. I may or may not be on
one of these main branches. Please make sure I can view the
commits on the current branch with respect to what is in these
other branches.
And to serve that request, the code checks if the current branch is
among the ones listed on the command line, and adds it only if it is
not to the end of one array, which essentially lists the objects.
The reflog mode additionally uses another array to list reflog
messages, which the "--current" code does not add to. This leaves
one uninitialized slot at the end of the array of reflog messages,
and causes the program to show garbage or segfault.
Catch the unsupported (and meaningless) combination and exit with a
usage error.
There are other combinations of options that are incompatible but
have not been tested. Add test to cover them while adding coverage
for this new combination.
Reported-by: Gregory David <gregory.david@p1sec.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
--keep-base rebases onto the merge base of the given upstream and the
current HEAD regardless of whether a branch is given. This is contrary
to the documentation and to the option's intended purpose. Instead,
rebase onto the merge base of the given upstream and the given branch.
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git uses temporary files to pass the contents of blobs to external diff
programs and textconv filters. It calls mks_tempfile_ts() to create
them, which puts them all in the same directory. This requires adding
a random name prefix.
Use mks_tempfile_dt() instead, which allows the files to have arbitrary
names, each in their own separate temporary directory. This way they
can have the same basename as the original blob, which looks nicer in
graphical diff programs.
The test in t4020 to check the prettiness of the temporary paths was
neutered by 5476bdf0e8 (diff tests: don't ignore "git diff" exit code in
"read" loop, 2022-03-07), which removed its grep check without replacing
it with an equivalent test_cmp check. Add one that only checks the
basename of the temporary file and nothing else.
And make the test more robust while at it, by using test_when_finished
to get rid of the added file even if the test fails.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Call diff_free() on the "pruning" member of "struct rev_info". Doing
so makes several tests pass under SANITIZE=leak.
This was also the last missing piece that allows us to remove the
UNLEAK() in "cmd_diff" and "cmd_diff_index", which allows us to use
those commands as a canary for general leaks in the revisions API. See
[1] for further rationale, and 886e1084d7 (builtin/: add UNLEAKs,
2017-10-01) for the commit that added the UNLEAK() there.
1. https://lore.kernel.org/git/220218.861r00ib86.gmgdl@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a missing reflog_walk_info_release() to "reflog-walk.c" and use it
in release_revisions().
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Clear the "boundary_commits" object_array in release_revisions(). This
makes a few more tests pass under SANITIZE=leak, including
"t/t4126-apply-empty.sh" which started failed as an UNLEAK() in
cmd_format_patch() was removed in a preceding commit.
This also re-marks the various tests relying on "git format-patch" as
passing under "SANITIZE=leak", in the preceding "revisions API users:
use release_revisions() in builtin/log.c" commit those were marked as
failing as we removed the UNLEAK(rev) from cmd_format_patch() in
"builtin/log.c".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the the release_revisions() function so that it frees the
"grep_filter" in the "struct rev_info".This allows us to mark a test
as passing under "TEST_PASSES_SANITIZE_LEAK=true".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the the release_revisions() function so that it frees the
"filter" in the "struct rev_info". This in combination with a
preceding change to free "cmdline" means that we can mark another set
of tests as passing under "TEST_PASSES_SANITIZE_LEAK=true".
The "filter" member was added recently in ffaa137f64 (revision: put
object filter into struct rev_info, 2022-03-09), and this fixes leaks
intruded in the subsequent leak 7940941de1 (pack-objects: use
rev.filter when possible, 2022-03-09) and 105c6f14ad (bundle: parse
filter capability, 2022-03-09).
The "builtin/pack-objects.c" leak in 7940941de1 was effectively with
us already, but the variable was referred to by a "static" file-scoped
variable. The "bundle.c " leak in 105c6f14ad was newly introduced
with the new "filter" feature for bundles.
The "t5600-clone-fail-cleanup.sh" change here to add
"TEST_PASSES_SANITIZE_LEAK=true" is one of the cases where
run-command.c in not carrying the abort() exit code upwards would have
had that test passing before, but now it *actually* passes[1]. We
should fix the lack of 1=1 mapping of SANITIZE=leak testing to actual
leaks some other time, but it's an existing edge case, let's just mark
the really-passing test as passing for now.
1. https://lore.kernel.org/git/220303.86fsnz5o9w.gmgdl@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the the release_revisions() function so that it frees the
"cmdline" in the "struct rev_info". This in combination with a
preceding change to free "commits" and "mailmap" means that we can
whitelist another test under "TEST_PASSES_SANITIZE_LEAK=true".
There was a proposal in [1] to do away with xstrdup()-ing this
add_rev_cmdline(), perhaps that would be worthwhile, but for now let's
just free() it.
We could also make that a "char *" in "struct rev_cmdline_entry"
itself, but since we own it let's expose it as a constant to outside
callers. I proposed that in [2] but have since changed my mind. See
14d30cdfc0 (ref-filter: fix memory leak in `free_array_item()`,
2019-07-10), c514c62a4f (checkout: fix leak of non-existent branch
names, 2020-08-14) and other log history hits for "free((char *)" for
prior art.
This includes the tests we had false-positive passes on before my
6798b08e84 (perl Git.pm: don't ignore signalled failure in
_cmd_close(), 2022-02-01), now they pass for real.
Since there are 66 tests matching t/t[0-9]*git-svn*.sh it's easier to
list those that don't pass than to touch most of those 66. So let's
introduce a "TEST_FAILS_SANITIZE_LEAK=true", which if set in the tests
won't cause lib-git-svn.sh to set "TEST_PASSES_SANITIZE_LEAK=true.
This change also marks all the tests that we removed
"TEST_FAILS_SANITIZE_LEAK=true" from in an earlier commit due to
removing the UNLEAK() from cmd_format_patch(), we can now assert that
its API use doesn't leak any "struct rev_info" memory.
This change also made commit "t5503-tagfollow.sh" pass on current
master, but that would regress when combined with
ps/fetch-atomic-fixup's de004e848a (t5503: simplify setup of test
which exercises failure of backfill, 2022-03-03) (through no fault of
that topic, that change started using "git clone" in the test, which
has an outstanding leak). Let's leave that test out for now to avoid
in-flight semantic conflicts.
1. https://lore.kernel.org/git/YUj%2FgFRh6pwrZalY@carlos-mbp.lan/
2. https://lore.kernel.org/git/87o88obkb1.fsf@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extend the the release_revisions() function so that it frees the
"mailmap" in the "struct rev_info".
The log family of functions now calls the clear_mailmap() function
added in fa8afd18e5a (revisions API: provide and use a
release_revisions(), 2021-09-19), allowing us to whitelist some tests
with "TEST_PASSES_SANITIZE_LEAK=true".
Unfortunately having a pointer to a mailmap in "struct rev_info"
instead of an embedded member that we "own" get a bit messy, as can be
seen in the change to builtin/commit.c.
When we free() this data we won't be able to tell apart a pointer to a
"mailmap" on the heap from one on the stack. As seen in
ea57bc0d41 (log: add --use-mailmap option, 2013-01-05) the "log"
family allocates it on the heap, but in the find_author_by_nickname()
code added in ea16794e43 (commit: search author pattern against
mailmap, 2013-08-23) we allocated it on the stack instead.
Ideally we'd simply change that member to a "struct string_list
mailmap" and never free() the "mailmap" itself, but that would be a
much larger change to the revisions API.
We have code that needs to hand an existing "mailmap" to a "struct
rev_info", while we could change all of that, let's not go there
now.
The complexity isn't in the ownership of the "mailmap" per-se, but
that various things assume a "rev_info.mailmap == NULL" means "doesn't
want mailmap", if we changed that to an init'd "struct string_list
we'd need to carefully refactor things to change those assumptions.
Let's instead always free() it, and simply declare that if you add
such a "mailmap" it must be allocated on the heap. Any modern libc
will correctly panic if we free() a stack variable, so this should be
safe going forward.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for having the "log" family of functions make wider use
of release_revisions() let's have them call it just before
exiting. This changes the "log", "whatchanged", "show",
"format-patch", etc. commands, all of which live in this file.
The release_revisions() API still only frees the "pending" member, but
will learn to release more members of "struct rev_info" in subsequent
commits.
In the case of "format-patch" revert the addition of UNLEAK() in
dee839a263 (format-patch: mark rev_info with UNLEAK, 2021-12-16),
which will cause several tests that previously passed under
"TEST_PASSES_SANITIZE_LEAK=true" to start failing.
In subsequent commits we'll now be able to use those tests to check
whether that part of the API is really leaking memory, and will fix
all of those memory leaks. Removing the UNLEAK() allows us to make
incremental progress in that direction. See [1] for further details
about this approach.
Note that the release_revisions() will not be sufficient to deal with
the code in cmd_show() added in 5d7eeee2ac (git-show: grok blobs,
trees and tags, too, 2006-12-14) which clobbers the "pending" array in
the case of "OBJ_COMMIT". That will need to be dealt with by some
future follow-up work.
1. https://lore.kernel.org/git/220218.861r00ib86.gmgdl@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a release_revisions() to various users of "struct rev_info" which
requires a minor refactoring to a "goto cleanup" pattern to use that
function.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a release_revisions() to various users of "struct rev_list" in
those straightforward cases where we only need to add the
release_revisions() call to the end of a block, and don't need to
e.g. refactor anything to use a "goto cleanup" pattern.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a memory leak that's been with us since f9500261e0 (fast-rebase:
write conflict state to working tree, index, and HEAD, 2021-05-20)
changed this code to move these strbuf_release() into an if/else
block.
We'll also add to "reflog_msg" in the "else" arm of the "if" block
being modified here, and we'll append to "branch_msg" in both
cases. But after f9500261e0 only the "if" block would free these two
"struct strbuf".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 2a0cafd464,
as it expects a working "a ref deletion must produce a single
transaction, not one for loose and another for packed" topic,
which we do not have.
With the addition of the safe.directory in 8959555ce
(setup_git_directory(): add an owner check for the top-level directory,
2022-03-02) released in v2.35.2, we are receiving feedback from a
variety of users about the feature.
Some users have a very large list of shared repositories and find it
cumbersome to add this config for every one of them.
In a more difficult case, certain workflows involve running Git commands
within containers. The container boundary prevents any global or system
config from communicating `safe.directory` values from the host into the
container. Further, the container almost always runs as a different user
than the owner of the directory in the host.
To simplify the reactions necessary for these users, extend the
definition of the safe.directory config value to include a possible '*'
value. This value implies that all directories are safe, providing a
single setting to opt-out of this protection.
Note that an empty assignment of safe.directory clears all previous
values, and this is already the case with the "if (!value || !*value)"
condition.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It seems that nothing is ever checking to make sure the safe directories
in the configs actually have the key safe.directory, so some unrelated
config that has a value with a certain directory would also make it a
safe directory.
Signed-off-by: Matheus Valadares <me@m28.io>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is difficult to change the ownership on a directory in our test
suite, so insert a new GIT_TEST_ASSUME_DIFFERENT_OWNER environment
variable to trick Git into thinking we are in a differently-owned
directory. This allows us to test that the config is parsed correctly.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use test_path_is_file() instead of 'test -f' for better debugging
information.
Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As the address sanitizer checks for a superset of the issues detected
by setting MALLOC_CHECK_ (which tries to detect things like double
frees and off-by-one errors) there is no need to set the latter when
compiling with -fsanitize=address.
This fixes a regression introduced by 131b94a10a ("test-lib.sh: Use
GLIBC_TUNABLES instead of MALLOC_CHECK_ on glibc >= 2.34", 2022-03-04)
which causes all the tests to fail with the message
ASan runtime does not come first in initial library list;
you should either link runtime to your application or
manually preload it with LD_PRELOAD.
when git is compiled with SANITIZE=address on systems with glibc >=
2.34. I have tested SANITIZE=leak and SANITIZE=undefined and they do
not suffer from this regression so the fix in this patch should be
sufficient.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Check if git grep works around the PCRE2 big fixed by their e0c6029
(Fix inifinite loop when a single byte newline is searched in JIT.,
2020-05-29), which affects version 10.35 and earlier.
Searching for leading whitespace also triggers the endless loop.
Set a one-second alarm to abort in case we do get hit by the bug, to
avoid having to wait forever for the test result.
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "FUNNYNAMES" test prerequisite passes on Cygwin, as the Cygwin
file system interface has a workaround for the underlying operating
system's lack of support for tabs, newlines or quotes. However, it does
not add support for backslash, which is treated as a directory
separator, meaning one of the tests added by 48803821b1 ("completion:
handle unusual characters for sparse-checkout", 2022-02-07) will fail on
Cygwin.
To avoid this failure while still getting maximal test coverage, split
that test into two: test handling of paths that include tabs on anything
that has the FUNNYNAMES prerequisite, but skip testing handling of paths
that include backslashes unless both FUNNYNAMES is set and the system is
not Cygwin.
It might be nice to have more granularity than "FUNNYNAMES" and its
sibling "FUNNIERNAMES" provide, so that tests could be run based on
specific individual characters supported by the file system being
tested, but that seems like it would make the prerequisite checks in
this area much more verbose for very little gain.
Signed-off-by: Adam Dinwoodie <adam@dinwoodie.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The warning about converting line endings is extremely confusing. Its
two sentences each use the word "will" without specifying a timeframe,
which makes it sound like both sentences are referring to the same
timeframe. On top of that, it uses the term "original line endings"
without saying whether "original" means LF or CRLF.
Rephrase the warning to be clear about when the line endings will be
changed and what they will be changed to.
On a platform whose native line endings are not CRLF (e.g. Linux), the
"git add" step in the following sequence triggers the warning in
question:
$ git config core.autocrlf true
$ echo 'Hello world!' >hello.txt
$ git add hello.txt
warning: LF will be replaced by CRLF in hello.txt
The file will have its original line endings in your working directory
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tests that affect the repo in stateful ways are easier to write if we
can run setup steps outside of the measured portion of perf iteration.
This change adds a "--setup 'setup-script'" parameter to test_perf. To
make invocations easier to understand, I also moved the prerequisites to
a new --prereq parameter.
The setup facility will be used in the upcoming perf tests for batch
mode, but it already helps in some existing tests, like t5302 and t7820.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add basic performance tests for git commands that can add data to the
object database. We cover:
* git add
* git stash
* git update-index (via git stash)
* git unpack-objects
* git commit --all
We cover all currently available fsync methods as well.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add test cases to exercise batch mode for:
* 'git add'
* 'git stash'
* 'git update-index'
* 'git unpack-objects'
These tests ensure that the added data winds up in the object database.
In this change we introduce a new test helper lib-unique-files.sh. The
goal of this library is to create a tree of files that have different
oids from any other files that may have been created in the current test
repo. This helps us avoid missing validation of an object being added
due to it already being in the repo.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Several tests use awk to parse OIDs from the output of 'git ls-files
--stage' and 'git ls-tree'. Introduce helpers to centralize these uses
of awk.
Update t5317-pack-objects-filter-objects.sh to use the new ls-files
helper so that it has some usages to review. Other updates are left for
the future.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Perforce has a file type "utf8" which represents a text file with
explicit BOM. utf8-encoded files *without* BOM are stored as
regular file type "text". The "utf8" file type behaves like text
in all but one important way: it is stored, internally, without
the leading 3 BOM bytes.
git-p4 has historically imported utf8-with-BOM files (files stored,
in Perforce, as type "utf8") the same way as regular text files -
losing the BOM in the process.
Under most circumstances this issue has little functional impact,
as most systems consider the BOM to be optional and redundant, but
this *is* a correctness failure, and can have lead to practical
issues for example when BOMs are explicitly included in test files,
for example in a file encoding test suite.
Fix the handling of utf8-with-BOM files when importing changes from
p4 to git, and introduce a test that checks it is working correctly.
Signed-off-by: Tao Klerks <tao@klerks.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the --branch argument of the "sync" subcommand, git-p4 enables
you to import a perforce branch/path to an arbitrary git ref, using
a full ref path, or to refs/remotes/p4/* or refs/heads/p4/*,
depending on --import-local, using a short ref name.
However, when you later want to explicitly sync such a given ref to
pick up subsequent p4 changes, it only works if the ref was placed
in the p4 path *and* has only one path component (no "/").
This limitation results from a bad assumption in the
existing-branch sync logic, and also means you cannot individually
sync branches detected by --detect-branches, as these also get a
"/" in their names.
Fix "git p4 sync --branch", when called with an existing ref, so
that it works correctly regardless of whether the ref is in the p4
path or not, and (in the case of refs in the p4 path) regardless of
whether it has a "/" in its short name or not.
Also add tests to validate that these branch-specific syncs work
as expected.
Signed-off-by: Tao Klerks <tao@klerks.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the test 'cherry-pick after renaming branch', stop checking for
the presence of a file (opos) because we are going to "grep" in it in
the same test and the lack of it will be noticed as a failure anyway.
In the test 'revert after renaming branch', instead of allowing any
random contents as long as a known phrase is not there in it, we can
expect the exact outcome---after the successful revert of "added", the
contents of file "spoo" should become identical to what was in file
"oops" in the "initial" commit. This test also contains 'test -f' that
verifies presence of a file, but we have a helper function to do the same
thing. Replace it with appropriate helper function 'test_path_is_file'
for better readability and better error messages.
In both tests, we will not notice when "git rev-parse" starts segfaulting
without emitting any output. The 'test' command will end up being just
"test =", which yields success. Use the 'test_cmp_rev' helper to make
sure we will notice such a breakage.
Signed-off-by: Khalid Masum <khalid.masum.92@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
dc6b1d92ca (wt-status: use settings from git_diff_ui_config, 2018-05-04)
disabled diffopt.break_opt for diffstats shown by git status and in
commit templates. For git status there isn't even a way to enable it.
Make the commit summary (shown after the commit) consistent by disabling
it there as well.
Reported-by: Laurent Lyaudet <laurent.lyaudet@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git worktree list --porcelain" did not c-quote pathnames and lock
reasons with unsafe bytes correctly, which is worked around by
introducing NUL terminated output format with "-z".
* pw/worktree-list-with-z:
worktree: add -z option for list subcommand
Built-in fsmonitor (part 2).
* jh/builtin-fsmonitor-part2: (30 commits)
t7527: test status with untracked-cache and fsmonitor--daemon
fsmonitor: force update index after large responses
fsmonitor--daemon: use a cookie file to sync with file system
fsmonitor--daemon: periodically truncate list of modified files
t/perf/p7519: add fsmonitor--daemon test cases
t/perf/p7519: speed up test on Windows
t/perf/p7519: fix coding style
t/helper/test-chmtime: skip directories on Windows
t/perf: avoid copying builtin fsmonitor files into test repo
t7527: create test for fsmonitor--daemon
t/helper/fsmonitor-client: create IPC client to talk to FSMonitor Daemon
help: include fsmonitor--daemon feature flag in version info
fsmonitor--daemon: implement handle_client callback
compat/fsmonitor/fsm-listen-darwin: implement FSEvent listener on MacOS
compat/fsmonitor/fsm-listen-darwin: add MacOS header files for FSEvent
compat/fsmonitor/fsm-listen-win32: implement FSMonitor backend on Windows
fsmonitor--daemon: create token-based changed path cache
fsmonitor--daemon: define token-ids
fsmonitor--daemon: add pathname classification
fsmonitor--daemon: implement 'start' command
...
Give hint when branch tracking cannot be established because fetch
refspecs from multiple remote repositories overlap.
* tk/ambiguous-fetch-refspec:
tracking branches: add advice to ambiguous refspec error