Introduce an extension to the commit-graph to make it efficient to
check for the paths that were modified at each commit using Bloom
filters.
* gs/commit-graph-path-filter:
bloom: ignore renames when computing changed paths
commit-graph: add GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS test flag
t4216: add end to end tests for git log with Bloom filters
revision.c: add trace2 stats around Bloom filter usage
revision.c: use Bloom filters to speed up path based revision walks
commit-graph: add --changed-paths option to write subcommand
commit-graph: reuse existing Bloom filters during write
commit-graph: write Bloom filters to commit graph file
commit-graph: examine commits by generation number
commit-graph: examine changed-path objects in pack order
commit-graph: compute Bloom filters for changed paths
diff: halt tree-diff early after max_changes
bloom.c: core Bloom filter implementation for changed paths.
bloom.c: introduce core Bloom filter constructs
bloom.c: add the murmur3 hash implementation
commit-graph: define and use MAX_NUM_CHUNKS
The commit-graph code exhausted file descriptors easily when it
does not have to.
* tb/commit-graph-fd-exhaustion-fix:
commit-graph: close descriptors after mmap
commit-graph.c: gracefully handle file descriptor exhaustion
t/test-lib.sh: make ULIMIT_FILE_DESCRIPTORS available to tests
commit-graph.c: don't use discarded graph_name in error
Fix in-core inconsistency after fetching into a shallow repository
that broke the code to write out commit-graph.
* tb/reset-shallow:
shallow.c: use '{commit,rollback}_shallow_file'
t5537: use test_write_lines and indented heredocs for readability
Tighten "git mailinfo" to notice and error out when decoded result
contains NUL in it.
* dd/mailinfo-with-nul:
mailinfo: disallow NUL character in mail's header
mailinfo.c: avoid strlen on strings that can contains NUL
t4254: merge 2 steps of a single test
Test clean-up.
* dl/test-must-fail-fixes-4:
t9819: don't use test_must_fail with p4
t9164: use test_must_fail only on git commands
t9160: use test_path_is_missing()
t9141: use test_path_is_missing()
t7508: don't use `test_must_fail test_cmp`
t7408: replace incorrect uses of test_must_fail
t6030: use test_path_is_missing()
"git update-ref --stdin" learned a handful of new verbs to let the
user control ref update transactions more explicitly, which helps
as an ingredient to implement two-phase commit-style atomic
ref-updates across multiple repositories.
* ps/transactional-update-ref-stdin:
update-ref: implement interactive transaction handling
update-ref: read commands in a line-wise fashion
update-ref: move transaction handling into `update_refs_stdin()`
update-ref: pass end pointer instead of strbuf
update-ref: drop unused argument for `parse_refname`
update-ref: organize commands in an array
strbuf: provide function to append whole lines
git-update-ref.txt: add missing word
refs: fix segfault when aborting empty transaction
The directory traversal code had redundant recursive calls which
made its performance characteristics exponential with respect to
the depth of the tree, which was corrected.
* en/fill-directory-exponential:
completion: fix 'git add' on paths under an untracked directory
Fix error-prone fill_directory() API; make it only return matches
dir: replace double pathspec matching with single in treat_directory()
dir: include DIR_KEEP_UNTRACKED_CONTENTS handling in treat_directory()
dir: replace exponential algorithm with a linear one
dir: refactor treat_directory to clarify control flow
dir: fix confusion based on variable tense
dir: fix broken comment
dir: consolidate treat_path() and treat_one_path()
dir: fix simple typo in comment
t3000: add more testcases testing a variety of ls-files issues
t7063: more thorough status checking
"sparse-checkout" UI improvements.
* en/sparse-checkout:
sparse-checkout: provide a new reapply subcommand
unpack-trees: failure to set SKIP_WORKTREE bits always just a warning
unpack-trees: provide warnings on sparse updates for unmerged paths too
unpack-trees: make sparse path messages sound like warnings
unpack-trees: split display_error_msgs() into two
unpack-trees: rename ERROR_* fields meant for warnings to WARNING_*
unpack-trees: move ERROR_WOULD_LOSE_SUBMODULE earlier
sparse-checkout: use improved unpack_trees porcelain messages
sparse-checkout: use new update_sparsity() function
unpack-trees: add a new update_sparsity() function
unpack-trees: pull sparse-checkout pattern reading into a new function
unpack-trees: do not mark a dirty path with SKIP_WORKTREE
unpack-trees: allow check_updates() to work on a different index
t1091: make some tests a little more defensive against failures
unpack-trees: simplify pattern_list freeing
unpack-trees: simplify verify_absent_sparse()
unpack-trees: remove unused error type
unpack-trees: fix minor typo in comment
Update the CI configuration to use GitHub Actions, retiring the one
based on Azure Pipelines.
* dd/ci-swap-azure-pipelines-with-github-actions:
ci: let GitHub Actions upload failed tests' directories
ci: add a problem matcher for GitHub Actions
tests: when run in Bash, annotate test failures with file name/line number
ci: retire the Azure Pipelines definition
README: add a build badge for the GitHub Actions runs
ci: configure GitHub Actions for CI/PR
ci: run gem with sudo to install asciidoctor
ci: explicit install all required packages
ci: fix the `jobname` of the `GETTEXT_POISON` job
ci/lib: set TERM environment variable if not exist
ci/lib: allow running in GitHub Actions
ci/lib: if CI type is unknown, show the environment variables
The stash entry created by "git rebase --autosquash" to keep the
initial dirty state were discarded by mistake upon "git rebase
--quit", which has been corrected.
* dl/merge-autostash-rebase-quit-fix:
rebase: save autostash entry into stash reflog on --quit
In a previous commit, we made incremental graph layers read-only by
using 'git_mkstemp_mode' with permissions '0444'.
There is no reason that 'commit-graph-chain's should be modifiable by
the user, since they are generated at a temporary location and then
atomically renamed into place.
To ensure that these files are read-only, too, use
'hold_lock_file_for_update_mode' with the same read-only permission
bits, and let the umask and 'adjust_shared_perm' take care of the rest.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Non-layered commit-graphs use 'adjust_shared_perm' to make the
commit-graph file readable (or not) to a combination of the user, group,
and others.
Call 'adjust_shared_perm' for split-graph layers to make sure that these
also respect 'core.sharedRepository'. The 'commit-graph-chain' file
already respects this configuration since it uses
'hold_lock_file_for_update' (which calls 'adjust_shared_perm' eventually
in 'create_tempfile_mode').
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the previous commit, Git learned 'hold_lock_file_for_update_mode' to
allow the caller to specify the permission bits (prior to further
adjustment by the umask and shared repository permissions) used when
acquiring a temporary file.
Use this in the commit-graph machinery for writing a non-split graph to
acquire an opened temporary file with permissions read-only permissions
to match the split behavior. (In the split case, Git uses
git_mkstemp_mode' for each of the commit-graph layers with permission
bits '0444').
One can notice this discrepancy when moving a non-split graph to be part
of a new chain. This causes a commit-graph chain where all layers have
read-only permission bits, except for the base layer, which is writable
for the current user.
Resolve this discrepancy by using the new
'hold_lock_file_for_update_mode' and passing the desired permission
bits.
Doing so causes some test fallout in t5318 and t6600. In t5318, this
occurs in tests that corrupt a commit-graph file by writing into it. For
these, 'chmod u+w'-ing the file beforehand resolves the issue. The
additional spot in 'corrupt_graph_verify' is necessary because of the
extra 'git commit-graph write' beforehand (which *does* rewrite the
commit-graph file). In t6600, this is caused by copying a read-only
commit-graph file into place and then trying to replace it. For these,
make these files writable.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the patches for CVE-2020-11008, the ability to specify credential
settings in the config for partial URLs got lost. For example, it used
to be possible to specify a credential helper for a specific protocol:
[credential "https://"]
helper = my-https-helper
Likewise, it used to be possible to configure settings for a specific
host, e.g.:
[credential "dev.azure.com"]
useHTTPPath = true
Let's reinstate this behavior.
While at it, increase the test coverage to document and verify the
behavior with a couple other categories of partial URLs.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git grep" did not quote a path with unusual character like other
commands (like "git diff", "git status") do, but did quote when run
from a subdirectory, both of which has been corrected.
* mt/grep-cquote-path:
grep: follow conventions for printing paths w/ unusual chars
The "--decorate-refs" and "--decorate-refs-exclude" options "git
log" takes have learned a companion configuration variable
log.excludeDecoration that sits at the lowest priority in the
family.
* ds/log-exclude-decoration-config:
log: add log.excludeDecoration config option
log-tree: make ref_filter_match() a helper method
"git diff-tree --pretty --notes" used to hit an assertion failure,
as it forgot to initialize the notes subsystem.
* tb/diff-tree-with-notes:
diff-tree.c: load notes machinery when required
Allowing the user to split a patch hunk while "git stash -p" does
not work well; a band-aid has been added to make this (partially)
work better.
* js/stash-p-fix:
stash -p: (partially) fix bug concerning split hunks
t3904: fix incorrect demonstration of a bug
"git push --atomic" used to show failures for refs that weren't
even pushed, which has been corrected.
* jx/atomic-push:
transport-helper: new method reject_atomic_push()
transport-helper: mark failure for atomic push
send-pack: mark failure of atomic push properly
t5543: never report what we do not push
send-pack: fix inconsistent porcelain output
"git diff" in a partial clone learned to avoid lazy loading blob
objects in more casese when they are not needed.
* jt/avoid-prefetch-when-able-in-diff:
diff: restrict when prefetching occurs
diff: refactor object read
diff: make diff_populate_filespec_options struct
promisor-remote: accept 0 as oid_nr in function
"git commit-graph write --expire-time=<timestamp>" did not use the
given timestamp correctly, which has been corrected.
* ds/commit-graph-expiry-fix:
commit-graph: fix buggy --expire-time option
"git log" learns "--[no-]mailmap" as a synonym to "--[no-]use-mailmap"
* jc/log-no-mailmap:
log: give --[no-]use-mailmap a more sensible synonym --[no-]mailmap
clone: reorder --recursive/--recurse-submodules
parse-options: teach "git cmd -h" to show alias as alias
The custom hash function used by "git fast-import" has been
replaced with the one from hashmap.c, which gave us a nice
performance boost.
* jk/fast-import-use-hashmap:
fast-import: replace custom hash with hashmap.c
Document the recommended way to abort a failing test early (e.g. by
exiting a loop), which is to say "return 1".
* jc/doc-test-leaving-early:
t/README: suggest how to leave test early with failure
Various tests have been updated to work around issues found with
shell utilities that come with busybox etc.
* dd/test-with-busybox:
t5703: feed raw data into test-tool unpack-sideband
t4124: tweak test so that non-compliant diff(1) can also be used
t7063: drop non-POSIX argument "-ls" from find(1)
t5616: use rev-parse instead to get HEAD's object_id
t5003: skip conversion test if unzip -a is unavailable
t5003: drop the subshell in test_lazy_prereq
test-lib-functions: test_cmp: eval $GIT_TEST_CMP
t4061: use POSIX compliant regex(7)
In a03b55530a (merge: teach --autostash option, 2020-04-07), the
--autostash option was introduced for `git merge`. Notably, when
`git merge --quit` is run with an autostash entry present, it is saved
into the stash reflog. This is contrasted with the current behaviour of
`git rebase --quit` where the autostash entry is simply just dropped out
of existence.
Adopt the behaviour of `git merge --quit` in `git rebase --quit` and
save the autostash entry into the stash reflog instead of just deleting
it.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test added by 477dcaddb6 (tests: do not let lazy prereqs inside
`test_expect_*` turn off tracing, 2020-03-26) runs a sub-test script
that traces a test with a lazy prereq, like:
test_have_prereq LAZY && echo trace
That won't work if GIT_TEST_FAIL_PREREQS is set in the environment,
because our have_prereq will report failure, and we won't run the echo
at all.
We could work around this by avoiding the &&-chain, but we can
fix this and any future tests at once by unsetting that variable for our
sub-tests. These are meant to be controlled environments where we test
the test-suite itself; the outer test snippet should be in charge of the
sub-test environment, not whatever mode the user happens to be running
in.
Reported-by: Son Luong Ngoc <sluongng@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the function process_acks() in fetch-pack.c, the variable
received_ack is meant to track that an ACK was received, but it was
never set. This results in negotiation terminating prematurely through
the in_vain counter, when the counter should have been reset upon every
ACK.
Therefore, reset the in_vain counter upon every ACK.
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fetching, Git stops negotiation when it has sent at least
MAX_IN_VAIN (which is 256) "have" lines without having any of them
ACK-ed. But this is supposed to trigger only after the first ACK, as
pack-protocol.txt says:
However, the 256 limit *only* turns on in the canonical client
implementation if we have received at least one "ACK %s continue"
during a prior round. This helps to ensure that at least one common
ancestor is found before we give up entirely.
The code path for protocol v0 observes this, but not protocol v2,
resulting in shorter negotiation rounds but significantly larger
packfiles. Teach the code path for protocol v2 to check this criterion
only after at least one ACK was received.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
46fd7b3900 ("credential: allow wildcard patterns when matching config",
2020-02-20) introduced support for matching credential helpers using
urlmatch. In doing so, it introduced code to percent-encode the paths
we get from the credential helper so that they could be effectively
matched by the urlmatch code.
Unfortunately, that code had a bug: it percent-encoded the slashes in
the path, resulting in any URL path that contained multiple levels
(i.e., a directory component) not matching.
We are currently the only caller of the percent-encoding code and could
simply change it not to encode slashes. However, we still want to
encode slashes in the username component, so we need to have both
behaviors available.
So instead, let's add a flag to control encoding slashes, which is the
behavior we want here, and use it when calling the code in this case.
Add a test for credential helper URLs using multiple slashes in the
path, which our test suite previously lacked, as well as one ensuring
that we handle usernames with slashes gracefully. Since we're testing
other percent-encoding handling, let's add one for non-ASCII UTF-8
characters as well.
Reported-by: Ilya Tretyakov <it@it3xl.ru>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Reviewed-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Reviewed-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-commit(1) says ISO-8601 is one of our supported date format.
ISO-8601 allows timestamps to have a fractional number of seconds.
We represent time only in terms of whole seconds, so we never bothered
parsing fractional seconds. However, it's better for us to parse and
throw away the fractional part than to refuse to parse the timestamp
at all.
And refusing parsing fractional second part may confuse the parse to
think fractional and timezone as day and month in this example:
2008-02-14 20:30:45.019-04:00
While doing this, make sure that we only interpret the number after the
second and the dot as fractional when and only when the date is known,
since only ISO-8601 allows the fractional part, and we've taught our
users to interpret "12:34:56.7.days.ago" as a way to specify a time
relative to current time.
Reported-by: Brian M. Carlson <sandals@crustytoothpaste.net>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In bd0b42aed3 (fetch-pack: do not take shallow lock unnecessarily,
2019-01-10), the author noted that 'is_repository_shallow' produces
visible side-effect(s) by setting 'is_shallow' and 'shallow_stat'.
This is a problem for e.g., fetching with '--update-shallow' in a
shallow repository with 'fetch.writeCommitGraph' enabled, since the
update to '.git/shallow' will cause Git to think that the repository
isn't shallow when it is, thereby circumventing the commit-graph
compatibility check.
This causes problems in shallow repositories with at least shallow refs
that have at least one ancestor (since the client won't have those
objects, and therefore can't take the reachability closure over commits
when writing a commit-graph).
Address this by introducing thin wrappers over 'commit_lock_file' and
'rollback_lock_file' for use specifically when the lock is held over
'.git/shallow'. These wrappers (appropriately called
'commit_shallow_file' and 'rollback_shallow_file') call into their
respective functions in 'lockfile.h', but additionally reset validity
checks used by the shallow machinery.
Replace each instance of 'commit_lock_file' and 'rollback_lock_file'
with 'commit_shallow_file' and 'rollback_shallow_file' when the lock
being held is over the '.git/shallow' file.
As a result, 'prune_shallow' can now only be called once (since
'check_shallow_file_for_update' will die after calling
'reset_repository_shallow'). But, this is OK since we only call
'prune_shallow' at most once per process.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A number of spots in t5537 use the non-indented heredoc '<<EOF' when
they would benefit from instead using '<<-EOF' or simply
test_write_lines.
In preparation for adding new tests in a good style and being consistent
with the surrounding code, update the existing tests to improve their
readability.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If "test-tool bloom" is not fed a command, or if arguments are missing
for some commands, it will just segfault. Let's check argc and write a
friendlier usage message.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When writing a layered commit-graph, the commit-graph machinery uses
'commit_graph_filenames_after' and 'commit_graph_hash_after' to keep
track of the layers in the chain that we are in the process of writing.
When the number of commit-graph layers shrinks, we initialize all
entries in the aforementioned arrays, because we know the structure of
the new commit-graph chain immediately (since there are no new layers,
there are no unknown hash values).
But when the number of commit-graph layers grows (i.e., that
'num_commit_graphs_after > num_commit_graphs_before'), then we leave
some entries in the filenames and hashes arrays as uninitialized,
because we will fill them in later as those values become available.
For instance, we rely on 'write_commit_graph_file's to store the
filename and hash of the last layer in the new chain, which is the one
that it is responsible for writing. But, it's possible that
'write_commit_graph_file' may fail, e.g., from file descriptor
exhaustion. In this case it is possible that 'git_mkstemp_mode' will
fail, and that function will return early *before* setting the values
for the last commit-graph layer's filename and hash.
This causes a number of upleasant side-effects. For instance, trying to
'free()' each entry in 'ctx->commit_graph_filenames_after' (and
similarly for the hashes array) causes us to 'free()' uninitialized
memory, since the area is allocated with 'malloc()' and is therefore
subject to contain garbage (which is left alone when
'write_commit_graph_file' returns early).
This can manifest in other issues, like a general protection fault,
and/or leaving a stray 'commit-graph-chain.lock' around after the
process dies. (The reasoning for this is still a mystery to me, since
we'd otherwise usually expect the kernel to run tempfile.c's 'atexit()'
handlers in the case of a normal death...)
To resolve this, initialize the memory with 'CALLOC_ARRAY' so that
uninitialized entries are filled with zeros, and can thus be 'free()'d
as a noop instead of causing a fault.
Helped-by: Jeff King <peff@peff.net>
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t1400 the prerequisite 'ULIMIT_FILE_DESCRIPTORS' is defined and used
to effectively guard the helper function 'run_with_limited_open_files'
from being used on systems that do not satisfy this prerequisite.
In the subsequent patch, we will introduce another test outside of t1400
that would benefit from using this prerequisite. So, move it to
'test-lib.sh' instead so that it can be used by multiple tests.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're passing buffer from strbuf to reencode_string,
which will call strlen(3) on that buffer,
and discard the length of newly created buffer.
Then, we compute the length of the return buffer to attach to strbuf.
During this process, we introduce a discrimination between mail
originally written in utf-8 and other encoding.
* if the email was written in utf-8, we leave it as is. If there is
a NUL character in that line, we complains loudly:
error: a NUL byte in commit log message not allowed.
* if the email was written in other encoding, we truncate the data as
the NUL character in that line, then we used the truncated line for
the metadata.
We can do better by reusing all the available information,
and call the underlying lower level function that will be called
indirectly by reencode_string. By doing this, we will also postpone
the NUL character processing to the commit step, which will
complains about the faulty metadata.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While we are at it, make sure we run a clean up after testing.
In a later patch, we will test for more corrupted patch.
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Parsing of URL for the credential helper has been corrected.
* jk/credential-parsing-end-of-host-in-URL:
credential: treat "?" and "#" in URLs as end of host
Allow "git rebase" to reapply all local commits, even if the may be
already in the upstream, without checking first.
* jt/rebase-allow-duplicate:
rebase --merge: optionally skip upstreamed commits
"git rebase" (again) learns to honor "--no-keep-empty", which lets
the user to discard commits that are empty from the beginning (as
opposed to the ones that become empty because of rebasing). The
interactive rebase also marks commits that are empty in the todo.
* en/rebase-no-keep-empty:
rebase: fix an incompatible-options error message
rebase: reinstate --no-keep-empty
rebase -i: mark commits that begin empty in todo editor
A Windows-specific test element has been made more robust against
misuse from both user's environment and programmer's errors.
* js/mingw-is-hidden-test-fix:
t: restrict `is_hidden` to be called only on Windows
mingw: make test_path_is_hidden more robust
t: consolidate the `is_hidden` functions
"git log" learned "--show-pulls" that helps pathspec limited
history views; a merge commit that takes the whole change from a
side branch, which is normally omitted from the output, is shown
in addition to the commits that introduce real changes.
* ds/revision-show-pulls:
revision: --show-pulls adds helpful merges
Misc fixes for Windows.
* js/mingw-fixes:
mingw: help debugging by optionally executing bash with strace
mingw: do not treat `COM0` as a reserved file name
mingw: use modern strftime implementation if possible
We've left the command line parsing of "git log :/a/b/" broken for
about a full year without anybody noticing, which has been
corrected.
* jc/missing-ref-store-fix:
repository: mark the "refs" pointer as private
sha1-name: do not assume that the ref store is initialized
The output from "git format-patch" uses RFC 2047 encoding for
non-ASCII letters on From: and Subject: headers, so that it can
directly be fed to e-mail programs. A new option has been added
to produce these headers in raw.
* eb/format-patch-no-encode-headers:
format-patch: teach --no-encode-email-headers
"git rebase" learned the "--no-gpg-sign" option to countermand
commit.gpgSign the user may have.
* dd/no-gpg-sign:
Documentation: document merge option --no-gpg-sign
Documentation: merge commit-tree --[no-]gpg-sign
Documentation: reword commit --no-gpg-sign
Documentation: document am --no-gpg-sign
cherry-pick/revert: honour --no-gpg-sign in all case
rebase.c: honour --no-gpg-sign
The logic to auto-follow tags by "git clone --single-branch" was
not careful to avoid lazy-fetching unnecessary tags, which has been
corrected.
* jk/use-quick-lookup-in-clone-for-tag-following:
clone: use "quick" lookup while following tags
"git rebase" with the merge backend did not work well when the
rebase.abbreviateCommands configuration was set.
* ag/rebase-merge-allow-ff-under-abbrev-command:
t3432: test `--merge' with `rebase.abbreviateCommands = true', too
sequencer: don't abbreviate a command if it doesn't have a short form
Code cleanup.
* jk/oid-array-cleanups:
oidset: stop referring to sha1-array
ref-filter: stop referring to "sha1 array"
bisect: stop referring to sha1_array
test-tool: rename sha1-array to oid-array
oid_array: rename source file from sha1-array
oid_array: use size_t for iteration
oid_array: use size_t for count and allocation
The server-end of the v2 protocol to serve "git clone" and "git
fetch" was not prepared to see a delim packets at unexpected
places, which led to a crash.
* jk/harden-protocol-v2-delim-handling:
test-lib-functions: simplify packetize() stdin code
upload-pack: handle unexpected delim packets
test-lib-functions: make packetize() more efficient
When fed a midx that records no objects, some codepaths tried to
loop from 0 through (num_objects-1), which, due to integer
arithmetic wrapping around, made it nonsense operation with out of
bounds array accesses. The code has been corrected to reject such
an midx file.
* dr/midx-avoid-int-underflow:
midx.c: fix an integer underflow
Enable tests that require GnuPG on Windows.
* js/tests-gpg-integration-on-windows:
tests: increase the verbosity of the GPG-related prereqs
tests: turn GPG, GPGSM and RFC1991 into lazy prereqs
tests: do not let lazy prereqs inside `test_expect_*` turn off tracing
t/lib-gpg.sh: stop pretending to be a stand-alone script
tests(gpg): allow the gpg-agent to start on Windows
Since its introduction in 7249e91 (revision.c: support --notes
command-line option, 2011-03-29), combining '--notes' with any option
that causes us to format notes (e.g., '--pretty', '--format="%N"', etc)
results in a failed assertion at runtime.
$ git rev-list HEAD | git diff-tree --stdin --pretty=medium --notes
commit 8f3d9f354286745c751374f5f1fcafee6b3f3136
git: notes.c:1308: format_display_notes: Assertion `display_notes_trees' failed.
Aborted
This failure is due to diff-tree not calling 'load_display_notes' to
initialize the notes machinery.
Ordinarily, this failure isn't triggered, because it requires passing
both '--notes' and another of the above mentioned options. In the case
of '--pretty', for example, we set 'opt->verbose_header', causing
'show_log()' to eventually call 'format_display_notes()', which expects
a non-NULL 'display_note_trees'.
Without initializing the notes machinery, 'display_note_trees' remains
NULL, and thus triggers an assertion failure.
Fix this by initializing the notes machinery after parsing our options,
and harden this behavior against regression with a test in t4013. (Note
that the added ref in this test requires updating two unrelated tests
which use 'log --all', and thus need to learn about the new refs).
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We were using `test_must_fail p4` to test that the p4 command failed as
expected. However, test_must_fail() is used to ensure that commands fail
in an expected way, not due to something like a segv. Since we are not
in the business of verifying the sanity of the external world, replace
`test_must_fail p4` with `! p4` and assume that the `p4` command does
not die unexpectedly.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `test_must_fail` function should only be used for git commands;
we are not in the business of catching segmentation fault by external
commands. Shell helper functions test_cmp and svn_cmd used in this
script are wrappers around external commands, so just use `! cmd`
instead of `test_must_fail cmd`
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test_must_fail() function should only be used for git commands since
we assume that external commands work sanely. Since, not only should
this file not exist, but there shouldn't exit _any_ filesystem entity in
these paths, replace `test_must_fail test -f` with
`test_path_is_missing`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test_must_fail() function should only be used for git commands since
we assume that external commands work sanely. Since, not only should
these directories not exist, but there shouldn't exist _any_ filesystem
entity in these paths, replace `test_must_fail test -d` with
`test_path_is_missing`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test_must_fail function should only be used for git commands since
we assume that external commands work sanely. Since test_cmp() just
wraps an external command, replace `test_must_fail test_cmp` with
`! test_cmp`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
According to t/README, test_must_fail() should only be used to test for
failure in Git commands.
Replace the invocation of `test_must_fail test_path_is_file` with
`test_path_is_missing` since, in this test case, the path should not
exist at all.
In all the cases where `test_must_fail test_alternate_is_used` appears,
test_alternate_is_used() fails because test_line_count() cannot open the
non-existent $alternates_file. Replace
`test_must_fail test_alternate_is_used` with `test_path_is_missing` to
test for this directly.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test_must_fail() function should only be used for git commands since
we should assume that external commands work sanely. Replace
`test_must_fail test -e` with `test_path_is_missing`.
Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
grep does not follow the conventions used by other Git commands when
printing paths that contain unusual characters (as double-quotes or
newlines). Commands such as ls-files, commit, status and diff will:
- Quote and escape unusual pathnames, by default.
- Print names verbatim and unquoted when "-z" is used.
But grep *never* quotes/escapes absolute paths with unusual chars and
*always* quotes/escapes relative ones, even with "-z". Besides being
inconsistent in its own output, the deviation from other Git commands
can be confusing. So let's make it follow the two rules above and add
some tests for this new behavior. Note that, making grep quote/escape
all unusual paths by default, also make it fully compliant with the
core.quotePath configuration, which is currently ignored for absolute
paths.
Reported-by: Greg Hurrell <greg@hurrell.net>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git's URL parser interprets
https:///example.com/repo.git
to have no host and a path of "example.com/repo.git". Curl, on the
other hand, internally redirects it to https://example.com/repo.git. As
a result, until "credential: parse URL without host as empty host, not
unset", tricking a user into fetching from such a URL would cause Git to
send credentials for another host to example.com.
Teach fsck to block and detect .gitmodules files using such a URL to
prevent sharing them with Git versions that are not yet protected.
A relative URL in a .gitmodules file could also be used to trigger this.
The relative URL resolver used for .gitmodules does not normalize
sequences of slashes and can follow ".." components out of the path part
and to the host part of a URL, meaning that such a relative URL can be
used to traverse from a https://foo.example.com/innocent superproject to
a https:///attacker.example.com/exploit submodule. Fortunately,
redundant extra slashes in .gitmodules are rare, so we can catch this by
detecting one after a leading sequence of "./" and "../" components.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Until "credential: refuse to operate when missing host or protocol",
Git's credential handling code interpreted URLs with empty scheme to
mean "give me credentials matching this host for any protocol".
Luckily libcurl does not recognize such URLs (it tries to look for a
protocol named "" and fails). Just in case that changes, let's reject
them within Git as well. This way, credential_from_url is guaranteed to
always produce a "struct credential" with protocol and host set.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
libcurl permits making requests without a URL scheme specified. In
this case, it guesses the URL from the hostname, so I can run
git ls-remote http::ftp.example.com/path/to/repo
and it would make an FTP request.
Any user intentionally using such a URL is likely to have made a typo.
Unfortunately, credential_from_url is not able to determine the host and
protocol in order to determine appropriate credentials to send, and
until "credential: refuse to operate when missing host or protocol",
this resulted in another host's credentials being leaked to the named
host.
Teach credential_from_url_gently to consider such a URL to be invalid
so that fsck can detect and block gitmodules files with such URLs,
allowing server operators to avoid serving them to downstream users
running older versions of Git.
This also means that when such URLs are passed on the command line, Git
will print a clearer error so affected users can switch to the simpler
URL that explicitly specifies the host and protocol they intend.
One subtlety: .gitmodules files can contain relative URLs, representing
a URL relative to the URL they were cloned from. The relative URL
resolver used for .gitmodules can follow ".." components out of the path
part and past the host part of a URL, meaning that such a relative URL
can be used to traverse from a https://foo.example.com/innocent
superproject to a https::attacker.example.com/exploit submodule.
Fortunately a leading ':' in the first path component after a series of
leading './' and '../' components is unlikely to show up in other
contexts, so we can catch this by detecting that pattern.
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
When we try to initialize credential loading by URL and find that the
URL is invalid, we set all fields to NULL in order to avoid acting on
malicious input. Later when we request credentials, we diagonse the
erroneous input:
fatal: refusing to work with credential missing host field
This is problematic in two ways:
- The message doesn't tell the user *why* we are missing the host
field, so they can't tell from this message alone how to recover.
There can be intervening messages after the original warning of
bad input, so the user may not have the context to put two and two
together.
- The error only occurs when we actually need to get a credential. If
the URL permits anonymous access, the only encouragement the user gets
to correct their bogus URL is a quiet warning.
This is inconsistent with the check we perform in fsck, where any use
of such a URL as a submodule is an error.
When we see such a bogus URL, let's not try to be nice and continue
without helpers. Instead, die() immediately. This is simpler and
obviously safe. And there's very little chance of disrupting a normal
workflow.
It's _possible_ that somebody has a legitimate URL with a raw newline in
it. It already wouldn't work with credential helpers, so this patch
steps that up from an inconvenience to "we will refuse to work with it
at all". If such a case does exist, we should figure out a way to work
with it (especially if the newline is only in the path component, which
we normally don't even pass to helpers). But until we see a real report,
we're better off being defensive.
Reported-by: Carlo Arenas <carenas@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
In 07259e74ec (fsck: detect gitmodules URLs with embedded newlines,
2020-03-11), git fsck learned to check whether URLs in .gitmodules could
be understood by the credential machinery when they are handled by
git-remote-curl.
However, the check is overbroad: it checks all URLs instead of only
URLs that would be passed to git-remote-curl. In principle a git:// or
file:/// URL does not need to follow the same conventions as an http://
URL; in particular, git:// and file:// protocols are not succeptible to
issues in the credential API because they do not support attaching
credentials.
In the HTTP case, the URL in .gitmodules does not always match the URL
that would be passed to git-remote-curl and the credential machinery:
Git's URL syntax allows specifying a remote helper followed by a "::"
delimiter and a URL to be passed to it, so that
git ls-remote http::https://example.com/repo.git
invokes git-remote-http with https://example.com/repo.git as its URL
argument. With today's checks, that distinction does not make a
difference, but for a check we are about to introduce (for empty URL
schemes) it will matter.
.gitmodules files also support relative URLs. To ensure coverage for the
https based embedded-newline attack, urldecode and check them directly
for embedded newlines.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
The credential helper protocol was designed to be very flexible: the
fields it takes as input are treated as a pattern, and any missing
fields are taken as wildcards. This allows unusual things like:
echo protocol=https | git credential reject
to delete all stored https credentials (assuming the helpers themselves
treat the input that way). But when helpers are invoked automatically by
Git, this flexibility works against us. If for whatever reason we don't
have a "host" field, then we'd match _any_ host. When you're filling a
credential to send to a remote server, this is almost certainly not what
you want.
Prevent this at the layer that writes to the credential helper. Add a
check to the credential API that the host and protocol are always passed
in, and add an assertion to the credential_write function that speaks
credential helper protocol to be doubly sure.
There are a few ways this can be triggered in practice:
- the "git credential" command passes along arbitrary credential
parameters it reads from stdin.
- until the previous patch, when the host field of a URL is empty, we
would leave it unset (rather than setting it to the empty string)
- a URL like "example.com/foo.git" is treated by curl as if "http://"
was present, but our parser sees it as a non-URL and leaves all
fields unset
- the recent fix for URLs with embedded newlines blanks the URL but
otherwise continues. Rather than having the desired effect of
looking up no credential at all, many helpers will return _any_
credential
Our earlier test for an embedded newline didn't catch this because it
only checked that the credential was cleared, but didn't configure an
actual helper. Configuring the "verbatim" helper in the test would show
that it is invoked (it's obviously a silly helper which doesn't look at
its input, but the point is that it shouldn't be run at all). Since
we're switching this case to die(), we don't need to bother with a
helper. We can see the new behavior just by checking that the operation
fails.
We'll add new tests covering partial input as well (these can be
triggered through various means with url-parsing, but it's simpler to
just check them directly, as we know we are covered even if the url
parser changes behavior in the future).
[jn: changed to die() instead of logging and showing a manual
username/password prompt]
Reported-by: Carlo Arenas <carenas@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
We may feed a URL like "cert:///path/to/cert.pem" into the credential
machinery to get the key for a client-side certificate. That
credential has no hostname field, which is about to be disallowed (to
avoid confusion with protocols where a helper _would_ expect a
hostname).
This means as of the next patch, credential helpers won't work for
unlocking certs. Let's fix that by doing two things:
- when we parse a url with an empty host, set the host field to the
empty string (asking only to match stored entries with an empty
host) rather than NULL (asking to match _any_ host).
- when we build a cert:// credential by hand, similarly assign an
empty string
It's the latter that is more likely to impact real users in practice,
since it's what's used for http connections. But we don't have good
infrastructure to test it.
The url-parsing version will help anybody using git-credential in a
script, and is easy to test.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>