As the comment above the function indicates, we do not bother actually
storing commit messages in our anonymization map. But we still take the
message as a parameter, and just ignore it. Let's stop doing that, which
will make -Wunused-parameter happier.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The anonymization code has a specific generator callback for each type
of data (e.g., one for paths, one for oids, and so on). These all take a
"data" parameter, but none of them use it for anything. Which is not
surprising, as the point is to generate a new name independent of any
input, and each function keeps its own static counter.
We added the extra pointer in d5bf91fde4 (fast-export: add a "data"
callback parameter to anonymize_str(), 2020-06-23) to handle
--anonymize-map parsing, but that turned out to be awkward itself, and
was recently dropped.
So let's get rid of this "data" parameter that nobody is using, both
from the generators and from anonymize_str() which plumbed it through.
This simplifies the code, and makes -Wunused-parameter happier.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we handle an --anonymize-map option, we parse the orig/anon pair,
and then feed the "orig" string to anonymize_str(), along with a
generator function that duplicates the "anon" string to be cached in the
map.
This works, because anonymize_str() says "ah, there is no mapping yet
for orig; I'll add one from the generator". But there are some
downsides:
1. It's a bit too clever, as it's not obvious what the code is trying
to do or why it works.
2. It requires allowing generator functions to take an extra void
pointer, which is not something any of the normal callers of
anonymize_str() want.
3. It does the wrong thing if the same token is provided twice.
When there are conflicting options, like:
git fast-export --anonymize \
--anonymize-map=foo:one \
--anonymize-map=foo:two
we usually let the second one override the first. But by using
anonymize_str(), which has first-one-wins logic, we do the
opposite.
So instead of relying on anonymize_str(), let's directly add the entry
ourselves. We can tweak the tests to show that we handle overridden
options correctly now.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When anonymizing output, there's only one spot where we generate new
entries to add to our hashmap: when anonymize_str() doesn't find an
entry, we use the generate() callback to make one and add it. Let's pull
that into its own function in preparation for another caller.
Note that we'll add one extra feature. In anonymize_str(), we know that
we won't find an existing entry in the hashmap (since it will only try
to add after failing to find one). But other callers won't have the same
behavior, so we should catch this case and free the now-dangling entry.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We take pains to avoid doing a lookup on a hashmap which has not been
initialized with hashmap_init(). That was necessary back when this code
was written. But hashmap_get() became safer in b7879b0ba6 (hashmap:
allow re-use after hashmap_free(), 2020-11-02). Since then it's OK to
call functions on a zero-initialized table; it will just correctly
return NULL, since there is no match.
This simplifies the code a little, and also lets us keep the
initialization line closer to when we add an entry (which is when the
hashmap really does need to be totally initialized). That will help
later refactoring.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We store anonymized values as pointers to "const char *", since they are
conceptually const to callers who use them. But they are actually
allocated strings whose memory is owned by the struct.
The ownership mismatch hasn't been a big deal since we never free() them
(they are held until the program ends), but let's switch them to "char *"
in preparation for changing that.
Since most code only accesses them via anonymize_str(), it can continue
to narrow them to "const char *" in its return value.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git receive-pack" that responds to "git push" requests failed to
clean a stale lockfile when killed in the middle, which has been
corrected.
* ps/receive-pack-unlock-before-die:
receive-pack: fix stale packfile locks when dying
Fix for a "ls-files --format="%(path)" that produced nonsense
output, which was a bug in 2.38.
* aj/ls-files-format-fix:
ls-files: fix "--format" output of relative paths
"git format-patch" honors the src/dst prefixes set to nonstandard
values with configuration variables like "diff.noprefix", causing
receiving end of the patch that expects the standard -p1 format to
break. Teach "format-patch" to ignore end-user configuration and
always use the standard prefixes.
This is a backward compatibility breaking change.
* jk/format-patch-ignore-noprefix:
rebase: prefer --default-prefix to --{src,dst}-prefix for format-patch
format-patch: add format.noprefix option
format-patch: do not respect diff.noprefix
diff: add --default-prefix option
t4013: add tests for diff prefix options
diff: factor out src/dst prefix setup
There were two reasons we didn't do this. As "git am" is designed
to grok e-mailed patches, not necessarily taken out of a Git
repostiory or even if it came from a Git repository not necessarily
produced with format-patch, we didn't want to single it out as the
"blessed" input producer to the command. Also, in the original
workflow that "git am" was invented for, the user of "am" was
expected to be a different person than the users of "format-patch".
But this is a very safe change to make in 2023. Thanks to the
effort by many contributors, Git ended up becoming a bit more
popular than we initially thought it would be, and "format-patch",
which took me a few weeks to pursuade Linus to take in 2005, seems
to have become the de-facto standard tool to produce patch e-mails.
Interestingly, the documentation for "git apply", which is listed in
SEE ALSO section of "git am" documentation, does mention "am" and
"format-patch" as two things that are related but different from
"apply" in an early part.
Suggested-by: Kai Grossjohann <kai.grossjohann@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 2007 the docbook project made the mistake of converting ' to \' for
man pages [1]. It's a problem because groff interprets \' as acute
accent which is rendered as ' in ASCII, but as ´ in utf-8.
This started a cascade of bug reports in git [2], debian [3], Arch Linux
[4], docbook itself [5], and probably many others.
A solution was to use the correct groff character: \(aq, which is always
rendered as ', but the problem is that such character doesn't work in
other troff programs.
A portable solution required the use of a conditional character that is
\(aq in groff, but ' in all others:
.ie \n(.g .ds Aq \(aq
.el .ds Aq '
The proper solution took time to be implemented in docbook, but in 2010
they did it [6]. So the docbook man page stylesheets were broken from
1.73 to 1.76.
Unfortunately by that point many workarounds already existed. In the
case of git, GNU_ROFF was introduced, and in the case of Arch Linux
a mapping from \' to ' was added to groff's man.local. Other
distributions might have done the same, or similar workarounds.
Since 2010 there is no need for this workaround, which is fixed
elsewhere, not just in docbook, but other layers as well.
Let's remove it.
[1] ea2a0bac56
[2] https://lore.kernel.org/git/20091012102926.GA3937@debian.b2j/
[3] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=507673#65
[4] https://bugs.archlinux.org/task/9643
[5] https://sourceforge.net/p/docbook/bugs/1022/
[6] fb55343426
Inspired-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use gettimeofday instead of time(NULL) to get current time.
This avoids clock skew on glibc 2.31+ on Linux, where in the
first 1 to 2.5 ms of every second, time(NULL) returns a
value that is one less than the tv_sec part of
higher-resolution timestamps such as those returned by
gettimeofday or timespec_get, or those in the file system.
There are similar clock skew problems on AIX and MS-Windows,
which have problems in the first 5 ms of every second.
Without this patch, users can observe Git issuing a
timestamp T+1 before it issues timestamp T, because Git
sometimes uses time(NULL) or time(&t) and sometimes uses
higher-res methods like gettimeofday. Although strictly
speaking users should tolerate this behavior because a
superuser can always change the clock back, this is a
quality of implementation issue and users naturally expect
Git to issue timestamps in increasing order unless the
superuser has fiddled with the system clock.
This patch always uses gettimeofday(...) instead of time(...),
and I have verified that the resulting .o files never refer
to the name 'time'. A trickier patch would change only
those calls for which timestamp monotonicity is user-visible.
Such a patch would require more expertise about Git internals,
though, and would be harder to maintain later.
Another possibility would be to change Git's documentation
to warn users that Git does not always issue timestamps in
increasing order. However, Git users would likely be either
dismayed by this possibility, or confused by the level of
detail that any such documentation would require.
Yet another possibility would be to fix the Linux kernel so
that the time syscall is consistent with the other timestamp
syscalls. I suppose this has not been done due to
performance implications. (Git's use of timestamps is rare
enough that performance is not a significant consideration
for git.) However, this wouldn't fix Git's problem on older
Linux kernels, or on AIX or MS-Windows.
Signed-off-by: Paul Eggert <eggert@cs.ucla.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use designated initializers in the expansions of the OPT_* macros to
make it more readable which one-letter macro parameter initializes
which field in the resulting 'struct option'.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename the 'help' parameter as it matches one of the fields in 'struct
option', and, while at it, rename all other parameters to the usual
one-letter name used in similar macro definitions.
Furthermore, put all parameters in the replacement list between
parentheses, like all other OPT_* macros do.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the various OPT_* macros the 'f' parameter is usually used to
specify flags, while the 'cb' parameter is used to specify a callback
function. OPT_CALLBACK and OPT_NUMBER_CALLBACKS, however, are
inconsistent with the rest, as they use 'f' to specify their callback
function.
Rename their callback macro parameters to 'cb' to avoid the
inconsistency.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The headers 'diagnose.h', 'list-objects-filter-options.h',
'ref-filter.h' and 'remote.h' declare option parsing callback
functions with a 'struct option*' parameter, and 'revision.h' declares
an option parsing helper function taking 'struct parse_opt_ctx_t*' and
'struct option*' parameters. These headers all include
'parse-options.h', although they don't need any of the type
definitions from that header file. Furthermore,
'list-objects-filter-options.h' and 'ref-filter.h' also define some
OPT_* macros to initialize a 'struct option', but these don't
necessitate the inclusion of parse-options.h in these headers either,
because these macros are only expanded in source files.
Remove these unnecessary inclusions of parse-options.h and use forward
declarations to declare the necessary types.
After this patch none of the header files include parse-options.h
anymore.
With these changes, the build time after modifying only
parse-options.h is reduced by about 30%, and the number of targets
built is almost 20% less:
Before:
$ touch parse-options.h && time make -j4 |wc -l
353
real 1m1.527s
user 3m32.205s
sys 0m15.903s
After:
289
real 0m39.285s
user 2m12.540s
sys 0m11.164s
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The builtins 'ls-remote', 'pack-objects', 'receive-pack', 'reflog' and
'send-pack' use parse_options(), but their source files don't directly
include 'parse-options.h'. Furthermore, the source files
'diagnose.c', 'list-objects-filter-options.c', 'remote.c' and
'send-pack.c' define option parsing callback functions, while
'revision.c' defines an option parsing helper function, and thus need
access to various fields in 'struct option' and 'struct
parse_opt_ctx_t', but they don't directly include 'parse-options.h'
either. They all can still be built, of course, because they include
one of the header files that does include 'parse-options.h' (though
unnecessarily, see the next commit).
Add those missing includes to these files, as our general rule is that
"a C file must directly include the header files that declare the
functions and the types it uses".
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
WM_ABORT_ALL and WM_ABORT_TO_STARSTAR are used internally to limit
backtracking when a match fails, they are not of interest to the caller
and so should not be public.
Suggested-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code changed in this commit is designed to check if the pattern
starts with "**/" or contains "/**/" (see 3a078dec33 (wildmatch: fix
"**" special case, 2013-01-01)). Unfortunately when the pattern begins
with "**/" `prev_p = p - 2` is evaluated when `p` points to the second
"*" and so the subtraction is undefined according to section 6.5.6 of
the C standard because the result does not point within the same object
as `p`. Fix this by avoiding the subtraction unless it is well defined.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When dowild() cannot match a '*' or '/**/' wildcard then it must return
WM_ABORT_TO_STARSTAR or WM_ABORT_ALL respectively. Failure to observe
this results in unnecessary backtracking and the time taken for a failed
match increases exponentially with the number of wildcards in the
pattern [1]. Unfortunately in some instances dowild() returns WM_NOMATCH
for a failed match resulting in long match times for patterns containing
multiple wildcards as can be seen in the following benchmark.
(Note that the timings in the Benchmark 1 are really measuring the time
to execute test-tool rather than the time to match the pattern)
Benchmark 1: t/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "*a"
Time (mean ± σ): 22.8 ms ± 1.7 ms [User: 12.1 ms, System: 10.6 ms]
Range (min … max): 19.4 ms … 26.9 ms 113 runs
Warning: Ignoring non-zero exit code.
Benchmark 2: t/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "*a*a*a*a*a*a*a*a*a"
Time (mean ± σ): 5.244 s ± 0.228 s [User: 5.229 s, System: 0.010 s]
Range (min … max): 4.969 s … 5.707 s 10 runs
Warning: Ignoring non-zero exit code.
Summary
't/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "*a"' ran
230.37 ± 20.04 times faster than 't/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "*a*a*a*a*a*a*a*a*a"'
The security implications are limited as it only affects operations that
are potentially DoS vectors. For example by creating a blob containing
such a pattern a malicious user can exploit this behavior to use large
amounts of CPU time on a remote server by pushing the blob and then
creating a new clone with --filter=sparse:oid. However this filter type
is usually disabled as it is known to consume large amounts of CPU time
even without this bug.
The WM_MATCH changed in the first hunk of this patch comes from the
original implementation imported from rsync in 5230f605e1 (Import
wildmatch from rsync, 2012-10-15). Compared to the others converted here
it is fairly harmless as it only triggers at the end of the pattern and
so will only cause a single unnecessary backtrack. The others introduced
by 6f1a31f0aa (wildmatch: advance faster in <asterisk> + <literal>
patterns, 2013-01-01) and 46983441ae (wildmatch: make a special case for
"*/" with FNM_PATHNAME, 2013-01-01) are more pernicious and will cause
exponential behavior.
A new test is added to protect against future regressions.
[1] https://research.swtch.com/glob
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tests in t1507-rev-parse-upstream.sh compare files "expect" and "actual"
to assert the output of "git rev-parse", "git show", and "git log".
However, two of the tests '@{reflog}-parsing does not look beyond colon'
and '@{upstream}-parsing does not look beyond colon' don't inspect the
contents of the created files.
Assert output of "git rev-parse" in tests in t1507-rev-parse-upstream.sh
to improve test coverage.
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some tests in file t1404-update-ref-errors.sh create file "unchanged" as
the expected side for a test_cmp assertion at the end of the test for
output of "git for-each-ref". Test 'no bogus intermediate values during
delete' also creates a file named "unchanged" using "git for-each-ref".
However, the file isn't used for any assertions in the test. Instead,
"git rev-parse" is used to compare the reference with variable $D.
Don't create unused file "unchanged" in test 'no bogus intermediate
values during delete' of t1404-update-ref-errors.sh.
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t1400-update-ref.sh test 'transaction can create and delete' creates
files "expect" and "actual", but doesn't compare them. Similarly, test
'transaction cannot restart ongoing transaction' redirects output of
"git update-ref" to file "actual", but doesn't check its contents with
any assertions.
Assert output of "git update-ref" in tests to improve test coverage in
t1400-update-ref.sh.
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test 'gitdir selection on unsupported repo' in t1302-repo-version.sh
writes output of a "git config" invocation to file "actual". However,
the test doesn't have any assertions for the file. The file was used by
this test until commit b9605bc4f2 (config: only read .git/config from
configured repos, 2016-09-12), before which "git config" was expected to
print the bogus value of "core.repositoryformatversion" to standard
output.
Don't redirect output of "git config" to file "actual" in test 'gitdir
selection on unsupported repo'.
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Builtin "git mktree" writes the the object name of the tree object built
to the standard output. Tests 'mktree refuses to read ls-tree -r output
(1)' and 'mktree refuses to read ls-tree -r output (2)' in
"t1010-mktree.sh" redirect output of "git mktree" to a file, but don't
use its contents in assertions.
Don't redirect output of "git mktree" to file "actual" in tests that
assert that an invocation of "git mktree" must fail.
Output of "git mktree" is empty when it refuses to build a tree object.
So, alternatively, the test could assert that the output is empty.
However, there isn't a good reason for the user to expect the command to
be silent in such cases, so we shouldn't enforce it. The user shouldn't
use the output of a failing command anyway.
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test "cat-file $arg1 $arg2 error on missing full OID" in
t1006-cat-file.sh compares files "expect.err" and "err.actual" to assert
the expected error output of "git cat-file". A similar test in the same
file named "cat-file $arg1 $arg2 error on missing short OID" also
creates these two files, but doesn't use them in assertions.
Assert error output of "git cat-file" in test "cat-file $arg1 $arg2
error on missing short OID" of t1006-cat-file.sh to improve test
coverage.
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Test 'reset should work' in t1005-read-tree-reset.sh compares two files
"expect" and "actual" to assert the expected output of "git ls-files".
Several other tests in the same file also create files "expect" and
"actual", but don't use them in assertions.
Assert output of "git ls-files" in t1005-read-tree-reset.sh to improve
test coverage.
Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git add -p" while the index is unmerged sometimes failed to parse
the diff output it internally produces and died, which has been
corrected.
* jk/add-p-unmerged-fix:
add-patch: handle "* Unmerged path" lines
After "git pull" that is configured with pull.rebase=false
merge.ff=only fails due to our end having our own development, give
advice messages to get out of the "Not possible to fast-forward"
state.
* fc/advice-diverged-history:
advice: add diverging advice for novices
The code to parse "git rebase -X<opt>" was not prepared to see an
unparsable option string, which has been corrected.
* ab/fix-strategy-opts-parsing:
sequencer.c: fix overflow & segfault in parse_strategy_opts()
Once we start running, we assumed that the list of alternate object
databases would never change. Hook into the machinery used to
update the list of packfiles during runtime to update this list as
well.
* ds/reprepare-alternates-when-repreparing-packfiles:
object-file: reprepare alternates when necessary
"git format-patch" learned to write a log-message only output file
for empty commits.
* jk/format-patch-change-format-for-empty-commits:
format-patch: output header for empty commits
"git bundle" learned that "-" is a common way to say that the input
comes from the standard input and/or the output goes to the
standard output. It used to work only for output and only from the
root level of the working tree.
* jk/bundle-use-dash-for-stdfiles:
parse-options: use prefix_filename_except_for_dash() helper
parse-options: consistently allocate memory in fix_filename()
bundle: don't blindly apply prefix_filename() to "-"
bundle: document handling of "-" as stdin
bundle: let "-" mean stdin for reading operations
A few subcommands have been taught to stop users from working on a
branch that is being used in another worktree linked to the same
repository.
* rj/avoid-switching-to-already-used-branch:
switch: reject if the branch is already checked out elsewhere (test)
rebase: refuse to switch to a branch already checked out elsewhere (test)
branch: fix die_if_checked_out() when ignore_current_worktree
worktree: introduce is_shared_symref()
Allow "git bisect reset" to check out the original branch when the
branch is already checked out in a different worktree linked to the
same repository.
* rj/bisect-already-used-branch:
bisect: fix "reset" when branch is checked out elsewhere
"git push" has been taught to allow deletion of refs with one-level
names to help repairing a repository who acquired such a ref by
mistake. In general, we don't encourage use of such a ref, and
creation or update to such a ref is rejected as before.
* zh/push-to-delete-onelevel-ref:
push: allow delete single-level ref
receive-pack: fix funny ref error messsage
"git restore" supports options like "--ours" that are only
meaningful during a conflicted merge, but these options are only
meaningful when updating the working tree files. These options are
marked to be incompatible when both "--staged" and "--worktree" are
in effect.
* ak/restore-both-incompatible-with-conflicts:
restore: fault --staged --worktree with merge opts
Fix a segfaulting loop. The function and its caller may need
further clean-up.
* ew/commit-reach-clean-up-flags-fix:
commit-reach: avoid NULL dereference
There's code in git_connect() that checks whether we are doing a push
with protocol_v2, and if so, drops us to protocol_v0 (since we know
how to do v2 only for fetches). But it misses some corner cases:
1. it checks the "prog" variable, which is actually the path to
receive-pack on the remote side. By default this is just
"git-receive-pack", but it could be an arbitrary string (like
"/path/to/git receive-pack", etc). We'd accidentally stay in v2
mode in this case.
2. besides "receive-pack" and "upload-pack", there's one other value
we'd expect: "upload-archive" for handling "git archive --remote".
Like receive-pack, this doesn't understand v2, and should use the
v0 protocol.
In practice, neither of these causes bugs in the real world so far. We
do send a "we understand v2" probe to the server, but since no server
implements v2 for anything but upload-pack, it's simply ignored. But
this would eventually become a problem if we do implement v2 for those
endpoints, as older clients would falsely claim to understand it,
leading to a server response they can't parse.
We can fix (1) by passing in both the program path and the "name" of the
operation. I treat the name as a string here, because that's the pattern
set in transport_connect(), which is one of our callers (we were simply
throwing away the "name" value there before).
We can fix (2) by allowing only known-v2 protocols ("upload-pack"),
rather than blocking unknown ones ("receive-pack" and "upload-archive").
That will mean whoever eventually implements v2 push will have to adjust
this list, but that's reasonable. We'll do the safe, conservative thing
(sticking to v0) by default, and anybody working on v2 will quickly
realize this spot needs to be updated.
The new tests cover the receive-pack and upload-archive cases above, and
re-confirm that we allow v2 with an arbitrary "--upload-pack" path (that
already worked before this patch, of course, but it would be an easy
thing to break if we flipped the allow/block logic without also handling
"name" separately).
Here are a few miscellaneous implementation notes, since I had to do a
little head-scratching to understand who calls what:
- transport_connect() is called only for git-upload-archive. For
non-http git remotes, that resolves to the virtual connect_git()
function (which then calls git_connect(); confused yet?). So
plumbing through "name" in connect_git() covers that.
- for regular fetches and pushes, callers use higher-level functions
like transport_fetch_refs(). For non-http git remotes, that means
calling git_connect() under the hood via connect_setup(). And that
uses the "for_push" flag to decide which name to use.
- likewise, plumbing like fetch-pack and send-pack may call
git_connect() directly; they each know which name to use.
- for remote helpers (including http), we already have separate
parameters for "name" and "exec" (another name for "prog"). In
process_connect_service(), we feed the "name" to the helper via
"connect" or "stateless-connect" directives.
There's also a "servpath" option, which can be used to tell the
helper about the "exec" path. But no helpers we implement support
it! For http it would be useless anyway (no reasonable server
implementation will allow you to send a shell command to run the
server). In theory it would be useful for more obscure helpers like
remote-ext, but even there it is not implemented.
It's tempting to get rid of it simply to reduce confusion, but we
have publicly documented it since it was added in fa8c097cc9
(Support remote helpers implementing smart transports, 2009-12-09),
so it's possible some helper in the wild is using it.
- So for v2, helpers (again, including http) are mainly used via
stateless-connect, driven by the main program. But they do still
need to decide whether to do a v2 probe. And so there's similar
logic in remote-curl.c's discover_refs() that looks for
"git-receive-pack". But it's not buggy in the same way. Since it
doesn't support servpath, it is always dealing with a "service"
string like "git-receive-pack". And since it doesn't support
straight "connect", it can't be used for "upload-archive".
So we could leave that spot alone. But I've updated it here to match
the logic we're changing in connect_git(). That seems like the least
confusing thing for somebody who has to touch both of these spots
later (say, to add v2 push support). I didn't add a new test to make
sure this doesn't break anything; we already have several tests (in
t5551 and elsewhere) that make sure we are using v2 over http.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A new "fetch.hideRefs" option can be used to exclude specified refs
from "rev-list --objects --stdin --not --all" traversal for
checking object connectivity, most useful when there are many
unrelated histories in a single repository.
* ew/fetch-hiderefs:
fetch: support hideRefs to speed up connectivity checks
Allow information carried on the WWW-AUthenticate header to be
passed to the credential helpers.
* mc/credential-helper-www-authenticate:
credential: add WWW-Authenticate header to cred requests
http: read HTTP WWW-Authenticate response headers
t5563: add tests for basic and anoymous HTTP access