The wildmatch library code unlearns exponential behaviour it
acquired some time ago since it was borrowed from rsync.
* pw/wildmatch-fixes:
t3070: make chain lint tester happy
wildmatch: hide internal return values
wildmatch: avoid undefined behavior
wildmatch: fix exponential behavior
When I ran this test using `TEST_SHELL_PATH=/bin/bash` in my Ubuntu
setup (where Bash is at version 5.0.17(1)-release), I was greeted with
this error message:
./test-lib.sh: line 1072: $CHALLENGE: ambiguous redirect
This commit fixes that error by quoting the `CHALLENGE` variable (which
has as value a path containing spaces), and by avoiding to cuddle the
empty string parameter in the `printf` call with the redirect character
(in fact, the `printf ''>$CHALLENGE` is removed because the next line
overwrites the file anyway because it _also_ uses a single `>` to
redirect the output).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A recent-ish change to allow unicode character classes to be used
with "grep -P" triggered a JIT bug in older pcre2 libraries.
The problematic change in Git built with these older libraries has
been disabled to work around the bug.
* mk/workaround-pcre-jit-ucp-bug:
grep: work around UTF-8 related JIT bug in PCRE2 <= 10.34
Code clean-up to use designated initializers in parse-options API.
* sg/parse-options-h-initializers:
parse-options.h: use designated initializers in OPT_* macros
parse-options.h: rename _OPT_CONTAINS_OR_WITH()'s parameters
parse-options.h: use consistent name for the callback parameters
Code clean-up to include and/or uninclude parse-options.h file as
needed.
* sg/parse-options-h-users:
treewide: remove unnecessary inclusions of parse-options.h from headers
treewide: include parse-options.h in source files
This argument was added in 7cae7627c4 (builtin/grep.c: integrate with
sparse index, 2022-09-22), but it was a carry-over from an earlier
version where the --sparse flag was added to the 'git grep' builtin.
This argument does not exist, so currently the
p2000-sparse-operations.sh performance test script fails when reaching
this step.
With this fix, the script works with these numbers for my copy of the
Git source code repository:
Test HEAD
------------------------------------------------------------
2000.30: git grep --cached ... (full-v3) 0.34(1.20+0.14)
2000.31: git grep --cached ... (full-v4) 0.31(1.15+0.13)
2000.32: git grep --cached ... (sparse-v3) 0.26(1.13+0.12)
2000.33: git grep --cached ... (sparse-v4) 0.27(1.13+0.12)
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove workaround for ancient versions of DocBook to make it work
correctly with groff, which has not been necessary since docbook
1.76 from 2010.
* fc/docbook-remove-groff-workaround:
doc: remove GNU troff workaround
time(2) on glib 2.31+, especially on Linux, goes out of sync with
higher resolution timers used for gettimeofday(2) and by the
filesystem. Replace all calls to it with a git_time() wrapper and
use gettimeofday(2) in its implementation.
* pe/time-use-gettimeofday:
git-compat-util: use gettimeofday(2) for time(2)
Transports that do not support protocol v2 did not correctly fall
back to protocol v0 under certain conditions, which has been
corrected.
* jk/fix-proto-downgrade-to-v0:
git_connect(): fix corner cases in downgrading v2 to v0
"git rev-parse --quiet foo@{u}", or anything that asks @{u} to be
parsed with GET_OID_QUIETLY option, did not quietly fail, which has
been corrected.
* fc/oid-quietly-parse-upstream:
object-name: fix quiet @{u} parsing
Lift the limitation that colored prompts can only be used with
PROMPT_COMMAND mode.
* fc/completion-colors-do-not-need-prompt-command:
completion: prompt: use generic colors
1f2e05f0b7 ("wildmatch: fix exponential behavior", 2023-03-20)
introduced a new test with a background process. Backgrounding
necessarily gives a result of 0, so that a seemingly broken && chain is
not really broken.
Adjust t3070 slightly so that our chain lint test recognizes the
construct for what it is and does not raise a false positive.
Signed-off-by: Michael J Gruber <git@grubix.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 50b54fd72a (config: be strict on core.commentChar, 2014-05-17)
notes that “multi-byte character encoding could also be misinterpreted”,
and indeed a multi-byte codepoint (non-ASCII) is not accepted as a valid
`core.commentChar`.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This should have been removed in `ab/retire-scripted-add-p` but wasn't.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stephane is reporting[1] a regression introduced in git v2.40.0 that leads
to 'git grep' segfaulting in his CI pipeline. It turns out, he's using an
older version of libpcre2 that triggers a wild pointer dereference in
the generated JIT code that was fixed in PCRE2 10.35.
Instead of completely disabling the JIT compiler for the buggy version,
just mask out the Unicode property handling as we used to do prior to
commit acabd2048e ("grep: correctly identify utf-8 characters with
\{b,w} in -P").
[1] https://lore.kernel.org/git/7E83DAA1-F9A9-4151-8D07-D80EA6D59EEA@clumio.com/
Reported-by: Stephane Odul <stephane@clumio.com>
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git receive-pack" that responds to "git push" requests failed to
clean a stale lockfile when killed in the middle, which has been
corrected.
* ps/receive-pack-unlock-before-die:
receive-pack: fix stale packfile locks when dying
Fix for a "ls-files --format="%(path)" that produced nonsense
output, which was a bug in 2.38.
* aj/ls-files-format-fix:
ls-files: fix "--format" output of relative paths
"git format-patch" honors the src/dst prefixes set to nonstandard
values with configuration variables like "diff.noprefix", causing
receiving end of the patch that expects the standard -p1 format to
break. Teach "format-patch" to ignore end-user configuration and
always use the standard prefixes.
This is a backward compatibility breaking change.
* jk/format-patch-ignore-noprefix:
rebase: prefer --default-prefix to --{src,dst}-prefix for format-patch
format-patch: add format.noprefix option
format-patch: do not respect diff.noprefix
diff: add --default-prefix option
t4013: add tests for diff prefix options
diff: factor out src/dst prefix setup
There were two reasons we didn't do this. As "git am" is designed
to grok e-mailed patches, not necessarily taken out of a Git
repostiory or even if it came from a Git repository not necessarily
produced with format-patch, we didn't want to single it out as the
"blessed" input producer to the command. Also, in the original
workflow that "git am" was invented for, the user of "am" was
expected to be a different person than the users of "format-patch".
But this is a very safe change to make in 2023. Thanks to the
effort by many contributors, Git ended up becoming a bit more
popular than we initially thought it would be, and "format-patch",
which took me a few weeks to pursuade Linus to take in 2005, seems
to have become the de-facto standard tool to produce patch e-mails.
Interestingly, the documentation for "git apply", which is listed in
SEE ALSO section of "git am" documentation, does mention "am" and
"format-patch" as two things that are related but different from
"apply" in an early part.
Suggested-by: Kai Grossjohann <kai.grossjohann@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 2007 the docbook project made the mistake of converting ' to \' for
man pages [1]. It's a problem because groff interprets \' as acute
accent which is rendered as ' in ASCII, but as ´ in utf-8.
This started a cascade of bug reports in git [2], debian [3], Arch Linux
[4], docbook itself [5], and probably many others.
A solution was to use the correct groff character: \(aq, which is always
rendered as ', but the problem is that such character doesn't work in
other troff programs.
A portable solution required the use of a conditional character that is
\(aq in groff, but ' in all others:
.ie \n(.g .ds Aq \(aq
.el .ds Aq '
The proper solution took time to be implemented in docbook, but in 2010
they did it [6]. So the docbook man page stylesheets were broken from
1.73 to 1.76.
Unfortunately by that point many workarounds already existed. In the
case of git, GNU_ROFF was introduced, and in the case of Arch Linux
a mapping from \' to ' was added to groff's man.local. Other
distributions might have done the same, or similar workarounds.
Since 2010 there is no need for this workaround, which is fixed
elsewhere, not just in docbook, but other layers as well.
Let's remove it.
[1] ea2a0bac56
[2] https://lore.kernel.org/git/20091012102926.GA3937@debian.b2j/
[3] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=507673#65
[4] https://bugs.archlinux.org/task/9643
[5] https://sourceforge.net/p/docbook/bugs/1022/
[6] fb55343426
Inspired-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use gettimeofday instead of time(NULL) to get current time.
This avoids clock skew on glibc 2.31+ on Linux, where in the
first 1 to 2.5 ms of every second, time(NULL) returns a
value that is one less than the tv_sec part of
higher-resolution timestamps such as those returned by
gettimeofday or timespec_get, or those in the file system.
There are similar clock skew problems on AIX and MS-Windows,
which have problems in the first 5 ms of every second.
Without this patch, users can observe Git issuing a
timestamp T+1 before it issues timestamp T, because Git
sometimes uses time(NULL) or time(&t) and sometimes uses
higher-res methods like gettimeofday. Although strictly
speaking users should tolerate this behavior because a
superuser can always change the clock back, this is a
quality of implementation issue and users naturally expect
Git to issue timestamps in increasing order unless the
superuser has fiddled with the system clock.
This patch always uses gettimeofday(...) instead of time(...),
and I have verified that the resulting .o files never refer
to the name 'time'. A trickier patch would change only
those calls for which timestamp monotonicity is user-visible.
Such a patch would require more expertise about Git internals,
though, and would be harder to maintain later.
Another possibility would be to change Git's documentation
to warn users that Git does not always issue timestamps in
increasing order. However, Git users would likely be either
dismayed by this possibility, or confused by the level of
detail that any such documentation would require.
Yet another possibility would be to fix the Linux kernel so
that the time syscall is consistent with the other timestamp
syscalls. I suppose this has not been done due to
performance implications. (Git's use of timestamps is rare
enough that performance is not a significant consideration
for git.) However, this wouldn't fix Git's problem on older
Linux kernels, or on AIX or MS-Windows.
Signed-off-by: Paul Eggert <eggert@cs.ucla.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use designated initializers in the expansions of the OPT_* macros to
make it more readable which one-letter macro parameter initializes
which field in the resulting 'struct option'.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename the 'help' parameter as it matches one of the fields in 'struct
option', and, while at it, rename all other parameters to the usual
one-letter name used in similar macro definitions.
Furthermore, put all parameters in the replacement list between
parentheses, like all other OPT_* macros do.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the various OPT_* macros the 'f' parameter is usually used to
specify flags, while the 'cb' parameter is used to specify a callback
function. OPT_CALLBACK and OPT_NUMBER_CALLBACKS, however, are
inconsistent with the rest, as they use 'f' to specify their callback
function.
Rename their callback macro parameters to 'cb' to avoid the
inconsistency.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The headers 'diagnose.h', 'list-objects-filter-options.h',
'ref-filter.h' and 'remote.h' declare option parsing callback
functions with a 'struct option*' parameter, and 'revision.h' declares
an option parsing helper function taking 'struct parse_opt_ctx_t*' and
'struct option*' parameters. These headers all include
'parse-options.h', although they don't need any of the type
definitions from that header file. Furthermore,
'list-objects-filter-options.h' and 'ref-filter.h' also define some
OPT_* macros to initialize a 'struct option', but these don't
necessitate the inclusion of parse-options.h in these headers either,
because these macros are only expanded in source files.
Remove these unnecessary inclusions of parse-options.h and use forward
declarations to declare the necessary types.
After this patch none of the header files include parse-options.h
anymore.
With these changes, the build time after modifying only
parse-options.h is reduced by about 30%, and the number of targets
built is almost 20% less:
Before:
$ touch parse-options.h && time make -j4 |wc -l
353
real 1m1.527s
user 3m32.205s
sys 0m15.903s
After:
289
real 0m39.285s
user 2m12.540s
sys 0m11.164s
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The builtins 'ls-remote', 'pack-objects', 'receive-pack', 'reflog' and
'send-pack' use parse_options(), but their source files don't directly
include 'parse-options.h'. Furthermore, the source files
'diagnose.c', 'list-objects-filter-options.c', 'remote.c' and
'send-pack.c' define option parsing callback functions, while
'revision.c' defines an option parsing helper function, and thus need
access to various fields in 'struct option' and 'struct
parse_opt_ctx_t', but they don't directly include 'parse-options.h'
either. They all can still be built, of course, because they include
one of the header files that does include 'parse-options.h' (though
unnecessarily, see the next commit).
Add those missing includes to these files, as our general rule is that
"a C file must directly include the header files that declare the
functions and the types it uses".
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
WM_ABORT_ALL and WM_ABORT_TO_STARSTAR are used internally to limit
backtracking when a match fails, they are not of interest to the caller
and so should not be public.
Suggested-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code changed in this commit is designed to check if the pattern
starts with "**/" or contains "/**/" (see 3a078dec33 (wildmatch: fix
"**" special case, 2013-01-01)). Unfortunately when the pattern begins
with "**/" `prev_p = p - 2` is evaluated when `p` points to the second
"*" and so the subtraction is undefined according to section 6.5.6 of
the C standard because the result does not point within the same object
as `p`. Fix this by avoiding the subtraction unless it is well defined.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When dowild() cannot match a '*' or '/**/' wildcard then it must return
WM_ABORT_TO_STARSTAR or WM_ABORT_ALL respectively. Failure to observe
this results in unnecessary backtracking and the time taken for a failed
match increases exponentially with the number of wildcards in the
pattern [1]. Unfortunately in some instances dowild() returns WM_NOMATCH
for a failed match resulting in long match times for patterns containing
multiple wildcards as can be seen in the following benchmark.
(Note that the timings in the Benchmark 1 are really measuring the time
to execute test-tool rather than the time to match the pattern)
Benchmark 1: t/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "*a"
Time (mean ± σ): 22.8 ms ± 1.7 ms [User: 12.1 ms, System: 10.6 ms]
Range (min … max): 19.4 ms … 26.9 ms 113 runs
Warning: Ignoring non-zero exit code.
Benchmark 2: t/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "*a*a*a*a*a*a*a*a*a"
Time (mean ± σ): 5.244 s ± 0.228 s [User: 5.229 s, System: 0.010 s]
Range (min … max): 4.969 s … 5.707 s 10 runs
Warning: Ignoring non-zero exit code.
Summary
't/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "*a"' ran
230.37 ± 20.04 times faster than 't/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "*a*a*a*a*a*a*a*a*a"'
The security implications are limited as it only affects operations that
are potentially DoS vectors. For example by creating a blob containing
such a pattern a malicious user can exploit this behavior to use large
amounts of CPU time on a remote server by pushing the blob and then
creating a new clone with --filter=sparse:oid. However this filter type
is usually disabled as it is known to consume large amounts of CPU time
even without this bug.
The WM_MATCH changed in the first hunk of this patch comes from the
original implementation imported from rsync in 5230f605e1 (Import
wildmatch from rsync, 2012-10-15). Compared to the others converted here
it is fairly harmless as it only triggers at the end of the pattern and
so will only cause a single unnecessary backtrack. The others introduced
by 6f1a31f0aa (wildmatch: advance faster in <asterisk> + <literal>
patterns, 2013-01-01) and 46983441ae (wildmatch: make a special case for
"*/" with FNM_PATHNAME, 2013-01-01) are more pernicious and will cause
exponential behavior.
A new test is added to protect against future regressions.
[1] https://research.swtch.com/glob
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git add -p" while the index is unmerged sometimes failed to parse
the diff output it internally produces and died, which has been
corrected.
* jk/add-p-unmerged-fix:
add-patch: handle "* Unmerged path" lines
After "git pull" that is configured with pull.rebase=false
merge.ff=only fails due to our end having our own development, give
advice messages to get out of the "Not possible to fast-forward"
state.
* fc/advice-diverged-history:
advice: add diverging advice for novices
The code to parse "git rebase -X<opt>" was not prepared to see an
unparsable option string, which has been corrected.
* ab/fix-strategy-opts-parsing:
sequencer.c: fix overflow & segfault in parse_strategy_opts()
Once we start running, we assumed that the list of alternate object
databases would never change. Hook into the machinery used to
update the list of packfiles during runtime to update this list as
well.
* ds/reprepare-alternates-when-repreparing-packfiles:
object-file: reprepare alternates when necessary
"git format-patch" learned to write a log-message only output file
for empty commits.
* jk/format-patch-change-format-for-empty-commits:
format-patch: output header for empty commits