git-commit-vandalism

Author	SHA1	Message	Date
Michael J Gruber	3dc0b7f0dc	t3070: make chain lint tester happy `1f2e05f0b7` ("wildmatch: fix exponential behavior", 2023-03-20) introduced a new test with a background process. Backgrounding necessarily gives a result of 0, so that a seemingly broken && chain is not really broken. Adjust t3070 slightly so that our chain lint test recognizes the construct for what it is and does not raise a false positive. Signed-off-by: Michael J Gruber <git@grubix.eu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-27 17:02:38 -07:00
Phillip Wood	1f2e05f0b7	wildmatch: fix exponential behavior When dowild() cannot match a '' or '//' wildcard then it must return WM_ABORT_TO_STARSTAR or WM_ABORT_ALL respectively. Failure to observe this results in unnecessary backtracking and the time taken for a failed match increases exponentially with the number of wildcards in the pattern [1]. Unfortunately in some instances dowild() returns WM_NOMATCH for a failed match resulting in long match times for patterns containing multiple wildcards as can be seen in the following benchmark. (Note that the timings in the Benchmark 1 are really measuring the time to execute test-tool rather than the time to match the pattern) Benchmark 1: t/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "a" Time (mean ± σ): 22.8 ms ± 1.7 ms [User: 12.1 ms, System: 10.6 ms] Range (min … max): 19.4 ms … 26.9 ms 113 runs Warning: Ignoring non-zero exit code. Benchmark 2: t/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "aaaaaaaaa" Time (mean ± σ): 5.244 s ± 0.228 s [User: 5.229 s, System: 0.010 s] Range (min … max): 4.969 s … 5.707 s 10 runs Warning: Ignoring non-zero exit code. Summary 't/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "a"' ran 230.37 ± 20.04 times faster than 't/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "aaaaaaaaa"' The security implications are limited as it only affects operations that are potentially DoS vectors. For example by creating a blob containing such a pattern a malicious user can exploit this behavior to use large amounts of CPU time on a remote server by pushing the blob and then creating a new clone with --filter=sparse:oid. However this filter type is usually disabled as it is known to consume large amounts of CPU time even without this bug. The WM_MATCH changed in the first hunk of this patch comes from the original implementation imported from rsync in `5230f605e1` (Import wildmatch from rsync, 2012-10-15). Compared to the others converted here it is fairly harmless as it only triggers at the end of the pattern and so will only cause a single unnecessary backtrack. The others introduced by `6f1a31f0aa` (wildmatch: advance faster in <asterisk> + <literal> patterns, 2013-01-01) and `46983441ae` (wildmatch: make a special case for "/" with FNM_PATHNAME, 2013-01-01) are more pernicious and will cause exponential behavior. A new test is added to protect against future regressions. [1] https://research.swtch.com/glob Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 10:58:53 -07:00
Eric Sunshine	9fd911237f	test-lib: retire "lint harder" optimization hack `test_run_` in test-lib.sh "lints" the body of a test by sending it down a `sed chainlint.sed \| grep` pipeline; this happens once for each test run by a test script. Although this pipeline may seem relatively cheap in isolation, it can become expensive when invoked 26800+ times by `make test`, once for each test run, despite the existence of only 16500+ test definitions across all tests scripts. This difference in the number of tests defined in the scripts (16500+) and the number of tests actually run by `make test` (26800+) is explained by the fact that some test scripts run a very large number of small tests, all driven by a series of functions/loops which fill in the test bodies. This means that certain test definitions are being linted repeatedly (tens or hundreds of times) unnecessarily. To avoid such unnecessary work, `2d86a96220` (t: avoid sed-based chain-linting in some expensive cases, 2021-05-13) added an optimization hack which allows individual scripts to manually suppress the unnecessary repeated linting of the same test definition. However, unlike chainlint.sed which checks a test body as the test is run, chainlint.pl checks each test definition just once, no matter how many times the test is run, thus the sort of optimization hack introduced by `2d86a96220` is no longer needed and can be retired. Therefore, revert `2d86a96220`. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 10:07:41 -07:00
Eric Sunshine	74d2f5695d	tests: fix broken &&-chains in compound statements The top-level &&-chain checker built into t/test-lib.sh causes tests to magically exit with code 117 if the &&-chain is broken. However, it has the shortcoming that the magic does not work within `{...}` groups, `(...)` subshells, `$(...)` substitutions, or within bodies of compound statements, such as `if`, `for`, `while`, `case`, etc. `chainlint.sed` partly fills in the gap by catching broken &&-chains in `(...)` subshells, but bugs can still lurk behind broken &&-chains in the other cases. Fix broken &&-chains in compound statements in order to reduce the number of possible lurking bugs. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-12-13 10:29:48 -08:00
Ævar Arnfjörð Bjarmason	288a480621	leak tests: mark various "generic" tests as passing with SANITIZE=leak Mark various "generic" tests as passing when git is compiled with SANITIZE=leak. These tests were subjectively picked from the lists of passing tests since they're all small, and test some generic feature such as wildmatch(), commonly used environment variables, ident parsing etc. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-12 18:23:24 -07:00
Jeff King	2d86a96220	t: avoid sed-based chain-linting in some expensive cases Commit `878f988350` (t/test-lib: teach --chain-lint to detect broken &&-chains in subshells, 2018-07-11) introduced additional chain-lint tests which add an extra "sed" pipeline to each test we run. This has a measurable impact on runtime. Here are timings with and without a new environment variable (added by this patch) that lets you disable just the additional sed-based chain-lint tests: Benchmark #1: GIT_TEST_CHAIN_LINT_HARDER=1 make test Time (mean ± σ): 64.202 s ± 1.030 s [User: 622.469 s, System: 301.402 s] Range (min … max): 61.571 s … 65.662 s 10 runs Benchmark #2: GIT_TEST_CHAIN_LINT_HARDER=0 make test Time (mean ± σ): 57.591 s ± 0.333 s [User: 529.368 s, System: 270.618 s] Range (min … max): 57.143 s … 58.309 s 10 runs Summary 'GIT_TEST_CHAIN_LINT_HARDER=0 make test' ran 1.11 ± 0.02 times faster than 'GIT_TEST_CHAIN_LINT_HARDER=1 make test' Of course those extra lint checks are doing something useful, so paying a few extra seconds (at least on Linux) isn't so bad (though note the CPU time; we're bounded in our parallel run here by the slowest test, so it really is ~120s of CPU improvement). But we can observe that there are some test scripts where they produce a much stronger effect, and provide less value. In t0027 and t3070 we run a very large number of small tests, all driven by a series of functions/loops which are filling in the test bodies. There we get much less bang for our buck in terms of bug-finding versus CPU cost. This patch introduces a mechanism for controlling when those extra lint checks are run, at two levels: - a user can ask to disable or to force-enable the checks by setting GIT_TEST_CHAIN_LINT_HARDER - if the user hasn't specified a preference, individual scripts can disable the checks by setting GIT_TEST_CHAIN_LINT_HARDER_DEFAULT; scripts which don't set that get the current behavior of enabling them. In addition, this patch flips the default for t0027 and t3070's mass-generated sections to disable the extra checks. Here are the timing results for t0027: Benchmark #1: GIT_TEST_CHAIN_LINT_HARDER=1 ./t0027-auto-crlf.sh Time (mean ± σ): 17.078 s ± 0.848 s [User: 14.878 s, System: 7.075 s] Range (min … max): 15.952 s … 18.421 s 10 runs Benchmark #2: GIT_TEST_CHAIN_LINT_HARDER=0 ./t0027-auto-crlf.sh Time (mean ± σ): 9.063 s ± 0.759 s [User: 7.890 s, System: 3.362 s] Range (min … max): 7.747 s … 10.619 s 10 runs Benchmark #3: ./t0027-auto-crlf.sh Time (mean ± σ): 9.186 s ± 0.881 s [User: 7.957 s, System: 3.427 s] Range (min … max): 7.796 s … 10.498 s 10 runs Summary 'GIT_TEST_CHAIN_LINT_HARDER=0 ./t0027-auto-crlf.sh' ran 1.01 ± 0.13 times faster than './t0027-auto-crlf.sh' 1.88 ± 0.18 times faster than 'GIT_TEST_CHAIN_LINT_HARDER=1 ./t0027-auto-crlf.sh' We can see that disabling the checks for the whole script buys us an almost 2x speedup. But the new default behavior, disabling them only for the mass-generated part, gets us most of that speedup (but still leaves the checks on for further manual tests people might write). As a side note, I'd caution about comparing runtimes and CPU seconds between this timing and the earlier "make test" one. In "make test", we're running a lot of scripts in parallel, so the CPU is throttling down (and thus a CPU second saved here would count for more during a parallel run; the same work takes more CPU seconds there). We get similar results for t3070: Benchmark #1: GIT_TEST_CHAIN_LINT_HARDER=1 ./t3070-wildmatch.sh Time (mean ± σ): 20.054 s ± 3.967 s [User: 16.003 s, System: 8.286 s] Range (min … max): 11.891 s … 23.671 s 10 runs Benchmark #2: GIT_TEST_CHAIN_LINT_HARDER=0 ./t3070-wildmatch.sh Time (mean ± σ): 12.399 s ± 2.256 s [User: 7.542 s, System: 5.342 s] Range (min … max): 9.606 s … 15.727 s 10 runs Benchmark #3: ./t3070-wildmatch.sh Time (mean ± σ): 10.726 s ± 3.476 s [User: 6.790 s, System: 4.365 s] Range (min … max): 5.444 s … 15.376 s 10 runs Summary './t3070-wildmatch.sh' ran 1.16 ± 0.43 times faster than 'GIT_TEST_CHAIN_LINT_HARDER=0 ./t3070-wildmatch.sh' 1.87 ± 0.71 times faster than 'GIT_TEST_CHAIN_LINT_HARDER=1 ./t3070-wildmatch.sh' Again, we get almost a 2x speedup disabling these. In this case, there are no tests not covered by the script's "default to disable" behavior, so the second two benchmarks should be the same (and while they do differ, you can see the variance is quite high but they're within one standard deviation). So it seems like for these two scripts, at least, disabling the extra checks is a reasonable tradeoff. Sadly, the overall runtime of "make test" on my system doesn't get much faster. But that's because we're mostly limited by the cost of the single biggest test. Here are the top-5 tests by wall-clock time from a parallel run, before my patch: 57.9192368984222 t9001-send-email.sh 45.6329638957977 t0027-auto-crlf.sh 32.5278220176697 t3070-wildmatch.sh 22.2701289653778 t7610-mergetool.sh 20.8635759353638 t1701-racy-split-index.sh And after: 57.1476998329163 t9001-send-email.sh 33.776211977005 t0027-auto-crlf.sh 21.3116669654846 t7610-mergetool.sh 20.7748689651489 t1701-racy-split-index.sh 19.6957249641418 t7112-reset-submodule.sh We dropped 12s from t0027, and t3070 dropped off our list entirely at around 16s. In both cases we're bound by t9001, but its slowness is due to the actual tests, so we'll have to deal with it in a different way. But this reduces overall CPU, and means that dealing with t9001 (by improving the speed of send-email or splitting it apart) will let us reduce our overall runtime even on multi-core machines. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-13 15:50:44 +09:00
Junio C Hamano	25e4da89ed	Merge branch 'nd/wildmatch-double-asterisk' A pattern with '*' that does not have a slash on either side used to be an invalid one, but the code now treats such double-asterisks the same way as two normal asterisks that happen to be adjacent to each other. nd/wildmatch-double-asterisk: wildmatch: change behavior of "foo**bar" in WM_PATHNAME mode	2018-11-13 22:37:19 +09:00
Nguyễn Thái Ngọc Duy	e5bbe09e88	wildmatch: change behavior of "foo*bar" in WM_PATHNAME mode In WM_PATHNAME mode (or FNM_PATHNAME), '' does not match '/' and '' can but only in three patterns: - '/' matches zero or more leading directories - '//' matches zero or more directories in between - '/' matches zero or more trailing directories/files When '' is present but not in one of these patterns, the current behavior is consider the pattern invalid and stop matching. In other words, 'foobar' never matches anything, whatever you throw at it. This behavior is arguably a bit confusing partly because we can't really tell the user their pattern is invalid so that they can fix it. So instead, tolerate it and make '*' act like two regular ''s (which is essentially the same as a single asterisk). This behavior seems more predictable. Noticed-by: dana <dana@dana.is> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-29 13:19:22 +09:00
Ævar Arnfjörð Bjarmason	d3c6751b18	tests: make use of the test_must_be_empty function Change various tests that use an idiom of the form: >expect && test_cmp expect actual To instead use: test_must_be_empty actual The test_must_be_empty() wrapper was introduced in `ca8d148daf` ("test: test_must_be_empty helper", 2013-06-09). Many of these tests have been added after that time. This was mostly found with, and manually pruned from: git grep '^\s+>.expect. &&$' t Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-30 11:18:41 -07:00
Nguyễn Thái Ngọc Duy	0489289de2	t/helper: merge test-wildmatch into test-tool Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-27 08:45:47 -07:00
Ævar Arnfjörð Bjarmason	8725923b85	wildmatch test: mark test as EXPENSIVE_ON_WINDOWS Mark the newly added test which creates test files on-disk as EXPENSIVE_ON_WINDOWS. According to [1] it takes almost ten minutes to run this test file on Windows after this recent change, but just a few seconds on Linux as noted in my [2]. This could be done faster by exiting earlier, however by using this pattern we'll emit "skip" lines for each skipped test, making it clear we're not running a lot of them in the TAP output, at the cost of some overhead. 1. nycvar.QRO.7.76.6.1801061337020.1337@wbunaarf-fpuvaqryva.tvgsbejvaqbjf.bet (https://public-inbox.org/git/nycvar.QRO.7.76.6.1801061337020.1337@wbunaarf-fpuvaqryva.tvgsbejvaqbjf.bet/) 2. 87mv1raz9p.fsf@evledraar.gmail.com (https://public-inbox.org/git/87mv1raz9p.fsf@evledraar.gmail.com/) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-30 14:04:01 -08:00
Ævar Arnfjörð Bjarmason	de8bada2bf	wildmatch test: create & test files on disk in addition to in-memory There has never been any full roundtrip testing of what git-ls-files and other commands that use wildmatch() actually do, rather we've been satisfied with just testing the underlying C function. Due to git-ls-files and friends having their own codepaths before they call wildmatch() there's sometimes differences in the behavior between the two. Even when we test for those (as with [1]), there was no one place where you can review how these two modes differ. Now there is. We now attempt to create a file called $haystack and match $needle against it for each pair of $needle and $haystack that we were passing to test-wildmatch. If we can't create the file we skip the test. This ensures that we can run this on all platforms and not maintain some infinitely growing whitelist of e.g. platforms that don't support certain characters in filenames. A notable exception to this is Windows, where due to the reasons explained in [2] the shellscript emulation layer might fake the creation of a file such as "", and "test -e" for it will succeed since it just got created with some character that maps to "", but git ls-files won't be fooled by this. Thus we need to skip creating certain filenames entirely on Windows, the list here might be overly aggressive. I don't have access to a Windows system to test this. As a result of doing these tests we can now see the cases where these two ways of testing wildmatch differ: * Creating a file called 'a[]b' and running ls-files 'a[]b' will show that file, but wildmatch("a[]b", "a[]b") will not match * wildmatch() won't match a file called \ against \, but ls-files will. * `git --glob-pathspecs ls-files 'foo**'` will match a file 'foo/bba/arr', but wildmatch won't, however pathmatch will. This seems like a bug to me, the two are otherwise equivalent as these tests show. This also reveals the case discussed in [1], since 2.16.0 '' is now an error as far as ls-files is concerned, but wildmatch() itself happily accepts it. 1. `9e4e8a64c2` ("pathspec: die on empty strings as pathspec", 2017-06-06) 2. nycvar.QRO.7.76.6.1801052133380.1337@wbunaarf-fpuvaqryva.tvgsbejvaqbjf.bet (https://public-inbox.org/git/?q=nycvar.QRO.7.76.6.1801052133380.1337%40wbunaarf-fpuvaqryva.tvgsbejvaqbjf.bet) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-30 14:04:01 -08:00
Ævar Arnfjörð Bjarmason	91061c444a	wildmatch test: perform all tests under all wildmatch() modes Rewrite the wildmatch() test suite so that each test now tests all combinations of the wildmatch() WM_CASEFOLD and WM_PATHNAME flags. Before this change some test inputs were not tested on e.g. WM_PATHNAME. Now the function is stress tested on all possible inputs, and for each input we declare what the result should be if the mode is case-insensitive, or pathname matching, or case-sensitive or not matching pathnames. Also before this change, nothing was testing case-insensitive non-pathname matching, so I've added that to test-wildmatch.c and made use of it. This yields a rather scary patch, but there are no functional changes here, just more test coverage. Some now-redundant tests were deleted as a result of this change, since they were now duplicating an earlier test. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-30 14:04:01 -08:00
Ævar Arnfjörð Bjarmason	4bc280f250	wildmatch test: use test_must_fail, not ! for test-wildmatch Use of ! should be reserved for non-git programs that are assumed not to fail, see README. With this change only t/t0110-urlmatch-normalization.sh is still using this anti-pattern. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-30 14:04:01 -08:00
Ævar Arnfjörð Bjarmason	50eafb1a27	wildmatch test: remove dead fnmatch() test code Remove the unused fnmatch() test parameter from the wildmatch test. The code that used to test this was removed in `70a8fc999d` ("stop using fnmatch (either native or compat)", 2014-02-15). As a --word-diff shows the only change to the body of the tests is the removal of the second out of four parameters passed to match(). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-30 14:04:01 -08:00
Ævar Arnfjörð Bjarmason	5684c2bc69	wildmatch test: use a paranoia pattern from nul_match() Use a pattern from the nul_match() function in t7008-grep-binary.sh to make sure that we don't just fall through to the "else" if there's an unknown parameter. This is something I added in commit `77f6f4406f` ("grep: add a test helper function for less verbose -f \0 tests", 2017-05-20) to grep tests, which were modeled on these wildmatch tests, and I'm now porting back to the original wildmatch tests. I am not using the "say '...'; exit 1" pattern from t0000-basic.sh because if I fail I want to run the rest of the tests (unless under -i), and doing this makes sure we do that and don't exit right away without fully reporting our errors. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-30 14:04:00 -08:00
Ævar Arnfjörð Bjarmason	f5ebe8f3f1	wildmatch test: don't try to vertically align our output Don't try to vertically align the test output, which is futile anyway under the TAP output where we're going to be emitting a number for each test without aligning the test count. This makes subsequent changes of mine where I'm not going to be aligning this output as I add new tests easier. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-30 14:04:00 -08:00
Ævar Arnfjörð Bjarmason	5008ba8c5e	wildmatch test: use more standard shell style Change the wildmatch test to use more standard shell style, usually we use "if test" not "if [". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-30 14:04:00 -08:00
Ævar Arnfjörð Bjarmason	a4a136f56e	wildmatch test: indent with tabs, not spaces Replace the 4-width mixed space & tab indentation in this file with indentation with tabs as we do in most of the rest of our tests. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-30 14:04:00 -08:00
Ville Skyttä	6412757514	Spelling fixes Signed-off-by: Ville Skyttä <ville.skytta@iki.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-27 10:35:49 -07:00
Ævar Arnfjörð Bjarmason	d8604c747b	wildmatch test: cover a blind spot in "/" matching A negated character class that does not include '/', e.g. [^a-z]: - Should match '/' when doing "wildmatch" - Should not match '/' when doing "pathmatch" Add two tests to cover these cases. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-15 15:02:10 -07:00
Ævar Arnfjörð Bjarmason	44e2ff09ce	wildmatch test: remove redundant duplicate test Remove a test line that's exactly the same as the preceding line. This was brought in in commit `feabcc173b` ("Integrate wildmatch to git", 2012-10-15), these tests are originally copied from rsync.git, but the duplicate line was never present there, so must have just snuck in during integration with git by accident. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-29 09:54:15 +09:00
Nguyễn Thái Ngọc Duy	70a8fc999d	stop using fnmatch (either native or compat) Since v1.8.4 (about six months ago) wildmatch is used as default replacement for fnmatch. We have seen only one fix since so wildmatch probably has done a good job as fnmatch replacement. This concludes the fnmatch->wildmatch transition by no longer relying on fnmatch. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-20 14:16:11 -08:00
Anthony Ramine	b79c0c3755	wildmatch: properly fold case everywhere Case folding is not done correctly when matching against the [:upper:] character class and uppercased character ranges (e.g. A-Z). Specifically, an uppercase letter fails to match against any of them when case folding is requested because plain characters in the pattern and the whole string are preemptively lowercased to handle the base case fast. That optimization is kept and ISLOWER() is used in the [:upper:] case when case folding is requested, while matching against a character range is retried with toupper() if the character was lowercase, as the bounds of the range itself cannot be modified (in a case-insensitive context, [A-_] is not equivalent to [a-_]). Signed-off-by: Anthony Ramine <n.oxyde@gmail.com> Reviewed-by: Duy Nguyen <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-06-02 14:13:05 -07:00
Nguyễn Thái Ngọc Duy	6f1a31f0aa	wildmatch: advance faster in <asterisk> + <literal> patterns Normally when we match "X" on "abcX", we call dowild("X", "abcX"), dowild("X", "bcX"), dowild("X", "cX") and dowild("X", "X"). Only the last call may have a chance of matching. By skipping the text before "X", we can eliminate the first three useless calls. compat, '//' on linux-2.6.git file list 2000 times, before: wildmatch 7s 985049us fnmatch 2s 735541us or 34.26% faster and after: wildmatch 4s 492549us fnmatch 0s 888263us or 19.77% slower Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-01-01 15:32:37 -08:00
Nguyễn Thái Ngọc Duy	46983441ae	wildmatch: make a special case for "/" with FNM_PATHNAME Normally we need recursion for "". In this case we know that it matches everything until "/" so we can skip the recursion. glibc, '//*' on linux-2.6.git file list 2000 times before: wildmatch 8s 74513us fnmatch 1s 97042us or 13.59% faster after: wildmatch 3s 521862us fnmatch 3s 488616us or 99.06% slower Same test with compat/fnmatch: wildmatch 8s 110763us fnmatch 2s 980845us or 36.75% faster wildmatch 3s 522156us fnmatch 1s 544487us or 43.85% slower Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-01-01 15:32:37 -08:00
Nguyễn Thái Ngọc Duy	c41244e702	wildmatch: support "no FNM_PATHNAME" mode So far, wildmatch() has always honoured directory boundary and there was no way to turn it off. Make it behave more like fnmatch() by requiring all callers that want the FNM_PATHNAME behaviour to pass that in the equivalent flag WM_PATHNAME. Callers that do not specify WM_PATHNAME will get wildcards like ? and * in their patterns matched against '/', just like not passing FNM_PATHNAME to fnmatch(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-01-01 15:32:37 -08:00
Nguyễn Thái Ngọc Duy	3a078dec33	wildmatch: fix "" special case "" is adjusted to only be effective when surrounded by slashes, in `40bbee0` (wildmatch: adjust "" behavior - 2012-10-15). Except that the commit did it wrong: 1. when it checks for "the preceding slash unless is at the beginning", it compares to wrong pointer. It should have compared to the beginning of the pattern, not the text. 2. prev_p points to the character before "*", not the first "". The correct comparison must be "prev_p < pattern" or "prev_p + 1 == pattern", not "prev_p == pattern". 3. The pattern must be surrounded by slashes unless it's at the beginning or the end of the pattern. We do two checks: one for the preceding slash and one the trailing slash. Both checks must be met. The use of "\|\|" is wrong. This patch fixes all above. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-01-01 15:31:18 -08:00
Ramsay Jones	d5d80e12bd	t3070: Disable some failing fnmatch tests The failing tests make use of a POSIX character class, '[:xdigit:]' in this case, which some versions of the fnmatch() library function do not support. In the spirit of commit `f1cf7b79` ("t3070: disable unreliable fnmatch tests", 15-10-2012), we disable the fnmatch() half of these tests. Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Acked-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-12-15 22:26:58 -08:00
Nguyễn Thái Ngọc Duy	ef49841ddf	test-wildmatch: avoid Windows path mangling The MSYS bash mangles arguments that begin with a forward slash when they are passed to test-wildmatch. This causes tests to fail. Avoid mangling by prepending "XXX", which is removed by test-wildmatch before further processing. [J6t: reworded commit message] Reported-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org>	2012-11-20 12:09:13 -08:00
Nguyễn Thái Ngọc Duy	4c251e5cb5	wildmatch: make // match zero or more directories "foo//bar" matches "foo/x/bar", "foo/x/y/bar"... but not "foo/bar". We make a special case, when foo/*/ is detected (and "foo/" part is already matched), try matching "bar" with the rest of the string. "Match one or more directories" semantics can be easily achieved using "foo///bar". This also makes "/foo" match "foo" in addition to "x/foo", "x/y/foo".. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-10-15 14:58:18 -07:00
Nguyễn Thái Ngọc Duy	40bbee0ab0	wildmatch: adjust "*" behavior Standard wildmatch() sees consecutive asterisks as "" that can also match slashes. But that may be hard to explain to users as "abc//def" can match "abcdef", "abcxyzdef", "abc/def", "abc/x/def", "abc/x/y/def"... This patch changes wildmatch so that users can do - "/def" -> all paths ending with file/directory 'def' - "abc/" - equivalent to "/abc/" - "abc//def" -> "abc/x/def", "abc/x/y/def"... - otherwise consider the pattern malformed if "" is found Basically the magic of "" only remains if it's wrapped around by slashes. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-10-15 14:58:18 -07:00
Nguyễn Thái Ngọc Duy	f1cf7b7983	t3070: disable unreliable fnmatch tests These tests show different results on different fnmatch() versions. We don't want to test fnmatch here. We want to make sure wildmatch behavior matches fnmatch and that only makes sense in cases when fnmatch() behaves consistently. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-10-15 14:58:18 -07:00
Nguyễn Thái Ngọc Duy	feabcc173b	Integrate wildmatch to git Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-10-15 14:58:17 -07:00

34 Commits