Normally when we match "*X" on "abcX", we call dowild("X", "abcX"),
dowild("X", "bcX"), dowild("X", "cX") and dowild("X", "X"). Only the
last call may have a chance of matching. By skipping the text before
"X", we can eliminate the first three useless calls.
compat, '*/*/*' on linux-2.6.git file list 2000 times, before:
wildmatch 7s 985049us
fnmatch 2s 735541us or 34.26% faster
and after:
wildmatch 4s 492549us
fnmatch 0s 888263us or 19.77% slower
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Normally we need recursion for "*". In this case we know that it
matches everything until "/" so we can skip the recursion.
glibc, '*/*/*' on linux-2.6.git file list 2000 times
before:
wildmatch 8s 74513us
fnmatch 1s 97042us or 13.59% faster
after:
wildmatch 3s 521862us
fnmatch 3s 488616us or 99.06% slower
Same test with compat/fnmatch:
wildmatch 8s 110763us
fnmatch 2s 980845us or 36.75% faster
wildmatch 3s 522156us
fnmatch 1s 544487us or 43.85% slower
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
So far, wildmatch() has always honoured directory boundary and there
was no way to turn it off. Make it behave more like fnmatch() by
requiring all callers that want the FNM_PATHNAME behaviour to pass
that in the equivalent flag WM_PATHNAME. Callers that do not specify
WM_PATHNAME will get wildcards like ? and * in their patterns matched
against '/', just like not passing FNM_PATHNAME to fnmatch().
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
- All exported constants now have a prefix WM_
- Do not rely on FNM_* constants, use the WM_ counterparts
- Remove TRUE and FALSE to follow Git's coding style
- While at it, turn flags type from int to unsigned int
- Add an (unused yet) argument to carry extra information
so that we don't have to change the prototype again later
when we need to pass other stuff to wildmatch
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'special' is too generic and is used for two different purposes.
Replace it with 'match_slash' to indicate "**" pattern and 'negated'
for "[!...]" and "[^...]".
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"**" is adjusted to only be effective when surrounded by slashes, in
40bbee0 (wildmatch: adjust "**" behavior - 2012-10-15). Except that
the commit did it wrong:
1. when it checks for "the preceding slash unless ** is at the
beginning", it compares to wrong pointer. It should have compared
to the beginning of the pattern, not the text.
2. prev_p points to the character before "**", not the first "*". The
correct comparison must be "prev_p < pattern" or
"prev_p + 1 == pattern", not "prev_p == pattern".
3. The pattern must be surrounded by slashes unless it's at the
beginning or the end of the pattern. We do two checks: one for the
preceding slash and one the trailing slash. Both checks must be
met. The use of "||" is wrong.
This patch fixes all above.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"foo/**/bar" matches "foo/x/bar", "foo/x/y/bar"... but not
"foo/bar". We make a special case, when foo/**/ is detected (and
"foo/" part is already matched), try matching "bar" with the rest of
the string.
"Match one or more directories" semantics can be easily achieved using
"foo/*/**/bar".
This also makes "**/foo" match "foo" in addition to "x/foo",
"x/y/foo"..
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Standard wildmatch() sees consecutive asterisks as "*" that can also
match slashes. But that may be hard to explain to users as
"abc/**/def" can match "abcdef", "abcxyzdef", "abc/def", "abc/x/def",
"abc/x/y/def"...
This patch changes wildmatch so that users can do
- "**/def" -> all paths ending with file/directory 'def'
- "abc/**" - equivalent to "/abc/"
- "abc/**/def" -> "abc/x/def", "abc/x/y/def"...
- otherwise consider the pattern malformed if "**" is found
Basically the magic of "**" only remains if it's wrapped around by
slashes.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
dowild() does case insensitive matching by lower-casing the text. That
means lower case letters in patterns imply case-insensitive matching,
but upper case means exact matching.
We do not want that subtlety. Lower case pattern too so iwildmatch()
always does what we expect it to do.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One place less to worry about thread safety. Also combine wildmatch
and iwildmatch into one.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
wildmatch returns non-zero if matched, zero otherwise. This patch
makes it return zero if matches, non-zero otherwise, like fnmatch().
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
wildmatch's coding style is pretty close to Git's except the use of 4
space indentation instead of 8. This patch should produce empty diff
with "git diff -b"
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These files are from rsync.git commit
f92f5b166e3019db42bc7fe1aa2f1a9178cd215d, which was the last commit
before rsync turned GPL-3. All files are imported as-is and
no-op. Adaptation is done in a separate patch.
rsync.git -> git.git
lib/wildmatch.[ch] wildmatch.[ch]
wildtest.txt t/t3070/wildtest.txt
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>