Currently we treat "*.c" and "path/to/*.c" the same way. Which means
we check all possible paths in repo against "path/to/*.c". One could
see that "path/elsewhere/foo.c" obviously cannot match "path/to/*.c"
and we only need to check all paths _inside_ "path/to/" against that
pattern.
This patch checks the leading fixed part of a pathspec against base
directory and exit early if possible. We could even optimize further
in "path/to/something*.c" case (i.e. check the fixed part against
name_entry as well) but that's more complicated and probably does not
gain us much.
-O2 build on linux-2.6, without and with this patch respectively:
$ time git rev-list --quiet HEAD -- 'drivers/*.c'
real 1m9.484s
user 1m9.128s
sys 0m0.181s
$ time ~/w/git/git rev-list --quiet HEAD -- 'drivers/*.c'
real 0m15.710s
user 0m15.564s
sys 0m0.107s
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a pattern contains only a single asterisk as wildcard,
e.g. "foo*bar", after literally comparing the leading part "foo" with
the string, we can compare the tail of the string and make sure it
matches "bar", instead of running fnmatch() on "*bar" against the
remainder of the string.
-O2 build on linux-2.6, without the patch:
$ time git rev-list --quiet HEAD -- '*.c'
real 0m40.770s
user 0m40.290s
sys 0m0.256s
With the patch
$ time ~/w/git/git rev-list --quiet HEAD -- '*.c'
real 0m34.288s
user 0m33.997s
sys 0m0.205s
The above command is not supposed to be widely popular. It's chosen
because it exercises pathspec matching a lot. The point is it cuts
down matching time for popular patterns like *.c, which could be used
as pathspec in other places.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We mark pathspec with wildcards with the field use_wildcard. We
could do better by saving the length of the non-wildcard part, which
can be used for optimizations such as f9f6e2c (exclude: do strcmp as
much as possible before fnmatch - 2012-06-07).
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sometimes curl_multi_timeout() function suggested a wrong timeout
value when there is no file descriptors to wait on and the http
transport ended up sleeping for minutes in select(2) system call. A
workaround has been added for this.
* sz/maint-curl-multi-timeout:
Fix potential hang in https handshake
"git pull --rebase" run while the HEAD is detached tried to find
the upstream branch of the detached HEAD (which by definition
does not exist) and emitted unnecessary error messages.
* ph/pull-rebase-detached:
git-pull: Avoid merge-base on detached head
A symbolic ref refs/heads/SYM was not correctly removed with "git
branch -d SYM"; the command removed the ref pointed by SYM instead.
* rs/branch-del-symref:
branch: show targets of deleted symrefs, not sha1s
branch: skip commit checks when deleting symref branches
branch: delete symref branch, not its target
branch: factor out delete_branch_config()
branch: factor out check_branch_commit()
"git grep -e pattern <tree>" asked the attribute system to read
"<tree>:.gitattributes" file in the working tree, which was
nonsense.
* nd/grep-true-path:
grep: stop looking at random places for .gitattributes
"git log -F -E --grep='<ere>'" failed to use the given <ere>
pattern as extended regular expression, and instead looked for the
string literally.
* 'jc/grep-pcre-loose-ends' (early part):
log --grep: use the same helper to set -E/-F options as "git grep"
revisions: initialize revs->grep_filter using grep_init()
grep: move pattern-type bits support to top-level grep.[ch]
grep: move the configuration parsing logic to grep.[ch]
builtin/grep.c: make configuration callback more reusable
"git mergetool" feeds /dev/null as a common ancestor when dealing
with an add/add conflict, but p4merge backend cannot handle it. Work
it around by passing a temporary empty file.
* da/mergetools-p4:
mergetools/p4merge: Handle "/dev/null"
The "say" function in the test scaffolding incorrectly allowed
"echo" to interpret "\a" as if it were a C-string asking for a BEL
output.
* jc/test-say-color-avoid-echo-escape:
test-lib: Fix say_color () not to interpret \a\b\c in the message
The configuration parser had an unnecessary hardcoded limit on
variable names that was not checked consistently.
* bw/config-lift-variable-name-length-limit:
Remove the hard coded length limit on variable names in config files
Prepare the release notes for the upcoming release, and describe
changes up to the 5th batch we just merged.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Emit the notes attached to the commit in "format-patch --notes"
output after three-dashes.
* jc/prettier-pretty-note:
format-patch: add a blank line between notes and diffstat
Doc User-Manual: Patch cover letter, three dashes, and --notes
Doc format-patch: clarify --notes use case
Doc notes: Include the format-patch --notes option
Doc SubmittingPatches: Mention --notes option after "cover letter"
Documentation: decribe format-patch --notes
format-patch --notes: show notes after three-dashes
format-patch: append --signature after notes
pretty_print_commit(): do not append notes message
pretty: prepare notes message at a centralized place
format_note(): simplify API
pretty: remove reencode_commit_message()
Follow-on to the new "--set-upstream-to" topic from v1.8.0 to avoid
suggesting the deprecated "--set-upstream".
* mg/maint-pull-suggest-upstream-to:
push/pull: adjust missing upstream help text to changed interface
Improve the asymptotic performance of the cat_sort_uniq notes merge
strategy.
* mh/notes-string-list:
string_list_add_refs_from_colon_sep(): use string_list_split()
notes: fix handling of colon-separated values
combine_notes_cat_sort_uniq(): sort and dedup lines all at once
Initialize sort_uniq_list using named constant
string_list: add a function string_list_remove_empty_items()
Allows "cvsimport" to read per-author timezone from the author info
file.
* cr/cvsimport-local-zone:
cvsimport: work around perl tzset issue
git-cvsimport: allow author-specific timezones
Various codepaths checked if two encoding names are the same using
ad-hoc code and some of them ended up asking iconv() to convert
between "utf8" and "UTF-8". The former is not a valid way to spell
the encoding name, but often people use it by mistake, and we
equated them in some but not all codepaths. Introduce a new helper
function to make these codepaths consistent.
* jc/same-encoding:
reencode_string(): introduce and use same_encoding()
Conflicts:
builtin/mailinfo.c
Add "symbolic-ref -d SYM" to delete a symbolic ref SYM.
It is already possible to remove a symbolic ref with "update-ref -d
--no-deref", but it may be a good addition for completeness.
* jh/symbolic-ref-d:
git symbolic-ref --delete $symref
For a fetch refspec (or the result of applying wildcard on one), we
always want the RHS to map to something inside "refs/" hierarchy.
This was split out from discarded jc/maint-push-refs-all topic.
* jc/maint-fetch-tighten-refname-check:
get_fetch_map(): tighten checks on dest refs
The last line of the note text comes immediately before the diffstat
block, making the latter unnecessarily harder to view.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Cleans up some leftover bits from an earlier submodule change.
* ph/maint-submodule-status-fix:
submodule status: remove unused orig_* variables
t7407: Fix recursive submodule test
Code cleanups so that libgit.a does not depend on anything in the
builtin/ directory.
* nd/builtin-to-libgit:
fetch-pack: move core code to libgit.a
fetch-pack: remove global (static) configuration variable "args"
send-pack: move core code to libgit.a
Move setup_diff_pager to libgit.a
Move print_commit_list to libgit.a
Move estimate_bisect_steps to libgit.a
Move try_merge_command and checkout_fast_forward to libgit.a
Sometimes curl_multi_timeout() function suggested a wrong timeout
value when there is no file descriptors to wait on and the http
transport ended up sleeping for minutes in select(2) system call.
Detect this and reduce the wait timeout in such a case.
* sz/maint-curl-multi-timeout:
Fix potential hang in https handshake
Cleanups to prepare for mo/cvs-server-updates.
* mo/cvs-server-cleanup:
Use character class for sed expression instead of \s
cvsserver status: provide real sticky info
cvsserver: cvs add: do not expand directory arguments
cvsserver: use whole CVS rev number in-process; don't strip "1." prefix
cvsserver: split up long lines in req_{status,diff,log}
cvsserver: clean up client request handler map comments
cvsserver: remove unused functions _headrev and gethistory
cvsserver update: comment about how we shouldn't remove a user-modified file
cvsserver: add comments about database schema/usage
cvsserver: removed unused sha1Or-k mode from kopts_from_path
cvsserver t9400: add basic 'cvs log' test
"git send-email --compose" can let the user create a non-ascii
cover letter message, but there was not a way to mark it with
appropriate content type before sending it out.
Further updates fix subject quoting.
* km/send-email-compose-encoding:
git-send-email: add rfc2047 quoting for "=?"
git-send-email: introduce quote_subject()
git-send-email: skip RFC2047 quoting for ASCII subjects
git-send-email: use compose-encoding for Subject
git-send-email: introduce compose-encoding
Fixes many rfc2047 quoting issues in the output from format-patch.
* js/format-2047:
format-patch tests: check quoting/encoding in To: and Cc: headers
format-patch: fix rfc2047 address encoding with respect to rfc822 specials
format-patch: make rfc2047 encoding more strict
format-patch: introduce helper function last_line_length()
format-patch: do not wrap rfc2047 encoded headers too late
format-patch: do not wrap non-rfc2047 headers too early
utf8: fix off-by-one wrapping of text
When "update-ref -d --no-deref SYM" tried to delete a symbolic ref
SYM, it incorrectly locked the underlying reference pointed by SYM,
not the symbolic ref itself.
* rs/lock-correct-ref-during-delete:
refs: lock symref that is to be deleted, not its target
Start laying the foundation to build the "wildmatch" after we can
agree on its desired semantics.
* nd/attr-match-optim-more:
attr: more matching optimizations from .gitignore
gitignore: make pattern parsing code a separate function
exclude: split pathname matching code into a separate function
exclude: fix a bug in prefix compare optimization
exclude: split basename matching code into a separate function
exclude: stricten a length check in EXC_FLAG_ENDSWITH case
It makes for simpler code than strbuf_split().
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Acked-by: Johan Herland <johan@herland.net>
Signed-off-by: Jeff King <peff@peff.net>
The substrings output by strbuf_split() include the ':' delimiters.
When processing GIT_NOTES_DISPLAY_REF and GIT_NOTES_REWRITE_REF, strip
off the delimiter character *before* checking whether the substring is
empty rather than after, so that empty strings within the list are
also skipped.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Acked-by: Johan Herland <johan@herland.net>
Signed-off-by: Jeff King <peff@peff.net>
Instead of reading lines one by one and insertion-sorting them into a
string_list, read all of the lines, sort them, then remove duplicates.
Aside from being less code, this reduces the complexity from O(N^2) to
O(N lg N) in the total number of lines.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Acked-by: Johan Herland <johan@herland.net>
Signed-off-by: Jeff King <peff@peff.net>
In case of a missing upstream, the git-parse-remote script suggests:
If you wish to set tracking information for this branch you can do so
with:
git branch --set-upstream nsiv2 origin/<branch>
But --set-upstream is deprectated. Change the suggestion to:
git branch --set-upstream-to=origin/<branch> nsiv2
Reported-by: Jeroen van der Ham <vdham@uva.nl>
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Jeff King <peff@peff.net>
Callers of reencode_string() that re-encodes a string from one
encoding to another all used ad-hoc way to bypass the case where the
input and the output encodings are the same. Some did strcmp(),
some did strcasecmp(), yet some others when converting to UTF-8 used
is_encoding_utf8().
Introduce same_encoding() helper function to make these callers use
the same logic. Notably, is_encoding_utf8() has a work-around for
common misconfiguration to use "utf8" to name UTF-8 encoding, which
does not match "UTF-8" hence strcasecmp() would not consider the
same. Make use of it in this helper function.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On many platforms, the first invocation of localtime_r will
check $TZ in the environment, but subsequent invocations
will use a cached value. That means that setting $ENV{TZ} in
the middle of the program may or may not have an effect on
later calls to localtime. Perl 5.10.0 and later handles
this automatically for us, but we try to remain portable
back to 5.8. Work around it by calling tzset ourselves.