1184a95ea2
They were disabled at 53b8d93
("grep: disable threading in non-worktree
case", 12-12-2011), due to observable performance drops (to the point
that using a single thread would be faster than multiple threads). But
now that zlib inflation can be performed in parallel we can regain the
speedup, so let's re-enable threads in non-worktree grep.
Grepping 'abcd[02]' ("Regex 1") and '(static|extern) (int|double) \*'
("Regex 2") at chromium's repository[1] I got:
Threads | Regex 1 | Regex 2
---------|------------|-----------
1 | 17.2920s | 20.9624s
2 | 9.6512s | 11.3184s
4 | 6.7723s | 7.6268s
8** | 6.2886s | 6.9843s
These are all means of 30 executions after 2 warmup runs. All tests were
executed on an i7-7700HQ (quad-core w/ hyper-threading), 16GB of RAM and
SSD, running Manjaro Linux. But to make sure the optimization also
performs well on HDD, the tests were repeated on another machine with an
i5-4210U (dual-core w/ hyper-threading), 8GB of RAM and HDD (SATA III,
5400 rpm), also running Manjaro Linux:
Threads | Regex 1 | Regex 2
---------|------------|-----------
1 | 18.4035s | 22.5368s
2 | 12.5063s | 14.6409s
4** | 10.9136s | 12.7106s
** Note that in these cases we relied on hyper-threading, and that's
probably why we don't see a big difference in time.
Unfortunately, multithreaded git-grep might be slow in the non-worktree
case when --textconv is used and there're too many text conversions.
Probably the reason for this is that the object read lock is used to
protect fill_textconv() and therefore there is a mutual exclusion
between textconv execution and object reading. Because both are
time-consuming operations, not being able to perform them in parallel
can cause performance drops. To inform the users about this (and other
threading details), let's also add a "NOTES ON THREADS" section to
Documentation/git-grep.txt.
[1]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
04-06-2019), after a 'git gc' execution.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
364 lines
11 KiB
Plaintext
364 lines
11 KiB
Plaintext
git-grep(1)
|
|
===========
|
|
|
|
NAME
|
|
----
|
|
git-grep - Print lines matching a pattern
|
|
|
|
|
|
SYNOPSIS
|
|
--------
|
|
[verse]
|
|
'git grep' [-a | --text] [-I] [--textconv] [-i | --ignore-case] [-w | --word-regexp]
|
|
[-v | --invert-match] [-h|-H] [--full-name]
|
|
[-E | --extended-regexp] [-G | --basic-regexp]
|
|
[-P | --perl-regexp]
|
|
[-F | --fixed-strings] [-n | --line-number] [--column]
|
|
[-l | --files-with-matches] [-L | --files-without-match]
|
|
[(-O | --open-files-in-pager) [<pager>]]
|
|
[-z | --null]
|
|
[ -o | --only-matching ] [-c | --count] [--all-match] [-q | --quiet]
|
|
[--max-depth <depth>] [--[no-]recursive]
|
|
[--color[=<when>] | --no-color]
|
|
[--break] [--heading] [-p | --show-function]
|
|
[-A <post-context>] [-B <pre-context>] [-C <context>]
|
|
[-W | --function-context]
|
|
[--threads <num>]
|
|
[-f <file>] [-e] <pattern>
|
|
[--and|--or|--not|(|)|-e <pattern>...]
|
|
[--recurse-submodules] [--parent-basename <basename>]
|
|
[ [--[no-]exclude-standard] [--cached | --no-index | --untracked] | <tree>...]
|
|
[--] [<pathspec>...]
|
|
|
|
DESCRIPTION
|
|
-----------
|
|
Look for specified patterns in the tracked files in the work tree, blobs
|
|
registered in the index file, or blobs in given tree objects. Patterns
|
|
are lists of one or more search expressions separated by newline
|
|
characters. An empty string as search expression matches all lines.
|
|
|
|
|
|
CONFIGURATION
|
|
-------------
|
|
|
|
grep.lineNumber::
|
|
If set to true, enable `-n` option by default.
|
|
|
|
grep.column::
|
|
If set to true, enable the `--column` option by default.
|
|
|
|
grep.patternType::
|
|
Set the default matching behavior. Using a value of 'basic', 'extended',
|
|
'fixed', or 'perl' will enable the `--basic-regexp`, `--extended-regexp`,
|
|
`--fixed-strings`, or `--perl-regexp` option accordingly, while the
|
|
value 'default' will return to the default matching behavior.
|
|
|
|
grep.extendedRegexp::
|
|
If set to true, enable `--extended-regexp` option by default. This
|
|
option is ignored when the `grep.patternType` option is set to a value
|
|
other than 'default'.
|
|
|
|
grep.threads::
|
|
Number of grep worker threads to use. If unset (or set to 0),
|
|
8 threads are used by default (for now).
|
|
|
|
grep.fullName::
|
|
If set to true, enable `--full-name` option by default.
|
|
|
|
grep.fallbackToNoIndex::
|
|
If set to true, fall back to git grep --no-index if git grep
|
|
is executed outside of a git repository. Defaults to false.
|
|
|
|
|
|
OPTIONS
|
|
-------
|
|
--cached::
|
|
Instead of searching tracked files in the working tree, search
|
|
blobs registered in the index file.
|
|
|
|
--no-index::
|
|
Search files in the current directory that is not managed by Git.
|
|
|
|
--untracked::
|
|
In addition to searching in the tracked files in the working
|
|
tree, search also in untracked files.
|
|
|
|
--no-exclude-standard::
|
|
Also search in ignored files by not honoring the `.gitignore`
|
|
mechanism. Only useful with `--untracked`.
|
|
|
|
--exclude-standard::
|
|
Do not pay attention to ignored files specified via the `.gitignore`
|
|
mechanism. Only useful when searching files in the current directory
|
|
with `--no-index`.
|
|
|
|
--recurse-submodules::
|
|
Recursively search in each submodule that has been initialized and
|
|
checked out in the repository. When used in combination with the
|
|
<tree> option the prefix of all submodule output will be the name of
|
|
the parent project's <tree> object.
|
|
|
|
-a::
|
|
--text::
|
|
Process binary files as if they were text.
|
|
|
|
--textconv::
|
|
Honor textconv filter settings.
|
|
|
|
--no-textconv::
|
|
Do not honor textconv filter settings.
|
|
This is the default.
|
|
|
|
-i::
|
|
--ignore-case::
|
|
Ignore case differences between the patterns and the
|
|
files.
|
|
|
|
-I::
|
|
Don't match the pattern in binary files.
|
|
|
|
--max-depth <depth>::
|
|
For each <pathspec> given on command line, descend at most <depth>
|
|
levels of directories. A value of -1 means no limit.
|
|
This option is ignored if <pathspec> contains active wildcards.
|
|
In other words if "a*" matches a directory named "a*",
|
|
"*" is matched literally so --max-depth is still effective.
|
|
|
|
-r::
|
|
--recursive::
|
|
Same as `--max-depth=-1`; this is the default.
|
|
|
|
--no-recursive::
|
|
Same as `--max-depth=0`.
|
|
|
|
-w::
|
|
--word-regexp::
|
|
Match the pattern only at word boundary (either begin at the
|
|
beginning of a line, or preceded by a non-word character; end at
|
|
the end of a line or followed by a non-word character).
|
|
|
|
-v::
|
|
--invert-match::
|
|
Select non-matching lines.
|
|
|
|
-h::
|
|
-H::
|
|
By default, the command shows the filename for each
|
|
match. `-h` option is used to suppress this output.
|
|
`-H` is there for completeness and does not do anything
|
|
except it overrides `-h` given earlier on the command
|
|
line.
|
|
|
|
--full-name::
|
|
When run from a subdirectory, the command usually
|
|
outputs paths relative to the current directory. This
|
|
option forces paths to be output relative to the project
|
|
top directory.
|
|
|
|
-E::
|
|
--extended-regexp::
|
|
-G::
|
|
--basic-regexp::
|
|
Use POSIX extended/basic regexp for patterns. Default
|
|
is to use basic regexp.
|
|
|
|
-P::
|
|
--perl-regexp::
|
|
Use Perl-compatible regular expressions for patterns.
|
|
+
|
|
Support for these types of regular expressions is an optional
|
|
compile-time dependency. If Git wasn't compiled with support for them
|
|
providing this option will cause it to die.
|
|
|
|
-F::
|
|
--fixed-strings::
|
|
Use fixed strings for patterns (don't interpret pattern
|
|
as a regex).
|
|
|
|
-n::
|
|
--line-number::
|
|
Prefix the line number to matching lines.
|
|
|
|
--column::
|
|
Prefix the 1-indexed byte-offset of the first match from the start of the
|
|
matching line.
|
|
|
|
-l::
|
|
--files-with-matches::
|
|
--name-only::
|
|
-L::
|
|
--files-without-match::
|
|
Instead of showing every matched line, show only the
|
|
names of files that contain (or do not contain) matches.
|
|
For better compatibility with 'git diff', `--name-only` is a
|
|
synonym for `--files-with-matches`.
|
|
|
|
-O[<pager>]::
|
|
--open-files-in-pager[=<pager>]::
|
|
Open the matching files in the pager (not the output of 'grep').
|
|
If the pager happens to be "less" or "vi", and the user
|
|
specified only one pattern, the first file is positioned at
|
|
the first match automatically. The `pager` argument is
|
|
optional; if specified, it must be stuck to the option
|
|
without a space. If `pager` is unspecified, the default pager
|
|
will be used (see `core.pager` in linkgit:git-config[1]).
|
|
|
|
-z::
|
|
--null::
|
|
Output \0 instead of the character that normally follows a
|
|
file name.
|
|
|
|
-o::
|
|
--only-matching::
|
|
Print only the matched (non-empty) parts of a matching line, with each such
|
|
part on a separate output line.
|
|
|
|
-c::
|
|
--count::
|
|
Instead of showing every matched line, show the number of
|
|
lines that match.
|
|
|
|
--color[=<when>]::
|
|
Show colored matches.
|
|
The value must be always (the default), never, or auto.
|
|
|
|
--no-color::
|
|
Turn off match highlighting, even when the configuration file
|
|
gives the default to color output.
|
|
Same as `--color=never`.
|
|
|
|
--break::
|
|
Print an empty line between matches from different files.
|
|
|
|
--heading::
|
|
Show the filename above the matches in that file instead of
|
|
at the start of each shown line.
|
|
|
|
-p::
|
|
--show-function::
|
|
Show the preceding line that contains the function name of
|
|
the match, unless the matching line is a function name itself.
|
|
The name is determined in the same way as 'git diff' works out
|
|
patch hunk headers (see 'Defining a custom hunk-header' in
|
|
linkgit:gitattributes[5]).
|
|
|
|
-<num>::
|
|
-C <num>::
|
|
--context <num>::
|
|
Show <num> leading and trailing lines, and place a line
|
|
containing `--` between contiguous groups of matches.
|
|
|
|
-A <num>::
|
|
--after-context <num>::
|
|
Show <num> trailing lines, and place a line containing
|
|
`--` between contiguous groups of matches.
|
|
|
|
-B <num>::
|
|
--before-context <num>::
|
|
Show <num> leading lines, and place a line containing
|
|
`--` between contiguous groups of matches.
|
|
|
|
-W::
|
|
--function-context::
|
|
Show the surrounding text from the previous line containing a
|
|
function name up to the one before the next function name,
|
|
effectively showing the whole function in which the match was
|
|
found.
|
|
|
|
--threads <num>::
|
|
Number of grep worker threads to use.
|
|
See `grep.threads` in 'CONFIGURATION' for more information.
|
|
|
|
-f <file>::
|
|
Read patterns from <file>, one per line.
|
|
+
|
|
Passing the pattern via <file> allows for providing a search pattern
|
|
containing a \0.
|
|
+
|
|
Not all pattern types support patterns containing \0. Git will error
|
|
out if a given pattern type can't support such a pattern. The
|
|
`--perl-regexp` pattern type when compiled against the PCRE v2 backend
|
|
has the widest support for these types of patterns.
|
|
+
|
|
In versions of Git before 2.23.0 patterns containing \0 would be
|
|
silently considered fixed. This was never documented, there were also
|
|
odd and undocumented interactions between e.g. non-ASCII patterns
|
|
containing \0 and `--ignore-case`.
|
|
+
|
|
In future versions we may learn to support patterns containing \0 for
|
|
more search backends, until then we'll die when the pattern type in
|
|
question doesn't support them.
|
|
|
|
-e::
|
|
The next parameter is the pattern. This option has to be
|
|
used for patterns starting with `-` and should be used in
|
|
scripts passing user input to grep. Multiple patterns are
|
|
combined by 'or'.
|
|
|
|
--and::
|
|
--or::
|
|
--not::
|
|
( ... )::
|
|
Specify how multiple patterns are combined using Boolean
|
|
expressions. `--or` is the default operator. `--and` has
|
|
higher precedence than `--or`. `-e` has to be used for all
|
|
patterns.
|
|
|
|
--all-match::
|
|
When giving multiple pattern expressions combined with `--or`,
|
|
this flag is specified to limit the match to files that
|
|
have lines to match all of them.
|
|
|
|
-q::
|
|
--quiet::
|
|
Do not output matched lines; instead, exit with status 0 when
|
|
there is a match and with non-zero status when there isn't.
|
|
|
|
<tree>...::
|
|
Instead of searching tracked files in the working tree, search
|
|
blobs in the given trees.
|
|
|
|
\--::
|
|
Signals the end of options; the rest of the parameters
|
|
are <pathspec> limiters.
|
|
|
|
<pathspec>...::
|
|
If given, limit the search to paths matching at least one pattern.
|
|
Both leading paths match and glob(7) patterns are supported.
|
|
+
|
|
For more details about the <pathspec> syntax, see the 'pathspec' entry
|
|
in linkgit:gitglossary[7].
|
|
|
|
EXAMPLES
|
|
--------
|
|
|
|
`git grep 'time_t' -- '*.[ch]'`::
|
|
Looks for `time_t` in all tracked .c and .h files in the working
|
|
directory and its subdirectories.
|
|
|
|
`git grep -e '#define' --and \( -e MAX_PATH -e PATH_MAX \)`::
|
|
Looks for a line that has `#define` and either `MAX_PATH` or
|
|
`PATH_MAX`.
|
|
|
|
`git grep --all-match -e NODE -e Unexpected`::
|
|
Looks for a line that has `NODE` or `Unexpected` in
|
|
files that have lines that match both.
|
|
|
|
`git grep solution -- :^Documentation`::
|
|
Looks for `solution`, excluding files in `Documentation`.
|
|
|
|
NOTES ON THREADS
|
|
----------------
|
|
|
|
The `--threads` option (and the grep.threads configuration) will be ignored when
|
|
`--open-files-in-pager` is used, forcing a single-threaded execution.
|
|
|
|
When grepping the object store (with `--cached` or giving tree objects), running
|
|
with multiple threads might perform slower than single threaded if `--textconv`
|
|
is given and there're too many text conversions. So if you experience low
|
|
performance in this case, it might be desirable to use `--threads=1`.
|
|
|
|
GIT
|
|
---
|
|
Part of the linkgit:git[1] suite
|