Commit Graph

135 Commits

Author SHA1 Message Date
Junio C Hamano
6a67695268 Merge branch 'js/regexec-buf'
Some codepaths in "git diff" used regexec(3) on a buffer that was
mmap(2)ed, which may not have a terminating NUL, leading to a read
beyond the end of the mapped region.  This was fixed by introducing
a regexec_buf() helper that takes a <ptr,len> pair with REG_STARTEND
extension.

* js/regexec-buf:
  regex: use regexec_buf()
  regex: add regexec_buf() that can work on a non NUL-terminated string
  regex: -G<pattern> feeds a non NUL-terminated string to regexec() and fails
2016-09-26 16:09:19 -07:00
Johannes Schindelin
b7d36ffca0 regex: use regexec_buf()
The new regexec_buf() function operates on buffers with an explicitly
specified length, rather than NUL-terminated strings.

We need to use this function whenever the buffer we want to pass to
regexec(3) may have been mmap(2)ed (and is hence not NUL-terminated).

Note: the original motivation for this patch was to fix a bug where
`git diff -G <regex>` would crash. This patch converts more callers,
though, some of which allocated to construct NUL-terminated strings,
or worse, modified buffers to temporarily insert NULs while calling
regexec(3).  By converting them to use regexec_buf(), the code has
become much cleaner.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-21 13:56:15 -07:00
Junio C Hamano
1a5f1a3f25 Merge branch 'js/am-3-merge-recursive-direct'
"git am -3" calls "git merge-recursive" when it needs to fall back
to a three-way merge; this call has been turned into an internal
subroutine call instead of spawning a separate subprocess.

* js/am-3-merge-recursive-direct:
  merge-recursive: flush output buffer even when erroring out
  merge_trees(): ensure that the callers release output buffer
  merge-recursive: offer an option to retain the output in 'obuf'
  merge-recursive: write the commit title in one go
  merge-recursive: flush output buffer before printing error messages
  am -3: use merge_recursive() directly again
  merge-recursive: switch to returning errors instead of dying
  merge-recursive: handle return values indicating errors
  merge-recursive: allow write_tree_from_memory() to error out
  merge-recursive: avoid returning a wholesale struct
  merge_recursive: abort properly upon errors
  prepare the builtins for a libified merge_recursive()
  merge-recursive: clarify code in was_tracked()
  die(_("BUG")): avoid translating bug messages
  die("bug"): report bugs consistently
  t5520: verify that `pull --rebase` shows the helpful advice when failing
2016-08-10 12:33:20 -07:00
Junio C Hamano
b422d99658 Merge branch 'jc/grep-commandline-vs-configuration'
"git -c grep.patternType=extended log --basic-regexp" misbehaved
because the internal API to access the grep machinery was not
designed well.

* jc/grep-commandline-vs-configuration:
  grep: further simplify setting the pattern type
2016-08-04 14:39:18 -07:00
Johannes Schindelin
ef1177d18e die("bug"): report bugs consistently
The vast majority of error messages in Git's source code which report a
bug use the convention to prefix the message with "BUG:".

As part of cleaning up merge-recursive to stop die()ing except in case of
detected bugs, let's just make the remainder of the bug reports consistent
with the de facto rule.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-26 11:13:44 -07:00
Junio C Hamano
8465541e8c grep: further simplify setting the pattern type
When c5c31d33 (grep: move pattern-type bits support to top-level
grep.[ch], 2012-10-03) introduced grep_commit_pattern_type() helper
function, the intention was to allow the users of grep API to having
to fiddle only with .pattern_type_option (which can be set to "fixed",
"basic", "extended", and "pcre"), and then immediately before compiling
the pattern strings for use, call grep_commit_pattern_type() to have
it prepare various bits in the grep_opt structure (like .fixed,
.regflags, etc.).

However, grep_set_pattern_type_option() helper function the grep API
internally uses were left as an external function by mistake.  This
function shouldn't have been made callable by the users of the API.

Later when the grep API was used in revision traversal machinery,
the caller then mistakenly started calling the function around
34a4ae55 (log --grep: use the same helper to set -E/-F options as
"git grep", 2012-10-03), instead of setting the .pattern_type_option
field and letting the grep_commit_pattern_type() to take care of the
details.

This caused an unnecessary bug that made a configured
grep.patternType take precedence over the command line options
(e.g. --basic-regexp, --fixed-strings) in "git log" family of
commands.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-25 09:16:18 -07:00
Junio C Hamano
a883c31af6 Merge branch 'nd/icase'
"git grep -i" has been taught to fold case in non-ascii locales
correctly.

* nd/icase:
  grep.c: reuse "icase" variable
  diffcore-pickaxe: support case insensitive match on non-ascii
  diffcore-pickaxe: Add regcomp_or_die()
  grep/pcre: support utf-8
  gettext: add is_utf8_locale()
  grep/pcre: prepare locale-dependent tables for icase matching
  grep: rewrite an if/else condition to avoid duplicate expression
  grep/icase: avoid kwsset when -F is specified
  grep/icase: avoid kwsset on literal non-ascii strings
  test-regex: expose full regcomp() to the command line
  test-regex: isolate the bug test code
  grep: break down an "if" stmt in preparation for next changes
2016-07-19 13:22:17 -07:00
Nguyễn Thái Ngọc Duy
695f95ba5d grep.c: reuse "icase" variable
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 12:44:57 -07:00
Nguyễn Thái Ngọc Duy
18547aacf5 grep/pcre: support utf-8
In the previous change in this function, we add locale support for
single-byte encodings only. It looks like pcre only supports utf-* as
multibyte encodings, the others are left in the cold (which is
fine).

We need to enable PCRE_UTF8 so pcre can find character boundary
correctly. It's needed for case folding (when --ignore-case is used)
or '*', '+' or similar syntax is used.

The "has_non_ascii()" check is to be on the conservative side. If
there's non-ascii in the pattern, the searched content could still be
in utf-8, but we can treat it just like a byte stream and everything
should work. If we force utf-8 based on locale only and pcre validates
utf-8 and the file content is in non-utf8 encoding, things break.

Noticed-by: Plamen Totev <plamen.totev@abv.bg>
Helped-by: Plamen Totev <plamen.totev@abv.bg>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 12:44:57 -07:00
Nguyễn Thái Ngọc Duy
9d9babb84d grep/pcre: prepare locale-dependent tables for icase matching
The default tables are usually built with C locale and only suitable
for LANG=C or similar.  This should make case insensitive search work
correctly for all single-byte charsets.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 12:44:57 -07:00
Nguyễn Thái Ngọc Duy
e944d9d932 grep: rewrite an if/else condition to avoid duplicate expression
"!icase || ascii_only" is repeated twice in this if/else chain as this
series evolves. Rewrite it (and basically revert the first if
condition back to before the "grep: break down an "if" stmt..." commit).

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 12:44:57 -07:00
Nguyễn Thái Ngọc Duy
793dc676e0 grep/icase: avoid kwsset when -F is specified
Similar to the previous commit, we can't use kws on icase search
outside ascii range. But we can't simply pass the pattern to
regcomp/pcre like the previous commit because it may contain regex
special characters, so we need to quote the regex first.

To avoid misquote traps that could lead to undefined behavior, we
always stick to basic regex engine in this case. We don't need fancy
features for grepping a literal string anyway.

basic_regex_quote_buf() assumes that if the pattern is in a multibyte
encoding, ascii chars must be unambiguously encoded as single
bytes. This is true at least for UTF-8. For others, let's wait until
people yell up. Chances are nobody uses multibyte, non utf-8 charsets
anymore.

Noticed-by: Plamen Totev <plamen.totev@abv.bg>
Helped-by: René Scharfe <l.s.r@web.de>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 12:44:30 -07:00
Nguyễn Thái Ngọc Duy
5c1ebcca4d grep/icase: avoid kwsset on literal non-ascii strings
When we detect the pattern is just a literal string, we avoid heavy
regex engine and use fast substring search implemented in kwsset.c.
But kws uses git-ctype which is locale-independent so it does not know
how to fold case properly outside ascii range. Let regcomp or pcre
take care of this case instead. Slower, but accurate.

Noticed-by: Plamen Totev <plamen.totev@abv.bg>
Helped-by: René Scharfe <l.s.r@web.de>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-27 07:31:35 -07:00
Nguyễn Thái Ngọc Duy
60452a30f5 grep: break down an "if" stmt in preparation for next changes
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-27 07:31:35 -07:00
Junio C Hamano
d15c05a5d0 Merge branch 'rs/xdiff-hunk-with-func-line'
"git show -W" (extend hunks to cover the entire function, delimited
by lines that match the "funcname" pattern) used to show the entire
file when a change added an entire function at the end of the file,
which has been fixed.

* rs/xdiff-hunk-with-func-line:
  xdiff: fix merging of appended hunk with -W
  grep: -W: don't extend context to trailing empty lines
  t7810: add test for grep -W and trailing empty context lines
  xdiff: don't trim common tail with -W
  xdiff: -W: don't include common trailing empty lines in context
  xdiff: ignore empty lines before added functions with -W
  xdiff: handle appended chunks better with -W
  xdiff: factor out match_func_rec()
  t4051: rewrite, add more tests
2016-06-20 11:01:04 -07:00
René Scharfe
4aa2c4753d grep: -W: don't extend context to trailing empty lines
Empty lines between functions are shown by grep -W, as it considers them
to be part of the function preceding them.  They are not interesting in
most languages.  The previous patches stopped showing them for diff -W.

Stop showing empty lines trailing a function with grep -W.  Grep scans
the lines of a buffer from top to bottom and prints matching lines
immediately.  Thus we need to peek ahead in order to determine if an
empty line is part of a function body and worth showing or not.

Remember how far ahead we peeked in order to avoid having to do so
repeatedly when handling multiple consecutive empty lines.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-31 13:08:56 -07:00
Nguyễn Thái Ngọc Duy
7645d8f119 grep.c: use error_errno()
While at there, improve the error message a bit (what operation failed?)

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-09 12:29:08 -07:00
Jeff King
3733e69464 use xmallocz to avoid size arithmetic
We frequently allocate strings as xmalloc(len + 1), where
the extra 1 is for the NUL terminator. This can be done more
simply with xmallocz, which also checks for integer
overflow.

There's no case where switching xmalloc(n+1) to xmallocz(n)
is wrong; the result is the same length, and malloc made no
guarantees about what was in the buffer anyway. But in some
cases, we can stop manually placing NUL at the end of the
allocated buffer. But that's only safe if it's clear that
the contents will always fill the buffer.

In each case where this patch does so, I manually examined
the control flow, and I tried to err on the side of caution.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22 14:51:09 -08:00
Jeff King
7ce4fb948c color: add color_set helper for copying raw colors
To set up default colors, we sometimes strcpy() from the
default string literals into our color buffers. This isn't a
bug (assuming the destination is COLOR_MAXLEN bytes), but
makes it harder to audit the code for problematic strcpy
calls.

Let's introduce a color_set which copies under the
assumption that there are COLOR_MAXLEN bytes in the
destination (of course you can call it on a smaller buffer,
so this isn't providing a huge amount of safety, but it's
more convenient than calling xsnprintf yourself).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-05 11:08:05 -07:00
Jeff King
19bdd3e7e1 grep: use xsnprintf to format failure message
This looks at first glance like the sprintf can overflow our
buffer, but it's actually fine; the p->origin string is
something constant and small, like "command line" or "-e
option".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25 10:18:18 -07:00
Junio C Hamano
092c4be7f5 Merge branch 'jk/blame-commit-label'
"git blame HEAD -- missing" failed to correctly say "HEAD" when it
tried to say "No such path 'missing' in HEAD".

* jk/blame-commit-label:
  blame.c: fix garbled error message
  use xstrdup_or_null to replace ternary conditionals
  builtin/commit.c: use xstrdup_or_null instead of envdup
  builtin/apply.c: use xstrdup_or_null instead of null_strdup
  git-compat-util: add xstrdup_or_null helper
2015-02-11 13:39:50 -08:00
Jeff King
8c53f0719b use xstrdup_or_null to replace ternary conditionals
This replaces "x ? xstrdup(x) : NULL" with xstrdup_or_null(x).
The change is fairly mechanical, with the exception of
resolve_refdup, which can eliminate a temporary variable.

There are still a few hits grepping for "?.*xstrdup", but
these are of slightly different forms and cannot be
converted (e.g., "x ? xstrdup(x->foo) : NULL").

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-01-13 10:05:48 -08:00
Junio C Hamano
bf1f639ea2 Merge branch 'rs/grep-color-words'
Allow painting or not painting (partial) matches in context lines
when showing "grep -C<num>" output in color.

* rs/grep-color-words:
  grep: add color.grep.matchcontext and color.grep.matchselected
2014-10-31 11:49:47 -07:00
René Scharfe
79a77109d3 grep: add color.grep.matchcontext and color.grep.matchselected
The config option color.grep.match can be used to specify the highlighting
color for matching strings.  Add the options matchContext and matchSelected
to allow different colors to be specified for matching strings in the
context vs. in selected lines.  This is similar to the ms and mc specifiers
in GNU grep's environment variable GREP_COLORS.

Tests are from Zoltan Klinger's earlier attempt to solve the same
issue in a different way.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-28 10:33:50 -07:00
Jeff King
f6c5a2968c color_parse: do not mention variable name in error message
Originally the color-parsing function was used only for
config variables. It made sense to pass the variable name so
that the die() message could be something like:

  $ git -c color.branch.plain=bogus branch
  fatal: bad color value 'bogus' for variable 'color.branch.plain'

These days we call it in other contexts, and the resulting
error messages are a little confusing:

  $ git log --pretty='%C(bogus)'
  fatal: bad color value 'bogus' for variable '--pretty format'

  $ git config --get-color foo.bar bogus
  fatal: bad color value 'bogus' for variable 'command line'

This patch teaches color_parse to complain only about the
value, and then return an error code. Config callers can
then propagate that up to the config parser, which mentions
the variable name. Other callers can provide a custom
message. After this patch these three cases now look like:

  $ git -c color.branch.plain=bogus branch
  error: invalid color value: bogus
  fatal: unable to parse 'color.branch.plain' from command-line config

  $ git log --pretty='%C(bogus)'
  error: invalid color value: bogus
  fatal: unable to parse --pretty format

  $ git config --get-color foo.bar bogus
  error: invalid color value: bogus
  fatal: unable to parse default color value

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-14 11:01:21 -07:00
Junio C Hamano
52df9173fa Merge branch 'as/grep-fullname-config'
Add a configuration variable to force --full-name to be default for
"git grep".

This may cause regressions on scripted users that do not expect
this new behaviour.

* as/grep-fullname-config:
  grep: add grep.fullName config variable
2014-06-03 12:06:39 -07:00
Andreas Schwab
6453f7b348 grep: add grep.fullName config variable
This configuration variable sets the default for the --full-name option.

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-20 12:38:00 -07:00
Junio C Hamano
6f6be80ef1 Merge branch 'rs/grep-h-c'
"git grep" learns to handle combination of "-h (no header)" and "-c
(counts)".

* rs/grep-h-c:
  grep: support -h (no header) with --count
  t7810: add missing variables to tests in loop
2014-03-18 13:51:20 -07:00
René Scharfe
f76d947ae1 grep: support -h (no header) with --count
Suppress printing the header (filename) with -h even if in -c/--count
mode.  GNU grep and OpenBSD's grep do the same.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-11 15:05:28 -07:00
Sun He
50546b15ed Use hashcpy() when copying object names
We invented hashcpy() to keep the abstraction of "object name"
behind it.  Use it instead of calling memcpy() with hard-coded
20-byte length when moving object names between pieces of memory.

Leave ppc/sha1.c as-is, because the function is about the SHA-1 hash
algorithm whose output is and will always be 20 bytes.

Helped-by: Michael Haggerty <mhagger@alum.mit.edu>
Helped-by: Duy Nguyen <pclouds@gmail.com>
Signed-off-by: Sun He <sunheehnus@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-06 14:03:12 -08:00
Jeff King
335ec3bf41 grep: allow to use textconv filters
Recently and not so recently, we made sure that log/grep type operations
use textconv filters when a userfacing diff would do the same:

ef90ab6 (pickaxe: use textconv for -S counting, 2012-10-28)
b1c2f57 (diff_grep: use textconv buffers for add/deleted files, 2012-10-28)
0508fe5 (combine-diff: respect textconv attributes, 2011-05-23)

"git grep" currently does not use textconv filters at all, that is
neither for displaying the match and context nor for the actual grepping,
even when requested by --textconv.

Introduce an option "--textconv" which makes git grep use any configured
textconv filters for grepping and output purposes. It is off by default.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-10 10:27:31 -07:00
Antoine Pelisse
3ce3ffb840 fix clang -Wtautological-compare with unsigned enum
Create a GREP_HEADER_FIELD_MIN so we can check that the field value is
sane and silence the clang warning.

Clang warning happens because the enum is unsigned (this is
implementation-defined, and there is no negative fields) and the check
is then tautological.

Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
Signed-off-by: John Keeping <john@keeping.me.uk>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-25 07:35:55 -08:00
Jeff King
e034d1bb92 Merge branch 'nd/grep-true-path'
"git grep -e pattern <tree>" asked the attribute system to read
"<tree>:.gitattributes" file in the working tree, which was
nonsense.

* nd/grep-true-path:
  grep: stop looking at random places for .gitattributes
2012-10-29 04:13:16 -04:00
Nguyễn Thái Ngọc Duy
55c61688ea grep: stop looking at random places for .gitattributes
grep searches for .gitattributes using "name" field in struct
grep_source but that field is not real on-disk path name. For example,
"grep pattern rev" fills the field with "rev:path", and Git looks for
.gitattributes in the (non-existent but exploitable) path "rev:path"
instead of "path".

This patch passes real paths down to grep_source_load_driver() when:

 - grep on work tree
 - grep on the index
 - grep a commit (or a tag if it points to a commit)

so that these cases look up .gitattributes at proper paths.
.gitattributes lookup is disabled in all other cases.

Initial-work-by: Jeff King <peff@peff.net>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-12 08:24:44 -07:00
Junio C Hamano
918d4e1c90 revisions: initialize revs->grep_filter using grep_init()
Instead of using the hand-rolled initialization sequence,
use grep_init() to populate the necessary bits.  This opens
the door to allow the calling commands to optionally read
grep.* configuration variables via git_config() if they
want to.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-09 23:21:29 -07:00
Junio C Hamano
c5c31d3381 grep: move pattern-type bits support to top-level grep.[ch]
Switching between -E/-G/-P/-F correctly needs a lot more than just
flipping opt->regflags bit these days, and we have a nice helper
function buried in builtin/grep.c for the sole use of "git grep".

Extract it so that "log --grep" family can also use it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-09 23:21:29 -07:00
Junio C Hamano
7687a0541e grep: move the configuration parsing logic to grep.[ch]
The configuration handling is a library-ish part of this program,
that is not specific to "git grep" command.  It should be reusable
by "log" and others.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-09 16:17:50 -07:00
Junio C Hamano
baa6378ff2 log --grep-reflog: reject the option without -g
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-29 12:07:04 -07:00
Nguyễn Thái Ngọc Duy
72fd13f71c revision: add --grep-reflog to filter commits by reflog messages
Similar to --author/--committer which filters commits by author and
committer header fields. --grep-reflog adds a fake "reflog" header to
commit and a grep filter to search on that line.

All rules to --author/--committer apply except no timestamp stripping.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-29 11:41:14 -07:00
Nguyễn Thái Ngọc Duy
ad4813b3c2 grep: prepare for new header field filter
grep supports only author and committer headers, which have the same
special treatment that later headers may or may not have. Check for
field type and only strip_timestamp() when the field is either author
or committer.

GREP_HEADER_FIELD_MAX is put in the grep_header_field enum to be
calculated automatically, correctly, as long as it's at the end of the
enum.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-29 11:40:58 -07:00
Junio C Hamano
3083301ead grep.c: make two symbols really file-scope static this time
Adding a declaration at the beginning is not sufficient for obvious
reasons. The definition has to be made static.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-20 14:20:09 -07:00
Junio C Hamano
07a7d656dd grep.c: mark private file-scope symbols as static
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-15 23:35:39 -07:00
Junio C Hamano
13e4fc7e01 log --grep/--author: honor --all-match honored for multiple --grep patterns
When we have both header expression (which has to be an OR node by
construction) and a pattern expression (which could be anything), we
create a new top-level OR node to bind them together, and the
resulting expression structure looks like this:

             OR
        /          \
       /            \
   pattern            OR
     / \           /     \
    .....    committer    OR
                         /   \
                     author   TRUE

The three elements on the top-level backbone that are inspected by
the "all-match" logic are "pattern", "committer" and "author".  When
there are more than one elements in the "pattern", the top-level
node of the "pattern" part of the subtree is an OR, and that node is
inspected by "all-match".

The result ends up ignoring the "--all-match" given from the command
line.  A match on either side of the pattern is considered a match,
hence:

        git log --grep=A --grep=B --author=C --all-match

shows the same "authored by C and has either A or B" that is correct
only when run without "--all-match".

Fix this by turning the resulting expression around when "--all-match"
is in effect, like this:

              OR
          /        \
         /          \
        /              OR
    committer        /    \
                 author    \
                           pattern

The set of nodes on the top-level backbone in the resulting
expression becomes "committer", "author", and the nodes that are on
the top-level backbone of the "pattern" subexpression.  This makes
the "all-match" logic inspect the same nodes in "pattern" as the
case without the author and/or the committer restriction, and makes
the earlier "log" example to show "authored by C and has A and has
B", which is what the command line expects.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-14 10:12:56 -07:00
Junio C Hamano
17bf35a3c7 grep: teach --debug option to dump the parse tree
Our "grep" allows complex boolean expressions to be formed to match
each individual line with operators like --and, '(', ')' and --not.
Introduce the "--debug" option to show the parse tree to help people
who want to debug and enhance it.

Also "log" learns "--grep-debug" option to do the same.  The command
line parser to the log family is a lot more limited than the general
"git grep" parser, but it has special handling for header matching
(e.g. "--author"), and a parse tree is valuable when working on it.

Note that "--all-match" is *not* any individual node in the parse
tree.  It is an instruction to the evaluator to check all the nodes
in the top-level backbone have matched and reject a document as
non-matching otherwise.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-14 10:10:35 -07:00
Junio C Hamano
2147cb2762 Merge branch 'rs/maint-grep-F' into maint
"git grep -e '$pattern'", unlike the case where the patterns are read from
a file, did not treat individual lines in the given pattern argument as
separate regular expressions as it should.

By René Scharfe
* rs/maint-grep-F:
  grep: stop leaking line strings with -f
  grep: support newline separated pattern list
  grep: factor out do_append_grep_pat()
  grep: factor out create_grep_pat()
2012-06-01 13:01:41 -07:00
René Scharfe
526a858a99 grep: support newline separated pattern list
Currently, patterns that contain newline characters don't match anything
when given to git grep.  Regular grep(1) interprets patterns as lists of
newline separated search strings instead.

Implement this functionality by creating and inserting extra grep_pat
structures for patterns consisting of multiple lines when appending to
the pattern lists.  For simplicity, all pattern strings are duplicated.
The original pattern is truncated in place to make it contain only the
first line.

Requested-by: Torne (Richard Coles) <torne@google.com>
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-20 15:25:46 -07:00
René Scharfe
2b3873ff34 grep: factor out do_append_grep_pat()
Add do_append_grep_pat() as a shared function for adding patterns to
the header pattern list and the general pattern list.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-20 15:12:25 -07:00
René Scharfe
fc45675110 grep: factor out create_grep_pat()
Add create_grep_pat(), a shared helper for all grep pattern allocation
and initialization needs.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-20 15:12:22 -07:00
Junio C Hamano
f5f37461a9 Merge branch 'ah/maint-grep-double-init' into maint
By Angus Hammond
* ah/maint-grep-double-init:
  grep.c: remove redundant line of code
2012-05-11 11:16:09 -07:00
Angus Hammond
2385f24625 grep.c: remove redundant line of code
Signed-off-by: Angus Hammond <angusgh@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-07 11:25:04 -07:00