Commit Graph

44455 Commits

Author SHA1 Message Date
Junio C Hamano
f838198357 Merge branch 'jc/deref-tag' into maint
Code clean-up.

* jc/deref-tag:
  blame, line-log: do not loop around deref_tag()
2016-07-06 13:06:46 -07:00
Junio C Hamano
f7927316cf Merge branch 'pb/strbuf-read-file-doc' into maint
Minor doc update.

* pb/strbuf-read-file-doc:
  strbuf: describe the return value of strbuf_read_file
2016-07-06 13:06:45 -07:00
Junio C Hamano
1c22105f2c Merge branch 'jk/fetch-prune-doc' into maint
Minor doc update.

* jk/fetch-prune-doc:
  fetch: document that pruning happens before fetching
2016-07-06 13:06:44 -07:00
Junio C Hamano
9d3d0dbb14 Merge branch 'pc/occurred' into maint
Typofix.

* pc/occurred:
  config.c: fix misspelt "occurred" in an error message
  refs.h: fix misspelt "occurred" in a comment
2016-07-06 13:06:43 -07:00
Junio C Hamano
25227f0bea Merge branch 'mg/cherry-pick-multi-on-unborn' into maint
"git cherry-pick A" worked on an unborn branch, but "git
cherry-pick A..B" didn't.

* mg/cherry-pick-multi-on-unborn:
  cherry-pick: allow to pick to unborn branches
2016-07-06 13:06:42 -07:00
Junio C Hamano
af3a43cb11 Merge branch 'em/newer-freebsd-shells-are-fine-with-returns' into maint
Comments about misbehaving FreeBSD shells have been clarified with
the version number (9.x and before are broken, newer ones are OK).

* em/newer-freebsd-shells-are-fine-with-returns:
  rebase: update comment about FreeBSD /bin/sh
2016-07-06 13:06:41 -07:00
Junio C Hamano
89aef71d0e Merge branch 'lv/status-say-working-tree-not-directory' into maint
"git status" used to say "working directory" when it meant "working
tree".

* lv/status-say-working-tree-not-directory:
  Use "working tree" instead of "working directory" for git status
2016-07-06 13:06:40 -07:00
Junio C Hamano
1729853432 Merge branch 'nb/gnome-keyring-build' into maint
Build improvements for gnome-keyring (in contrib/)

* nb/gnome-keyring-build:
  gnome-keyring: Don't hard-code pkg-config executable
2016-07-06 13:06:40 -07:00
Junio C Hamano
c8b080af71 Merge branch 'et/add-chmod-x' into maint
"git update-index --add --chmod=+x file" may be usable as an escape
hatch, but not a friendly thing to force for people who do need to
use it regularly.  "git add --chmod=+x file" can be used instead.

* et/add-chmod-x:
  add: add --chmod=+x / --chmod=-x options
2016-07-06 13:06:39 -07:00
Junio C Hamano
1f5d429e4a Merge branch 'jk/avoid-unbounded-alloca' into maint
A codepath that used alloca(3) to place an unbounded amount of data
on the stack has been updated to avoid doing so.

* jk/avoid-unbounded-alloca:
  tree-diff: avoid alloca for large allocations
2016-07-06 13:06:39 -07:00
Junio C Hamano
c0144452ef Merge branch 'rj/compat-regex-size-max-fix' into maint
A compilation fix.

* rj/compat-regex-size-max-fix:
  regex: fix a SIZE_MAX macro redefinition warning
2016-07-06 13:06:38 -07:00
Junio C Hamano
8162401fb0 Merge branch 'vs/prompt-avoid-unset-variable' into maint
The git-prompt scriptlet (in contrib/) was not friendly with those
who uses "set -u", which has been fixed.

* vs/prompt-avoid-unset-variable:
  git-prompt.sh: Don't error on null ${ZSH,BASH}_VERSION, $short_sha
2016-07-06 13:06:38 -07:00
Junio C Hamano
7949837520 Merge branch 'sg/reflog-past-root' into maint
"git reflog" stopped upon seeing an entry that denotes a branch
creation event (aka "unborn"), which made it appear as if the
reflog was truncated.

* sg/reflog-past-root:
  reflog: continue walking the reflog past root commits
2016-07-06 13:06:37 -07:00
Junio C Hamano
17eb7a7858 Merge branch 'dn/gpg-doc' into maint
The documentation tries to consistently spell "GPG"; when
referring to the specific program name, "gpg" is used.

* dn/gpg-doc:
  Documentation: GPG capitalization
2016-07-06 13:06:36 -07:00
Junio C Hamano
7f223b108d Merge branch 'ap/git-svn-propset-doc' into maint
"git svn propset" subcommand that was added in 2.3 days is
documented now.

* ap/git-svn-propset-doc:
  git-svn: document the 'git svn propset' command
2016-07-06 13:06:35 -07:00
Junio C Hamano
073d0b0914 Merge branch 'tr/doc-tt' into maint
The documentation set has been updated so that literal commands,
configuration variables and environment variables are consistently
typeset in fixed-width font and bold in manpages.

* tr/doc-tt:
  doc: change configuration variables format
  doc: more consistency in environment variables format
  doc: change environment variables format
  doc: clearer rule about formatting literals
2016-07-06 13:06:34 -07:00
Armin Kunaschik
c578a09bd6 t7610: test for mktemp before test execution
mktemp is not available on all platforms, so the test
'temporary filenames are used with mergetool.writeToTemp'
fails there.
This patch does not replace mktemp but just disables
the test that otherwise would fail.
mergetool checks itself before executing mktemp and
reports an error.

Signed-off-by: Armin Kunaschik <megabreit@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-06 12:18:09 -07:00
Torsten Bögershausen
6523728499 convert: unify the "auto" handling of CRLF
Before this change,
$ echo "* text=auto" >.gitattributes
$ echo "* eol=crlf" >>.gitattributes

would have the same effect as
$ echo "* text" >.gitattributes
$ git config core.eol crlf

Since the 'eol' attribute had higher priority than 'text=auto', this may
corrupt binary files and is not what most users expect to happen.

Make the 'eol' attribute to obey 'text=auto' and now
$ echo "* text=auto" >.gitattributes
$ echo "* eol=crlf" >>.gitattributes
behaves the same as
$ echo "* text=auto" >.gitattributes
$ git config core.eol crlf

In other words,
$ echo "* text=auto eol=crlf" >.gitattributes
has the same effect as
$ git config core.autocrlf true

and
$ echo "* text=auto eol=lf" >.gitattributes
has the same effect as
$ git config core.autocrlf input

Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-06 11:53:51 -07:00
Jeff King
4df7c8a037 Makefile: use VCSSVN_LIB to refer to svn library
We have an abstracted variable; let's use it consistently.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-06 11:51:03 -07:00
Jeff King
1be36b60f1 Makefile: drop extra dependencies for test helpers
A few test-helpers have Makefile dependencies on specific
object files. But since these files are part of libgit.a
(which all of the helpers link against), the inclusion is
simply redundant.

These were once necessary, but became redundant due to
5c5ba73 (Makefile: Use generic rule to build test programs,
2007-05-31), which added the $(GITLIBS) dependency (but
didn't prune the extra dependency lines). Later commits then
cargo-culted the practice (e.g., b4285c7).

Note that we _do_ need to leave the dependencies on the svn
library, as that is not part of the usual link command.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-06 11:50:48 -07:00
Nguyễn Thái Ngọc Duy
bc437d1020 fetch: reduce duplicate in ref update status lines with placeholder
In the "remote -> local" line, if either ref is a substring of the
other, the common part in the other string is replaced with "*". For
example

    abc                -> origin/abc
    refs/pull/123/head -> pull/123

become

    abc         -> origin/*
    refs/*/head -> pull/123

Activated with fetch.output=compact.

For the record, this output is not perfect. A single giant ref can
push all refs very far to the right and likely be wrapped around. We
may have a few options:

 - exclude these long lines smarter

 - break the line after "->", exclude it from column width calculation

 - implement a new format, { -> origin/}foo, which makes the problem
   go away at the cost of a bit harder to read

 - reverse all the arrows so we have "* <- looong-ref", again still
   hard to read.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-06 11:48:25 -07:00
Nguyễn Thái Ngọc Duy
6bc91f23a6 fetch: align all "remote -> local" output
We do align "remote -> local" output by allocating 10 columns to
"remote". That produces aligned output only for short refs. An extra
pass is performed to find the longest remote ref name (that does not
produce a line longer than terminal width) to produce better aligned
output.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-06 11:48:25 -07:00
David Turner
a199a7c9d0 mailmap: use main email address for dturner
Signed-off-by: David Turner <novalis@novalis.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-06 10:57:01 -07:00
Jeff King
023ff39b29 parse_options: allocate a new array when concatenating
In exactly one callers (builtin/revert.c), we build up the
options list dynamically from multiple arrays. We do so by
manually inserting "filler" entries into one array, and then
copying the other array into the allocated space.

This is tedious and error-prone, as you have to adjust the
filler any time the second array is modified (although we do
at least check and die() when the counts do not match up).

Instead, let's just allocate a new array.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-06 10:11:08 -07:00
Junio C Hamano
de61cebde7 Merge branch 'jk/common-main-2.8' into jk/common-main
* jk/common-main-2.8:
  mingw: declare main()'s argv as const
  common-main: call git_setup_gettext()
  common-main: call restore_sigpipe_to_default()
  common-main: call sanitize_stdfds()
  common-main: call git_extract_argv0_path()
  add an extra level of indirection to main()
2016-07-06 10:02:57 -07:00
Johannes Schindelin
08aade7080 mingw: declare main()'s argv as const
In 84d32bf (sparse: Fix mingw_main() argument number/type errors,
2013-04-27), we addressed problems identified by the 'sparse' tool where
argv was declared inconsistently. The way we addressed it was by casting
from the non-const version to the const-version.

This patch is long overdue, fixing compat/mingw.h's declaration to
make the "argv" parameter const.  This also allows us to lose the
"const" trickery introduced earlier to common-main.c:main().

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-06 08:11:47 -07:00
Jeff King
03c39b3458 t/lib-git-daemon: use test_match_signal
When git-daemon exits, we expect it to be with the SIGTERM
we just sent it. If we see anything else, we'll complain.
But our check against exit code "143" is not portable. For
example:

  $ ksh93 t5570-git-daemon.sh
  [...]
  error: git daemon exited with status: 271

We can fix this by using test_match_signal.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-06 07:44:25 -07:00
Jeff King
2472448c88 test_must_fail: use test_match_signal
In 8bf4bec (add "ok=sigpipe" to test_must_fail and use it to
fix flaky tests, 2015-11-27), test_must_fail learned to
recognize "141" as a sigpipe failure. However, testing for
a signal is more complicated than that; we should use
test_match_signal to implement more portable checking.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-06 07:44:25 -07:00
Jeff King
6f5f9d7476 t0005: use test_match_signal as appropriate
The first test already uses this more portable construct
(that was where it was factored from initially), but the
later tests do a raw comparison against 141 to look for
SIGPIPE, which can fail on some shells and platforms.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-06 07:44:25 -07:00
Jeff King
9b67c9942e tests: factor portable signal check out of t0005
In POSIX shells, a program which exits due to a signal
generally has an exit code of 128 plus the signal number.
However, ksh uses 256 plus the signal number.  We've
accounted for that in t0005, but not in other tests.  Let's
pull out the logic so we can use it elsewhere.

It would be nice for debugging if this additionally printed
errors to stderr, like our other test_* helpers. But we're
going to need to use it in other places besides the innards
of a test_expect block. So let's leave it as generic as
possible.

Note that we also leave the magic "3" for Windows out of the
generic helper. This is an artifact of the way we use
raise() to kill ourselves in test-sigchain.c, and will not
necessarily apply to all programs. So it's better to keep it
out of the helper, to reduce the chance of confusing it with
a real call to exit(3).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-06 07:43:29 -07:00
Christopher Layne
19e9542fa2 git-svn: clone: Fail on missing url argument
cmd_clone should detect a missing $url arg before using it otherwise
an uninitialized value error is emitted in even the simplest case of
'git svn clone' without arguments.

Signed-off-by: Christopher Layne <clayne@anodized.com>
Signed-off-by: Eric Wong <e@80x24.org>
2016-07-03 06:04:47 +00:00
Jeff King
5ce5f5fa5a common-main: call git_setup_gettext()
This should be part of every program, as otherwise users do
not get translated error messages. However, some external
commands forgot to do so (e.g., git-credential-store). This
fixes them, and eliminates the repeated code in programs
that did remember to use it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 15:09:10 -07:00
Jeff King
12e0437f23 common-main: call restore_sigpipe_to_default()
This is another safety/sanity setup that should be in force
everywhere, but which we only applied in git.c. This did
catch most cases, since even external commands are typically
run via "git ..." (and the restoration applies to
sub-processes, too). But there were cases we missed, such as
somebody calling git-upload-pack directly via ssh, or
scripts which use dashed external commands directly.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 15:09:10 -07:00
Jeff King
57f5d52a94 common-main: call sanitize_stdfds()
This is setup that should be done in every program for
safety, but we never got around to adding it everywhere (so
builtins benefited from the call in git.c, but any external
commands did not). Putting it in the common main() gives us
this safety everywhere.

Note that the case in daemon.c is a little funny. We wait
until we know whether we want to daemonize, and then either:

 - call daemonize(), which will close stdio and reopen it to
   /dev/null under the hood

 - sanitize_stdfds(), to fix up any odd cases

But that is way too late; the point of sanitizing is to give
us reliable descriptors on 0/1/2, and we will already have
executed code, possibly called die(), etc. The sanitizing
should be the very first thing that happens.

With this patch, git-daemon will sanitize first, and can
remove the call in the non-daemonize case. It does mean that
daemonize() may just end up closing the descriptors we
opened, but that's not a big deal (it's not wrong to do so,
nor is it really less optimal than the case where our parent
process redirected us from /dev/null ahead of time).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 15:09:10 -07:00
Jeff King
650c449250 common-main: call git_extract_argv0_path()
Every program which links against libgit.a must call this
function, or risk hitting an assert() in system_path() that
checks whether we have configured argv0_path (though only
when RUNTIME_PREFIX is defined, so essentially only on
Windows).

Looking at the diff, you can see that putting it into the
common main() saves us having to do it individually in each
of the external commands. But what you can't see are the
cases where we _should_ have been doing so, but weren't
(e.g., git-credential-store, and all of the t/helper test
programs).

This has been an accident-waiting-to-happen for a long time,
but wasn't triggered until recently because it involves one
of those programs actually calling system_path(). That
happened with git-credential-store in v2.8.0 with ae5f677
(lazily load core.sharedrepository, 2016-03-11). The
program:

  - takes a lock file, which...

  - opens a tempfile, which...

  - calls adjust_shared_perm to fix permissions, which...

  - lazy-loads the config (as of ae5f677), which...

  - calls system_path() to find the location of
    /etc/gitconfig

On systems with RUNTIME_PREFIX, this means credential-store
reliably hits that assert() and cannot be used.

We never noticed in the test suite, because we set
GIT_CONFIG_NOSYSTEM there, which skips the system_path()
lookup entirely.  But if we were to tweak git_config() to
find /etc/gitconfig even when we aren't going to open it,
then the test suite shows multiple failures (for
credential-store, and for some other test helpers). I didn't
include that tweak here because it's way too specific to
this particular call to be worth carrying around what is
essentially dead code.

The implementation is fairly straightforward, with one
exception: there is exactly one caller (git.c) that actually
cares about the result of the function, and not the
side-effect of setting up argv0_path. We can accommodate
that by simply replacing the value of argv[0] in the array
we hand down to cmd_main().

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 15:09:10 -07:00
Jeff King
3f2e2297b9 add an extra level of indirection to main()
There are certain startup tasks that we expect every git
process to do. In some cases this is just to improve the
quality of the program (e.g., setting up gettext()). In
others it is a requirement for using certain functions in
libgit.a (e.g., system_path() expects that you have called
git_extract_argv0_path()).

Most commands are builtins and are covered by the git.c
version of main(). However, there are still a few external
commands that use their own main(). Each of these has to
remember to include the correct startup sequence, and we are
not always consistent.

Rather than just fix the inconsistencies, let's make this
harder to get wrong by providing a common main() that can
run this standard startup.

We basically have two options to do this:

 - the compat/mingw.h file already does something like this by
   adding a #define that replaces the definition of main with a
   wrapper that calls mingw_startup().

   The upside is that the code in each program doesn't need
   to be changed at all; it's rewritten on the fly by the
   preprocessor.

   The downside is that it may make debugging of the startup
   sequence a bit more confusing, as the preprocessor is
   quietly inserting new code.

 - the builtin functions are all of the form cmd_foo(),
   and git.c's main() calls them.

   This is much more explicit, which may make things more
   obvious to somebody reading the code. It's also more
   flexible (because of course we have to figure out _which_
   cmd_foo() to call).

   The downside is that each of the builtins must define
   cmd_foo(), instead of just main().

This patch chooses the latter option, preferring the more
explicit approach, even though it is more invasive. We
introduce a new file common-main.c, with the "real" main. It
expects to call cmd_main() from whatever other objects it is
linked against.

We link common-main.o against anything that links against
libgit.a, since we know that such programs will need to do
this setup. Note that common-main.o can't actually go inside
libgit.a, as the linker would not pick up its main()
function automatically (it has no callers).

The rest of the patch is just adjusting all of the various
external programs (mostly in t/helper) to use cmd_main().
I've provided a global declaration for cmd_main(), which
means that all of the programs also need to match its
signature. In particular, many functions need to switch to
"const char **" instead of "char **" for argv. This effect
ripples out to a few other variables and functions, as well.

This makes the patch even more invasive, but the end result
is much better. We should be treating argv strings as const
anyway, and now all programs conform to the same signature
(which also matches the way builtins are defined).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 15:09:10 -07:00
Charles Bailey
b8e47d1acf grep: fix grepping for "intent to add" files
This reverts commit 4d5520053 (grep: make it clear i-t-a entries are
ignored, 2015-12-27) and adds an alternative fix to maintain the -L
--cached behavior.

4d5520053 caused 'git grep' to no longer find matches in new files in
the working tree where the corresponding index entry had the "intent to
add" bit set, despite the fact that these files are tracked.

The content in the index of a file for which the "intent to add" bit is
set is considered indeterminate and not empty. For most grep queries we
want these to behave the same, however for -L --cached (files without a
match) we don't want to respond positively for "intent to add" files as
their contents are indeterminate. This is in contrast to files with
empty contents in the index (no lines implies no matches for any grep
query expression) which should be reported in the output of a grep -L
--cached invocation.

Add tests to cover this case and a few related cases which previously
lacked coverage.

Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Charles Bailey <cbailey32@bloomberg.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 13:27:41 -07:00
Charles Bailey
89e64100f4 t7810-grep.sh: fix a whitespace inconsistency
Signed-off-by: Charles Bailey <charles@hashpling.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 13:27:38 -07:00
Charles Bailey
878452b966 t7810-grep.sh: fix duplicated test name
Signed-off-by: Charles Bailey <charles@hashpling.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 13:26:30 -07:00
Lukas Fleischer
5e5be9e257 sideband.c: refactor recv_sideband()
We used character buffer manipulations to split messages from the
sideband at line breaks and insert "remote: " at the beginning of
each line, using the packet size to determine the end of a message.

However, since it is safe to assume that diagnostic messages from
the sideband never contain NUL characters, we can also NUL-terminate
the buffer, use strpbrk() for splitting lines and use format strings
to insert the prefix, to make the code easier to read and maintain.

A strbuf is used for accumulating the output which is then printed
using a single write(2) call to ensure the atomicity of the output.
See 9ac13ec (atomic write for sideband remote messages, 2006-10-11)
for details.

Helped-by: Jeff King <peff@peff.net>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Lukas Fleischer <lfleischer@lfos.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 13:09:37 -07:00
Vasco Almeida
415c7dd026 t5541: become resilient to GETTEXT_POISON
Use test_i18n* functions for testing text already marked for
translation.

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 12:47:30 -07:00
Nguyễn Thái Ngọc Duy
695f95ba5d grep.c: reuse "icase" variable
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 12:44:57 -07:00
Nguyễn Thái Ngọc Duy
b51a9c1479 diffcore-pickaxe: support case insensitive match on non-ascii
Similar to the "grep -F -i" case, we can't use kws on icase search
outside ascii range, so we quote the string and pass it to regcomp as
a basic regexp and let regex engine deal with case sensitivity.

The new test is put in t7812 instead of t4209-log-pickaxe because
lib-gettext.sh might cause problems elsewhere, probably.

Noticed-by: Plamen Totev <plamen.totev@abv.bg>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 12:44:57 -07:00
Nguyễn Thái Ngọc Duy
3d5b23a362 diffcore-pickaxe: Add regcomp_or_die()
There's another regcomp code block coming in this function that needs
the same error handling. This function can help avoid duplicating
error handling code.

Helped-by: Jeff King <peff@peff.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 12:44:57 -07:00
Nguyễn Thái Ngọc Duy
18547aacf5 grep/pcre: support utf-8
In the previous change in this function, we add locale support for
single-byte encodings only. It looks like pcre only supports utf-* as
multibyte encodings, the others are left in the cold (which is
fine).

We need to enable PCRE_UTF8 so pcre can find character boundary
correctly. It's needed for case folding (when --ignore-case is used)
or '*', '+' or similar syntax is used.

The "has_non_ascii()" check is to be on the conservative side. If
there's non-ascii in the pattern, the searched content could still be
in utf-8, but we can treat it just like a byte stream and everything
should work. If we force utf-8 based on locale only and pcre validates
utf-8 and the file content is in non-utf8 encoding, things break.

Noticed-by: Plamen Totev <plamen.totev@abv.bg>
Helped-by: Plamen Totev <plamen.totev@abv.bg>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 12:44:57 -07:00
Nguyễn Thái Ngọc Duy
e8c1672655 gettext: add is_utf8_locale()
This function returns true if git is running under an UTF-8
locale. pcre in the next patch will need this.

is_encoding_utf8() is used instead of strcmp() to catch both "utf-8"
and "utf8" suffixes.

When built with no gettext support, we peek in several env variables
to detect UTF-8. pcre library might support utf-8 even if libc is
built without locale support.. The peeking code is a copy from
compat/regex/regcomp.c

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 12:44:57 -07:00
Nguyễn Thái Ngọc Duy
9d9babb84d grep/pcre: prepare locale-dependent tables for icase matching
The default tables are usually built with C locale and only suitable
for LANG=C or similar.  This should make case insensitive search work
correctly for all single-byte charsets.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 12:44:57 -07:00
Nguyễn Thái Ngọc Duy
e944d9d932 grep: rewrite an if/else condition to avoid duplicate expression
"!icase || ascii_only" is repeated twice in this if/else chain as this
series evolves. Rewrite it (and basically revert the first if
condition back to before the "grep: break down an "if" stmt..." commit).

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 12:44:57 -07:00
Nguyễn Thái Ngọc Duy
793dc676e0 grep/icase: avoid kwsset when -F is specified
Similar to the previous commit, we can't use kws on icase search
outside ascii range. But we can't simply pass the pattern to
regcomp/pcre like the previous commit because it may contain regex
special characters, so we need to quote the regex first.

To avoid misquote traps that could lead to undefined behavior, we
always stick to basic regex engine in this case. We don't need fancy
features for grepping a literal string anyway.

basic_regex_quote_buf() assumes that if the pattern is in a multibyte
encoding, ascii chars must be unambiguously encoded as single
bytes. This is true at least for UTF-8. For others, let's wait until
people yell up. Chances are nobody uses multibyte, non utf-8 charsets
anymore.

Noticed-by: Plamen Totev <plamen.totev@abv.bg>
Helped-by: René Scharfe <l.s.r@web.de>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 12:44:30 -07:00
Jeff King
5caeeb83bc archive-tar: drop return value
We never do any error checks, and so never return anything
but "0". Let's just drop this to simplify the code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 10:26:28 -07:00