Trace output which contains arbitrary strings (e.g., the
arguments to commands which we are running) is always passed
through sq_quote_buf(). That function always adds
single-quotes, even if the output consists of vanilla
characters. This can make the output a bit hard to read.
Let's avoid the quoting if there are no characters which a
shell would interpret. Trace output doesn't necessarily need
to be shell-compatible, but:
- the shell language is a good ballpark for what humans
consider readable (well, humans versed in command line
tools)
- the run_command bits can be cut-and-pasted to a shell,
and we'll keep that property
- it covers any cases which would make the output
visually ambiguous (e.g., embedded whitespace or quotes)
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
No caller passes anything but "0" for this parameter, which
requests that the function ignore it completely. In fact, in
all of history there was only one such caller, and it went
away in 7f51f8bc2b (alias: use run_command api to execute
aliases, 2011-01-07).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Gcc 7 adds -Wimplicit-fallthrough, which can warn when a
switch case falls through to the next case. The general idea
is that the compiler can't tell if this was intentional or
not, so you should annotate any intentional fall-throughs as
such, leaving it to complain about any unannotated ones.
There's a GNU __attribute__ which can be used for
annotation, but of course we'd have to #ifdef it away on
non-gcc compilers. Gcc will also recognize
specially-formatted comments, which matches our current
practice. Let's extend that practice to all of the
unannotated sites (which I did look over and verify that
they were behaving as intended).
Ideally in each case we'd actually give some reasons in the
comment about why we're falling through, or what we're
falling through to. And gcc does support that with
-Wimplicit-fallthrough=2, which relaxes the comment pattern
matching to anything that contains "fallthrough" (or a
variety of spelling variants). However, this isn't the
default for -Wimplicit-fallthrough, nor for -Wextra. In the
name of simplicity, it's probably better for us to support
the default level, which requires "fallthrough" to be the
only thing in the comment (modulo some window dressing like
"else" and some punctuation; see the gcc manual for the
complete set of patterns).
This patch suppresses all warnings due to
-Wimplicit-fallthrough. We might eventually want to add that
to the DEVELOPER Makefile knob, but we should probably wait
until gcc 7 is more widely adopted (since earlier versions
will complain about the unknown warning type).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git grep -i" has been taught to fold case in non-ascii locales
correctly.
* nd/icase:
grep.c: reuse "icase" variable
diffcore-pickaxe: support case insensitive match on non-ascii
diffcore-pickaxe: Add regcomp_or_die()
grep/pcre: support utf-8
gettext: add is_utf8_locale()
grep/pcre: prepare locale-dependent tables for icase matching
grep: rewrite an if/else condition to avoid duplicate expression
grep/icase: avoid kwsset when -F is specified
grep/icase: avoid kwsset on literal non-ascii strings
test-regex: expose full regcomp() to the command line
test-regex: isolate the bug test code
grep: break down an "if" stmt in preparation for next changes
Similar to the previous commit, we can't use kws on icase search
outside ascii range. But we can't simply pass the pattern to
regcomp/pcre like the previous commit because it may contain regex
special characters, so we need to quote the regex first.
To avoid misquote traps that could lead to undefined behavior, we
always stick to basic regex engine in this case. We don't need fancy
features for grepping a literal string anyway.
basic_regex_quote_buf() assumes that if the pattern is in a multibyte
encoding, ascii chars must be unambiguously encoded as single
bytes. This is true at least for UTF-8. For others, let's wait until
people yell up. Chances are nobody uses multibyte, non utf-8 charsets
anymore.
Noticed-by: Plamen Totev <plamen.totev@abv.bg>
Helped-by: René Scharfe <l.s.r@web.de>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A big comment at the beginning of quote.c is really
related to sq_quote_buf(), so let's move it in front
of this function.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 77d604c (Enhanced sq_quote(), 10 Oct 2005), the
comment at the beginning of quote.c is broken.
Let's fix it.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
ls-tree uses read_tree_recursive() which already does path filtering
using pathspec. No need to filter one more time based on prefix
only. "ls-tree ../somewhere" does not work because of
this. write_name_quotedpfx() can now be retired because nobody else
uses it.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove sq_quote_print() since it has no callers.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The print_value() function in for-each-ref.c prints values to stdout
immediately using {sq|perl|python|tcl}_quote_print(). Change these
lower-level quote functions to instead leave their results in strbuf
so that we can later add post-processing to the results of them.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After substitute path_relative() in quote.c with relative_path()
from path.c, parameters (such as len and prefix_len) are redundant
in function write_name() and write_name_quoted_relative(). The
callers have already been audited that the strings they pass are
properly NUL terminated and the length they give are the length of
the string (or -1 that asks the length to be counted by the callee).
Remove these now-redundant parameters.
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
quote_path_relative() used to take a counted string as its parameter
(the string to be quoted). With an earlier change, it now uses
relative_path() that does not take a counted string, and we have
been passing only the pointer to the string since then.
Remove the length parameter from quote_path_relative() to show that
this parameter was redundant. All the changed lines show that the
caller passed either -1 (to ask the function run strlen() on the
string), or the length of the string, so the earlier conversion was
safe.
All the callers of quote_path_relative() that used to take counted string
have been audited to make sure that they are passing length of the actual
string (or -1 to ask the callee run strlen())
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Substitute the function path_relative in quote.c with the function
relative_path. Function relative_path can be treated as an enhanced
and more robust version of path_relative.
Outputs of path_relative and it's replacement (relative_path) are the
same for the following cases:
path prefix output of path_relative output of relative_path
======== ========= ======================= =======================
/a/b/c/ /a/b/ c/ c/
/a/b/c /a/b/ c c
/a/ /a/b/ ../ ../
/ /a/b/ ../../ ../../
/a/c /a/b/ ../c ../c
/x/y /a/b/ ../../x/y ../../x/y
a/b/c/ a/b/ c/ c/
a/ a/b/ ../ ../
x/y a/b/ ../../x/y ../../x/y
/a/b (empty) /a/b /a/b
/a/b (null) /a/b /a/b
a/b (empty) a/b a/b
a/b (null) a/b a/b
But if both of the path and the prefix are the same, or the returned
relative path should be the current directory, the outputs of both
functions are different. Function relative_path returns "./", while
function path_relative returns empty string.
path prefix output of path_relative output of relative_path
======== ========= ======================= =======================
/a/b/ /a/b/ (empty) ./
a/b/ a/b/ (empty) ./
(empty) (null) (empty) ./
(empty) (empty) (empty) ./
But the callers of path_relative can handle such cases, or never
encounter this issue at all, because:
* In function quote_path_relative, if the output of path_relative is
empty, append "./" to it, like:
if (!out->len)
strbuf_addstr(out, "./");
* Another caller is write_name_quoted_relative, which is only used
by builtin/ls-files.c. git-ls-files only show files, so path of
files will never be identical with the prefix of a directory.
The following differences show that path_relative does not handle
extra slashes properly:
path prefix output of path_relative output of relative_path
======== ========= ======================= =======================
/a//b//c/ //a/b// ../../../../a//b//c/ c/
a/b//c a//b ../b//c c
And if prefix has no trailing slash, path_relative does not work
properly either. But since prefix always has a trailing slash, it's
not a problem.
path prefix output of path_relative output of relative_path
======== ========= ======================= =======================
/a/b/c/ /a/b b/c/ c/
/a/b /a/b b ./
/a/b/ /a/b b/ ./
/a /a/b/ ../../a ../
a/b/c/ a/b b/c/ c/
a/b/ a/b b/ ./
a a/b ../a ../
x/y a/b/ ../x/y ../../x/y
a/c a/b c ../c
/a/ /a/b (empty) ../
(empty) /a/b ../../ ./
One tricky part in this conversion is write_name() function in
ls-files.c. It takes a counted string, <name, len>, that is to be
made relative to <prefix, prefix_len> and then quoted. Because
write_name_quoted_relative() still takes these two parameters as
counted string, but ignores the count and treat these two as
NUL-terminated strings, this conversion needs to be audited for its
callers:
- For <name, len>, all three callers of write_name() passes a
NUL-terminated string and its true length, so this patch makes
"len" unused.
- For <prefix, prefix_len>, prefix could be a string that is longer
than empty while prefix_len could be 0 when "--full-name" option
is used. This is fixed by checking prefix_len in write_name()
and calling write_name_quoted_relative() with NULL when
prefix_len is set to 0. Again, this makes "prefix_len" given to
write_name_quoted_relative() unused, without introducing a bug.
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/argv-array:
run_hook: use argv_array API
checkout: use argv_array API
bisect: use argv_array API
quote: provide sq_dequote_to_argv_array
refactor argv_array into generic code
quote.h: fix bogus comment
add sha1_array API docs
This is similar to sq_dequote_to_argv, but more convenient
if you have an argv_array. It's tempting to just feed the
components of the argv_array to sq_dequote_to_argv instead,
but:
1. It wouldn't maintain the NULL-termination invariant
of argv_array.
2. It doesn't match the memory ownership policy of
argv_array (in which each component is free-able, not a
pointer into a separate buffer).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The following sequence of commands reveals an issue with error
reporting of relative paths:
$ mkdir sub
$ cd sub
$ git ls-files --error-unmatch ../bbbbb
error: pathspec 'b' did not match any file(s) known to git.
$ git commit --error-unmatch ../bbbbb
error: pathspec 'b' did not match any file(s) known to git.
This bug is visible only if the normalized path (i.e., the relative
path from the repository root) is longer than the prefix.
Otherwise, the code skips over the normalized path and reads from
an unused memory location which still contains a leftover of the
original command line argument.
So instead, use the existing facilities to deal with relative paths
correctly.
Also fix inconsistency between "checkout" and "commit", e.g.
$ cd Documentation
$ git checkout nosuch.txt
error: pathspec 'Documentation/nosuch.txt' did not match...
$ git commit nosuch.txt
error: pathspec 'nosuch.txt' did not match...
by propagating the prefix down the codepath that reports the error.
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is in preparation of relative path support for ls-files, which
quotes a path only if the line terminator is not the NUL character.
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function did not work on strings that were not NUL-terminated. It
reads through a length-bounded string, searching for characters in need of
quoting. After we find one, we output the quoted character, then advance
our pointer to find the next one. However, we never decremented the
length, meaning we ended up looking at whatever random junk was stored
after the string.
This bug was not found by the existing tests because most code paths feed
a NUL-terminated string. The notable exception is a directory name being
fed by ls-tree.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a few remaining ones, but this fixes the trivial ones. It boils
down to two main issues that sparse complains about:
- warning: Using plain integer as NULL pointer
Sparse doesn't like you using '0' instead of 'NULL'. For various good
reasons, not the least of which is just the visual confusion. A NULL
pointer is not an integer, and that whole "0 works as NULL" is a
historical accident and not very pretty.
A few of these remain: zlib is a total mess, and Z_NULL is just a 0.
I didn't touch those.
- warning: symbol 'xyz' was not declared. Should it be static?
Sparse wants to see declarations for any functions you export. A lack
of a declaration tends to mean that you should either add one, or you
should mark the function 'static' to show that it's in file scope.
A few of these remain: I only did the ones that should obviously just
be made static.
That 'wt_status_submodule_summary' one is debatable. It has a few related
flags (like 'wt_status_use_color') which _are_ declared, and are used by
builtin-commit.c. So maybe we'd like to export it at some point, but it's
not declared now, and not used outside of that file, so 'static' it is in
this patch.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This new function unwraps the space separated shell quoted elements in
its first argument and places them in the argv array passed as its second
argument.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sq_dequote() function does not allow parsing a string with more than
one single-quoted parameter easily; use its code to implement a new API
sq_dequote_step() to allow the caller iterate through such a string to
parse them one-by-one. The original sq_dequote() becomes a thin wrapper
around it.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A lot of modules that have nothing to do with git-shell functionality
were linked in, bloating git-shell more than 8 times.
This patch cuts off redundant dependencies by:
1. providing stubs for three functions that make no sense for git-shell;
2. moving quote_path_fully from environment.c to quote.c to make the
later self sufficient;
3. moving make_absolute_path into a new separate file.
The following numbers have been received with the default optimization
settings on master using GCC 4.1.2:
Before:
text data bss dec hex filename
143915 1348 93168 238431 3a35f git-shell
After:
text data bss dec hex filename
17670 788 8232 26690 6842 git-shell
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* dp/clean-fix:
git-clean: add tests for relative path
git-clean: correct printing relative path
Make private quote_path() in wt-status.c available as quote_path_relative()
Revert part of d089eba (setup: sanitize absolute and funny paths in get_pathspec())
Revert part of 1abf095 (git-add: adjust to the get_pathspec() changes)
Revert part of 744dacd (builtin-mv: minimum fix to avoid losing files)
get_pathspec(): die when an out-of-tree path is given
Move quote_path() from wt-status.c to quote.c and rename it as
quote_path_relative(), because it is a better name for a public function.
Also, instead of handcrafted quoting, quote_c_style_counted() is now used,
to make its quoting more consistent with the rest of the system, also
honoring core.quotepath specified in configuration.
Signed-off-by: Dmitry Potapov <dpotapov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The optional endp parameter to unquote_c_style() was supposed to point at
a location past the closing double quote, but it was going one beyond it.
git-fast-import used this function heavily and the bug caused it to
misparse the input stream, especially when parsing a rename command:
R "filename that needs quoting" rename-target-name
Because the function erroneously ate the whitespace after the closing dq,
this triggered "Missing space after source" error when it shouldn't.
Thanks to Adeodato Simò for having caught this.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This moves the logic to quote two paths (prefix + path) in
C-style introduced in the previous commit from the
dump_quoted_path() in combine-diff.c to quote.c, and uses it to
fix rewrite_diff() that never C-quoted the pathnames correctly.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that str_buf takes care of all the allocations, there is
no more gain to pass an argument count.
So this patch removes the "count" argument from:
- "sq_quote_argv"
- "trace_argv_printf"
and all the callers.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
sq_quote_buf() treats single-quotes and exclamation marks specially, but
it incorrectly parsed the input for single-quotes and backslashes.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For that purpose, the ->buf is always initialized with a char * buf living
in the strbuf module. It is made a char * so that we can sloppily accept
things that perform: sb->buf[0] = '\0', and because you can't pass "" as an
initializer for ->buf without making gcc unhappy for very good reasons.
strbuf_init/_detach/_grow have been fixed to trust ->alloc and not ->buf
anymore.
as a consequence strbuf_detach is _mandatory_ to detach a buffer, copying
->buf isn't an option anymore, if ->buf is going to escape from the scope,
and eventually be free'd.
API changes:
* strbuf_setlen now always works, so just make strbuf_reset a convenience
macro.
* strbuf_detatch takes a size_t* optional argument (meaning it can be
NULL) to copy the buffer's len, as it was needed for this refactor to
make the code more readable, and working like the callers.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* sq_quote_buf is made public, and works on a strbuf.
* sq_quote_argv also works on a strbuf.
* make sq_quote_argv take a "maxlen" argument to check the buffer won't grow
too big.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* quote_c_style works on a strbuf instead of a wild buffer.
* quote_c_style is now clever enough to not add double quotes if not needed.
* write_name_quoted inherits those advantages, but also take a different
set of arguments. Now instead of asking for quotes or not, you pass a
"terminator". If it's \0 then we assume you don't want to escape, else C
escaping is performed. In any case, the terminator is also appended to the
stream. It also no longer takes the prefix/prefix_len arguments, as it's
seldomly used, and makes some optimizations harder.
* write_name_quotedpfx is created to work like write_name_quoted and take
the prefix/prefix_len arguments.
Thanks to those API changes, diff.c has somehow lost weight, thanks to the
removal of functions that were wrappers around the old write_name_quoted
trying to give it a semantics like the new one, but performing a lot of
allocations for this goal. Now we always write directly to the stream, no
intermediate allocation is performed.
As a side effect of the refactor in builtin-apply.c, the length of the bar
graphs in diffstats are not affected anymore by the fact that the path was
clipped.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
If the gain is not obvious in the diffstat, the resulting code is more
readable, _and_ in checkout-index/update-index we now reuse the same buffer
to unquote strings instead of always freeing/mallocing.
This also is more coherent with the next patch that reworks quoting
functions.
The quoting function is also made more efficient scanning for backslashes
and treating portions of strings without a backslash at once.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
We always quote "unusual" byte values in a pathname using
C-string style, to make it safer for parsing scripts that do not
handle NUL separated records well (or just too lazy to bother).
The absolute minimum bytes that need to be quoted for this
purpose are TAB, LF (and other control characters), double quote
and backslash.
However, we have also always quoted the bytes in high 8-bit
range; this was partly because we were lazy and partly because
we were being cautious.
This introduces an internal "quote_path_fully" variable, and
core.quotepath configuration variable to control it. When set
to false, it does not quote bytes in high 8-bit range anymore
but passes them intact.
The variable defaults to "true" to retain the traditional
behaviour for now.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Love it or hate it, some people actually still program in Tcl. Some
of those programs are meant for interfacing with Git. Programs such as
gitk and git-gui. It may be useful to have Tcl-safe output available
from for-each-ref, just like shell, Perl and Python already enjoy.
Thanks to Sergey Vlasov for pointing out the horrible flaws in the
first and second version of this patch, and steering me in the right
direction for Tcl value quoting.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
* maint:
revision traversal: --unpacked does not limit commit list anymore.
Continue traversal when rev-list --unpacked finds a packed commit.
Use memmove instead of memcpy for overlapping areas
quote.c: ensure the same quoting across platforms.
Surround "#define DEBUG 0" with "#ifndef DEBUG..#endif"
We read a byte from "char *" and compared it with ' ' to decide
if it needs quoting to protect textual output. With a platform
where char is unsigned char that would give different result.
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds a new command, git-for-each-ref. You can have it iterate
over refs and have it output various aspects of the objects they
refer to.
Signed-off-by: Junio C Hamano <junkio@cox.net>
So that this function may be used in places other than "rsh.c".
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Now if GIT_TRACE is set to an integer value greater than 1
and lower than 10, we interpret this as an open fd value
and we trace into it. Note that this behavior is not
compatible with the previous one.
We also trace whole messages using one write(2) call to
make sure messages from processes do net get mixed up in
the middle.
It's now possible to run the tests like this:
GIT_TRACE=9 make test 9>/var/tmp/trace.log
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
With the environment variable GIT_TRACE set git will show
- alias expansion
- built-in command execution
- external command execution
on stderr.
Signed-off-by: Matthias Lederhofer <matled@gmx.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
Though very nice and readable, the "case 'a'...'z':" construct is not ANSI C99
compliant. This patch unfolds the range in `quote.c' and substitutes the
switch-statement with an if-statement in `http-fetch.c' and `http-push.c'.
Signed-off-by: Florian Forster <octo@verplant.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
The switch is inside an if statement which is false if
the character is ' '. Either the if should be <=' '
instead of <' ', or the case should be removed as it could
be misleading.
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
quote_c_style_counted() in quote.c uses a hard-to-read construct.
Convert this to a more traditional form of the for loop.
Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
This adds a very git specific restricted shell, that can be
added to /etc/shells and set to the pw_shell in the /etc/passwd
file, to give users ability to push into repositories over ssh
without giving them full interactive shell acount.
[jc: I updated Linus' patch to match what the current sq_quote()
does.]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>