strbuf_setlen(., 0) writes '\0' to sb.buf[0], where buf is either
allocated and unique to sb, or the global slopbuf. The slopbuf is meant
to provide a guarantee that buf is not NULL and that a freshly
initialized buffer contains the empty string, but it is not supposed to
be written to. That strbuf_setlen writes to slopbuf has at least two
implications:
First, it's wrong in principle. Second, it might be hiding misuses which
are just waiting to wreak havoc. Third, ThreadSanitizer detects a race
when multiple threads write to slopbuf at roughly the same time, thus
potentially making any more critical races harder to spot.
Avoid writing to strbuf_slopbuf in strbuf_setlen. Let's instead assert
on the first byte of slopbuf being '\0', since it helps ensure the
promised invariant of buf[len] == '\0'. (We know that "len" was already
0, or someone has messed with "alloc". If someone has fiddled with the
fields that much beyond the correct interface, they're on their own.)
This is a function which is used in many places, possibly also in hot
code paths. There are two branches in strbuf_setlen already, and we are
adding a third and possibly a fourth (in the assert). In hot code paths,
we hopefully reuse the buffer in order to avoid continous reallocations.
Thus, after a start-up phase, we should always take the same path,
which might help branch prediction, and we would never make the assert.
If a hot code path continuously reallocates, we probably have bigger
performance problems than this new safety-check.
Simple measurements do not contradict this reasoning. 100000000 times
resetting a buffer and adding the empty string takes 5.29/5.26 seconds
with/without this patch (best of three). Releasing at every iteration
yields 18.01/17.87. Adding a 30-character string instead of the empty
string yields 5.61/5.58 and 17.28/17.28(!).
This patch causes the git binary emitted by gcc 5.4.0 -O2 on my machine
to grow from 11389848 bytes to 11497184 bytes, an increase of 0.9%.
I also tried to piggy-back on the fact that we already check alloc,
which should already tell us whether we are using the slopbuf:
if (sb->alloc) {
if (len > sb->alloc - 1)
die("BUG: strbuf_setlen() beyond buffer");
sb->buf[len] = '\0';
} else {
if (len)
die("BUG: strbuf_setlen() beyond buffer");
assert(!strbuf_slopbuf[0]);
}
sb->len = len;
That didn't seem to be much slower (5.38, 18.02, 5.70, 17.32 seconds),
but it does introduce some minor code duplication. The resulting git
binary was 11510528 bytes large (another 0.1% increase).
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When checking the conditional of "while (me->remaining)", we did not
hold the lock. Calling find_deltas would still be safe, since it checks
"remaining" (after taking the lock) and is able to handle all values. In
fact, this could (currently) not trigger any bug: a bug could happen if
`remaining` transitioning from zero to non-zero races with the evaluation
of the while-condition, but these are always separated by the
data_ready-mechanism.
Make sure we have the lock when we read `remaining`. This does mean we
release it just so that find_deltas can take it immediately again. We
could tweak the contract so that the lock should be taken before calling
find_deltas, but let's defer that until someone can actually show that
"unlock+lock" has a measurable negative impact.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
convert_attrs contains an "if-else". In the "if", we set attr_action
twice, and the first assignment has no effect. In the "else", we do not
set it at all. Since git_check_attr always returns the same value, we'll
always end up in the "if", so there is no problem right now. But
convert_attrs is obviously trying not to rely on such an
implementation-detail of another component.
Make the initialization of attr_action after the if-else. Remove the
earlier assignments.
Suggested-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code refactoring.
* pw/unquote-path-in-git-pm:
t9700: add tests for Git::unquote_path()
Git::unquote_path(): throw an exception on bad path
Git::unquote_path(): handle '\a'
add -i: move unquote_path() to Git.pm
We run an early part of "git gc" that deals with refs before
daemonising (and not under lock) even when running a background
auto-gc, which caused multiple gc processes attempting to run the
early part at the same time. This is now prevented by running the
early part also under the GC lock.
* jk/gc-pre-detach-under-hook:
gc: run pre-detach operations under lock
Code clean-up, that makes us in sync with Debian by one patch.
* jn/hooks-pre-rebase-sample-fix:
pre-rebase hook: capture documentation in a <<here document
The progress meter did not give a useful output when we haven't had
0.5 seconds to measure the throughput during the interval. Instead
show the overall throughput rate at the end, which is a much more
useful number.
* rs/progress-overall-throughput-at-the-end:
progress: show overall rate in last update
On Cygwin, similar to Windows, "git push //server/share/repository"
ought to mean a repository on a network share that can be accessed
locally, but this did not work correctly due to stripping the double
slashes at the beginning.
This may need to be heavily tested before it gets unleashed to the
wild, as the change is at a fairly low-level code and would affect
not just the code to decide if the push destination is local. There
may be unexpected fallouts in the path normalization.
* tb/push-to-cygwin-unc-path:
cygwin: allow pushing to UNC paths
Code cleanup.
* rs/apply-avoid-over-reading:
apply: use strcmp(3) for comparing strings in gitdiff_verify_name()
apply: use starts_with() in gitdiff_verify_name()
If we get a repo path like "-repo.git", we may try to invoke
"git-upload-pack -repo.git". This is going to fail, since
upload-pack will interpret it as a set of bogus options. But
let's reject this before we even run the sub-program, since
we would not want to allow any mischief with repo names that
actually are real command-line options.
You can still ask for such a path via git-daemon, but there's no
security problem there, because git-daemon enters the repo itself
and then passes "." on the command line.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If you have a GIT_PROXY_COMMAND configured, we will run it
with the host/port on the command-line. If a URL contains a
mischievous host like "--foo", we don't know how the proxy
command may handle it. It's likely to break, but it may also
do something dangerous and unwanted (technically it could
even do something useful, but that seems unlikely).
We should err on the side of caution and reject this before
we even run the command.
The hostname check matches the one we do in a similar
circumstance for ssh. The port check is not present for ssh,
but there it's not necessary because the syntax is "-p
<port>", and there's no ambiguity on the parsing side.
It's not clear whether you can actually get a negative port
to the proxy here or not. Doing:
git fetch git://remote:-1234/repo.git
keeps the "-1234" as part of the hostname, with the default
port of 9418. But it's a good idea to keep this check close
to the point of running the command to make it clear that
there's no way to circumvent it (and at worst it serves as a
belt-and-suspenders check).
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We reject hostnames that start with a dash because they may
be confused for command-line options. Let's factor out that
notion into a helper function, as we'll use it in more
places. And while it's simple now, it's not clear if some
systems might need more complex logic to handle all cases.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Per the explanation in the previous patch, this should be
(and is) rejected.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When commands like "git fetch" talk with ssh://$rest_of_URL/, the
code splits $rest_of_URL into components like host, port, etc., and
then spawns the underlying "ssh" program by formulating argv[] array
that has:
- the path to ssh command taken from GIT_SSH_COMMAND, etc.
- dashed options like '-batch' (for Tortoise), '-p <port>' as
needed.
- ssh_host, which is supposed to be the hostname parsed out of
$rest_of_URL.
- then the command to be run on the other side, e.g. git
upload-pack.
If the ssh_host ends up getting '-<anything>', the argv[] that is
used to spawn the command becomes something like:
{ "ssh", "-p", "22", "-<anything>", "command", "to", "run", NULL }
which obviously is bogus, but depending on the actual value of
"<anything>", will make "ssh" parse and use it as an option.
Prevent this by forbidding ssh_host that begins with a "-".
Noticed-by: Joern Schneeweisz of Recurity Labs
Reported-by: Brian at GitLab
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The tests for protocol.allow actually set that variable in
the on-disk config, run a series of tests, and then never
clean up after themselves. This means that whatever tests we
run after have protocol.allow=never, which may influence
their results.
In most cases we either exit after running these tests, or
do another round of test_proto(). In the latter case, this happens to
work because:
1. Tests of the GIT_ALLOW_PROTOCOL environment variable
override the config.
2. Tests of the specific config "protocol.foo.allow"
override the protocol.allow config.
3. The next round of protocol.allow tests start off by
setting the config to a known value.
However, it's a land-mine waiting to trap somebody adding
new tests to one of the t581x test scripts. Let's make sure
we clean up after ourselves.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>