We sometimes sprintf into fixed-size buffers when we know
that the buffer is large enough to fit the input (either
because it's a constant, or because it's numeric input that
is bounded in size). Likewise with strcpy of constant
strings.
However, these sites make it hard to audit sprintf and
strcpy calls for buffer overflows, as a reader has to
cross-reference the size of the array with the input. Let's
use xsnprintf instead, which communicates to a reader that
we don't expect this to overflow (and catches the mistake in
case we do).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The connection initiation code for "ssh" transport tried to absorb
differences between the stock "ssh" and Putty-supplied "plink" and
its derivatives, but the logic to tell that we are using "plink"
variants were too loose and falsely triggered when "plink" appeared
anywhere in the path (e.g. "/home/me/bin/uplink/ssh").
* bc/connect-plink:
connect: improve check for plink to reduce false positives
t5601: fix quotation error leading to skipped tests
connect: simplify SSH connection code path
The git_connect function has code to handle plink and tortoiseplink
specially, as they require different command line arguments from
OpenSSH (-P instead of -p for ports; tortoiseplink additionally requires
-batch). However, the match was done by checking for "plink" anywhere
in the string, which led to a GIT_SSH value containing "uplink" being
treated as an invocation of putty's plink.
Improve the check by looking for "plink" or "tortoiseplink" (or those
names suffixed with ".exe") only in the final component of the path.
This has the downside that a program such as "plink-0.63" would no
longer be recognized, but the increased robustness is likely worth it.
Add tests to cover these cases to avoid regressions.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code path used in git_connect pushed the majority of the SSH
connection code into an else block, even though the if block returns.
Simplify the code by eliminating the else block, as it is unneeded.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An earlier update to the parser that disects an address broke an
address, followed by a colon, followed by an empty string (instead
of the port number).
* tb/connect-ipv6-parse-fix:
connect.c: ignore extra colon after hostname
Ignore an extra ':' at the end of the hostname in URL's like
"ssh://example.com:/path/to/repo"
The colon is meant to separate a port number from the hostname.
If the port is empty, the colon should be ignored, see RFC 3986.
It had been working for URLs with ssh:// scheme, but was unintentionally
broken in 86ceb3, "allow ssh://user@[2001:db8::1]/repo.git"
Reported-by: Reid Woodbury Jr. <reidw@rawsound.com>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We did not parse username followed by literal IPv6 address in SSH
transport URLs, e.g. ssh://user@[2001:db8::1]:22/repo.git
correctly.
* tb/connect-ipv6-parse-fix:
t5500: show user name and host in diag-url
t5601: add more test cases for IPV6
connect.c: allow ssh://user@[2001:db8::1]/repo.git
The "interpolated-path" option of "git daemon" inserted any string
client declared on the "host=" capability request without checking.
Sanitize and limit %H and %CH to a saner and a valid DNS name.
* jk/daemon-interpolate:
daemon: sanitize incoming virtual hostname
t5570: test git-daemon's --interpolated-path option
git_connect: let user override virtual-host we send to daemon
When git_connect() is called to see how the URL is parsed for
debugging purposes with CONNECT_DIAG_URL set, the variable conn is
leaked. At this point in the codeflow, it only has its memory and
no other resource is associated with it, so it is sufficient to
clean it up by just freeing it.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code cleanups.
* rs/simple-cleanups:
sha1_name: use strlcpy() to copy strings
pretty: use starts_with() to check for a prefix
for-each-ref: use skip_prefix() to avoid duplicate string comparison
connect: use strcmp() for string comparison
We did not parse username followed by literal IPv6 address in SSH
transport URLs, e.g. ssh://user@[2001:db8::1]:22/repo.git
correctly.
* tb/connect-ipv6-parse-fix:
t5500: show user name and host in diag-url
t5601: add more test cases for IPV6
connect.c: allow ssh://user@[2001:db8::1]/repo.git
Code cleanups.
* rs/simple-cleanups:
sha1_name: use strlcpy() to copy strings
pretty: use starts_with() to check for a prefix
for-each-ref: use skip_prefix() to avoid duplicate string comparison
connect: use strcmp() for string comparison
The "interpolated-path" option of "git daemon" inserted any string
client declared on the "host=" capability request without checking.
Sanitize and limit %H and %CH to a saner and a valid DNS name.
* jk/daemon-interpolate:
daemon: sanitize incoming virtual hostname
t5570: test git-daemon's --interpolated-path option
git_connect: let user override virtual-host we send to daemon
The URL for ssh may have include a username before the hostname,
like ssh://user@host/repo.
When literal IPV6 addresses are used together with a username,
the substring "user@[::1]" must be converted into "user@::1".
Make that conversion visible for the user, and write userandhost
in the diagnostics
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ssh:// syntax was added in 2386d658 (Add first cut at "git
protocol" connect logic., 2005-07-13), it accepted
ssh://user@2001:db8::1/repo.git, which is now legacy.
Over the years the parser was improved to support [] and port numbers,
but the combination of ssh://user@[2001:db8::1]:222/repo.git did
never work.
The only only way to use a user name, a literall IPV6 address and a port
number was ssh://[user@2001:db8::1]:222/repo.git
(Thanks to Christian Taube <lists@hcf.yourweb.de> for reporting this long
standing issue)
New users would use ssh://user@[2001:db8::1]:222/repo.git,
so change the parser to handle it correctly.
Support the old legacy URLs as well, to be backwards compatible,
and avoid regressions for users which upgrade an existing installation
to a later Git version.
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Get rid of magic string length constants and simply compare the strings
using strcmp(). This makes the intent of the code a bit clearer.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we connect to a git-daemon at a given host and port, we
actually send the string "localhost:9418" to the other side,
which allows it to do virtual-hosting lookups. For testing
and debugging, we'd like to be able to send arbitrary
strings, rather than the hostname we actually connected to.
Using "insteadOf" config does not work for this purpose, as
the hostname determination happens at a very low level,
right before we feed the hostname to our lookup routines.
You could use /etc/hosts or similar to get around this, but
we cannot do that portably from our test suite.
Instead, this patch provides an environment variable that
can be used to send an arbitrary string.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git remote update --prune" to drop many refs has been optimized.
* mh/simplify-repack-without-refs:
sort_string_list(): rename to string_list_sort()
prune_remote(): iterate using for_each_string_list_item()
prune_remote(): rename local variable
repack_without_refs(): make the refnames argument a string_list
prune_remote(): sort delete_refs_list references en masse
prune_remote(): initialize both delete_refs lists in a single loop
prune_remote(): exit early if there are no stale references
The new name is more consistent with the names of other
string_list-related functions.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It may be impractical to install a wrapper script for GIT_SSH
when additional parameters need to be passed. Provide an alternative
way of specifying a shell command to be run, including command line
arguments, by means of the GIT_SSH_COMMAND environment variable,
which behaves like GIT_SSH but is passed to the shell.
The special circuitry to modify parameters in the case of using
PuTTY's plink/tortoiseplink is activated only when using GIT_SSH;
in the case of using GIT_SSH_COMMAND, it is deliberately left up to
the user to make any required parameters adaptation before calling
the underlying ssh implementation.
Signed-off-by: Thomas Quinot <thomas@quinot.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up.
* rs/more-uses-of-skip-prefix:
pack-write: simplify index_pack_lockfile using skip_prefix() and xstrfmt()
connect: simplify check_ref() using skip_prefix() and starts_with()
Both callers of check_ref() pass in NUL-terminated strings for name.
Remove the len parameter and then use skip_prefix() and starts_with()
instead of memcmp() to check if it starts with certain strings. This
gets rid of several magic string length constants and a strlen() call.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a helper function for initializing those struct child_process
variables for which the macro CHILD_PROCESS_INIT can't be used.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most struct child_process variables are cleared using memset first after
declaration. Provide a macro, CHILD_PROCESS_INIT, that can be used to
initialize them statically instead. That's shorter, doesn't require a
function call and is slightly more readable (especially given that we
already have STRBUF_INIT, ARGV_ARRAY_INIT etc.).
Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use xmemdupz() to allocate the memory, copy the data and make sure to
NUL-terminate the result, all in one step. The resulting code is
shorter, doesn't contain the constants 1 and '\0', and avoids
duplicating function parameters.
For blame, the last copied byte (o->file.ptr[o->file.size]) is always
set to NUL by fake_working_tree_commit() or read_sha1_file(), so no
information is lost by the conversion to using xmemdupz().
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's a common idiom to match a prefix and then skip past it
with a magic number, like:
if (starts_with(foo, "bar"))
foo += 3;
This is easy to get wrong, since you have to count the
prefix string yourself, and there's no compiler check if the
string changes. We can use skip_prefix to avoid the magic
numbers here.
Note that some of these conversions could be much shorter.
For example:
if (starts_with(arg, "--foo=")) {
bar = arg + 6;
continue;
}
could become:
if (skip_prefix(arg, "--foo=", &bar))
continue;
However, I have left it as:
if (skip_prefix(arg, "--foo=", &v)) {
bar = v;
continue;
}
to visually match nearby cases which need to actually
process the string. Like:
if (skip_prefix(arg, "--foo=", &v)) {
bar = atoi(v);
continue;
}
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This avoids magic numbers when we allocate fixed-size argv
arrays, and makes it more obvious that we are not
overflowing.
It is also the first step to fixing a memory leak. When
git_connect returns a child_process struct, the argv array
in the struct is dynamically allocated, but the individual
strings are not (they are either owned elsewhere, or are
freed). Later, in finish_connect, we free the array but
leave the strings alone.
This works for the child_process created by git_connect, but
if we use transport_take_over, we may also end up with a
child_process created by transport-helper's get_helper.
In that case, the strings are freshly allocated, and we
would want to free them. However, we have no idea in
finish_connect which type we have.
By consistently using run-command's internal argv-array, we
do not have to worry about this issue at all; finish_command
takes care of it for us, and we can drop our manual free
entirely.
Note that this actually makes the get_helper leak slightly
worse; now we are leaking both the strings and the array.
But when we adjust it in a future patch, that leak will go
away entirely.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fetching from a shallow-cloned repository used to be forbidden,
primarily because the codepaths involved were not carefully vetted
and we did not bother supporting such usage. This attempts to allow
object transfer out of a shallow-cloned repository in a controlled
way (i.e. the receiver become a shallow repository with truncated
history).
* nd/shallow-clone: (31 commits)
t5537: fix incorrect expectation in test case 10
shallow: remove unused code
send-pack.c: mark a file-local function static
git-clone.txt: remove shallow clone limitations
prune: clean .git/shallow after pruning objects
clone: use git protocol for cloning shallow repo locally
send-pack: support pushing from a shallow clone via http
receive-pack: support pushing to a shallow clone via http
smart-http: support shallow fetch/clone
remote-curl: pass ref SHA-1 to fetch-pack as well
send-pack: support pushing to a shallow clone
receive-pack: allow pushes that update .git/shallow
connected.c: add new variant that runs with --shallow-file
add GIT_SHALLOW_FILE to propagate --shallow-file to subprocesses
receive/send-pack: support pushing from a shallow clone
receive-pack: reorder some code in unpack()
fetch: add --update-shallow to accept refs that update .git/shallow
upload-pack: make sure deepening preserves shallow roots
fetch: support fetching from a shallow repository
clone: support remote shallow repository
...
Be more careful when parsing remote repository URL given in the
scp-style host:path notation.
* tb/clone-ssh-with-colon-for-port:
git_connect(): use common return point
connect.c: refactor url parsing
git_connect(): refactor the port handling for ssh
git fetch: support host:/~repo
t5500: add test cases for diag-url
git fetch-pack: add --diag-url
git_connect: factor out discovery of the protocol and its parts
git_connect: remove artificial limit of a remote command
t5601: add tests for ssh
t5601: remove clear_ssh, refactor setup_ssh_wrapper
No callers pass a non-empty pointer as shallow_points at this
stage. As a result, all clients still refuse to talk to shallow
repository on the other end.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The latter can do everything the former can and is used in many more
places.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use only one return point from git_connect(), doing the
free();
return conn;
only at one place in the code.
There may be a little confusion what the variable "host" is for. At
some places it is only the host part, at other places it may include
the port number, so change host into hostandport here.
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the function is_local() in transport.c public, rename it into
url_is_local_not_ssh() and use it in both transport.c and connect.c
Use a protocol "local" for URLs for the local file system.
One note about using file:// under Windows:
The (absolute) path on Unix like system typically starts with "/".
When the host is empty, it can be omitted, so that a shell scriptlet
url=file://$pwd
will give a URL like "file:///home/user/repo".
Windows does not have the same concept of a root directory located in "/".
When parsing the URL allow "file://C:/user/repo"
(even if RFC1738 indicates that "file:///C:/user/repo" should be used).
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use get_host_and_port() even for ssh.
Remove the variable port git_connect(), and simplify parse_connect_url()
Use only one return point in git_connect(), doing the free() and return conn.
t5601 had 2 corner test cases which now pass.
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation (in urls.txt) says that
"ssh://host:/~repo",
"host:/~repo" or
"host:~repo"
specify the repository "repo" in the home directory at "host".
This has not been working for "host:/~repo".
Before commit 356bec "Support [address] in URLs", the comparison
"url != hostname" could be used to determine if the URL had a scheme
or not: "ssh://host/host" != "host".
However, after 356bec "[::1]" was converted into "::1", yielding
url != hostname as well. To fix this regression, don't use
"if (url != hostname)", but look at the separator instead.
Rename the variable "c" into "separator" to make it easier to read.
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The main purpose is to trace the URL parser called by git_connect() in
connect.c
The main features of the parser can be listed as this:
- parse out host and path for URLs with a scheme (git:// file:// ssh://)
- parse host names embedded by [] correctly
- extract the port number, if present
- separate URLs like "file" (which are local)
from URLs like "host:repo" which should use ssh
Add the new parameter "--diag-url" to "git fetch-pack", which prints
the value for protocol, host and path to stderr and exits.
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git_connect has grown large due to the many different protocols syntaxes
that are supported. Move the part of the function that parses the URL to
connect to into a separate function for readability.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since day one, function git_connect() had a limit on the command line of
the command that is invoked to make a connection. 7a33bcbe converted the
code that constructs the command to strbuf. This would have been the
right time to remove the limit, but it did not happen. Remove it now.
git_connect() uses start_command() to invoke the command; consequently,
the limits of the system still apply, but are diagnosed only at execve()
time. But these limits are more lenient than the 1K that git_connect()
imposed.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.
The change can be recreated by mechanically applying this:
$ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
grep -v strbuf\\.c |
xargs perl -pi -e '
s|!prefixcmp\(|starts_with\(|g;
s|prefixcmp\(|!starts_with\(|g;
s|!suffixcmp\(|ends_with\(|g;
s|suffixcmp\(|!ends_with\(|g;
'
on the result of preparatory changes in this series.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
One long-standing flaw in the pack transfer protocol used by "git
clone" was that there was no way to tell the other end which branch
"HEAD" points at, and the receiving end needed to guess. A new
capability has been defined in the pack protocol to convey this
information so that cloning from a repository with more than one
branches pointing at the same commit where the HEAD is at now
reliably sets the initial branch in the resulting repository.
* jc/upload-pack-send-symref:
t5570: Update for clone-progress-to-stderr branch
t5570: Update for symref capability
clone: test the new HEAD detection logic
connect: annotate refs with their symref information in get_remote_head()
connect.c: make parse_feature_value() static
upload-pack: send non-HEAD symbolic refs
upload-pack: send symbolic ref information as capability
upload-pack.c: do not pass confusing cb_data to mark_our_ref()
t5505: fix "set-head --auto with ambiguous HEAD" test
commit 6000334 (clone: allow cloning local paths with colons in them -
2013-05-04) made it possible to specify a path that has colons in it
without file://, e.g. ../foo:bar/somewhere. But the check was a bit
sloppy.
Consider the url '[foo]:bar'. The '[]' unwrapping code will turn the
string to 'foo\0:bar'. In effect this new string is the same as
'foo/:bar' in the check "path < strchrnul(host, '/')", which mistakes
it for a local path (with '/' before the first ':') when it's actually
not.
So disable the check for '/' before ':' when the URL has been mangled
by '[]' unwrapping.
[jn: with tests from Jeff King]
Noticed-by: Morten Stenshorne <mstensho@opera.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
By doing this, clients of upload-pack can now reliably tell what ref
a symbolic ref points at; the updated test in t5505 used to expect
failure due to the ambiguity and made sure we give diagnostics, but
we no longer need to be so pessimistic. Make sure we correctly learn
which branch HEAD points at from the other side instead.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The definition of "struct ref" in "cache.h", a header file so
central to the system, always confused me. This structure is not
about the local ref used by sha1-name API to name local objects.
It is what refspecs are expanded into, after finding out what refs
the other side has, to define what refs are updated after object
transfer succeeds to what values. It belongs to "remote.h" together
with "struct refspec".
While we are at it, also move the types and functions related to the
Git transport connection to a new header file connect.h
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git clone foo/bar:baz" cannot be a request to clone from a remote
over git-over-ssh specified in the scp style. Detect this case and
clone from a local repository at "foo/bar:baz".
* nd/clone-local-with-colon:
clone: allow cloning local paths with colons in them
Usually "foo:bar" is interpreted as an ssh url. This patch allows to
clone from such paths by putting at least one slash before the colon
(i.e. /path/to/foo:bar or just ./foo:bar).
file://foo:bar should also work, but local optimizations are off in
that case, which may be unwanted. While at there, warn the users about
--local being ignored in this case.
Reported-by: William Giokas <1007380@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>