The only transport that does not allow fetch() to be called before
get_refs_list() is the bundle transport. Clean up the code by teaching
the bundle transport the ability to do this, and removing support for
transports that don't support this order of invocation.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit e70a3030e7 ("fetch: do not list refs if fetching only hashes",
2018-10-07) and its ancestors taught Git, as an optimization, to skip
the ls-refs step when it is not necessary during a protocol v2 fetch
(for example, when lazy fetching a missing object in a partial clone, or
when running "git fetch --no-tags <remote> <SHA-1>"). But that was only
done for natively supported protocols; in particular, HTTP was not
supported.
Teach Git to skip ls-refs when using remote helpers that support connect
or stateless-connect. To do this, fetch() is made an acceptable entry
point. Because fetch() can now be the first function in the vtable
called, "get_helper(transport);" has to be added to the beginning of
that function to set the transport up (if not yet set up) before
process_connect() is invoked.
When fetch() is called, the transport could be taken over (this happens
if "connect" or "stateless-connect" is successfully run without any
"fallback" response), or not. If the transport is taken over, execution
continues like execution for natively supported protocols
(fetch_refs_via_pack() is executed, which will fetch refs using ls-refs
if needed). If not, the remote helper interface will invoke
get_refs_list() if it hasn't been invoked yet, preserving existing
behavior.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git push --atomic" that goes over the transport-helper (namely,
the smart http transport) failed to prevent refs to be pushed when
it can locally tell that one of the ref update will fail without
having to consult the other end, which has been corrected.
* es/local-atomic-push-failure-with-http:
transport-helper: avoid var decl in for () loop control
transport-helper: enforce atomic in push_refs_with_push
The ownership rule for the file descriptor to fast-import remote
backend was mixed up, leading to unrelated file descriptor getting
closed, which has been fixed.
* mh/import-transport-fd-fix:
Use xmmap_gently instead of xmmap in use_pack
dup() the input fd for fast-import used for remote helpers
"git push --atomic" that goes over the transport-helper (namely,
the smart http transport) failed to prevent refs to be pushed when
it can locally tell that one of the ref update will fail without
having to consult the other end, which has been corrected.
* es/local-atomic-push-failure-with-http:
transport-helper: avoid var decl in for () loop control
transport-helper: enforce atomic in push_refs_with_push
Teach transport-helper how to notice if skipping a ref during push would
violate atomicity on the client side. We notice that a ref would be
rejected, and choose not to send it, but don't notice that if the client
has asked for --atomic we are violating atomicity if all the other
pushes we are sending would succeed. Asking the server end to uphold
atomicity wouldn't work here as the server doesn't have any idea that we
tried to update a ref that's broken.
The added test-case is a succinct way to reproduce this issue that fails
today. The same steps work fine when we aren't using a transport-helper
to get to the upstream, i.e. when we've added a local repository as a
remote:
git remote add ~/upstream upstream
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the filter_spec string a string_list rather than a raw C string.
The list of strings must be concatted together to make a complete
filter_spec. A future patch will use this capability to build "combine:"
filter specs gradually.
A strbuf would seem to be a more natural choice for this object, but it
unfortunately requires initialization besides just zero'ing out the
memory. This results in all container structs, and all containers of
those structs, etc., to also require initialization. Initializing them
all would be more cumbersome that simply using a string_list, which
behaves properly when its contents are zero'd.
For the purposes of code simplification, change behavior in how filter
specs are conveyed over the protocol: do not normalize the tree:<depth>
filter specs since there should be no server in existence that supports
tree:# but not tree:#k etc.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Matthew DeVore <matvore@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The ownership rule for the file descriptor to fast-import remote
backend was mixed up, leading to unrelated file descriptor getting
closed, which has been fixed.
* mh/import-transport-fd-fix:
Use xmmap_gently instead of xmmap in use_pack
dup() the input fd for fast-import used for remote helpers
When a remote helper exposes the "import" capability, stdout of the
helper is sent to stdin of a new fast-import process. This is done by
setting the corresponding child_process's in field to the value of the
out field of the helper child_process.
The child_process API is defined to close the file descriptors it's
given when calling start_command. This means when start_command is
called for the fast-import process, its input fd (the output fd of the
helper), is closed.
But when the transport helper is later destroyed, in disconnect_helper,
its input and output are closed, which means close() is called with
an invalid fd (since it was already closed as per above). Or worse, with
a valid fd owned by something else (since fd numbers can be reused).
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add trace2 child classification for transport processes.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update the protocol message specification to allow only the limited
use of scaled quantities. This is ensure potential compatibility
issues will not go out of hand.
* js/filter-options-should-use-plain-int:
filter-options: expand scaled numbers
tree:<depth>: skip some trees even when collecting omits
list-objects-filter: teach tree:# how to handle >0
Portability updates for the HPE NonStop platform.
* rb/hpe:
compat/regex/regcomp.c: define intptr_t and uintptr_t on NonStop
git-compat-util.h: add FLOSS headers for HPE NonStop
config.mak.uname: support for modern HPE NonStop config.
transport-helper: drop read/write errno checks
transport-helper: use xread instead of read
When communicating with a remote server or a subprocess, use
expanded numbers rather than numbers with scaling suffix in the
object filter spec (e.g. "limit:blob=1k" becomes
"limit:blob=1024").
Update the protocol docs to note that clients should always perform this
expansion, to allow for more compatibility between server
implementations.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we use xread() and xwrite() here, EINTR, EAGAIN, and
EWOULDBLOCK retries are already handled for us, and we will
never see these errno values ourselves. We can drop these
conditions entirely, making the code easier to follow.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This fix was needed on HPE NonStop NSE and NSX where SSIZE_MAX is less than
BUFFERSIZE resulting in EINVAL. The call to read in transport-helper.c
was the only place outside of wrapper.c where it is used instead of xread.
Signed-off-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The built-in bundle transport and the transport helper interface do not
work when transport_fetch_refs() is called immediately after transport
creation. This will be needed in a subsequent patch, so fix this.
Evidence: fetch_refs_from_bundle() relies on data->header being
initialized in get_refs_from_bundle(), and fetch() in transport-helper.c
relies on either data->fetch or data->import being set by get_helper(),
but neither transport_helper_init() nor fetch() calls get_helper().
Up until the introduction of the partial clone feature, this has not
been a problem, because transport_fetch_refs() is always called after
transport_get_remote_refs(). With the introduction of the partial clone
feature, which involves calling transport_fetch_refs() (to fetch objects
by their OIDs) without transport_get_remote_refs(), this is still not a
problem, but only coincidentally - we do not support partially cloning a
bundle, and as for cloning using a transport-helper-using protocol, it
so happens that before transport_fetch_refs() is called, fetch_refs() in
fetch-object.c calls transport_set_option(), which means that the
aforementioned get_helper() is invoked through set_helper_option() in
transport-helper.c.
This could be fixed by fixing the transports themselves, but it doesn't
seem like a good idea to me to open up previously untested code paths;
also, there may be transport helpers in the wild that assume that "list"
is always called before "fetch". Instead, fix this by having
transport_fetch_refs() call transport_get_remote_refs() to ensure that
the latter is always called at least once, unless the transport
explicitly states that it supports fetching without listing refs.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fetch" sometimes failed to update the remote-tracking refs,
which has been corrected.
* jt/connectivity-check-after-unshallow:
fetch-pack: unify ref in and out param
Many more strings are prepared for l10n.
* nd/i18n: (23 commits)
transport-helper.c: mark more strings for translation
transport.c: mark more strings for translation
sha1-file.c: mark more strings for translation
sequencer.c: mark more strings for translation
replace-object.c: mark more strings for translation
refspec.c: mark more strings for translation
refs.c: mark more strings for translation
pkt-line.c: mark more strings for translation
object.c: mark more strings for translation
exec-cmd.c: mark more strings for translation
environment.c: mark more strings for translation
dir.c: mark more strings for translation
convert.c: mark more strings for translation
connect.c: mark more strings for translation
config.c: mark more strings for translation
commit-graph.c: mark more strings for translation
builtin/replace.c: mark more strings for translation
builtin/pack-objects.c: mark more strings for translation
builtin/grep.c: mark strings for translation
builtin/config.c: mark more strings for translation
...
"git fetch" learned a new option "--negotiation-tip" to limit the
set of commits it tells the other end as "have", to reduce wasted
bandwidth and cycles, which would be helpful when the receiving
repository has a lot of refs that have little to do with the
history at the remote it is fetching from.
* jt/fetch-nego-tip:
fetch-pack: support negotiation tip whitelist
When a user fetches:
- at least one up-to-date ref and at least one non-up-to-date ref,
- using HTTP with protocol v0 (or something else that uses the fetch
command of a remote helper)
some refs might not be updated after the fetch.
This bug was introduced in commit 989b8c4452 ("fetch-pack: put shallow
info in output parameter", 2018-06-28) which allowed transports to
report the refs that they have fetched in a new out-parameter
"fetched_refs". If they do so, transport_fetch_refs() makes this
information available to its caller.
Users of "fetched_refs" rely on the following 3 properties:
(1) it is the complete list of refs that was passed to
transport_fetch_refs(),
(2) it has shallow information (REF_STATUS_REJECT_SHALLOW set if
relevant), and
(3) it has updated OIDs if ref-in-want was used (introduced after
989b8c4452).
In an effort to satisfy (1), whenever transport_fetch_refs()
filters the refs sent to the transport, it re-adds the filtered refs to
whatever the transport supplies before returning it to the user.
However, the implementation in 989b8c4452 unconditionally re-adds the
filtered refs without checking if the transport refrained from reporting
anything in "fetched_refs" (which it is allowed to do), resulting in an
incomplete list, no longer satisfying (1).
An earlier effort to resolve this [1] solved the issue by readding the
filtered refs only if the transport did not refrain from reporting in
"fetched_refs", but after further discussion, it seems that the better
solution is to revert the API change that introduced "fetched_refs".
This API change was first suggested as part of a ref-in-want
implementation that allowed for ref patterns and, thus, there could be
drastic differences between the input refs and the refs actually fetched
[2]; we eventually decided to only allow exact ref names, but this API
change remained even though its necessity was decreased.
Therefore, revert this API change by reverting commit 989b8c4452, and
make receive_wanted_refs() update the OIDs in the sought array (like how
update_shallow() updates shallow information in the sought array)
instead. A test is also included to show that the user-visible bug
discussed at the beginning of this commit message no longer exists.
[1] https://public-inbox.org/git/20180801171806.GA122458@google.com/
[2] https://public-inbox.org/git/86a128c5fb710a41791e7183207c4d64889f9307.1485381677.git.jonathantanmy@google.com/
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many messages will be marked for translation in the following
commits. This commit updates some of them to be more consistent and
reduce diff noise in those commits. Changes are
- keep the first letter of die(), error() and warning() in lowercase
- no full stop in die(), error() or warning() if it's single sentence
messages
- indentation
- some messages are turned to BUG(), or prefixed with "BUG:" and will
not be marked for i18n
- some messages are improved to give more information
- some messages are broken down by sentence to be i18n friendly
(on the same token, combine multiple warning() into one big string)
- the trailing \n is converted to printf_ln if possible, or deleted
if not redundant
- errno_errno() is used instead of explicit strerror()
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
During negotiation, fetch-pack eventually reports as "have" lines all
commits reachable from all refs. Allow the user to restrict the commits
sent in this way by providing a whitelist of tips; only the tips
themselves and their ancestors will be sent.
Both globs and single objects are supported.
This feature is only supported for protocols that support connect or
stateless-connect (such as HTTP with protocol v2).
This will speed up negotiation when the repository has multiple
relatively independent branches (for example, when a repository
interacts with multiple repositories, such as with linux-next [1] and
torvalds/linux [2]), and the user knows which local branch is likely to
have commits in common with the upstream branch they are fetching.
[1] https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next/
[2] https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux/
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Expand the transport fetch method signature, by adding an output
parameter, to allow transports to return information about the refs they
have fetched. Then communicate shallow status information through this
mechanism instead of by modifying the input list of refs.
This does require clients to sometimes generate the ref map twice: once
from the list of refs provided by the remote (as is currently done) and
potentially once from the new list of refs that the fetch mechanism
provides.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert 'apply_refspecs()' to take a 'struct refspec' as a parameter instead
of a list of 'struct refspec_item'.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the refspecs in transport-helper.c to be stored in a
'struct refspec'.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for introducing an abstraction around a collection of
refspecs (much like how a 'struct pathspec' is a collection of 'struct
pathspec_item's) rename the existing 'struct refspec' to 'struct
refspec_item'.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for performing a refactor on refspec related code, move
the refspec parsing logic into its own file.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The beginning of the next-gen transfer protocol.
* bw/protocol-v2: (35 commits)
remote-curl: don't request v2 when pushing
remote-curl: implement stateless-connect command
http: eliminate "# service" line when using protocol v2
http: don't always add Git-Protocol header
http: allow providing extra headers for http requests
remote-curl: store the protocol version the server responded with
remote-curl: create copy of the service name
pkt-line: add packet_buf_write_len function
transport-helper: introduce stateless-connect
transport-helper: refactor process_connect_service
transport-helper: remove name parameter
connect: don't request v2 when pushing
connect: refactor git_connect to only get the protocol version once
fetch-pack: support shallow requests
fetch-pack: perform a fetch using v2
upload-pack: introduce fetch server command
push: pass ref prefixes when pushing
fetch: pass ref prefixes when fetching
ls-remote: pass ref prefixes when requesting a remote's refs
transport: convert transport_get_remote_refs to take a list of ref prefixes
...
Introduce the transport-helper capability 'stateless-connect'. This
capability indicates that the transport-helper can be requested to run
the 'stateless-connect' command which should attempt to make a
stateless connection with a remote end. Once established, the
connection can be used by the git client to communicate with
the remote end natively in a stateless-rpc manner as supported by
protocol v2. This means that the client must send everything the server
needs in a single request as the client must not assume any
state-storing on the part of the server or transport.
If a stateless connection cannot be established then the remote-helper
will respond in the same manner as the 'connect' command indicating that
the client should fallback to using the dumb remote-helper commands.
A future patch will implement the 'stateless-connect' capability in our
http remote-helper (remote-curl) so that protocol v2 can be used using
the http transport.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A future patch will need to take advantage of the logic which runs and
processes the response of the connect command on a remote helper so
factor out this logic from 'process_connect_service()' and place it into
a helper function 'run_connect()'.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 266f1fdfa (transport-helper: be quiet on read errors from
helpers, 2013-06-21) removed a call to 'die()' which printed the name of
the remote helper passed in to the 'recvline_fh()' function using the
'name' parameter. Once the call to 'die()' was removed the parameter
was no longer necessary but wasn't removed. Clean up 'recvline_fh()'
parameter list by removing the 'name' parameter.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the 'struct transport' virtual function 'get_refs_list()' to
optionally take an argv_array of ref prefixes. When communicating with
a server using protocol v2 these ref prefixes can be sent when
requesting a listing of their refs allowing the server to filter the
refs it sends based on the sent prefixes. This list will be ignored
when not using protocol v2.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The machinery to clone & fetch, which in turn involves packing and
unpacking objects, have been told how to omit certain objects using
the filtering mechanism introduced by the jh/object-filtering
topic, and also mark the resulting pack as a promisor pack to
tolerate missing objects, taking advantage of the mechanism
introduced by the jh/fsck-promisors topic.
* jh/partial-clone:
t5616: test bulk prefetch after partial fetch
fetch: inherit filter-spec from partial clone
t5616: end-to-end tests for partial clone
fetch-pack: restore save_commit_buffer after use
unpack-trees: batch fetching of missing blobs
clone: partial clone
partial-clone: define partial clone settings in config
fetch: support filters
fetch: refactor calculation of remote list
fetch-pack: test support excluding large blobs
fetch-pack: add --no-filter
fetch-pack, index-pack, transport: partial clone
upload-pack: add object filtering for partial clone
Move the definition of the transport-specific functions provided by
transports, whether declared in transport.c or transport-helper.c, into
an internal header. This means that transport-using code (as opposed to
transport-declaring code) can no longer access these functions (without
importing the internal header themselves), making it clear that they
should use the transport_*() functions instead, and also allowing the
interface between the transport mechanism and an individual transport to
independently evolve.
This is superficially a reversal of commit 824d5776c3 ("Refactor
struct transport_ops inlined into struct transport", 2007-09-19).
However, the scope of the involved variables was neither affected nor
discussed in that commit, and I think that the advantages in making
those functions more private outweigh the advantages described in that
commit's commit message. A minor additional point is that the code has
gotten more complicated since then, in that the function-pointer
variables are potentially mutated twice (once initially and once if
transport_take_over() is invoked), increasing the value of corralling
them into their own struct.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Transfer ownership of detached strbufs to string_lists of the
duplicating variety by calling string_list_append_nodup() instead of
string_list_append() to avoid duplicating and then leaking the buffer.
While at it make sure to release the string_list when done;
push_refs_with_export() already does that.
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many codepaths did not diagnose write failures correctly when disks
go full, due to their misuse of write_in_full() helper function,
which have been corrected.
* jk/write-in-full-fix:
read_pack_header: handle signed/unsigned comparison in read result
config: flip return value of store_write_*()
notes-merge: use ssize_t for write_in_full() return value
pkt-line: check write_in_full() errors against "< 0"
convert less-trivial versions of "write_in_full() != len"
avoid "write_in_full(fd, buf, len) != len" pattern
get-tar-commit-id: check write_in_full() return against 0
config: avoid "write_in_full(fd, buf, len) < len" pattern
Convert resolve_ref_unsafe to take a pointer to struct object_id by
converting one remaining caller to use struct object_id, removing the
temporary NULL pointer check in expand_ref, converting the declaration
and definition, and applying the following semantic patch:
@@
expression E1, E2, E3, E4;
@@
- resolve_ref_unsafe(E1, E2, E3.hash, E4)
+ resolve_ref_unsafe(E1, E2, &E3, E4)
@@
expression E1, E2, E3, E4;
@@
- resolve_ref_unsafe(E1, E2, E3->hash, E4)
+ resolve_ref_unsafe(E1, E2, E3, E4)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All but two of the call sites already have parameters using the hash
parameter of struct object_id, so convert them to take a pointer to the
struct directly. Also convert refs_read_refs_full, the underlying
implementation.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert update_ref, refs_update_ref, and write_pseudoref to use struct
object_id. Update the existing callers as well. Remove update_ref_oid,
as it is no longer needed.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many codepaths did not diagnose write failures correctly when disks
go full, due to their misuse of write_in_full() helper function,
which have been corrected.
* jk/write-in-full-fix:
read_pack_header: handle signed/unsigned comparison in read result
config: flip return value of store_write_*()
notes-merge: use ssize_t for write_in_full() return value
pkt-line: check write_in_full() errors against "< 0"
convert less-trivial versions of "write_in_full() != len"
avoid "write_in_full(fd, buf, len) != len" pattern
get-tar-commit-id: check write_in_full() return against 0
config: avoid "write_in_full(fd, buf, len) < len" pattern
Many leaks of strbuf have been fixed.
* rs/strbuf-leakfix: (34 commits)
wt-status: release strbuf after use in wt_longstatus_print_tracking()
wt-status: release strbuf after use in read_rebase_todolist()
vcs-svn: release strbuf after use in end_revision()
utf8: release strbuf on error return in strbuf_utf8_replace()
userdiff: release strbuf after use in userdiff_get_textconv()
transport-helper: release strbuf after use in process_connect_service()
sequencer: release strbuf after use in save_head()
shortlog: release strbuf after use in insert_one_record()
sha1_file: release strbuf on error return in index_path()
send-pack: release strbuf on error return in send_pack()
remote: release strbuf after use in set_url()
remote: release strbuf after use in migrate_file()
remote: release strbuf after use in read_remote_branches()
refs: release strbuf on error return in write_pseudoref()
notes: release strbuf after use in notes_copy_from_stdin()
merge: release strbuf after use in write_merge_heads()
merge: release strbuf after use in save_state()
mailinfo: release strbuf on error return in handle_boundary()
mailinfo: release strbuf after use in handle_from()
help: release strbuf on error return in exec_woman_emacs()
...
The return value of write_in_full() is either "-1", or the
requested number of bytes[1]. If we make a partial write
before seeing an error, we still return -1, not a partial
value. This goes back to f6aa66cb95 (write_in_full: really
write in full or return error on disk full., 2007-01-11).
So checking anything except "was the return value negative"
is pointless. And there are a couple of reasons not to do
so:
1. It can do a funny signed/unsigned comparison. If your
"len" is signed (e.g., a size_t) then the compiler will
promote the "-1" to its unsigned variant.
This works out for "!= len" (unless you really were
trying to write the maximum size_t bytes), but is a
bug if you check "< len" (an example of which was fixed
recently in config.c).
We should avoid promoting the mental model that you
need to check the length at all, so that new sites are
not tempted to copy us.
2. Checking for a negative value is shorter to type,
especially when the length is an expression.
3. Linus says so. In d34cf19b89 (Clean up write_in_full()
users, 2007-01-11), right after the write_in_full()
semantics were changed, he wrote:
I really wish every "write_in_full()" user would just
check against "<0" now, but this fixes the nasty and
stupid ones.
Appeals to authority aside, this makes it clear that
writing it this way does not have an intentional
benefit. It's a historical curiosity that we never
bothered to clean up (and which was undoubtedly
cargo-culted into new sites).
So let's convert these obviously-correct cases (this
includes write_str_in_full(), which is just a wrapper for
write_in_full()).
[1] A careful reader may notice there is one way that
write_in_full() can return a different value. If we ask
write() to write N bytes and get a return value that is
_larger_ than N, we could return a larger total. But
besides the fact that this would imply a totally broken
version of write(), it would already invoke undefined
behavior. Our internal remaining counter is an unsigned
size_t, which means that subtracting too many byte will
wrap it around to a very large number. So we'll instantly
begin reading off the end of the buffer, trying to write
gigabytes (or petabytes) of data.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a file .tsan-suppressions and list two functions in it: want_color()
and transfer_debug(). Both of these use the pattern
static int foo = -1;
if (foo < 0)
foo = bar();
where bar always returns the same non-negative value. This can cause
ThreadSanitizer to diagnose a race when foo is written from two threads.
That is indeed a race, although it arguably doesn't matter in practice
since it's always the same value that is written.
Add NEEDSWORK-comments to the functions so that this problem is not
forever swept way under the carpet.
The suppressions-file is used by setting the environment variable
TSAN_OPTIONS to, e.g., "suppressions=$(pwd)/.tsan-suppressions". Observe
that relative paths such as ".tsan-suppressions" might not work.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apply the result of the just-added coccinelle rule. This manually
excludes a few occurrences, mostly things that resulted in many
FREE_AND_NULL() on one line, that'll be manually fixed in a subsequent
change.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We can use xsnprintf to do our truncation check with less
code. The error message isn't as specific, but the point is
that this isn't supposed to trigger in the first place
(because our buffer is big enough to handle any int).
Signed-off-by: Jeff King <peff@peff.net>
"git ls-remote" and "git archive --remote" are designed to work
without being in a directory under Git's control. However, recent
updates revealed that we randomly look into a directory called
.git/ without actually doing necessary set-up when working in a
repository. Stop doing so.
* jn/remote-helpers-with-git-dir:
remote helpers: avoid blind fall-back to ".git" when setting GIT_DIR
remote: avoid reading $GIT_DIR config in non-repo
To push from or fetch to the current repository, remote helpers need
to know what repository that is. Accordingly, Git sets the GIT_DIR
environment variable to the path to the current repository when
invoking remote helpers.
There is a special case it does not handle: "git ls-remote" and "git
archive --remote" can be run to inspect a remote repository without
being run from any local repository. GIT_DIR is not useful in this
scenario:
- if we are not in a repository, we don't need to set GIT_DIR to
override an existing GIT_DIR value from the environment. If GIT_DIR
is present then we would be in a repository if it were valid and
would have called die() if it weren't.
- not setting GIT_DIR may cause a helper to do the usual discovery
walk to find the repository. But we know we're not in one, or we
would have found it ourselves. So in the worst case it may expend
a little extra effort to try to find a repository and fail (for
example, remote-curl would do this to try to find repository-level
configuration).
So leave GIT_DIR unset in this case. This makes GIT_DIR easier to
understand for remote helper authors and makes transport code less of
a special case for repository discovery.
Noticed using b1ef400e (setup_git_env: avoid blind fall-back to
".git", 2016-10-20) from 'next':
$ cd /tmp
$ git ls-remote https://kernel.googlesource.com/pub/scm/git/git
fatal: BUG: setup_git_env called without repository
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using non-builtin protocols relying on a transport helper
(such as http), push options are not propagated to the helper.
The user could ask for push options and a push would seemingly succeed,
but the push options would never be transported to the server,
misleading the users expectation.
Fix this by propagating the push options to the transport helper.
This is only addressing the first issue of
(1) the helper protocol does not propagate push-option
(2) the http helper is not prepared to handle push-option
Once we fix (2), the http transport helper can make use of push options
as well, but that happens as a follow up. (1) is a bug fix, whereas (2)
is a feature, which is why we only do (1) here.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The existing "git fetch --depth=<n>" option was hard to use
correctly when making the history of an existing shallow clone
deeper. A new option, "--deepen=<n>", has been added to make this
easier to use. "git clone" also learned "--shallow-since=<date>"
and "--shallow-exclude=<tag>" options to make it easier to specify
"I am interested only in the recent N months worth of history" and
"Give me only the history since that version".
* nd/shallow-deepen: (27 commits)
fetch, upload-pack: --deepen=N extends shallow boundary by N commits
upload-pack: add get_reachable_list()
upload-pack: split check_unreachable() in two, prep for get_reachable_list()
t5500, t5539: tests for shallow depth excluding a ref
clone: define shallow clone boundary with --shallow-exclude
fetch: define shallow boundary with --shallow-exclude
upload-pack: support define shallow boundary by excluding revisions
refs: add expand_ref()
t5500, t5539: tests for shallow depth since a specific date
clone: define shallow clone boundary based on time with --shallow-since
fetch: define shallow boundary with --shallow-since
upload-pack: add deepen-since to cut shallow repos based on time
shallow.c: implement a generic shallow boundary finder based on rev-list
fetch-pack: use a separate flag for fetch in deepening mode
fetch-pack.c: mark strings for translating
fetch-pack: use a common function for verbose printing
fetch-pack: use skip_prefix() instead of starts_with()
upload-pack: move rev-list code out of check_non_tip()
upload-pack: make check_non_tip() clean things up on error
upload-pack: tighten number parsing at "deepen" lines
...
The N_() no-op call currently marks the string to be extracted by
xgettext but doesn't trigger the retrieval of the translation at run
time, whereas _() does both. Meaning that, in spite of having
translations available, they were never retrieved to make use of them.
Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In git-fetch, --depth argument is always relative with the latest
remote refs. This makes it a bit difficult to cover this use case,
where the user wants to make the shallow history, say 3 levels
deeper. It would work if remote refs have not moved yet, but nobody
can guarantee that, especially when that use case is performed a
couple months after the last clone or "git fetch --depth". Also,
modifying shallow boundary using --depth does not work well with
clones created by --since or --not.
This patch fixes that. A new argument --deepen=<N> will add <N> more (*)
parent commits to the current history regardless of where remote refs
are.
Have/Want negotiation is still respected. So if remote refs move, the
server will send two chunks: one between "have" and "want" and another
to extend shallow history. In theory, the client could send no "want"s
in order to get the second chunk only. But the protocol does not allow
that. Either you send no want lines, which means ls-remote; or you
have to send at least one want line that carries deep-relative to the
server..
The main work was done by Dongcan Jiang. I fixed it up here and there.
And of course all the bugs belong to me.
(*) We could even support --deepen=<N> where <N> is negative. In that
case we can cut some history from the shallow clone. This operation
(and --depth=<shorter depth>) does not require interaction with remote
side (and more complicated to implement as a result).
Helped-by: Duy Nguyen <pclouds@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Dongcan Jiang <dongcan.jiang@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
For now we can handle two types, string and boolean, in
set_helper_option(). Later on we'll add string_list support, which does
not fit well. The new function strbuf_set_helper_option() can be reused
for a separate function that handles string-list.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code for warning_errno/die_errno has been refactored and a new
error_errno() reporting helper is introduced.
* nd/error-errno: (41 commits)
wrapper.c: use warning_errno()
vcs-svn: use error_errno()
upload-pack.c: use error_errno()
unpack-trees.c: use error_errno()
transport-helper.c: use error_errno()
sha1_file.c: use {error,die,warning}_errno()
server-info.c: use error_errno()
sequencer.c: use error_errno()
run-command.c: use error_errno()
rerere.c: use error_errno() and warning_errno()
reachable.c: use error_errno()
mailmap.c: use error_errno()
ident.c: use warning_errno()
http.c: use error_errno() and warning_errno()
grep.c: use error_errno()
gpg-interface.c: use error_errno()
fast-import.c: use error_errno()
entry.c: use error_errno()
editor.c: use error_errno()
diff-no-index.c: use error_errno()
...
Many instances of duplicate words (e.g. "the the path") and
a few typoes are fixed, originally in multiple patches.
wildmatch: fix duplicate words of "the"
t: fix duplicate words of "output"
transport-helper: fix duplicate words of "read"
Git.pm: fix duplicate words of "return"
path: fix duplicate words of "look"
pack-protocol.txt: fix duplicate words of "the"
precompose-utf8: fix typo of "sequences"
split-index: fix typo
worktree.c: fix typo
remote-ext: fix typo
utf8: fix duplicate words of "the"
git-cvsserver: fix duplicate words
Signed-off-by: Li Peng <lip@dtdream.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fetch" and friends that make network connections can now be
told to only use ipv4 (or ipv6).
* ew/force-ipv4:
connect & http: support -4 and -6 switches for remote operations
Sometimes it is necessary to force IPv4-only or IPv6-only operation
on networks where name lookups may return a non-routable address and
stall remote operations.
The ssh(1) command has an equivalent switches which we may pass when
we run them. There may be old ssh(1) implementations out there
which do not support these switches; they should report the
appropriate error in that case.
rsync support is untouched for now since it is deprecated and
scheduled to be removed.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Reviewed-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our implementation of helpers never use CRLF line endings, and they
do not depend on the ability to place a CR as payload at the end of
the line, so this is essentially a no-op for in-tree users. However,
this allows third-party implementation of helpers to give us their
line with CRLF line ending (they cannot expect us to feed CRLF to
them, though).
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now there is no direct caller to strbuf_getline(), we can demote it
to file-scope static that is private to strbuf.c and rename it to
strbuf_getdelim(). Rename strbuf_getline_crlf(), which is designed
to be the most "text friendly" variant, and allow it to take over
this simplest name, strbuf_getline(), so we can add more uses of it
without having to type _crlf over and over again in the coming
steps.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The strbuf_getline() interface allows a byte other than LF or NUL as
the line terminator, but this is only because I wrote these
codepaths anticipating that there might be a value other than NUL
and LF that could be useful when I introduced line_termination long
time ago. No useful caller that uses other value has emerged.
By now, it is clear that the interface is overly broad without a
good reason. Many codepaths have hardcoded preference to read
either LF terminated or NUL terminated records from their input, and
then call strbuf_getline() with LF or NUL as the third parameter.
This step introduces two thin wrappers around strbuf_getline(),
namely, strbuf_getline_lf() and strbuf_getline_nul(), and
mechanically rewrites these call sites to call either one of
them. The changes contained in this patch are:
* introduction of these two functions in strbuf.[ch]
* mechanical conversion of all callers to strbuf_getline() with
either '\n' or '\0' as the third parameter to instead call the
respective thin wrapper.
After this step, output from "git grep 'strbuf_getline('" would
become a lot smaller. An interim goal of this series is to make
this an empty set, so that we can have strbuf_getline_crlf() take
over the shorter name strbuf_getline().
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use struct object_id in three fields in struct ref and convert all the
necessary places that use it.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
If we are cloning an untrusted remote repository into a
sandbox, we may also want to fetch remote submodules in
order to get the complete view as intended by the other
side. However, that opens us up to attacks where a malicious
user gets us to clone something they would not otherwise
have access to (this is not necessarily a problem by itself,
but we may then act on the cloned contents in a way that
exposes them to the attacker).
Ideally such a setup would sandbox git entirely away from
high-value items, but this is not always practical or easy
to set up (e.g., OS network controls may block multiple
protocols, and we would want to enable some but not others).
We can help this case by providing a way to restrict
particular protocols. We use a whitelist in the environment.
This is more annoying to set up than a blacklist, but
defaults to safety if the set of protocols git supports
grows). If no whitelist is specified, we continue to default
to allowing all protocols (this is an "unsafe" default, but
since the minority of users will want this sandboxing
effect, it is the only sensible one).
A note on the tests: ideally these would all be in a single
test file, but the git-daemon and httpd test infrastructure
is an all-or-nothing proposition rather than a test-by-test
prerequisite. By putting them all together, we would be
unable to test the file-local code on machines without
apache.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The client side codepaths in "git push" have been cleaned up
and the user can request to perform an optional "signed push",
i.e. sign only when the other end accepts signed push.
* db/push-sign-if-asked:
push: add a config option push.gpgSign for default signed pushes
push: support signing pushes iff the server supports it
builtin/send-pack.c: use parse_options API
config.c: rename git_config_maybe_bool_text and export it as git_parse_maybe_bool
transport: remove git_transport_options.push_cert
gitremote-helpers.txt: document pushcert option
Documentation/git-send-pack.txt: document --signed
Documentation/git-send-pack.txt: wrap long synopsis line
Documentation/git-push.txt: document when --signed may fail
Add a new flag --sign=true (or --sign=false), which means the same
thing as the original --signed (or --no-signed). Give it a third
value --sign=if-asked to tell push and send-pack to send a push
certificate if and only if the server advertised a push cert nonce.
If not, warn the user that their push may not be as secure as they
thought.
Signed-off-by: Dave Borowitz <dborowitz@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We check the return value of read_ref in 19 out of 21 cases.
This adds checks to the missing cases.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git push --signed" gave an incorrectly worded error message when
the other side did not support the capability.
* jc/push-cert:
transport-helper: fix typo in error message when --signed is not supported
"git fetch" over a remote-helper that cannot respond to "list"
command could not fetch from a symbolic reference e.g. HEAD.
* mh/deref-symref-over-helper-transport:
transport-helper: do not request symbolic refs to remote helpers
The transport-helper did not give transport options such as
verbosity, progress, cloning, etc. to import and export based
helpers, like it did for fetch and push based helpers, robbing them
the chance to honor the wish of the end-users better.
* mh/transport-capabilities:
transport-helper: ask the helper to set the same options for import as for fetch
transport-helper: ask the helper to set progress and verbosity options after asking for its capabilities
"git push --signed" gave an incorrectly worded error message when
the other side did not support the capability.
* jc/push-cert:
transport-helper: fix typo in error message when --signed is not supported
A few files include the same header file directly more than once.
As all these headers protect themselves against repeated inclusion
by the "#ifndef FOO_H / #define FOO_H / ... / #endif" idiom, leave
only the first inclusion and remove the later inclusion as a no-op
clean-up.
Signed-off-by: Дилян Палаузов <git-dpa@aegee.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A remote helper is currently only told about the 'check-connectivity',
'cloning', and 'update-shallow' options when it supports the 'fetch'
command, but not when it supports 'import' instead.
This is especially important for the 'cloning' option, because it
means a remote helper that only supports 'import' can't distinguish
between a clone and a pull besides doing some assumptions from the
git directory state.
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, a remote helper is only told about the progress and verbosity
options for the 'fetch' and 'push' commands. This means a remote helper
that implements 'import' and 'export' can never know the user requested
progress or verbosity (or lack thereof) through the command line.
Telling the remote helper about those options after asking for its
capabilities ensures it can act accordingly for all commands.
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A typical remote helper will return a `list` of refs containing a symbolic
ref HEAD, pointing to, e.g. refs/heads/master. In the case of a clone, all
the refs are being requested through `fetch` or `import`, including the
symbolic ref.
While this works properly, in some cases of a fetch, like `git fetch url`
or `git fetch origin HEAD`, or any fetch command involving a symbolic ref
without also fetching the corresponding ref it points to, the fetch command
fails with:
fatal: bad object 0000000000000000000000000000000000000000
error: <remote> did not send all necessary objects
(in the case the remote helper returned '?' values to the `list` command).
This is because there is only one ref given to fetch(), and it's not
further resolved to something at the end of fetch_with_import().
While this can be somehow handled in the remote helper itself, by adding
a refspec for the symbolic ref, and storing an explicit ref in a private
namespace, and then handling the `import` for that symbolic ref
specifically, very few existing remote helpers are actually doing that.
So, instead of requesting the exact list of wanted refs to remote helpers,
treat symbolic refs differently and request the ref they point to instead.
Then, resolve the symbolic refs values based on the pointed ref.
This assumes there is no more than one level of indirection (a symbolic
ref doesn't point to another symbolic ref).
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Call child_process_init() instead of zeroing the memory of variables of
type struct child_process by hand before use because the former is both
clearer and shorter.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add managed "env" array to child_process to clarify the lifetime
rules.
* rs/run-command-env-array:
use env_array member of struct child_process
run-command: add env_array, an optional argv_array for env
Convert users of struct child_process to using the managed env_array for
specifying environment variables instead of supplying an array on the
stack or bringing their own argv_array. This shortens and simplifies
the code and ensures automatically that the allocated memory is freed
after use.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
resolve_ref_unsafe takes a boolean argument for reading (a nonexistent ref
resolves successfully for writing but not for reading). Change this to be
a flags field instead, and pass the new constant RESOLVE_REF_READING when
we want this behaviour.
While at it, swap two of the arguments in the function to put output
arguments at the end. As a nice side effect, this ensures that we can
catch callers that were unaware of the new API so they can be audited.
Give the wrapper functions resolve_refdup and read_ref_full the same
treatment for consistency.
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow "git push" request to be signed, so that it can be verified and
audited, using the GPG signature of the person who pushed, that the
tips of branches at a public repository really point the commits
the pusher wanted to, without having to "trust" the server.
* jc/push-cert: (24 commits)
receive-pack::hmac_sha1(): copy the entire SHA-1 hash out
signed push: allow stale nonce in stateless mode
signed push: teach smart-HTTP to pass "git push --signed" around
signed push: fortify against replay attacks
signed push: add "pushee" header to push certificate
signed push: remove duplicated protocol info
send-pack: send feature request on push-cert packet
receive-pack: GPG-validate push certificates
push: the beginning of "git push --signed"
pack-protocol doc: typofix for PKT-LINE
gpg-interface: move parse_signature() to where it should be
gpg-interface: move parse_gpg_output() to where it should be
send-pack: clarify that cmds_sent is a boolean
send-pack: refactor inspecting and resetting status and sending commands
send-pack: rename "new_refs" to "need_pack_data"
receive-pack: factor out capability string generation
send-pack: factor out capability string generation
send-pack: always send capabilities
send-pack: refactor decision to send update per ref
send-pack: move REF_STATUS_REJECT_NODELETE logic a bit higher
...
The "--signed" option received by "git push" is first passed to the
transport layer, which the native transport directly uses to notice
that a push certificate needs to be sent. When the transport-helper
is involved, however, the option needs to be told to the helper with
set_helper_option(), and the helper needs to take necessary action.
For the smart-HTTP helper, the "necessary action" involves spawning
the "git send-pack" subprocess with the "--signed" option.
Once the above all gets wired in, the smart-HTTP transport now can
use the push certificate mechanism to authenticate its pushes.
Add a test that is modeled after tests for the native transport in
t5534-push-signed.sh to t5541-http-push-smart.sh. Update the test
Apache configuration to pass GNUPGHOME environment variable through.
As PassEnv would trigger warnings for an environment variable that
is not set, export it from test-lib.sh set to a harmless value when
GnuPG is not being used in the tests.
Note that the added test is deliberately loose and does not check
the nonce in this step. This is because the stateless RPC mode is
inevitably flaky and a nonce that comes back in the actual push
processing is one issued by a different process; if the two
interactions with the server crossed a second boundary, the nonces
will not match and such a check will fail. A later patch in the
series will work around this shortcoming.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a helper function for initializing those struct child_process
variables for which the macro CHILD_PROCESS_INIT can't be used.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We detect the "import-marks" capability by looking for that
string, but _without_ a trailing space. Then we skip past it
using strlen("import-marks "), with a space. So if a remote
helper gives us exactly "import-marks", we will read past
the end-of-string by one character.
This is unlikely to be a problem in practice, because such
input is malformed in the first place, and because there is
a good chance that the string has an extra NUL terminator
one character after the original (because it formerly had a
newline in it that we parsed off).
We can fix it by using skip_prefix with "import-marks ",
with the space. The other form appears to be a typo from
a515ebe (transport-helper: implement marks location as
capability, 2011-07-16); "import-marks" has never existed
without an argument, and it should match the "export-marks"
definition above.
Speaking of which, we can also use skip_prefix in a few
other places while we are in the function.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow remote-helper/fast-import based transport to rename the refs
while transferring the history.
* fc/remote-helper-refmap:
transport-helper: remove unnecessary strbuf resets
transport-helper: add support to delete branches
fast-export: add support to delete refs
fast-import: add support to delete refs
transport-helper: add support to push symbolic refs
transport-helper: add support for old:new refspec
fast-export: add new --refspec option
fast-export: improve argument parsing