Commit Graph

371 Commits

Author SHA1 Message Date
Jonathan Tan
d1035cac09 upload-pack: clear flags before each v2 request
Suppose a server has the following commit graph:

 A   B
  \ /
   O

We create a client by cloning A from the server with depth 1, and add
many commits to it (so that future fetches span multiple requests due to
lengthy negotiation). If it then fetches B using protocol v2, the fetch
spanning multiple requests, the resulting packfile does not contain O
even though the client did report that A is shallow.

This is because upload_pack_v2() can be called multiple times while
processing the same session. During the 2nd and all subsequent
invocations, some object flags remain from the previous invocations. In
particular, CLIENT_SHALLOW remains, preventing process_shallow() from
adding client-reported shallows to the "shallows" array, and hence
pack-objects not knowing about these client-reported shallows.

Therefore, teach upload_pack_v2() to clear object flags at the start of
each invocation. This has some other results:

 - THEY_HAVE gates addition of objects to have_obj in process_haves().
   Previously in upload_pack_v2(), have_obj needed to be static because
   once an object is added to have_obj, it is never readded and thus we
   needed to retain the contents of have_obj between invocations. Now
   that flags are cleared, this is no longer necessary. This patch does
   not change the behavior of ok_to_give_up() (THEY_HAVE is still set on
   each "have") and got_oid() (used only in non-v2)); THEY_HAVE is not
   used in any other function.

 - WANTED gates addition of objects to want_obj in parse_want() and
   parse_want_ref(). It is also used in receive_needs(), but that is
   only used in non-v2. For the same reasons as THEY_HAVE, want_obj no
   longer needs to be static in upload_pack_v2().

 - CLIENT_SHALLOW is changed as discussed above.

Clearing of the other 5 flags does not affect functionality in v2. (Note
that in non-v2, upload_pack() is only called once per process, so each
invocation starts with blank flags anyway.)

 - OUR_REF is only used in non-v2.

 - COMMON_KNOWN is only used as a scratch flag in ok_to_give_up().

 - SHALLOW is passed to invocations in deepen() and
   deepen_by_rev_list(), but upload-pack doesn't use it.

 - NOT_SHALLOW is used by send_shallow() and send_unshallow(), but
   invocations of those functions are always preceded by code that sets
   NOT_SHALLOW on the appropriate objects.

 - HIDDEN_REF is only used in non-v2.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-19 12:04:53 +09:00
Jonathan Tan
1d1243fe63 upload-pack: make want_obj not global
Because upload_pack_v2() can be invoked multiple times in the same
process, the static variable want_obj may not be empty when it is
invoked. To make further analysis of this situation easier, make the
variable local; analysis will be done in a subsequent patch.

The new local variable in upload_pack_v2() is static to preserve
existing behavior; this is not necessary in upload_pack() because
upload_pack() is only invoked once per process.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-19 12:04:53 +09:00
Jonathan Tan
0b9333ff3e upload-pack: make have_obj not global
Because upload_pack_v2() can be invoked multiple times in the same
process, the static variable have_obj may not be empty when it is
invoked. To make further analysis of this situation easier, make the
variable local; analysis will be done in a subsequent patch.

The new local variable in upload_pack_v2() is static to preserve
existing behavior; this is not necessary in upload_pack() because
upload_pack() is only invoked once per process.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-19 12:04:53 +09:00
Junio C Hamano
6d8f8ebb74 Merge branch 'ds/commit-graph-with-grafts'
The recently introduced commit-graph auxiliary data is incompatible
with mechanisms such as replace & grafts that "breaks" immutable
nature of the object reference relationship.  Disable optimizations
based on its use (and updating existing commit-graph) when these
incompatible features are in use in the repository.

* ds/commit-graph-with-grafts:
  commit-graph: close_commit_graph before shallow walk
  commit-graph: not compatible with uninitialized repo
  commit-graph: not compatible with grafts
  commit-graph: not compatible with replace objects
  test-repository: properly init repo
  commit-graph: update design document
  refs.c: upgrade for_each_replace_ref to be a each_repo_ref_fn callback
  refs.c: migrate internal ref iteration to pass thru repository argument
2018-10-16 16:15:59 +09:00
brian m. carlson
f690b6b030 upload-pack: express constants in terms of the_hash_algo
Convert all uses of the GIT_SHA1_HEXSZ to use the_hash_algo so that they
are appropriate for any given hash length.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-15 12:53:15 +09:00
Junio C Hamano
1b7a91da71 Merge branch 'ds/reachable'
The code for computing history reachability has been shuffled,
obtained a bunch of new tests to cover them, and then being
improved.

* ds/reachable:
  commit-reach: correct accidental #include of C file
  commit-reach: use can_all_from_reach
  commit-reach: make can_all_from_reach... linear
  commit-reach: replace ref_newer logic
  test-reach: test commit_contains
  test-reach: test can_all_from_reach_with_flags
  test-reach: test reduce_heads
  test-reach: test get_merge_bases_many
  test-reach: test is_descendant_of
  test-reach: test in_merge_bases
  test-reach: create new test tool for ref_newer
  commit-reach: move can_all_from_reach_with_flags
  upload-pack: generalize commit date cutoff
  upload-pack: refactor ok_to_give_up()
  upload-pack: make reachable() more generic
  commit-reach: move commit_contains from ref-filter
  commit-reach: move ref_newer from remote.c
  commit.h: remove method declarations
  commit-reach: move walk methods from commit.c
2018-09-17 13:53:52 -07:00
Derrick Stolee
829a321569 commit-graph: close_commit_graph before shallow walk
Call close_commit_graph() when about to start a rev-list walk that
includes shallow commits. This is necessary in code paths that "fake"
shallow commits for the sake of fetch. Specifically, test 351 in
t5500-fetch-pack.sh runs

	git fetch --shallow-exclude one origin

with a file-based transfer. When the "remote" has a commit-graph, we do
not prevent the commit-graph from being loaded, but then the commits are
intended to be dynamically transferred into shallow commits during
get_shallow_commits_by_rev_list(). By closing the commit-graph before
this call, we prevent this interaction.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-21 10:22:51 -07:00
Junio C Hamano
3a2a1dc170 Merge branch 'sb/object-store-lookup'
lookup_commit_reference() and friends have been updated to find
in-core object for a specific in-core repository instance.

* sb/object-store-lookup: (32 commits)
  commit.c: allow lookup_commit_reference to handle arbitrary repositories
  commit.c: allow lookup_commit_reference_gently to handle arbitrary repositories
  tag.c: allow deref_tag to handle arbitrary repositories
  object.c: allow parse_object to handle arbitrary repositories
  object.c: allow parse_object_buffer to handle arbitrary repositories
  commit.c: allow get_cached_commit_buffer to handle arbitrary repositories
  commit.c: allow set_commit_buffer to handle arbitrary repositories
  commit.c: migrate the commit buffer to the parsed object store
  commit-slabs: remove realloc counter outside of slab struct
  commit.c: allow parse_commit_buffer to handle arbitrary repositories
  tag: allow parse_tag_buffer to handle arbitrary repositories
  tag: allow lookup_tag to handle arbitrary repositories
  commit: allow lookup_commit to handle arbitrary repositories
  tree: allow lookup_tree to handle arbitrary repositories
  blob: allow lookup_blob to handle arbitrary repositories
  object: allow lookup_object to handle arbitrary repositories
  object: allow object_as_type to handle arbitrary repositories
  tag: add repository argument to deref_tag
  tag: add repository argument to parse_tag_buffer
  tag: add repository argument to lookup_tag
  ...
2018-08-02 15:30:42 -07:00
Junio C Hamano
88df0fa659 Merge branch 'jt/connectivity-check-after-unshallow'
"git fetch" failed to correctly validate the set of objects it
received when making a shallow history deeper, which has been
corrected.

* jt/connectivity-check-after-unshallow:
  fetch-pack: write shallow, then check connectivity
  fetch-pack: implement ref-in-want
  fetch-pack: put shallow info in output parameter
  fetch: refactor to make function args narrower
  fetch: refactor fetch_refs into two functions
  fetch: refactor the population of peer ref OIDs
  upload-pack: test negotiation with changing repository
  upload-pack: implement ref-in-want
  test-pkt-line: add unpack-sideband subcommand
2018-07-24 14:50:44 -07:00
Derrick Stolee
4fbcca4eff commit-reach: make can_all_from_reach... linear
The can_all_from_reach_with_flags() algorithm is currently quadratic in
the worst case, because it calls the reachable() method for every 'from'
without tracking which commits have already been walked or which can
already reach a commit in 'to'.

Rewrite the algorithm to walk each commit a constant number of times.

We also add some optimizations that should work for the main consumer of
this method: fetch negotitation (haves/wants).

The first step includes using a depth-first-search (DFS) from each
'from' commit, sorted by ascending generation number. We do not walk
beyond the minimum generation number or the minimum commit date. This
DFS is likely to be faster than the existing reachable() method because
we expect previous ref values to be along the first-parent history.

If we find a target commit, then we mark everything in the DFS stack as
a RESULT. This expands the set of targets for the other 'from' commits.
We also mark the visited commits using 'assign_flag' to prevent re-
walking the same commits.

We still need to clear our flags at the end, which is why we will have a
total of three visits to each commit.

Performance was measured on the Linux repository using
'test-tool reach can_all_from_reach'. The input included rows seeded by
tag values. The "small" case included X-rows as v4.[0-9]* and Y-rows as
v3.[0-9]*. This mimics a (very large) fetch that says "I have all major
v3 releases and want all major v4 releases." The "large" case included
X-rows as "v4.*" and Y-rows as "v3.*". This adds all release-candidate
tags to the set, which does not greatly increase the number of objects
that are considered, but does increase the number of 'from' commits,
demonstrating the quadratic nature of the previous code.

Small Case:

Before: 1.52 s
 After: 0.26 s

Large Case:

Before: 3.50 s
 After: 0.27 s

Note how the time increases between the two cases in the two versions.
The new code increases relative to the number of commits that need to be
walked, but not directly relative to the number of 'from' commits.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20 15:38:56 -07:00
Derrick Stolee
ba3ca1edce commit-reach: move can_all_from_reach_with_flags
There are several commit walks in the codebase. Group them together into
a new commit-reach.c file and corresponding header. After we group these
walks into one place, we can reduce duplicate logic by calling
equivalent methods.

The can_all_from_reach_with_flags method is used in a stateful way by
upload-pack.c. The parameters are very flexible, so we will be able to
use its commit walking logic for many other callers.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20 15:38:55 -07:00
Derrick Stolee
118be5785e upload-pack: generalize commit date cutoff
The ok_to_give_up() method uses the commit date as a cutoff to avoid
walking the entire reachble set of commits. Before moving the
reachable() method to commit-reach.c, pull out the dependence on the
global constant 'oldest_have' with a 'min_commit_date' parameter.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20 15:38:55 -07:00
Derrick Stolee
921bf7734f upload-pack: refactor ok_to_give_up()
In anticipation of consolidating all commit reachability algorithms,
refactor ok_to_give_up() in order to allow splitting its logic into
an external method.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20 15:38:54 -07:00
Derrick Stolee
f044bb49ad upload-pack: make reachable() more generic
In anticipation of moving the reachable() method to commit-reach.c,
modify the prototype to be more generic to flags known outside of
upload-pack.c. Also rename 'want' to 'from' to make the statement
more clear outside of the context of haves/wants negotiation.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20 15:38:54 -07:00
Junio C Hamano
00624d608c Merge branch 'sb/object-store-grafts'
The conversion to pass "the_repository" and then "a_repository"
throughout the object access API continues.

* sb/object-store-grafts:
  commit: allow lookup_commit_graft to handle arbitrary repositories
  commit: allow prepare_commit_graft to handle arbitrary repositories
  shallow: migrate shallow information into the object parser
  path.c: migrate global git_path_* to take a repository argument
  cache: convert get_graft_file to handle arbitrary repositories
  commit: convert read_graft_file to handle arbitrary repositories
  commit: convert register_commit_graft to handle arbitrary repositories
  commit: convert commit_graft_pos() to handle arbitrary repositories
  shallow: add repository argument to is_repository_shallow
  shallow: add repository argument to check_shallow_file_for_update
  shallow: add repository argument to register_shallow
  shallow: add repository argument to set_alternate_shallow_file
  commit: add repository argument to lookup_commit_graft
  commit: add repository argument to prepare_commit_graft
  commit: add repository argument to read_graft_file
  commit: add repository argument to register_commit_graft
  commit: add repository argument to commit_graft_pos
  object: move grafts to object parser
  object-store: move object access functions to object-store.h
2018-07-18 12:20:28 -07:00
Stefan Beller
a74093da5e tag: add repository argument to deref_tag
Add a repository argument to allow the callers of deref_tag
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-29 10:43:39 -07:00
Stefan Beller
5abddd1eb7 object: add repository argument to lookup_object
Add a repository argument to allow callers of lookup_object to be more
specific about which repository to handle. This is a small mechanical
change; it doesn't change the implementation to handle repositories
other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-29 10:43:38 -07:00
Stefan Beller
109cd76dd3 object: add repository argument to parse_object
Add a repository argument to allow the callers of parse_object
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-29 10:43:38 -07:00
Junio C Hamano
b16b60f71b Merge branch 'sb/object-store-grafts' into sb/object-store-lookup
* sb/object-store-grafts:
  commit: allow lookup_commit_graft to handle arbitrary repositories
  commit: allow prepare_commit_graft to handle arbitrary repositories
  shallow: migrate shallow information into the object parser
  path.c: migrate global git_path_* to take a repository argument
  cache: convert get_graft_file to handle arbitrary repositories
  commit: convert read_graft_file to handle arbitrary repositories
  commit: convert register_commit_graft to handle arbitrary repositories
  commit: convert commit_graft_pos() to handle arbitrary repositories
  shallow: add repository argument to is_repository_shallow
  shallow: add repository argument to check_shallow_file_for_update
  shallow: add repository argument to register_shallow
  shallow: add repository argument to set_alternate_shallow_file
  commit: add repository argument to lookup_commit_graft
  commit: add repository argument to prepare_commit_graft
  commit: add repository argument to read_graft_file
  commit: add repository argument to register_commit_graft
  commit: add repository argument to commit_graft_pos
  object: move grafts to object parser
  object-store: move object access functions to object-store.h
2018-06-29 10:43:28 -07:00
Brandon Williams
516e2b76bd upload-pack: implement ref-in-want
Currently, while performing packfile negotiation, clients are only
allowed to specify their desired objects using object ids.  This causes
a vulnerability to failure when an object turns non-existent during
negotiation, which may happen if, for example, the desired repository is
provided by multiple Git servers in a load-balancing arrangement and
there exists replication delay.

In order to eliminate this vulnerability, implement the ref-in-want
feature for the 'fetch' command in protocol version 2.  This feature
enables the 'fetch' command to support requests in the form of ref names
through a new "want-ref <ref>" parameter.  At the conclusion of
negotiation, the server will send a list of all of the wanted references
(as provided by "want-ref" lines) in addition to the generated packfile.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-28 09:33:29 -07:00
Junio C Hamano
54db5c0e1e Merge branch 'jt/partial-clone-proto-v2'
Transfer protocol v2 learned to support the partial clone.

* jt/partial-clone-proto-v2:
  {fetch,upload}-pack: support filter in protocol v2
  upload-pack: read config when serving protocol v2
  upload-pack: fix error message typo
2018-05-30 14:04:10 +09:00
Junio C Hamano
42c8ce1c49 Merge branch 'bc/object-id'
Conversion from uchar[20] to struct object_id continues.

* bc/object-id: (42 commits)
  merge-one-file: compute empty blob object ID
  add--interactive: compute the empty tree value
  Update shell scripts to compute empty tree object ID
  sha1_file: only expose empty object constants through git_hash_algo
  dir: use the_hash_algo for empty blob object ID
  sequencer: use the_hash_algo for empty tree object ID
  cache-tree: use is_empty_tree_oid
  sha1_file: convert cached object code to struct object_id
  builtin/reset: convert use of EMPTY_TREE_SHA1_BIN
  builtin/receive-pack: convert one use of EMPTY_TREE_SHA1_HEX
  wt-status: convert two uses of EMPTY_TREE_SHA1_HEX
  submodule: convert several uses of EMPTY_TREE_SHA1_HEX
  sequencer: convert one use of EMPTY_TREE_SHA1_HEX
  merge: convert empty tree constant to the_hash_algo
  builtin/merge: switch tree functions to use object_id
  builtin/am: convert uses of EMPTY_TREE_SHA1_BIN to the_hash_algo
  sha1-file: add functions for hex empty tree and blob OIDs
  builtin/receive-pack: avoid hard-coded constants for push certs
  diff: specify abbreviation size in terms of the_hash_algo
  upload-pack: replace use of several hard-coded constants
  ...
2018-05-30 14:04:10 +09:00
Stefan Beller
c88134870e shallow: add repository argument to is_repository_shallow
Add a repository argument to allow callers of is_repository_shallow
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-18 08:13:10 +09:00
Stefan Beller
19143f139d shallow: add repository argument to register_shallow
Add a repository argument to allow callers of register_shallow
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-18 08:13:10 +09:00
Stefan Beller
cbd53a2193 object-store: move object access functions to object-store.h
This should make these functions easier to find and cache.h less
overwhelming to read.

In particular, this moves:
- read_object_file
- oid_object_info
- write_object_file

As a result, most of the codebase needs to #include object-store.h.
In this patch the #include is only added to files that would fail to
compile otherwise.  It would be better to #include wherever
identifiers from the header are used.  That can happen later
when we have better tooling for it.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-16 11:42:03 +09:00
Junio C Hamano
9bfa0f9be3 Merge branch 'bw/protocol-v2'
The beginning of the next-gen transfer protocol.

* bw/protocol-v2: (35 commits)
  remote-curl: don't request v2 when pushing
  remote-curl: implement stateless-connect command
  http: eliminate "# service" line when using protocol v2
  http: don't always add Git-Protocol header
  http: allow providing extra headers for http requests
  remote-curl: store the protocol version the server responded with
  remote-curl: create copy of the service name
  pkt-line: add packet_buf_write_len function
  transport-helper: introduce stateless-connect
  transport-helper: refactor process_connect_service
  transport-helper: remove name parameter
  connect: don't request v2 when pushing
  connect: refactor git_connect to only get the protocol version once
  fetch-pack: support shallow requests
  fetch-pack: perform a fetch using v2
  upload-pack: introduce fetch server command
  push: pass ref prefixes when pushing
  fetch: pass ref prefixes when fetching
  ls-remote: pass ref prefixes when requesting a remote's refs
  transport: convert transport_get_remote_refs to take a list of ref prefixes
  ...
2018-05-08 15:59:16 +09:00
Jonathan Tan
ba95710a3b {fetch,upload}-pack: support filter in protocol v2
The fetch-pack/upload-pack protocol v2 was developed independently of
the filter parameter (used in partial fetches), thus it did not include
support for it. Add support for the filter parameter.

Like in the legacy protocol, the server advertises and supports "filter"
only if uploadpack.allowfilter is configured.

Like in the legacy protocol, the client continues with a warning if
"--filter" is specified, but the server does not advertise it.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-06 13:17:19 +09:00
Jonathan Tan
5459268751 upload-pack: read config when serving protocol v2
The upload-pack code paths never call git_config() with
upload_pack_config() when protocol v2 is used, causing options like
uploadpack.packobjectshook to not take effect. Ensure that this function
is called.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-06 13:17:19 +09:00
Jonathan Tan
7cc6ed2d06 upload-pack: fix error message typo
Fix a typo in an error message.

Also, this line was introduced in 3145ea957d ("upload-pack: introduce
fetch server command", 2018-03-15), which did not contain a test for the
case which causes this error to be printed, so introduce a test.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-02 18:54:32 +09:00
Junio C Hamano
ea44c0a594 Merge branch 'bw/protocol-v2' into jt/partial-clone-proto-v2
The beginning of the next-gen transfer protocol.

* bw/protocol-v2: (35 commits)
  remote-curl: don't request v2 when pushing
  remote-curl: implement stateless-connect command
  http: eliminate "# service" line when using protocol v2
  http: don't always add Git-Protocol header
  http: allow providing extra headers for http requests
  remote-curl: store the protocol version the server responded with
  remote-curl: create copy of the service name
  pkt-line: add packet_buf_write_len function
  transport-helper: introduce stateless-connect
  transport-helper: refactor process_connect_service
  transport-helper: remove name parameter
  connect: don't request v2 when pushing
  connect: refactor git_connect to only get the protocol version once
  fetch-pack: support shallow requests
  fetch-pack: perform a fetch using v2
  upload-pack: introduce fetch server command
  push: pass ref prefixes when pushing
  fetch: pass ref prefixes when fetching
  ls-remote: pass ref prefixes when requesting a remote's refs
  transport: convert transport_get_remote_refs to take a list of ref prefixes
  ...
2018-05-02 18:54:10 +09:00
brian m. carlson
55dc227d16 upload-pack: replace use of several hard-coded constants
Update several uses of hard-coded 40-based constants to use either
the_hash_algo or GIT_MAX_HEXSZ, as appropriate.  Replace a combined use
of oid_to_hex and memcpy with oid_to_hex_r, which not only avoids the
need for a constant, but is more efficient.  Make use of parse_oid_hex
to eliminate the need for constants and simplify the code at the same
time.  Update some comments to no longer refer to SHA-1 as well.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-02 13:59:51 +09:00
Stefan Beller
d807c4a01d exec_cmd: rename to use dash in file name
This is more consistent with the project style. The majority of Git's
source files use dashes in preference to underscores in their file names.

Signed-off-by: Stefan Beller <sbeller@google.com>
2018-04-11 18:11:00 +09:00
Junio C Hamano
c2a499e6c3 Merge branch 'jh/partial-clone'
Hotfix.

* jh/partial-clone:
  upload-pack: disable object filtering when disabled by config
  unpack-trees: release oid_array after use in check_updates()
2018-03-29 15:39:59 -07:00
Jonathan Nieder
c7620bd0f3 upload-pack: disable object filtering when disabled by config
When upload-pack gained partial clone support (v2.17.0-rc0~132^2~12,
2017-12-08), it was guarded by the uploadpack.allowFilter config item
to allow server operators to control when they start supporting it.

That config item didn't go far enough, though: it controls whether the
'filter' capability is advertised, but if a (custom) client ignores
the capability advertisement and passes a filter specification anyway,
the server would handle that despite allowFilter being false.

This is particularly significant if a security bug is discovered in
this new experimental partial clone code.  Installations without
uploadpack.allowFilter ought not to be affected since they don't
intend to support partial clone, but they would be swept up into being
vulnerable.

Simplify and limit the attack surface by making uploadpack.allowFilter
disable the feature, not just the advertisement of it.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-29 15:39:31 -07:00
Brandon Williams
685fbd3291 fetch-pack: perform a fetch using v2
When communicating with a v2 server, perform a fetch by requesting the
'fetch' command.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-15 12:01:08 -07:00
Brandon Williams
3145ea957d upload-pack: introduce fetch server command
Introduce the 'fetch' server command.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-15 12:01:08 -07:00
Brandon Williams
ae2948f30c upload-pack: factor out processing lines
Factor out the logic for processing shallow, deepen, deepen_since, and
deepen_not lines into their own functions to simplify the
'receive_needs()' function in addition to making it easier to reuse some
of this logic when implementing protocol_v2.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14 14:15:06 -07:00
Brandon Williams
a3d6b53e92 upload-pack: convert to a builtin
In order to allow for code sharing with the server-side of fetch in
protocol-v2 convert upload-pack to be a builtin.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14 14:15:06 -07:00
Junio C Hamano
6bed209a20 Merge branch 'jh/partial-clone'
The machinery to clone & fetch, which in turn involves packing and
unpacking objects, have been told how to omit certain objects using
the filtering mechanism introduced by the jh/object-filtering
topic, and also mark the resulting pack as a promisor pack to
tolerate missing objects, taking advantage of the mechanism
introduced by the jh/fsck-promisors topic.

* jh/partial-clone:
  t5616: test bulk prefetch after partial fetch
  fetch: inherit filter-spec from partial clone
  t5616: end-to-end tests for partial clone
  fetch-pack: restore save_commit_buffer after use
  unpack-trees: batch fetching of missing blobs
  clone: partial clone
  partial-clone: define partial clone settings in config
  fetch: support filters
  fetch: refactor calculation of remote list
  fetch-pack: test support excluding large blobs
  fetch-pack: add --no-filter
  fetch-pack, index-pack, transport: partial clone
  upload-pack: add object filtering for partial clone
2018-02-13 13:39:04 -08:00
Jonathan Tan
0b6069fe0a fetch-pack: test support excluding large blobs
Created tests to verify fetch-pack and upload-pack support
for excluding large blobs using --filter=blobs:limit=<n>
parameter.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-08 09:58:51 -08:00
Jeff Hostetler
10ac85c785 upload-pack: add object filtering for partial clone
Teach upload-pack to negotiate object filtering over the protocol and
to send filter parameters to pack-objects.  This is intended for partial
clone and fetch.

The idea to make upload-pack configurable using uploadpack.allowFilter
comes from Jonathan Tan's work in [1].

[1] https://public-inbox.org/git/f211093280b422c32cc1b7034130072f35c5ed51.1506714999.git.jonathantanmy@google.com/

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-08 09:58:51 -08:00
Junio C Hamano
4c6dad0059 Merge branch 'bw/protocol-v1'
A new mechanism to upgrade the wire protocol in place is proposed
and demonstrated that it works with the older versions of Git
without harming them.

* bw/protocol-v1:
  Documentation: document Extra Parameters
  ssh: introduce a 'simple' ssh variant
  i5700: add interop test for protocol transition
  http: tell server that the client understands v1
  connect: tell server that the client understands v1
  connect: teach client to recognize v1 server response
  upload-pack, receive-pack: introduce protocol version 1
  daemon: recognize hidden request arguments
  protocol: introduce protocol extension mechanisms
  pkt-line: add packet_write function
  connect: in ref advertisement, shallows are last
2017-12-06 09:23:44 -08:00
Brandon Williams
aa9bab29b8 upload-pack, receive-pack: introduce protocol version 1
Teach upload-pack and receive-pack to understand and respond using
protocol version 1, if requested.

Protocol version 1 is simply the original and current protocol (what I'm
calling version 0) with the addition of a single packet line, which
precedes the ref advertisement, indicating the protocol version being
spoken.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-17 10:51:29 +09:00
brian m. carlson
b420d90980 refs: convert peel_ref to struct object_id
Convert peel_ref (and its corresponding backend) to struct object_id.

This transformation was done with an update to the declaration,
definition, comments, and test helper and the following semantic patch:

@@
expression E1, E2;
@@
- peel_ref(E1, E2.hash)
+ peel_ref(E1, &E2)

@@
expression E1, E2;
@@
- peel_ref(E1, E2->hash)
+ peel_ref(E1, E2)

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 11:05:51 +09:00
brian m. carlson
cca5fa6406 refs: convert dwim_ref and expand_ref to struct object_id
All of the callers of these functions just pass the hash member of a
struct object_id, so convert them to use a pointer to struct object_id
directly.  Insert a check for NULL in expand_ref on a temporary basis;
this check can be removed when resolve_ref_unsafe is converted as well.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 11:05:51 +09:00
Junio C Hamano
69c54c7284 Merge branch 'ma/leakplugs'
Memory leaks in various codepaths have been plugged.

* ma/leakplugs:
  pack-bitmap[-write]: use `object_array_clear()`, don't leak
  object_array: add and use `object_array_pop()`
  object_array: use `object_array_clear()`, not `free()`
  leak_pending: use `object_array_clear()`, not `free()`
  commit: fix memory leak in `reduce_heads()`
  builtin/commit: fix memory leak in `prepare_index()`
2017-09-29 11:23:43 +09:00
René Scharfe
744c040b19 refs: pass NULL to resolve_ref_unsafe() if hash is not needed
This allows us to get rid of some write-only variables, among them seven
SHA1 buffers.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-24 10:18:21 +09:00
Martin Ågren
dcb572ab94 object_array: use object_array_clear(), not free()
Instead of freeing `foo.objects` for an object array `foo` (sometimes
conditionally), call `object_array_clear(&foo)`. This means we don't
poke as much into the implementation, which is already a good thing, but
also that we release the individual entries as well, thereby fixing at
least one memory-leak (in diff-lib.c).

If someone is holding on to a pointer to an element's `name` or `path`,
that is now a dangling pointer, i.e., we'd be turning an unpleasant
situation into an outright bug. To the best of my understanding no such
long-term pointers are being taken.

The way we handle `study` in builting/reflog.c still looks like it might
leak. That will be addressed in the next commit.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-24 10:06:01 +09:00
Junio C Hamano
f31d23a399 Merge branch 'bw/config-h'
Fix configuration codepath to pay proper attention to commondir
that is used in multi-worktree situation, and isolate config API
into its own header file.

* bw/config-h:
  config: don't implicitly use gitdir or commondir
  config: respect commondir
  setup: teach discover_git_directory to respect the commondir
  config: don't include config.h by default
  config: remove git_config_iter
  config: create config.h
2017-06-24 14:28:41 -07:00
Brandon Williams
b2141fc1d2 config: don't include config.h by default
Stop including config.h by default in cache.h.  Instead only include
config.h in those files which require use of the config system.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-15 12:56:22 -07:00
Junio C Hamano
6b526ced6f Merge branch 'bc/object-id'
Conversion from uchar[20] to struct object_id continues.

* bc/object-id: (53 commits)
  object: convert parse_object* to take struct object_id
  tree: convert parse_tree_indirect to struct object_id
  sequencer: convert do_recursive_merge to struct object_id
  diff-lib: convert do_diff_cache to struct object_id
  builtin/ls-tree: convert to struct object_id
  merge: convert checkout_fast_forward to struct object_id
  sequencer: convert fast_forward_to to struct object_id
  builtin/ls-files: convert overlay_tree_on_cache to object_id
  builtin/read-tree: convert to struct object_id
  sha1_name: convert internals of peel_onion to object_id
  upload-pack: convert remaining parse_object callers to object_id
  revision: convert remaining parse_object callers to object_id
  revision: rename add_pending_sha1 to add_pending_oid
  http-push: convert process_ls_object and descendants to object_id
  refs/files-backend: convert many internals to struct object_id
  refs: convert struct ref_update to use struct object_id
  ref-filter: convert some static functions to struct object_id
  Convert struct ref_array_item to struct object_id
  Convert the verify_pack callback to struct object_id
  Convert lookup_tag to struct object_id
  ...
2017-05-29 12:34:43 +09:00
brian m. carlson
c251c83df2 object: convert parse_object* to take struct object_id
Make parse_object, parse_object_or_die, and parse_object_buffer take a
pointer to struct object_id.  Remove the temporary variables inserted
earlier, since they are no longer necessary.  Transform all of the
callers using the following semantic patch:

@@
expression E1;
@@
- parse_object(E1.hash)
+ parse_object(&E1)

@@
expression E1;
@@
- parse_object(E1->hash)
+ parse_object(E1)

@@
expression E1, E2;
@@
- parse_object_or_die(E1.hash, E2)
+ parse_object_or_die(&E1, E2)

@@
expression E1, E2;
@@
- parse_object_or_die(E1->hash, E2)
+ parse_object_or_die(E1, E2)

@@
expression E1, E2, E3, E4, E5;
@@
- parse_object_buffer(E1.hash, E2, E3, E4, E5)
+ parse_object_buffer(&E1, E2, E3, E4, E5)

@@
expression E1, E2, E3, E4, E5;
@@
- parse_object_buffer(E1->hash, E2, E3, E4, E5)
+ parse_object_buffer(E1, E2, E3, E4, E5)

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08 15:12:58 +09:00
brian m. carlson
cf93982fae upload-pack: convert remaining parse_object callers to object_id
Convert the remaining parse_object callers to struct object_id.  Use
named constants for several hard-coded values.  In addition, rename
got_sha1 to got_oid to reflect the new argument.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08 15:12:58 +09:00
brian m. carlson
e92b848cb6 shallow: convert shallow registration functions to object_id
Convert register_shallow and unregister_shallow to take struct
object_id.  register_shallow is a caller of lookup_commit, which we will
convert later.  It doesn't make sense for the registration and
unregistration functions to have incompatible interfaces, so convert
them both.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08 15:12:57 +09:00
Johannes Schindelin
dddbad728c timestamp_t: a new data type for timestamps
Git's source code assumes that unsigned long is at least as precise as
time_t. Which is incorrect, and causes a lot of problems, in particular
where unsigned long is only 32-bit (notably on Windows, even in 64-bit
versions).

So let's just use a more appropriate data type instead. In preparation
for this, we introduce the new `timestamp_t` data type.

By necessity, this is a very, very large patch, as it has to replace all
timestamps' data type in one go.

As we will use a data type that is not necessarily identical to `time_t`,
we need to be very careful to use `time_t` whenever we interact with the
system functions, and `timestamp_t` everywhere else.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-27 13:07:39 +09:00
Johannes Schindelin
cb71f8bdb5 PRItime: introduce a new "printf format" for timestamps
Currently, Git's source code treats all timestamps as if they were
unsigned longs. Therefore, it is okay to write "%lu" when printing them.

There is a substantial problem with that, though: at least on Windows,
time_t is *larger* than unsigned long, and hence we will want to switch
away from the ill-specified `unsigned long` data type.

So let's introduce the pseudo format "PRItime" (currently simply being
defined to "lu") to make it easier to change the data type used for
timestamps.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-23 20:19:15 -07:00
Johannes Schindelin
1aeb7e756c parse_timestamp(): specify explicitly where we parse timestamps
Currently, Git's source code represents all timestamps as `unsigned
long`. In preparation for using a more appropriate data type, let's
introduce a symbol `parse_timestamp` (currently being defined to
`strtoul`) where appropriate, so that we can later easily switch to,
say, use `strtoull()` instead.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-23 20:19:15 -07:00
Jonathan Tan
bdb31eada7 upload-pack: report "not our ref" to client
Make upload-pack report "not our ref" errors to the client as an "ERR" line.
(If not, the client would be left waiting for a response when the server is
already dead.)

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-23 12:14:40 -08:00
David Turner
f8edeaa05d upload-pack: optionally allow fetching any sha1
It seems a little silly to do a reachabilty check in the case where we
trust the user to access absolutely everything in the repository.

Also, it's racy in a distributed system -- perhaps one server
advertises a ref, but another has since had a force-push to that ref,
and perhaps the two HTTP requests end up directed to these different
servers.

Signed-off-by: David Turner <dturner@twosigma.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-11-18 13:06:14 -08:00
Junio C Hamano
dbaa6bdce2 Merge branch 'ls/filter-process'
The smudge/clean filter API expect an external process is spawned
to filter the contents for each path that has a filter defined.  A
new type of "process" filter API has been added to allow the first
request to run the filter for a path to spawn a single process, and
all filtering need is served by this single process for multiple
paths, reducing the process creation overhead.

* ls/filter-process:
  contrib/long-running-filter: add long running filter example
  convert: add filter.<driver>.process option
  convert: prepare filter.<driver>.process option
  convert: make apply_filter() adhere to standard Git error handling
  pkt-line: add functions to read/write flush terminated packet streams
  pkt-line: add packet_write_gently()
  pkt-line: add packet_flush_gently()
  pkt-line: add packet_write_fmt_gently()
  pkt-line: extract set_packet_header()
  pkt-line: rename packet_write() to packet_write_fmt()
  run-command: add clean_on_exit_handler
  run-command: move check_pipe() from write_or_die to run_command
  convert: modernize tests
  convert: quote filter names in error messages
2016-10-31 13:15:21 -07:00
Lars Schneider
81c634e94f pkt-line: rename packet_write() to packet_write_fmt()
packet_write() should be called packet_write_fmt() because it is a
printf-like function that takes a format string as first parameter.

packet_write_fmt() should be used for text strings only. Arbitrary
binary data should use a new packet_write() function that is introduced
in a subsequent patch.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-17 11:36:50 -07:00
Jeff King
5411b10cef upload-pack: use priority queue in reachable() check
Like a lot of old commit-traversal code, this keeps a
commit_list in commit-date order, and and inserts parents
into the list. This means each insertion is potentially
linear, and the whole thing is quadratic (though the exact
runtime depends on the relationship between the commit dates
and the parent topology).

These days we have a priority queue, which can do the same
thing with a much better worst-case time.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-11 14:27:56 -07:00
Junio C Hamano
a460ea4a3c Merge branch 'nd/shallow-deepen'
The existing "git fetch --depth=<n>" option was hard to use
correctly when making the history of an existing shallow clone
deeper.  A new option, "--deepen=<n>", has been added to make this
easier to use.  "git clone" also learned "--shallow-since=<date>"
and "--shallow-exclude=<tag>" options to make it easier to specify
"I am interested only in the recent N months worth of history" and
"Give me only the history since that version".

* nd/shallow-deepen: (27 commits)
  fetch, upload-pack: --deepen=N extends shallow boundary by N commits
  upload-pack: add get_reachable_list()
  upload-pack: split check_unreachable() in two, prep for get_reachable_list()
  t5500, t5539: tests for shallow depth excluding a ref
  clone: define shallow clone boundary with --shallow-exclude
  fetch: define shallow boundary with --shallow-exclude
  upload-pack: support define shallow boundary by excluding revisions
  refs: add expand_ref()
  t5500, t5539: tests for shallow depth since a specific date
  clone: define shallow clone boundary based on time with --shallow-since
  fetch: define shallow boundary with --shallow-since
  upload-pack: add deepen-since to cut shallow repos based on time
  shallow.c: implement a generic shallow boundary finder based on rev-list
  fetch-pack: use a separate flag for fetch in deepening mode
  fetch-pack.c: mark strings for translating
  fetch-pack: use a common function for verbose printing
  fetch-pack: use skip_prefix() instead of starts_with()
  upload-pack: move rev-list code out of check_non_tip()
  upload-pack: make check_non_tip() clean things up on error
  upload-pack: tighten number parsing at "deepen" lines
  ...
2016-10-10 14:03:50 -07:00
Ville Skyttä
2e3a16b279 Spelling fixes
<BAD>                     <CORRECTED>
    accidently                accidentally
    commited                  committed
    dependancy                dependency
    emtpy                     empty
    existance                 existence
    explicitely               explicitly
    git-upload-achive         git-upload-archive
    hierachy                  hierarchy
    indegee                   indegree
    intial                    initial
    mulitple                  multiple
    non-existant              non-existent
    precendence.              precedence.
    priviledged               privileged
    programatically           programmatically
    psuedo-binary             pseudo-binary
    soemwhere                 somewhere
    successfull               successful
    transfering               transferring
    uncommited                uncommitted
    unkown                    unknown
    usefull                   useful
    writting                  writing

Signed-off-by: Ville Skyttä <ville.skytta@iki.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11 14:35:42 -07:00
Junio C Hamano
d4c6375fd8 Merge branch 'jk/common-main'
There are certain house-keeping tasks that need to be performed at
the very beginning of any Git program, and programs that are not
built-in commands had to do them exactly the same way as "git"
potty does.  It was easy to make mistakes in one-off standalone
programs (like test helpers).  A common "main()" function that
calls cmd_main() of individual program has been introduced to
make it harder to make mistakes.

* jk/common-main:
  mingw: declare main()'s argv as const
  common-main: call git_setup_gettext()
  common-main: call restore_sigpipe_to_default()
  common-main: call sanitize_stdfds()
  common-main: call git_extract_argv0_path()
  add an extra level of indirection to main()
2016-07-19 13:22:19 -07:00
Junio C Hamano
1e4bf90789 Merge branch 'jk/upload-pack-hook'
"upload-pack" allows a custom "git pack-objects" replacement when
responding to "fetch/clone" via the uploadpack.packObjectsHook.

* jk/upload-pack-hook:
  upload-pack: provide a hook for running pack-objects
  t1308: do not get fooled by symbolic links to the source tree
  config: add a notion of "scope"
  config: return configset value for current_config_ functions
  config: set up config_source for command-line config
  git_config_parse_parameter: refactor cleanup code
  git_config_with_options: drop "found" counting
2016-07-06 13:38:11 -07:00
Junio C Hamano
35d213c87c Merge branch 'lf/sideband-returns-void'
A small internal API cleanup.

* lf/sideband-returns-void:
  upload-pack.c: make send_client_data() return void
  sideband.c: make send_sideband() return void
2016-07-06 13:38:09 -07:00
Junio C Hamano
de61cebde7 Merge branch 'jk/common-main-2.8' into jk/common-main
* jk/common-main-2.8:
  mingw: declare main()'s argv as const
  common-main: call git_setup_gettext()
  common-main: call restore_sigpipe_to_default()
  common-main: call sanitize_stdfds()
  common-main: call git_extract_argv0_path()
  add an extra level of indirection to main()
2016-07-06 10:02:57 -07:00
Jeff King
5ce5f5fa5a common-main: call git_setup_gettext()
This should be part of every program, as otherwise users do
not get translated error messages. However, some external
commands forgot to do so (e.g., git-credential-store). This
fixes them, and eliminates the repeated code in programs
that did remember to use it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 15:09:10 -07:00
Jeff King
650c449250 common-main: call git_extract_argv0_path()
Every program which links against libgit.a must call this
function, or risk hitting an assert() in system_path() that
checks whether we have configured argv0_path (though only
when RUNTIME_PREFIX is defined, so essentially only on
Windows).

Looking at the diff, you can see that putting it into the
common main() saves us having to do it individually in each
of the external commands. But what you can't see are the
cases where we _should_ have been doing so, but weren't
(e.g., git-credential-store, and all of the t/helper test
programs).

This has been an accident-waiting-to-happen for a long time,
but wasn't triggered until recently because it involves one
of those programs actually calling system_path(). That
happened with git-credential-store in v2.8.0 with ae5f677
(lazily load core.sharedrepository, 2016-03-11). The
program:

  - takes a lock file, which...

  - opens a tempfile, which...

  - calls adjust_shared_perm to fix permissions, which...

  - lazy-loads the config (as of ae5f677), which...

  - calls system_path() to find the location of
    /etc/gitconfig

On systems with RUNTIME_PREFIX, this means credential-store
reliably hits that assert() and cannot be used.

We never noticed in the test suite, because we set
GIT_CONFIG_NOSYSTEM there, which skips the system_path()
lookup entirely.  But if we were to tweak git_config() to
find /etc/gitconfig even when we aren't going to open it,
then the test suite shows multiple failures (for
credential-store, and for some other test helpers). I didn't
include that tweak here because it's way too specific to
this particular call to be worth carrying around what is
essentially dead code.

The implementation is fairly straightforward, with one
exception: there is exactly one caller (git.c) that actually
cares about the result of the function, and not the
side-effect of setting up argv0_path. We can accommodate
that by simply replacing the value of argv[0] in the array
we hand down to cmd_main().

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 15:09:10 -07:00
Jeff King
3f2e2297b9 add an extra level of indirection to main()
There are certain startup tasks that we expect every git
process to do. In some cases this is just to improve the
quality of the program (e.g., setting up gettext()). In
others it is a requirement for using certain functions in
libgit.a (e.g., system_path() expects that you have called
git_extract_argv0_path()).

Most commands are builtins and are covered by the git.c
version of main(). However, there are still a few external
commands that use their own main(). Each of these has to
remember to include the correct startup sequence, and we are
not always consistent.

Rather than just fix the inconsistencies, let's make this
harder to get wrong by providing a common main() that can
run this standard startup.

We basically have two options to do this:

 - the compat/mingw.h file already does something like this by
   adding a #define that replaces the definition of main with a
   wrapper that calls mingw_startup().

   The upside is that the code in each program doesn't need
   to be changed at all; it's rewritten on the fly by the
   preprocessor.

   The downside is that it may make debugging of the startup
   sequence a bit more confusing, as the preprocessor is
   quietly inserting new code.

 - the builtin functions are all of the form cmd_foo(),
   and git.c's main() calls them.

   This is much more explicit, which may make things more
   obvious to somebody reading the code. It's also more
   flexible (because of course we have to figure out _which_
   cmd_foo() to call).

   The downside is that each of the builtins must define
   cmd_foo(), instead of just main().

This patch chooses the latter option, preferring the more
explicit approach, even though it is more invasive. We
introduce a new file common-main.c, with the "real" main. It
expects to call cmd_main() from whatever other objects it is
linked against.

We link common-main.o against anything that links against
libgit.a, since we know that such programs will need to do
this setup. Note that common-main.o can't actually go inside
libgit.a, as the linker would not pick up its main()
function automatically (it has no callers).

The rest of the patch is just adjusting all of the various
external programs (mostly in t/helper) to use cmd_main().
I've provided a global declaration for cmd_main(), which
means that all of the programs also need to match its
signature. In particular, many functions need to switch to
"const char **" instead of "char **" for argv. This effect
ripples out to a few other variables and functions, as well.

This makes the patch even more invasive, but the end result
is much better. We should be treating argv strings as const
anyway, and now all programs conform to the same signature
(which also matches the way builtins are defined).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 15:09:10 -07:00
Lukas Fleischer
fcf0fe9e69 upload-pack.c: make send_client_data() return void
The send_client_data() function uses write_or_die() for writing data
which immediately terminates the process on errors. If no such error
occurred, send_client_data() always returned the value that was passed
as third parameter prior to this commit. This value is already known to
the caller in any case, so let's turn send_client_data() into a void
function instead.

Signed-off-by: Lukas Fleischer <lfleischer@lfos.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-16 11:40:31 -07:00
Lukas Fleischer
4c4b7d1d3b sideband.c: make send_sideband() return void
The send_sideband() function uses write_or_die() for writing data which
immediately terminates the process on errors. If no such error occurred,
send_sideband() always returned the value that was passed as fourth
parameter prior to this commit. This value is already known to the
caller in any case, so let's turn send_sideband() into a void function
instead.

Signed-off-by: Lukas Fleischer <lfleischer@lfos.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-16 11:40:19 -07:00
Nguyễn Thái Ngọc Duy
cccf74e2da fetch, upload-pack: --deepen=N extends shallow boundary by N commits
In git-fetch, --depth argument is always relative with the latest
remote refs. This makes it a bit difficult to cover this use case,
where the user wants to make the shallow history, say 3 levels
deeper. It would work if remote refs have not moved yet, but nobody
can guarantee that, especially when that use case is performed a
couple months after the last clone or "git fetch --depth". Also,
modifying shallow boundary using --depth does not work well with
clones created by --since or --not.

This patch fixes that. A new argument --deepen=<N> will add <N> more (*)
parent commits to the current history regardless of where remote refs
are.

Have/Want negotiation is still respected. So if remote refs move, the
server will send two chunks: one between "have" and "want" and another
to extend shallow history. In theory, the client could send no "want"s
in order to get the second chunk only. But the protocol does not allow
that. Either you send no want lines, which means ls-remote; or you
have to send at least one want line that carries deep-relative to the
server..

The main work was done by Dongcan Jiang. I fixed it up here and there.
And of course all the bugs belong to me.

(*) We could even support --deepen=<N> where <N> is negative. In that
case we can cut some history from the shallow clone. This operation
(and --depth=<shorter depth>) does not require interaction with remote
side (and more complicated to implement as a result).

Helped-by: Duy Nguyen <pclouds@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Dongcan Jiang <dongcan.jiang@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy
079aa97e24 upload-pack: add get_reachable_list()
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy
2997178ee6 upload-pack: split check_unreachable() in two, prep for get_reachable_list()
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy
269a7a8316 upload-pack: support define shallow boundary by excluding revisions
This should allow the user to say "create a shallow clone of this branch
after version <some-tag>".

Short refs are accepted and expanded at the server side with expand_ref()
because we cannot expand (unknown) refs from the client side.

Like deepen-since, deepen-not cannot be used with deepen. But deepen-not
can be mixed with deepen-since. The result is exactly how you do the
command "git rev-list --since=... --not ref".

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy
569e554be9 upload-pack: add deepen-since to cut shallow repos based on time
This should allow the user to say "create a shallow clone containing the
work from last year" (once the client side is fixed up, of course).

In theory deepen-since and deepen (aka --depth) can be used together to
draw the shallow boundary (whether it's intersection or union is up to
discussion, but if rev-list is used, it's likely intersection). However,
because deepen goes with a custom commit walker, we can't mix the two
yet.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy
3f0f6624f5 upload-pack: move rev-list code out of check_non_tip()
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy
7fcbd37f9c upload-pack: make check_non_tip() clean things up on error
On error check_non_tip() will die and not closing file descriptors is no
big deal. The next patch will split the majority of this function out
for reuse in other cases, where die() may not be the only outcome. Same
story for popping SIGPIPE out of the signal chain. So let's make sure we
clean things up properly first.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy
6e414e30fd upload-pack: tighten number parsing at "deepen" lines
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy
8bf3b75841 upload-pack: use skip_prefix() instead of starts_with()
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy
873700c92e upload-pack: move "unshallow" sending code out of deepen()
Also add some more comments in this code because it takes too long to
understand what it does (to me, who should be familiar enough to
understand this code well!)

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy
ef635b9056 upload-pack: remove unused variable "backup"
After the last patch, "result" and "backup" are the same. "result" used
to move, but the movement is now contained in send_shallow(). Delete
this redundant variable.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy
5c24cdea1e upload-pack: move "shallow" sending code out of deepen()
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy
e8e44de787 upload-pack: move shallow deepen code out of receive_needs()
This is a prep step for further refactoring. Besides reindentation and
s/shallows\./shallows->/g, no other changes are expected.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-13 14:38:16 -07:00
Jeff King
20b20a22f8 upload-pack: provide a hook for running pack-objects
When upload-pack serves a client request, it turns to
pack-objects to do the heavy lifting of creating a
packfile. There's no easy way to intercept the call to
pack-objects, but there are a few good reasons to want to do
so:

  1. If you're debugging a client or server issue with
     fetching, you may want to store a copy of the generated
     packfile.

  2. If you're gathering data from real-world fetches for
     performance analysis or debugging, storing a copy of
     the arguments and stdin lets you replay the pack
     generation at your leisure.

  3. You may want to insert a caching layer around
     pack-objects; it is the most CPU- and memory-intensive
     part of serving a fetch, and its output is a pure
     function[1] of its input, making it an ideal place to
     consolidate identical requests.

This patch adds a simple "hook" interface to intercept calls
to pack-objects. The new test demonstrates how it can be
used for debugging (using it for caching is a
straightforward extension; the tricky part is writing the
actual caching layer).

This hook is unlike the normal hook scripts found in the
"hooks/" directory of a repository. Because we promise that
upload-pack is safe to run in an untrusted repository, we
cannot execute arbitrary code or commands found in the
repository (neither in hooks/, nor in the config). So
instead, this hook is triggered from a config variable that
is explicitly ignored in the per-repo config.

The config variable holds the actual shell command to run as
the hook.  Another approach would be to simply treat it as a
boolean: "should I respect the upload-pack hooks in this
repo?", and then run the script from "hooks/" as we usually
do. However, that isn't as flexible; there's no way to run a
hook approved by the site administrator (e.g., in
"/etc/gitconfig") on a repository whose contents are not
trusted. The approach taken by this patch is more
fine-grained, if a little less conventional for git hooks
(it does behave similar to other configured commands like
diff.external, etc).

[1] Pack-objects isn't _actually_ a pure function. Its
    output depends on the exact packing of the object
    database, and if multi-threading is used for delta
    compression, can even differ racily. But for the
    purposes of caching, that's OK; of the many possible
    outputs for a given input, it is sufficient only that we
    output one of them.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-02 15:22:24 -07:00
Antoine Queru
9812f2136b upload-pack.c: use parse-options API
Use the parse-options API rather than a hand-rolled option parser.

Description for --stateless-rpc and --advertise-refs come from
42526b4 (Add stateless RPC options to upload-pack,
receive-pack, 2009-10-30).

Signed-off-by: Antoine Queru <antoine.queru@grenoble-inp.org>
Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-31 10:17:20 -07:00
Junio C Hamano
40cfc95856 Merge branch 'nd/error-errno'
The code for warning_errno/die_errno has been refactored and a new
error_errno() reporting helper is introduced.

* nd/error-errno: (41 commits)
  wrapper.c: use warning_errno()
  vcs-svn: use error_errno()
  upload-pack.c: use error_errno()
  unpack-trees.c: use error_errno()
  transport-helper.c: use error_errno()
  sha1_file.c: use {error,die,warning}_errno()
  server-info.c: use error_errno()
  sequencer.c: use error_errno()
  run-command.c: use error_errno()
  rerere.c: use error_errno() and warning_errno()
  reachable.c: use error_errno()
  mailmap.c: use error_errno()
  ident.c: use warning_errno()
  http.c: use error_errno() and warning_errno()
  grep.c: use error_errno()
  gpg-interface.c: use error_errno()
  fast-import.c: use error_errno()
  entry.c: use error_errno()
  editor.c: use error_errno()
  diff-no-index.c: use error_errno()
  ...
2016-05-17 14:38:28 -07:00
Nguyễn Thái Ngọc Duy
d2b6afa2cb upload-pack.c: use error_errno()
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-09 12:29:08 -07:00
Michael Procter
65a3629ea3 upload-pack: use argv_array for pack_objects
Use the argv_array in the child_process structure, to avoid having to
manually maintain an array size.

Signed-off-by: Michael Procter <michael@procter.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-25 14:20:25 -08:00
brian m. carlson
ed1c9977cb Remove get_object_hash.
Convert all instances of get_object_hash to use an appropriate reference
to the hash member of the oid member of struct object.  This provides no
functional change, as it is essentially a macro substitution.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 08:02:05 -05:00
brian m. carlson
f2fd0760f6 Convert struct object to object_id
struct object is one of the major data structures dealing with object
IDs.  Convert it to use struct object_id instead of an unsigned char
array.  Convert get_object_hash to refer to the new member as well.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 08:02:05 -05:00
brian m. carlson
7999b2cf77 Add several uses of get_object_hash.
Convert most instances where the sha1 member of struct object is
dereferenced to use get_object_hash.  Most instances that are passed to
functions that have versions taking struct object_id, such as
get_sha1_hex/get_oid_hex, or instances that can be trivially converted
to use struct object_id instead, are not converted.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 08:02:05 -05:00
Lukas Fleischer
78a766ab6e hideRefs: add support for matching full refs
In addition to matching stripped refs, one can now add hideRefs
patterns that the full (unstripped) ref is matched against. To
distinguish between stripped and full matches, those new patterns
must be prefixed with a circumflex (^).

This commit also removes support for the undocumented and unintended
hideRefs settings ".have" (suppressing all "have" lines) and
"capabilities^{}" (suppressing the capabilities line).

Signed-off-by: Lukas Fleischer <lfleischer@lfos.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-11-05 11:25:02 -08:00
Lukas Fleischer
00b293e519 upload-pack: strip refs before calling ref_is_hidden()
Make hideRefs handling in upload-pack consistent with the behavior
described in the documentation by stripping refs before comparing them
with prefixes in hideRefs.

Signed-off-by: Lukas Fleischer <lfleischer@lfos.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-11-05 11:25:02 -08:00
René Scharfe
e510ab8988 use pop_commit() for consuming the first entry of a struct commit_list
Instead of open-coding the function pop_commit() just call it.  This
makes the intent clearer and reduces code size.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-26 14:06:46 -07:00
Junio C Hamano
5455ee0573 Merge branch 'bc/object-id'
for_each_ref() callback functions were taught to name the objects
not with "unsigned char sha1[20]" but with "struct object_id".

* bc/object-id: (56 commits)
  struct ref_lock: convert old_sha1 member to object_id
  warn_if_dangling_symref(): convert local variable "junk" to object_id
  each_ref_fn_adapter(): remove adapter
  rev_list_insert_ref(): remove unneeded arguments
  rev_list_insert_ref_oid(): new function, taking an object_oid
  mark_complete(): remove unneeded arguments
  mark_complete_oid(): new function, taking an object_oid
  clear_marks(): rewrite to take an object_id argument
  mark_complete(): rewrite to take an object_id argument
  send_ref(): convert local variable "peeled" to object_id
  upload-pack: rewrite functions to take object_id arguments
  find_symref(): convert local variable "unused" to object_id
  find_symref(): rewrite to take an object_id argument
  write_one_ref(): rewrite to take an object_id argument
  write_refs_to_temp_dir(): convert local variable sha1 to object_id
  submodule: rewrite to take an object_id argument
  shallow: rewrite functions to take object_id arguments
  handle_one_ref(): rewrite to take an object_id argument
  add_info_ref(): rewrite to take an object_id argument
  handle_one_reflog(): rewrite to take an object_id argument
  ...
2015-06-05 12:17:37 -07:00
Junio C Hamano
a9d3493380 Merge branch 'fm/fetch-raw-sha1'
"git upload-pack" that serves "git fetch" can be told to serve
commits that are not at the tip of any ref, as long as they are
reachable from a ref, with uploadpack.allowReachableSHA1InWant
configuration variable.

* fm/fetch-raw-sha1:
  upload-pack: optionally allow fetching reachable sha1
  upload-pack: prepare to extend allow-tip-sha1-in-want
  config.txt: clarify allowTipSHA1InWant with camelCase
2015-06-01 12:45:19 -07:00
Michael Haggerty
21758affae send_ref(): convert local variable "peeled" to object_id
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-25 12:19:37 -07:00