Commit Graph

64404 Commits

Author SHA1 Message Date
Junio C Hamano
bbeca063cf Merge branch 'ar/submodule-add-more'
More parts of "git submodule add" has been rewritten in C.

* ar/submodule-add-more:
  submodule--helper: rename compute_submodule_clone_url()
  submodule--helper: remove resolve-relative-url subcommand
  submodule--helper: remove add-config subcommand
  submodule--helper: remove add-clone subcommand
  submodule--helper: convert the bulk of cmd_add() to C
  dir: libify and export helper functions from clone.c
  submodule--helper: remove repeated code in sync_submodule()
  submodule--helper: refactor resolve_relative_url() helper
  submodule--helper: add options for compute_submodule_clone_url()
2021-09-20 15:20:43 -07:00
Junio C Hamano
b5a36278f4 Merge branch 'ar/submodule-add-config'
Large part of "git submodule add" gets rewritten in C.

* ar/submodule-add-config:
  submodule--helper: introduce add-config subcommand
2021-09-20 15:20:42 -07:00
Junio C Hamano
67fc02be54 Merge branch 'ab/unbundle-progress'
Add progress display to "git bundle unbundle".

* ab/unbundle-progress:
  bundle: show progress on "unbundle"
  index-pack: add --progress-title option
  bundle API: change "flags" to be "extra_index_pack_args"
  bundle API: start writing API documentation
2021-09-20 15:20:42 -07:00
Junio C Hamano
a1af533323 Merge branch 'tb/pack-finalize-ordering'
The order in which various files that make up a single (conceptual)
packfile has been reevaluated and straightened up.  This matters in
correctness, as an incomplete set of files must not be shown to a
running Git.

* tb/pack-finalize-ordering:
  pack-objects: rename .idx files into place after .bitmap files
  pack-write: split up finish_tmp_packfile() function
  builtin/index-pack.c: move `.idx` files into place last
  index-pack: refactor renaming in final()
  builtin/repack.c: move `.idx` files into place last
  pack-write.c: rename `.idx` files after `*.rev`
  pack-write: refactor renaming in finish_tmp_packfile()
  bulk-checkin.c: store checksum directly
  pack.h: line-wrap the definition of finish_tmp_packfile()
2021-09-20 15:20:42 -07:00
Junio C Hamano
403192acb6 Merge branch 'cb/pedantic-build-for-developers'
Update the build procedure to use the "-pedantic" build when
DEVELOPER makefile macro is in effect.

* cb/pedantic-build-for-developers:
  developer: enable pedantic by default
  win32: allow building with pedantic mode enabled
  gettext: remove optional non-standard parens in N_() definition
2021-09-20 15:20:41 -07:00
Junio C Hamano
df0c308c1a Merge branch 'ab/progress-users-adjust-counters'
The code to show progress indicator in a few code paths did not
cover between 0-100%, which has been corrected.

* ab/progress-users-adjust-counters:
  entry: show finer-grained counter in "Filtering content" progress line
  commit-graph: fix bogus counter in "Scanning merged commits" progress line
2021-09-20 15:20:41 -07:00
Junio C Hamano
75405e7270 Merge branch 'dt/submodule-diff-fixes'
"git diff --submodule=diff" showed failure from run_command() when
trying to run diff inside a submodule, when the user manually
removes the submodule directory.

* dt/submodule-diff-fixes:
  diff --submodule=diff: don't print failure message twice
  diff --submodule=diff: do not fail on ever-initialied deleted submodules
  t4060: remove unused variable
2021-09-20 15:20:41 -07:00
Junio C Hamano
c2509c5407 Merge branch 'jv/pkt-line-batch'
Reduce number of write(2) system calls while sending the
ref advertisement.

* jv/pkt-line-batch:
  upload-pack: use stdio in send_ref callbacks
  pkt-line: add stdio packet write functions
2021-09-20 15:20:41 -07:00
Junio C Hamano
ed8794ef7a Merge branch 'lh/systemd-timers'
"git maintenance" scheduler learned to use systemd timers as a
possible backend.

* lh/systemd-timers:
  maintenance: add support for systemd timers on Linux
  maintenance: `git maintenance run` learned `--scheduler=<scheduler>`
  cache.h: Introduce a generic "xdg_config_home_for(…)" function
2021-09-20 15:20:40 -07:00
Junio C Hamano
76f5fdc203 Merge branch 'ab/tr2-leaks-and-fixes'
The tracing of process ancestry information has been enhanced.

* ab/tr2-leaks-and-fixes:
  tr2: log N parent process names on Linux
  tr2: do compiler enum check in trace2_collect_process_info()
  tr2: leave the parent list empty upon failure & don't leak memory
  tr2: stop leaking "thread_name" memory
  tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux
  tr2: remove NEEDSWORK comment for "non-procfs" implementations
2021-09-20 15:20:40 -07:00
Junio C Hamano
11e5d0a262 Merge branch 'jt/grep-wo-submodule-odb-as-alternate'
The code to make "git grep" recurse into submodules has been
updated to migrate away from the "add submodule's object store as
an alternate object store" mechanism (which is suboptimal).

* jt/grep-wo-submodule-odb-as-alternate:
  t7814: show lack of alternate ODB-adding
  submodule-config: pass repo upon blob config read
  grep: add repository to OID grep sources
  grep: allocate subrepos on heap
  grep: read submodule entry with explicit repo
  grep: typesafe versions of grep_source_init
  grep: use submodule-ODB-as-alternate lazy-addition
  submodule: lazily add submodule ODBs as alternates
2021-09-20 15:20:39 -07:00
Junio C Hamano
0649303820 Merge branch 'tb/multi-pack-bitmaps'
The reachability bitmap file used to be generated only for a single
pack, but now we've learned to generate bitmaps for history that
span across multiple packfiles.

* tb/multi-pack-bitmaps: (29 commits)
  pack-bitmap: drop bitmap_index argument from try_partial_reuse()
  pack-bitmap: drop repository argument from prepare_midx_bitmap_git()
  p5326: perf tests for MIDX bitmaps
  p5310: extract full and partial bitmap tests
  midx: respect 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP'
  t7700: update to work with MIDX bitmap test knob
  t5319: don't write MIDX bitmaps in t5319
  t5310: disable GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP
  t0410: disable GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP
  t5326: test multi-pack bitmap behavior
  t/helper/test-read-midx.c: add --checksum mode
  t5310: move some tests to lib-bitmap.sh
  pack-bitmap: write multi-pack bitmaps
  pack-bitmap: read multi-pack bitmaps
  pack-bitmap.c: avoid redundant calls to try_partial_reuse
  pack-bitmap.c: introduce 'bitmap_is_preferred_refname()'
  pack-bitmap.c: introduce 'nth_bitmap_object_oid()'
  pack-bitmap.c: introduce 'bitmap_num_objects()'
  midx: avoid opening multiple MIDXs when writing
  midx: close linked MIDXs, avoid leaking memory
  ...
2021-09-20 15:20:39 -07:00
Junio C Hamano
deec8aa2d0 Merge branch 'ps/fetch-optim'
Optimize code that handles large number of refs in the "git fetch"
code path.

* ps/fetch-optim:
  fetch: avoid second connectivity check if we already have all objects
  fetch: merge fetching and consuming refs
  fetch: refactor fetch refs to be more extendable
  fetch-pack: optimize loading of refs via commit graph
  connected: refactor iterator to return next object ID directly
  fetch: avoid unpacking headers in object existence check
  fetch: speed up lookup of want refs via commit-graph
2021-09-20 15:20:39 -07:00
Jeff King
6b58df54cf clone: handle unborn branch in bare repos
When cloning a repository with an unborn HEAD, we'll set the local HEAD
to match it only if the local repository is non-bare. This is
inconsistent with all other combinations:

  remote HEAD       | local repo | local HEAD
  -----------------------------------------------
  points to commit  | non-bare   | same as remote
  points to commit  | bare       | same as remote
  unborn            | non-bare   | same as remote
  unborn            | bare       | local default

So I don't think this is some clever or subtle behavior, but just a bug
in 4f37d45706 (clone: respect remote unborn HEAD, 2021-02-05). And it's
easy to see how we ended up there. Before that commit, the code to set
up the HEAD for an empty repo was guarded by "if (!option_bare)". That's
because the only thing it did was call install_branch_config(), and we
don't want to do so for a bare repository (unborn HEAD or not).

That commit put the handling of unborn HEADs into the same block, since
those also need to call install_branch_config(). But the unborn case has
an additional side effect of calling create_symref(), and we want that
to happen whether we are bare or not.

This patch just pulls all of the "figure out the default branch" code
out of the "!option_bare" block. Only the actual config installation is
kept there.

Note that this does mean we might allocate "ref" and not use it (if the
remote is empty but did not advertise an unborn HEAD). But that's not
really a big deal since this isn't a hot code path, and it keeps the
code simple. The alternative would be handling unborn_head_target
separately, but that gets confusing since its memory ownership is
tangled up with the "ref" variable.

There's just one new test, for the case we're fixing. The other ones in
the table are handled elsewhere (the unborn non-bare case just above,
and the actually-born cases in t5601, t5606, and t5609, as they do not
require v2's "unborn" protocol extension).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-20 14:05:36 -07:00
Elijah Newren
3584cff71c merge-ort: fix completely wrong comment
Not sure what happened, but the comment is describing code elsewhere in
the file.  Fix the comment to actually discuss the code that follows.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-20 11:25:02 -07:00
Elijah Newren
b031f47802 trace2.h: fix trivial comment typo
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-20 11:25:01 -07:00
Taylor Blau
d5fdf3073a builtin/commit-graph.c: don't accept common --[no-]progress
In 84e4484f12 (commit-graph: use parse_options_concat(), 2021-08-23) we
unified common options of commit-graph's subcommands into a single
"common_opts" array.

But 84e4484f12 introduced a behavior change which is to accept the
"--[no-]progress" option before any sub-commands, e.g.,

    git commit-graph --progress write ...

Prior to that commit, the above would error out with "unknown option".

There are two issues with this behavior change. First is that the
top-level --[no-]progress is not always respected. This is because
isatty(2) is performed in the sub-commands, which unconditionally
overwrites any --[no-]progress that was given at the top-level.

But the second issue is that the existing sub-commands of commit-graph
only happen to both have a sensible interpretation of what `--progress`
or `--no-progress` means. If we ever added a sub-command which didn't
have a notion of progress, we would be forced to ignore the top-level
`--[no-]progress` altogether.

Since we haven't released a version of Git that supports --[no-]progress
as a top-level option for `git commit-graph`, let's remove it.

Suggested-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-20 11:01:23 -07:00
Wesley Schwengle
d1e894c6d7 Document rebase.forkpoint in rebase man page
The configuration option `rebase.forkpoint' is only mentioned in the man
page of git-config(1). Since it is a configuration for rebase, mention
it in the documentation of rebase at the --fork-point/--no-fork-point
section. This will help users set a preferred default for their
workflow.

Signed-off-by: Wesley Schwengle <wesley@opperschaap.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-20 09:05:48 -07:00
Carlo Marcelo Arenas Belón
187fc8b8b6 unicode: update the width tables to Unicode 14
Released[0] after a long beta period and including several additional
zero/double width characters.

[0] https://home.unicode.org/announcing-the-unicode-standard-version-14-0/

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-17 17:26:21 -07:00
Carlo Marcelo Arenas Belón
4b81f690f6 Documentation: cleanup git-cvsserver
Fix a few typos and alignment issues, and while at it update the
example hashes to show most of the ones available in recent crypt(3).

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-16 20:47:48 -07:00
Carlo Marcelo Arenas Belón
bffcb4d9d6 git-cvsserver: protect against NULL in crypt(3)
Some versions of crypt(3) will return NULL when passed an unsupported
hash type (ex: OpenBSD with DES), so check for undef instead of using
it directly.

Also use this to probe the system and select a better hash function in
the tests, so it can pass successfully.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
[jc: <CAPUEspjqD5zy8TLuFA96usU7FYi=0wF84y7NgOVFqegtxL9zbw@mail.gmail.com>]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-16 20:47:23 -07:00
Carlo Marcelo Arenas Belón
a7775c7eb8 git-cvsserver: use crypt correctly to compare password hashes
c057bad370 (git-cvsserver: use a password file cvsserver pserver,
2010-05-15) adds a way for `git cvsserver` to provide authenticated
pserver accounts without having clear text passwords, but uses the
username instead of the password to the call for crypt(3).

Correct that, and make sure the documentation correctly indicates how
to obtain hashed passwords that could be used to populate this
configuration, as well as correcting the hash that was used for the
tests.

This change will require that any user of this feature updates the
hashes in their configuration, but has the advantage of using a more
similar format than cvs uses, probably also easying any migration.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-16 15:06:24 -07:00
Carlo Marcelo Arenas Belón
66c0c44df6 t0000: avoid masking git exit value through pipes
9af0b8dbe2 (t0000-basic: more commit-tree tests., 2006-04-26) adds
tests for commit-tree that mask the return exit from git as described
in a378fee5b0 (Documentation: add shell guidelines, 2018-10-05).

Fix the tests, to avoid pipes by using a temporary file instead.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-16 13:43:42 -07:00
Carlo Marcelo Arenas Belón
637799bf0a tree-diff: fix leak when not HAVE_ALLOCA_H
b8ba412bf7 (tree-diff: avoid alloca for large allocations, 2016-06-07)
adds a way to route some bigger allocations out of the stack and free
them through the addition of two conveniently named macros, but leaves
the calls to free the xalloca part, which could be also in the heap,
if the system doesn't HAVE_ALLOCA_H (ex: macOS and other BSD).

Add the missing free call, xalloca_free(), which is a noop if we
allocated memory in the stack frame, but a real free() if we
allocated in the heap instead, and while at it, change the expression
to match in both macros for ease of readability.

This avoids a leak reported by LSAN while running t0000 but that
wouldn't fail the test (which is fixed in the next patch):

  SUMMARY: LeakSanitizer: 1034 byte(s) leaked in 15 allocation(s).

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-16 13:43:42 -07:00
Kyle Zhao
afb32e8101 pack-revindex.h: correct the time complexity descriptions
Time complexities for pack_pos_to_midx and midx_to_pack_pos are swapped,
correct it.

Signed-off-by: Kyle Zhao <kylezhao@tencent.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-15 22:16:25 -07:00
Junio C Hamano
4c719308ce The sixth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-15 13:15:28 -07:00
Junio C Hamano
44257f7b52 Merge branch 'jc/prefix-filename-allocates'
Leakfix.

* jc/prefix-filename-allocates:
  hash-object: prefix_filename() returns allocated memory these days
2021-09-15 13:15:28 -07:00
Junio C Hamano
3d141d8789 Merge branch 'rs/range-diff-avoid-segfault-with-I'
"git range-diff -I... <range> <range>" segfaulted, which has been
corrected.

* rs/range-diff-avoid-segfault-with-I:
  range-diff: avoid segfault with -I
2021-09-15 13:15:27 -07:00
Junio C Hamano
1ea5e46cb9 Merge branch 'ab/reverse-midx-optim'
The code that optionally creates the *.rev reverse index file has
been optimized to avoid needless computation when it is not writing
the file out.

* ab/reverse-midx-optim:
  pack-write: skip *.rev work when not writing *.rev
2021-09-15 13:15:27 -07:00
Junio C Hamano
5639a8d144 Merge branch 'bs/install-strip'
"make INSTALL_STRIP=-s install" allows the installation step to use
"install -s" to strip the binaries as they get installed.

* bs/install-strip:
  make: add INSTALL_STRIP option variable
2021-09-15 13:15:26 -07:00
Junio C Hamano
2b2af95908 Merge branch 'pb/test-use-user-env'
Teach "test_pause" and "debug" helpers to allow using the HOME and
TERM environment variables the user usually uses.

* pb/test-use-user-env:
  test-lib-functions: keep user's debugger config files and TERM in 'debug'
  test-lib-functions: optionally keep HOME, TERM and SHELL in 'test_pause'
  test-lib-functions: use 'TEST_SHELL_PATH' in 'test_pause'
2021-09-15 13:15:26 -07:00
Junio C Hamano
c76fcf3e46 Merge branch 'jc/trivial-threeway-binary-merge'
The "git apply -3" code path learned not to bother the lower level
merge machinery when the three-way merge can be trivially resolved
without the content level merge.

* jc/trivial-threeway-binary-merge:
  apply: resolve trivial merge without hitting ll-merge with "--3way"
2021-09-15 13:15:26 -07:00
Junio C Hamano
f696272e58 Merge branch 'bs/doc-bugreport-outdir'
Docfix.

* bs/doc-bugreport-outdir:
  Documentation: fix default directory of git bugreport -o
2021-09-15 13:15:25 -07:00
Junio C Hamano
59a29d1644 Merge branch 'ab/no-more-check-bindir'
Build simplification.

* ab/no-more-check-bindir:
  Makefile: remove the check_bindir script
2021-09-15 13:15:25 -07:00
Junio C Hamano
10de757a09 Merge branch 'ab/send-email-config-fix'
Regression fix.

* ab/send-email-config-fix:
  send-email: fix a "first config key wins" regression in v2.33.0
2021-09-15 13:15:24 -07:00
Junio C Hamano
e8332242b7 Merge branch 'so/diff-index-regression-fix'
Recent "diff -m" changes broke "gitk", which has been corrected.

* so/diff-index-regression-fix:
  diff-index: restore -c/--cc options handling
2021-09-15 13:15:24 -07:00
Jeff King
7c1200745b t1400: avoid SIGPIPE race condition on fifo
t1400.190 sometimes fails or even hangs because of the way it uses
fifos. Our goal is to interactively read and write lines from
update-ref, so we have two fifos, in and out. We open a descriptor
connected to "in" and redirect output to that, so that update-ref does
not see EOF as it would if we opened and closed it for each "echo" call.

But we don't do the same for the output. This leads to a race where our
"read response <out" has not yet opened the fifo, but update-ref tries
to write to it and gets SIGPIPE. This can result in the test failing, or
worse, hanging as we wait forever for somebody to write to the pipe.

This is the same proble we fixed in 4783e7ea83 (t0008: avoid SIGPIPE
race condition on fifo, 2013-07-12), and we can fix it the same way, by
opening a second long-running descriptor.

Before this patch, running:

  ./t1400-update-ref.sh --run=1,190 --stress

failed or hung within a few dozen iterations. After it, I ran it for
several hundred without problems.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-15 13:06:50 -07:00
Jonathan Tan
ce125d431a submodule: extract path to submodule gitdir func
We currently store each submodule gitdir in ".git/modules/<name>", but
this has problems with some submodule naming schemes, as described in a
comment in submodule_name_to_gitdir() in this patch.

Extract the determination of the location of a submodule's gitdir into
its own function submodule_name_to_gitdir(). For now, the problem
remains unsolved, but this puts us in a better position for finding a
solution.

This was motivated, at $DAYJOB, by a part of Android's repo hierarchy
[1]. In particular, there is a repo "build", and several repos of the
form "build/<name>".

This is based on earlier work by Brandon Williams [2].

[1] https://android.googlesource.com/platform/
[2] https://lore.kernel.org/git/20180808223323.79989-2-bmwill@google.com/

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-15 12:59:12 -07:00
Jeff King
ccf094788c ls-refs: reject unknown arguments
The v2 ls-refs command may receive extra arguments from the client, one
per pkt-line. The spec is pretty clear that the arguments must come from
a specified set, but we silently ignore any unknown entries. For a
well-behaved client this doesn't matter, but it makes testing and
debugging more confusing. Let's tighten this up to match the spec.

In theory this liberal behavior _could_ be useful for extending the
protocol. But:

  - every other part of the protocol requires that the server first
    indicate that it supports the argument; this includes the fetch and
    object-info commands, plus the "unborn" capability added to ls-refs
    itself

  - it's not a very good extension mechanism anyway; without the server
    advertising support, clients would have no idea if the argument was
    silently ignored, or accepted and simply had no effect

So we're not really losing anything by tightening this.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-15 12:25:19 -07:00
Jeff King
0ab7eeccd9 serve: reject commands used as capabilities
Our table of v2 "capabilities" contains everything we might tell the
client we support. But there are differences in how we expect the client
to respond. Some of the entries are true capabilities (i.e., we expect
the client to say "yes, I support this"), and some are ones we expect
them to send as commands (with "command=ls-refs" or similar).

When we receive a capability used as a command, we complain about that.
But when we receive a command used as a capability (e.g., just "ls-refs"
in a pkt-line by itself), we silently ignore it.

This isn't really hurting anything (clients shouldn't send it, and we'll
ignore it), but we can tighten up the protocol to match what we expect
to happen.

There are two new tests here. The first one checks a capability used as
a command, which already passes. The second tests a command as a
capability, which this patch fixes.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-15 12:25:19 -07:00
Jeff King
108c265f27 serve: reject bogus v2 "command=ls-refs=foo"
When we see a line from the client like "command=ls-refs", we parse
everything after the equals sign as a capability, which we check against
our capabilities table. If we don't recognize the command (e.g.,
"command=foo"), we'll reject it.

But in parse_command(), we use the same get_capability() parser for
parsing non-command lines. So if we see "command=ls-refs=foo", we will
feed "ls-refs=foo" to get_capability(), which will say "OK, that's
ls-refs, with value 'foo'". But then we simply ignore the value
entirely.

The client is violating the spec here, which says:

      command = PKT-LINE("command=" key LF)
      key = 1*(ALPHA | DIGIT | "-_")

I.e., the key is not even allowed to have an equals sign in it. Whereas
a real non-command capability does allow a value:

      capability = PKT-LINE(key[=value] LF)

So by reusing the same get_capability() parser, we are mixing up the
"key" and "capability" tokens. However, since that parser tells us
whether it saw an "=", we can still use it; we just need to reject any
input that produces a non-NULL value field.

The current behavior isn't really hurting anything (the client should
never send such a request, and if it does, we just ignore the "value"
part). But since it does violate the spec, let's tighten it up to
prevent any surprising behavior.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-15 12:25:19 -07:00
Jeff King
9db5fb4fb3 docs/protocol-v2: clarify some ls-refs ref-prefix details
We've never documented the fact that a client can provide multiple
ref-prefix capabilities. Let's describe the behavior.

We also never discussed the "best effort" nature of the prefixes. The
client side of git.git has always treated them this way, filtering the
result with local patterns. And indeed any client must do this, because
the prefix patterns are not sufficient to express the usual refspecs
(and so for "foo" we ask for "refs/heads/foo", "refs/tags/foo", and so
on).

So this may be considered a change in the spec with respect to client
expectations / requirements, but it's mostly codifying existing
behavior.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-15 12:25:19 -07:00
Jeff King
7f0e4f6ac2 ls-refs: ignore very long ref-prefix counts
Because each "ref-prefix" capability from the client comes in its own
pkt-line, there's no limit to the number of them that a misbehaving
client may send. We read them all into a strvec, which means the client
can waste arbitrary amounts of our memory by just sending us "ref-prefix
foo" over and over.

One possible solution is to just drop the connection when the limit is
reached. If we set it high enough, then only misbehaving or malicious
clients would hit it. But "high enough" is vague, and it's unfriendly if
we guess wrong and a legitimate client hits this.

But we can do better. Since supporting the ref-prefix capability is
optional anyway, the client has to further cull the response based on
their own patterns. So we can simply ignore the patterns once we cross a
certain threshold. Note that we have to ignore _all_ patterns, not just
the ones past our limit (since otherwise we'd send too little data).

The limit here is fairly arbitrary, and probably much higher than anyone
would need in practice. It might be worth limiting it further, if only
because we check it linearly (so with "m" local refs and "n" patterns,
we do "m * n" string comparisons). But if we care about optimizing this,
an even better solution may be a more advanced data structure anyway.

I didn't bother making the limit configurable, since it's so high and
since Git should behave correctly in either case. It wouldn't be too
hard to do, but it makes both the code and documentation more complex.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-15 12:25:19 -07:00
Jeff King
f0a35c9ce5 serve: drop "keys" strvec
We collect the set of capabilities the client sends us in a strvec.
While this is usually small, there's no limit to the number of
capabilities the client can send us (e.g., they could just send us
"agent" pkt-lines over and over, and we'd keep adding them to the list).

Since all code has been converted away from using this list, let's get
rid of it. This avoids a potential attack where clients waste our
memory.

Note that we do have to replace it with a flag, because some of the
flush-packet logic checks whether we've seen any valid commands or keys.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-15 12:25:19 -07:00
Jeff King
ab539c9094 serve: provide "receive" function for session-id capability
Rather than pulling the session-id string from the list of collected
capabilities, we can handle it as soon as we receive it. This gets us
closer to dropping the collected list entirely.

The behavior should be the same, with one exception. Previously if the
client sent us multiple session-id lines, we'd report only the first.
Now we'll pass each one along to trace2. This shouldn't matter in
practice, since clients shouldn't do that (and if they do, it's probably
sensible to log them all).

As this removes the last caller of the static has_capability(), we can
remove it, as well (and in fact we must to avoid -Wunused-function
complaining).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-14 17:12:05 -07:00
Jeff King
c7d3aabd27 serve: provide "receive" function for object-format capability
We get any "object-format" specified by the client by searching for it
in the collected list of capabilities the client sent. We can instead
just handle it as soon as they send it. This is slightly more efficient,
and gets us one step closer to dropping that collected list.

Note that we do still have to do our final hash check after receiving
all capabilities (because they might not have sent an object-format line
at all, and we still have to check that the default matches our
repository algorithm). Since the check_algorithm() function would now be
down to a single if() statement, I've just inlined it in its only
caller.

There should be no change of behavior here, except for two
broken-protocol cases:

  - if the client sends multiple conflicting object-format capabilities
    (which they should not), we'll now choose the last one rather than
    the first. We could also detect and complain about the duplicates
    quite easily now, which we could not before, but I didn't do so
    here.

  - if the client sends a bogus "object-format" with no equals sign,
    we'll now say so, rather than "unknown object format: ''"

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-14 17:12:05 -07:00
Jeff King
e56e53067f serve: add "receive" method for v2 capabilities table
We have a capabilities table that tells us what we should tell the
client we are capable of, and what to do when a client gives us a
particular command (e.g., "command=ls-refs"). But it doesn't tell us
what to do when the client sends us back a capability (e.g.,
"object-format=sha256"). We just collect them all in a strvec and hope
somebody can use them later.

Instead, let's provide a function pointer in the table to act on these.
This will eventually help us avoid collecting the strings, which will be
more efficient and less prone to mischief.

Using the new method is optional, which helps in two ways:

  - we can move existing capabilities over to this new system gradually
    in individual commits

  - some capabilities we don't actually do anything with anyway. For
    example, the client is free to say "agent=git/1.2.3" to us, but we
    do not act on the information in any way.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-14 10:56:19 -07:00
Jeff King
5ef260d2d1 serve: return capability "value" from get_capability()
When the client sends v2 capabilities, they may be simple, like:

  foo

or have a value like:

  foo=bar

(all of the current capabilities actually expect a value, but the
protocol allows for boolean ones).

We use get_capability() to make sure the client's pktline matches a
capability. In doing so, we parse enough to see the "=" and the value
(if any), but we immediately forget it. Nobody cares for now, because they end
up parsing the values out later using has_capability(). But in
preparation for changing that, let's pass back a pointer so the callers
know what we found.

Note that unlike has_capability(), we'll return NULL for a "simple"
capability. Distinguishing these will be useful for some future patches.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-14 10:56:19 -07:00
Jeff King
76804526f9 serve: rename is_command() to parse_command()
The is_command() function not only tells us whether the pktline is a
valid command string, but it also parses out the command (and complains
if we see a duplicate). Let's rename it to make those extra functions
a bit more obvious.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-14 10:56:19 -07:00
Junio C Hamano
0057847208 Merge branch 'ab/serve-cleanup' into jk/reduce-malloc-in-v2-servers
* ab/serve-cleanup:
  upload-pack: document and rename --advertise-refs
  serve.[ch]: remove "serve_options", split up --advertise-refs code
  {upload,receive}-pack tests: add --advertise-refs tests
  serve.c: move version line to advertise_capabilities()
  serve: move transfer.advertiseSID check into session_id_advertise()
  serve.[ch]: don't pass "struct strvec *keys" to commands
  serve: use designated initializers
  transport: use designated initializers
  transport: rename "fetch" in transport_vtable to "fetch_refs"
  serve: mark has_capability() as static
2021-09-14 10:56:05 -07:00