Commit Graph

19763 Commits

Author SHA1 Message Date
SZEDER Gábor
0350954482 builtin/gc.c: let parse-options parse 'git maintenance's subcommands
'git maintenanze' parses its subcommands with a couple of if
statements.  parse-options has just learned to parse subcommands, so
let's use that facility instead, with the benefits of shorter code,
handling missing or unknown subcommands, and listing subcommands for
Bash completion.

This change makes 'git maintenance' consistent with other commands in
that the help text shown for '-h' goes to standard output, not error,
in the exit code and error message on unknown subcommand, and the
error message on missing subcommand.  There is a test checking these,
which is now updated accordingly.

Note that some of the functions implementing each subcommand don't
accept any parameters, so add the (unused) 'argc', '**argv' and
'*prefix' parameters to make them match the type expected by
parse-options, and thus avoid casting function pointers.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:15 -07:00
SZEDER Gábor
1c3b05170a builtin/commit-graph.c: let parse-options parse subcommands
'git commit-graph' parses its subcommands with an if-else if
statement.  parse-options has just learned to parse subcommands, so
let's use that facility instead, with the benefits of shorter code,
handling missing or unknown subcommands, and listing subcommands for
Bash completion.

Note that the functions implementing each subcommand only accept the
'argc' and '**argv' parameters, so add a (unused) '*prefix' parameter
to make them match the type expected by parse-options, and thus avoid
casting function pointers.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:15 -07:00
SZEDER Gábor
fa83cc834d parse-options: add support for parsing subcommands
Several Git commands have subcommands to implement mutually exclusive
"operation modes", and they usually parse their subcommand argument
with a bunch of if-else if statements.

Teach parse-options to handle subcommands as well, which will result
in shorter and simpler code with consistent error handling and error
messages on unknown or missing subcommand, and it will also make
possible for our Bash completion script to handle subcommands
programmatically.

The approach is guided by the following observations:

  - Most subcommands [1] are implemented in dedicated functions, and
    most of those functions [2] either have a signature matching the
    'int cmd_foo(int argc, const char **argc, const char *prefix)'
    signature of builtin commands or can be trivially converted to
    that signature, because they miss only that last prefix parameter
    or have no parameters at all.

  - Subcommand arguments only have long form, and they have no double
    dash prefix, no negated form, and no description, and they don't
    take any arguments, and can't be abbreviated.

  - There must be exactly one subcommand among the arguments, or zero
    if the command has a default operation mode.

  - All arguments following the subcommand are considered to be
    arguments of the subcommand, and, conversely, arguments meant for
    the subcommand may not preceed the subcommand.

So in the end subcommand declaration and parsing would look something
like this:

    parse_opt_subcommand_fn *fn = NULL;
    struct option builtin_commit_graph_options[] = {
        OPT_STRING(0, "object-dir", &opts.obj_dir, N_("dir"),
                   N_("the object directory to store the graph")),
        OPT_SUBCOMMAND("verify", &fn, graph_verify),
        OPT_SUBCOMMAND("write", &fn, graph_write),
        OPT_END(),
    };
    argc = parse_options(argc, argv, prefix, options,
                         builtin_commit_graph_usage, 0);
    return fn(argc, argv, prefix);

Here each OPT_SUBCOMMAND specifies the name of the subcommand and the
function implementing it, and the address of the same 'fn' subcommand
function pointer.  parse_options() then processes the arguments until
it finds the first argument matching one of the subcommands, sets 'fn'
to the function associated with that subcommand, and returns, leaving
the rest of the arguments unprocessed.  If none of the listed
subcommands is found among the arguments, parse_options() will show
usage and abort.

If a command has a default operation mode, 'fn' should be initialized
to the function implementing that mode, and parse_options() should be
invoked with the PARSE_OPT_SUBCOMMAND_OPTIONAL flag.  In this case
parse_options() won't error out when not finding any subcommands, but
will return leaving 'fn' unchanged.  Note that if that default
operation mode has any --options, then the PARSE_OPT_KEEP_UNKNOWN_OPT
flag is necessary as well (otherwise parse_options() would error out
upon seeing the unknown option meant to the default operation mode).

Some thoughts about the implementation:

  - The same pointer to 'fn' must be specified as 'value' for each
    OPT_SUBCOMMAND, because there can be only one set of mutually
    exclusive subcommands; parse_options() will BUG() otherwise.

    There are other ways to tell parse_options() where to put the
    function associated with the subcommand given on the command line,
    but I didn't like them:

      - Change parse_options()'s signature by adding a pointer to
        subcommand function to be set to the function associated with
        the given subcommand, affecting all callsites, even those that
        don't have subcommands.

      - Introduce a specific parse_options_and_subcommand() variant
        with that extra funcion parameter.

  - I decided against automatically calling the subcommand function
    from within parse_options(), because:

      - There are commands that have to perform additional actions
        after option parsing but before calling the function
        implementing the specified subcommand.

      - The return code of the subcommand is usually the return code
        of the git command, but preserving the return code of the
        automatically called subcommand function would have made the
        API awkward.

  - Also add a OPT_SUBCOMMAND_F() variant to allow specifying an
    option flag: we have two subcommands that are purposefully
    excluded from completion ('git remote rm' and 'git stash save'),
    so they'll have to be specified with the PARSE_OPT_NOCOMPLETE
    flag.

  - Some of the 'parse_opt_flags' don't make sense with subcommands,
    and using them is probably just an oversight or misunderstanding.
    Therefore parse_options() will BUG() when invoked with any of the
    following flags while the options array contains at least one
    OPT_SUBCOMMAND:

      - PARSE_OPT_KEEP_DASHDASH: parse_options() stops parsing
        arguments when encountering a "--" argument, so it doesn't
        make sense to expect and keep one before a subcommand, because
        it would prevent the parsing of the subcommand.

        However, this flag is allowed in combination with the
        PARSE_OPT_SUBCOMMAND_OPTIONAL flag, because the double dash
        might be meaningful for the command's default operation mode,
        e.g. to disambiguate refs and pathspecs.

      - PARSE_OPT_STOP_AT_NON_OPTION: As its name suggests, this flag
        tells parse_options() to stop as soon as it encouners a
        non-option argument, but subcommands are by definition not
        options...  so how could they be parsed, then?!

      - PARSE_OPT_KEEP_UNKNOWN: This flag can be used to collect any
        unknown --options and then pass them to a different command or
        subsystem.  Surely if a command has subcommands, then this
        functionality should rather be delegated to one of those
        subcommands, and not performed by the command itself.

        However, this flag is allowed in combination with the
        PARSE_OPT_SUBCOMMAND_OPTIONAL flag, making possible to pass
        --options to the default operation mode.

  - If the command with subcommands has a default operation mode, then
    all arguments to the command must preceed the arguments of the
    subcommand.

    AFAICT we don't have any commands where this makes a difference,
    because in those commands either only the command accepts any
    arguments ('notes' and 'remote'), or only the default subcommand
    ('reflog' and 'stash'), but never both.

  - The 'argv' array passed to subcommand functions currently starts
    with the name of the subcommand.  Keep this behavior.  AFAICT no
    subcommand functions depend on the actual content of 'argv[0]',
    but the parse_options() call handling their options expects that
    the options start at argv[1].

  - To support handling subcommands programmatically in our Bash
    completion script, 'git cmd --git-completion-helper' will now list
    both subcommands and regular --options, if any.  This means that
    the completion script will have to separate subcommands (i.e.
    words without a double dash prefix) from --options on its own, but
    that's rather easy to do, and it's not much work either, because
    the number of subcommands a command might have is rather low, and
    those commands accept only a single --option or none at all.  An
    alternative would be to introduce a separate option that lists
    only subcommands, but then the completion script would need not
    one but two git invocations and command substitutions for commands
    with subcommands.

    Note that this change doesn't affect the behavior of our Bash
    completion script, because when completing the --option of a
    command with subcommands, e.g. for 'git notes --<TAB>', then all
    subcommands will be filtered out anyway, as none of them will
    match the word to be completed starting with that double dash
    prefix.

[1] Except 'git rerere', because many of its subcommands are
    implemented in the bodies of the if-else if statements parsing the
    command's subcommand argument.

[2] Except 'credential', 'credential-store' and 'fsmonitor--daemon',
    because some of the functions implementing their subcommands take
    special parameters.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:14 -07:00
SZEDER Gábor
99d86d60e5 parse-options: PARSE_OPT_KEEP_UNKNOWN only applies to --options
The description of 'PARSE_OPT_KEEP_UNKNOWN' starts with "Keep unknown
arguments instead of erroring out".  This is a bit misleading, as this
flag only applies to unknown --options, while non-option arguments are
kept even without this flag.

Update the description to clarify this, and rename the flag to
PARSE_OPTIONS_KEEP_UNKNOWN_OPT to make this obvious just by looking at
the flag name.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:14 -07:00
SZEDER Gábor
c1b117d31c t0040-parse-options: test parse_options() with various 'parse_opt_flags'
In 't0040-parse-options.sh' we thoroughly test the parsing of all
types and forms of options, but in all those tests parse_options() is
always invoked with a 0 flags parameter.

Add a few tests to demonstrate how various 'enum parse_opt_flags'
values are supposed to influence option parsing.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:14 -07:00
SZEDER Gábor
31a66c1964 t5505-remote.sh: check the behavior without a subcommand
'git remote' without a subcommand defaults to listing all remotes and
doesn't accept any arguments except the '-v|--verbose' option.

We are about to teach parse-options to handle subcommands, and update
'git remote' to make use of that new feature.  So let's add some tests
to make sure that the upcoming changes don't inadvertently change the
behavior in these cases.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:13 -07:00
SZEDER Gábor
9e4658d5c6 t3301-notes.sh: check that default operation mode doesn't take arguments
'git notes' without a subcommand defaults to listing all notes and
doesn't accept any arguments.

We are about to teach parse-options to handle subcommands, and update
'git notes' to make use of that new feature.  So let's add a test to
make sure that the upcoming changes don't inadvertenly change the
behavior in this corner case.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 11:13:13 -07:00
Junio C Hamano
a31dbaebb1 Merge branch 'js/ci-github-workflow-markup'
A fix for a regression in test framework.

* js/ci-github-workflow-markup:
  tests: fix incorrect --write-junit-xml code
2022-07-22 15:04:03 -07:00
Junio C Hamano
18bbc795fc Merge branch 'gc/bare-repo-discovery'
Introduce a discovery.barerepository configuration variable that
allows users to forbid discovery of bare repositories.

* gc/bare-repo-discovery:
  setup.c: create `safe.bareRepository`
  safe.directory: use git_protected_config()
  config: learn `git_protected_config()`
  Documentation: define protected configuration
  Documentation/git-config.txt: add SCOPES section
2022-07-22 15:04:02 -07:00
Junio C Hamano
4b8cdff8ba Merge branch 'll/curl-accept-language'
Earlier, HTTP transport clients learned to tell the server side
what locale they are in by sending Accept-Language HTTP header, but
this was done only for some requests but not others.

* ll/curl-accept-language:
  remote-curl: send Accept-Language header to server
2022-07-19 16:40:19 -07:00
Junio C Hamano
cf92cb29e9 Merge branch 'jk/clone-unborn-confusion'
"git clone" from a repository with some ref whose HEAD is unborn
did not set the HEAD in the resulting repository correctly, which
has been corrected.

* jk/clone-unborn-confusion:
  clone: move unborn head creation to update_head()
  clone: use remote branch if it matches default HEAD
  clone: propagate empty remote HEAD even with other branches
  clone: drop extra newline from warning message
2022-07-19 16:40:17 -07:00
Junio C Hamano
99c0d94eaa Merge branch 'hx/lookup-commit-in-graph-fix'
A corner case bug where lazily fetching objects from a promisor
remote resulted in infinite recursion has been corrected.

* hx/lookup-commit-in-graph-fix:
  t5330: remove run_with_limited_processses()
  commit-graph.c: no lazy fetch in lookup_commit_in_graph()
2022-07-19 16:40:16 -07:00
Junio C Hamano
418aef9055 Merge branch 'jc/resolve-undo'
The resolve-undo information in the index was not protected against
GC, which has been corrected.

* jc/resolve-undo:
  fsck: do not dereference NULL while checking resolve-undo data
  revision: mark blobs needed for resolve-undo as reachable
2022-07-19 16:40:16 -07:00
Junio C Hamano
6d003858e5 Merge branch 'gc/submodule-use-super-prefix'
Another step to rewrite more parts of "git submodule" in C.

* gc/submodule-use-super-prefix:
  submodule--helper: remove display path helper
  submodule--helper update: use --super-prefix
  submodule--helper: remove unused SUPPORT_SUPER_PREFIX flags
  submodule--helper: use correct display path helper
  submodule--helper: don't recreate recursive prefix
  submodule--helper update: use display path helper
  submodule--helper tests: add missing "display path" coverage
2022-07-18 13:31:56 -07:00
Junio C Hamano
e3349f2888 Merge branch 'en/merge-dual-dir-renames-fix'
Fixes a long-standing corner case bug around directory renames in
the merge-ort strategy.

* en/merge-dual-dir-renames-fix:
  merge-ort: fix issue with dual rename and add/add conflict
  merge-ort: shuffle the computation and cleanup of potential collisions
  merge-ort: make a separate function for freeing struct collisions
  merge-ort: small cleanups of check_for_directory_rename
  t6423: add tests of dual directory rename plus add/add conflict
2022-07-18 13:31:56 -07:00
Junio C Hamano
3d3874d537 Merge branch 'ab/test-without-templates'
Tweak tests so that they still work when the "git init" template
did not create .git/info directory.

* ab/test-without-templates:
  tests: don't assume a .git/info for .git/info/sparse-checkout
  tests: don't assume a .git/info for .git/info/exclude
  tests: don't assume a .git/info for .git/info/refs
  tests: don't assume a .git/info for .git/info/attributes
  tests: don't assume a .git/info for .git/info/grafts
  tests: don't depend on template-created .git/branches
  t0008: don't rely on default ".git/info/exclude"
2022-07-18 13:31:55 -07:00
Junio C Hamano
48e88a4862 Merge branch 'ab/build-gitweb'
Teach "make all" to build gitweb as well.

* ab/build-gitweb:
  gitweb/Makefile: add a "NO_GITWEB" parameter
  Makefile: build 'gitweb' in the default target
  gitweb/Makefile: include in top-level Makefile
  gitweb: remove "test" and "test-installed" targets
  gitweb/Makefile: prepare to merge into top-level Makefile
  gitweb/Makefile: clear up and de-duplicate the gitweb.{css,js} vars
  gitweb/Makefile: add a $(GITWEB_ALL) variable
  gitweb/Makefile: define all .PHONY prerequisites inline
2022-07-18 13:31:55 -07:00
Junio C Hamano
f63ac61fbf Merge branch 'ab/test-tool-leakfix'
Plug various memory leaks in test-tool commands.

* ab/test-tool-leakfix:
  test-tool delta: fix a memory leak
  test-tool ref-store: fix a memory leak
  test-tool bloom: fix memory leaks
  test-tool json-writer: fix memory leaks
  test-tool regex: call regfree(), fix memory leaks
  test-tool urlmatch-normalization: fix a memory leak
  test-tool {dump,scrap}-cache-tree: fix memory leaks
  test-tool path-utils: fix a memory leak
  test-tool test-hash: fix a memory leak
2022-07-18 13:31:54 -07:00
Junio C Hamano
44357f64f6 Merge branch 'ab/leakfix'
Plug various memory leaks.

* ab/leakfix:
  pull: fix a "struct oid_array" memory leak
  cat-file: fix a common "struct object_context" memory leak
  gc: fix a memory leak
  checkout: avoid "struct unpack_trees_options" leak
  merge-file: fix memory leaks on error path
  merge-file: refactor for subsequent memory leak fix
  cat-file: fix a memory leak in --batch-command mode
  revert: free "struct replay_opts" members
  submodule.c: free() memory from xgetcwd()
  clone: fix memory leak in wanted_peer_refs()
  check-ref-format: fix trivial memory leak
2022-07-18 13:31:54 -07:00
Glen Choo
8d1a744820 setup.c: create safe.bareRepository
There is a known social engineering attack that takes advantage of the
fact that a working tree can include an entire bare repository,
including a config file. A user could run a Git command inside the bare
repository thinking that the config file of the 'outer' repository would
be used, but in reality, the bare repository's config file (which is
attacker-controlled) is used, which may result in arbitrary code
execution. See [1] for a fuller description and deeper discussion.

A simple mitigation is to forbid bare repositories unless specified via
`--git-dir` or `GIT_DIR`. In environments that don't use bare
repositories, this would be minimally disruptive.

Create a config variable, `safe.bareRepository`, that tells Git whether
or not to die() when working with a bare repository. This config is an
enum of:

- "all": allow all bare repositories (this is the default)
- "explicit": only allow bare repositories specified via --git-dir
  or GIT_DIR.

If we want to protect users from such attacks by default, neither value
will suffice - "all" provides no protection, but "explicit" is
impractical for bare repository users. A more usable default would be to
allow only non-embedded bare repositories ([2] contains one such
proposal), but detecting if a repository is embedded is potentially
non-trivial, so this work is not implemented in this series.

[1]: https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com
[2]: https://lore.kernel.org/git/5b969c5e-e802-c447-ad25-6acc0b784582@github.com

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-14 15:08:29 -07:00
Glen Choo
6061601d9f safe.directory: use git_protected_config()
Use git_protected_config() to read `safe.directory` instead of
read_very_early_config(), making it 'protected configuration only'.

As a result, `safe.directory` now respects "-c", so update the tests and
docs accordingly. It used to ignore "-c" due to how it was implemented,
not because of security or correctness concerns [1].

[1] https://lore.kernel.org/git/xmqqlevabcsu.fsf@gitster.g/

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-14 15:08:29 -07:00
Glen Choo
5b3c650777 config: learn git_protected_config()
`uploadpack.packObjectsHook` is the only 'protected configuration only'
variable today, but we've noted that `safe.directory` and the upcoming
`safe.bareRepository` should also be 'protected configuration only'. So,
for consistency, we'd like to have a single implementation for protected
configuration.

The primary constraints are:

1. Reading from protected configuration should be fast. Nearly all "git"
   commands inside a bare repository will read both `safe.directory` and
   `safe.bareRepository`, so we cannot afford to be slow.

2. Protected configuration must be readable when the gitdir is not
   known. `safe.directory` and `safe.bareRepository` both affect
   repository discovery and the gitdir is not known at that point [1].

The chosen implementation in this commit is to read protected
configuration and cache the values in a global configset. This is
similar to the caching behavior we get with the_repository->config.

Introduce git_protected_config(), which reads protected configuration
and caches them in the global configset protected_config. Then, refactor
`uploadpack.packObjectsHook` to use git_protected_config().

The protected configuration functions are named similarly to their
non-protected counterparts, e.g. git_protected_config_check_init() vs
git_config_check_init().

In light of constraint 1, this implementation can still be improved.
git_protected_config() iterates through every variable in
protected_config, which is wasteful, but it makes the conversion simple
because it matches existing patterns. We will likely implement constant
time lookup functions for protected configuration in a future series
(such functions already exist for non-protected configuration, i.e.
repo_config_get_*()).

An alternative that avoids introducing another configset is to continue
to read all config using git_config(), but only accept values that have
the correct config scope [2]. This technically fulfills constraint 2,
because git_config() simply ignores the local and worktree config when
the gitdir is not known. However, this would read incomplete config into
the_repository->config, which would need to be reset when the gitdir is
known and git_config() needs to read the local and worktree config.
Resetting the_repository->config might be reasonable while we only have
these 'protected configuration only' variables, but it's not clear
whether this extends well to future variables.

[1] In this case, we do have a candidate gitdir though, so with a little
refactoring, it might be possible to provide a gitdir.
[2] This is how `uploadpack.packObjectsHook` was implemented prior to
this commit.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-14 15:08:29 -07:00
Junio C Hamano
361cbe6d6d Merge branch 'ab/submodule-cleanup'
Further preparation to turn git-submodule.sh into a builtin.

* ab/submodule-cleanup:
  git-sh-setup.sh: remove "say" function, change last users
  git-submodule.sh: use "$quiet", not "$GIT_QUIET"
  submodule--helper: eliminate internal "--update" option
  submodule--helper: understand --checkout, --merge and --rebase synonyms
  submodule--helper: report "submodule" as our name in some "-h" output
  submodule--helper: rename "absorb-git-dirs" to "absorbgitdirs"
  submodule update: remove "-v" option
  submodule--helper: have --require-init imply --init
  git-submodule.sh: remove unused top-level "--branch" argument
  git-submodule.sh: make the "$cached" variable a boolean
  git-submodule.sh: remove unused $prefix variable
  git-submodule.sh: remove unused sanitize_submodule_env()
2022-07-14 15:04:00 -07:00
Junio C Hamano
0455aad1e3 Merge branch 'sy/mv-out-of-cone'
"git mv A B" in a sparsely populated working tree can be asked to
move a path between directories that are "in cone" (i.e. expected
to be materialized in the working tree) and "out of cone"
(i.e. expected to be hidden).  The handling of such cases has been
improved.

* sy/mv-out-of-cone:
  mv: add check_dir_in_index() and solve general dir check issue
  mv: use flags mode for update_mode
  mv: check if <destination> exists in index to handle overwriting
  mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
  mv: decouple if/else-if checks using goto
  mv: update sparsity after moving from out-of-cone to in-cone
  t1092: mv directory from out-of-cone to in-cone
  t7002: add tests for moving out-of-cone file/directory
2022-07-14 15:04:00 -07:00
Junio C Hamano
73b9ef6ab1 Merge branch 'hx/unpack-streaming'
Allow large objects read from a packstream to be streamed into a
loose object file straight, without having to keep it in-core as a
whole.

* hx/unpack-streaming:
  unpack-objects: use stream_loose_object() to unpack large objects
  core doc: modernize core.bigFileThreshold documentation
  object-file.c: add "stream_loose_object()" to handle large object
  object-file.c: factor out deflate part of write_loose_object()
  object-file.c: refactor write_loose_object() to several steps
  unpack-objects: low memory footprint for get_data() in dry_run mode
2022-07-14 15:03:59 -07:00
Junio C Hamano
be733e1200 Merge branch 'en/merge-tree'
"git merge-tree" learned a new mode where it takes two commits and
computes a tree that would result in the merge commit, if the
histories leading to these two commits were to be merged.

* en/merge-tree:
  git-merge-tree.txt: add a section on potentional usage mistakes
  merge-tree: add a --allow-unrelated-histories flag
  merge-tree: allow `ls-files -u` style info to be NUL terminated
  merge-ort: optionally produce machine-readable output
  merge-ort: store more specific conflict information
  merge-ort: make `path_messages` a strmap to a string_list
  merge-ort: store messages in a list, not in a single strbuf
  merge-tree: provide easy access to `ls-files -u` style info
  merge-tree: provide a list of which files have conflicts
  merge-ort: remove command-line-centric submodule message from merge-ort
  merge-ort: provide a merge_get_conflicted_files() helper function
  merge-tree: support including merge messages in output
  merge-ort: split out a separate display_update_messages() function
  merge-tree: implement real merges
  merge-tree: add option parsing and initial shell for real merge function
  merge-tree: move logic for existing merge into new function
  merge-tree: rename merge_trees() to trivial_merge_trees()
2022-07-14 15:03:59 -07:00
Junio C Hamano
dc6315e1fc Merge branch 'gg/worktree-from-the-above'
In a non-bare repository, the behavior of Git when the
core.worktree configuration variable points at a directory that has
a repository as its subdirectory, regressed in Git 2.27 days.

* gg/worktree-from-the-above:
  dir: minor refactoring / clean-up
  dir: traverse into repository
2022-07-14 15:03:58 -07:00
Johannes Schindelin
7253f7ca9f tests: fix incorrect --write-junit-xml code
In 78d5e4cfb4 (tests: refactor --write-junit-xml code, 2022-05-21),
this developer refactored the `--write-junit-xml` code a bit, including
the part where the current test case's title was used in a `set`
invocation, but failed to account for the fact that some test cases'
titles start with a long option, which the `set` misinterprets as being
intended for parsing.

Let's fix this by using the `set -- <...>` form.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-14 10:02:06 -07:00
Junio C Hamano
8c4f65e0bf Merge branch 'cl/grep-max-count'
"git grep -m<max-hits>" is a way to limit the hits shown per file.

* cl/grep-max-count:
  grep: add --max-count command line option
2022-07-13 14:54:55 -07:00
Junio C Hamano
81705c4ee6 Merge branch 'zk/push-use-bitmaps'
"git push" sometimes perform poorly when reachability bitmaps are
used, even in a repository where other operations are helped by
bitmaps.  The push.useBitmaps configuration variable is introduced
to allow disabling use of reachability bitmaps only for "git push".

* zk/push-use-bitmaps:
  send-pack.c: add config push.useBitmaps
2022-07-13 14:54:54 -07:00
Junio C Hamano
33f448b5fc Merge branch 'jk/remote-show-with-negative-refspecs'
"git remote show [-n] frotz" now pays attention to negative
pathspec.

* jk/remote-show-with-negative-refspecs:
  remote: handle negative refspecs in git remote show
2022-07-13 14:54:54 -07:00
Junio C Hamano
ee493108e5 Merge branch 'll/ls-files-tests-update'
Test update.

* ll/ls-files-tests-update:
  ls-files: update test style
2022-07-13 14:54:53 -07:00
Junio C Hamano
92a25a8897 Merge branch 'ab/test-quoting-fix'
Fixes for tests when the source directory has unusual characters in
its path, e.g. whitespaces, double-quotes, etc.

* ab/test-quoting-fix:
  config tests: fix harmless but broken "rm -r" cleanup
  test-lib.sh: fix prepend_var() quoting issue
  tests: add missing double quotes to included library paths
2022-07-13 14:54:52 -07:00
Junio C Hamano
db791e6e8f Merge branch 'ds/t5510-brokequote'
Test fix.

* ds/t5510-brokequote:
  t5510: replace 'origin' with URL more carefully
2022-07-13 14:54:52 -07:00
Junio C Hamano
8da79e7250 Merge branch 'en/t6429-test-must-be-empty-fix'
A test fix.

* en/t6429-test-must-be-empty-fix:
  t6429: fix use of non-existent function
2022-07-13 14:54:51 -07:00
Han Xin
cb88b37cb9 t5330: remove run_with_limited_processses()
run_with_limited_processses() is used to end the loop faster when an
infinite loop happen. But "ulimit" is tied to the entire development
station, and the test will fail due to too many other processes or using
"--stress".

Without run_with_limited_processses() the infinite loop can also be
stopped due to global configrations or quotas, and the verification
still works fine. So let's remove run_with_limited_processses().

Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-12 07:47:43 -07:00
Junio C Hamano
f2e5255fc2 Git 2.37.1
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE4fA2sf7nIh/HeOzvsLXohpav5ssFAmLDUXkACgkQsLXohpav
 5sv4SA/6AyzOuaMiBVTLtiYETFj9UU1Z3C12XtUlnu4qmw4Ddd1rq8/E4BgNDYJ7
 cC4MDVfOp7tvyaVGBBSfzIvIieBnGa7PIQl4z1eqZqIm0xt4T1D65jx1CHeJ+UIK
 k63L879dodQnIgwd1ThoGng0BUvmTREFODbGzX6JYPlRuEYkOpswLdvMO14epjLu
 T+TV9etoD5UELTrwnXDOq2ydH1MguyFj21g6NhMvBDTMCbZlQFb9skuF8dx2mC7T
 TktGntaEnFbm1UoZAoHg7g9AzO0iD+Vl6MVkkkolxJyAqiNUf030Ct6RQq9kRj1W
 7kYJDrgN9Oh3g93tJGsnTHmeOSoNiEJsHIeddH4HU0gzRtcx32ygL+KYE6exl2F6
 S1aoWJMdiQ9lYgQRef6aTQEHl2A08rIr3a3wFhVZBNZZk4NpwGblXfI2oIgKIVAx
 cFt0ABGX6RlokUUFGP+F/pk2noPi4m2tMaYfpUsd3sca+uKhFgtf26tQtmLpXAbq
 LKOA9FE+kjGgcOKMSIBjZYFod1HYHgZ+0F87JAURiUJnK2zAYSj+Sq4EatyvcHlb
 QcWQX5+Zxf+/rA1ACEBY+y4iZoSA0f1VQv8aANRmQwoIcjYjcu+W8dQKGCpJB68I
 ftTTxoM/OeQONkdzCcVNtS6ZbgjgKaaaaurvzLMwkQowtWEHd08=
 =pXYr
 -----END PGP SIGNATURE-----

Sync with Git 2.37.1
2022-07-11 16:08:49 -07:00
Junio C Hamano
b5a2d6cc49 Merge branch 'rs/archive-with-internal-gzip'
Teach "git archive" to (optionally and then by default) avoid
spawning an external "gzip" process when creating ".tar.gz" (and
".tgz") archives.

* rs/archive-with-internal-gzip:
  archive-tar: use internal gzip by default
  archive-tar: use OS_CODE 3 (Unix) for internal gzip
  archive-tar: add internal gzip implementation
  archive-tar: factor out write_block()
  archive: rename archiver data field to filter_command
  archive: update format documentation
2022-07-11 15:38:51 -07:00
Junio C Hamano
c2d01098fb Merge branch 'ds/branch-checked-out'
Introduce a helper to see if a branch is already being worked on
(hence should not be newly checked out in a working tree), which
performs much better than the existing find_shared_symref() to
replace many uses of the latter.

* ds/branch-checked-out:
  branch: drop unused worktrees variable
  fetch: stop passing around unused worktrees variable
  branch: fix branch_checked_out() leaks
  branch: use branch_checked_out() when deleting refs
  fetch: use new branch_checked_out() and add tests
  branch: check for bisects and rebases
  branch: add branch_checked_out() helper
2022-07-11 15:38:51 -07:00
Li Linchao
b0c4adcdd7 remote-curl: send Accept-Language header to server
Git server end's ability to accept Accept-Language header was introduced
in f18604bbf2 (http: add Accept-Language header if possible, 2015-01-28),
but this is only used by very early phase of the transfer, which is HTTP
GET request to discover references. For other phases, like POST request
in the smart HTTP, the server does not know what language the client
speaks.

Teach git client to learn end-user's preferred language and throw
accept-language header to the server side. Once the server gets this header,
it has the ability to talk to end-user with language they understand.
This would be very helpful for many non-English speakers.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Li Linchao <lilinchao@oschina.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-11 12:24:28 -07:00
Jeff King
cc8fcd1e1a clone: use remote branch if it matches default HEAD
Usually clone tries to use the same local HEAD as the remote (unless the
user has given --branch explicitly). Even if the remote HEAD is detached
or unborn, we can detect those situations with modern versions of Git.
If the remote is too old to support the "unborn" extension (or it has
been disabled via config), then we can't know the name of the remote's
unborn HEAD, and we fall back whatever the local default branch name is
configured to be.

But that leads to one weird corner case. It's rare because it needs a
number of factors:

  - the remote has an unborn HEAD

  - the remote is too old to support "unborn", or has disabled it

  - the remote has another branch "foo"

  - the local default branch name is "foo"

In that case you end up with a local clone on an unborn "foo" branch,
disconnected completely from the remote's "foo". This is rare in
practice, but the result is quite confusing.

When choosing "foo", we can double check whether the remote has such a
name, and if so, start our local "foo" at the same spot, rather than
making it unborn.

Note that this causes a test failure in t5605, which is cloning from a
bundle that doesn't contain HEAD (so it behaves like a remote that
doesn't support "unborn"), but has a single "main" branch. That test
expects that we end up in the weird "unborn main" case, where we don't
actually check out the remote branch of the same name. Even though we
have to update the test, this seems like an argument in favor of this
patch: checking out main is what I'd expect from such a bundle.

So this patch updates the test for the new behavior and adds an adjacent
one that checks what the original was going for: if there's no HEAD and
the bundle _doesn't_ have a branch that matches our local default name,
then we end up with nothing checked out.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-07 20:57:54 -07:00
Jeff King
3d8314f8d1 clone: propagate empty remote HEAD even with other branches
Unless "--branch" was given, clone generally tries to match the local
HEAD to the remote one. For most repositories, this is easy: the remote
tells us which branch HEAD was pointing to, and we call our local
checkout() function on that branch.

When cloning an empty repository, it's a little more tricky: we have
special code that checks the transport's "unborn" extension, or falls back
to our local idea of what the default branch should be. In either case,
we point the new HEAD to that, and set up the branch.* config.

But that leaves one case unhandled: when the remote repository _isn't_
empty, but its HEAD is unborn. The checkout() function is smart enough
to realize we didn't fetch the remote HEAD and it bails with a warning.
But we'll have ignored any information the remote gave us via the unborn
extension. This leads to nonsense outcomes:

  - If the remote has its HEAD pointing to an unborn "foo" and contains
    another branch "bar", cloning will get branch "bar" but leave the
    local HEAD pointing at "master" (or whatever our local default is),
    which is useless. The project does not use "master" as a branch.

  - Worse, if the other branch "bar" is instead called "master" (but
    again, the remote HEAD is not pointing to it), then we end up with a
    local unborn branch "master", which is not connected to the remote
    "master" (it shares no history, and there's no branch.* config).

Instead, we should try to use the remote's HEAD, even if its unborn, to
be consistent with the other cases.

The reason this case was missed is that cmd_clone() handles empty and
non-empty repositories on two different sides of a conditional:

  if (we have any refs) {
      fetch refs;
      check for --branch;
      otherwise, try to point our head at remote head;
      otherwise, our head is NULL;
  } else {
      check for --branch;
      otherwise, try to use "unborn" extension;
      otherwise, fall back to our default name name;
  }

So the smallest change would be to repeat the "unborn" logic at the end
of the first block. But we can note some other overlaps and
inconsistencies:

  - both sides have to handle --branch (though note that it's always an
    error for the empty repo case, since an empty repo by definition
    does not have a matching branch)

  - the fall back to the default name is much more explicit in the
    empty-repo case. The non-empty case eventually ends up bailing
    from checkout() with a warning, which produces a similar result, but
    fails to set up the branch config we do in the empty case.

So let's pull the HEAD setup out of this conditional entirely. This
de-duplicates some of the code and the result is easy to follow, because
helper functions like find_ref_by_name() do the right thing even in the
empty-repo case (i.e., by returning NULL).

There are two subtleties:

  - for a remote with a detached HEAD, it will advertise an oid for HEAD
    (which we store in our "remote_head" variable), but we won't find a
    matching refname (so our "remote_head_points_at" is NULL). In this
    case we make a local detached HEAD to match. Right now this happens
    implicitly by reaching update_head() with a non-NULL remote_head
    (since we skip all of the unborn-fallback). We'll now need to
    account for it explicitly before doing the fallback.

  - for an empty repo, we issue a warning to the user that they've
    cloned an empty repo. The text of that warning doesn't make sense
    for a non-empty repo with an unborn HEAD, so we'll have to
    differentiate the two cases there. We could just use different text,
    but instead let's allow the code to continue down to checkout(),
    which will issue an appropriate warning, like:

      remote HEAD refers to nonexistent ref, unable to checkout

    Continuing down to checkout() will make it easier to do more fixes
    on top (see below).

Note that this patch fixes the case where the other side reports an
unborn head to us using the protocol extension. It _doesn't_ fix the
case where the other side doesn't tell us, we locally guess "master",
and the other side happens to have a "master" which its HEAD doesn't
point. But it doesn't make anything worse there, and it should actually
make it easier to fix that problem on top.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-07 20:57:54 -07:00
Li Linchao
18337d406f ls-files: update test style
Update test style in t/t30[*].sh for uniformity, that's to
keep test title the same line with helper function itself,
and fix some indentions.

Add a new section "recommended style" in t/README to
encourage people to use more modern style in test.

Signed-off-by: Li Linchao <lilinchao@oschina.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-06 10:01:04 -07:00
Elijah Newren
751e165424 merge-ort: fix issue with dual rename and add/add conflict
There is code in both merge-recursive and merge-ort for avoiding doubly
transitive renames (i.e. one side renames directory A/ -> B/, and the
other side renames directory B/ -> C/), because this combination would
otherwise make a mess for new files added to A/ on the first side and
wondering which directory they end up in -- especially if there were
even more renames such as the first side renaming C/ -> D/.  In such
cases, it just turns "off" directory rename detection for the higher
order transitive cases.

The testcases added in t6423 a couple commits ago are slightly different
but similar in principle.  They involve a similar case of paired
renaming but instead of A/ -> B/ and B/ -> C/, the second side renames
a leading directory of B/ to C/.  And both sides add a new file
somewhere under the directory that the other side will rename.  While
the new files added start within different directories and thus could
logically end up within different directories, it is weird for a file
on one side to end up where the other one started and not move along
with it.  So, let's just turn off directory rename detection in this
case as well.

Another way to look at this is that if the source name involved in a
directory rename on one side is the target name of a directory rename
operation for a file from the other side, then we avoid the doubly
transitive rename.  (More concretely, if a directory rename on side D
wants to rename a file on side E from OLD_NAME -> NEW_NAME, and side D
already had a file named NEW_NAME, and a directory rename on side E
wants to rename side D's NEW_NAME -> NEWER_NAME, then we turn off the
directory rename detection for NEW_NAME to prevent the
NEW_NAME -> NEWER_NAME rename, and instead end up with an add/add
conflict on NEW_NAME.)

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-06 09:39:46 -07:00
Elijah Newren
0565cee5e4 t6423: add tests of dual directory rename plus add/add conflict
This is an attempt at minimalizing a testcase reported by Glen Choo
with tensorflow where merge-ort would report an assertion failure:

    Assertion failed: (ci->filemask == 2 || ci->filemask == 4), function apply_directory_rename_modifications, file merge-ort.c, line 2410

reversing the direction of the merge provides a different error:

    error: cache entry has null sha1: ...
    fatal: unable to write .git/index

so we add testcases for both.  With these new testcases, the
recursive strategy differs in that it returns the latter error for
both merge directions.

These testcases are somehow a little different than Glen's original
tensorflow testcase in that these ones trigger a bug with the recursive
algorithm whereas his testcase didn't.  I figure that means these
testcases somehow manage to be more comprehensive.

Reported-by: Glen Choo <chooglen@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-06 09:39:46 -07:00
Junio C Hamano
a631e99807 Merge 'js/add-i-delete' into maint-2.37
Rewrite of "git add -i" in C that appeared in Git 2.25 didn't
correctly record a removed file to the index, which is an old
regression but has become widely known because the C version
has become the default in the latest release.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-04 13:40:59 -07:00
Junio C Hamano
0f0bc2124b Merge branch 'js/add-i-delete'
Rewrite of "git add -i" in C that appeared in Git 2.25 didn't
correctly record a removed file to the index, which was fixed.

* js/add-i-delete:
  add --interactive: allow `update` to stage deleted files
2022-07-02 21:56:08 -07:00
Shaoxuan Yuan
b91a2b6594 mv: add check_dir_in_index() and solve general dir check issue
Originally, moving a <source> directory which is not on-disk due
to its existence outside of sparse-checkout cone, "giv mv" command
errors out with "bad source".

Add a helper check_dir_in_index() function to see if a directory
name exists in the index. Also add a SKIP_WORKTREE_DIR bit to mark
such directories.

Change the checking logic, so that such <source> directory makes
"giv mv" command warns with "advise_on_updating_sparse_paths()"
instead of "bad source"; also user now can supply a "--sparse" flag so
this operation can be carried out successfully.

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 14:50:16 -07:00
Shaoxuan Yuan
8a26a3915f mv: check if <destination> exists in index to handle overwriting
Originally, moving a sparse file into cone can result in unwarned
overwrite of existing entry. The expected behavior is that if the
<destination> exists in the entry, user should be prompted to supply
a [-f|--force] to carry out the operation, or the operation should
fail.

Add a check mechanism to do that.

Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 14:50:16 -07:00
Shaoxuan Yuan
6645b03ca5 mv: check if out-of-cone file exists in index with SKIP_WORKTREE bit
Originally, moving a <source> file which is not on-disk but exists in
index as a SKIP_WORKTREE enabled cache entry, "giv mv" command errors
out with "bad source".

Change the checking logic, so that such <source>
file makes "giv mv" command warns with "advise_on_updating_sparse_paths()"
instead of "bad source"; also user now can supply a "--sparse" flag so
this operation can be carried out successfully.

Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-01 14:50:16 -07:00