Commit Graph

8971 Commits

Author SHA1 Message Date
Taylor Blau
4f6460df55 builtin/bugreport.c: use thread-safe localtime_r()
To generate its filename, the 'git bugreport' builtin asks the system
for the current time with 'localtime()'. Since this uses a shared
buffer, it is not thread-safe.

Even though 'git bugreport' is not multi-threaded, using localtime() can
trigger some static analysis tools to complain, and a quick

    $ git grep -oh 'localtime\(_.\)\?' -- **/*.c | sort | uniq -c

shows that the only usage of the thread-unsafe 'localtime' is in a piece
of documentation.

So, convert this instance to use the thread-safe version for
consistency, and to appease some analysis tools.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-01 13:05:37 -08:00
Junio C Hamano
3d8f81f21b Merge branch 'sa/credential-store-timeout'
Multiple "credential-store" backends can race to lock the same
file, causing everybody else but one to fail---reattempt locking
with some timeout to reduce the rate of the failure.

* sa/credential-store-timeout:
  crendential-store: use timeout when locking file
2020-11-30 14:49:45 -08:00
Junio C Hamano
fa27e2d103 Merge branch 'km/stash-error-message-fix'
Error message fix.

* km/stash-error-message-fix:
  stash: add missing space to an error message
2020-11-30 14:49:45 -08:00
Junio C Hamano
f73ee0c6be Merge branch 'mt/worktree-error-message-fix'
Fix formulation of an error message with two placeholders in "git
worktree add" subcommand.

* mt/worktree-error-message-fix:
  worktree: fix order of arguments in error message
2020-11-30 14:49:43 -08:00
Junio C Hamano
1c04cdd424 Merge branch 'ab/gc-keep-base-option'
Fix an option name in "gc" documentation.

* ab/gc-keep-base-option:
  gc: rename keep_base_pack variable for --keep-largest-pack
  gc docs: change --keep-base-pack to --keep-largest-pack
2020-11-30 14:49:43 -08:00
Junio C Hamano
290c94085b Merge branch 'js/pull-rebase-use-advise'
UI improvement.

* js/pull-rebase-use-advise:
  pull: colorize the hint about setting `pull.rebase`
2020-11-30 14:49:42 -08:00
Rafael Silva
e72f7defc4 maintenance: fix SEGFAULT when no repository
The "git maintenance run" and "git maintenance start/stop" commands
holds a file-based lock at the .git/maintenance.lock and
.git/schedule.lock respectively. These locks are used to ensure only
one maintenance process is executed at the time as both operations
involves writing data into the git repository.

The path to the lock file is built using
"the_repository->objects->odb->path" that results in SEGFAULT when we
have no repository available as "the_repository->objects->odb" is
set to NULL.

Let's teach maintenance command to use RUN_SETUP option that will
provide the validation and fail when running outside of a repository.
Hence fixing the SEGFAULT for all three operations and making the
behaviour consistent across all subcommands.

Setting the RUN_SETUP also provides the same protection for all
subcommands given that the "register" and "unregister" also requires to
be executed inside a repository.

Furthermore let's remove the local validation implemented by the
"register" and "unregister" as this will not be required anymore with
the new option.

Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-30 13:44:15 -08:00
Junio C Hamano
2ba70a330b Merge branch 'rs/gc-sort-func-cast-fix'
Fix broken sorting of maintenance tasks.

* rs/gc-sort-func-cast-fix:
  gc: fix cast in compare_tasks_by_selection()
2020-11-25 15:24:53 -08:00
Junio C Hamano
fcf26ef53a Merge branch 'jk/4gb-idx'
The code was not prepared to deal with pack .idx file that is
larger than 4GB.

* jk/4gb-idx:
  packfile: detect overflow in .idx file size checks
  block-sha1: take a size_t length parameter
  fsck: correctly compute checksums on idx files larger than 4GB
  use size_t to store pack .idx byte offsets
  compute pack .idx byte offsets using size_t
2020-11-25 15:24:52 -08:00
Junio C Hamano
8f8f10ac09 Merge branch 'jx/t5411-flake-fix'
The exchange between receive-pack and proc-receive hook did not
carefully check for errors.

* jx/t5411-flake-fix:
  receive-pack: use default version 0 for proc-receive
  receive-pack: gently write messages to proc-receive
  t5411: new helper filter_out_user_friendly_and_stable_output
2020-11-25 15:24:52 -08:00
Derrick Stolee
483a6d9b5d maintenance: use 'git config --fixed-value'
When a repository's leading directories contain regex metacharacters,
the config calls for 'git maintenance register' and 'git maintenance
unregister' are not careful enough. Use the new --fixed-value option
to direct the config machinery to use exact string matches. This is a
more robust option than escaping these arguments in a piecemeal fashion.

For the test, require that we are not running on Windows since the '+'
and '*' characters are not allowed on that filesystem.

Reported-by: Emily Shaffer <emilyshaffer@google.com>
Reported-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-25 15:04:55 -08:00
Derrick Stolee
3f1bae1dc3 config: implement --fixed-value with --get*
The config builtin does its own regex matching of values for the --get,
--get-all, and --get-regexp modes. Plumb the existing 'flags' parameter
to the get_value() method so we can initialize the value-pattern argument
as a fixed string instead of a regex pattern.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-25 14:43:48 -08:00
Derrick Stolee
c90702a1f6 config: plumb --fixed-value into config API
The git_config_set_multivar_in_file_gently() and related methods now
take a 'flags' bitfield, so add a new bit representing the --fixed-value
option from 'git config'. This alters the purpose of the value_pattern
parameter to be an exact string match. This requires some initialization
changes in git_config_set_multivar_in_file_gently() and a new strcmp()
call in the matches() method.

The new CONFIG_FLAGS_FIXED_VALUE flag is initialized in builtin/config.c
based on the --fixed-value option, and that needs to be updated in
several callers.

This patch only affects some of the modes of 'git config', and the rest
will be completed in the next change.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-25 14:43:48 -08:00
Derrick Stolee
fda43942d7 config: add --fixed-value option, un-implemented
The 'git config' builtin takes a 'value-pattern' parameter for several
actions. This can cause confusion when expecting exact value matches
instead of regex matches, especially when the input string contains
metacharacters. While callers can escape the patterns themselves, it
would be more friendly to allow an argument to disable the pattern
matching in favor of an exact string match.

Add a new '--fixed-value' option that does not currently change the
behavior. The implementation will be filled in by later changes for
each appropriate action. For now, check and test that --fixed-value
will abort the command when included with an incompatible action or
without a 'value-pattern' argument.

The name '--fixed-value' was chosen over something simpler like
'--fixed' because some commands allow regular expressions on the
key in addition to the value.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-25 14:43:48 -08:00
Derrick Stolee
247e2f822e config: replace 'value_regex' with 'value_pattern'
The 'value_regex' argument in the 'git config' builtin is poorly named,
especially related to an upcoming change that allows exact string
matches instead of ERE pattern matches.

Perform a mostly mechanical change of every instance of 'value_regex' to
'value_pattern' in the codebase. This is only critical for documentation
and error messages, but it is best to be consistent inside the codebase,
too.

For documentation, use 'value-pattern' which is better punctuation. This
affects Documentation/git-config.txt and the usage in builtin/config.c,
which was already mixed between 'value_regex' and 'value-regex'.

I gave some thought to leaving the value_regex variables inside config.c
that are regex_t pointers. However, it is probably best to keep the name
consistent with the rest of the variables.

This does not update the translations inside the po/ directory, as that
creates conflicts with ongoing work. The input strings should
automatically update through automation, and a few of the output strings
currently use "[value_regex]" directly.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-25 14:43:48 -08:00
Derrick Stolee
504ee1290e config: convert multi_replace to flags
We will extend the flexibility of the config API. Before doing so, let's
take an existing 'int multi_replace' parameter and replace it with a new
'unsigned flags' parameter that can take multiple options as a bit field.

Update all callers that specified multi_replace to now specify the
CONFIG_FLAGS_MULTI_REPLACE flag. To add more clarity, extend the
documentation of git_config_set_multivar_in_file() including a clear
labeling of its arguments. Other config API methods in config.h require
only a change of the final parameter from 'int' to 'unsigned'.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-25 14:43:47 -08:00
Simão Afonso
df7f915fb6 crendential-store: use timeout when locking file
When holding the lock for rewriting the credential file, use a timeout
to avoid race conditions when the credentials file needs to be updated
in parallel.

An example would be doing `fetch --all` on a repository with several
remotes that need credentials, using parallel fetching.

The timeout can be configured using "credentialStore.lockTimeoutMS",
defaulting to 1 second.

Signed-off-by: Simão Afonso <simao.afonso@powertools-tech.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-25 12:30:18 -08:00
Derrick Stolee
31345d5545 maintenance: extract platform-specific scheduling
The existing schedule mechanism using 'cron' is supported by POSIX
platforms, but not Windows. It also works slightly differently on
macOS to significant detriment of the user experience. To allow for
new implementations on these platforms, extract a method that
performs the platform-specific scheduling mechanism. This will be
swapped at compile time with new implementations on specialized
platforms.

As we add this generality, rename GIT_TEST_CRONTAB to
GIT_TEST_MAINT_SCHEDULER. Further, this variable is now parsed as
"<scheduler>:<command>" so we can test platform-specific scheduling
logic even when not on the correct platform. By specifying the
<scheduler> in this string, we will be able to test all three sets of
Git logic from a Linux machine.

Co-authored-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-24 13:02:29 -08:00
Kyle Meyer
eaf5341538 stash: add missing space to an error message
Restore a space that was lost in 8a0fc8d19d (stash: convert apply to
builtin, 2019-02-25).

Signed-off-by: Kyle Meyer <kyle@kyleam.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-24 12:56:31 -08:00
Junio C Hamano
bf0a430f70 Merge branch 'en/strmap'
A specialization of hashmap that uses a string as key has been
introduced.  Hopefully it will see wider use over time.

* en/strmap:
  shortlog: use strset from strmap.h
  Use new HASHMAP_INIT macro to simplify hashmap initialization
  strmap: take advantage of FLEXPTR_ALLOC_STR when relevant
  strmap: enable allocations to come from a mem_pool
  strmap: add a strset sub-type
  strmap: split create_entry() out of strmap_put()
  strmap: add functions facilitating use as a string->int map
  strmap: enable faster clearing and reusing of strmaps
  strmap: add more utility functions
  strmap: new utility functions
  hashmap: provide deallocation function names
  hashmap: introduce a new hashmap_partial_clear()
  hashmap: allow re-use after hashmap_free()
  hashmap: adjust spacing to fix argument alignment
  hashmap: add usage documentation explaining hashmap_free[_entries]()
2020-11-21 15:14:38 -08:00
Junio C Hamano
0dd171f0bc Merge branch 'jk/rev-parse-end-of-options'
"git rev-parse" learned the "--end-of-options" to help scripts to
safely take a parameter that is supposed to be a revision, e.g.
"git rev-parse --verify -q --end-of-options $rev".

* jk/rev-parse-end-of-options:
  rev-parse: handle --end-of-options
  rev-parse: put all options under the "-" check
  rev-parse: don't accept options after dashdash
2020-11-21 15:14:38 -08:00
Junio C Hamano
473c6224c6 Merge branch 'jc/format-patch-name-max'
The maximum length of output filenames "git format-patch" creates
has become configurable (used to be capped at 64).

* jc/format-patch-name-max:
  format-patch: make output filename configurable
2020-11-21 15:14:38 -08:00
Martin Ågren
96313423a7 grep: use designated initializers for grep_defaults
In 15fabd1bbd ("builtin/grep.c: make configuration callback more
reusable", 2012-10-09), we learned to fill a `static struct grep_opt
grep_defaults` which we can use as a blueprint for other such structs.

At the time, we didn't consider designated initializers to be widely
useable, but these days, we do. (See, e.g., cbc0f81d96 ("strbuf: use
designated initializers in STRBUF_INIT", 2017-07-10).)

Use designated initializers to let the compiler set up the struct and so
that we don't need to remember to call `init_grep_defaults()`.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-21 14:50:33 -08:00
Martin Ågren
1d3878799f grep: don't set up a "default" repo for grep
`init_grep_defaults()` fills a `static struct grep_opt grep_defaults`.
This struct is then used by `grep_init()` as a blueprint for other such
structs. Notably, `grep_init()` takes a `struct repo *` and assigns it
into the target struct.

As a result, it is unnecessary for us to take a `struct repo *` in
`init_grep_defaults()` as well. We assign it into the default struct and
never look at it again. And in light of how we return early if we have
already set up the default struct, it's not just unnecessary, but is
also a bit confusing: If we are called twice and with different repos,
is it a bug or a feature that we ignore the second repo?

Drop the repo parameter for `init_grep_defaults()`.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-21 14:50:29 -08:00
Matheus Tavares
b86339b12b worktree: fix order of arguments in error message
`git worktree add` (without --force) errors out when given a path
that is already registered as a worktree and the path is missing on
disk. But the `cmd` and `path` strings are switched on the error
message. Let's fix that.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-21 13:31:34 -08:00
Ævar Arnfjörð Bjarmason
793c1464d3 gc: rename keep_base_pack variable for --keep-largest-pack
As noted in an earlier change the keep_base_pack variable name is a
relic from an earlier on-list version of ae4e89e549 ("gc: add
--keep-largest-pack option", 2018-04-15) before it was renamed to
--keep-largest-pack.

Let's change the variable name to avoid that confusion, it's easier to
read the code if there's a 1=1 mapping between the variable name and
option name.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-21 11:39:59 -08:00
Johannes Schindelin
e01ae2a4a7 pull: colorize the hint about setting pull.rebase
In d18c950a69 (pull: warn if the user didn't say whether to rebase or
to merge, 2020-03-09), a new hint was introduced to encourage users to
make a conscious decision about whether they want their pull to merge or
to rebase by configuring the `pull.rebase` setting.

This warning was clearly intended to advise users, but as pointed out in
https://lore.kernel.org/git/87ima2rdsm.fsf%40evledraar.gmail.com, it
uses `warning()` instead of `advise()`.

One consequence is that the advice is not colorized in the same manner
as other, similar messages. So let's use `advise()` instead.

Pointed-out-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 14:13:30 -08:00
René Scharfe
a1c74791d5 gc: fix cast in compare_tasks_by_selection()
compare_tasks_by_selection() is used with QSORT and gets passed pointers
to the elements of "static struct maintenance_task tasks[]".  It casts
the *addresses* of these passed pointers to element pointers, though,
and thus effectively compares some unrelated values from the stack.  Fix
the casts to actually compare array elements.

Detected by USan (make SANITIZE=undefined test).

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-18 14:15:58 -08:00
Junio C Hamano
3f6dc9c366 Merge branch 'pb/blame-funcname-range-userdiff'
"git blame -L :funcname -- path" did not work well for a path for
which a userdiff driver is defined.

* pb/blame-funcname-range-userdiff:
  blame: simplify 'setup_blame_bloom_data' interface
  blame: simplify 'setup_scoreboard' interface
  blame: enable funcname blaming with userdiff driver
  line-log: mention both modes in 'blame' and 'log' short help
  doc: add more pointers to gitattributes(5) for userdiff
  blame-options.txt: also mention 'funcname' in '-L' description
  doc: line-range: improve formatting
  doc: log, gitk: move '-L' description to 'line-range-options.txt'
2020-11-18 13:32:53 -08:00
Junio C Hamano
a1f95951ef Merge branch 'en/merge-ort-api-null-impl'
Preparation for a new merge strategy.

* en/merge-ort-api-null-impl:
  merge,rebase,revert: select ort or recursive by config or environment
  fast-rebase: demonstrate merge-ort's API via new test-tool command
  merge-ort-wrappers: new convience wrappers to mimic the old merge API
  merge-ort: barebones API of new merge strategy with empty implementation
2020-11-18 13:32:53 -08:00
Junio C Hamano
7660da1618 Merge branch 'ds/maintenance-part-3'
Parts of "git maintenance" to ease writing crontab entries (and
other scheduling system configuration) for it.

* ds/maintenance-part-3:
  maintenance: add troubleshooting guide to docs
  maintenance: use 'incremental' strategy by default
  maintenance: create maintenance.strategy config
  maintenance: add start/stop subcommands
  maintenance: add [un]register subcommands
  for-each-repo: run subcommands on configured repos
  maintenance: add --schedule option and config
  maintenance: optionally skip --auto process
2020-11-18 13:32:53 -08:00
Junio C Hamano
c042c455d4 Merge branch 'pw/rebase-i-orig-head'
"git rebase -i" did not store ORIG_HEAD correctly.

* pw/rebase-i-orig-head:
  rebase -i: simplify get_revision_ranges()
  rebase -i: use struct object_id when writing state
  rebase -i: use struct object_id rather than looking up commit
  rebase -i: stop overwriting ORIG_HEAD buffer
2020-11-18 13:32:53 -08:00
Junio C Hamano
5edc8bdc06 Merge branch 'jk/format-patch-output'
"git format-patch --output=there" did not work as expected and
instead crashed.  The option is now supported.

* jk/format-patch-output:
  format-patch: support --output option
  format-patch: tie file-opening logic to output_directory
  format-patch: refactor output selection
2020-11-18 13:32:52 -08:00
Junio C Hamano
f8a1cee7b3 Merge branch 'jc/line-log-takes-no-pathspec'
"git log -L<range>:<path>" is documented to take no pathspec, but
this was not enforced by the command line option parser, which has
been corrected.

* jc/line-log-takes-no-pathspec:
  log: diagnose -L used with pathspec as an error
2020-11-18 13:32:52 -08:00
Junio C Hamano
30f5257611 Merge branch 'rs/empty-reflog-check-fix'
The code to see if "git stash drop" can safely remove refs/stash
has been made more carerful.

* rs/empty-reflog-check-fix:
  stash: simplify reflog emptiness check
2020-11-18 13:32:52 -08:00
Taylor Blau
2fcb03b52d builtin/repack.c: don't move existing packs out of the way
When 'git repack' creates a pack with the same name as any existing
pack, it moves the existing one to 'old-pack-xxx.{pack,idx,...}' and
then renames the new one into place.

Eventually, it would be nice to have 'git repack' allow for writing a
multi-pack index at the critical time (after the new packs have been
written / moved into place, but before the old ones have been deleted).
Guessing that this option might be called '--write-midx', this makes the
following situation (where repacks are issued back-to-back without any
new objects) impossible:

    $ git repack -adb
    $ git repack -adb --write-midx

In the second repack, the existing packs are overwritten verbatim with
the same rename-to-old sequence. At that point, the current MIDX is
invalidated, since it refers to now-missing packs. So that code wants to
be run after the MIDX is re-written. But (prior to this patch) the new
MIDX can't be written until the new packs are moved into place. So, we
have a circular dependency.

This is all hypothetical, since no code currently exists to write a MIDX
safely during a 'git repack' (the 'GIT_TEST_MULTI_PACK_INDEX' does so
unsafely). Putting hypothetical aside, though: why do we need to rename
existing packs to be prefixed with 'old-' anyway?

This behavior dates all the way back to 2ad47d6 (git-repack: Be
careful when updating the same pack as an existing one., 2006-06-25).
2ad47d6 is mainly concerned about a case where a newly written pack
would have a different structure than its index. This used to be
possible when the pack name was a hash of the set of objects. Under this
naming scheme, two packs that store the same set of objects could differ
in delta selection, object positioning, or both. If this happened, then
any such packs would be unreadable in the instant between copying the
new pack and new index (i.e., either the index or pack will be stale
depending on the order that they were copied).

But since 1190a1a (pack-objects: name pack files after trailer hash,
2013-12-05), this is no longer possible, since pack files are named not
after their logical contents (i.e., the set of objects), but by the
actual checksum of their contents. So, this old- behavior can safely go,
which allows us to avoid our circular dependency above.

In addition to avoiding the circular dependency, this patch also makes
'git repack' a lot simpler, since we don't have to deal with failures
encountered when renaming existing packs to be prefixed with 'old-'.

This patch is mostly limited to removing code paths that deal with the
'old' prefixing, with the exception of files that include the pack's
name in their own filename, like .idx, .bitmap, and related files. The
exception is that we want to continue to trust what pack-objects wrote.
That is, it is not the case that we pretend as if pack-objects didn't
write files identical to ones that already exist, but rather that we
respect what pack-objects wrote as the source of truth. That cuts two
ways:

  - If pack-objects produced an identical pack to one that already
    exists with a bitmap, but did not produce a bitmap, we remove the
    bitmap that already exists. (This behavior is codified in t7700.14).

  - If pack-objects produced an identical pack to one that already
    exists, we trust the just-written version of the coresponding .idx,
    .promisor, and other files over the ones that already exist. This
    ensures that we use the most up-to-date versions of this files,
    which is safe even in the face of format changes in, say, the .idx
    file (which would not be reflected in the .idx file's name).

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-17 13:31:55 -08:00
Philippe Blain
5176f20ffe pull: check for local submodule modifications with the right range
Ever since 'git pull' learned '--recurse-submodules' in a6d7eb2c7a
(pull: optionally rebase submodules (remote submodule changes only),
2017-06-23), we check if there are local submodule modifications by
checking the revision range 'curr_head --not rebase_fork_point'.

The goal of this check is to abort the pull if there are submodule
modifications in the local commits being rebased, since this scenario is
not supported.

However, the actual range of commits being rebased is not
'rebase_fork_point..curr_head', as the logic in
'get_rebase_newbase_and_upstream' reveals, it is 'upstream..curr_head'.

If the 'git merge-base --fork-point' invocation in
'get_rebase_fork_point' fails to find a fork point between the current
branch and the remote-tracking branch we are pulling from,
'rebase_fork_point' is null and since 4d36f88be7 (submodule: do not pass
null OID to setup_revisions, 2018-05-24), 'submodule_touches_in_range'
checks 'curr_head' and all its ancestors for submodule modifications.

Since it is highly likely that there are submodule modifications in this
range (which is in effect the whole history of the current branch), this
prevents 'git pull --rebase --recurse-submodules' from succeeding if no
fork point exists between the current branch and the remote-tracking
branch being pulled. This can happen, for example, when the current
branch was forked from a commit which was never recorded in the reflog
of the remote-tracking branch we are pulling, as the last two paragraphs
of the "Discussion on fork-point mode" section in git-merge-base(1)
explain.

Fix this bug by passing 'upstream' instead of 'rebase_fork_point' as the
'excl_oid' argument to 'submodule_touches_in_range'.

Reported-by: Brice Goglin <bgoglin@free.fr>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-16 16:01:13 -08:00
Philippe Blain
4f66d79ae3 pull --rebase: compute rebase arguments in separate function
The function 'run_rebase' is responsible for constructing the
command line to be passed to 'git rebase'. This includes both forwarding
pass-through options given to 'git pull' as well computing the <newbase>
and <upstream> arguments to 'git rebase'.

A following commit will need to access the <upstream> argument in
'cmd_pull' to fix a bug with 'git pull --rebase --recurse-submodules'.
In order to do so, refactor the code so that the <newbase> and
<upstream> commits are computed in a new, separate function,
'get_rebase_newbase_and_upstream'.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-16 16:01:13 -08:00
Taylor Blau
704c4a5c07 builtin/repack.c: keep track of what pack-objects wrote
In the subsequent commit, it will become useful to keep track of which
metadata files were written by pack-objects. We already do this to an
extent with the 'exts' array, which only is used in the context of
existing packs.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-16 15:57:44 -08:00
Jeff King
63f4d5cf57 repack: make "exts" array available outside cmd_repack()
We'll use it in a helper function soon.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-16 15:57:43 -08:00
Patrick Steinhardt
8c4417f1cf update-ref: disallow "start" for ongoing transactions
It is currently possible to write multiple "start" commands into
git-update-ref(1) for a single session, but none of them except for the
first one actually have any effect.

Using such nested "start"s may eventually have a sensible effect. One
may imagine that it restarts the current transaction, effectively
emptying it and creating a new one. It may also allow for creation of
nested transactions. But currently, none of these are implemented.

Silently ignoring this misuse is making it hard to iterate in the future
if "start" is ever going to have meaningful semantics in such a context.
This commit thus makes sure to error out in case we see such use.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-16 13:44:01 -08:00
Patrick Steinhardt
262a4d28fe update-ref: allow creation of multiple transactions
While git-update-ref has recently grown commands which allow interactive
control of transactions in e48cf33b61 (update-ref: implement interactive
transaction handling, 2020-04-02), it is not yet possible to create
multiple transactions in a single session. To do so, one currently still
needs to invoke the executable multiple times.

This commit addresses this shortcoming by allowing the "start" command
to create a new transaction if the current transaction has already been
either committed or aborted.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-16 13:44:01 -08:00
Jeff King
a9bc372ef8 use size_t to store pack .idx byte offsets
We sometimes store the offset into a pack .idx file as an "unsigned
long", but the mmap'd size of a pack .idx file can exceed 4GB. This is
sufficient on LP64 systems like Linux, but will be too small on LLP64
systems like Windows, where "unsigned long" is still only 32 bits. Let's
use size_t, which is a better type for an offset into a memory buffer.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-16 13:41:35 -08:00
Jeff King
f86f769550 compute pack .idx byte offsets using size_t
A pack and its matching .idx file are limited to 2^32 objects, because
the pack format contains a 32-bit field to store the number of objects.
Hence we use uint32_t in the code.

But the byte count of even a .idx file can be much larger than that,
because it stores at least a hash and an offset for each object. So
using SHA-1, a v2 .idx file will cross the 4GB boundary at 153,391,650
objects. This confuses load_idx(), which computes the minimum size like
this:

  unsigned long min_size = 8 + 4*256 + nr*(hashsz + 4 + 4) + hashsz + hashsz;

Even though min_size will be big enough on most 64-bit platforms, the
actual arithmetic is done as a uint32_t, resulting in a truncation. We
actually exceed that min_size, but then we do:

  unsigned long max_size = min_size;
  if (nr)
          max_size += (nr - 1)*8;

to account for the variable-sized table. That computation doesn't
overflow quite so low, but with the truncation for min_size, we end up
with a max_size that is much smaller than our actual size. So we
complain that the idx is invalid, and can't find any of its objects.

We can fix this case by casting "nr" to a size_t, which will do the
multiplication in 64-bits (assuming you're on a 64-bit platform; this
will never work on a 32-bit system since we couldn't map the whole .idx
anyway). Likewise, we don't have to worry about further additions,
because adding a smaller number to a size_t will convert the other side
to a size_t.

A few notes:

  - obviously we could just declare "nr" as a size_t in the first place
    (and likewise, packed_git.num_objects).  But it's conceptually a
    uint32_t because of the on-disk format, and we correctly treat it
    that way in other contexts that don't need to compute byte offsets
    (e.g., iterating over the set of objects should and generally does
    use a uint32_t). Switching to size_t would make all of those other
    cases look wrong.

  - it could be argued that the proper type is off_t to represent the
    file offset. But in practice the .idx file must fit within memory,
    because we mmap the whole thing. And the rest of the code (including
    the idx_size variable we're comparing against) uses size_t.

  - we'll add the same cast to the max_size arithmetic line. Even though
    we're adding to a larger type, which will convert our result, the
    multiplication is still done as a 32-bit value and can itself
    overflow. I didn't check this with my test case, since it would need
    an even larger pack (~530M objects), but looking at compiler output
    shows that it works this way. The standard should agree, but I
    couldn't find anything explicit in 6.3.1.8 ("usual arithmetic
    conversions").

The case in load_idx() was the most immediate one that I was able to
trigger. After fixing it, looking up actual objects (including the very
last one in sha1 order) works in a test repo with 153,725,110 objects.
That's because bsearch_hash() works with uint32_t entry indices, and the
actual byte access:

  int cmp = hashcmp(table + mi * stride, sha1);

is done with "stride" as a size_t, causing the uint32_t "mi" to be
promoted to a size_t. This is the way most code will access the index
data.

However, I audited all of the other byte-wise accesses of
packed_git.index_data, and many of the others are suspect (they are
similar to the max_size one, where we are adding to a properly sized
offset or directly to a pointer, but the multiplication in the
sub-expression can overflow). I didn't trigger any of these in practice,
but I believe they're potential problems, and certainly adding in the
cast is not going to hurt anything here.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-16 13:41:35 -08:00
Josh Steadmon
a2a066d96a receive-pack: log received client session ID
When receive-pack receives a session-id capability from the client, log
the received session ID via a trace2 data event.

Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-11 18:26:53 -08:00
Josh Steadmon
8073d75bbf receive-pack: advertise session ID in v0 capabilities
When transfer.advertiseSID is true, advertise receive-pack's session ID
via the new session-id capability.

Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-11 18:26:53 -08:00
Junio C Hamano
902f358555 Merge branch 'rs/clear-commit-marks-in-repo'
Code clean-up.

* rs/clear-commit-marks-in-repo:
  bisect: clear flags in passed repository
  object: allow clear_commit_marks_all to handle any repo
2020-11-11 13:18:37 -08:00
Elijah Newren
449a900969 shortlog: use strset from strmap.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-11 12:55:27 -08:00
Elijah Newren
b19315d8ab Use new HASHMAP_INIT macro to simplify hashmap initialization
Now that hashamp has lazy initialization and a HASHMAP_INIT macro,
hashmaps allocated on the stack can be initialized without a call to
hashmap_init() and in some cases makes the code a bit shorter.  Convert
some callsites over to take advantage of this.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-11 12:55:27 -08:00
Jiang Xin
80ffeb94f4 receive-pack: use default version 0 for proc-receive
In the verison negotiation phase between "receive-pack" and
"proc-receive", "proc-receive" can send an empty flush-pkt to end the
negotiation and use default version 0. Capabilities (such as
"push-options") are not supported in version 0.

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-11 12:46:56 -08:00
Jiang Xin
f65003b4c4 receive-pack: gently write messages to proc-receive
Johannes found a flaky hang in `t5411/test-0013-bad-protocol.sh` in the
osx-clang job of the CI/PR builds, and ran into an issue when using
the `--stress` option with the following error messages:

    fatal: unable to write flush packet: Broken pipe
    send-pack: unexpected disconnect while reading sideband packet
    fatal: the remote end hung up unexpectedly

In this test case, the "proc-receive" hook sends an error message and
dies earlier. While "receive-pack" on the other side of the pipe
should forward the error message of the "proc-receive" hook to the
client side, but it fails to do so. This is because "receive-pack"
uses `packet_write_fmt()` and `packet_flush()` to write pkt-line
message to "proc-receive" hook, and these functions die immediately
when pipe is broken. Using "gently" forms for these functions will get
more predicable output.

Add more "--die-*" options to test helper to test different stages of
the protocol between "receive-pack" and "proc-receive" hook.

Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-11 12:46:56 -08:00
Jeff King
3a1f91cfd9 rev-parse: handle --end-of-options
We taught rev-list a new way to separate options from revisions in
19e8789b23 (revision: allow --end-of-options to end option parsing,
2019-08-06), but rev-parse uses its own parser. It should know about
--end-of-options not only for consistency, but because it may be
presented with similarly ambiguous cases. E.g., if a caller does:

  git rev-parse "$rev" -- "$path"

to parse an untrusted input, then it will get confused if $rev contains
an option-like string like "--local-env-vars". Or even "--not-real",
which we'd keep as an option to pass along to rev-list.

Or even more importantly:

  git rev-parse --verify "$rev"

can be confused by options, even though its purpose is safely parsing
untrusted input. On the plus side, it will always fail the --verify
part, as it will not have parsed a revision, so the caller will
generally "fail closed" rather than continue to use the untrusted
string. But it will still trigger whatever option was in "$rev"; this
should be mostly harmless, since rev-parse options are all read-only,
but I didn't carefully audit all paths.

This patch lets callers write:

  git rev-parse --end-of-options "$rev" -- "$path"

and:

  git rev-parse --verify --end-of-options "$rev"

which will both treat "$rev" always as a revision parameter. The latter
is a bit clunky. It would be nicer if we had defined "--verify" to
require that its next argument be the revision. But we have not
historically done so, and:

  git rev-parse --verify -q "$rev"

does currently work. I added a test here to confirm that we didn't break
that.

A few implementation notes:

 - We don't document --end-of-options explicitly in commands, but rather
   in gitcli(7). So I didn't give it its own section in git-rev-parse(1).
   But I did call it out specifically in the --verify section, and
   include it in the examples, which should show best practices.

 - We don't have to re-indent the main option-parsing block, because we
   can combine our "did we see end of options" check with "does it start
   with a dash". The exception is the pre-setup options, which need
   their own block.

 - We do however have to pull the "--" parsing out of the "does it start
   with dash" block, because we want to parse it even if we've seen
   --end-of-options.

 - We'll leave "--end-of-options" in the output. This is probably not
   technically necessary, as a careful caller will do:

     git rev-parse --end-of-options $revs -- $paths

   and anything in $revs will be resolved to an object id. However, it
   does help a slightly less careful caller like:

     git rev-parse --end-of-options $revs_or_paths

   where a path "--foo" will remain in the output as long as it also
   exists on disk. In that case, it's helpful to retain --end-of-options
   to get passed along to rev-list, s it would otherwise see just
   "--foo".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-10 13:46:27 -08:00
Jeff King
9033addfa6 rev-parse: put all options under the "-" check
The option-parsing loop of rev-parse checks whether the first character
of an arg is "-". If so, then it enters a series of conditionals
checking for individual options. But some options are inexplicably
outside of that outer conditional.

This doesn't produce the wrong behavior; the conditional is actually
redundant with the individual option checks, and it's really only its
fallback "continue" that we care about. But we should at least be
consistent.

One obvious alternative is that we could get rid of the conditional
entirely. But we'll be using the extra block it provides in the next
patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-10 13:46:27 -08:00
Jeff King
e05e2ae8fe rev-parse: don't accept options after dashdash
Because of the order in which we check options in rev-parse, there are a
few options we accept even after a "--". This is wrong, because the
whole point of "--" is to say "everything after here is a path". Let's
move the "did we see a dashdash" check (it's called "as_is" in the code)
to the top of the parsing loop.

Note there is one subtlety here. The options are ordered so that some
are checked before we even see if we're in a repository (they continue
the loop, and if we get past a certain point, then we do the repository
setup). By moving the as_is check higher, it's also in that "before
setup" section, even though it might look at the repository via
verify_filename(). However, this works out: we'd never set as_is until
we parse "--", and we don't parse that until after doing the setup.

An alternative here to avoid the subtlety is to put the as_is check at
the top of the post-setup options. But then every pre-setup option would
have to remember to check "if (!as_is && !strcmp(...))". So while this
is a bit magical, it's harder for future code to get wrong.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-10 13:46:27 -08:00
Junio C Hamano
3baf58bfb4 format-patch: make output filename configurable
For the past 15 years, we've used the hardcoded 64 as the length
limit of the filename of the output from the "git format-patch"
command.  Since the value is shorter than the 80-column terminal, it
could grow without line wrapping a bit.  At the same time, since the
value is longer than half of the 80-column terminal, we could fit
two or more of them in "ls" output on such a terminal if we allowed
to lower it.

Introduce a new command line option --filename-max-length=<n> and a
new configuration variable format.filenameMaxLength to override the
hardcoded default.

While we are at it, remove a check that the name of output directory
does not exceed PATH_MAX---this check is pointless in that by the
time control reaches the function, the caller would already have
done an equivalent of "mkdir -p", so if the system does not like an
overly long directory name, the control wouldn't have reached here,
and otherwise, we know that the system allowed the output directory
to exist.  In the worst case, we will get an error when we try to
open the output file and handle the error correctly anyway.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-09 17:44:41 -08:00
Junio C Hamano
6a44c9c0d0 Merge branch 'jk/committer-date-is-author-date-fix-simplify'
Code simplification.

* jk/committer-date-is-author-date-fix-simplify:
  am, sequencer: stop parsing our own committer ident
2020-11-09 14:06:28 -08:00
Junio C Hamano
ecf95d938b Merge branch 'ab/git-remote-exit-code'
Exit codes from "git remote add" etc. were not usable by scripted
callers.

* ab/git-remote-exit-code:
  remote: add meaningful exit code on missing/existing
2020-11-09 14:06:26 -08:00
Junio C Hamano
92d6bd2e90 Merge branch 'jk/checkout-index-errors'
"git checkout-index" did not consistently signal an error with its
exit status.

* jk/checkout-index-errors:
  checkout-index: propagate errors to exit code
  checkout-index: drop error message from empty --stage=all
2020-11-09 14:06:26 -08:00
Junio C Hamano
cfdc70b299 Merge branch 'mr/bisect-in-c-3'
Rewriting "git bisect" in C continues.

* mr/bisect-in-c-3:
  bisect--helper: retire `--bisect-autostart` subcommand
  bisect--helper: retire `--write-terms` subcommand
  bisect--helper: retire `--check-expected-revs` subcommand
  bisect--helper: reimplement `bisect_state` & `bisect_head` shell functions in C
  bisect--helper: retire `--next-all` subcommand
  bisect--helper: retire `--bisect-clean-state` subcommand
  bisect--helper: finish porting `bisect_start()` to C
2020-11-09 14:06:25 -08:00
Phillip Wood
8843302307 rebase -i: simplify get_revision_ranges()
Now that all the external users of head_hash have been converted to
use a opts->orig_head instead we can stop returning head_hash from
get_revision_ranges().

Because we want to pass the full object names back to the caller in
`revisions` the find_unique_abbrev_r() call that was used to initialize
`head_hash` is replaced with oid_to_hex().

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-04 14:10:41 -08:00
Phillip Wood
a2bb10d06d rebase -i: use struct object_id when writing state
Rather than passing a string around pass the struct object_id that the
string was created from call oid_hex() when we write the file.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-04 14:10:41 -08:00
Phillip Wood
f3e27a02d5 rebase -i: use struct object_id rather than looking up commit
We already have a struct object_id containing the oid that we want to
set ORIG_HEAD to so use that rather than converting it to a string and
then calling get_oid() on that string.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-04 14:10:41 -08:00
Phillip Wood
e100bea481 rebase -i: stop overwriting ORIG_HEAD buffer
After rebasing, ORIG_HEAD is supposed to point to the old HEAD of the
rebased branch.  The code used find_unique_abbrev() to obtain the
object name of the old HEAD and wrote to both
.git/rebase-merge/orig-head (used by `rebase --abort` to go back to
the previous state) and to ORIG_HEAD.  The buffer find_unique_abbrev()
gives back is volatile, unfortunately, and was overwritten after the
former file is written but before ORIG_FILE is written, leaving an
incorrect object name in it.

Avoid relying on the volatile buffer of find_unique_abbrev(), and
instead supply our own buffer to keep the object name.

I think that all of the users of head_hash should actually be using
opts->orig_head instead as passing a string rather than a struct
object_id around is a hang over from the scripted implementation. This
patch just fixes the immediate bug and adds a regression test based on
Caspar's reproduction example[1]. The users will be converted to use
struct object_id and head_hash removed in the next few commits.

[1] https://lore.kernel.org/git/CAFzd1+7PDg2PZgKw7U0kdepdYuoML9wSN4kofmB_-8NHrbbrHg@mail.gmail.com

Reported-by: Caspar Duregger <herr.kaste@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-04 14:10:41 -08:00
Jeff King
dc1672dd10 format-patch: support --output option
We've never intended to support diff's --output option in format-patch.
And until baa4adc66a (parse-options: disable option abbreviation with
PARSE_OPT_KEEP_UNKNOWN, 2019-01-27), it was impossible to trigger. We
first parse the format-patch options before handing the remainder off to
setup_revisions(). Before that commit, we'd accept "--output=foo" as an
abbreviation for "--output-directory=foo". But afterwards, we don't
check abbreviations, and --output gets passed to the diff code.

This results in nonsense behavior and bugs. The diff code will have
opened a filehandle at rev.diffopt.file, but we'll overwrite that with
our own handles that we open for each individual patch file. So the
--output file will always just be empty. But worse, the diff code also
sets rev.diffopt.close_file, so log_tree_commit() will close the
filehandle itself. And then the main loop in cmd_format_patch() will try
to close it again, resulting in a double-free.

The simplest solution would be to just disallow --output with
format-patch, as nobody ever intended it to work. However, we have
accidentally documented it (because format-patch includes diff-options).
And it does work with "git log", which writes the whole output to the
specified file. It's easy enough to make that work for format-patch,
too: it's really the same as --stdout, but pointed at a specific file.

We can detect the use of the --output option by the "close_file" flag
(note that we can't use rev.diffopt.file, since the diff setup will
otherwise set it to stdout). So we just need to unset that flag, but
don't have to do anything else. Our situation is otherwise exactly like
--stdout (note that we don't fclose() the file, but nor does the stdout
case; exiting the program takes care of that for us).

Reported-by: Johannes Postler <johannes.postler@txture.io>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-04 14:05:29 -08:00
Jeff King
1e1693b2bb format-patch: tie file-opening logic to output_directory
In format-patch we're either outputting to stdout or to individual files
in an output directory (which may be just "./"). Our logic for whether
to open a new file for each patch is checked with "!use_stdout", but it
is equally correct to check for a non-NULL output_directory.

The distinction will matter when we add a new single-stream output in a
future patch, when only one of the three methods will want individual
files. Let's swap the logic here in preparation.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-04 14:05:28 -08:00
Jeff King
4c6f781f9c format-patch: refactor output selection
The --stdout and --output-directory options are mutually exclusive, but
it's hard to tell from reading the code. We have three separate
conditionals that check for use_stdout, and it's only after we've set up
the output_directory fully that we check whether the user also specified
--stdout.

Instead, let's check the exclusion explicitly first, then have a single
conditional that handles stdout versus an output directory. This is
slightly easier to follow now, and also will keep things sane when we
add another output mode in a future patch.

We'll add a few tests as well, covering the mutual exclusion and the
fact that we are not confused by a configured output directory.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-04 14:05:28 -08:00
Junio C Hamano
39664cb0ac log: diagnose -L used with pathspec as an error
The -L option is documented to accept no pathspec, but the
command line option parser has allowed the combination without
checking so far.  Ensure that there is no pathspec when the -L
option is in effect to fix this.

Incidentally, this change fixes another bug in the command line
option parser, which has allowed the -L option used together
with the --follow option.  Because the latter requires exactly
one path given, but the former takes no pathspec, they become
mutually incompatible automatically.  Because the -L option
follows renames on its own, there is no reason to give --follow
at the same time.

The new tests say they may fail with "-L and --follow being
incompatible" instead of "-L and pathspec being incompatible".
Currently the expected failure can come only from the latter, but
this is to futureproof them, in case we decide to add code to
explicititly die on -L and --follow used together.

Heled-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-04 13:38:33 -08:00
Elijah Newren
14c4586c2d merge,rebase,revert: select ort or recursive by config or environment
Allow the testsuite to run where it treats requests for "recursive" or
the default merge algorithm via consulting the environment variable
GIT_TEST_MERGE_ALGORITHM which is expected to either be "recursive" (the
old traditional algorithm) or "ort" (the new algorithm).

Also, allow folks to pick the new algorithm via config setting.  It
turns out builtin/merge.c already had a way to allow users to specify a
different default merge algorithm: pull.twohead.  Rather odd
configuration name (especially to be in the 'pull' namespace rather than
'merge') but it's there.  Add that same configuration to rebase,
cherry-pick, and revert.

This required updating the various callsites that called merge_trees()
or merge_recursive() to conditionally call the new API, so this serves
as another demonstration of what the new API looks and feels like.
There are almost certainly some callsites that have not yet been
modified to work with the new merge algorithm, but this represents the
ones that I have been testing with thus far.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-02 16:35:50 -08:00
Junio C Hamano
1ae0949a03 Merge branch 'mk/diff-ignore-regex'
"git diff" family of commands learned the "-I<regex>" option to
ignore hunks whose changed lines all match the given pattern.

* mk/diff-ignore-regex:
  diff: add -I<regex> that ignores matching changes
  merge-base, xdiff: zero out xpparam_t structures
2020-11-02 13:17:44 -08:00
Junio C Hamano
51830654fc Merge branch 'jk/fast-import-marks-cleanup'
Code clean-up.

* jk/fast-import-marks-cleanup:
  fast-import: remove duplicated option-parsing line
2020-11-02 13:17:40 -08:00
Junio C Hamano
e0f6ad2984 Merge branch 'tk/credential-config'
"git credential' didn't honor the core.askPass configuration
variable (among other things), which has been corrected.

* tk/credential-config:
  credential: load default config
2020-11-02 13:17:39 -08:00
Junio C Hamano
b6fb70c985 Merge branch 'dl/diff-merge-base'
"git diff A...B" learned "git diff --merge-base A B", which is a
longer short-hand to say the same thing.

* dl/diff-merge-base:
  contrib/completion: complete `git diff --merge-base`
  builtin/diff-tree: learn --merge-base
  builtin/diff-index: learn --merge-base
  t4068: add --merge-base tests
  diff-lib: define diff_get_merge_base()
  diff-lib: accept option flags in run_diff_index()
  contrib/completion: extract common diff/difftool options
  git-diff.txt: backtick quote command text
  git-diff-index.txt: make --cached description a proper sentence
  t4068: remove unnecessary >tmp
2020-11-02 13:17:39 -08:00
Junio C Hamano
761a4e9ab1 Merge branch 'bk/sob-dco'
Document that the meaning of a Signed-off-by trailer can vary from
project to project in the end-user documentation, and clarify what
it means to this project.

* bk/sob-dco:
  Documentation: stylistically normalize references to Signed-off-by:
  SubmittingPatches: clarify DCO is our --signoff rule
  Documentation: clarify and expand description of --signoff
  doc: preparatory clean-up of description on the sign-off option
2020-11-02 13:17:39 -08:00
Junio C Hamano
0be2d65132 Merge branch 'ds/maintenance-commit-graph-auto-fix'
Test-coverage enhancement of running commit-graph task "git
maintenance" as needed led to discovery and fix of a bug.

* ds/maintenance-commit-graph-auto-fix:
  maintenance: core.commitGraph=false prevents writes
  maintenance: test commit-graph auto condition
2020-11-02 13:17:39 -08:00
Junio C Hamano
cd47bbe164 Merge branch 'jk/fast-import-marks-alloc-fix'
"git fast-import" wasted a lot of memory when many marks were in use.

* jk/fast-import-marks-alloc-fix:
  fast-import: fix over-allocation of marks storage
2020-11-02 13:17:37 -08:00
Elijah Newren
6da1a25814 hashmap: provide deallocation function names
hashmap_free(), hashmap_free_entries(), and hashmap_free_() have existed
for a while, but aren't necessarily the clearest names, especially with
hashmap_partial_clear() being added to the mix and lazy-initialization
now being supported.  Peff suggested we adopt the following names[1]:

  - hashmap_clear() - remove all entries and de-allocate any
    hashmap-specific data, but be ready for reuse

  - hashmap_clear_and_free() - ditto, but free the entries themselves

  - hashmap_partial_clear() - remove all entries but don't deallocate
    table

  - hashmap_partial_clear_and_free() - ditto, but free the entries

This patch provides the new names and converts all existing callers over
to the new naming scheme.

[1] https://lore.kernel.org/git/20201030125059.GA3277724@coredump.intra.peff.net/

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-02 12:15:50 -08:00
Philippe Blain
3af31e8786 blame: simplify 'setup_blame_bloom_data' interface
The penultimate commit moved the initialization of 'sb.path' in
'builtin/blame.c::cmd_blame' before the call to
'blame.c::setup_blame_bloom_data'. Since 'cmd_blame' is the only caller
of 'setup_blame_bloom_data', it is now unnecessary for
'setup_blame_bloom_data' to receive 'path' as a separate argument, as
'sb.path' is already initialized.

Remove this argument from setup_blame_bloom_data's interface and use the
'path' field of the 'sb' 'struct blame_scoreboard' instead.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-01 15:54:15 -08:00
Philippe Blain
88894aaeea blame: simplify 'setup_scoreboard' interface
The previous commit moved the initialization of 'sb.path' in
'builtin/blame.c::cmd_blame' before the call to
'blame.c::setup_scoreboard'. Since 'cmd_blame' is the only caller of
'setup_scoreboard', it is now unnecessary for 'setup_scoreboard' to
receive 'path' as a separate argument, as 'sb.path' is already
initialized.

Remove this argument from setup_scoreboard's interface and use the
'path' field of the 'sb' 'struct blame_scoreboard' instead.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-01 15:54:15 -08:00
Philippe Blain
9466e3809d blame: enable funcname blaming with userdiff driver
In blame.c::cmd_blame, we send the 'path' field of the 'sb' 'struct
blame_scoreboard' as the 'path' argument to
'line-range.c::parse_range_arg', but 'sb.path' is not set yet; it's set
to the local variable 'path' a few lines later at line 1137.

This 'path' argument is only used in 'parse_range_arg' if we are blaming
a funcname, i.e. `git blame -L :<funcname> <path>`, and in that case it
is sent to 'parse_range_funcname', where it is used to determine if a
userdiff driver should be used for said <path> to match the given
funcname.

Since 'path' is yet unset, the userdiff driver is never used, so we fall
back to the default funcname regex, which is usually not appropriate for
paths that are set to use a specific userdiff driver, and thus either we
match some unrelated lines, or we die with

    fatal: -L parameter '<funcname>' starting at line 1: no match

This has been the case ever since `git blame` learned to blame a
funcname in 13b8f68c1f (log -L: :pattern:file syntax to find by
funcname, 2013-03-28).

Enable funcname blaming for paths using specific userdiff drivers by
initializing 'sb.path' earlier in 'cmd_blame', when some of its other
fields are initialized, so that it is set when passed to
'parse_range_arg'.

Add a regression test in 'annotate-tests.sh', which is sourced in
t8001-annotate.sh and t8002-blame.sh, leveraging an existing file used
to test the userdiff patterns in t4018-diff-funcname.

Also, use 'sb.path' instead of 'path' when constructing the error
message at line 1114, for consistency.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-01 15:54:15 -08:00
Philippe Blain
180d641d7d line-log: mention both modes in 'blame' and 'log' short help
'git blame -h' and 'git log -h' both show '-L <n,m>' and describe this
option as "Process only line range n,m, counting from 1". No hint is
given that a function name regex can also be used.

Use <range> instead, and expand the description of the option to mention
both modes. Remove "counting from 1" as it's uneeded; it's uncommon to
refer to the first line of a file as "line 0".

Also, for 'git log', improve the wording to better reflect the long help.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-01 15:54:14 -08:00
René Scharfe
4f44c5659b stash: simplify reflog emptiness check
Calling rev-parse to check if the drop subcommand removed the last stash
and treating its failure as confirmation is fragile, as the command can
fail for other reasons, e.g. because the system is out of memory.
Directly check if the reflog is empty instead, which is more robust.

Reported-by: Marek Mrva <mrva@eof-studios.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-01 15:51:31 -08:00
René Scharfe
cd8888452c object: allow clear_commit_marks_all to handle any repo
Allow callers to specify the repository to use.  Rename the function to
repo_clear_commit_marks to document its new scope.  No functional change
intended.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-31 10:46:34 -07:00
Junio C Hamano
4f9f7c1442 Merge branch 'jk/committer-date-is-author-date-fix' into maint
In 2.29, "--committer-date-is-author-date" option of "rebase" and
"am" subcommands lost the e-mail address by mistake, which has been
corrected.

* jk/committer-date-is-author-date-fix:
  rebase: fix broken email with --committer-date-is-author-date
  am: fix broken email with --committer-date-is-author-date
  t3436: check --committer-date-is-author-date result more carefully
2020-10-29 14:18:47 -07:00
Junio C Hamano
0e41cfad62 Merge branch 'dl/checkout-guess'
"git checkout" learned to use checkout.guess configuration variable
and enable/disable its "--[no-]guess" option accordingly.

* dl/checkout-guess:
  checkout: learn to respect checkout.guess
  Documentation/config/checkout: replace sq with backticks
2020-10-27 15:09:51 -07:00
Junio C Hamano
f3cfeb3078 Merge branch 'dl/checkout-p-merge-base'
"git checkout -p A...B [-- <path>]" did not work, even though the
same command without "-p" correctly used the merge-base between
commits A and B.

* dl/checkout-p-merge-base:
  t2016: add a NEEDSWORK about the PERL prerequisite
  add-patch: add NEEDSWORK about comparing commits
  Doc: document "A...B" form for <tree-ish> in checkout and switch
  builtin/checkout: fix `git checkout -p HEAD...` bug
2020-10-27 15:09:51 -07:00
Junio C Hamano
40696c6727 Merge branch 'sb/clone-origin'
"git clone" learned clone.defaultremotename configuration variable
to customize what nickname to use to call the remote the repository
was cloned from.

* sb/clone-origin:
  clone: allow configurable default for `-o`/`--origin`
  clone: read new remote name from remote_name instead of option_origin
  clone: validate --origin option before use
  refs: consolidate remote name validation
  remote: add tests for add and rename with invalid names
  clone: use more conventional config/option layering
  clone: add tests for --template and some disallowed option pairs
2020-10-27 15:09:50 -07:00
Junio C Hamano
de0a7effc8 Merge branch 'sk/force-if-includes'
"git push --force-with-lease[=<ref>]" can easily be misused to lose
commits unless the user takes good care of their own "git fetch".
A new option "--force-if-includes" attempts to ensure that what is
being force-pushed was created after examining the commit at the
tip of the remote ref that is about to be force-replaced.

* sk/force-if-includes:
  t, doc: update tests, reference for "--force-if-includes"
  push: parse and set flag for "--force-if-includes"
  push: add reflog check for "--force-if-includes"
2020-10-27 15:09:49 -07:00
Junio C Hamano
52b8c8c716 Merge branch 'ds/maintenance-part-2'
"git maintenance", an extended big brother of "git gc", continues
to evolve.

* ds/maintenance-part-2:
  maintenance: add incremental-repack auto condition
  maintenance: auto-size incremental-repack batch
  maintenance: add incremental-repack task
  midx: use start_delayed_progress()
  midx: enable core.multiPackIndex by default
  maintenance: create auto condition for loose-objects
  maintenance: add loose-objects task
  maintenance: add prefetch task
2020-10-27 15:09:47 -07:00
Junio C Hamano
26bb5437f6 Merge branch 'rs/worktree-list-show-locked'
"git worktree list" now shows if each worktree is locked.  This
possibly may open us to show other kinds of states in the future.

* rs/worktree-list-show-locked:
  worktree: teach `list` to annotate locked worktree
2020-10-27 15:09:47 -07:00
Junio C Hamano
ae84e924da Merge branch 'rs/tighten-callers-of-deref-tag'
Code clean-up.

* rs/tighten-callers-of-deref-tag:
  line-log: handle deref_tag() returning NULL
  blame: handle deref_tag() returning NULL
  grep: handle deref_tag() returning NULL
2020-10-27 15:09:46 -07:00
Jeff King
7e41061588 checkout-index: propagate errors to exit code
If we encounter an error while checking out an explicit path, we print a
message to stderr but do not actually exit with a non-zero code. While
this is a plumbing command and the behavior goes all the way back to
33db5f4d90 (Add a "checkout-cache" command which does what the name
suggests., 2005-04-09), this is almost certainly an oversight:

  - we _do_ return an exit code from checkout_file(); the caller just
    never reads it

  - errors while checking out all paths (with "-a") do result in a
    non-zero exit code.

  - it would be quite unusual not to use the exit code for an error,
    as otherwise the caller has no idea the command failed except by
    scraping stderr

To keep our tests simple and portable, we can use the most obvious
error: asking to checkout a path which is not in the index at all.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-27 12:41:56 -07:00
Jeff King
0b809c8248 checkout-index: drop error message from empty --stage=all
If checkout-index is given --stage=all for a specific path, it will try
to write stages 1-3 (if present) for that path to temporary files.
However, if the file is present only at stage 0, it writes nothing but
gives a confusing message:

  $ git checkout-index --stage=all -- Makefile
  git checkout-index: Makefile does not exist at stage 4

This is nonsense. There is no stage 4 (it's just an internal enum value
we use for "all"), and the documentation clearly states:

  Paths which only have a stage 0 entry will always be omitted from the
  output.

Here it's talking about the list of tempfiles written to stdout, but it
seems clear that this case was not meant to be an error. We even have a
test which covers it, but it only checks that the command reports an
exit code of 0, not its stderr. And it reports 0 only because of another
bug which fails to propagate errors (which will be fixed in a subsequent
patch).

So let's make the test more thorough. We'll also cover the case that we
found _no_ entry, not even a stage zero, which should still be an error.
However, because of the other bug, we'll have to mark this as expecting
failure for the moment.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-27 12:41:54 -07:00
Ævar Arnfjörð Bjarmason
9144ba4cf5 remote: add meaningful exit code on missing/existing
Change the exit code for the likes of "git remote add/rename" to exit
with 2 if the remote in question doesn't exist, and 3 if it
does. Before we'd just die() and exit with the general 128 exit code.

This changes the output message from e.g.:

    fatal: remote origin already exists.

To:

    error: remote origin already exists.

Which I believe is a feature, since we generally use "fatal" for the
generic errors, and "error" for the more specific ones with a custom
exit code, but this part of the change may break code that already
relies on stderr parsing (not that we ever supported that...).

The motivation for this is a discussion around some code in GitLab's
gitaly which wanted to check this, and had to parse stderr to do so:
https://gitlab.com/gitlab-org/gitaly/-/merge_requests/2695

It's worth noting as an aside that a method of checking this that
doesn't rely on that is to check with "git config" whether the value
in question does or doesn't exist. That introduces a TOCTOU race
condition, but on the other hand this code (e.g. "git remote add")
already has a TOCTOU race.

We go through the config.lock for the actual setting of the config,
but the pseudocode logic is:

    read_config();
    check_config_and_arg_sanity();
    save_config();

So e.g. if a sleep() is added right after the remote_is_configured()
check in add() we'll clobber remote.NAME.url, and add another (usually
duplicate) remote.NAME.fetch entry (and other values, depending on
invocation).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-27 11:40:33 -07:00
Junio C Hamano
f34687dc81 Merge branch 'jk/committer-date-is-author-date-fix'
In 2.29, "--committer-date-is-author-date" option of "rebase" and
"am" subcommands lost the e-mail address by mistake, which has been
corrected.

* jk/committer-date-is-author-date-fix:
  rebase: fix broken email with --committer-date-is-author-date
  am: fix broken email with --committer-date-is-author-date
  t3436: check --committer-date-is-author-date result more carefully
2020-10-26 14:59:58 -07:00
Jeff King
2020451c5b am, sequencer: stop parsing our own committer ident
For the --committer-date-is-author-date option of git-am and git-rebase,
we format the committer ident, then re-parse it to find the name and
email, and then feed those back to fmt_ident().

We can simplify this by handling it all at the time of the fmt_ident()
call. We pass in the appropriate getenv() results, and if they're not
present, then our WANT_COMMITTER_IDENT flag tells fmt_ident() to fill in
the appropriate value from the config. Which is exactly what
git_committer_ident() was doing under the hood.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-26 09:59:57 -07:00
Jeff King
16b0bb99ea am: fix broken email with --committer-date-is-author-date
Commit e8cbe2118a (am: stop exporting GIT_COMMITTER_DATE, 2020-08-17)
rewrote the code for setting the committer date to use fmt_ident(),
rather than setting an environment variable and letting commit_tree()
handle it. But it introduced two bugs:

  - we use the author email string instead of the committer email

  - when parsing the committer ident, we used the wrong variable to
    compute the length of the email, resulting in it always being a
    zero-length string

This commit fixes both, which causes our test of this option via the
rebase "apply" backend to now succeed.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-23 08:25:19 -07:00
Michał Kępień
ec7967cfaf merge-base, xdiff: zero out xpparam_t structures
xpparam_t structures are usually zero-initialized before their specific
fields are assigned to, but there are three locations in the tree where
that does not happen.  Add the missing memset() calls in order to make
initialization of xpparam_t structures consistent tree-wide and to
prevent stack garbage from being used as field values.

Signed-off-by: Michał Kępień <michal@isc.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-20 12:53:26 -07:00
Bradley M. Kuhn
3abd4a67d9 Documentation: stylistically normalize references to Signed-off-by:
Ted reported an old typo in the git-commit.txt and merge-options.txt.
Namely, the phrase "Signed-off-by line" was used without either a
definite nor indefinite article.

Upon examination, it seems that the documentation (including items in
Documentation/, but also option help strings) have been quite
inconsistent on usage when referring to `Signed-off-by`.

First, very few places used a definite or indefinite article with the
phrase "Signed-off-by line", but that was the initial typo that led
to this investigation.  So, normalize using either an indefinite or
definite article consistently.

The original phrasing, in Commit 3f971fc425 (Documentation updates,
2005-08-14), is "Add Signed-off-by line".  Commit 6f855371a5 (Add
--signoff, --check, and long option-names. 2005-12-09) switched to
using "Add `Signed-off-by:` line", but didn't normalize the former
commit to match.  Later commits seem to have cut and pasted from one
or the other, which is likely how the usage became so inconsistent.

Junio stated on the git mailing list in
<xmqqy2k1dfoh.fsf@gitster.c.googlers.com> a preference to leave off
the colon.  Thus, prefer `Signed-off-by` (with backticks) for the
documentation files and Signed-off-by (without backticks) for option
help strings.

Additionally, Junio argued that "trailer" is now the standard term to
refer to `Signed-off-by`, saying that "becomes plenty clear that we
are not talking about any random line in the log message".  As such,
prefer "trailer" over "line" anywhere the former word fits.

However, leave alone those few places in documentation that use
Signed-off-by to refer to the process (rather than the specific
trailer), or in places where mail headers are generally discussed in
comparison with Signed-off-by.

Reported-by: "Theodore Y. Ts'o" <tytso@mit.edu>
Signed-off-by: Bradley M. Kuhn <bkuhn@sfconservancy.org>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-20 11:57:40 -07:00
Thomas Koutcher
567ad2c0f9 credential: load default config
Make `git credential fill` honour the core.askPass variable.

Signed-off-by: Thomas Koutcher <thomas.koutcher@online.fr>
[jk: added test]
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-16 12:30:45 -07:00
Pranit Bauva
b0f6494f70 bisect--helper: retire --bisect-autostart subcommand
The `--bisect-autostart` subcommand is no longer used from the
git-bisect.sh shell script. Instead the function
`bisect_autostart()` is directly called from the C implementation.

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-16 12:24:20 -07:00