Commit Graph

10183 Commits

Author SHA1 Message Date
Ævar Arnfjörð Bjarmason
f58c7468cd ls-remote: set packet_trace_identity(<name>)
Set packet_trace_identity() for ls-remote. This replaces the generic
"git" identity in GIT_TRACE_PACKET=<file> traces to "ls-remote", e.g.:

    [...] packet:  upload-pack> version 2
    [...] packet:  upload-pack> agent=git/2.32.0-dev
    [...] packet:    ls-remote< version 2
    [...] packet:    ls-remote< agent=git/2.32.0-dev

Where in an "git ls-remote file://<path>" dialog ">" is the sender (or
"to the server") and "<" is the recipient (or "received by the
client").

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-24 14:47:07 -07:00
Derrick Stolee
a16eb6b1ff maintenance: skip bootout/bootstrap when plist is registered
On macOS, we use launchctl to manage the background maintenance
schedule. This uses a set of .plist files to describe the schedule, but
these files are also registered with 'launchctl bootstrap'. If multiple
'git maintenance start' commands run concurrently, then they can collide
replacing these schedule files and registering them with launchctl.

To avoid extra launchctl commands, do a check for the .plist files on
disk and check if they are registered using 'launchctl list <name>'.
This command will return with exit code 0 if it exists, or exit code 113
if it does not.

We can test this behavior using the GIT_TEST_MAINT_SCHEDULER environment
variable.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-24 14:16:58 -07:00
Johannes Schindelin
bb01122a82 maintenance: create launchctl configuration using a lock file
When two `git maintenance` processes try to write the `.plist` file, we
need to help them with serializing their efforts.

The 150ms time-out value was determined from thin air.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-24 14:16:57 -07:00
Atharva Raykar
c51f8f94e5 submodule--helper: run update procedures from C
Add a new submodule--helper subcommand `run-update-procedure` that runs
the update procedure if the SHA1 of the submodule does not match what
the superproject expects.

This is an intermediate change that works towards total conversion of
`submodule update` from shell to C.

Specific error codes are returned so that the shell script calling the
subcommand can take a decision on the control flow, and preserve the
error messages across subsequent recursive calls of `cmd_update`.

This change is more focused on doing a faithful conversion, so for now we
are not too concerned with trying to reduce subprocess spawns.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-24 14:03:58 -07:00
Taylor Blau
3ba3d0621b pack-bitmap-write.c: gracefully fail to write non-closed bitmaps
The set of objects covered by a bitmap must be closed under
reachability, since it must be the case that there is a valid bit
position assigned for every possible reachable object (otherwise the
bitmaps would be incomplete).

Pack bitmaps are never written from 'git repack' unless repacking
all-into-one, and so we never write non-closed bitmaps (except in the
case of partial clones where we aren't guaranteed to have all objects).

But multi-pack bitmaps change this, since it isn't known whether the
set of objects in the MIDX is closed under reachability until walking
them. Plumb through a bit that is set when a reachable object isn't
found.

As soon as a reachable object isn't found in the set of objects to
include in the bitmap, bitmap_writer_build() knows that the set is not
closed, and so it now fails gracefully.

A test is added in t0410 to trigger a bitmap write without full
reachability closure by removing local copies of some reachable objects
from a promisor remote.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-24 13:21:13 -07:00
Joel Klinghed
8ef6aad664 commit: restore --edit when combined with --fixup
Recent changes to --fixup, adding amend suboption, caused the
--edit flag to be ignored as use_editor was always set to zero.

Restore edit_flag having higher priority than fixup_message when
deciding the value of use_editor by moving the edit flag condition
later in the method.

Signed-off-by: Joel Klinghed <the_jk@spawned.biz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-15 09:44:08 -07:00
Junio C Hamano
716f68ec33 Merge branch 'ds/add-with-sparse-index' into ds/sparse-index-ignored-files
* ds/add-with-sparse-index:
  add: remove ensure_full_index() with --renormalize
  add: ignore outside the sparse-checkout in refresh()
  pathspec: stop calling ensure_full_index
  add: allow operating on a sparse-only index
  t1092: test merge conflicts outside cone
2021-08-10 13:39:14 -07:00
Atharva Raykar
de0fcbe0f4 submodule--helper: rename compute_submodule_clone_url()
Let's rename 'compute_submodule_clone_url()' to 'resolve_relative_url()'
to make it clear that this internal helper need not be used exclusively
for computing submodule clone URLs.

Since the original 'resolve-relative-url' subcommand and its C entry
point has been removed in c461095ae3 (submodule--helper: remove
resolve-relative-url subcommand, 2021-07-02), this rename can be done
without causing any confusion about which function it actually binds to.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:45:12 -07:00
Atharva Raykar
15fe88d5a6 submodule--helper: remove resolve-relative-url subcommand
The shell subcommand `resolve-relative-url` is no longer required, as
its last caller has been removed when it was converted to C.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:45:12 -07:00
Atharva Raykar
ba8a3b019e submodule--helper: remove add-config subcommand
Also no longer needed is this subcommand, as all of its functionality is
being called by the newly-introduced `module_add()` directly within C.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:45:12 -07:00
Atharva Raykar
f006132c24 submodule--helper: remove add-clone subcommand
We no longer need this subcommand, as all of its functionality is being
called by the newly-introduced `module_add()` directly within C.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:45:11 -07:00
Atharva Raykar
a6226fd772 submodule--helper: convert the bulk of cmd_add() to C
Introduce the 'add' subcommand to `submodule--helper.c` that does all
the work 'submodule add' past the parsing of flags.

We also remove the constness of the sm_path field of the `add_data`
struct. This is needed so that it can be modified by
normalize_path_copy().

As with the previous conversions, this is meant to be a faithful
conversion with no modification to the behaviour of `submodule add`.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Helped-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Based-on-patch-by: Shourya Shukla <periperidip@gmail.com>
Based-on-patch-by: Prathamesh Chavan <pc44800@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:45:11 -07:00
Atharva Raykar
ed86301f68 dir: libify and export helper functions from clone.c
These functions can be useful to other parts of Git. Let's move them to
dir.c, while renaming them to be make their functionality more explicit.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:45:11 -07:00
Atharva Raykar
0c61041ed6 submodule--helper: remove repeated code in sync_submodule()
This part of `sync_submodule()` is doing the same thing that
`compute_submodule_clone_url()` is doing. Let's reuse that helper here.

Note that this change adds a small overhead where we allocate and free
the 'remote' twice, but that is a small price to pay for the higher
level of abstraction we get.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:45:11 -07:00
Atharva Raykar
ab6f23b751 submodule--helper: refactor resolve_relative_url() helper
Refactor the helper function to resolve a relative url, by reusing the
existing `compute_submodule_clone_url()` function.

`compute_submodule_clone_url()` performs the same work that
`resolve_relative_url()` is doing, so we eliminate this code repetition
by moving the former function's definition up, and calling it inside
`resolve_relative_url()`.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:45:11 -07:00
Atharva Raykar
6baf4e4da4 submodule--helper: add options for compute_submodule_clone_url()
Let's modify the interface to `compute_submodule_clone_url()` function
by adding two more arguments, so that we can reuse this in various parts
of `submodule--helper.c` that follow a common pattern, which is--read
the remote url configuration of the superproject and then call
`relative_url()`.

This function is nearly identical to `resolve_relative_url()`, the only
difference being the extra warning message. We can add a quiet flag to
it, to suppress that warning when not needed, and then refactor
`resolve_relative_url()` by using this function, something we will do in
the next patch.

We also rename the local variable 'relurl' to avoid potential confusion
with the 'rel_url' parameter while we are at it.

Having this functionality factored out will be useful for converting the
rest of `submodule add` in subsequent patches.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 11:45:11 -07:00
Junio C Hamano
59dcbb810c Merge branch 'ar/submodule-add-config' into ar/submodule-add
* ar/submodule-add-config:
  submodule--helper: introduce add-config subcommand
2021-08-10 11:01:19 -07:00
Atharva Raykar
a452128a36 submodule--helper: introduce add-config subcommand
Add a new "add-config" subcommand to `git submodule--helper` with the
goal of converting part of the shell code in git-submodule.sh related to
`git submodule add` into C code. This new subcommand sets the
configuration variables of a newly added submodule, by registering the
url in local git config, as well as the submodule name and path in the
.gitmodules file. It also sets 'submodule.<name>.active' to "true" if
the submodule path has not already been covered by any pathspec
specified in 'submodule.active'.

This is meant to be a faithful conversion from shell to C, although we
add comments to areas that could be improved in future patches, after
the conversion has settled.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Based-on-patch-by: Shourya Shukla <periperidip@gmail.com>
Based-on-patch-by: Prathamesh Chavan <pc44800@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10 10:57:57 -07:00
Carlo Marcelo Arenas Belón
00e302da76 builtin/merge: avoid -Wformat-extra-args from ancient Xcode
d540b70c85 (merge: cleanup messages like commit, 2019-04-17) adds
a way to change part of the helper text using a single call to
strbuf_add_commented_addf but with two formats with varying number
of parameters.

this trigger a warning in old versions of Xcode (ex 8.0), so use
instead two independent calls with a matching number of parameters

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-09 09:48:01 -07:00
Elijah Newren
6a5fb96672 Change default merge backend from recursive to ort
There are a few reasons to switch the default:
  * Correctness
  * Extensibility
  * Performance

I'll provide some summaries about each.

=== Correctness ===

The original impetus for a new merge backend was to fix issues that were
difficult to fix within recursive's design.  The success with this goal
is perhaps most easily demonstrated by running the following:

  $ git grep -2 KNOWN_FAILURE t/ | grep -A 4 GIT_TEST_MERGE_ALGORITHM
  $ git grep test_expect_merge_algorithm.failure.success t/
  $ git grep test_expect_merge_algorithm.success.failure t/

In order, these greps show:

  * Seven sets of submodule tests (10 total tests) that fail with
    recursive but succeed with ort
  * 22 other tests that fail with recursive, but succeed with ort
  * 0 tests that pass with recursive, but fail with ort

=== Extensibility ===

Being able to perform merges without touching the working tree or index
makes it possible to create new features that were difficult with the
old backend:

  * Merging, cherry-picking, rebasing, reverting in bare repositories...
    or just on branches that aren't checked out.

  * `git diff AUTO_MERGE` -- ability to see what changes the user has
    made to resolve conflicts so far (see commit 5291828df8 ("merge-ort:
    write $GIT_DIR/AUTO_MERGE whenever we hit a conflict", 2021-03-20)

  * A --remerge-diff option for log/show, used to show diffs for merges
    that display the difference between what an automatic merge would
    have created and what was recorded in the merge.  (This option will
    often result in an empty diff because many merges are clean, but for
    the non-clean ones it will show how conflicts were fixed including
    the removal of conflict markers, and also show additional changes
    made outside of conflict regions to e.g. fix semantic conflicts.)

  * A --remerge-diff-only option for log/show, similar to --remerge-diff
    but also showing how cherry-picks or reverts differed from what an
    automatic cherry-pick or revert would provide.

The last three have been implemented already (though only one has been
submitted upstream so far; the others were waiting for performance work
to complete), and I still plan to implement the first one.

=== Performance ===

I'll quote from the summary of my final optimization for merge-ort
(while fixing the testcase name from 'no-renames' to 'few-renames'):

                               Timings

                                          Infinite
                 merge-       merge-     Parallelism
                recursive    recursive    of rename    merge-ort
                 v2.30.0      current     detection     current
                ----------   ---------   -----------   ---------
few-renames:      18.912 s    18.030 s     11.699 s     198.3 ms
mega-renames:   5964.031 s   361.281 s    203.886 s     661.8 ms
just-one-mega:   149.583 s    11.009 s      7.553 s     264.6 ms

                           Speedup factors

                                          Infinite
                 merge-       merge-     Parallelism
                recursive    recursive    of rename
                 v2.30.0      current     detection    merge-ort
                ----------   ---------   -----------   ---------
few-renames:        1           1.05         1.6           95
mega-renames:       1          16.5         29           9012
just-one-mega:      1          13.6         20            565

And, for partial clone users:

             Factor reduction in number of objects needed

                                          Infinite
                 merge-       merge-     Parallelism
                recursive    recursive    of rename
                 v2.30.0      current     detection    merge-ort
                ----------   ---------   -----------   ---------
mega-renames:       1            1            1          181.3

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 15:35:02 -07:00
Patrick Steinhardt
29ef1f27fe revision: separate walk and unsorted flags
The `--no-walk` flag supports two modes: either it sorts the revisions
given as input input or it doesn't. This is reflected in a single
`no_walk` flag, which reflects one of the three states "walk", "don't
walk but without sorting" and "don't walk but with sorting".

Split up the flag into two separate bits, one indicating whether we
should walk or not and one indicating whether the input should be sorted
or not. This will allow us to more easily introduce a new flag
`--unsorted-input`, which only impacts the sorting bit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 09:37:28 -07:00
Ævar Arnfjörð Bjarmason
98e2d9d6f7 upload-pack: document and rename --advertise-refs
The --advertise-refs documentation in git-upload-pack added in
9812f2136b (upload-pack.c: use parse-options API, 2016-05-31) hasn't
been entirely true ever since v2 support was implemented in
e52449b672 (connect: request remote refs using v2, 2018-03-15). Under
v2 we don't advertise the refs at all, but rather dump the
capabilities header.

This option has always been an obscure internal implementation detail,
it wasn't even documented for git-receive-pack. Since it has exactly
one user let's rename it to --http-backend-info-refs, which is more
accurate and points the reader in the right direction. Let's also
cross-link this from the protocol v1 and v2 documentation.

I'm retaining a hidden --advertise-refs alias in case there's any
external users of this, and making both options hidden to the bash
completion (as with most other internal-only options).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 08:59:37 -07:00
Ævar Arnfjörð Bjarmason
f234da8019 serve.[ch]: remove "serve_options", split up --advertise-refs code
The "advertise capabilities" mode of serve.c added in
ed10cb952d (serve: introduce git-serve, 2018-03-15) is only used by
the http-backend.c to call {upload,receive}-pack with the
--advertise-refs parameter. See 42526b478e (Add stateless RPC options
to upload-pack, receive-pack, 2009-10-30).

Let's just make cmd_upload_pack() take the two (v2) or three (v2)
parameters the the v2/v1 servicing functions need directly, and pass
those in via the function signature. The logic of whether daemon mode
is implied by the timeout belongs in the v1 function (only used
there).

Once we split up the "advertise v2 refs" from "serve v2 request" it
becomes clear that v2 never cared about those in combination. The only
time it mattered was for v1 to emit its ref advertisement, in that
case we wanted to emit the smart-http-only "no-done" capability.

Since we only do that in the --advertise-refs codepath let's just have
it set "do_done" itself in v1's upload_pack() just before send_ref(),
at that point --advertise-refs and --stateless-rpc in combination are
redundant (the only user is get_info_refs() in http-backend.c), so we
can just pass in --advertise-refs only.

Since we need to touch all the serve() and advertise_capabilities()
codepaths let's rename them to less clever and obvious names, it's
been suggested numerous times, the latest of which is [1]'s suggestion
for protocol_v2_serve_loop(). Let's go with that.

1. https://lore.kernel.org/git/CAFQ2z_NyGb8rju5CKzmo6KhZXD0Dp21u-BbyCb2aNxLEoSPRJw@mail.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 08:59:37 -07:00
Elijah Newren
81483fe613 Update error message and code comment
There were two locations in the code that referred to 'merge-recursive'
but which were also applicable to 'merge-ort'.  Update them to more
general wording.

Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-05 08:57:40 -07:00
Junio C Hamano
4dc964691f Merge branch 'rs/use-fspathhash'
Code simplification.

* rs/use-fspathhash:
  use fspathhash() everywhere
2021-08-04 13:28:54 -07:00
Junio C Hamano
5fef3b15db Merge branch 'pb/merge-autostash-more'
The local changes stashed by "git merge --autostash" were lost when
the merge failed in certain ways, which has been corrected.

* pb/merge-autostash-more:
  merge: apply autostash if merge strategy fails
  merge: apply autostash if fast-forward fails
  Documentation: define 'MERGE_AUTOSTASH'
  merge: add missing word "strategy" to a message
2021-08-04 13:28:54 -07:00
Junio C Hamano
506d2a354a Merge branch 'ds/commit-and-checkout-with-sparse-index'
"git checkout" and "git commit" learn to work without unnecessarily
expanding sparse indexes.

* ds/commit-and-checkout-with-sparse-index:
  unpack-trees: resolve sparse-directory/file conflicts
  t1092: document bad 'git checkout' behavior
  checkout: stop expanding sparse indexes
  sparse-index: recompute cache-tree
  commit: integrate with sparse-index
  p2000: compress repo names
  p2000: add 'git checkout -' test and decrease depth
2021-08-04 13:28:53 -07:00
Junio C Hamano
31f9acf9ce Merge branch 'ah/plugleaks'
Leak plugging.

* ah/plugleaks:
  reset: clear_unpack_trees_porcelain to plug leak
  builtin/rebase: fix options.strategy memory lifecycle
  builtin/merge: free found_ref when done
  builtin/mv: free or UNLEAK multiple pointers at end of cmd_mv
  convert: release strbuf to avoid leak
  read-cache: call diff_setup_done to avoid leak
  ref-filter: also free head for ATOM_HEAD to avoid leak
  diffcore-rename: move old_dir/new_dir definition to plug leak
  builtin/for-each-repo: remove unnecessary argv copy to plug leak
  builtin/submodule--helper: release unused strbuf to avoid leak
  environment: move strbuf into block to plug leak
  fmt-merge-msg: free newly allocated temporary strings when done
2021-08-04 13:28:52 -07:00
Junio C Hamano
10f57e0eb9 Merge branch 'ar/submodule-add'
Rewrite of "git submodule" in C continues.

* ar/submodule-add:
  submodule: drop unused sm_name parameter from show_fetch_remotes()
  submodule--helper: introduce add-clone subcommand
  submodule--helper: refactor module_clone()
  submodule: prefix die messages with 'fatal'
  t7400: test failure to add submodule in tracked path
2021-08-04 13:28:52 -07:00
René Scharfe
7431842325 use fspathhash() everywhere
cf2dc1c238 (speed up alt_odb_usable() with many alternates, 2021-07-07)
introduced the function fspathhash() for calculating path hashes while
respecting the configuration option core.ignorecase.  Call it instead of
open-coding it; the resulting code is shorter and less repetitive.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-30 12:14:27 -07:00
Derrick Stolee
42f8ed6ca2 add: remove ensure_full_index() with --renormalize
The --renormalize option updates the EOL conversions for the tracked
files. However, the loop already ignores files marked with the
SKIP_WORKTREE bit, so it will continue to do so with a sparse index
because the sparse directory entries also have this bit set.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-29 12:36:34 -07:00
Derrick Stolee
939fa07582 add: ignore outside the sparse-checkout in refresh()
Since b243012 (refresh_index(): add flag to ignore SKIP_WORKTREE
entries, 2021-04-08), 'git add --refresh <path>' will output a warning
message when the path is outside the sparse-checkout definition. The
implementation of this warning happened in parallel with the
sparse-index work to add ensure_full_index() calls throughout the
codebase.

Update this loop to have the proper logic that checks to see if the
pathspec is outside the sparse-checkout definition. This avoids the need
to expand the sparse directory entry and determine if the path is
tracked, untracked, or ignored. We simply avoid updating the stat()
information because there isn't even an entry that matches the path!

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-29 12:36:34 -07:00
Derrick Stolee
5e7cbab196 add: allow operating on a sparse-only index
Disable command_requires_full_index for 'git add'. This does not require
any additional removals of ensure_full_index(). The main reason is that
'git add' discovers changes based on the pathspec and the worktree
itself. These are then inserted into the index directly, and calls to
index_name_pos() or index_file_exists() already call expand_to_path() at
the appropriate time to support a sparse-index.

Add a test to check that 'git add -A' and 'git add <file>' does not
expand the index at all, as long as <file> is not within a sparse
directory. This does not help the global 'git add .' case.

We can measure the improvement using p2000-sparse-operations.sh with
these results:

Test                                  HEAD~1           HEAD
------------------------------------------------------------------------------
2000.6: git add -A (full-index-v3)    0.35(0.30+0.05)  0.37(0.29+0.06) +5.7%
2000.7: git add -A (full-index-v4)    0.31(0.26+0.06)  0.33(0.27+0.06) +6.5%
2000.8: git add -A (sparse-index-v3)  0.57(0.53+0.07)  0.05(0.04+0.08) -91.2%
2000.9: git add -A (sparse-index-v4)  0.58(0.55+0.06)  0.05(0.05+0.06) -91.4%

While the 91% improvement seems impressive, it's important to recognize
that previously we had significant overhead for expanding the
sparse-index. Comparing to the full index case, 'git add -A' goes from
0.37s to 0.05s, which is "only" an 86% improvement.

This modification to 'git add' creates some behavior change depending on
the use of a sparse index. We modify a test in t1092 to demonstrate
these changes which will be remedied in future changes.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-29 12:36:34 -07:00
Junio C Hamano
7f554a4f69 Merge branch 'tb/reverse-midx'
The code that gives an error message in "git multi-pack-index" when
no subcommand is given tried to print a NULL pointer as a strong,
which has been corrected.

* tb/reverse-midx:
  multi-pack-index: fix potential segfault without sub-command
2021-07-28 13:18:04 -07:00
Junio C Hamano
b271a3034f Merge branch 'ds/status-with-sparse-index'
"git status" codepath learned to work with sparsely populated index
without hydrating it fully.

* ds/status-with-sparse-index:
  t1092: document bad sparse-checkout behavior
  fsmonitor: integrate with sparse index
  wt-status: expand added sparse directory entries
  status: use sparse-index throughout
  status: skip sparse-checkout percentage with sparse-index
  diff-lib: handle index diffs with sparse dirs
  dir.c: accept a directory as part of cone-mode patterns
  unpack-trees: unpack sparse directory entries
  unpack-trees: rename unpack_nondirectories()
  unpack-trees: compare sparse directories correctly
  unpack-trees: preserve cache_bottom
  t1092: add tests for status/add and sparse files
  t1092: expand repository data shape
  t1092: replace incorrect 'echo' with 'cat'
  sparse-index: include EXTENDED flag when expanding
  sparse-index: skip indexes with unmerged entries
2021-07-28 13:18:02 -07:00
Junio C Hamano
6ca224f156 Merge branch 'dl/diff-merge-base'
"git diff --merge-base" documentation has been updated.

* dl/diff-merge-base:
  git-diff: fix missing --merge-base docs
2021-07-28 13:17:59 -07:00
Junio C Hamano
dd6d3c90ee Merge branch 'ab/attribute-format'
Many "printf"-like helper functions we have have been annotated
with __attribute__() to catch placeholder/parameter mismatches.

* ab/attribute-format:
  advice.h: add missing __attribute__((format)) & fix usage
  *.h: add a few missing __attribute__((format))
  *.c static functions: add missing __attribute__((format))
  sequencer.c: move static function to avoid forward decl
  *.c static functions: don't forward-declare __attribute__
2021-07-28 13:17:59 -07:00
Junio C Hamano
c9d6d8a193 Merge branch 'jk/log-decorate-optim'
Optimize "git log" for cases where we wasted cycles to load ref
decoration data that may not be needed.

* jk/log-decorate-optim:
  load_ref_decorations(): fix decoration with tags
  add_ref_decoration(): rename s/type/deco_type/
  load_ref_decorations(): avoid parsing non-tag objects
  object.h: add lookup_object_by_type() function
  object.h: expand docstring for lookup_unknown_object()
  log: avoid loading decorations for userformats that don't need it
  pretty.h: update and expand docstring for userformat_find_requirements()
2021-07-28 13:17:58 -07:00
Junio C Hamano
01369fdfd3 Merge branch 'sm/worktree-add-lock'
"git worktree add --lock" learned to record why the worktree is
locked with a custom message.

* sm/worktree-add-lock:
  worktree: teach `add` to accept --reason <string> with --lock
  worktree: mark lock strings with `_()` for translation
  t2400: clean up '"add" worktree with lock' test
2021-07-28 13:17:58 -07:00
Junio C Hamano
14793a4e37 Merge branch 'hj/commit-allow-empty-message'
"git commit --allow-empty-message" won't abort the operation upon
an empty message, but the hint shown in the editor said otherwise.

* hj/commit-allow-empty-message:
  commit: remove irrelavent prompt on `--allow-empty-message`
  commit: reorganise commit hint strings
2021-07-28 13:17:57 -07:00
Andrzej Hunt
b54cf3a766 builtin/rebase: fix options.strategy memory lifecycle
- cmd_rebase populates rebase_options.strategy with newly allocated
  strings, hence we need to free those strings at the end of cmd_rebase
  to avoid a leak.
- In some cases: get_replay_opts() is called, which prepares replay_opts
  using data from rebase_options. We used to simply copy the pointer
  from rebase_options.strategy,  however that would now result in a
  double-free because sequencer_remove_state() is eventually used to
  free replay_opts.strategy. To avoid this we xstrdup() strategy when
  adding it to replay_opts.

The original leak happens because we always populate
rebase_options.strategy, but we don't always enter the path that calls
get_replay_opts() and later sequencer_remove_state() - in  other words
we'd always allocate a new string into rebase_options.strategy but
only sometimes did we free it. We now make sure that rebase_options
and replay_opts both own their own copies of strategy, and each copy
is free'd independently.

This was first seen when running t0021 with LSAN, but t2012 helped catch
the fact that we can't just free(options.strategy) at the end of
cmd_rebase (as that can cause a double-free). LSAN output from t0021:

LSAN output from t0021:

Direct leak of 4 byte(s) in 1 object(s) allocated from:
    #0 0x486804 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
    #1 0xa71eb8 in xstrdup wrapper.c:29:14
    #2 0x61b1cc in cmd_rebase builtin/rebase.c:1779:22
    #3 0x4ce83e in run_builtin git.c:475:11
    #4 0x4ccafe in handle_builtin git.c:729:3
    #5 0x4cb01c in run_argv git.c:818:4
    #6 0x4cb01c in cmd_main git.c:949:19
    #7 0x6b3fad in main common-main.c:52:11
    #8 0x7f267b512349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 4 byte(s) leaked in 1 allocation(s).

Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:19:21 -07:00
Andrzej Hunt
8c05e42c7a builtin/merge: free found_ref when done
merge_name() calls dwim_ref(), which allocates a new string into
found_ref. Therefore add a free() to avoid leaking found_ref.

LSAN output from t0021:

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x486804 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
    #1 0xa8beb8 in xstrdup wrapper.c:29:14
    #2 0x954054 in expand_ref refs.c:671:12
    #3 0x953cb6 in repo_dwim_ref refs.c:644:22
    #4 0x5d3759 in dwim_ref refs.h:162:9
    #5 0x5d3759 in merge_name builtin/merge.c:517:6
    #6 0x5d3759 in collect_parents builtin/merge.c:1214:5
    #7 0x5cf60d in cmd_merge builtin/merge.c:1458:16
    #8 0x4ce83e in run_builtin git.c:475:11
    #9 0x4ccafe in handle_builtin git.c:729:3
    #10 0x4cb01c in run_argv git.c:818:4
    #11 0x4cb01c in cmd_main git.c:949:19
    #12 0x6bdbfd in main common-main.c:52:11
    #13 0x7f0430502349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 16 byte(s) leaked in 1 allocation(s).

Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:19:20 -07:00
Andrzej Hunt
ed3c566d97 builtin/mv: free or UNLEAK multiple pointers at end of cmd_mv
These leaks all happen at the end of cmd_mv, hence don't matter in any
way. But we still fix the easy ones and squash the rest to get us closer
to being able to run tests without leaks.

LSAN output from t0050:

Direct leak of 384 byte(s) in 1 object(s) allocated from:
    #0 0x49ab49 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0xa8c015 in xrealloc wrapper.c:126:8
    #2 0xa0a7e1 in add_entry string-list.c:44:2
    #3 0xa0a7e1 in string_list_insert string-list.c:58:14
    #4 0x5dac03 in cmd_mv builtin/mv.c:248:4
    #5 0x4ce83e in run_builtin git.c:475:11
    #6 0x4ccafe in handle_builtin git.c:729:3
    #7 0x4cb01c in run_argv git.c:818:4
    #8 0x4cb01c in cmd_main git.c:949:19
    #9 0x6bd9ad in main common-main.c:52:11
    #10 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x49a82d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0xa8bd09 in do_xmalloc wrapper.c:41:8
    #2 0x5dbc34 in internal_prefix_pathspec builtin/mv.c:32:2
    #3 0x5da575 in cmd_mv builtin/mv.c:158:14
    #4 0x4ce83e in run_builtin git.c:475:11
    #5 0x4ccafe in handle_builtin git.c:729:3
    #6 0x4cb01c in run_argv git.c:818:4
    #7 0x4cb01c in cmd_main git.c:949:19
    #8 0x6bd9ad in main common-main.c:52:11
    #9 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x49a82d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0xa8bd09 in do_xmalloc wrapper.c:41:8
    #2 0x5dbc34 in internal_prefix_pathspec builtin/mv.c:32:2
    #3 0x5da4e4 in cmd_mv builtin/mv.c:148:11
    #4 0x4ce83e in run_builtin git.c:475:11
    #5 0x4ccafe in handle_builtin git.c:729:3
    #6 0x4cb01c in run_argv git.c:818:4
    #7 0x4cb01c in cmd_main git.c:949:19
    #8 0x6bd9ad in main common-main.c:52:11
    #9 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x49a9a2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
    #1 0xa8c119 in xcalloc wrapper.c:140:8
    #2 0x5da585 in cmd_mv builtin/mv.c:159:22
    #3 0x4ce83e in run_builtin git.c:475:11
    #4 0x4ccafe in handle_builtin git.c:729:3
    #5 0x4cb01c in run_argv git.c:818:4
    #6 0x4cb01c in cmd_main git.c:949:19
    #7 0x6bd9ad in main common-main.c:52:11
    #8 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 4 byte(s) in 1 object(s) allocated from:
    #0 0x49a9a2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
    #1 0xa8c119 in xcalloc wrapper.c:140:8
    #2 0x5da4f8 in cmd_mv builtin/mv.c:149:10
    #3 0x4ce83e in run_builtin git.c:475:11
    #4 0x4ccafe in handle_builtin git.c:729:3
    #5 0x4cb01c in run_argv git.c:818:4
    #6 0x4cb01c in cmd_main git.c:949:19
    #7 0x6bd9ad in main common-main.c:52:11
    #8 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Indirect leak of 65 byte(s) in 1 object(s) allocated from:
    #0 0x49ab49 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0xa8c015 in xrealloc wrapper.c:126:8
    #2 0xa00226 in strbuf_grow strbuf.c:98:2
    #3 0xa00226 in strbuf_vaddf strbuf.c:394:3
    #4 0xa065c7 in xstrvfmt strbuf.c:981:2
    #5 0xa065c7 in xstrfmt strbuf.c:991:8
    #6 0x9e7ce7 in prefix_path_gently setup.c:115:15
    #7 0x9e7fa6 in prefix_path setup.c:128:12
    #8 0x5dbdbf in internal_prefix_pathspec builtin/mv.c:55:23
    #9 0x5da575 in cmd_mv builtin/mv.c:158:14
    #10 0x4ce83e in run_builtin git.c:475:11
    #11 0x4ccafe in handle_builtin git.c:729:3
    #12 0x4cb01c in run_argv git.c:818:4
    #13 0x4cb01c in cmd_main git.c:949:19
    #14 0x6bd9ad in main common-main.c:52:11
    #15 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Indirect leak of 65 byte(s) in 1 object(s) allocated from:
    #0 0x49ab49 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0xa8c015 in xrealloc wrapper.c:126:8
    #2 0xa00226 in strbuf_grow strbuf.c:98:2
    #3 0xa00226 in strbuf_vaddf strbuf.c:394:3
    #4 0xa065c7 in xstrvfmt strbuf.c:981:2
    #5 0xa065c7 in xstrfmt strbuf.c:991:8
    #6 0x9e7ce7 in prefix_path_gently setup.c:115:15
    #7 0x9e7fa6 in prefix_path setup.c:128:12
    #8 0x5dbdbf in internal_prefix_pathspec builtin/mv.c:55:23
    #9 0x5da4e4 in cmd_mv builtin/mv.c:148:11
    #10 0x4ce83e in run_builtin git.c:475:11
    #11 0x4ccafe in handle_builtin git.c:729:3
    #12 0x4cb01c in run_argv git.c:818:4
    #13 0x4cb01c in cmd_main git.c:949:19
    #14 0x6bd9ad in main common-main.c:52:11
    #15 0x7fbfeffc4349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 558 byte(s) leaked in 7 allocation(s).

Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:19:20 -07:00
Andrzej Hunt
2b2999460c builtin/for-each-repo: remove unnecessary argv copy to plug leak
cmd_for_each_repo() copies argv into args (a strvec), which is later
passed into run_command_on_repo(), which in turn copies that strvec onto
the end of child.args. The initial copy is unnecessary (we never modify
args). We therefore choose to just pass argv directly into
run_command_on_repo(), which lets us avoid the copy and fixes the leak.

LSAN output from t0068:

Direct leak of 192 byte(s) in 1 object(s) allocated from:
    #0 0x7f63bd4ab8b0 in realloc (/usr/lib64/libasan.so.4+0xdc8b0)
    #1 0x98d7e6 in xrealloc wrapper.c:126
    #2 0x916914 in strvec_push_nodup strvec.c:19
    #3 0x916a6e in strvec_push strvec.c:26
    #4 0x4be4eb in cmd_for_each_repo builtin/for-each-repo.c:49
    #5 0x410dcd in run_builtin git.c:475
    #6 0x410dcd in handle_builtin git.c:729
    #7 0x414087 in run_argv git.c:818
    #8 0x414087 in cmd_main git.c:949
    #9 0x40e9ec in main common-main.c:52
    #10 0x7f63bc9fa349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Indirect leak of 22 byte(s) in 2 object(s) allocated from:
    #0 0x7f63bd445e30 in __interceptor_strdup (/usr/lib64/libasan.so.4+0x76e30)
    #1 0x98d698 in xstrdup wrapper.c:29
    #2 0x916a63 in strvec_push strvec.c:26
    #3 0x4be4eb in cmd_for_each_repo builtin/for-each-repo.c:49
    #4 0x410dcd in run_builtin git.c:475
    #5 0x410dcd in handle_builtin git.c:729
    #6 0x414087 in run_argv git.c:818
    #7 0x414087 in cmd_main git.c:949
    #8 0x40e9ec in main common-main.c:52
    #9 0x7f63bc9fa349 in __libc_start_main (/lib64/libc.so.6+0x24349)

See also discussion about the original implementation below - this code
appears to have evolved from a callback explaining the double-strvec-copy
pattern, but there's no strong reason to keep that now:
  https://lore.kernel.org/git/68bbeca5-314b-08ee-ef36-040e3f3814e9@gmail.com/

Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:19:20 -07:00
Andrzej Hunt
edfc744918 builtin/submodule--helper: release unused strbuf to avoid leak
relative_url() populates sb. In the normal return path, its buffer is
detached using strbuf_detach(). However the early return path does
nothing with sb, which means that sb's memory is leaked - therefore
we add a release to avoid this leak.

The reset is also only necessary for the normal return path, hence we
move it down to after the early-return to avoid unnecessary work.

LSAN output from t0060:

Direct leak of 121 byte(s) in 1 object(s) allocated from:
    #0 0x7f31246f28b0 in realloc (/usr/lib64/libasan.so.4+0xdc8b0)
    #1 0x98d7d6 in xrealloc wrapper.c:126
    #2 0x909a60 in strbuf_grow strbuf.c:98
    #3 0x90bf00 in strbuf_vaddf strbuf.c:401
    #4 0x90c321 in strbuf_addf strbuf.c:335
    #5 0x5cb78d in relative_url builtin/submodule--helper.c:182
    #6 0x5cbe46 in resolve_relative_url_test builtin/submodule--helper.c:248
    #7 0x410dcd in run_builtin git.c:475
    #8 0x410dcd in handle_builtin git.c:729
    #9 0x414087 in run_argv git.c:818
    #10 0x414087 in cmd_main git.c:949
    #11 0x40e9ec in main common-main.c:52
    #12 0x7f3123c41349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 121 byte(s) leaked in 1 allocation(s).

Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:19:20 -07:00
Johannes Schindelin
a03b097d63 Use a better name for the function interpolating paths
It is not immediately clear what `expand_user_path()` means, so let's
rename it to `interpolate_path()`. This also opens the path for
interpolating more than just a home directory.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:17:16 -07:00
Jeff King
bbe3165f82 submodule: drop unused sm_name parameter from show_fetch_remotes()
This parameter has not been used since the function was introduced in
8c8195e9c3 (submodule--helper: introduce add-clone subcommand,
2021-07-10).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:03:44 -07:00
ZheNing Hu
e85fcb355a ref-filter: use non-const ref_format in *_atom_parser()
Use non-const ref_format in *_atom_parser(), which can help us
modify the members of ref_format in *_atom_parser().

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-26 12:01:26 -07:00
Philippe Blain
e082631e51 merge: apply autostash if merge strategy fails
Since 'git merge' learned '--autostash' in a03b55530a (merge: teach
--autostash option, 2020-04-07), 'cmd_merge', once it is determined that
we have to create a merge commit, calls 'create_autostash' if
'--autostash' is given.

As explained in a03b55530a, and made more abvious by the tests added in
that commit, the autostash is then applied if the merge succeeds, either
directly or by committing (after conflict resolution or if '--no-commit'
was given), or if the merge is aborted with 'git merge --abort'. In some
other cases, like the user calling 'git reset --merge' or 'git merge
--quit', the autostash is not applied, but saved in the stash list.

However, there exists a scenario that creates an autostash but does not
apply nor save it to the stash list: if the chosen merge strategy
completely fails to handle the merge, i.e. 'try_merge_strategy' returns
2.

Apply the autostash in that case also. An easy way to test that is to
try to merge more than two commits but explicitely ask for the 'recursive'
merge strategy.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-23 15:45:40 -07:00
Philippe Blain
12510bd5da merge: apply autostash if fast-forward fails
Since 'git merge' learned '--autostash' in a03b55530a (merge: teach
--autostash option, 2020-04-07), 'cmd_merge', in the fast-forward case,
calls 'create_autostash' before calling 'checkout_fast_forward' if
'--autostash' is given.

However, if 'checkout_fast_forward' fails, the autostash is not applied
to the working tree, nor saved in the stash list, since the code simply
calls 'goto done'.

Be more helpful to the user by applying the autostash in that case.

An easy way to test a failing fast-forward is when we are merging a
branch that has a tracked file that conflicts with an untracked file in
the working tree.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-23 15:45:38 -07:00
Philippe Blain
9938f30d13 merge: add missing word "strategy" to a message
The variable 'best_strategy' holds the name of the merge strategy that
resulted in fewer conflicts, if several strategies were tried. When
that's the case but the best strategy was not the first one tried, we
inform the user which strategy was the "best" one before recreating the
merge and leaving the conflicted files in the tree.

This informational message is missing the word "strategy", so it shows
something like:

    Using the recursive to prepare resolving by hand.

Fix that.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-23 15:45:33 -07:00
Junio C Hamano
fe3fec53a6 Merge branch 'bc/rev-list-without-commit-line'
"git rev-list" learns to omit the "commit <object-name>" header
lines from the output with the `--no-commit-header` option.

* bc/rev-list-without-commit-line:
  rev-list: add option for --pretty=format without header
2021-07-22 13:05:56 -07:00
Elijah Newren
6f843a3355 pull: fix handling of multiple heads
With multiple heads, we should not allow rebasing or fast-forwarding.
Make sure any fast-forward request calls out specifically the fact that
multiple branches are in play.  Also, since we cannot fast-forward to
multiple branches, fix our computation of can_ff.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-22 11:54:30 -07:00
Elijah Newren
359ff69389 pull: update docs & code for option compatibility with rebasing
git-pull.txt includes merge-options.txt, which is written assuming
merges will happen.  git-pull has allowed rebases for many years; update
the documentation to reflect that.

While at it, pass any `--signoff` flag through to the rebase backend too
so that we don't have to document it as merge-specific.  Rebase has
supported the --signoff flag for years now as well.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-22 11:54:30 -07:00
Elijah Newren
031e2f7ae1 pull: abort by default when fast-forwarding is not possible
We have for some time shown a long warning when the user does not
specify how to reconcile divergent branches with git pull.  Make it an
error now.

Initial-patch-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-22 11:54:29 -07:00
Elijah Newren
adc27d6a93 pull: make --rebase and --no-rebase override pull.ff=only
Fix the last few precedence tests failing in t7601 by now implementing
the logic to have --[no-]rebase override a pull.ff=only config setting.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-22 11:54:29 -07:00
Elijah Newren
e4dc25ed49 pull: since --ff-only overrides, handle it first
There are both merge and rebase branches in the logic, and previously
both had to handle fast-forwarding.  Merge handled that implicitly
(because git merge handles it directly), while in rebase it was
explicit.  Given that the --ff-only flag is meant to override any
--rebase or --no-rebase, make the code reflect that by handling
--ff-only before the merge-vs-rebase logic.

It turns out that this also fixes a bug for submodules.  Previously,
when --ff-only was given, the code would run `merge --ff-only` on the
main module, and then run `submodule update --recursive --rebase` on the
submodules.  With this change, we still run `merge --ff-only` on the
main module, but now run `submodule update --recursive --checkout` on
the submodules.  I believe this better reflects the intent of --ff-only
to have it apply to both the main module and the submodules.

(Sidenote: It is somewhat interesting that all merges pass `--checkout`
to submodule update, even when `--no-ff` is specified, meaning that it
will only do fast-forward merges for submodules.  This was discussed in
commit a6d7eb2c7a ("pull: optionally rebase submodules (remote submodule
changes only)", 2017-06-23).  The same limitations apply now as then, so
we are not trying to fix this at this time.)

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-22 11:54:29 -07:00
Alex Henrie
3d5fc24dae pull: abort if --ff-only is given and fast-forwarding is impossible
The warning about pulling without specifying how to reconcile divergent
branches says that after setting pull.rebase to true, --ff-only can
still be passed on the command line to require a fast-forward. Make that
actually work.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
[en: updated tests; note 3 fixes and 1 new failure]
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20 21:43:12 -07:00
Taylor Blau
88617d11f9 multi-pack-index: fix potential segfault without sub-command
Since cd57bc41bb (builtin/multi-pack-index.c: display usage on
unrecognized command, 2021-03-30) we have used a "usage" label to avoid
having two separate callers of usage_with_options (one when no arguments
are given, and another for unrecognized sub-commands).

But the first caller has been broken since cd57bc41bb, since it will
happily jump to usage without arguments, and then pass argv[0] to the
"unrecognized subcommand" error.

Many compilers will save us from a segfault here, but the end result is
ugly, since it mentions an unrecognized subcommand when we didn't even
pass one, and (on GCC) includes "(null)" in its output.

Move the "usage" label down past the error about unrecognized
subcommands so that it is only triggered when it should be. While we're
at it, bulk up our test coverage in this area, too.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-19 15:24:01 -07:00
Junio C Hamano
bd4232fac3 Merge branch 'ab/struct-init'
Code cleanup around struct_type_init() functions.

* ab/struct-init:
  string-list.h users: change to use *_{nodup,dup}()
  string-list.[ch]: add a string_list_init_{nodup,dup}()
  dir.[ch]: replace dir_init() with DIR_INIT
  *.c *_init(): define in terms of corresponding *_INIT macro
  *.h: move some *_INIT to designated initializers
2021-07-16 17:42:53 -07:00
Junio C Hamano
8eb90d385c Merge branch 'ar/help-micro-cleanup'
Tiny code clean-up.

* ar/help-micro-cleanup:
  help: convert git_cmd to page in one place
2021-07-16 17:42:51 -07:00
Junio C Hamano
f90efd9981 Merge branch 'ar/submodule-helper-include-cleanup'
Code clean-up.

* ar/submodule-helper-include-cleanup:
  submodule--helper: remove redundant include
2021-07-16 17:42:51 -07:00
Junio C Hamano
cdeabf513a Merge branch 'ab/bundle-updates'
Code clean-up and leak plugging in "git bundle".

* ab/bundle-updates:
  bundle: remove "ref_list" in favor of string-list.c API
  bundle.c: use a temporary variable for OIDs and names
  bundle cmd: stop leaking memory from parse_options_cmd_bundle()
2021-07-16 17:42:49 -07:00
Junio C Hamano
1fb3445658 Merge branch 'ab/show-branch-tests'
Fill test gaps.

* ab/show-branch-tests:
  show-branch tests: add missing tests
  show-branch: don't <COLOR></RESET> for space characters
  show-branch tests: modernize test code
  show-branch tests: rename the one "show-branch" test file
2021-07-16 17:42:48 -07:00
Junio C Hamano
b2fc822629 Merge branch 'ab/fetch-negotiate-segv-fix'
Code recently added to support common ancestry negotiation during
"git push" did not sanity check its arguments carefully enough.

* ab/fetch-negotiate-segv-fix:
  fetch: fix segfault in --negotiate-only without --negotiation-tip=*
  fetch: document the --negotiate-only option
  send-pack.c: move "no refs in common" abort earlier
2021-07-16 17:42:48 -07:00
Stephen Manz
0db4961c49 worktree: teach add to accept --reason <string> with --lock
The default reason stored in the lock file, "added with --lock",
is unlikely to be what the user would have given in a separate
`git worktree lock` command. Allowing `--reason` to be specified
along with `--lock` when adding a working tree gives the user control
over the reason for locking without needing a second command.

Signed-off-by: Stephen Manz <smanz@alum.mit.edu>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-15 13:30:59 -07:00
Jonathan Tan
82823118b9 fetch: die on invalid --negotiation-tip hash
If a full hexadecimal hash is given as a --negotiation-tip to "git
fetch", and that hash does not correspond to an object, "git fetch" will
segfault if --negotiate-only is given and will silently ignore that hash
otherwise. Make these cases fatal errors, just like the case when an
invalid ref name or abbreviated hash is given.

While at it, mark the error messages as translatable.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-15 11:58:52 -07:00
Jonathan Tan
74fab8ff54 send-pack: fix push.negotiate with remote helper
Commit 477673d6f3 ("send-pack: support push negotiation", 2021-05-05)
introduced the push.negotiate config variable and included a test. The
test only covered pushing without a remote helper, so the fact that
pushing with a remote helper doesn't work went unnoticed.

This is ultimately caused by the "url" field not being set in the args
struct. This field being unset probably went unnoticed because besides
push negotiation, this field is only used to generate a "pushee" line in
a push cert (and if not given, no such line is generated). Therefore,
set this field.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-15 11:58:52 -07:00
Derrick Stolee
1ba5f45132 checkout: stop expanding sparse indexes
Previous changes did the necessary improvements to unpack-trees.c and
diff-lib.c in order to modify a sparse index based on its comparision
with a tree. The only remaining work is to remove some
ensure_full_index() calls and add tests that verify that the index is
not expanded in our interesting cases. Include 'switch' and 'restore' in
these tests, as they share a base implementation with 'checkout'.

Here are the relevant performance results from
p2000-sparse-operations.sh:

Test                                     HEAD~1           HEAD
--------------------------------------------------------------------------------
2000.18: git checkout -f - (full-v3)     0.49(0.43+0.03)  0.47(0.39+0.05) -4.1%
2000.19: git checkout -f - (full-v4)     0.45(0.37+0.06)  0.42(0.37+0.05) -6.7%
2000.20: git checkout -f - (sparse-v3)   0.76(0.71+0.07)  0.04(0.03+0.04) -94.7%
2000.21: git checkout -f - (sparse-v4)   0.75(0.72+0.04)  0.05(0.06+0.04) -93.3%

It is important to compare the full index case to the sparse index case,
as the previous results for the sparse index were inflated by the index
expansion. For index v4, this is an 88% improvement.

On an internal repository with over two million paths at HEAD and a
sparse-checkout definition containing ~60,000 of those paths, 'git
checkout' went from 3.5s to 297ms with this change. The theoretical
optimum where only those ~60,000 paths exist was 275ms, so the extra
sparse directory entries contribute a 22ms overhead.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 15:05:53 -07:00
Derrick Stolee
daa1acefc5 commit: integrate with sparse-index
Update 'git commit' to allow using the sparse-index in memory without
expanding to a full one. The only place that had an ensure_full_index()
call was in cache_tree_update(). The recursive algorithm for
update_one() was already updated in 2de37c536 (cache-tree: integrate
with sparse directory entries, 2021-03-03) to handle sparse directory
entries in the index.

Most of this change involves testing different command-line options that
allow specifying which on-disk changes should be included in the commit.
This includes no options (only take currently-staged changes), -a (take
all tracked changes), and --include (take a list of specific changes).
To simplify testing that these options do not expand the index, update
the test that previously verified that 'git status' does not expand the
index with a helper method, ensure_not_expanded().

This allows 'git commit' to operate much faster when the sparse-checkout
cone is much smaller than the full list of files at HEAD.

Here are the relevant lines from p2000-sparse-operations.sh:

Test                                      HEAD~1           HEAD
----------------------------------------------------------------------------------
2000.14: git commit -a -m A (full-v3)     0.35(0.26+0.06)  0.36(0.28+0.07) +2.9%
2000.15: git commit -a -m A (full-v4)     0.32(0.26+0.05)  0.34(0.28+0.06) +6.3%
2000.16: git commit -a -m A (sparse-v3)   0.63(0.59+0.06)  0.04(0.05+0.05) -93.7%
2000.17: git commit -a -m A (sparse-v4)   0.64(0.59+0.08)  0.04(0.04+0.04) -93.8%

It is important to compare the full-index case to the sparse-index case,
so the improvement for index version v4 is actually an 88% improvement in
this synthetic example.

In a real repository with over two million files at HEAD and 60,000
files in the sparse-checkout definition, the time for 'git commit -a'
went from 2.61 seconds to 134ms. I compared this to the result if the
index only contained the paths in the sparse-checkout definition and
found the theoretical optimum to be 120ms, so the out-of-cone paths only
add a 12% overhead.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 15:05:53 -07:00
Derrick Stolee
d76723ee53 status: use sparse-index throughout
By testing 'git -c core.fsmonitor= status -uno', we can check for the
simplest index operations that can be made sparse-aware. The necessary
implementation details are already integrated with sparse-checkout, so
modify command_requires_full_index to be zero for cmd_status().

In refresh_index(), we loop through the index entries to refresh their
stat() information. However, sparse directories have no stat()
information to populate. Ignore these entries.

This allows 'git status' to no longer expand a sparse index to a full
one. This is further tested by dropping the "-uno" option and adding an
untracked file into the worktree.

The performance test p2000-sparse-checkout-operations.sh demonstrates
these improvements:

Test                                  HEAD~1           HEAD
-----------------------------------------------------------------------------
2000.2: git status (full-index-v3)    0.31(0.30+0.05)  0.31(0.29+0.06) +0.0%
2000.3: git status (full-index-v4)    0.31(0.29+0.07)  0.34(0.30+0.08) +9.7%
2000.4: git status (sparse-index-v3)  2.35(2.28+0.10)  0.04(0.04+0.05) -98.3%
2000.5: git status (sparse-index-v4)  2.35(2.24+0.15)  0.05(0.04+0.06) -97.9%

Note that since HEAD~1 was expanding the sparse index by parsing trees,
it was artificially slower than the full index case. Thus, the 98%
improvement is misleading, and instead we should celebrate the 0.34s to
0.05s improvement of 85%. This is more indicative of the peformance
gains we are expecting by using a sparse index.

Note: we are dropping the assignment of core.fsmonitor here. This is not
necessary for the test script as we are not altering the config any
other way. Correct integration with FS Monitor will be validated in
later changes.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 13:42:49 -07:00
Stephen Manz
f7c35ea2a1 worktree: mark lock strings with _() for translation
- default lock string, "added with --lock"
- temporary lock string, "initializing"

Signed-off-by: Stephen Manz <smanz@alum.mit.edu>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-14 09:29:59 -07:00
Junio C Hamano
4da281e84d Merge branch 'ab/pickaxe-pcre2'
Rewrite the backend for "diff -G/-S" to use pcre2 engine when
available.

* ab/pickaxe-pcre2: (22 commits)
  xdiff-interface: replace discard_hunk_line() with a flag
  xdiff users: use designated initializers for out_line
  pickaxe -G: don't special-case create/delete
  pickaxe -G: terminate early on matching lines
  xdiff-interface: allow early return from xdiff_emit_line_fn
  xdiff-interface: prepare for allowing early return
  pickaxe -S: slightly optimize contains()
  pickaxe: rename variables in has_changes() for brevity
  pickaxe -S: support content with NULs under --pickaxe-regex
  pickaxe: assert that we must have a needle under -G or -S
  pickaxe: refactor function selection in diffcore-pickaxe()
  perf: add performance test for pickaxe
  pickaxe/style: consolidate declarations and assignments
  diff.h: move pickaxe fields together again
  pickaxe: die when --find-object and --pickaxe-all are combined
  pickaxe: die when -G and --pickaxe-regex are combined
  pickaxe tests: add missing test for --no-pickaxe-regex being an error
  pickaxe tests: test for -G, -S and --find-object incompatibility
  pickaxe tests: add test for "log -S" not being a regex
  pickaxe tests: add test for diffgrep_consume() internals
  ...
2021-07-13 16:52:50 -07:00
Junio C Hamano
0659866a09 Merge branch 'fc/push-simple-updates-cleanup'
Some more code and doc clarification around "git push".

* fc/push-simple-updates-cleanup:
  push: don't get a full remote object
  push: only check same_remote when needed
  push: remove trivial function
  push: remove redundant check
  push: factor out the typical case
  push: get rid of all the setup_push_* functions
  push: trivial simplifications
  push: make setup_push_* return the dst
  push: only get the branch when needed
  push: factor out null branch check
  push: split switch cases
  push: return immediately in trivial switch case
  push: create new get_upstream_ref() helper
2021-07-13 16:52:50 -07:00
Junio C Hamano
07e230d762 Merge branch 'fc/push-simple-updates'
Some code and doc clarification around "git push".

* fc/push-simple-updates:
  doc: push: explain default=simple correctly
  push: remove unused code in setup_push_upstream()
  push: simplify setup_push_simple()
  push: reorganize setup_push_simple()
  push: copy code to setup_push_simple()
  push: hedge code of default=simple
  push: rename !triangular to same_remote
2021-07-13 16:52:49 -07:00
Junio C Hamano
5d96bcbc06 Merge branch 'zh/cat-file-batch-fix'
"git cat-file --batch-all-objects"" misbehaved when "--batch" is in
use and did not ask for certain object traits.

* zh/cat-file-batch-fix:
  cat-file: merge two block into one
  cat-file: handle trivial --batch format with --batch-all-objects
2021-07-13 16:52:49 -07:00
Ævar Arnfjörð Bjarmason
48ca53cac4 *.c static functions: add missing __attribute__((format))
Add missing __attribute__((format)) function attributes to various
"static" functions that take printf arguments.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-13 15:20:20 -07:00
Denton Liu
eb448631fb git-diff: fix missing --merge-base docs
When `git diff --merge-base` was introduced at around Git 2.30, the
documentation included a few errors.

In the example given for `git diff --cached --merge-base`, the
`--cached` flag was omitted for the `--merge-base` example. Add the
missing flag.

In the `git diff <commit>` case, we failed to mention that
`--merge-base` is an available option. Give the usage of `--merge-base`
as an option there.

Finally, there are two errors in the usage of `git diff`. Firstly, we do
not mention `--merge-base` in the `git diff --cached` case. Mention it
so that it's consistent with the documentation. Secondly, we put the
`[--merge-base]` in between `<commit>` and `[<commit>...]`. Move the
`[--merge-base]` so that it's beside `[<options>]` which is a more
logical grouping.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-12 13:55:29 -07:00
Ævar Arnfjörð Bjarmason
103e02c700 *.c static functions: don't forward-declare __attribute__
9cf6d3357a (Add git-index-pack utility, 2005-10-12) and
466dbc42f5 (receive-pack: Send internal errors over side-band #2,
2010-02-10) we added these static functions and forward-declared their
__attribute__((printf)).

I think this may have been to work around some compiler limitation at
the time, but in any case we have a lot of code that uses the briefer
way of declaring these that I'm using here, so if we had any such
issues with compilers we'd have seen them already.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-12 12:09:53 -07:00
Atharva Raykar
8c8195e9c3 submodule--helper: introduce add-clone subcommand
Let's add a new "add-clone" subcommand to `git submodule--helper` with
the goal of converting part of the shell code in git-submodule.sh
related to `git submodule add` into C code. This new subcommand clones
the repository that is to be added, and checks out to the appropriate
branch.

This is meant to be a faithful conversion that leaves the behaviour of
'cmd_add()' script unchanged.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Based-on-patch-by: Shourya Shukla <periperidip@gmail.com>
Based-on-patch-by: Prathamesh Chavan <pc44800@gmail.com>
Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-12 12:06:21 -07:00
Atharva Raykar
a98b02c112 submodule--helper: refactor module_clone()
Separate out the core logic of module_clone() from the flag
parsing---this way we can call the equivalent of the `submodule--helper
clone` subcommand directly within C, without needing to push arguments
in a strvec.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-12 12:06:21 -07:00
brian m. carlson
d1c5ae78ce rev-list: add option for --pretty=format without header
In general, we encourage users to use plumbing commands, like git
rev-list, over porcelain commands, like git log, when scripting.
However, git rev-list has one glaring problem that prevents it from
being used in certain cases: when --pretty is used with a custom format,
it always prints out a line containing "commit" and the object ID.  This
makes it unsuitable for many scripting needs, and forces users to use
git log instead.

While we can't change this behavior for backwards compatibility, we can
add an option to suppress this behavior, so let's do so, and call it
"--no-commit-header".  Additionally, add the corresponding positive
option to switch it back on.

Note that this option doesn't affect the built-in formats, only custom
formats.  This is exactly the same behavior as users already have from
git log and is what most users will be used to.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-12 10:12:31 -07:00
Hu Jialun
6f70f00b4f commit: remove irrelavent prompt on --allow-empty-message
Even when the `--allow-empty-message` option is given, "git commit"
offers an interactive editor session with prefilled message that says
the commit will be aborted if the buffer is emptied, which is wrong.

Remove the "an empty message aborts" part from the message when the
option is given to fix it.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Helped-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Hu Jialun <hujialun@comp.nus.edu.sg>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-09 12:08:18 -07:00
Hu Jialun
54ba2f1862 commit: reorganise commit hint strings
Strings of hint messages inserted into editor on interactive commit was
scattered in-line, rendering the code harder to understand at first
glance.

Extract those messages out into separate variables to make the code
outline easier to follow.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Hu Jialun <hujialun@comp.nus.edu.sg>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-09 12:04:51 -07:00
Ævar Arnfjörð Bjarmason
561fa03529 pack-objects: fix segfault in --stdin-packs option
Fix a segfault in the --stdin-packs option added in
339bce27f4 (builtin/pack-objects.c: add '--stdin-packs' option,
2021-02-22).

The read_packs_list_from_stdin() function didn't check that the lines
it was reading were valid packs, and thus when doing the QSORT() with
pack_mtime_cmp() we'd have a NULL "util" field. The "util" field is
used to associate the names of included/excluded packs with the
packed_git structs they correspond to.

The logic error was in assuming that we could iterate all packs and
annotate the excluded and included packs we got, as opposed to
checking the lines we got on stdin. There was a check for excluded
packs, but included packs were simply assumed to be valid.

As noted in the test we'll not report the first bad line, but whatever
line sorted first according to the string-list.c API. In this case I
think that's fine. There was further discussion of alternate
approaches in [1].

Even though we're being lazy let's assert the line we do expect to get
in the test, since whoever changes this code in the future might miss
this case, and would want to update the test and comments.

1. http://lore.kernel.org/git/YND3h2l10PlnSNGJ@nand.local

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-09 11:53:40 -07:00
Junio C Hamano
e867110340 Merge branch 'ab/cmd-foo-should-return'
Code clean-up.

* ab/cmd-foo-should-return:
  builtins + test helpers: use return instead of exit() in cmd_*
2021-07-08 13:15:04 -07:00
Junio C Hamano
221ec24e9b Merge branch 'fc/pull-cleanups'
Code cleanup.

* fc/pull-cleanups:
  pull: trivial whitespace style fix
  pull: trivial cleanup
  pull: cleanup autostash check
2021-07-08 13:15:00 -07:00
Junio C Hamano
a515f26eac Merge branch 'ar/typofix'
Typofixes.

* ar/typofix:
  *: fix typos which duplicate a word
2021-07-08 13:14:59 -07:00
Junio C Hamano
1d38852b11 Merge branch 'ah/uninitialized-reads-fix'
Make the codebase MSAN clean.

* ah/uninitialized-reads-fix:
  builtin/checkout--worker: zero-initialise struct to avoid MSAN complaints
  split-index: use oideq instead of memcmp to compare object_id's
  bulk-checkin: make buffer reuse more obvious and safer
2021-07-08 13:14:59 -07:00
Ævar Arnfjörð Bjarmason
eff40457a4 fetch: fix segfault in --negotiate-only without --negotiation-tip=*
The recent --negotiate-only option would segfault in the call to
oid_array_for_each() in negotiate_using_fetch() unless one or more
--negotiation-tip=* options were provided.

All of the other tests for the feature combine both, but nothing was
checking this assumption, let's do that and add a test for it. Fixes a
bug in 9c1e657a8f (fetch: teach independent negotiation (no
packfile), 2021-05-04).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-08 08:20:16 -07:00
Atharva Raykar
6a24cc71ed submodule--helper: remove redundant include
"dir.h" should have been included only once.

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Shourya Shukla <periperidip@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-06 14:01:08 -07:00
Andrei Rybak
d5659f856f help: convert git_cmd to page in one place
Depending on the chosen format of help pages, git-help uses function
show_man_page, show_info_page, or show_html_page.  The first thing all
three functions do is to convert given `git_cmd` to a `page` using
function cmd_to_page.

Move the common part of these three functions to function cmd_help to
avoid code duplication.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Reviewed-by: Felipe Contreras <felipe.contreras@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-06 13:09:20 -07:00
Ævar Arnfjörð Bjarmason
10b635b773 bundle: remove "ref_list" in favor of string-list.c API
Move away from the "struct ref_list" in bundle.c in favor of the
almost identical string-list.c API.

That API fits this use-case perfectly, but did not exist in its
current form when this code was added in 2e0afafebd (Add git-bundle:
move objects and references by archive, 2007-02-22), with hindsight we
could have used the path-list API, which later got renamed to
string-list. See 8fd2cb4069 (Extract helper bits from
c-merge-recursive work, 2006-07-25)

We need to change "name" to "string" and "oid" to "util" to make this
conversion, but other than that the APIs are pretty much identical for
what bundle.c made use of.

Let's also replace the memset(..,0,...) pattern with a more idiomatic
"INIT" macro, and finally add a *_release() function so to free the
allocated memory.

Before this the add_to_ref_list() would leak memory, now e.g. "bundle
list-heads" reports no memory leaks at all under valgrind.

In the bundle_header_init() function we're using a clever trick to
memcpy() what we'd get from the corresponding
BUNDLE_HEADER_INIT. There is a concurrent series to make use of that
pattern more generally, see [1].

1. https://lore.kernel.org/git/cover-0.5-00000000000-20210701T104855Z-avarab@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-06 12:10:17 -07:00
Ævar Arnfjörð Bjarmason
db6bfb9fe8 bundle cmd: stop leaking memory from parse_options_cmd_bundle()
Fix a memory leak from the prefix_filename() function introduced with
its use in 3b754eedd5 (bundle: use prefix_filename with bundle path,
2017-03-20).

As noted in that commit the leak was intentional as a part of being
sloppy about freeing resources just before we exit, I'm changing this
because I'll be fixing other memory leaks in the bundle API (including
the library version) in subsequent commits. It's easier to reason
about those fixes if valgrind runs cleanly at the end without any
leaks whatsoever.

An earlier version of this change[1] went out of its way to not leak
memory on the die() codepaths here, but doing so will only avoid
reports of potential leaks under heap-only leak trackers such as
valgrind, not the SANITIZE=leak mode.

Avoiding those leaks as well might be useful to enable us to run
cleanly under the likes of valgrind in the future. But for now the
relative verbosity of the resulting code, and the fact that we don't
have some valgrind or SANITIZE=leak mode as part of our CI (it's only
run ad-hoc, see [2]), means we're not worrying about that for now.

1. https://lore.kernel.org/git/87v95vdxrc.fsf@evledraar.gmail.com/
2. https://lore.kernel.org/git/87czsv2idy.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-06 12:10:16 -07:00
Ævar Arnfjörð Bjarmason
ce93a4c612 dir.[ch]: replace dir_init() with DIR_INIT
Remove the dir_init() function and replace it with a DIR_INIT
macro. In many cases in the codebase we need to initialize things with
a function for good reasons, e.g. needing to call another function on
initialization. The "dir_init()" function was not one such case, and
could trivially be replaced with a more idiomatic macro initialization
pattern.

The only place where we made use of its use of memset() was in
dir_clear() itself, which resets the contents of an an existing struct
pointer. Let's use the new "memcpy() a 'blank' struct on the stack"
idiom to do that reset.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-01 12:32:22 -07:00
Junio C Hamano
bff9703f0a Merge branch 'zh/cat-file-batch-fix' into zh/ref-filter-raw-data
* zh/cat-file-batch-fix:
  cat-file: merge two block into one
  cat-file: handle trivial --batch format with --batch-all-objects
2021-07-01 12:16:43 -07:00
Jeff King
b2086b5183 log: avoid loading decorations for userformats that don't need it
If no --decorate option is given, we default to auto-decoration. And
when that kicks in, cmd_log_init_finish() will unconditionally load the
decoration refs.

However, if we are using a user-format that does not include "%d" or
"%D", we won't show the decorations at all, so we don't need to load
them. We can detect this case and auto-disable them by adding a new
field to our userformat_want helper. We can do this even when the user
explicitly asked for --decorate, because it can't affect the output at
all.

This patch consistently reduces the time to run "git log -1 --format=%H"
on my git.git clone (with ~2k refs) from 34ms to 7ms. On a much more
extreme real-world repository (with ~220k refs), it goes from 2.5s to
4ms.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 20:30:17 -07:00
Ævar Arnfjörð Bjarmason
4465690cd8 show-branch: don't <COLOR></RESET> for space characters
Change the colored output introduced in ab07ba2a24 (show-branch: color
the commit status signs, 2009-04-22) to not color and reset each
individual space character we use for padding. The intent is to color
just the "!", "+" etc. characters.

This makes the output easier to test, so let's do that now. The test
would be much more verbose without a color/reset for each space
character. Since the coloring cycles through colors we previously had
a "rainbow of space characters".

In theory this breaks things for anyone who's relying on the exact
colored output of show-branch, in practice I'd think anyone parsing it
isn't actively turning on the colored output.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-28 09:33:06 -07:00
Felipe Contreras
a7d18a1109 pull: trivial whitespace style fix
Two spaces unaligned to anything is not part of the coding-style. A
single tab is.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-19 16:36:17 +09:00
Felipe Contreras
a751e0296f pull: trivial cleanup
There's no need to store ran_ff. Now it's obvious from the conditionals.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-19 16:36:17 +09:00
Felipe Contreras
340062243a pull: cleanup autostash check
Currently "git pull --rebase" takes a shortcut in the case a
fast-forward merge is possible; run_merge() is called with --ff-only.

However, "git merge" didn't have an --autostash option, so, when "git
pull --rebase --autostash" was called *and* the fast-forward merge
shortcut was taken, then the pull failed.

This was fixed in commit f15e7cf5cc (pull: ff --rebase --autostash
works in dirty repo, 2017-06-01) by simply skipping the fast-forward
merge shortcut.

Later on "git merge" learned the --autostash option [a03b55530a
(merge: teach --autostash option, 2020-04-07)], and so did "git pull"
[d9f15d37f1 (pull: pass --autostash to merge, 2020-04-07)].

Therefore it's not necessary to skip the fast-forward merge shortcut
anymore when called with --rebase --autostash.

Let's always take the fast-forward merge shortcut by essentially
reverting f15e7cf5cc.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-19 16:36:16 +09:00
Andrzej Hunt
4dbc55e87d builtin/checkout--worker: zero-initialise struct to avoid MSAN complaints
report_result() sends a struct to the parent process, but that struct
would contain uninitialised padding bytes. Running this code under MSAN
rightly triggers a warning - but we don't particularly care about this
warning because we control the receiving code, and we therefore know
that those padding bytes won't be read on the receiving end.

We could simply suppress this warning under MSAN with the approporiate
ifdef'd attributes, but a less intrusive solution is to 0-initialise the
struct, which guarantees that the padding will also be initialised.

Interestingly, in the error-case branch, we only try to copy the first
two members of pc_item_result, by copying only PC_ITEM_RESULT_BASE_SIZE
bytes. However PC_ITEM_RESULT_BASE_SIZE is defined as
'offsetof(the_last_member)', which means that we're copying padding bytes
after the end of the second last member. We could avoid doing this by
redefining PC_ITEM_RESULT_BASE_SIZE as
'offsetof(second_last_member) + sizeof(second_last_member)', but there's
no huge benefit to doing so (and this patch silences the MSAN warning in
this scenario either way).

MSAN output from t2080 (partially interleaved due to the
parallel work :) ):

Uninitialized bytes in __interceptor_write at offset 12 inside [0x7fff37d83408, 160)
==23279==WARNING: MemorySanitizer: use-of-uninitialized-value
Uninitialized bytes in __interceptor_write at offset 12 inside [0x7ffdb8a07ec8, 160)
==23280==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0xd5ac28 in xwrite /home/ahunt/git/git/wrapper.c:256:8
    #1 0xd5b327 in write_in_full /home/ahunt/git/git/wrapper.c:311:21
    #2 0xb0a8c4 in do_packet_write /home/ahunt/git/git/pkt-line.c:221:6
    #3 0xb0a5fd in packet_write /home/ahunt/git/git/pkt-line.c:242:6
    #4 0x4f7441 in report_result /home/ahunt/git/git/builtin/checkout--worker.c:69:2
    #5 0x4f6be6 in worker_loop /home/ahunt/git/git/builtin/checkout--worker.c💯3
    #6 0x4f68d3 in cmd_checkout__worker /home/ahunt/git/git/builtin/checkout--worker.c:143:2
    #7 0x4a1e76 in run_builtin /home/ahunt/git/git/git.c:461:11
    #8 0x49e1e7 in handle_builtin /home/ahunt/git/git/git.c:714:3
    #9 0x4a0c08 in run_argv /home/ahunt/git/git/git.c:781:4
    #10 0x49d5a8 in cmd_main /home/ahunt/git/git/git.c:912:19
    #11 0x7974da in main /home/ahunt/git/git/common-main.c:52:11
    #12 0x7f8778114349 in __libc_start_main (/lib64/libc.so.6+0x24349)
    #13 0x421bd9 in _start /home/abuild/rpmbuild/BUILD/glibc-2.26/csu/../sysdeps/x86_64/start.S:120

  Uninitialized value was created by an allocation of 'res' in the stack frame of function 'report_result'
    #0 0x4f72c0 in report_result /home/ahunt/git/git/builtin/checkout--worker.c:55

SUMMARY: MemorySanitizer: use-of-uninitialized-value /home/ahunt/git/git/wrapper.c:256:8 in xwrite
Exiting
    #0 0xd5ac28 in xwrite /home/ahunt/git/git/wrapper.c:256:8
    #1 0xd5b327 in write_in_full /home/ahunt/git/git/wrapper.c:311:21
    #2 0xb0a8c4 in do_packet_write /home/ahunt/git/git/pkt-line.c:221:6
    #3 0xb0a5fd in packet_write /home/ahunt/git/git/pkt-line.c:242:6
    #4 0x4f7441 in report_result /home/ahunt/git/git/builtin/checkout--worker.c:69:2
    #5 0x4f6be6 in worker_loop /home/ahunt/git/git/builtin/checkout--worker.c💯3
    #6 0x4f68d3 in cmd_checkout__worker /home/ahunt/git/git/builtin/checkout--worker.c:143:2
    #7 0x4a1e76 in run_builtin /home/ahunt/git/git/git.c:461:11
    #8 0x49e1e7 in handle_builtin /home/ahunt/git/git/git.c:714:3
    #9 0x4a0c08 in run_argv /home/ahunt/git/git/git.c:781:4
    #10 0x49d5a8 in cmd_main /home/ahunt/git/git/git.c:912:19
    #11 0x7974da in main /home/ahunt/git/git/common-main.c:52:11
    #12 0x7f2749a0e349 in __libc_start_main (/lib64/libc.so.6+0x24349)
    #13 0x421bd9 in _start /home/abuild/rpmbuild/BUILD/glibc-2.26/csu/../sysdeps/x86_64/start.S:120

  Uninitialized value was created by an allocation of 'res' in the stack frame of function 'report_result'
    #0 0x4f72c0 in report_result /home/ahunt/git/git/builtin/checkout--worker.c:55

SUMMARY: MemorySanitizer: use-of-uninitialized-value /home/ahunt/git/git/wrapper.c:256:8 in xwrite

Signed-off-by: Andrzej Hunt <andrzej@ahunt.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-15 12:07:56 +09:00
Junio C Hamano
8e444e66df Merge branch 'so/log-m-implies-p'
The "-m" option in "git log -m" that does not specify which format,
if any, of diff is desired did not have any visible effect; it now
implies some form of diff (by default "--patch") is produced.

* so/log-m-implies-p:
  diff-merges: let "-m" imply "-p"
  diff-merges: rename "combined_imply_patch" to "merges_imply_patch"
  stash list: stop passing "-m" to "git log"
  git-svn: stop passing "-m" to "git rev-list"
  diff-merges: move specific diff-index "-m" handling to diff-index
  t4013: test "git diff-index -m"
  t4013: test "git diff-tree -m"
  t4013: test "git log -m --stat"
  t4013: test "git log -m --raw"
  t4013: test that "-m" alone has no effect in "git log"
2021-06-14 13:33:27 +09:00
Junio C Hamano
f4f7304b44 Merge branch 'jk/clone-clean-upon-transport-error'
Recent "git clone" left a temporary directory behind when the
transport layer returned an failure.

* jk/clone-clean-upon-transport-error:
  clone: clean up directory after transport_fetch_refs() failure
2021-06-14 13:33:26 +09:00
Andrei Rybak
abcb66c614 *: fix typos which duplicate a word
Fix typos in documentation, code comments, and RelNotes which repeat
various words.  In trivial cases, just delete the duplicated word and
rewrap text, if needed.  Reword the affected sentence in
Documentation/RelNotes/1.8.4.txt for it to make sense.

Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-14 10:16:06 +09:00
Junio C Hamano
3153c83c77 Merge branch 'ah/fetch-reject-warning-grammofix'
Message update.

* ah/fetch-reject-warning-grammofix:
  fetch: improve grammar of "shallow roots" message
2021-06-10 12:04:27 +09:00
Junio C Hamano
e4b5d2a83f Merge branch 'wm/rev-parse-die-i18n'
Quite a many die() messages in rev-parse haven't been marked for
translation.

* wm/rev-parse-die-i18n:
  rev-parse: mark die() messages for translation
2021-06-10 12:04:25 +09:00
Junio C Hamano
f44416c823 Merge branch 'ah/submodule-helper-module-summary-parseopt'
Message update.

* ah/submodule-helper-module-summary-parseopt:
  submodule: use the imperative mood to describe the --files option
2021-06-10 12:04:24 +09:00
Junio C Hamano
ce885c5342 Merge branch 'ah/stash-usage-i18n-fix'
i18n update.

* ah/stash-usage-i18n-fix:
  stash: don't translate literal commands
2021-06-10 12:04:24 +09:00
Junio C Hamano
b03709eae7 Merge branch 'ah/merge-usage-i18n-fix'
i18n update.

* ah/merge-usage-i18n-fix:
  merge: don't translate literal commands
2021-06-10 12:04:23 +09:00
Ævar Arnfjörð Bjarmason
338abb0f04 builtins + test helpers: use return instead of exit() in cmd_*
Change various cmd_* functions that claim to return an "int" to use
"return" instead of exit() to indicate an exit code. These were not
marked with NORETURN, and by directly exit()-ing we'll skip the
cleanup git.c would otherwise do (e.g. closing fd's, erroring if we
can't). See run_builtin() in git.c.

In the case of shell.c and sh-i18n--envsubst.c this was the result of
an incomplete migration to using a cmd_main() in 3f2e2297b9 (add an
extra level of indirection to main(), 2016-07-01).

This was spotted by SunCC 12.5 on Solaris 10 (gcc210 on the gccfarm).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-09 09:15:58 +09:00
ZheNing Hu
ee02ac6164 cat-file: merge two block into one
There are two "if (opt->all_objects)" blocks next
to each other, merge them into one to provide better
readability.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-04 07:50:26 +09:00
ZheNing Hu
e16acc80a7 cat-file: handle trivial --batch format with --batch-all-objects
The --batch code to print an object assumes we found out the type of
the object from calling oid_object_info_extended(). This is true for
the default format, but even in a custom format, we manually modify
the object_info struct to ask for the type.

This assumption was broken by 845de33a5b (cat-file: avoid noop calls
to sha1_object_info_extended, 2016-05-18). That commit skips the call
to oid_object_info_extended() entirely when --batch-all-objects is in
use, and the custom format does not include any placeholders that
require calling it.

Or when the custom format only include placeholders like %(objectname) or
%(rest), oid_object_info_extended() will not get the type of the object.

This results in an error when we try to confirm that the type didn't
change:

$ git cat-file --batch=batman --batch-all-objects
batman
fatal: object 000023961a changed type!?

and also has other subtle effects (e.g., we'd fail to stream a blob,
since we don't realize it's a blob in the first place).

We can fix this by flipping the order of the setup. The check for "do
we need to get the object info" must come _after_ we've decided
whether we need to look up the type.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-04 07:50:26 +09:00
Felipe Contreras
7088ce7191 push: don't get a full remote object
All we need to know is that their names are the same.

Additionally this might be easier to parse for some since
remote_for_branch is more descriptive than remote_get(NULL).

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:12:03 +09:00
Felipe Contreras
e0c91cffde push: only check same_remote when needed
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:12:02 +09:00
Felipe Contreras
c5b09cf771 push: remove trivial function
It's a single line that is used in a single place, and the variable has
the same name as the function.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:12:02 +09:00
Felipe Contreras
1afd78fb5c push: remove redundant check
If fetch_remote is NULL (i.e. the branch remote is invalid), then it
can't possibly be same as remote, which can't be NULL.

The check is redundant, and so is the extra variable.

Also, fix the Yoda condition: we want to check if remote is the same as
the branch remote, not the other way around.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:12:02 +09:00
Felipe Contreras
1f934725f7 push: factor out the typical case
Only override dst on the odd case.

This allows a preemptive break on the `simple` case.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:12:02 +09:00
Felipe Contreras
0add899baf push: get rid of all the setup_push_* functions
Their code is much simpler now and can move into the parent function.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:12:02 +09:00
Felipe Contreras
d371a9ef4c push: trivial simplifications
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:12:01 +09:00
Felipe Contreras
00458dc5f1 push: make setup_push_* return the dst
All of the setup_push_* functions are appending a refspec. Do this only
once on the parent function.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:12:01 +09:00
Felipe Contreras
65c63a0054 push: only get the branch when needed
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:12:01 +09:00
Felipe Contreras
cc16f95d21 push: factor out null branch check
No need to do it in every single function.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:12:01 +09:00
Felipe Contreras
04159fba42 push: split switch cases
We want all the cases that don't do anything with a branch first, and
then the rest. That way we will be able to get the branch and die if
there's a problem in the parent function, instead of inside the function
of each mode.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:12:01 +09:00
Felipe Contreras
72739680fc push: return immediately in trivial switch case
There's no need to break when nothing else will be executed.

Will help next patches.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:12:01 +09:00
Felipe Contreras
533e0325ab push: create new get_upstream_ref() helper
This code is duplicated among multiple functions.

No functional changes.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:12:00 +09:00
Felipe Contreras
7e6d72bb11 push: remove unused code in setup_push_upstream()
Now it's not used for the simple mode.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:09:52 +09:00
Felipe Contreras
b8e8b98647 push: simplify setup_push_simple()
There's a safety check to make sure branch->refname isn't different
from branch->merge[0]->src, otherwise we die().

Therefore we always push to branch->refname.

Suggestions-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:09:52 +09:00
Felipe Contreras
6b010c80a2 push: reorganize setup_push_simple()
Simply move the code around and remove dead code. In particular the
'!same_remote' conditional is a no-op since that part of the code is the
same_remote leg of the conditional beforehand.

No functional changes.

Suggestions-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:09:51 +09:00
Felipe Contreras
d099b9c9c7 push: copy code to setup_push_simple()
In order to avoid doing unnecessary things and simplify it in further
patches. In particular moving the additional name safety out of
setup_push_upstream() and into setup_push_simple() and thus making both
more straightforward.

The code is copied exactly as-is; no functional changes.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:09:51 +09:00
Felipe Contreras
3b9fd8361f push: hedge code of default=simple
`simple` is the most important mode so move the relevant code to its own
function to make it easier to see what it's doing.

Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:09:51 +09:00
Felipe Contreras
050f76b9af push: rename !triangular to same_remote
The typical case is what git was designed for: distributed remotes.

It's only the atypical case--fetching and pushing to the same
remote--that we need to keep an eye on.

No functional changes.

Liked-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 10:09:51 +09:00
Junio C Hamano
ed125c4f07 Merge branch 'ab/fsck-api-cleanup'
Last minute compilation fix.

* ab/fsck-api-cleanup:
  builtin/fsck.c: don't conflate "int" and "enum" in callback
2021-06-02 07:34:27 +09:00
Ævar Arnfjörð Bjarmason
28abf260a5 builtin/fsck.c: don't conflate "int" and "enum" in callback
Fix a warning on AIX's xlc compiler that's been emitted since my
a1aad71601 (fsck.h: use "enum object_type" instead of "int",
2021-03-28):

    "builtin/fsck.c", line 805.32: 1506-068 (W) Operation between
    types "int(*)(struct object*,enum object_type,void*,struct
    fsck_options*)" and "int(*)(struct object*,int,void*,struct
    fsck_options*)" is not allowed.

I.e. it complains about us assigning a function with a prototype "int"
where we're expecting "enum object_type".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-02 05:59:15 +09:00
Junio C Hamano
5afd72a96f Merge branch 'ab/pack-linkage-fix'
"ld" on Solaris fails to link some test helpers, which has been
worked around by reshuffling the inline function definitions from a
header file to a source file that is the only user of them.

* ab/pack-linkage-fix:
  pack-objects: move static inline from a header to the sole consumer
2021-05-27 12:36:58 +09:00
Ævar Arnfjörð Bjarmason
7d089fb9b7 pack-objects: move static inline from a header to the sole consumer
Move the code that is only used in builtin/pack-objects.c out of
pack-objects.h.

This fixes an issue where Solaris's SunCC hasn't been able to compile
git since 483fa7f42d (t/helper/test-bitmap.c: initial commit,
2021-03-31).

The real origin of that issue is that in 898eba5e63 (pack-objects:
refer to delta objects by index instead of pointer, 2018-04-14)
utility functions only needed by builtin/pack-objects.c were added to
pack-objects.h. Since then the header has been used in a few other
places, but 483fa7f42d was the first time it was used by test helper.

Since Solaris is stricter about linking and the oe_get_size_slow()
function lives in builtin/pack-objects.c the build started failing
with:

    Undefined                       first referenced
     symbol                             in file
    oe_get_size_slow                    t/helper/test-bitmap.o
    ld: fatal: symbol referencing errors. No output written to t/helper/test-tool

On other platforms this is presumably OK because the compiler and/or
linker detects that the "static inline" functions that reference
oe_get_size_slow() aren't used.

Let's solve this by moving the relevant code from pack-objects.h to
builtin/pack-objects.c. This is almost entirely a code-only move, but
because of the early macro definitions in that file referencing some
of these inline functions we need to move the definition of "static
struct packing_data to_pack" earlier, and declare these inline
functions above the macros.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-27 12:14:41 +09:00
Junio C Hamano
5d5b147345 Merge branch 'mt/init-template-userpath-fix'
Regression fix.

* mt/init-template-userpath-fix:
  init: fix bug regarding ~/ expansion in init.templateDir
2021-05-25 16:21:20 +09:00
Matheus Tavares
a185dd58ec init: fix bug regarding ~/ expansion in init.templateDir
We used to read the init.templateDir setting at builtin/init-db.c using
a git_config() callback that, in turn, called git_config_pathname(). To
simplify the config reading logic at this file and plug a memory leak,
this was replaced by a direct call to git_config_get_value() at
e4de4502e6 ("init: remove git_init_db_config() while fixing leaks",
2021-03-14). However, this function doesn't provide path expanding
semantics, like git_config_pathname() does, so paths with '~/' and
'~user/' are treated literally. This makes 'git init' fail to handle
init.templateDir paths using these constructs:

	$ git config init.templateDir '~/templates_dir'
	$ git init
	'warning: templates not found in ~/templates_dir'

Replace the git_config_get_value() call by git_config_get_pathname(),
which does the '~/' and '~user/' expansions. Also add a regression test.
Note that unlike git_config_get_value(), the config cache does not own
the memory for the path returned by git_config_get_pathname(), so we
must free() it.

Reported on IRC by rkta.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-25 13:22:08 +09:00
Junio C Hamano
378c7c6ad4 Merge branch 'dl/stash-show-untracked-fixup'
Another brown paper bag inconsistency fix for a new feature
introduced during this cycle.

* dl/stash-show-untracked-fixup:
  stash show: use stash.showIncludeUntracked even when diff options given
2021-05-22 18:29:01 +09:00
Junio C Hamano
99fe1c6069 Merge branch 'wm/rev-parse-path-format-wo-arg'
The "rev-parse" command did not diagnose the lack of argument to
"--path-format" option, which was introduced in v2.31 era, which
has been corrected.

* wm/rev-parse-path-format-wo-arg:
  rev-parse: fix segfault with missing --path-format argument
2021-05-22 18:29:00 +09:00
Denton Liu
af5cd44b6f stash show: use stash.showIncludeUntracked even when diff options given
If options pertaining to how the diff is displayed is provided to
`git stash show`, the command will ignore the stash.showIncludeUntracked
configuration variable, defaulting to not showing any untracked files.
This is unintuitive behaviour since the format of the diff output and
whether or not to display untracked files are orthogonal.

Use stash.showIncludeUntracked even when diff options are given. Of
course, this is still overridable via the command-line options.

Update the documentation to explicitly say which configuration variables
will be overridden when a diff options are given.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-22 17:56:46 +09:00
Sergey Organov
1e20a407fe stash list: stop passing "-m" to "git log"
Passing "-m" in "git log --first-parent -m" is not needed as
--first-parent implies --diff-merges=first-parent anyway. OTOH, it
will stop being harmless once we let "-m" imply "-p".

While we are at it, fix corresponding test description in t3903-stash
to match what it actually tests.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-21 09:24:14 +09:00
Sergey Organov
19b2517f95 diff-merges: move specific diff-index "-m" handling to diff-index
Move specific handling of "-m" for diff-index to diff-index.c, so
diff-merges is left to handle only diff for merges options.

Being a better design by itself, this is especially essential in
preparation for letting -m imply -p, as "diff-index -m" obviously
should not imply -p, as it's entirely unrelated.

To handle this, in addition to moving specific diff-index "-m" code
out of diff-merges, we introduce new

  diff_merges_suppress_options_parsing()

and call it before generic options processing in cmd_diff_index().

This new diff_merges_suppress_options_parsing() could then be reused
and called before invocations of setup_revisions() for other commands
that don't need --diff-merges options, but that's outside of the scope
of these patch series.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-21 09:24:14 +09:00
Alex Henrie
09667e9516 fetch: improve grammar of "shallow roots" message
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-20 13:36:33 +09:00
Junio C Hamano
33be431c0c Merge branch 'en/dir-traversal'
"git clean" and "git ls-files -i" had confusion around working on
or showing ignored paths inside an ignored directory, which has
been corrected.

* en/dir-traversal:
  dir: introduce readdir_skip_dot_and_dotdot() helper
  dir: update stale description of treat_directory()
  dir: traverse into untracked directories if they may have ignored subfiles
  dir: avoid unnecessary traversal into ignored directory
  t3001, t7300: add testcase showcasing missed directory traversal
  t7300: add testcase showing unnecessary traversal into ignored directory
  ls-files: error out on -i unless -o or -c are specified
  dir: report number of visited directories and paths with trace2
  dir: convert trace calls to trace2 equivalents
2021-05-20 08:54:59 +09:00
Jeff King
6aacb7d861 clone: clean up directory after transport_fetch_refs() failure
git-clone started respecting errors from the transport subsystem in
aab179d937 (builtin/clone.c: don't ignore transport_fetch_refs() errors,
2020-12-03). However, that commit didn't handle the cleanup of the
filesystem quite right.

The cleanup of the directory that cmd_clone() creates is done by an
atexit() handler, which we control with a flag. It starts as
JUNK_LEAVE_NONE ("clean up everything"), then progresses to
JUNK_LEAVE_REPO when we know we have a valid repo but not working tree,
and then finally JUNK_LEAVE_ALL when we have a successful checkout.

Most errors cause us to die(), which then triggers the handler to do the
right thing based on how far into cmd_clone() we got. But the checks
added by aab179d937 instead set the "err" variable and then jump to a
new "cleanup" label, which then returns our non-zero status. However,
the code after the cleanup label includes setting the flag to
JUNK_LEAVE_ALL, and so we accidentally leave the repository and working
tree in place.

One obvious option to fix this is to reorder the end of the function to
set the flag first, before cleanup code, and put the label between them.

But we can observe another small bug: the error return from
transport_fetch_refs() is generally "-1", and we propagate that to the
return value of cmd_clone(), which ultimately becomes the exit code of
the process. And we try to avoid transmitting negative values via exit
codes (only the low 8 bits are passed along as an unsigned value, though
in practice for "-1" this at least retains the property that it's
non-zero).

Instead, let's just die(). That makes us consistent with rest of the
code in the function. It does add a new "fatal:" line to the output, but
I'd argue that's a good thing:

  - in the rare case that the transport code didn't say anything, now
    the user gets _some_ error message

  - even if the transport code said something like "error: ssh died of
    signal 9", it's nice to also say "fatal" to indicate that we
    considered that to be a show-stopper.

Triggering this in the test suite turns out to be surprisingly
difficult. Almost every error we'd encounter, including ones deep inside
the transport code, cause us to just die() right there! However, one way
is to put a fake wrapper around git-upload-pack that sends the complete
packfile but exits with a failure code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-19 21:14:59 +09:00
Wolfgang Müller
e2c5993744 rev-parse: mark die() messages for translation
These error messages are intended for the user. Let's touch them up
since we're here from the previous commit.

Signed-off-by: Wolfgang Müller <wolf@oriole.systems>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-17 18:39:53 +09:00
Wolfgang Müller
99fc555188 rev-parse: fix segfault with missing --path-format argument
Calling "git rev-parse --path-format" without an argument segfaults
instead of giving an error message. Commit fac60b8925 (rev-parse: add
option for absolute or relative path formatting, 2020-12-13) added the
argument parsing code but forgot to handle NULL.

Returning an error makes sense here because there is no default value we
could use. Add a test case to verify.

Signed-off-by: Wolfgang Müller <wolf@oriole.systems>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-17 18:39:29 +09:00
Alex Henrie
f5f5a61d5a submodule: use the imperative mood to describe the --files option
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-17 07:31:40 +09:00
Alex Henrie
4901884a23 stash: don't translate literal commands
Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-17 07:21:04 +09:00
Junio C Hamano
a8a2491e62 Merge branch 'dl/stash-show-untracked-fixup'
The code to handle options recently added to "git stash show"
around untracked part of the stash segfaulted when these options
were used on a stash entry that does not record untracked part.

* dl/stash-show-untracked-fixup:
  stash show: fix segfault with --{include,only}-untracked
  t3905: correct test title
2021-05-16 21:05:24 +09:00
Junio C Hamano
94294e92e1 Merge branch 'lh/maintenance-leakfix'
* lh/maintenance-leakfix:
  maintenance: fix two memory leaks
2021-05-16 21:05:24 +09:00
Junio C Hamano
483932a3d8 Merge branch 'dd/mailinfo-quoted-cr'
"git mailinfo" (hence "git am") learned the "--quoted-cr" option to
control how lines ending with CRLF wrapped in base64 or qp are
handled.

* dd/mailinfo-quoted-cr:
  am: learn to process quoted lines that ends with CRLF
  mailinfo: allow stripping quoted CR without warning
  mailinfo: allow squelching quoted CRLF warning
  mailinfo: warn if CRLF found in decoded base64/QP email
  mailinfo: stop parsing options manually
  mailinfo: load default metainfo_charset lazily
2021-05-16 21:05:23 +09:00
Junio C Hamano
a737e1f1d2 Merge branch 'mt/parallel-checkout-part-3'
The final part of "parallel checkout".

* mt/parallel-checkout-part-3:
  ci: run test round with parallel-checkout enabled
  parallel-checkout: add tests related to .gitattributes
  t0028: extract encoding helpers to lib-encoding.sh
  parallel-checkout: add tests related to path collisions
  parallel-checkout: add tests for basic operations
  checkout-index: add parallel checkout support
  builtin/checkout.c: complete parallel checkout support
  make_transient_cache_entry(): optionally alloc from mem_pool
2021-05-16 21:05:23 +09:00
Junio C Hamano
644f4a2046 Merge branch 'jt/push-negotiation'
"git push" learns to discover common ancestor with the receiving
end over protocol v2.

* jt/push-negotiation:
  send-pack: support push negotiation
  fetch: teach independent negotiation (no packfile)
  fetch-pack: refactor command and capability write
  fetch-pack: refactor add_haves()
  fetch-pack: refactor process_acks()
2021-05-16 21:05:22 +09:00
Alex Henrie
a30e43f61a merge: don't translate literal commands
These strings have not been modified in any translation, nor should they
be.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-16 13:00:28 +09:00
Junio C Hamano
52371bf449 Merge branch 'mt/clean-clean'
Code clean-up.

* mt/clean-clean:
  clean: remove unnecessary variable
2021-05-14 08:26:11 +09:00
Junio C Hamano
47fa106617 Merge branch 'ow/no-dryrun-in-add-i'
"git add -i --dry-run" does not dry-run, which was surprising.  The
combination of options has taught to error out.

* ow/no-dryrun-in-add-i:
  add: die if both --dry-run and --interactive are given
2021-05-14 08:26:09 +09:00
Denton Liu
1ff595d218 stash show: fix segfault with --{include,only}-untracked
When `git stash show --include-untracked` or
`git stash show --only-untracked` is run on a stash that doesn't include
an untracked entry, a segfault occurs. This happens because we do not
check whether the untracked entry is actually present and just attempt
to blindly dereference it.

Ensure that the untracked entry is present before actually attempting to
dereference it.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 08:48:59 +09:00
Elijah Newren
b548f0f156 dir: introduce readdir_skip_dot_and_dotdot() helper
Many places in the code were doing
    while ((d = readdir(dir)) != NULL) {
        if (is_dot_or_dotdot(d->d_name))
            continue;
        ...process d...
    }
Introduce a readdir_skip_dot_and_dotdot() helper to make that a one-liner:
    while ((d = readdir_skip_dot_and_dotdot(dir)) != NULL) {
        ...process d...
    }

This helper particularly simplifies checks for empty directories.

Also use this helper in read_cached_dir() so that our statistics are
consistent across platforms.  (In other words, read_cached_dir() should
have been using is_dot_or_dotdot() and skipping such entries, but did
not and left it to treat_path() to detect and mark such entries as
path_none.)

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 08:45:03 +09:00
Elijah Newren
b338e9f668 ls-files: error out on -i unless -o or -c are specified
ls-files --ignored can be used together with either --others or
--cached.  After being perplexed for a bit and digging in to the code, I
assumed that ls-files -i was just broken and not printing anything and
I had a nice patch ready to submit when I finally realized that -i can be
used with --cached to find tracked ignores.

While that was a mistake on my part, and a careful reading of the
documentation could have made this more clear, I suspect this is an
error others are likely to make as well.  In fact, of two uses in our
testsuite, I believe one of the two did make this error.  In t1306.13,
there are NO tracked files, and all the excludes built up and used in
that test and in previous tests thus have to be about untracked files.
However, since they were looking for an empty result, the mistake went
unnoticed as their erroneous command also just happened to give an empty
answer.

-i will most the time be used with -o, which would suggest we could just
make -i imply -o in the absence of either a -o or -c, but that would be
a backward incompatible break.  Instead, let's just flag -i without
either a -o or -c as an error, and update the two relevant testcases to
specify their intent.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-13 08:45:03 +09:00
Lénaïc Huard
c5d0b12a4c maintenance: fix two memory leaks
Fixes two memory leaks when running `git maintenance start` or `git
maintenance stop` in `update_background_schedule`:

$ valgrind --leak-check=full ~/git/bin/git maintenance start
==76584== Memcheck, a memory error detector
==76584== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==76584== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==76584== Command: /home/lenaic/git/bin/git maintenance start
==76584==
==76584==
==76584== HEAP SUMMARY:
==76584==     in use at exit: 34,880 bytes in 252 blocks
==76584==   total heap usage: 820 allocs, 568 frees, 146,414 bytes allocated
==76584==
==76584== 65 bytes in 1 blocks are definitely lost in loss record 17 of 39
==76584==    at 0x483E6AF: malloc (vg_replace_malloc.c:306)
==76584==    by 0x3DC39C: xrealloc (wrapper.c:126)
==76584==    by 0x3992CC: strbuf_grow (strbuf.c:98)
==76584==    by 0x39A473: strbuf_vaddf (strbuf.c:392)
==76584==    by 0x39BC54: xstrvfmt (strbuf.c:979)
==76584==    by 0x39BD2C: xstrfmt (strbuf.c:989)
==76584==    by 0x18451B: update_background_schedule (gc.c:1977)
==76584==    by 0x1846F6: maintenance_start (gc.c:2011)
==76584==    by 0x1847B4: cmd_maintenance (gc.c:2030)
==76584==    by 0x127A2E: run_builtin (git.c:453)
==76584==    by 0x127E81: handle_builtin (git.c:704)
==76584==    by 0x128142: run_argv (git.c:771)
==76584==
==76584== 240 bytes in 1 blocks are definitely lost in loss record 29 of 39
==76584==    at 0x4840D7B: realloc (vg_replace_malloc.c:834)
==76584==    by 0x491CE5D: getdelim (in /usr/lib/libc-2.33.so)
==76584==    by 0x39ADD7: strbuf_getwholeline (strbuf.c:635)
==76584==    by 0x39AF31: strbuf_getdelim (strbuf.c:706)
==76584==    by 0x39B064: strbuf_getline_lf (strbuf.c:727)
==76584==    by 0x184273: crontab_update_schedule (gc.c:1919)
==76584==    by 0x184678: update_background_schedule (gc.c:1997)
==76584==    by 0x1846F6: maintenance_start (gc.c:2011)
==76584==    by 0x1847B4: cmd_maintenance (gc.c:2030)
==76584==    by 0x127A2E: run_builtin (git.c:453)
==76584==    by 0x127E81: handle_builtin (git.c:704)
==76584==    by 0x128142: run_argv (git.c:771)
==76584==
==76584== LEAK SUMMARY:
==76584==    definitely lost: 305 bytes in 2 blocks
==76584==    indirectly lost: 0 bytes in 0 blocks
==76584==      possibly lost: 0 bytes in 0 blocks
==76584==    still reachable: 34,575 bytes in 250 blocks
==76584==         suppressed: 0 bytes in 0 blocks
==76584== Reachable blocks (those to which a pointer was found) are not shown.
==76584== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==76584==
==76584== For lists of detected and suppressed errors, rerun with: -s
==76584== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)

Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr>
Acked-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-12 07:00:45 +09:00
Junio C Hamano
1af57f5d32 Merge branch 'jk/pack-objects-negative-options-fix'
Options to "git pack-objects" that take numeric values like
--window and --depth should not accept negative values; the input
validation has been tightened.

* jk/pack-objects-negative-options-fix:
  pack-objects: clamp negative depth to 0
  t5316: check behavior of pack-objects --depth=0
  pack-objects: clamp negative window size to 0
  t5300: check that we produced expected number of deltas
  t5300: modernize basic tests
2021-05-11 15:27:23 +09:00
Junio C Hamano
5feebddd86 Merge branch 'js/merge-already-up-to-date-message-reword'
A few variants of informational message "Already up-to-date" has
been rephrased.

* js/merge-already-up-to-date-message-reword:
  merge: fix swapped "up to date" message components
  merge(s): apply consistent punctuation to "up to date" messages
2021-05-11 15:27:22 +09:00
Junio C Hamano
8ca4771dd0 Merge branch 'rj/bisect-skip-honor-terms'
"git bisect skip" when custom words are used for new/old did not
work, which has been corrected.

* rj/bisect-skip-honor-terms:
  bisect--helper: use BISECT_TERMS in 'bisect skip' command
2021-05-11 15:27:22 +09:00
Ævar Arnfjörð Bjarmason
22233d43eb xdiff users: use designated initializers for out_line
Amend the code added in 611e42a598 (xdiff: provide a separate emit
callback for hunks, 2018-11-02) to be more readable by using
designated initializers.

This changes "priv" in rerere.c to be initialized to NULL as we did in
merge-tree.c. That's not needed as we'll only use it if the callback
is defined, but being consistent here is better and less verbose.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-11 12:47:31 +09:00
Junio C Hamano
a0f521b56c Merge branch 'rs/repack-without-loosening-promised-objects'
"git repack -A -d" in a partial clone unnecessarily loosened
objects in promisor pack.

* rs/repack-without-loosening-promised-objects:
  repack: avoid loosening promisor objects in partial clones
2021-05-10 16:59:47 +09:00
Junio C Hamano
aaa3c8065d Merge branch 'bc/hash-transition-interop-part-1'
SHA-256 transition.

* bc/hash-transition-interop-part-1:
  hex: print objects using the hash algorithm member
  hex: default to the_hash_algo on zero algorithm value
  builtin/pack-objects: avoid using struct object_id for pack hash
  commit-graph: don't store file hashes as struct object_id
  builtin/show-index: set the algorithm for object IDs
  hash: provide per-algorithm null OIDs
  hash: set, copy, and use algo field in struct object_id
  builtin/pack-redundant: avoid casting buffers to struct object_id
  Use the final_oid_fn to finalize hashing of object IDs
  hash: add a function to finalize object IDs
  http-push: set algorithm when reading object ID
  Always use oidread to read into struct object_id
  hash: add an algo member to struct object_id
2021-05-10 16:59:46 +09:00
Đoàn Trần Công Danh
59b519ab7e am: learn to process quoted lines that ends with CRLF
In previous changes, mailinfo has learnt to process lines that decoded
from base64 or quoted-printable, and ends with CRLF.

Let's teach "am" that new trick, too.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-10 15:06:22 +09:00
Đoàn Trần Công Danh
f1aa299443 mailinfo: allow squelching quoted CRLF warning
In previous change, Git starts to warn for quoted CRLF in decoded
base64/QP email. Despite those warnings are usually helpful,
quoted CRLF could be part of some users' workflow.

Let's give them an option to turn off the warning completely.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-10 15:06:22 +09:00
Junio C Hamano
936e58851a Merge branch 'ah/plugleaks'
Plug various leans reported by LSAN.

* ah/plugleaks:
  builtin/rm: avoid leaking pathspec and seen
  builtin/rebase: release git_format_patch_opt too
  builtin/for-each-ref: free filter and UNLEAK sorting.
  mailinfo: also free strbuf lists when clearing mailinfo
  builtin/checkout: clear pending objects after diffing
  builtin/check-ignore: clear_pathspec before returning
  builtin/bugreport: don't leak prefixed filename
  branch: FREE_AND_NULL instead of NULL'ing real_ref
  bloom: clear each bloom_key after use
  ls-files: free max_prefix when done
  wt-status: fix multiple small leaks
  revision: free remainder of old commit list in limit_list
2021-05-07 12:47:41 +09:00
Junio C Hamano
8585d6c04a Merge branch 'ps/rev-list-object-type-filter'
"git rev-list" learns the "--filter=object:type=<type>" option,
which can be used to exclude objects of the given kind from the
packfile generated by pack-objects.

* ps/rev-list-object-type-filter:
  rev-list: allow filtering of provided items
  pack-bitmap: implement combined filter
  pack-bitmap: implement object type filter
  list-objects: implement object type filter
  list-objects: support filtering by tag and commit
  list-objects: move tag processing into its own function
  revision: mark commit parents as NOT_USER_GIVEN
  uploadpack.txt: document implication of `uploadpackfilter.allow`
2021-05-07 12:47:41 +09:00
Junio C Hamano
fe069dce62 Merge branch 'mt/add-rm-in-sparse-checkout'
"git add" and "git rm" learned not to touch those paths that are
outside of sparse checkout.

* mt/add-rm-in-sparse-checkout:
  rm: honor sparse checkout patterns
  add: warn when asked to update SKIP_WORKTREE entries
  refresh_index(): add flag to ignore SKIP_WORKTREE entries
  pathspec: allow to ignore SKIP_WORKTREE entries on index matching
  add: make --chmod and --renormalize honor sparse checkouts
  t3705: add tests for `git add` in sparse checkouts
  add: include magic part of pathspec on --refresh error
2021-05-07 12:47:40 +09:00
Junio C Hamano
e706aaf3bc Merge branch 'ps/config-global-override'
Replace GIT_CONFIG_NOSYSTEM mechanism to decline from reading the
system-wide configuration file with GIT_CONFIG_SYSTEM that lets
users specify from which file to read the system-wide configuration
(setting it to an empty file would essentially be the same as
setting NOSYSTEM), and introduce GIT_CONFIG_GLOBAL to override the
per-user configuration in $HOME/.gitconfig.

* ps/config-global-override:
  t1300: fix unset of GIT_CONFIG_NOSYSTEM leaking into subsequent tests
  config: allow overriding of global and system configuration
  config: unify code paths to get global config paths
  config: rename `git_etc_config()`
2021-05-07 12:47:39 +09:00
Junio C Hamano
c108c8c2f2 Merge branch 'zh/format-ref-array-optim'
"git (branch|tag) --format=..." has been micro-optimized.

* zh/format-ref-array-optim:
  ref-filter: reuse output buffer
  ref-filter: get rid of show_ref_array_item
2021-05-07 12:47:39 +09:00
Matheus Tavares
3a7f0908b6 clean: remove unnecessary variable
The variable `matches` used to hold the return of a `dir_path_match()`
call that was removed in 95c11ecc73 ("Fix error-prone fill_directory()
API; make it only return matches", 2020-04-01). Now `matches` will
always hold 0, which is the value it's initialized with; and the
condition `matches != MATCHED_EXACTLY` will always evaluate to true. So
let's remove this unnecessary variable.

Interestingly, it seems that `matches != MATCHED_EXACTLY` was already
unnecessary before 95c11ecc73. That's because `remove_directories` is
always set to 1 when we have pathspecs; So, in the condition
`!remove_directories && matches != MATCHED_EXACTLY`, we would either:

- have pathspecs (or have been given `-d`) and ignore `matches` because
  `remove_directories` is 1; or

- not have pathspecs (nor `-d`) and end up just checking that
  `0 != MATCHED_EXACTLY`, as `matches` would never get reassigned
  after its zero initialization (because there is no pathspec to match).

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-07 07:48:11 +09:00
Đoàn Trần Công Danh
dd9323b7fb mailinfo: stop parsing options manually
In a later change, mailinfo will learn more options, let's switch to our
robust parse_options framework before that step.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-07 06:40:26 +09:00
Đoàn Trần Công Danh
d582992e80 mailinfo: load default metainfo_charset lazily
In a later change, we will use parse_option to parse mailinfo's options.
In mailinfo, both "-u", "-n", and "--encoding" try to set the same
field, with "-u" reset that field to some default value from
configuration variable "i18n.commitEncoding".

Let's delay the setting of that field until we finish processing all
options. By doing that, "i18n.commitEncoding" can be parsed on demand.
More importantly, it cleans the way for using parse_option.

This change introduces some inconsistent brackets "{}" in "if/else if"
construct, however, we will rewrite them in the next few changes.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-07 06:40:25 +09:00
Øystein Walle
a1989cf7b8 add: die if both --dry-run and --interactive are given
The interactive machinery does not obey --dry-run. Die appropriately
if both flags are passed.

Signed-off-by: Øystein Walle <oystwa@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-07 06:14:04 +09:00
Matheus Tavares
70b052b209 checkout-index: add parallel checkout support
Allow checkout-index to use the parallel checkout framework, honoring
the checkout.workers configuration.

There are two code paths in checkout-index which call
`checkout_entry()`, and thus, can make use of parallel checkout:
`checkout_file()`, which is used to write paths explicitly given at the
command line; and `checkout_all()`, which is used to write all paths in
the index, when the `--all` option is given.

In both operation modes, checkout-index doesn't abort immediately on a
`checkout_entry()` failure. Instead, it tries to check out all remaining
paths before exiting with a non-zero exit code. To keep this behavior
when parallel checkout is being used, we must allow
`run_parallel_checkout()` to try writing the queued entries before we
exit, even if we already got an error code from a previous
`checkout_entry()` call.

However, `checkout_all()` doesn't return on errors, it calls `exit()`
with code 128. We could make it call `run_parallel_checkout()` before
exiting, but it makes the code easier to follow if we unify the exit
path for both checkout-index modes at `cmd_checkout_index()`, and let
this function take care of the interactions with the parallel checkout
API. So let's do that.

With this change, we also have to consider whether we want to keep using
128 as the error code for `git checkout-index --all`, while we use 1 for
`git checkout-index <path>` (even when the actual error is the same).
Since there is not much value in having code 128 only for `--all`, and
there is no mention about it in the docs (so it's unlikely that changing
it will break any existing script), let's make both modes exit with code
1 on `checkout_entry()` errors.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 12:26:36 +09:00
Matheus Tavares
6053950632 builtin/checkout.c: complete parallel checkout support
Pathspec-limited checkouts (like `git checkout *.txt`) are performed by
a code path that doesn't yet support parallel checkout because it calls
checkout_entry() directly, instead of unpack_trees(). Let's add parallel
checkout support for this code path too.

The transient cache entries allocated in checkout_merged() are now
allocated in a mem_pool which is only discarded after parallel checkout
finishes. This is done because the entries need to be valid when
run_parallel_checkout() is called.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 12:26:33 +09:00
Matheus Tavares
9616882780 make_transient_cache_entry(): optionally alloc from mem_pool
Allow make_transient_cache_entry() to optionally receive a mem_pool
struct in which it should allocate the entry. This will be used in the
following patch, to store some transient entries which should persist
until parallel checkout finishes.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 12:25:25 +09:00
Jonathan Tan
9c1e657a8f fetch: teach independent negotiation (no packfile)
Currently, the packfile negotiation step within a Git fetch cannot be
done independent of sending the packfile, even though there is at least
one application wherein this is useful. Therefore, make it possible for
this negotiation step to be done independently. A subsequent commit will
use this for one such application - push negotiation.

This feature is for protocol v2 only. (An implementation for protocol v0
would require a separate implementation in the fetch, transport, and
transport helper code.)

In the protocol, the main hindrance towards independent negotiation is
that the server can unilaterally decide to send the packfile. This is
solved by a "wait-for-done" argument: the server will then wait for the
client to say "done". In practice, the client will never say it; instead
it will cease requests once it is satisfied.

In the client, the main change lies in the transport and transport
helper code. fetch_refs_via_pack() performs everything needed - protocol
version and capability checks, and the negotiation itself.

There are 2 code paths that do not go through fetch_refs_via_pack() that
needed to be individually excluded: the bundle transport (excluded
through requiring smart_options, which the bundle transport doesn't
support) and transport helpers that do not support takeover. If or when
we support independent negotiation for protocol v0, we will need to
modify these 2 code paths to support it. But for now, report failure if
independent negotiation is requested in these cases.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-05 10:41:29 +09:00
Jeff King
6d52b6a5df pack-objects: clamp negative depth to 0
A negative delta depth makes no sense, and the code is not prepared to
handle it. If passed "--depth=-1" on the command line, then this line
from break_delta_chains():

	cur->depth = (total_depth--) % (depth + 1);

triggers a divide-by-zero. This is undefined behavior according to the C
standard, but on POSIX systems results in SIGFPE killing the process.
This is certainly one way to inform the use that the command was
invalid, but it's a bit friendlier to just treat it as "don't allow any
deltas", which we already do for --depth=0.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:30:46 +09:00
Jeff King
953aa54e1a pack-objects: clamp negative window size to 0
A negative window size makes no sense, and the code in find_deltas() is
not prepared to handle it. If you pass "-1", for example, we end up
generate a 0-length array of "struct unpacked", but our loop assumes it
has at least one entry in it (and we end up reading garbage memory).

We could complain to the user about this, but it's more forgiving to
just clamp it to 0, which means "do not find any deltas at all". The
0-case is already tested earlier in the script, so we'll make sure this
does the same thing.

Reported-by: Yiyuan guo <yguoaz@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:29:27 +09:00
Josh Soref
ad9322da03 merge: fix swapped "up to date" message components
The rewrite of git-merge from shell to C in 1c7b76be7d (Build in merge,
2008-07-07) accidentally transformed the message:

    Already up-to-date. (nothing to squash)

to:

    (nothing to squash)Already up-to-date.

due to reversed printf() arguments. This problem has gone unnoticed
despite being touched over the years by 7f87aff22c (Teach/Fix pull/fetch
-q/-v options, 2008-11-15) and bacec47845 (i18n: git-merge basic
messages, 2011-02-22), and tangentially by bef4830e88 (i18n: merge: mark
messages for translation, 2016-06-17) and 7560f547e6 (treewide: correct
several "up-to-date" to "up to date", 2017-08-23).

Fix it by restoring the message to its intended order. While at it, help
translators out by avoiding "sentence Lego".

[es: rewrote commit message]

Co-authored-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:14:58 +09:00
Eric Sunshine
80cde95eec merge(s): apply consistent punctuation to "up to date" messages
Although the various "Already up to date" messages resulting from merge
attempts share identical phrasing, they use a mix of punctuation ranging
from "." to "!" and even "Yeeah!", which leads to extra work for
translators. Ease the job of translators by settling upon "." as
punctuation for all such messages.

While at it, take advantage of printf_ln() to further ease the
translation task so translators need not worry about line termination,
and fix a case of missing line termination in the (unused)
merge_ort_nonrecursive() function.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:14:56 +09:00
Junio C Hamano
a1cac26cc6 Merge branch 'mt/parallel-checkout-part-2'
The checkout machinery has been taught to perform the actual
write-out of the files in parallel when able.

* mt/parallel-checkout-part-2:
  parallel-checkout: add design documentation
  parallel-checkout: support progress displaying
  parallel-checkout: add configuration options
  parallel-checkout: make it truly parallel
  unpack-trees: add basic support for parallel checkout
2021-04-30 13:50:26 +09:00
Junio C Hamano
59bb0aa93e Merge branch 'so/log-diff-merge'
"git log" learned "--diff-merges=<style>" option, with an
associated configuration variable log.diffMerges.

* so/log-diff-merge:
  doc/diff-options: document new --diff-merges features
  diff-merges: introduce log.diffMerges config variable
  diff-merges: adapt -m to enable default diff format
  diff-merges: refactor set_diff_merges()
  diff-merges: introduce --diff-merges=on
2021-04-30 13:50:26 +09:00
Junio C Hamano
8e97852919 Merge branch 'ds/sparse-index-protections'
Builds on top of the sparse-index infrastructure to mark operations
that are not ready to mark with the sparse index, causing them to
fall back on fully-populated index that they always have worked with.

* ds/sparse-index-protections: (47 commits)
  name-hash: use expand_to_path()
  sparse-index: expand_to_path()
  name-hash: don't add directories to name_hash
  revision: ensure full index
  resolve-undo: ensure full index
  read-cache: ensure full index
  pathspec: ensure full index
  merge-recursive: ensure full index
  entry: ensure full index
  dir: ensure full index
  update-index: ensure full index
  stash: ensure full index
  rm: ensure full index
  merge-index: ensure full index
  ls-files: ensure full index
  grep: ensure full index
  fsck: ensure full index
  difftool: ensure full index
  commit: ensure full index
  checkout: ensure full index
  ...
2021-04-30 13:50:26 +09:00
Junio C Hamano
d250f90359 Merge branch 'ds/maintenance-prefetch-fix'
The prefetch task in "git maintenance" assumed that "git fetch"
from any remote would fetch all its local branches, which would
fetch too much if the user is interested in only a subset of
branches there.

* ds/maintenance-prefetch-fix:
  maintenance: respect remote.*.skipFetchAll
  maintenance: use 'git fetch --prefetch'
  fetch: add --prefetch option
  maintenance: simplify prefetch logic
2021-04-30 13:50:25 +09:00
Junio C Hamano
13158b9910 Merge branch 'jk/promisor-optim'
Handling of "promisor packs" that allows certain objects to be
missing and lazily retrievable has been optimized (a bit).

* jk/promisor-optim:
  revision: avoid parsing with --exclude-promisor-objects
  lookup_unknown_object(): take a repository argument
  is_promisor_object(): free tree buffer after parsing
2021-04-30 13:50:24 +09:00
Ramsay Jones
4cd66e7d6b bisect--helper: use BISECT_TERMS in 'bisect skip' command
Commit e4c7b33747 ("bisect--helper: reimplement `bisect_skip` shell
function in C", 2021-02-03), as part of the shell-to-C conversion,
forgot to read the 'terms' file (.git/BISECT_TERMS) during the new
'bisect skip' command implementation. As a result, the 'bisect skip'
command will use the default 'bad'/'good' terms. If the bisection
terms have been set to non-default values (for example by the
'bisect start' command), then the 'bisect skip' command will fail.

In order to correct this problem, we insert a call to the get_terms()
function, which reads the non-default terms from that file (if set),
in the '--bisect-skip' command implementation of 'bisect--helper'.

Also, add a test[1] to protect against potential future regression.

[1] https://lore.kernel.org/git/xmqqim45h585.fsf@gitster.g/T/#m207791568054b0f8cf1a3942878ea36293273c7d

Reported-by: Trygve Aaberge <trygveaa@gmail.com>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-30 09:56:42 +09:00
Rafael Silva
a643157d5a repack: avoid loosening promisor objects in partial clones
When `git repack -A -d` is run in a partial clone, `pack-objects`
is invoked twice: once to repack all promisor objects, and once to
repack all non-promisor objects. The latter `pack-objects` invocation
is with --exclude-promisor-objects and --unpack-unreachable, which
loosens all objects unused during this invocation. Unfortunately,
this includes promisor objects.

Because the -d argument to `git repack` subsequently deletes all loose
objects also in packs, these just-loosened promisor objects will be
immediately deleted. However, this extra disk churn is unnecessary in
the first place.  For example, in a newly-cloned partial repo that
filters all blob objects (e.g. `--filter=blob:none`), `repack` ends up
unpacking all trees and commits into the filesystem because every
object, in this particular case, is a promisor object. Depending on
the repo size, this increases the disk usage considerably: In my copy
of the linux.git, the object directory peaked 26GB of more disk usage.

In order to avoid this extra disk churn, pass the names of the promisor
packfiles as --keep-pack arguments to the second invocation of
`pack-objects`. This informs `pack-objects` that the promisor objects
are already in a safe packfile and, therefore, do not need to be
loosened.

For testing, we need to validate whether any object was loosened.
However, the "evidence" (loosened objects) is deleted during the
process which prevents us from inspecting the object directory.
Instead, let's teach `pack-objects` to count loosened objects and
emit via trace2 thus allowing inspecting the debug events after the
process is finished. This new event is used on the added regression
test.

Lastly, add a new perf test to evaluate the performance impact
made by this changes (tested on git.git):

     Test          HEAD^                 HEAD
     ----------------------------------------------------------
     5600.3: gc    134.38(41.93+90.95)   7.80(6.72+1.35) -94.2%

For a bigger repository, such as linux.git, the improvement is
even bigger:

     Test          HEAD^                     HEAD
     -------------------------------------------------------------------
     5600.3: gc    6833.00(918.07+3162.74)   268.79(227.02+39.18) -96.1%

These improvements are particular big because every object in the
newly-cloned partial repository is a promisor object.

Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 13:36:13 +09:00
Andrzej Hunt
37be11994f builtin/rm: avoid leaking pathspec and seen
parse_pathspec() populates pathspec, hence we need to clear it once it's
no longer needed. seen is xcalloc'd within the same function and
likewise needs to be freed once its no longer needed.

cmd_rm() has multiple early returns, therefore we need to clear or free
as soon as this data is no longer needed, as opposed to doing a cleanup
at the end.

LSAN output from t0020:

Direct leak of 112 byte(s) in 1 object(s) allocated from:
    #0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9ac0a4 in do_xmalloc wrapper.c:41:8
    #2 0x9ac07a in xmalloc wrapper.c:62:9
    #3 0x873277 in parse_pathspec pathspec.c:582:2
    #4 0x646ffa in cmd_rm builtin/rm.c:266:2
    #5 0x4cd91d in run_builtin git.c:467:11
    #6 0x4cb5f3 in handle_builtin git.c:719:3
    #7 0x4ccf47 in run_argv git.c:808:4
    #8 0x4caf49 in cmd_main git.c:939:19
    #9 0x69dc0e in main common-main.c:52:11
    #10 0x7f948825b349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Indirect leak of 65 byte(s) in 1 object(s) allocated from:
    #0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9ac2a6 in xrealloc wrapper.c:126:8
    #2 0x93b14d in strbuf_grow strbuf.c:98:2
    #3 0x93ccf6 in strbuf_vaddf strbuf.c:392:3
    #4 0x93f726 in xstrvfmt strbuf.c:979:2
    #5 0x93f8b3 in xstrfmt strbuf.c:989:8
    #6 0x92ad8a in prefix_path_gently setup.c:115:15
    #7 0x873a8d in init_pathspec_item pathspec.c:439:11
    #8 0x87334f in parse_pathspec pathspec.c:589:3
    #9 0x646ffa in cmd_rm builtin/rm.c:266:2
    #10 0x4cd91d in run_builtin git.c:467:11
    #11 0x4cb5f3 in handle_builtin git.c:719:3
    #12 0x4ccf47 in run_argv git.c:808:4
    #13 0x4caf49 in cmd_main git.c:939:19
    #14 0x69dc0e in main common-main.c:52:11
    #15 0x7f948825b349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Indirect leak of 15 byte(s) in 1 object(s) allocated from:
    #0 0x486834 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
    #1 0x9ac048 in xstrdup wrapper.c:29:14
    #2 0x873ba2 in init_pathspec_item pathspec.c:468:20
    #3 0x87334f in parse_pathspec pathspec.c:589:3
    #4 0x646ffa in cmd_rm builtin/rm.c:266:2
    #5 0x4cd91d in run_builtin git.c:467:11
    #6 0x4cb5f3 in handle_builtin git.c:719:3
    #7 0x4ccf47 in run_argv git.c:808:4
    #8 0x4caf49 in cmd_main git.c:939:19
    #9 0x69dc0e in main common-main.c:52:11
    #10 0x7f948825b349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 1 byte(s) in 1 object(s) allocated from:
    #0 0x49a9d2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
    #1 0x9ac392 in xcalloc wrapper.c:140:8
    #2 0x647108 in cmd_rm builtin/rm.c:294:9
    #3 0x4cd91d in run_builtin git.c:467:11
    #4 0x4cb5f3 in handle_builtin git.c:719:3
    #5 0x4ccf47 in run_argv git.c:808:4
    #6 0x4caf49 in cmd_main git.c:939:19
    #7 0x69dbfe in main common-main.c:52:11
    #8 0x7f4fac1b0349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:45 +09:00
Andrzej Hunt
805b789a69 builtin/rebase: release git_format_patch_opt too
options.git_format_patch_opt can be populated during cmd_rebase's setup,
and will therefore leak on return. Although we could just UNLEAK all of
options, we choose to strbuf_release() the individual member, which matches
the existing pattern (where we're freeing invidual members of options).

Leak found when running t0021:

Direct leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9ac296 in xrealloc wrapper.c:126:8
    #2 0x93b13d in strbuf_grow strbuf.c:98:2
    #3 0x93bd3a in strbuf_add strbuf.c:295:2
    #4 0x60ae92 in strbuf_addstr strbuf.h:304:2
    #5 0x605f17 in cmd_rebase builtin/rebase.c:1759:3
    #6 0x4cd91d in run_builtin git.c:467:11
    #7 0x4cb5f3 in handle_builtin git.c:719:3
    #8 0x4ccf47 in run_argv git.c:808:4
    #9 0x4caf49 in cmd_main git.c:939:19
    #10 0x69dbfe in main common-main.c:52:11
    #11 0x7f66dae91349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 24 byte(s) leaked in 1 allocation(s).

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:45 +09:00
Andrzej Hunt
a317a553b8 builtin/for-each-ref: free filter and UNLEAK sorting.
sorting might be a list allocated in ref_default_sorting() (in this case
it's a fixed single item list, which has nevertheless been xcalloc'd),
or it might be a list allocated in parse_opt_ref_sorting(). In either
case we could free these lists - but instead we UNLEAK as we're at the
end of cmd_for_each_ref. (There's no existing implementation of
clear_ref_sorting(), and writing a loop to free the list seems more
trouble than it's worth.)

filter.with_commit/no_commit are populated via
OPT_CONTAINS/OPT_NO_CONTAINS, both of which create new entries via
parse_opt_commits(), and also need to be free'd or UNLEAK'd. Because
free_commit_list() already exists, we choose to use that over an UNLEAK.

LSAN output from t0041:

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x49a9d2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
    #1 0x9ac252 in xcalloc wrapper.c:140:8
    #2 0x8a4a55 in ref_default_sorting ref-filter.c:2486:32
    #3 0x56c6b1 in cmd_for_each_ref builtin/for-each-ref.c:72:13
    #4 0x4cd91d in run_builtin git.c:467:11
    #5 0x4cb5f3 in handle_builtin git.c:719:3
    #6 0x4ccf47 in run_argv git.c:808:4
    #7 0x4caf49 in cmd_main git.c:939:19
    #8 0x69dabe in main common-main.c:52:11
    #9 0x7f2bdc570349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9abf54 in do_xmalloc wrapper.c:41:8
    #2 0x9abf2a in xmalloc wrapper.c:62:9
    #3 0x717486 in commit_list_insert commit.c:540:33
    #4 0x8644cf in parse_opt_commits parse-options-cb.c:98:2
    #5 0x869bb5 in get_value parse-options.c:181:11
    #6 0x8677dc in parse_long_opt parse-options.c:378:10
    #7 0x8659bd in parse_options_step parse-options.c:817:11
    #8 0x867fcd in parse_options parse-options.c:870:10
    #9 0x56c62b in cmd_for_each_ref builtin/for-each-ref.c:59:2
    #10 0x4cd91d in run_builtin git.c:467:11
    #11 0x4cb5f3 in handle_builtin git.c:719:3
    #12 0x4ccf47 in run_argv git.c:808:4
    #13 0x4caf49 in cmd_main git.c:939:19
    #14 0x69dabe in main common-main.c:52:11
    #15 0x7f2bdc570349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:45 +09:00
Andrzej Hunt
52a9436aa7 builtin/checkout: clear pending objects after diffing
add_pending_object() populates rev.pending, we need to take care of
clearing it once we're done.

This code is run close to the end of a checkout, therefore this leak
seems like it would have very little impact. See also LSAN output
from t0020 below:

Direct leak of 2048 byte(s) in 1 object(s) allocated from:
    #0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9acc46 in xrealloc wrapper.c:126:8
    #2 0x83e3a3 in add_object_array_with_path object.c:337:3
    #3 0x8f672a in add_pending_object_with_path revision.c:329:2
    #4 0x8eaeab in add_pending_object_with_mode revision.c:336:2
    #5 0x8eae9d in add_pending_object revision.c:342:2
    #6 0x5154a0 in show_local_changes builtin/checkout.c:602:2
    #7 0x513b00 in merge_working_tree builtin/checkout.c:979:3
    #8 0x512cb3 in switch_branches builtin/checkout.c:1242:9
    #9 0x50f8de in checkout_branch builtin/checkout.c:1646:9
    #10 0x50ba12 in checkout_main builtin/checkout.c:2003:9
    #11 0x5086c0 in cmd_checkout builtin/checkout.c:2055:8
    #12 0x4cd91d in run_builtin git.c:467:11
    #13 0x4cb5f3 in handle_builtin git.c:719:3
    #14 0x4ccf47 in run_argv git.c:808:4
    #15 0x4caf49 in cmd_main git.c:939:19
    #16 0x69e43e in main common-main.c:52:11
    #17 0x7f5dd1d50349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 2048 byte(s) leaked in 1 allocation(s).
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:45 +09:00
Andrzej Hunt
265644367f builtin/check-ignore: clear_pathspec before returning
parse_pathspec() allocates new memory into pathspec, therefore we need
to free it when we're done.

An UNLEAK would probably be just as good here - but clear_pathspec() is
not much more work so we might as well use it. check_ignore() is either
called once directly from cmd_check_ignore() (in which case the leak
really doesnt matter), or it can be called multiple times in a loop from
check_ignore_stdin_paths(), in which case we're potentially leaking
multiple times - but even in this scenario the leak is so small as to
have no real consequence.

Found while running t0008:

Direct leak of 112 byte(s) in 1 object(s) allocated from:
    #0 0x49a85d in malloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9aca44 in do_xmalloc wrapper.c:41:8
    #2 0x9aca1a in xmalloc wrapper.c:62:9
    #3 0x873c17 in parse_pathspec pathspec.c:582:2
    #4 0x503eb8 in check_ignore builtin/check-ignore.c:90:2
    #5 0x5038af in cmd_check_ignore builtin/check-ignore.c:190:17
    #6 0x4cd91d in run_builtin git.c:467:11
    #7 0x4cb5f3 in handle_builtin git.c:719:3
    #8 0x4ccf47 in run_argv git.c:808:4
    #9 0x4caf49 in cmd_main git.c:939:19
    #10 0x69e43e in main common-main.c:52:11
    #11 0x7f18bb0dd349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Indirect leak of 65 byte(s) in 1 object(s) allocated from:
    #0 0x49ab79 in realloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9acc46 in xrealloc wrapper.c:126:8
    #2 0x93baed in strbuf_grow strbuf.c:98:2
    #3 0x93d696 in strbuf_vaddf strbuf.c:392:3
    #4 0x9400c6 in xstrvfmt strbuf.c:979:2
    #5 0x940253 in xstrfmt strbuf.c:989:8
    #6 0x92b72a in prefix_path_gently setup.c:115:15
    #7 0x87442d in init_pathspec_item pathspec.c:439:11
    #8 0x873cef in parse_pathspec pathspec.c:589:3
    #9 0x503eb8 in check_ignore builtin/check-ignore.c:90:2
    #10 0x5038af in cmd_check_ignore builtin/check-ignore.c:190:17
    #11 0x4cd91d in run_builtin git.c:467:11
    #12 0x4cb5f3 in handle_builtin git.c:719:3
    #13 0x4ccf47 in run_argv git.c:808:4
    #14 0x4caf49 in cmd_main git.c:939:19
    #15 0x69e43e in main common-main.c:52:11
    #16 0x7f18bb0dd349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Indirect leak of 2 byte(s) in 1 object(s) allocated from:
    #0 0x486834 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
    #1 0x9ac9e8 in xstrdup wrapper.c:29:14
    #2 0x874542 in init_pathspec_item pathspec.c:468:20
    #3 0x873cef in parse_pathspec pathspec.c:589:3
    #4 0x503eb8 in check_ignore builtin/check-ignore.c:90:2
    #5 0x5038af in cmd_check_ignore builtin/check-ignore.c:190:17
    #6 0x4cd91d in run_builtin git.c:467:11
    #7 0x4cb5f3 in handle_builtin git.c:719:3
    #8 0x4ccf47 in run_argv git.c:808:4
    #9 0x4caf49 in cmd_main git.c:939:19
    #10 0x69e43e in main common-main.c:52:11
    #11 0x7f18bb0dd349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 179 byte(s) leaked in 3 allocation(s).

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:45 +09:00
Andrzej Hunt
4fa268738c builtin/bugreport: don't leak prefixed filename
prefix_filename() returns newly allocated memory, and strbuf_addstr()
doesn't take ownership of its inputs. Therefore we have to make sure to
store and free prefix_filename()'s result.

As this leak is in cmd_bugreport(), we could just as well UNLEAK the
prefix - but there's no good reason not to just free it properly. This
leak was found while running t0091, see output below:

Direct leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x49ab79 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9acc66 in xrealloc wrapper.c:126:8
    #2 0x93baed in strbuf_grow strbuf.c:98:2
    #3 0x93c6ea in strbuf_add strbuf.c:295:2
    #4 0x69f162 in strbuf_addstr ./strbuf.h:304:2
    #5 0x69f083 in prefix_filename abspath.c:277:2
    #6 0x4fb275 in cmd_bugreport builtin/bugreport.c:146:9
    #7 0x4cd91d in run_builtin git.c:467:11
    #8 0x4cb5f3 in handle_builtin git.c:719:3
    #9 0x4ccf47 in run_argv git.c:808:4
    #10 0x4caf49 in cmd_main git.c:939:19
    #11 0x69df9e in main common-main.c:52:11
    #12 0x7f523a987349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:45 +09:00
Andrzej Hunt
4c217a4c34 ls-files: free max_prefix when done
common_prefix() returns a new string, which we store in max_prefix -
this string needs to be freed to avoid a leak. This leak is happening
in cmd_ls_files, hence is of no real consequence - an UNLEAK would be
just as good, but we might as well free the string properly.

Leak found while running t0002, see output below:

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x49a85d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9ab1b4 in do_xmalloc wrapper.c:41:8
    #2 0x9ab248 in do_xmallocz wrapper.c:75:8
    #3 0x9ab22a in xmallocz wrapper.c:83:9
    #4 0x9ab2d7 in xmemdupz wrapper.c:99:16
    #5 0x78d6a4 in common_prefix dir.c:191:15
    #6 0x5aca48 in cmd_ls_files builtin/ls-files.c:669:16
    #7 0x4cd92d in run_builtin git.c:453:11
    #8 0x4cb5fa in handle_builtin git.c:704:3
    #9 0x4ccf57 in run_argv git.c:771:4
    #10 0x4caf49 in cmd_main git.c:902:19
    #11 0x69ce2e in main common-main.c:52:11
    #12 0x7f64d4d94349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-28 09:25:44 +09:00
brian m. carlson
71b7672b67 builtin/pack-objects: avoid using struct object_id for pack hash
We use struct object_id for the names of objects.  It isn't intended to
be used for other hash values that don't name objects such as the pack
hash.

Because struct object_id will soon need to have its algorithm member
set, using it in this code path would mean that we didn't set that
member, only the hash member, which would result in a crash.  For both
of these reasons, switch to using an unsigned char array of size
GIT_MAX_RAWSZ.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:39 +09:00
brian m. carlson
dd15f4f457 builtin/show-index: set the algorithm for object IDs
In most cases, when we load the hash of an object into a struct
object_id, we load it using one of the oid* or *_oid_hex functions.
However, for git show-index, we read it in directly using fread.  As a
consequence, set the algorithm correctly so the objects can be used
correctly both now and in the future.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:39 +09:00
brian m. carlson
14228447c9 hash: provide per-algorithm null OIDs
Up until recently, object IDs did not have an algorithm member, only a
hash.  Consequently, it was possible to share one null (all-zeros)
object ID among all hash algorithms.  Now that we're going to be
handling objects from multiple hash algorithms, it's important to make
sure that all object IDs have a correct algorithm field.

Introduce a per-algorithm null OID, and add it to struct hash_algo.
Introduce a wrapper function as well, and use it everywhere we used to
use the null_oid constant.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:39 +09:00
brian m. carlson
0e5e2284f1 builtin/pack-redundant: avoid casting buffers to struct object_id
Now that we need our instances of struct object_id to be zero padded, we
can no longer cast unsigned char buffers to be pointers to struct
object_id.  This file reads data out of the pack objects and then
inserts it directly into a linked list item which is a pointer to struct
object_id.  Instead, let's have the linked list item hold its own struct
object_id and copy the data into it.

In addition, since these are not really pointers to struct object_id,
stop passing them around as such, and call them what they really are:
pointers to unsigned char.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:38 +09:00
brian m. carlson
5951bf467e Use the final_oid_fn to finalize hashing of object IDs
When we're hashing a value which is going to be an object ID, we want to
zero-pad that value if necessary.  To do so, use the final_oid_fn
instead of the final_fn anytime we're going to create an object ID to
ensure we perform this operation.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:38 +09:00
brian m. carlson
92e2cab96b Always use oidread to read into struct object_id
In the future, we'll want oidread to automatically set the hash
algorithm member for an object ID we read into it, so ensure we use
oidread instead of hashcpy everywhere we're copying a hash value into a
struct object_id.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27 16:31:38 +09:00
Junio C Hamano
2eebac2c49 Merge branch 'jk/pack-objects-bitmap-progress-fix'
When "git pack-objects" makes a literal copy of a part of existing
packfile using the reachability bitmaps, its update to the progress
meter was broken.

* jk/pack-objects-bitmap-progress-fix:
  pack-objects: update "nr_seen" progress based on pack-reused count
2021-04-20 17:23:35 -07:00
ZheNing Hu
844c3f0b0b ref-filter: reuse output buffer
When we use `git for-each-ref`, every ref will allocate
its own output strbuf and error strbuf. But we can reuse
the final strbuf for each step ref's output. The error
buffer will also be reused, despite the fact that the git
will exit when `format_ref_array_item()` return a non-zero
value and output the contents of the error buffer.

The performance for `git for-each-ref` on the Git repository
itself with performance testing tool `hyperfine` changes from
23.7 ms ± 0.9 ms to 22.2 ms ± 1.0 ms. Optimization is relatively
minor.

At the same time, we apply this optimization to `git tag -l`
and `git branch -l`.

This approach is similar to the one used by 79ed0a5
(cat-file: use a single strbuf for all output, 2018-08-14)
to speed up the cat-file builtin.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Jeff King <peff@peff.net>
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-20 11:09:50 -07:00
ZheNing Hu
22f69a85ed ref-filter: get rid of show_ref_array_item
Inlining the exported function `show_ref_array_item()`,
which is not providing the right level of abstraction,
simplifies the API and can unlock improvements at the
former call sites.

Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 15:08:00 -07:00
Patrick Steinhardt
1e06eb9b5d config: unify code paths to get global config paths
There's two callsites which assemble global config paths, once in the
config loading code and once in the git-config(1) builtin. We're about
to implement a way to override global config paths via an environment
variable which would require us to adjust both sites.

Unify both code paths into a single `git_global_config()` function which
returns both paths for `~/.gitconfig` and the XDG config file. This will
make the subsequent patch which introduces the new envvar easier to
implement.

No functional changes are expected from this patch.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 14:16:59 -07:00
Patrick Steinhardt
c62a999c6e config: rename git_etc_config()
The `git_etc_gitconfig()` function retrieves the system-level path of
the configuration file. We're about to introduce a way to override it
via an environment variable, at which point the name of this function
would start to become misleading.

Rename the function to `git_system_config()` as a preparatory step.
While at it, the function is also refactored to pass memory ownership to
the caller. This is done to better match semantics of
`git_global_config()`, which is going to be introduced in the next
commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 14:16:59 -07:00
Patrick Steinhardt
9cf68b27d5 rev-list: allow filtering of provided items
When providing an object filter, it is currently impossible to also
filter provided items. E.g. when executing `git rev-list HEAD` , the
commit this reference points to will be treated as user-provided and is
thus excluded from the filtering mechanism. This makes it harder than
necessary to properly use the new `--filter=object:type` filter given
that even if the user wants to only see blobs, he'll still see commits
of provided references.

Improve this by introducing a new `--filter-provided-objects` option
to the git-rev-parse(1) command. If given, then all user-provided
references will be subject to filtering.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 14:09:11 -07:00
Matheus Tavares
e9e8adf1a8 parallel-checkout: make it truly parallel
Use multiple worker processes to distribute the queued entries and call
write_pc_item() in parallel for them. The items are distributed
uniformly in contiguous chunks. This minimizes the chances of two
workers writing to the same directory simultaneously, which could affect
performance due to lock contention in the kernel. Work stealing (or any
other format of re-distribution) is not implemented yet.

The protocol between the main process and the workers is quite simple.
They exchange binary messages packed in pkt-line format, and use
PKT-FLUSH to mark the end of input (from both sides). The main process
starts the communication by sending N pkt-lines, each corresponding to
an item that needs to be written. These packets contain all the
necessary information to load, smudge, and write the blob associated
with each item. Then it waits for the worker to send back N pkt-lines
containing the results for each item. The resulting packet must contain:
the identification number of the item that it refers to, the status of
the operation, and the lstat() data gathered after writing the file (iff
the operation was successful).

For now, checkout always uses a hardcoded value of 2 workers, only to
demonstrate that the parallel checkout framework correctly divides and
writes the queued entries. The next patch will add user configurations
and define a more reasonable default, based on tests with the said
settings.

Co-authored-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-19 11:57:05 -07:00
Sergey Organov
17c13e60fd diff-merges: introduce log.diffMerges config variable
New log.diffMerges configuration variable sets the format that
--diff-merges=on will be using. The default is "separate".

t4013: add the following tests for log.diffMerges config:

* Test that wrong values are denied.

* Test that the value of log.diffMerges properly affects both
--diff-merges=on and -m.

t9902: fix completion tests for log.d* to match log.diffMerges.

Added documentation for log.diffMerges.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 23:38:35 -07:00
Junio C Hamano
7bec8e7fa6 Merge branch 'en/ort-readiness'
Plug the ort merge backend throughout the rest of the system, and
start testing it as a replacement for the recursive backend.

* en/ort-readiness:
  Add testing with merge-ort merge strategy
  t6423: mark remaining expected failure under merge-ort as such
  Revert "merge-ort: ignore the directory rename split conflict for now"
  merge-recursive: add a bunch of FIXME comments documenting known bugs
  merge-ort: write $GIT_DIR/AUTO_MERGE whenever we hit a conflict
  t: mark several submodule merging tests as fixed under merge-ort
  merge-ort: implement CE_SKIP_WORKTREE handling with conflicted entries
  t6428: new test for SKIP_WORKTREE handling and conflicts
  merge-ort: support subtree shifting
  merge-ort: let renormalization change modify/delete into clean delete
  merge-ort: have ll_merge() use a special attr_index for renormalization
  merge-ort: add a special minimal index just for renormalization
  merge-ort: use STABLE_QSORT instead of QSORT where required
2021-04-16 13:53:34 -07:00
Derrick Stolee
32f67888d8 maintenance: respect remote.*.skipFetchAll
If a remote has the skipFetchAll setting enabled, then that remote is
not intended for frequent fetching. It makes sense to not fetch that
data during the 'prefetch' maintenance task. Skip that remote in the
iteration without error. The skip_default_update member is initialized
in remote.c:handle_config() as part of initializing the 'struct remote'.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 13:36:55 -07:00
Derrick Stolee
cfd781ea22 maintenance: use 'git fetch --prefetch'
The 'prefetch' maintenance task previously forced the following refspec
for each remote:

	+refs/heads/*:refs/prefetch/<remote>/*

If a user has specified a more strict refspec for the remote, then this
prefetch task downloads more objects than necessary.

The previous change introduced the '--prefetch' option to 'git fetch'
which manipulates the remote's refspec to place all resulting refs into
refs/prefetch/, with further partitioning based on the destinations of
those refspecs.

Update the documentation to be more generic about the destination refs.
Do not mention custom refspecs explicitly, as that does not need to be
highlighted in this documentation. The important part of placing refs in
refs/prefetch/ remains.

Reported-by: Tom Saeger <tom.saeger@oracle.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 13:36:55 -07:00
Derrick Stolee
2e03115d0c fetch: add --prefetch option
The --prefetch option will be used by the 'prefetch' maintenance task
instead of sending refspecs explicitly across the command-line. The
intention is to modify the refspec to place all results in
refs/prefetch/ instead of anywhere else.

Create helper method filter_prefetch_refspec() to modify a given refspec
to fit the rules expected of the prefetch task:

 * Negative refspecs are preserved.
 * Refspecs without a destination are removed.
 * Refspecs whose source starts with "refs/tags/" are removed.
 * Other refspecs are placed within "refs/prefetch/".

Finally, we add the 'force' option to ensure that prefetch refs are
replaced as necessary.

There are some interesting cases that are worth testing.

An earlier version of this change dropped the "i--" from the loop that
deletes a refspec item and shifts the remaining entries down. This
allowed some refspecs to not be modified. The subtle part about the
first --prefetch test is that the "refs/tags/*" refspec appears directly
before the "refs/heads/bogus/*" refspec. Without that "i--", this
ordering would remove the "refs/tags/*" refspec and leave the last one
unmodified, placing the result in "refs/heads/*".

It is possible to have an empty refspec. This is typically the case for
remotes other than the origin, where users want to fetch a specific tag
or branch. To correctly test this case, we need to further remove the
upstream remote for the local branch. Thus, we are testing a refspec
that will be deleted, leaving nothing to fetch.

Helped-by: Tom Saeger <tom.saeger@oracle.com>
Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 13:36:55 -07:00
Derrick Stolee
2508df0272 update-index: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:29 -07:00
Derrick Stolee
a02912019a stash: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:26 -07:00
Derrick Stolee
e43e2a17d2 rm: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:24 -07:00
Derrick Stolee
299e2c4561 merge-index: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full one to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:21 -07:00
Derrick Stolee
42f44e84eb ls-files: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full one to avoid missing files.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:17 -07:00
Derrick Stolee
46eb6e31ef grep: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full one so we do not miss blobs to scan. Later, this can
integrate more carefully with sparse indexes with proper testing.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:13 -07:00
Derrick Stolee
2227ea175f fsck: ensure full index
When verifying all blobs reachable from the index, ensure that a sparse
index has been expanded to a full one to avoid missing some blobs.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:11 -07:00
Derrick Stolee
48b3c7da6c difftool: ensure full index
Before iterating over all cache entries, ensure that a sparse index has
been expanded to a full one to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:09 -07:00
Derrick Stolee
cb8388df5b commit: ensure full index
These two loops iterate over all cache entries, so ensure that a sparse
index is expanded to a full index before we do so.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:06 -07:00
Derrick Stolee
0f6d3ba6bd checkout: ensure full index
Before iterating over all cache entries in the checkout builtin, ensure
that we have a full index to avoid any unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:47:03 -07:00
Derrick Stolee
1b850d37f4 checkout-index: ensure full index
Before we iterate over all cache entries, ensure that the index is not
sparse. This loop in checkout_all() might be safe to iterate over a
sparse index, but let's put this protection here until it can be
carefully tested.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:46:59 -07:00
Derrick Stolee
54beed24d2 add: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:46:48 -07:00
Derrick Stolee
847a9e5d4f *: remove 'const' qualifier for struct index_state
Several methods specify that they take a 'struct index_state' pointer
with the 'const' qualifier because they intend to only query the data,
not change it. However, we will be introducing a step very low in the
method stack that might modify a sparse-index to become a full index in
the case that our queries venture inside a sparse-directory entry.

This change only removes the 'const' qualifiers that are necessary for
the following change which will actually modify the implementation of
index_name_stage_pos().

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-14 13:46:00 -07:00
Junio C Hamano
0623669fc6 Merge branch 'tb/pack-preferred-tips-to-give-bitmap'
A configuration variable has been added to force tips of certain
refs to be given a reachability bitmap.

* tb/pack-preferred-tips-to-give-bitmap:
  builtin/pack-objects.c: respect 'pack.preferBitmapTips'
  t/helper/test-bitmap.c: initial commit
  pack-bitmap: add 'test_bitmap_commits()' helper
2021-04-13 15:28:50 -07:00
Jeff King
45a187cc34 lookup_unknown_object(): take a repository argument
All of the other lookup_foo() functions take a repository argument, but
lookup_unknown_object() was never converted, and it uses the_repository
internally. Let's fix that.

We could leave a wrapper that uses the_repository, but there aren't that
many calls, so we'll just convert them all. I looked briefly at each
site to see if we had a repository struct (besides the_repository) we
could pass, but none of them do (so this conversion to pass
the_repository is a pure noop in each case, though it does take us one
step closer to eventually getting rid of the_repository).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-13 13:18:46 -07:00
Jeff King
8e118e8490 pack-objects: update "nr_seen" progress based on pack-reused count
When serving a clone or fetch with bitmaps, after deciding which objects
need to be sent our "pack reuse" mechanism kicks in: we try to send
more-or-less verbatim a bunch of objects from the beginning of the
bitmapped packfile without even adding them to the to_pack.objects
array.

After deciding which objects will be in the "reused" portion, we update
nr_result to account for those, and then trigger display_progress() to
show the user (who is undoubtedly dazzled that we managed to enumerate
so many objects so quickly).

But then something confusing happens: the "Enumerating objects" progress
meter jumps _backwards_, counting up from zero the number of objects we
actually add into to_pack.objects.

This worked correctly once upon a time, but was broken in 5af050437a
(pack-objects: show some progress when counting kept objects,
2018-04-15), when the latter half of that progress meter switched to
using a separate nr_seen counter, rather than nr_result. Nobody noticed
for two reasons:

  - prior to the pack-reuse fixes from a14aebeac3 (Merge branch
    'jk/packfile-reuse-cleanup', 2020-02-14), the reuse code almost
    never kicked in anyway

  - the output looks _kind of_ correct. The "backwards" moment is hard
    to catch, because we overwrite the old progress number with the new
    one, and the larger number is displayed only for a second. So unless
    you look at that exact second, you just see the much smaller value,
    counting up to the number of non-reused objects (though of course if
    you catch it in stderr, or look at GIT_TRACE_PACKET from a server
    with bitmaps, you can see both values).

This smaller output isn't wrong per se, but isn't counting what we ever
intended to. We should give the user the whole number of objects we
considered (which, as per 5af050437a's original purpose, is already
_not_ a count of what goes into to_pack.objects). The follow-on
"Counting objects" meter shows the actual number of objects we feed into
that array.

We can easily fix this by bumping (and showing) nr_seen for the
pack-reused objects. When the included test is run without this patch,
the second pack-objects invocation produces "Enumerating objects: 1" to
show the one loose object, even though the resulting pack has hundreds
of objects in it. With it, we jump to "Enumerating objects: 674" after
deciding on reuse, and then "675" when we add in the loose object.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-12 11:31:30 -07:00
Matheus Tavares
d5f4b8260f rm: honor sparse checkout patterns
`git add` refrains from adding or updating index entries that are
outside the current sparse checkout, but `git rm` doesn't follow the
same restriction. This is somewhat counter-intuitive and inconsistent.
So make `rm` honor the sparsity rules and advise on how to remove
SKIP_WORKTREE entries just like `add` does. Also add some tests for the
new behavior.

Suggested-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 14:18:03 -07:00
Matheus Tavares
a20f70478f add: warn when asked to update SKIP_WORKTREE entries
`git add` already refrains from updating SKIP_WORKTREE entries, but it
silently exits with zero code when it is asked to do so. Instead, let's
warn the user and display a hint on how to update these entries.

Note that we only warn the user whey they give a pathspec item that
matches no eligible path for updating, but it does match one or more
SKIP_WORKTREE entries. A warning was chosen over erroring out right away
to reproduce the same behavior `add` already exhibits with ignored
files. This also allow users to continue their workflow without having
to invoke `add` again with only the eligible paths (as those will have
already been added).

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 14:18:03 -07:00
Matheus Tavares
719630eb48 pathspec: allow to ignore SKIP_WORKTREE entries on index matching
Add a new enum parameter to `add_pathspec_matches_against_index()` and
`find_pathspecs_matching_against_index()`, allowing callers to specify
whether these function should attempt to match SKIP_WORKTREE entries or
not. This will be used in a future patch to make `git add` display a
warning when it is asked to update SKIP_WORKTREE entries.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 14:18:03 -07:00
Matheus Tavares
d73dbafc2c add: make --chmod and --renormalize honor sparse checkouts
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 14:18:03 -07:00
Matheus Tavares
4e95698349 add: include magic part of pathspec on --refresh error
When `git add --refresh <pathspec>` doesn't find any matches for the
given pathspec, it prints an error message using the `match` field of
the `struct pathspec_item`. However, this field doesn't contain the
magic part of the pathspec. Instead, let's use the `original` field.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 14:18:03 -07:00
Junio C Hamano
82fd285e46 Merge branch 'en/sequencer-edit-upon-conflict-fix'
"git cherry-pick/revert" with or without "--[no-]edit" did not spawn
the editor as expected (e.g. "revert --no-edit" after a conflict
still asked to edit the message), which has been corrected.

* en/sequencer-edit-upon-conflict-fix:
  sequencer: fix edit handling for cherry-pick and revert messages
2021-04-08 13:23:26 -07:00
Junio C Hamano
22eee7f455 Merge branch 'll/clone-reject-shallow'
"git clone --reject-shallow" option fails the clone as soon as we
notice that we are cloning from a shallow repository.

* ll/clone-reject-shallow:
  builtin/clone.c: add --reject-shallow option
2021-04-08 13:23:25 -07:00
Junio C Hamano
e6b971fcf5 Merge branch 'tb/reverse-midx'
An on-disk reverse-index to map the in-pack location of an object
back to its object name across multiple packfiles is introduced.

* tb/reverse-midx:
  midx.c: improve cache locality in midx_pack_order_cmp()
  pack-revindex: write multi-pack reverse indexes
  pack-write.c: extract 'write_rev_file_order'
  pack-revindex: read multi-pack reverse indexes
  Documentation/technical: describe multi-pack reverse indexes
  midx: make some functions non-static
  midx: keep track of the checksum
  midx: don't free midx_name early
  midx: allow marking a pack as preferred
  t/helper/test-read-midx.c: add '--show-objects'
  builtin/multi-pack-index.c: display usage on unrecognized command
  builtin/multi-pack-index.c: don't enter bogus cmd_mode
  builtin/multi-pack-index.c: split sub-commands
  builtin/multi-pack-index.c: define common usage with a macro
  builtin/multi-pack-index.c: don't handle 'progress' separately
  builtin/multi-pack-index.c: inline 'flags' with options
2021-04-08 13:23:25 -07:00
Junio C Hamano
5644419d04 Merge branch 'ab/fsck-api-cleanup'
Fsck API clean-up.

* ab/fsck-api-cleanup:
  fetch-pack: use new fsck API to printing dangling submodules
  fetch-pack: use file-scope static struct for fsck_options
  fetch-pack: don't needlessly copy fsck_options
  fsck.c: move gitmodules_{found,done} into fsck_options
  fsck.c: add an fsck_set_msg_type() API that takes enums
  fsck.c: pass along the fsck_msg_id in the fsck_error callback
  fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from *.c to *.h
  fsck.c: give "FOREACH_MSG_ID" a more specific name
  fsck.c: undefine temporary STR macro after use
  fsck.c: call parse_msg_type() early in fsck_set_msg_type()
  fsck.h: re-order and re-assign "enum fsck_msg_type"
  fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum
  fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type"
  fsck.c: rename remaining fsck_msg_id "id" to "msg_id"
  fsck.c: remove (mostly) redundant append_msg_id() function
  fsck.c: rename variables in fsck_set_msg_type() for less confusion
  fsck.h: use "enum object_type" instead of "int"
  fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT}
  fsck.c: refactor and rename common config callback
2021-04-07 16:54:09 -07:00
Junio C Hamano
d637a267d8 Merge branch 'cc/downcase-opt-help'
A few option description strings started with capital letters,
which were corrected.

* cc/downcase-opt-help:
  column, range-diff: downcase option description
2021-04-07 16:54:09 -07:00
Junio C Hamano
68e15e0c23 Merge branch 'zh/commit-trailer'
"git commit" learned "--trailer <key>[=<value>]" option; together
with the interpret-trailers command, this will make it easier to
support custom trailers.

* zh/commit-trailer:
  commit: add --trailer option
2021-04-07 16:54:08 -07:00
Junio C Hamano
642a40019c Merge branch 'ah/plugleaks'
Plug or annotate remaining leaks that trigger while running the
very basic set of tests.

* ah/plugleaks:
  transport: also free remote_refs in transport_disconnect()
  parse-options: don't leak alias help messages
  parse-options: convert bitfield values to use binary shift
  init-db: silence template_dir leak when converting to absolute path
  init: remove git_init_db_config() while fixing leaks
  worktree: fix leak in dwim_branch()
  clone: free or UNLEAK further pointers when finished
  reset: free instead of leaking unneeded ref
  symbolic-ref: don't leak shortened refname in check_symref()
2021-04-07 16:54:08 -07:00
Derrick Stolee
a039a1fcf9 maintenance: simplify prefetch logic
The previous logic filled a string list with the names of each remote,
but instead we could simply run the appropriate 'git fetch' data
directly in the remote iterator. Do this for reduced code size, but also
because it sets up an upcoming change to use the remote's refspec. This
data is accessible from the 'struct remote' data that is now accessible
in fetch_remote().

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-06 14:23:47 -07:00
Junio C Hamano
8a4394d1c1 Merge branch 'zh/format-patch-fractional-reroll-count'
"git format-patch -v<n>" learned to allow a reroll count that is
not an integer.

* zh/format-patch-fractional-reroll-count:
  format-patch: allow a non-integral version numbers
2021-04-02 14:43:14 -07:00
Junio C Hamano
861794b60d Merge branch 'jh/simple-ipc'
A simple IPC interface gets introduced to build services like
fsmonitor on top.

* jh/simple-ipc:
  t0052: add simple-ipc tests and t/helper/test-simple-ipc tool
  simple-ipc: add Unix domain socket implementation
  unix-stream-server: create unix domain socket under lock
  unix-socket: disallow chdir() when creating unix domain sockets
  unix-socket: add backlog size option to unix_stream_listen()
  unix-socket: eliminate static unix_stream_socket() helper function
  simple-ipc: add win32 implementation
  simple-ipc: design documentation for new IPC mechanism
  pkt-line: add options argument to read_packetized_to_strbuf()
  pkt-line: add PACKET_READ_GENTLE_ON_READ_ERROR option
  pkt-line: do not issue flush packets in write_packetized_*()
  pkt-line: eliminate the need for static buffer in packet_write_gently()
2021-04-02 14:43:14 -07:00
Junio C Hamano
c47679d040 Merge branch 'mt/parallel-checkout-part-1'
Preparatory API changes for parallel checkout.

* mt/parallel-checkout-part-1:
  entry: add checkout_entry_ca() taking preloaded conv_attrs
  entry: move conv_attrs lookup up to checkout_entry()
  entry: extract update_ce_after_write() from write_entry()
  entry: make fstat_output() and read_blob_entry() public
  entry: extract a header file for entry.c functions
  convert: add classification for conv_attrs struct
  convert: add get_stream_filter_ca() variant
  convert: add [async_]convert_to_working_tree_ca() variants
  convert: make convert_attrs() and convert structs public
2021-04-02 14:43:14 -07:00
Taylor Blau
9218c6a40c midx: allow marking a pack as preferred
When multiple packs in the multi-pack index contain the same object, the
MIDX machinery must make a choice about which pack it associates with
that object. Prior to this patch, the lowest-ordered[1] pack was always
selected.

Pack selection for duplicate objects is relatively unimportant today,
but it will become important for multi-pack bitmaps. This is because we
can only invoke the pack-reuse mechanism when all of the bits for reused
objects come from the reuse pack (in order to ensure that all reused
deltas can find their base objects in the same pack).

To encourage the pack selection process to prefer one pack over another
(the pack to be preferred is the one a caller would like to later use as
a reuse pack), introduce the concept of a "preferred pack". When
provided, the MIDX code will always prefer an object found in a
preferred pack over any other.

No format changes are required to store the preferred pack, since it
will be able to be inferred with a corresponding MIDX bitmap, by looking
up the pack associated with the object in the first bit position (this
ordering is described in detail in a subsequent commit).

[1]: the ordering is specified by MIDX internals; for our purposes we
can consider the "lowest ordered" pack to be "the one with the
most-recent mtime.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 13:07:37 -07:00
Li Linchao
4fe788b1b0 builtin/clone.c: add --reject-shallow option
In some scenarios, users may want more history than the repository
offered for cloning, which happens to be a shallow repository, can
give them. But because users don't know it is a shallow repository
until they download it to local, we may want to refuse to clone
this kind of repository, without creating any unnecessary files.

The '--depth=x' option cannot be used as a solution; the source may
be deep enough to give us 'x' commits when cloned, but the user may
later need to deepen the history to arbitrary depth.

Teach '--reject-shallow' option to "git clone" to abort as soon as
we find out that we are cloning from a shallow repository.

Signed-off-by: Li Linchao <lilinchao@oschina.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 12:58:58 -07:00
Taylor Blau
3f267a1128 builtin/pack-objects.c: respect 'pack.preferBitmapTips'
When writing a new pack with a bitmap, it is sometimes convenient to
indicate some reference prefixes which should receive priority when
selecting which commits to receive bitmaps.

A truly motivated caller could accomplish this by setting
'pack.islandCore', (since all commits in the core island are similarly
marked as preferred) but this requires callers to opt into using delta
islands, which they may or may not want to do.

Introduce a new multi-valued configuration, 'pack.preferBitmapTips' to
allow callers to specify a list of reference prefixes. All references
which have a prefix contained in 'pack.preferBitmapTips' will mark their
tips as "preferred" in the same way as commits are marked as preferred
for selection by 'pack.islandCore'.

The choice of the verb "prefer" is intentional: marking the NEEDS_BITMAP
flag on an object does *not* guarantee that that object will receive a
bitmap. It merely guarantees that that commit will receive a bitmap over
any *other* commit in the same window by bitmap_writer_select_commits().

The test this patch adds reflects this quirk, too. It only tests that
a commit (which didn't receive bitmaps by default) is selected for
bitmaps after changing the value of 'pack.preferBitmapTips' to include
it. Other commits may lose their bitmaps as a byproduct of how the
selection process works (bitmap_writer_select_commits() ignores the
remainder of a window after seeing a commit with the NEEDS_BITMAP flag).

This configuration will aide in selecting important references for
multi-pack bitmaps, since they do not respect the same pack.islandCore
configuration. (They could, but doing so may be confusing, since it is
packs--not bitmaps--which are influenced by the delta-islands
configuration).

In a fork network repository (one which lists all forks of a given
repository as remotes), for example, it is useful to set
pack.preferBitmapTips to 'refs/remotes/<root>/heads' and
'refs/remotes/<root>/tags', where '<root>' is an opaque identifier
referring to the repository which is at the base of the fork chain.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-31 23:14:03 -07:00
Elijah Newren
39edfd5cbc sequencer: fix edit handling for cherry-pick and revert messages
save_opts() should save any non-default values.  It was intended to do
this, but since most options in struct replay_opts default to 0, it only
saved non-zero values.  Unfortunately, this does not always work for
options.edit.  Roughly speaking, options.edit had a default value of 0
for cherry-pick but a default value of 1 for revert.  Make save_opts()
record a value whenever it differs from the default.

options.edit was also overly simplistic; we had more than two cases.
The behavior that previously existed was as follows:

                       Non-conflict commits    Right after Conflict
    revert             Edit iff isatty(0)      Edit (ignore isatty(0))
    cherry-pick        No edit                 See above
    Specify --edit     Edit (ignore isatty(0)) See above
    Specify --no-edit  (*)                     See above

    (*) Before stopping for conflicts, No edit is the behavior.  After
        stopping for conflicts, the --no-edit flag is not saved so see
        the first two rows.

However, the expected behavior is:

                       Non-conflict commits    Right after Conflict
    revert             Edit iff isatty(0)      Edit iff isatty(0)
    cherry-pick        No edit                 Edit iff isatty(0)
    Specify --edit     Edit (ignore isatty(0)) Edit (ignore isatty(0))
    Specify --no-edit  No edit                 No edit

In order to get the expected behavior, we need to change options.edit
to a tri-state: unspecified, false, or true.  When specified, we follow
what it says.  When unspecified, we need to check whether the current
commit being created is resolving a conflict as well as consulting
options.action and isatty(0).  While at it, add a should_edit() utility
function that compresses options.edit down to a boolean based on the
additional information for the non-conflict case.

continue_single_pick() is the function responsible for resuming after
conflict cases, regardless of whether there is one commit being picked
or many.  Make this function stop assuming edit behavior in all cases,
so that it can correctly handle !isatty(0) and specific requests to not
edit the commit message.

Reported-by: Renato Botelho <garga@freebsd.org>
Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-31 14:10:50 -07:00
Junio C Hamano
4730c5e273 Merge branch 'hx/pack-objects-chunk-comment'
Comment update.

* hx/pack-objects-chunk-comment:
  pack-objects: fix comment of reused_chunk.difference
2021-03-30 14:35:37 -07:00
Junio C Hamano
dc2a073036 Merge branch 'ab/remove-rebase-usebuiltin'
Remove the final hint that we used to have a scripted "git rebase".

* ab/remove-rebase-usebuiltin:
  rebase: remove transitory rebase.useBuiltin setting & env
2021-03-30 14:35:37 -07:00
Junio C Hamano
ad16f748f2 Merge branch 'ab/read-tree'
Code simplification by removing support for a caller that is long gone.

* ab/read-tree:
  tree.h API: simplify read_tree_recursive() signature
  tree.h API: expose read_tree_1() as read_tree_at()
  archive: stop passing "stage" through read_tree_recursive()
  ls-files: refactor away read_tree()
  ls-files: don't needlessly pass around stage variable
  tree.c API: move read_tree() into builtin/ls-files.c
  ls-files tests: add meaningful --with-tree tests
  show tests: add test for "git show <tree>"
2021-03-30 14:35:37 -07:00
Derrick Stolee
dcc5fd5fd2 sparse-checkout: disable sparse-index
We use 'git sparse-checkout init --cone --sparse-index' to toggle the
sparse-index feature. It makes sense to also disable it when running
'git sparse-checkout disable'. This is particularly important because it
removes the extensions.sparseIndex config option, allowing other tools
to use this Git repository again.

This does mean that 'git sparse-checkout init' will not re-enable the
sparse-index feature, even if it was previously enabled.

While testing this feature, I noticed that the sparse-index was not
being written on the first run, but by a second. This was caught by the
call to 'test-tool read-cache --table'. This requires adjusting some
assignments to core_apply_sparse_checkout and pl.use_cone_patterns in
the sparse_checkout_init() logic.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:48 -07:00
Derrick Stolee
122ba1f7b5 sparse-checkout: toggle sparse index from builtin
The sparse index extension is used to signal that index writes should be
in sparse mode. This was only updated using GIT_TEST_SPARSE_INDEX=1.

Add a '--[no-]sparse-index' option to 'git sparse-checkout init' that
specifies if the sparse index should be used. It also updates the index
to use the correct format, either way. Add a warning in the
documentation that the use of a repository extension might reduce
compatibility with third-party tools. 'git sparse-checkout init' already
sets extension.worktreeConfig, which places most sparse-checkout users
outside of the scope of most third-party tools.

Update t1092-sparse-checkout-compatibility.sh to use this CLI instead of
GIT_TEST_SPARSE_INDEX=1.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:48 -07:00
Derrick Stolee
836e25c51b sparse-checkout: hold pattern list in index
As we modify the sparse-checkout definition, we perform index operations
on a pattern_list that only exists in-memory. This allows easy backing
out in case the index update fails.

However, if the index write itself cares about the sparse-checkout
pattern set, we need access to that in-memory copy. Place a pointer to
a 'struct pattern_list' in the index so we can access this on-demand.
This will be used in the next change which uses the sparse-checkout
definition to filter out directories that are outside the sparse cone.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:57:46 -07:00
Taylor Blau
cd57bc41bb builtin/multi-pack-index.c: display usage on unrecognized command
When given a sub-command that it doesn't understand, 'git
multi-pack-index' dies with the following message:

    $ git multi-pack-index bogus
    fatal: unrecognized subcommand: bogus

Instead of 'die()'-ing, we can display the usage text, which is much
more helpful:

    $ git.compile multi-pack-index bogus
    error: unrecognized subcommand: bogus
    usage: git multi-pack-index [<options>] write
       or: git multi-pack-index [<options>] verify
       or: git multi-pack-index [<options>] expire
       or: git multi-pack-index [<options>] repack [--batch-size=<size>]

        --object-dir <file>   object directory containing set of packfile and pack-index pairs
        --progress            force progress reporting

While we're at it, clean up some duplication between the "no sub-command"
and "unrecognized sub-command" conditionals.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:16:56 -07:00
Taylor Blau
690eb05719 builtin/multi-pack-index.c: don't enter bogus cmd_mode
Even before the recent refactoring, 'git multi-pack-index' calls
'trace2_cmd_mode()' before verifying that the sub-command is recognized.

Push this call down into the individual sub-commands so that we don't
enter a bogus command mode.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:16:56 -07:00
Taylor Blau
60ca94769c builtin/multi-pack-index.c: split sub-commands
Handle sub-commands of the 'git multi-pack-index' builtin (e.g.,
"write", "repack", etc.) separately from one another. This allows
sub-commands with unique options, without forcing cmd_multi_pack_index()
to reject invalid combinations itself.

This comes at the cost of some duplication and boilerplate. Luckily, the
duplication is reduced to a minimum, since common options are shared
among sub-commands due to a suggestion by Ævar. (Sub-commands do have to
retain the common options, too, since this builtin accepts common
options on either side of the sub-command).

Roughly speaking, cmd_multi_pack_index() parses options (including
common ones), and stops at the first non-option, which is the
sub-command. It then dispatches to the appropriate sub-command, which
parses the remaining options (also including common options).

Unknown options are kept by the sub-commands in order to detect their
presence (and complain that too many arguments were given).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:16:56 -07:00
Taylor Blau
b25b727494 builtin/multi-pack-index.c: define common usage with a macro
Factor out the usage message into pieces corresponding to each mode.
This avoids options specific to one sub-command from being shared with
another in the usage.

A subsequent commit will use these #define macros to have usage
variables for each sub-command without duplicating their contents.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:16:56 -07:00
Taylor Blau
cf1f5389ec builtin/multi-pack-index.c: don't handle 'progress' separately
Now that there is a shared 'flags' member in the options structure,
there is no need to keep track of whether to force progress or not,
since ultimately the decision of whether or not to show a progress meter
is controlled by a bit in the flags member.

Manipulate that bit directly, and drop the now-unnecessary 'progress'
field while we're at it.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:16:56 -07:00
Taylor Blau
f7c4d63e35 builtin/multi-pack-index.c: inline 'flags' with options
Subcommands of the 'git multi-pack-index' command (e.g., 'write',
'verify', etc.) will want to optionally change a set of shared flags
that are eventually passed to the MIDX libraries.

Right now, options and flags are handled separately. That's fine, since
the options structure is never passed around. But a future patch will
make it so that common options shared by all sub-commands are defined in
a common location. That means that "flags" would have to become a global
variable.

Group it with the options structure so that we reduce the number of
global variables we have overall.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30 12:16:56 -07:00
Chinmoy Chakraborty
5ee90326dc column, range-diff: downcase option description
It is customary not to begin the help text for each option given to
the parse-options API with a capital letter. Various (sub)commands'
option arrays don't follow the guideline provided by the parse_options
Documentation regarding the descriptions.

Downcase the first word of some option descriptions for "column"
and "range-diff".

Signed-off-by: Chinmoy Chakraborty <chinmoy12c@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-29 14:06:08 -07:00
Ævar Arnfjörð Bjarmason
3745e2693d fetch-pack: use new fsck API to printing dangling submodules
Refactor the check added in 5476e1efde (fetch-pack: print and use
dangling .gitmodules, 2021-02-22) to make use of us now passing the
"msg_id" to the user defined "error_func". We can now compare against
the FSCK_MSG_GITMODULES_MISSING instead of parsing the generated
message.

Let's also replace register_found_gitmodules() with directly
manipulating the "gitmodules_found" member. A recent commit moved it
into "fsck_options" so we could do this here.

I'm sticking this callback in fsck.c. Perhaps in the future we'd like
to accumulate such callbacks into another file (maybe fsck-cb.c,
similar to parse-options-cb.c?), but while we've got just the one
let's just put it into fsck.c.

A better alternative in this case would be some library some more
obvious library shared by fetch-pack.c ad builtin/index-pack.c, but
there isn't such a thing.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason
462f5cae0f fetch-pack: don't needlessly copy fsck_options
Change the behavior of the .gitmodules validation added in
5476e1efde (fetch-pack: print and use dangling .gitmodules,
2021-02-22) so we're using one "fsck_options".

I found that code confusing to read. One might think that not setting
up the error_func earlier means that we're relying on the "error_func"
not being set in some code in between the two hunks being modified
here.

But we're not, all we're doing in the rest of "cmd_index_pack()" is
further setup by calling fsck_set_msg_types(), and assigning to
do_fsck_object.

So there was no reason in 5476e1efde to make a shallow copy of the
fsck_options struct before setting error_func. Let's just do this
setup at the top of the function, along with the "walk" assignment.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason
53692df2b8 fsck.c: add an fsck_set_msg_type() API that takes enums
Change code I added in acf9de4c94 (mktag: use fsck instead of custom
verify_tag(), 2021-01-05) to make use of a new API function that takes
the fsck_msg_{id,type} types, instead of arbitrary strings that
we'll (hopefully) parse into those types.

At the time that the fsck_set_msg_type() API was introduced in
0282f4dced (fsck: offer a function to demote fsck errors to warnings,
2015-06-22) it was only intended to be used to parse user-supplied
data.

For things that are purely internal to the C code it makes sense to
have the compiler check these arguments, and to skip the sanity
checking of the data in fsck_set_msg_type() which is redundant to
checks we get from the compiler.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason
394d5d31b0 fsck.c: pass along the fsck_msg_id in the fsck_error callback
Change the fsck_error callback to also pass along the
fsck_msg_id. Before this change the only way to get the message id was
to parse it back out of the "message".

Let's pass it down explicitly for the benefit of callers that might
want to use it, as discussed in [1].

Passing the msg_type is now redundant, as you can always get it back
from the msg_id, but I'm not changing that convention. It's really
common to need the msg_type, and the report() function itself (which
calls "fsck_error") needs to call fsck_msg_type() to discover
it. Let's not needlessly re-do that work in the user callback.

1. https://lore.kernel.org/git/87blcja2ha.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason
1b32b59f9b fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum
Move the FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} defines into a new
fsck_msg_type enum.

These defines were originally introduced in:

 - ba002f3b28 (builtin-fsck: move common object checking code to
   fsck.c, 2008-02-25)
 - f50c440730 (fsck: disallow demoting grave fsck errors to warnings,
   2015-06-22)
 - efaba7cc77 (fsck: optionally ignore specific fsck issues
   completely, 2015-06-22)
 - f27d05b170 (fsck: allow upgrading fsck warnings to errors,
   2015-06-22)

The reason these were defined in two different places is because we
use FSCK_{IGNORE,INFO,FATAL} only in fsck.c, but FSCK_{ERROR,WARN} are
used by external callbacks.

Untangling that would take some more work, since we expose the new
"enum fsck_msg_type" to both. Similar to "enum object_type" it's not
worth structuring the API in such a way that only those who need
FSCK_{ERROR,WARN} pass around a different type.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason
a1aad71601 fsck.h: use "enum object_type" instead of "int"
Change the fsck_walk_func to use an "enum object_type" instead of an
"int" type. The types are compatible, and ever since this was added in
355885d531 (add generic, type aware object chain walker, 2008-02-25)
we've used entries from object_type (OBJ_BLOB etc.).

So this doesn't really change anything as far as the generated code is
concerned, it just gives the compiler more information and makes this
easier to read.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-28 19:03:10 -07:00
Junio C Hamano
89519f662c Merge branch 'cm/rebase-i-fixup-amend-reword'
"git commit --fixup=<commit>", which was to tweak the changes made
to the contents while keeping the original log message intact,
learned "--fixup=(amend|reword):<commit>", that can be used to
tweak both the message and the contents, and only the message,
respectively.

* cm/rebase-i-fixup-amend-reword:
  doc/git-commit: add documentation for fixup=[amend|reword] options
  t3437: use --fixup with options to create amend! commit
  t7500: add tests for --fixup=[amend|reword] options
  commit: add a reword suboption to --fixup
  commit: add amend suboption to --fixup to create amend! commit
  sequencer: export and rename subject_length()
2021-03-26 14:59:03 -07:00
Junio C Hamano
2744383cbd Merge branch 'tb/geometric-repack'
"git repack" so far has been only capable of repacking everything
under the sun into a single pack (or split by size).  A cleverer
strategy to reduce the cost of repacking a repository has been
introduced.

* tb/geometric-repack:
  builtin/pack-objects.c: ignore missing links with --stdin-packs
  builtin/repack.c: reword comment around pack-objects flags
  builtin/repack.c: be more conservative with unsigned overflows
  builtin/repack.c: assign pack split later
  t7703: test --geometric repack with loose objects
  builtin/repack.c: do not repack single packs with --geometric
  builtin/repack.c: add '--geometric' option
  packfile: add kept-pack cache for find_kept_pack_entry()
  builtin/pack-objects.c: rewrite honor-pack-keep logic
  p5303: measure time to repack with keep
  p5303: add missing &&-chains
  builtin/pack-objects.c: add '--stdin-packs' option
  revision: learn '--no-kept-objects'
  packfile: introduce 'find_kept_pack_entry()'
2021-03-24 14:36:27 -07:00
Han Xin
bf12013f1a pack-objects: fix comment of reused_chunk.difference
As record_reused_object(offset, offset - hashfile_total(out)) said,
reused_chunk.difference should be the offset of original packfile minus
the offset of the generated packfile. But the comment presented an opposite way.

Signed-off-by: Han Xin <hanxin.hx@alibaba-inc.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-24 13:03:22 -07:00
Ævar Arnfjörð Bjarmason
9bcde4d531 rebase: remove transitory rebase.useBuiltin setting & env
Remove the rebase.useBuiltin setting and the now-obsolete
GIT_TEST_REBASE_USE_BUILTIN test flag.

This was left in place after my d03ebd411c (rebase: remove the
rebase.useBuiltin setting, 2019-03-18) to help anyone who'd used the
experimental flag and wanted to know that it was the default, or that
they should transition their test environment to use the builtin
rebase unconditionally.

It's been more than long enough for those users to get a headsup about
this. So remove all the scaffolding that was left inplace after
d03ebd411c. I'm also removing the documentation entry, if anyone
still has this left in their configuration they can do some source
archaeology to figure out what it used to do, which makes more sense
than exposing every git user reading the documentation to this legacy
configuration switch.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-23 14:05:58 -07:00
ZheNing Hu
db91988aa1 format-patch: allow a non-integral version numbers
The `-v<n>` option of `format-patch` can give nothing but an
integral iteration number to patches in a series.  Some people,
however, prefer to mark a new iteration with only a small fixup
with a non integral iteration number (e.g. an "oops, that was
wrong" fix-up patch for v4 iteration may be labeled as "v4.1").

Allow `format-patch` to take such a non-integral iteration
number.

`<n>` can be any string, such as '3.1' or '4rev2'. In the case
where it is a non-integral value, the "Range-diff" and "Interdiff"
headers will not include the previous version.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-23 12:49:47 -07:00
Matheus Tavares
d052cc0382 entry: extract a header file for entry.c functions
The declarations of entry.c's public functions and structures currently
reside in cache.h. Although not many, they contribute to the size of
cache.h and, when changed, cause the unnecessary recompilation of
modules that don't really use these functions. So let's move them to a
new entry.h header. While at it let's also move a comment related to
checkout_entry() from entry.c to entry.h as it's more useful to describe
the function there.

Original-patch-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-23 10:34:05 -07:00
ZheNing Hu
2daae3d1d1 commit: add --trailer option
Historically, Git has supported the 'Signed-off-by' commit trailer
using the '--signoff' and the '-s' option from the command line.
But users may need to provide other trailer information from the
command line such as "Helped-by", "Reported-by", "Mentored-by",

Now implement a new `--trailer <token>[(=|:)<value>]` option to pass
other trailers to `interpret-trailers` and insert them into commit
messages.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-23 10:31:38 -07:00
Junio C Hamano
3099d4faa3 Merge branch 'bc/clone-bare-with-conflicting-config'
"git -c core.bare=false clone --bare ..." would have segfaulted,
which has been corrected.

* bc/clone-bare-with-conflicting-config:
  builtin/init-db: handle bare clones when core.bare set to false
2021-03-22 14:00:25 -07:00
Junio C Hamano
f5c73f69fd Merge branch 'dl/stash-show-untracked'
"git stash show" learned to optionally show untracked part of the
stash.

* dl/stash-show-untracked:
  stash show: learn stash.showIncludeUntracked
  stash show: teach --include-untracked and --only-untracked
2021-03-22 14:00:24 -07:00
Junio C Hamano
24119d9d7b Merge branch 'ab/grep-pcre2-allocfix'
Updates to memory allocation code around the use of pcre2 library.

* ab/grep-pcre2-allocfix:
  grep/pcre2: move definitions of pcre2_{malloc,free}
  grep/pcre2: move back to thread-only PCREv2 structures
  grep/pcre2: actually make pcre2 use custom allocator
  grep/pcre2: use pcre2_maketables_free() function
  grep/pcre2: use compile-time PCREv2 version test
  grep/pcre2: add GREP_PCRE2_DEBUG_MALLOC debug mode
  grep/pcre2: prepare to add debugging to pcre2_malloc()
  grep/pcre2: correct reference to grep_init() in comment
  grep/pcre2: drop needless assignment to NULL
  grep/pcre2: drop needless assignment + assert() on opt->pcre2
2021-03-22 14:00:23 -07:00
Junio C Hamano
52182e3b1f Merge branch 'ab/remote-write-config-in-camel-case'
Update C code that sets a few configuration variables when a remote
is configured so that it spells configuration variable names in the
canonical camelCase.

* ab/remote-write-config-in-camel-case:
  remote: write camel-cased *.pushRemote on rename
  remote: add camel-cased *.tagOpt key, like clone
2021-03-22 14:00:23 -07:00
Junio C Hamano
204333b015 Merge branch 'jk/open-dotgitx-with-nofollow'
It does not make sense to make ".gitattributes", ".gitignore" and
".mailmap" symlinks, as they are supposed to be usable from the
object store (think: bare repositories where HEAD:.mailmap etc. are
used).  When these files are symbolic links, we used to read the
contents of the files pointed by them by mistake, which has been
corrected.

* jk/open-dotgitx-with-nofollow:
  mailmap: do not respect symlinks for in-tree .mailmap
  exclude: do not respect symlinks for in-tree .gitignore
  attr: do not respect symlinks for in-tree .gitattributes
  exclude: add flags parameter to add_patterns()
  attr: convert "macro_ok" into a flags field
  add open_nofollow() helper
2021-03-22 14:00:22 -07:00
Andrzej Hunt
68ffe095a2 transport: also free remote_refs in transport_disconnect()
transport_get_remote_refs() can populate the transport struct's
remote_refs. transport_disconnect() is already responsible for most of
transport's cleanup - therefore we also take care of freeing remote_refs
there.

There are 2 locations where transport_disconnect() is called before
we're done using the returned remote_refs. This patch changes those
callsites to only call transport_disconnect() after the returned refs
are no longer being used - which is necessary to safely be able to
free remote_refs during transport_disconnect().

This commit fixes the following leak which was found while running
t0000, but is expected to also fix the same pattern of leak in all
locations that use transport_get_remote_refs():

Direct leak of 165 byte(s) in 1 object(s) allocated from:
    #0 0x49a6b2 in calloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
    #1 0x9a72f2 in xcalloc /home/ahunt/oss-fuzz/git/wrapper.c:140:8
    #2 0x8ce203 in alloc_ref_with_prefix /home/ahunt/oss-fuzz/git/remote.c:867:20
    #3 0x8ce1a2 in alloc_ref /home/ahunt/oss-fuzz/git/remote.c:875:9
    #4 0x72f63e in process_ref_v2 /home/ahunt/oss-fuzz/git/connect.c:426:8
    #5 0x72f21a in get_remote_refs /home/ahunt/oss-fuzz/git/connect.c:525:8
    #6 0x979ab7 in handshake /home/ahunt/oss-fuzz/git/transport.c:305:4
    #7 0x97872d in get_refs_via_connect /home/ahunt/oss-fuzz/git/transport.c:339:9
    #8 0x9774b5 in transport_get_remote_refs /home/ahunt/oss-fuzz/git/transport.c:1388:4
    #9 0x51cf80 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1271:9
    #10 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #11 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #12 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #13 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #14 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #15 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-21 14:39:10 -07:00
Ævar Arnfjörð Bjarmason
47957485b3 tree.h API: simplify read_tree_recursive() signature
Simplify the signature of read_tree_recursive() to omit the "base",
"baselen" and "stage" arguments. No callers of it use these parameters
for anything anymore.

The last function to call read_tree_recursive() with a non-"" path was
read_tree_recursive() itself, but that was changed in
ffd31f661d (Reimplement read_tree_recursive() using
tree_entry_interesting(), 2011-03-25).

The last user of the "stage" parameter went away in the last commit,
and even that use was mere boilerplate.

So let's remove those and rename the read_tree_recursive() function to
just read_tree(). We had another read_tree() function that I've
refactored away in preceding commits, since all in-tree users read
trees recursively with a callback we can change the name to signify
that this is the norm.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 16:09:26 -07:00
Ævar Arnfjörð Bjarmason
9614ad3ce0 ls-files: refactor away read_tree()
Refactor away the read_tree() function into its only user,
overlay_tree_on_index().

First, change read_one_entry_opt() to use the strbuf parameter
read_tree_recursive() passes down in place. This finishes up a partial
refactoring started in 6a0b0b6de9 (tree.c: update read_tree_recursive
callback to pass strbuf as base, 2014-11-30).

Moving the rest into overlay_tree_on_index() makes this index juggling
we're doing easier to read.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 16:09:26 -07:00
Ævar Arnfjörð Bjarmason
fcc7c12f11 ls-files: don't needlessly pass around stage variable
Now that read_tree() has been moved to ls-files.c we can get rid of
the stage != 1 case that'll never happen.

Let's not use read_tree_recursive() as a pass-through to pass "stage =
1" either. For now we'll pass an unused "stage = 0" for consistency
with other read_tree_recursive() callers, that argument will be
removed in a follow-up commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 16:09:26 -07:00
Ævar Arnfjörð Bjarmason
eefadd18e1 tree.c API: move read_tree() into builtin/ls-files.c
Since the read_tree() API was added around the same time as
read_tree_recursive() in 94537c78a8 (Move "read_tree()" to
"tree.c"[...], 2005-04-22) and b12ec373b8 ([PATCH] Teach read-tree
about commit objects, 2005-04-20) things have gradually migrated over
to the read_tree_recursive() version.

Now builtin/ls-files.c is the last user of this code, let's move all
the relevant code there. This allows for subsequent simplification of
it, and an eventual move to read_tree_recursive().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 16:09:25 -07:00
Elijah Newren
5291828df8 merge-ort: write $GIT_DIR/AUTO_MERGE whenever we hit a conflict
There are a variety of questions users might ask while resolving
conflicts:
  * What changes have been made since the previous (first) parent?
  * What changes are staged?
  * What is still unstaged? (or what is still conflicted?)
  * What changes did I make to resolve conflicts so far?
The first three of these have simple answers:
  * git diff HEAD
  * git diff --cached
  * git diff
There was no way to answer the final question previously.  Adding one
is trivial in merge-ort, since it works by creating a tree representing
what should be written to the working copy complete with conflict
markers.  Simply write that tree to .git/AUTO_MERGE, allowing users to
answer the fourth question with
  * git diff AUTO_MERGE

I avoided using a name like "MERGE_AUTO", because that would be
merge-specific (much like MERGE_HEAD, REBASE_HEAD, REVERT_HEAD,
CHERRY_PICK_HEAD) and I wanted a name that didn't change depending on
which type of operation the merge was part of.

Ensure that paths which clean out other temporary operation-specific
files (e.g. CHERRY_PICK_HEAD, MERGE_MSG, rebase-merge/ state directory)
also clean out this AUTO_MERGE file.

Signed-off-by: Elijah Newren <newren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 12:35:40 -07:00
Junio C Hamano
ef486a9ecf Merge branch 'tb/git-mv-icase-fix'
Fix a corner case bug in "git mv" on case insensitive systems,
which was introduced in 2.29 timeframe.

* tb/git-mv-icase-fix:
  git mv foo FOO ; git mv foo bar gave an assert
2021-03-19 15:25:40 -07:00
Junio C Hamano
92ccd7b752 Merge branch 'rs/calloc-array'
CALLOC_ARRAY() macro replaces many uses of xcalloc().

* rs/calloc-array:
  cocci: allow xcalloc(1, size)
  use CALLOC_ARRAY
  git-compat-util.h: drop trailing semicolon from macro definition
2021-03-19 15:25:38 -07:00
Junio C Hamano
35381b13da Merge branch 'jk/bisect-peel-tag-fix'
"git bisect" reimplemented more in C during 2.30 timeframe did not
take an annotated tag as a good/bad endpoint well.  This regression
has been corrected.

* jk/bisect-peel-tag-fix:
  bisect: peel annotated tags to commits
2021-03-19 15:25:37 -07:00
Junio C Hamano
eabacfd9cb Merge branch 'jc/calloc-fix'
Code clean-up.

* jc/calloc-fix:
  xcalloc: use CALLOC_ARRAY() when applicable
2021-03-19 15:25:37 -07:00
Taylor Blau
14e7b8344f builtin/pack-objects.c: ignore missing links with --stdin-packs
When 'git pack-objects --stdin-packs' encounters a commit in a pack, it
marks it as a starting point of a best-effort reachability traversal
that is used to populate the name-hash of the objects listed in the
given packs.

The traversal expects that it should be able to walk the ancestors of
all commits in a pack without issue. Ordinarily this is the case, but it
is possible to having missing parents from an unreachable part of the
repository. In that case, we'd consider any missing objects in the
unreachable portion of the graph to be junk.

This should be handled gracefully: since the traversal is best-effort
(i.e., we don't strictly need to fill in all of the name-hash fields),
we should simply ignore any missing links.

This patch does that (by setting the 'ignore_missing_links' bit on the
rev_info struct), and ensures we don't regress in the future by adding a
test which demonstrates this case.

It is a little over-eager, since it will also ignore missing links in
reachable parts of the packs (which would indicate a corrupted
repository), but '--stdin-packs' is explicitly *not* about reachability.
So this step isn't making anything worse for a repository which contains
packs missing reachable objects (since we never drop objects with
'--stdin-packs').

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-19 11:19:29 -07:00
Ævar Arnfjörð Bjarmason
fb79f5bff7 fsck.c: refactor and rename common config callback
Refactor code I recently changed in 1f3299fda9 (fsck: make
fsck_config() re-usable, 2021-01-05) so that I could use fsck's config
callback in mktag in 1f3299fda9 (fsck: make fsck_config() re-usable,
2021-01-05).

I don't know what I was thinking in structuring the code this way, but
it clearly makes no sense to have an fsck_config_internal() at all
just so it can get a fsck_options when git_config() already supports
passing along some void* data.

Let's just make use of that instead, which gets us rid of the two
wrapper functions, and brings fsck's common config callback in line
with other such reusable config callbacks.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-17 14:02:43 -07:00
Jeff King
7730f85594 bisect: peel annotated tags to commits
This patch fixes a bug where git-bisect doesn't handle receiving
annotated tags as "git bisect good <tag>", etc. It's a regression in
27257bc466 (bisect--helper: reimplement `bisect_state` & `bisect_head`
shell functions in C, 2020-10-15).

The original shell code called:

  sha=$(git rev-parse --verify "$rev^{commit}") ||
          die "$(eval_gettext "Bad rev input: \$rev")"

which will peel the input to a commit (or complain if that's not
possible). But the C code just calls get_oid(), which will yield the oid
of the tag.

The fix is to peel to a commit. The error message here is a little
non-idiomatic for Git (since it starts with a capital). I've mostly left
it, as it matches the other converted messages (like the "Bad rev input"
we print when get_oid() fails), though I did add an indication that it
was the peeling that was the problem. It might be worth taking a pass
through this converted code to modernize some of the error messages.

Note also that the test does a bare "grep" (not i18ngrep) on the
expected "X is the first bad commit" output message. This matches the
rest of the test script.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-17 11:24:08 -07:00
Junio C Hamano
486f4bd183 xcalloc: use CALLOC_ARRAY() when applicable
These are for codebase before Git 2.31

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 17:51:10 -07:00
Jeff Hostetler
77e522caae unix-socket: disallow chdir() when creating unix domain sockets
Calls to `chdir()` are dangerous in a multi-threaded context.  If
`unix_stream_listen()` or `unix_stream_connect()` is given a socket
pathname that is too long to fit in a `sockaddr_un` structure, it will
`chdir()` to the parent directory of the requested socket pathname,
create the socket using a relative pathname, and then `chdir()` back.
This is not thread-safe.

Teach `unix_sockaddr_init()` to not allow calls to `chdir()` when this
flag is set.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:32:51 -07:00
Jeff Hostetler
55144ccb0a unix-socket: add backlog size option to unix_stream_listen()
Update `unix_stream_listen()` to take an options structure to override
default behaviors.  This commit includes the size of the `listen()` backlog.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:32:51 -07:00
Charvi Mendiratta
3270ae82ac commit: add a reword suboption to --fixup
`git commit --fixup=reword:<commit>` aliases
`--fixup=amend:<commit> --only`, where it creates an empty "amend!"
commit that will reword <commit> without changing its contents when
it is rebased with `--autosquash`.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:29:35 -07:00
Charvi Mendiratta
494d314a05 commit: add amend suboption to --fixup to create amend! commit
`git commit --fixup=amend:<commit>` will create an "amend!" commit.
The resulting commit message subject will be "amend! ..." where
"..." is the subject line of <commit> and the initial message
body will be <commit>'s message.

The "amend!" commit when rebased with --autosquash will fixup the
contents and replace the commit message of <commit> with the
"amend!" commit's message body.

In order to prevent rebase from creating commits with an empty
message we refuse to create an "amend!" commit if commit message
body is empty.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Charvi Mendiratta <charvi077@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-15 14:29:35 -07:00
Andrzej Hunt
04fe4d75fa init-db: silence template_dir leak when converting to absolute path
template_dir starts off pointing to either argv or nothing. However if
the value supplied in argv is a relative path, absolute_pathdup() is
used to turn it into an absolute path. absolute_pathdup() allocates
a new string, and we then "leak" it when cmd_init_db() completes.

We don't bother to actually free the return value (instead we UNLEAK
it), because there's no significant advantage to doing so here.
Correctly freeing it would require more significant changes to code flow
which would be more noisy than beneficial.

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-14 15:58:00 -07:00
Andrzej Hunt
e4de4502e6 init: remove git_init_db_config() while fixing leaks
The primary goal of this change is to stop leaking init_db_template_dir.
This leak can happen because:
 1. git_init_db_config() allocates new memory into init_db_template_dir
    without first freeing the existing value.
 2. init_db_template_dir might already contain data, either because:
  2.1 git_config() can be invoked twice with this callback in a single
      process - at least 2 allocations are likely.
  2.2 A single git_config() allocation can invoke the callback multiple
      times for a given key (see further explanation in the function
      docs) - each of those calls will trigger another leak.

The simplest fix for the leak would be to free(init_db_template_dir)
before overwriting it. Instead we choose to convert to fetching
init.templatedir via git_config_get_value() as that is more explicit,
more efficient, and avoids allocations (the returned result is owned by
the config cache, so we aren't responsible for freeing it).

If we remove init_db_template_dir, git_init_db_config() ends up being
responsible only for forwarding core.* config values to
platform_core_config(). However platform_core_config() already ignores
non-core.* config values, so we can safely remove git_init_db_config()
and invoke git_config() directly with platform_core_config() as the
callback.

The platform_core_config forwarding was originally added in:
  287853392a (mingw: respect core.hidedotfiles = false in git-init again, 2019-03-11
And I suspect the potential for a leak existed since the original
implementation of git_init_db_config in:
  90b45187ba (Add `init.templatedir` configuration variable., 2010-02-17)

LSAN output from t0001:

Direct leak of 73 byte(s) in 1 object(s) allocated from:
    #0 0x49a859 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9a7276 in xrealloc /home/ahunt/oss-fuzz/git/wrapper.c:126:8
    #2 0x9362ad in strbuf_grow /home/ahunt/oss-fuzz/git/strbuf.c:98:2
    #3 0x936eaa in strbuf_add /home/ahunt/oss-fuzz/git/strbuf.c:295:2
    #4 0x868112 in strbuf_addstr /home/ahunt/oss-fuzz/git/./strbuf.h:304:2
    #5 0x86a8ad in expand_user_path /home/ahunt/oss-fuzz/git/path.c:758:2
    #6 0x720bb1 in git_config_pathname /home/ahunt/oss-fuzz/git/config.c:1287:10
    #7 0x5960e2 in git_init_db_config /home/ahunt/oss-fuzz/git/builtin/init-db.c:161:11
    #8 0x7255b8 in configset_iter /home/ahunt/oss-fuzz/git/config.c:1982:7
    #9 0x7253fc in repo_config /home/ahunt/oss-fuzz/git/config.c:2311:2
    #10 0x725ca7 in git_config /home/ahunt/oss-fuzz/git/config.c:2399:2
    #11 0x593e8d in create_default_files /home/ahunt/oss-fuzz/git/builtin/init-db.c:225:2
    #12 0x5935c6 in init_db /home/ahunt/oss-fuzz/git/builtin/init-db.c:449:11
    #13 0x59588e in cmd_init_db /home/ahunt/oss-fuzz/git/builtin/init-db.c:714:9
    #14 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #15 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #16 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #17 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #18 0x69c4de in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #19 0x7f23552d6349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-14 15:57:59 -07:00
Andrzej Hunt
aa1b63971a worktree: fix leak in dwim_branch()
Make sure that we release the temporary strbuf during dwim_branch() for
all codepaths (and not just for the early return).

This leak appears to have been introduced in:
  f60a7b763f (worktree: teach "add" to check out existing branches, 2018-04-24)

Note that UNLEAK(branchname) is still needed: the returned result is
used in add(), and is stored in a pointer which is used to point at one
of:
  - a string literal ("HEAD")
  - member of argv (whatever the user specified in their invocation)
  - or our newly allocated string returned from dwim_branch()
Fixing the branchname leak isn't impossible, but does not seem
worthwhile given that add() is called directly from cmd_main(), and
cmd_main() returns immediately thereafter - UNLEAK is good enough.

This leak was found when running t0001 with LSAN, see also LSAN output
below:

Direct leak of 60 byte(s) in 1 object(s) allocated from:
    #0 0x49a859 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9ab076 in xrealloc /home/ahunt/oss-fuzz/git/wrapper.c:126:8
    #2 0x939fcd in strbuf_grow /home/ahunt/oss-fuzz/git/strbuf.c:98:2
    #3 0x93af53 in strbuf_splice /home/ahunt/oss-fuzz/git/strbuf.c:239:3
    #4 0x83559a in strbuf_check_branch_ref /home/ahunt/oss-fuzz/git/object-name.c:1593:2
    #5 0x6988b9 in dwim_branch /home/ahunt/oss-fuzz/git/builtin/worktree.c:454:20
    #6 0x695f8f in add /home/ahunt/oss-fuzz/git/builtin/worktree.c:525:19
    #7 0x694a04 in cmd_worktree /home/ahunt/oss-fuzz/git/builtin/worktree.c:1036:10
    #8 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #9 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #10 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #11 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #12 0x69caee in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #13 0x7f7b7dd10349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-14 15:57:59 -07:00
Andrzej Hunt
0c4542738e clone: free or UNLEAK further pointers when finished
Most of these pointers can safely be freed when cmd_clone() completes,
therefore we make sure to free them. The one exception is that we
have to UNLEAK(repo) because it can point either to argv[0], or a
malloc'd string returned by absolute_pathdup().

We also have to free(path) in the middle of cmd_clone(): later during
cmd_clone(), path is unconditionally overwritten with a different path,
triggering a leak. Freeing the first path immediately after use (but
only in the case where it contains data) seems like the cleanest
solution, as opposed to freeing it unconditionally before path is reused
for another path. This leak appears to have been introduced in:
  f38aa83f9a (use local cloning if insteadOf makes a local URL, 2014-07-17)

These leaks were found when running t0001 with LSAN, see also an excerpt
of the LSAN output below (the full list is omitted because it's far too
long, and mostly consists of indirect leakage of members of the refs we
are freeing).

Direct leak of 178 byte(s) in 1 object(s) allocated from:
    #0 0x49a53d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9a6ff4 in do_xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:41:8
    #2 0x9a6fca in xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:62:9
    #3 0x8ce296 in copy_ref /home/ahunt/oss-fuzz/git/remote.c:885:8
    #4 0x8d2ebd in guess_remote_head /home/ahunt/oss-fuzz/git/remote.c:2215:10
    #5 0x51d0c5 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1308:4
    #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #10 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #11 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 165 byte(s) in 1 object(s) allocated from:
    #0 0x49a53d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9a6fc4 in do_xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:41:8
    #2 0x9a6f9a in xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:62:9
    #3 0x8ce266 in copy_ref /home/ahunt/oss-fuzz/git/remote.c:885:8
    #4 0x51e9bd in wanted_peer_refs /home/ahunt/oss-fuzz/git/builtin/clone.c:574:21
    #5 0x51cfe1 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1284:17
    #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #10 0x69c42e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #11 0x7f8fef0c2349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 178 byte(s) in 1 object(s) allocated from:
    #0 0x49a53d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
    #1 0x9a6ff4 in do_xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:41:8
    #2 0x9a6fca in xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:62:9
    #3 0x8ce296 in copy_ref /home/ahunt/oss-fuzz/git/remote.c:885:8
    #4 0x8d2ebd in guess_remote_head /home/ahunt/oss-fuzz/git/remote.c:2215:10
    #5 0x51d0c5 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1308:4
    #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #10 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #11 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 165 byte(s) in 1 object(s) allocated from:
    #0 0x49a6b2 in calloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
    #1 0x9a72f2 in xcalloc /home/ahunt/oss-fuzz/git/wrapper.c:140:8
    #2 0x8ce203 in alloc_ref_with_prefix /home/ahunt/oss-fuzz/git/remote.c:867:20
    #3 0x8ce1a2 in alloc_ref /home/ahunt/oss-fuzz/git/remote.c:875:9
    #4 0x72f63e in process_ref_v2 /home/ahunt/oss-fuzz/git/connect.c:426:8
    #5 0x72f21a in get_remote_refs /home/ahunt/oss-fuzz/git/connect.c:525:8
    #6 0x979ab7 in handshake /home/ahunt/oss-fuzz/git/transport.c:305:4
    #7 0x97872d in get_refs_via_connect /home/ahunt/oss-fuzz/git/transport.c:339:9
    #8 0x9774b5 in transport_get_remote_refs /home/ahunt/oss-fuzz/git/transport.c:1388:4
    #9 0x51cf80 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1271:9
    #10 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #11 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #12 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #13 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #14 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #15 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Direct leak of 105 byte(s) in 1 object(s) allocated from:
    #0 0x49a859 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3
    #1 0x9a71f6 in xrealloc /home/ahunt/oss-fuzz/git/wrapper.c:126:8
    #2 0x93622d in strbuf_grow /home/ahunt/oss-fuzz/git/strbuf.c:98:2
    #3 0x937a73 in strbuf_addch /home/ahunt/oss-fuzz/git/./strbuf.h:231:3
    #4 0x939fcd in strbuf_add_absolute_path /home/ahunt/oss-fuzz/git/strbuf.c:911:4
    #5 0x69d3ce in absolute_pathdup /home/ahunt/oss-fuzz/git/abspath.c:261:2
    #6 0x51c688 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1021:10
    #7 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #8 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #9 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #10 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #11 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #12 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-14 15:57:59 -07:00
Andrzej Hunt
e901de6816 reset: free instead of leaking unneeded ref
dwim_ref() allocs a new string into ref. Instead of setting to NULL to
discard it, we can FREE_AND_NULL.

This leak appears to have been introduced in:
4cf76f6bbf (builtin/reset: compute checkout metadata for reset, 2020-03-16)

This leak was found when running t0001 with LSAN, see also LSAN output below:

Direct leak of 5 byte(s) in 1 object(s) allocated from:
    #0 0x486514 in strdup /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
    #1 0x9a7108 in xstrdup /home/ahunt/oss-fuzz/git/wrapper.c:29:14
    #2 0x8add6b in expand_ref /home/ahunt/oss-fuzz/git/refs.c:670:12
    #3 0x8ad777 in repo_dwim_ref /home/ahunt/oss-fuzz/git/refs.c:644:22
    #4 0x6394af in dwim_ref /home/ahunt/oss-fuzz/git/./refs.h:162:9
    #5 0x637e5c in cmd_reset /home/ahunt/oss-fuzz/git/builtin/reset.c:426:4
    #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #10 0x69c5ce in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #11 0x7f57ebb9d349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-14 15:57:59 -07:00
Andrzej Hunt
f63b88867a symbolic-ref: don't leak shortened refname in check_symref()
shorten_unambiguous_ref() returns an allocated string. We have to
track it separately from the const refname.

This leak has existed since:
9ab55daa55 (git symbolic-ref --delete $symref, 2012-10-21)

This leak was found when running t0001 with LSAN, see also LSAN output
below:

Direct leak of 19 byte(s) in 1 object(s) allocated from:
    #0 0x486514 in strdup /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
    #1 0x9ab048 in xstrdup /home/ahunt/oss-fuzz/git/wrapper.c:29:14
    #2 0x8b452f in refs_shorten_unambiguous_ref /home/ahunt/oss-fuzz/git/refs.c
    #3 0x8b47e8 in shorten_unambiguous_ref /home/ahunt/oss-fuzz/git/refs.c:1287:9
    #4 0x679fce in check_symref /home/ahunt/oss-fuzz/git/builtin/symbolic-ref.c:28:14
    #5 0x679ad8 in cmd_symbolic_ref /home/ahunt/oss-fuzz/git/builtin/symbolic-ref.c:70:9
    #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11
    #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3
    #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4
    #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19
    #10 0x69cc6e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11
    #11 0x7f98388a4349 in __libc_start_main (/lib64/libc.so.6+0x24349)

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-14 15:57:59 -07:00
René Scharfe
ca56dadb4b use CALLOC_ARRAY
Add and apply a semantic patch for converting code that open-codes
CALLOC_ARRAY to use it instead.  It shortens the code and infers the
element size automatically.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-13 16:00:09 -08:00
brian m. carlson
75555676ad builtin/init-db: handle bare clones when core.bare set to false
In 552955ed7f ("clone: use more conventional config/option layering",
2020-10-01), clone learned to read configuration options earlier in its
execution, before creating the new repository.  However, that led to a
problem: if the core.bare setting is set to false in the global config,
cloning a bare repository segfaults.  This happens because the
repository is falsely thought to be non-bare, but clone has set the work
tree to NULL, which is then dereferenced.

The code to initialize the repository already considers the fact that a
user might want to override the --bare option for git init, but it
doesn't take into account clone, which uses a different option.  Let's
just check that the work tree is not NULL, since that's how clone
indicates that the repository is bare.  This is also the case for git
init, so we won't be regressing that case.

Reported-by: Joseph Vusich <jvusich@amazon.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-10 15:06:48 -08:00
Denton Liu
0af760e261 stash show: learn stash.showIncludeUntracked
The previous commit teaches `git stash show --include-untracked`. It
may be desirable for a user to be able to always enable the
--include-untracked behavior. Teach the stash.showIncludeUntracked
config option which allows users to do this in a similar manner to
stash.showPatch.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 14:31:27 -08:00
Denton Liu
d3c7bf73bd stash show: teach --include-untracked and --only-untracked
Stash entries can be made with untracked files via
`git stash push --include-untracked`. However, because the untracked
files are stored in the third parent of the stash entry and not the
stash entry itself, running `git stash show` does not include the
untracked files as part of the diff.

With --include-untracked, untracked paths, which are recorded in the
third-parent if it exists, are shown in addition to the paths that have
modifications between the stash base and the working tree in the stash.

It is possible to manually craft a malformed stash entry where duplicate
untracked files in the stash entry will mask tracked files. We detect
and error out in that case via a custom unpack_trees() callback:
stash_worktree_untracked_merge().

Also, teach stash the --only-untracked option which only shows the
untracked files of a stash entry. This is similar to `git show stash^3`
but it is nice to provide a convenient abstraction for it so that users
do not have to think about the underlying implementation.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 14:31:26 -08:00
Junio C Hamano
ccae01cab8 builtin/repack.c: reword comment around pack-objects flags
The comment in this block is meant to indicate that passing '--all',
'--reflog', and so on aren't necessary when repacking with the
'--geometric' option.

But, it has two problems: first, it is factually incorrect ('--all' is
*not* incompatible with '--stdin-packs' as the comment suggests);
second, it is quite focused on the geometric case for a block that is
guarding against it.

Reword this comment to address both issues.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 11:33:52 -08:00
Taylor Blau
2a15964128 builtin/repack.c: be more conservative with unsigned overflows
There are a number of places in the geometric repack code where we
multiply the number of objects in a pack by another unsigned value. We
trust that the number of objects in a pack is always representable by a
uint32_t, but we don't necessarily trust that that number can be
multiplied without overflow.

Sprinkle some unsigned_add_overflows() and unsigned_mult_overflows() in
split_pack_geometry() to check that we never overflow any unsigned types
when adding or multiplying them.

Arguably these checks are a little too conservative, and certainly they
do not help the readability of this function. But they are serving a
useful purpose, so I think they are worthwhile overall.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 11:33:52 -08:00
Taylor Blau
13d746a303 builtin/repack.c: assign pack split later
To determine the where to place the split when repacking with the
'--geometric' option, split_pack_geometry() assigns the "split" variable
and then decrements it in a loop.

It would be equivalent (and more readable) to assign the split to the
loop position after exiting the loop, so do that instead.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 11:33:52 -08:00
Taylor Blau
f25e33c156 builtin/repack.c: do not repack single packs with --geometric
In 0fabafd0b9 (builtin/repack.c: add '--geometric' option, 2021-02-22),
the 'git repack --geometric' code aborts early when there is zero or one
pack.

When there are no packs, this code does the right thing by placing the
split at "0". But when there is exactly one pack, the split is placed at
"1", which means that "git repack --geometric" (with any factor)
repacks all of the objects in a single pack.

This is wasteful, and the remaining code in split_pack_geometry() does
the right thing (not repacking the objects in a single pack) even when
only one pack is present.

Loosen the guard to only stop when there aren't any packs, and let the
rest of the code do the right thing. Add a test to ensure that this is
the case.

Noticed-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-05 11:33:52 -08:00
Torsten Bögershausen
93c3d297b5 git mv foo FOO ; git mv foo bar gave an assert
The following sequence, on a case-insensitive file system,
(strictly speeking with core.ignorecase=true)
leads to an assertion failure and leaves .git/index.lock behind.

git init
echo foo >foo
git add foo
git mv foo FOO
git mv foo bar

This regression was introduced in Commit 9b906af657,
"git-mv: improve error message for conflicted file"

The bugfix is to change the "file exist case-insensitive in the index"
into the correct "file exist (case-sensitive) in the index".

This avoids the "assert" later in the code and keeps setting up the
"ce" pointer for ce_stage(ce) done in the next else if.

This fixes
https://github.com/git-for-windows/git/issues/2920

Reported-By: Dan Moseley <Dan.Moseley@microsoft.com>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-03 17:07:12 -08:00
Junio C Hamano
6ee353d42f Merge branch 'jt/transfer-fsck-across-packs'
The approach to "fsck" the incoming objects in "index-pack" is
attractive for performance reasons (we have them already in core,
inflated and ready to be inspected), but fundamentally cannot be
applied fully when we receive more than one pack stream, as a tree
object in one pack may refer to a blob object in another pack as
".gitmodules", when we want to inspect blobs that are used as
".gitmodules" file, for example.  Teach "index-pack" to emit
objects that must be inspected later and check them in the calling
"fetch-pack" process.

* jt/transfer-fsck-across-packs:
  fetch-pack: print and use dangling .gitmodules
  fetch-pack: with packfile URIs, use index-pack arg
  http-fetch: allow custom index-pack args
  http: allow custom index-pack args
2021-03-01 14:02:57 -08:00
Junio C Hamano
140045821a Merge branch 'jc/push-delete-nothing'
"git push $there --delete ''" should have been diagnosed as an
error, but instead turned into a matching push, which has been
corrected.

* jc/push-delete-nothing:
  push: do not turn --delete '' into a matching push
2021-02-25 16:43:33 -08:00
Junio C Hamano
1c8f5dfa42 Merge branch 'js/params-vs-args'
Messages update.

* js/params-vs-args:
  replace "parameters" by "arguments" in error messages
2021-02-25 16:43:32 -08:00
Junio C Hamano
d166e8c1d4 Merge branch 'es/maintenance-of-bare-repositories'
The "git maintenance register" command had trouble registering bare
repositories, which had been corrected.

* es/maintenance-of-bare-repositories:
  maintenance: fix incorrect `maintenance.repo` path with bare repository
2021-02-25 16:43:32 -08:00
Junio C Hamano
f277234860 Merge branch 'mt/add-chmod-fixes'
Various fixes on "git add --chmod".

* mt/add-chmod-fixes:
  add: propagate --chmod errors to exit status
  add: mark --chmod error string for translation
  add --chmod: don't update index when --dry-run is used
2021-02-25 16:43:31 -08:00
Junio C Hamano
682bbad64d Merge branch 'ah/rebase-no-fork-point-config'
"git rebase --[no-]fork-point" gained a configuration variable
rebase.forkPoint so that users do not have to keep specifying a
non-default setting.

* ah/rebase-no-fork-point-config:
  rebase: add a config option for --no-fork-point
2021-02-25 16:43:31 -08:00
Junio C Hamano
628c13ccee Merge branch 'mt/grep-sparse-checkout'
"git grep" has been tweaked to be limited to the sparse checkout
paths.

* mt/grep-sparse-checkout:
  grep: honor sparse-checkout on working tree searches
2021-02-25 16:43:31 -08:00
Junio C Hamano
845d6030f8 Merge branch 'jc/diffcore-rotate'
"git {diff,log} --{skip,rotate}-to=<path>" allows the user to
discard diff output for early paths or move them to the end of the
output.

* jc/diffcore-rotate:
  diff: --{rotate,skip}-to=<path>
2021-02-25 16:43:30 -08:00
Junio C Hamano
3da165ca28 Merge branch 'mt/checkout-index-corner-cases'
The error codepath around the "--temp/--prefix" feature of "git
checkout-index" has been improved.

* mt/checkout-index-corner-cases:
  checkout-index: omit entries with no tempname from --temp output
  write_entry(): fix misuses of `path` in error messages
2021-02-25 16:43:30 -08:00
Junio C Hamano
18decfd11d Merge branch 'rs/blame-optim'
Optimization in "git blame"

* rs/blame-optim:
  blame: remove unnecessary use of get_commit_info()
2021-02-25 16:43:29 -08:00
Junio C Hamano
6fe12b5215 Merge branch 'jk/rev-list-disk-usage'
"git rev-list" command learned "--disk-usage" option.

* jk/rev-list-disk-usage:
  docs/rev-list: add some examples of --disk-usage
  docs/rev-list: add an examples section
  rev-list: add --disk-usage option for calculating disk usage
  t: add --no-tag option to test_commit
2021-02-25 16:43:29 -08:00
Ævar Arnfjörð Bjarmason
0f1da600e6 remote: write camel-cased *.pushRemote on rename
When a remote is renamed don't change the canonical "*.pushRemote"
form to "*.pushremote". Fixes and tests for a minor bug in
923d4a5ca4 (remote rename/remove: handle branch.<name>.pushRemote
config values, 2020-01-27). See the preceding commit for why this does
& doesn't matter.

While we're at it let's also test that we handle the "*.pushDefault"
key correctly. The code to handle that was added in
b3fd6cbf29 (remote rename/remove: gently handle remote.pushDefault
config, 2020-02-01) and does the right thing, but nothing tested that
we wrote out the canonical camel-cased form.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-24 19:03:00 -08:00
Ævar Arnfjörð Bjarmason
bfa9148ff7 remote: add camel-cased *.tagOpt key, like clone
Change "git remote add" so that it adds a *.tagOpt key, and not the
lower-cased *.tagopt on "git remote add --no-tags", just as "git clone
--no-tags" would do.

This doesn't matter for anything that reads the config. It's just
prettier if we write config keys in their documented camelCase form to
user-readable config files.

When I added support for "clone -no-tags" in 0dab2468ee (clone: add a
--no-tags option to clone without tags, 2017-04-26) I made it use
the *.tagOpt form, but the older "git remote add" added in
111fb85865 (remote add: add a --[no-]tags option, 2010-04-20) has
been using *.tagopt all this time.

It's easy enough to add a test for this, so let's do that. We can't
use "git config -l" there, because it'll normalize the keys to their
lower-cased form. Let's add the test for "git clone" too for good
measure, not just to the "git remote" codepath we're fixing.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-24 19:02:58 -08:00
Matheus Tavares
9ebd7fe158 add: propagate --chmod errors to exit status
If `add` encounters an error while applying the --chmod changes, it
prints a message to stderr, but exits with a success code. This might
have been an oversight, as the command does exit with a non-zero code in
other situations where it cannot (or refuses to) update all of the
requested paths (e.g. when some of the given paths are ignored). So make
the exit behavior more consistent by also propagating --chmod errors to
the exit status.

Note: the test "all statuses changed in folder if . is given" uses paths
added by previous test cases, some of which might be symbolic links.
Because `git add --chmod` will now fail with such paths, this test would
depend on whether all the previous tests were executed, or only some
of them. Avoid that by running the test on a fresh repo with only
regular files.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-24 12:14:51 -08:00
Matheus Tavares
48960894f5 add: mark --chmod error string for translation
This error message is intended for humans, so mark it for translation.
Also use error() instead of fprintf(stderr, ...), to make the
corresponding line a bit cleaner, and to display the "error:" prefix,
which helps classifying the nature/severity of the message.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-24 12:14:51 -08:00
Matheus Tavares
c937d70bfb add --chmod: don't update index when --dry-run is used
`git add --chmod` applies the mode changes even when `--dry-run` is
used. Fix that and add some tests for this option combination.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-24 12:14:51 -08:00
Alex Henrie
2803d800d2 rebase: add a config option for --no-fork-point
Some users (myself included) would prefer to have this feature off by
default because it can silently drop commits.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-24 11:49:10 -08:00
Junio C Hamano
20e416409f push: do not turn --delete '' into a matching push
When we added a syntax sugar "git push remote --delete <ref>" to
"git push" as a synonym to the canonical "git push remote :<ref>"
syntax at f517f1f2 (builtin-push: add --delete as syntactic sugar
for :foo, 2009-12-30), we weren't careful enough to make sure that
<ref> is not empty.

Blindly rewriting "--delete <ref>" to ":<ref>" means that an empty
string <ref> results in refspec ":", which is the syntax to ask for
"matching" push that does not delete anything.

Worse yet, if there were matching refs that can be fast-forwarded,
they would have been published prematurely, even if the user feels
that they are not ready yet to be pushed out, which would be a real
disaster.

Noticed-by: Tilman Vogel <tilman.vogel@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-23 15:19:34 -08:00
Johannes Sixt
b865734760 replace "parameters" by "arguments" in error messages
When an error message informs the user about an incorrect command
invocation, it should refer to "arguments", not "parameters".

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-23 13:30:45 -08:00
Eric Sunshine
26c7974376 maintenance: fix incorrect maintenance.repo path with bare repository
The periodic maintenance tasks configured by `git maintenance start`
invoke `git for-each-repo` to run `git maintenance run` on each path
specified by the multi-value global configuration variable
`maintenance.repo`. Because `git for-each-repo` will likely be run
outside of the repositories which require periodic maintenance, it is
mandatory that the repository paths specified by `maintenance.repo` are
absolute.

Unfortunately, however, `git maintenance register` does nothing to
ensure that the paths it assigns to `maintenance.repo` are indeed
absolute, and may in fact -- especially in the case of a bare repository
-- assign a relative path to `maintenance.repo` instead. Fix this
problem by converting all paths to absolute before assigning them to
`maintenance.repo`.

While at it, also fix `git maintenance unregister` to convert paths to
absolute, as well, in order to ensure that it can correctly remove from
`maintenance.repo` a path assigned via `git maintenance register`.

Reported-by: Clement Moyroud <clement.moyroud@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-23 00:22:45 -08:00
Taylor Blau
0fabafd0b9 builtin/repack.c: add '--geometric' option
Often it is useful to both:

  - have relatively few packfiles in a repository, and

  - avoid having so few packfiles in a repository that we repack its
    entire contents regularly

This patch implements a '--geometric=<n>' option in 'git repack'. This
allows the caller to specify that they would like each pack to be at
least a factor times as large as the previous largest pack (by object
count).

Concretely, say that a repository has 'n' packfiles, labeled P1, P2,
..., up to Pn. Each packfile has an object count equal to 'objects(Pn)'.
With a geometric factor of 'r', it should be that:

  objects(Pi) > r*objects(P(i-1))

for all i in [1, n], where the packs are sorted by

  objects(P1) <= objects(P2) <= ... <= objects(Pn).

Since finding a true optimal repacking is NP-hard, we approximate it
along two directions:

  1. We assume that there is a cutoff of packs _before starting the
     repack_ where everything to the right of that cut-off already forms
     a geometric progression (or no cutoff exists and everything must be
     repacked).

  2. We assume that everything smaller than the cutoff count must be
     repacked. This forms our base assumption, but it can also cause
     even the "heavy" packs to get repacked, for e.g., if we have 6
     packs containing the following number of objects:

       1, 1, 1, 2, 4, 32

     then we would place the cutoff between '1, 1' and '1, 2, 4, 32',
     rolling up the first two packs into a pack with 2 objects. That
     breaks our progression and leaves us:

       2, 1, 2, 4, 32
         ^

     (where the '^' indicates the position of our split). To restore a
     progression, we move the split forward (towards larger packs)
     joining each pack into our new pack until a geometric progression
     is restored. Here, that looks like:

       2, 1, 2, 4, 32  ~>  3, 2, 4, 32  ~>  5, 4, 32  ~> ... ~> 9, 32
         ^                   ^                ^                   ^

This has the advantage of not repacking the heavy-side of packs too
often while also only creating one new pack at a time. Another wrinkle
is that we assume that loose, indexed, and reflog'd objects are
insignificant, and lump them into any new pack that we create. This can
lead to non-idempotent results.

Suggested-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 23:30:52 -08:00
Jeff King
6325da14af builtin/pack-objects.c: rewrite honor-pack-keep logic
Now that we have find_kept_pack_entry(), we don't have to manually keep
hunting through every pack to find a possible "kept" duplicate of the
object. This should be faster, assuming only a portion of your total
packs are actually kept.

Note that we have to re-order the logic a bit here; we can deal with the
disqualifying situations first (e.g., finding the object in a non-local
pack with --local), then "kept" situation(s), and then just fall back to
other "--local" conditions.

Here are the results from p5303 (measurements again taken on the
kernel):

  Test                                        HEAD^                   HEAD
  -----------------------------------------------------------------------------------------------
  5303.5: repack (1)                          57.26(54.59+10.84)      57.34(54.66+10.88) +0.1%
  5303.6: repack with kept (1)                57.33(54.80+10.51)      57.38(54.83+10.49) +0.1%
  5303.11: repack (50)                        71.54(88.57+4.84)       71.70(88.99+4.74) +0.2%
  5303.12: repack with kept (50)              85.12(102.05+4.94)      72.58(89.61+4.78) -14.7%
  5303.17: repack (1000)                      216.87(490.79+14.57)    217.19(491.72+14.25) +0.1%
  5303.18: repack with kept (1000)            665.63(938.87+15.76)    246.12(520.07+14.93) -63.0%

and the --stdin-packs timings:

  5303.7: repack with --stdin-packs (1)       0.01(0.01+0.00)         0.00(0.00+0.00) -100.0%
  5303.13: repack with --stdin-packs (50)     3.53(12.07+0.24)        3.43(11.75+0.24) -2.8%
  5303.19: repack with --stdin-packs (1000)   195.83(371.82+8.10)     130.50(307.15+7.66) -33.4%

So our repack with an empty .keep pack is roughly as fast as one without
a .keep pack up to 50 packs. But the --stdin-packs case scales a little
better, too.

Notably, it is faster than a repack of the same size and a kept pack. It
looks at fewer objects, of course, but the penalty for looking at many
packs isn't as costly.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 23:30:52 -08:00
Taylor Blau
339bce27f4 builtin/pack-objects.c: add '--stdin-packs' option
In an upcoming commit, 'git repack' will want to create a pack comprised
of all of the objects in some packs (the included packs) excluding any
objects in some other packs (the excluded packs).

This caller could iterate those packs themselves and feed the objects it
finds to 'git pack-objects' directly over stdin, but this approach has a
few downsides:

  - It requires every caller that wants to drive 'git pack-objects' in
    this way to implement pack iteration themselves. This forces the
    caller to think about details like what order objects are fed to
    pack-objects, which callers would likely rather not do.

  - If the set of objects in included packs is large, it requires
    sending a lot of data over a pipe, which is inefficient.

  - The caller is forced to keep track of the excluded objects, too, and
    make sure that it doesn't send any objects that appear in both
    included and excluded packs.

But the biggest downside is the lack of a reachability traversal.
Because the caller passes in a list of objects directly, those objects
don't get a namehash assigned to them, which can have a negative impact
on the delta selection process, causing 'git pack-objects' to fail to
find good deltas even when they exist.

The caller could formulate a reachability traversal themselves, but the
only way to drive 'git pack-objects' in this way is to do a full
traversal, and then remove objects in the excluded packs after the
traversal is complete. This can be detrimental to callers who care
about performance, especially in repositories with many objects.

Introduce 'git pack-objects --stdin-packs' which remedies these four
concerns.

'git pack-objects --stdin-packs' expects a list of pack names on stdin,
where 'pack-xyz.pack' denotes that pack as included, and
'^pack-xyz.pack' denotes it as excluded. The resulting pack includes all
objects that are present in at least one included pack, and aren't
present in any excluded pack.

To address the delta selection problem, 'git pack-objects --stdin-packs'
works as follows. First, it assembles a list of objects that it is going
to pack, as above. Then, a reachability traversal is started, whose tips
are any commits mentioned in included packs. Upon visiting an object, we
find its corresponding object_entry in the to_pack list, and set its
namehash parameter appropriately.

To avoid the traversal visiting more objects than it needs to, the
traversal is halted upon encountering an object which can be found in an
excluded pack (by marking the excluded packs as kept in-core, and
passing --no-kept-objects=in-core to the revision machinery).

This can cause the traversal to halt early, for example if an object in
an included pack is an ancestor of ones in excluded packs. But stopping
early is OK, since filling in the namehash fields of objects in the
to_pack list is only additive (i.e., having it helps the delta selection
process, but leaving it blank doesn't impact the correctness of the
resulting pack).

Even still, it is unlikely that this hurts us much in practice, since
the 'git repack --geometric' caller (which is introduced in a later
commit) marks small packs as included, and large ones as excluded.
During ordinary use, the small packs usually represent pushes after a
large repack, and so are unlikely to be ancestors of objects that
already exist in the repository.

(I found it convenient while developing this patch to have 'git
pack-objects' report the number of objects which were visited and got
their namehash fields filled in during traversal. This is also included
in the below patch via trace2 data lines).

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 23:30:52 -08:00
Junio C Hamano
45df6c4d75 Merge branch 'ab/diff-deferred-free'
A small memleak in "diff -I<regexp>" has been corrected.

* ab/diff-deferred-free:
  diff: plug memory leak from regcomp() on {log,diff} -I
  diff: add an API for deferred freeing
2021-02-22 16:12:43 -08:00
Junio C Hamano
15af6e6fee Merge branch 'bc/signed-objects-with-both-hashes'
Signed commits and tags now allow verification of objects, whose
two object names (one in SHA-1, the other in SHA-256) are both
signed.

* bc/signed-objects-with-both-hashes:
  gpg-interface: remove other signature headers before verifying
  ref-filter: hoist signature parsing
  commit: allow parsing arbitrary buffers with headers
  gpg-interface: improve interface for parsing tags
  commit: ignore additional signatures when parsing signed commits
  ref-filter: switch some uses of unsigned long to size_t
2021-02-22 16:12:42 -08:00
Junio C Hamano
b9554c03a0 Merge branch 'dl/stash-cleanup'
Documentation, code and test clean-up around "git stash".

* dl/stash-cleanup:
  stash: declare ref_stash as an array
  t3905: use test_cmp() to check file contents
  t3905: replace test -s with test_file_not_empty
  t3905: remove nested git in command substitution
  t3905: move all commands into test cases
  t3905: remove spaces after redirect operators
  git-stash.txt: be explicit about subcommand options
2021-02-22 16:12:42 -08:00
Jonathan Tan
5476e1efde fetch-pack: print and use dangling .gitmodules
Teach index-pack to print dangling .gitmodules links after its "keep" or
"pack" line instead of declaring an error, and teach fetch-pack to check
such lines printed.

This allows the tree side of the .gitmodules link to be in one packfile
and the blob side to be in another without failing the fsck check,
because it is now fetch-pack which checks such objects after all
packfiles have been downloaded and indexed (and not index-pack on an
individual packfile, as it is before this commit).

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 12:07:40 -08:00
Junio C Hamano
e68f62be8d Merge branch 'js/reflog-expire-stale-fix'
"git reflog expire --stale-fix" can be used to repair the reflog by
removing entries that refer to objects that have been pruned away,
but was not careful to tolerate missing objects.

* js/reflog-expire-stale-fix:
  reflog expire --stale-fix: be generous about missing objects
2021-02-17 17:21:43 -08:00
Junio C Hamano
d494433d26 Merge branch 'ds/maintenance-pack-refs'
"git maintenance" tool learned a new "pack-refs" maintenance task.

* ds/maintenance-pack-refs:
  maintenance: incremental strategy runs pack-refs weekly
  maintenance: add pack-refs task
2021-02-17 17:21:42 -08:00
Junio C Hamano
f712632a51 Merge branch 'mt/grep-cached-untracked'
"git grep --untracked" is meant to be "let's ALSO find in these
files on the filesystem" when looking for matches in the working
tree files, and does not make any sense if the primary search is
done against the index, or the tree objects.  The "--cached" and
"--untracked" options have been marked as mutually incompatible.

* mt/grep-cached-untracked:
  grep: error out if --untracked is used with --cached
2021-02-17 17:21:41 -08:00
Junio C Hamano
dadc91ff0c Merge branch 'js/range-diff-one-side-only'
The "git range-diff" command learned "--(left|right)-only" option
to show only one side of the compared range.

* js/range-diff-one-side-only:
  range-diff: offer --left-only/--right-only options
  range-diff: move the diffopt initialization down one layer
  range-diff: combine all options in a single data structure
  range-diff: simplify code spawning `git log`
  range-diff: libify the read_patches() function again
  range-diff: avoid leaking memory in two error code paths
2021-02-17 17:21:41 -08:00
Junio C Hamano
77348b0e6e Merge branch 'js/range-diff-wo-dotdot'
There are other ways than ".." for a single token to denote a
"commit range", namely "<rev>^!" and "<rev>^-<n>", but "git
range-diff" did not understand them.

* js/range-diff-wo-dotdot:
  range-diff(docs): explain how to specify commit ranges
  range-diff/format-patch: handle commit ranges other than A..B
  range-diff/format-patch: refactor check for commit range
2021-02-17 17:21:41 -08:00
Junio C Hamano
69571dfe21 Merge branch 'jt/clone-unborn-head'
"git clone" tries to locally check out the branch pointed at by
HEAD of the remote repository after it is done, but the protocol
did not convey the information necessary to do so when copying an
empty repository.  The protocol v2 learned how to do so.

* jt/clone-unborn-head:
  clone: respect remote unborn HEAD
  connect, transport: encapsulate arg in struct
  ls-refs: report unborn targets of symrefs
2021-02-17 17:21:40 -08:00
Junio C Hamano
0871fb9af5 Merge branch 'mr/bisect-in-c-4'
Piecemeal of rewrite of "git bisect" in C continues.

* mr/bisect-in-c-4:
  bisect--helper: retire `--check-and-set-terms` subcommand
  bisect--helper: reimplement `bisect_skip` shell function in C
  bisect--helper: retire `--bisect-auto-next` subcommand
  bisect--helper: use `res` instead of return in BISECT_RESET case option
  bisect--helper: retire `--bisect-write` subcommand
  bisect--helper: reimplement `bisect_replay` shell function in C
  bisect--helper: reimplement `bisect_log` shell function in C
2021-02-17 17:21:40 -08:00
Ævar Arnfjörð Bjarmason
cbe81e653f grep/pcre2: move back to thread-only PCREv2 structures
Change the setup of the "pcre2_general_context" to happen per-thread
in compile_pcre2_pattern() instead of in grep_init().

This change brings it in line with how the rest of the pcre2_* members
in the grep_pat structure are set up.

As noted in the preceding commit the approach 513f2b0bbd (grep: make
PCRE2 aware of custom allocator, 2019-10-16) took to allocate the
pcre2_general_context seems to have been initially based on a
misunderstanding of how PCREv2 memory allocation works.

The approach of creating a global context in grep_init() is just added
complexity for almost zero gain. On my system it's 24 bytes saved
per-thread. For comparison PCREv2 will then go on to allocate at least
a kilobyte for its own thread-local state.

As noted in 6d423dd542 (grep: don't redundantly compile throwaway
patterns under threading, 2017-05-25) the grep code is intentionally
not trying to micro-optimize allocations by e.g. sharing some PCREv2
structures globally, while making others thread-local.

So let's remove this special case and make all of them thread-local
again for simplicity. With this change we could move the
pcre2_{malloc,free} functions around to live closer to their current
use. I'm not doing that here to keep this change small, that cleanup
will be done in a follow-up commit.

See also the discussion in 94da9193a6 (grep: add support for PCRE v2,
2017-06-01) about thread safety, and Johannes's comments[1] to the
effect that we should be doing what this patch is doing.

1. https://lore.kernel.org/git/nycvar.QRO.7.76.6.1908052120302.46@tvgsbejvaqbjf.bet/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 16:32:19 -08:00
Rafael Silva
8e16effe97 blame: remove unnecessary use of get_commit_info()
When `git blame --color-by-age`, the determine_line_heat() is called to
select how to color the output based on the commit's author date.  It
uses the get_commit_info() to parse the information into a `commit_info`
structure, however, this is actually unnecessary because the
determine_line_heat() caller also does the same.

Instead, let's change the determine_line_heat() to take a `commit_info`
structure and remove the internal call to get_commit_info() thus
cleaning up and optimizing the code path.

Enabling Git's trace2 API in order to record the execution time for
every call to determine_line_heat() function:

   + trace2_region_enter("blame", "determine_line_heat", the_repository);
     determine_line_heat(ent, &default_color);
   + trace2_region_enter("blame", "determine_line_heat", the_repository);

Then, running `git blame` for "kernel/fork.c" in linux.git and summing
all the execution time for every call (around 1.3k calls) resulted in
2.6x faster execution (best out 3):

   git built from 328c109303 (The eighth batch, 2021-02-12) = 42ms
   git built from 328c109303 + this change                  = 16ms

Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17 11:04:17 -08:00
Matheus Tavares
3f7ba60350 checkout-index: omit entries with no tempname from --temp output
With --temp (or --stage=all, which implies --temp), checkout-index
writes a list to stdout associating temporary file names to the entries'
names. But if it fails to write an entry, and the failure happens before
even assigning a temporary filename to that entry, we get an odd output
line. This can be seen when trying to check out a symlink whose blob is
missing:

$ missing_blob=$(git hash-object --stdin </dev/null)
$ git update-index --add --cacheinfo 120000,$missing_blob,foo
$ git checkout-index --temp foo
error: unable to read sha1 file of foo (e69de29bb2)
        foo

The 'TAB foo' line is not much useful and it might break scripts that
expect the 'tempname TAB foo' output. So let's omit such entries from
the stdout list (but leaving the error message on stderr).

We could also consider omitting _all_ failed entries from the output
list, but that's probably not a good idea as the associated tempfiles
may have been created even when checkout failed, so scripts may want to
use the output list for cleanup.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 11:27:18 -08:00
Jeff King
1679d60bfc exclude: add flags parameter to add_patterns()
There are a number of callers of add_patterns() and its sibling
functions. Let's give them a "flags" parameter for adding new options
without having to touch each caller. We'll use this in a future patch to
add O_NOFOLLOW support. But for now each caller just passes 0.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 09:41:33 -08:00
Junio C Hamano
1eb4136ac2 diff: --{rotate,skip}-to=<path>
In the implementation of "git difftool", there is a case where the
user wants to start viewing the diffs at a specific path and
continue on to the rest, optionally wrapping around to the
beginning.  Since it is somewhat cumbersome to implement such a
feature as a post-processing step of "git diff" output, let's
support it internally with two new options.

 - "git diff --rotate-to=C", when the resulting patch would show
   paths A B C D E without the option, would "rotate" the paths to
   shows patch to C D E A B instead.  It is an error when there is
   no patch for C is shown.

 - "git diff --skip-to=C" would instead "skip" the paths before C,
   and shows patch to C D E.  Again, it is an error when there is no
   patch for C is shown.

 - "git log [-p]" also accepts these two options, but it is not an
   error if there is no change to the specified path.  Instead, the
   set of output paths are rotated or skipped to the specified path
   or the first path that sorts after the specified path.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16 09:30:42 -08:00
Junio C Hamano
8b25dee615 Merge branch 'tb/precompose-prefix-too'
When commands are started from a subdirectory, they may have to
compare the path to the subdirectory (called prefix and found out
from $(pwd)) with the tracked paths.  On macOS, $(pwd) and
readdir() yield decomposed path, while the tracked paths are
usually normalized to the precomposed form, causing mismatch.  This
has been fixed by taking the same approach used to normalize the
command line arguments.

* tb/precompose-prefix-too:
  MacOS: precompose_argv_prefix()
2021-02-12 14:21:04 -08:00
Junio C Hamano
3c12d0b885 Merge branch 'tb/pack-revindex-on-disk'
Introduce an on-disk file to record revindex for packdata, which
traditionally was always created on the fly and only in-core.

* tb/pack-revindex-on-disk:
  t5325: check both on-disk and in-memory reverse index
  pack-revindex: ensure that on-disk reverse indexes are given precedence
  t: support GIT_TEST_WRITE_REV_INDEX
  t: prepare for GIT_TEST_WRITE_REV_INDEX
  Documentation/config/pack.txt: advertise 'pack.writeReverseIndex'
  builtin/pack-objects.c: respect 'pack.writeReverseIndex'
  builtin/index-pack.c: write reverse indexes
  builtin/index-pack.c: allow stripping arbitrary extensions
  pack-write.c: prepare to write 'pack-*.rev' files
  packfile: prepare for the existence of '*.rev' files
2021-02-12 14:21:04 -08:00
Denton Liu
3e885f0277 stash: declare ref_stash as an array
Save sizeof(const char *) bytes by declaring ref_stash as an array
instead of having a redundant pointer to an array.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 13:34:58 -08:00
Jeff King
16950f8384 rev-list: add --disk-usage option for calculating disk usage
It can sometimes be useful to see which refs are contributing to the
overall repository size (e.g., does some branch have a bunch of objects
not found elsewhere in history, which indicates that deleting it would
shrink the size of a clone).

You can find that out by generating a list of objects, getting their
sizes from cat-file, and then summing them, like:

    git rev-list --objects --no-object-names main..branch
    git cat-file --batch-check='%(objectsize:disk)' |
    perl -lne '$total += $_; END { print $total }'

Though note that the caveats from git-cat-file(1) apply here. We "blame"
base objects more than their deltas, even though the relationship could
easily be flipped. Still, it can be a useful rough measure.

But one problem is that it's slow to run. Teaching rev-list to sum up
the sizes can be much faster for two reasons:

  1. It skips all of the piping of object names and sizes.

  2. If bitmaps are in use, for objects that are in the
     bitmapped packfile we can skip the oid_object_info()
     lookup entirely, and just ask the revindex for the
     on-disk size.

This patch implements a --disk-usage option which produces the same
answer in a fraction of the time. Here are some timings using a clone of
torvalds/linux:

  [rev-list piped to cat-file, no bitmaps]
  $ time git rev-list --objects --no-object-names --all |
    git cat-file --buffer --batch-check='%(objectsize:disk)' |
    perl -lne '$total += $_; END { print $total }'
  1459938510
  real	0m29.635s
  user	0m38.003s
  sys	0m1.093s

  [internal, no bitmaps]
  $ time git rev-list --disk-usage --objects --all
  1459938510
  real	0m31.262s
  user	0m30.885s
  sys	0m0.376s

Even though the wall-clock time is slightly worse due to parallelism,
notice the CPU savings between the two. We saved 21% of the CPU just by
avoiding the pipes.

But the real win is with bitmaps. If we use them without the new option:

  [rev-list piped to cat-file, bitmaps]
  $ time git rev-list --objects --no-object-names --all --use-bitmap-index |
    git cat-file --batch-check='%(objectsize:disk)' |
    perl -lne '$total += $_; END { print $total }'
  1459938510
  real	0m6.244s
  user	0m8.452s
  sys	0m0.311s

then we're faster to generate the list of objects, but we still spend a
lot of time piping and looking things up. But if we do both together:

  [internal, bitmaps]
  $ time git rev-list --disk-usage --objects --all --use-bitmap-index
  1459938510
  real	0m0.219s
  user	0m0.169s
  sys	0m0.049s

then we get the same answer much faster.

For "--all", that answer will correspond closely to "du objects/pack",
of course. But we're actually checking reachability here, so we're still
fast when we ask for more interesting things:

  $ time git rev-list --disk-usage --use-bitmap-index v5.0..v5.10
  374798628
  real	0m0.429s
  user	0m0.356s
  sys	0m0.072s

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 09:57:55 -08:00
Johannes Schindelin
c809798b2a reflog expire --stale-fix: be generous about missing objects
Whenever a user runs `git reflog expire --stale-fix`, the most likely
reason is that their repository is at least _somewhat_ corrupt. Which
means that it is more than just possible that some objects are missing.

If that is the case, that can currently let the command abort through
the phase where it tries to mark all reachable objects.

Instead of adding insult to injury, let's be gentle and continue as best
as we can in such a scenario, simply by ignoring the missing objects and
moving on.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 09:21:52 -08:00
Ævar Arnfjörð Bjarmason
e900d494dc diff: add an API for deferred freeing
Add a diff_free() function to free anything we may have allocated in
the "diff_options" struct, and the ability to make calling it a noop
by setting "no_free" in "diff_options".

This is required because when e.g. "git diff" is run we'll allocate
things in that struct, use the diff machinery once, and then exit.

But if we run e.g. "git log -p" we're going to re-use what we
allocated across multiple diff_flush() calls, and only want to free
things at the end.

We've thus ended up with features like the recently added "diff -I"[1]
where we'll leak memory. As it turns out it could have simply used the
pattern established in 6ea57703f6 (log: prepare log/log-tree to reuse
the diffopt.close_file attribute, 2016-06-22).

Manually adding more such flags to things log_tree_commit() every time
we need to allocate something would be tedious. Let's instead move
that fclose() code it to a new diff_free(), in anticipation of freeing
more things in that function in follow-up commits.

Some functions such as log_tree_commit() need an idiom of optionally
retaining a previous "no_free", as they may either free the memory
themselves, or their caller may do so. I'm keeping that idiom in
log_show_early() for good measure, even though I don't think it's
currently called in this manner. It also gets passed an existing
"struct rev_info", so future callers may want to set the "no_free"
flag.

This change is a bit hard to read because while the freeing pattern
we're introducing isn't unusual, the "file" member is a special
snowflake. We usually don't want to fclose() it. This is because
"file" is usually stdout, in which case we don't want to fclose()
it. We only want to opt-in to closing it when we e.g. open a file on
the filesystem. Thus the opt-in "close_file" flag.

So the API in general just needs a "no_free" flag to defer freeing,
but the "file" member still needs its "close_file" flag. This is made
more confusing because while refactoring this code we could replace
some "close_file=0" with "no_free=1", whereas others need to set both
flags.

This is because there were some cases where an existing "close_file=0"
meant "let's defer deallocation", and others where it meant "we don't
want to close this file handle at all".

1. 296d4a94e7 (diff: add -I<regex> that ignores matching changes,
   2020-10-20)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-11 09:21:05 -08:00
brian m. carlson
482c119186 gpg-interface: improve interface for parsing tags
We have a function which parses a buffer with a signature at the end,
parse_signature, and this function is used for signed tags.  However,
we'll need to store values for multiple algorithms, and we'll do this by
using a header for the non-default algorithm.

Adjust the parse_signature interface to store the parsed data in two
strbufs and turn the existing function into parse_signed_buffer.  The
latter is still used in places where we know we always have a signed
buffer, such as push certs.

Adjust all the callers to deal with this new interface.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-10 23:35:42 -08:00
Junio C Hamano
2f794620f5 Merge branch 'ds/more-index-cleanups'
Cleaning various codepaths up.

* ds/more-index-cleanups:
  t1092: test interesting sparse-checkout scenarios
  test-lib: test_region looks for trace2 regions
  sparse-checkout: load sparse-checkout patterns
  name-hash: use trace2 regions for init
  repository: add repo reference to index_state
  fsmonitor: de-duplicate BUG()s around dirty bits
  cache-tree: extract subtree_pos()
  cache-tree: simplify verify_cache() prototype
  cache-tree: clean up cache_tree_update()
2021-02-10 14:48:33 -08:00
Junio C Hamano
02fb21617e Merge branch 'rs/worktree-list-verbose'
`git worktree list` now annotates worktrees as prunable, shows
locked and prunable attributes in --porcelain mode, and gained
a --verbose option.

* rs/worktree-list-verbose:
  worktree: teach `list` verbose mode
  worktree: teach `list` to annotate prunable worktree
  worktree: teach `list --porcelain` to annotate locked worktree
  t2402: ensure locked worktree is properly cleaned up
  worktree: teach worktree_lock_reason() to gently handle main worktree
  worktree: teach worktree to lazy-load "prunable" reason
  worktree: libify should_prune_worktree()
2021-02-10 14:48:32 -08:00
Junio C Hamano
c9f94ab4fa Merge branch 'ab/lose-grep-debug'
Lose the debugging aid that may have been useful in the past, but
no longer is, in the "grep" codepaths.

* ab/lose-grep-debug:
  grep/log: remove hidden --debug and --grep-debug options
2021-02-10 14:48:31 -08:00
Junio C Hamano
9d5b1c06ac Merge branch 'jk/use-oid-pos'
Code clean-up to ensure our use of hashtables using object names as
keys use the "struct object_id" objects, not the raw hash values.

* jk/use-oid-pos:
  oid_pos(): access table through const pointers
  hash_pos(): convert to oid_pos()
  rerere: use strmap to store rerere directories
  rerere: tighten rr-cache dirname check
  rerere: check dirname format while iterating rr_cache directory
  commit_graft_pos(): take an oid instead of a bare hash
2021-02-10 14:48:31 -08:00
Matheus Tavares
42d906bec4 grep: honor sparse-checkout on working tree searches
On a sparse checked out repository, `git grep` (without --cached) ends
up searching the cache when an entry matches the search pathspec and has
the SKIP_WORKTREE bit set. This is confusing both because the sparse
paths are not expected to be in a working tree search (as they are not
checked out), and because the output mixes working tree and cache
results without distinguishing them. (Note that grep also resorts to the
cache on working tree searches that include --assume-unchanged paths.
But the whole point in that case is to assume that the contents of the
index entry and the file are the same. This does not apply to the case
of sparse paths, where the file isn't even expected to be present.)

Fix that by teaching grep to honor the sparse-checkout rules for working
tree searches. If the user wants to grep paths outside the current
sparse-checkout definition, they may either update the sparsity rules to
materialize the files, or use --cached to search all blobs registered in
the index.

Note: it might also be interesting to add a configuration option that
allow users to search paths that are present despite having the
SKIP_WORKTREE bit set, and/or to restrict searches in the index and past
revisions too. These ideas are left as future improvements to avoid
conflicting with other sparse-checkout topics currently in flight.

Suggested-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-09 23:10:51 -08:00
Derrick Stolee
acc1c4d5d4 maintenance: incremental strategy runs pack-refs weekly
When the 'maintenance.strategy' config option is set to 'incremental',
a default maintenance schedule is enabled. Add the 'pack-refs' task to
that strategy at the weekly cadence.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-09 23:09:29 -08:00
Derrick Stolee
41abfe15d9 maintenance: add pack-refs task
It is valuable to collect loose refs into a more compressed form. This
is typically the packed-refs file, although this could be the reftable
in the future. Having packed refs can be extremely valuable in repos
with many tags or remote branches that are not modified by the local
user, but still are necessary for other queries.

For instance, with many exploded refs, commands such as

	git describe --tags --exact-match HEAD

can be very slow (multiple seconds). This command in particular is used
by terminal prompts to show when a detatched HEAD is pointing to an
existing tag, so having it be slow causes significant delays for users.

Add a new 'pack-refs' maintenance task. It runs 'git pack-refs --all
--prune' to move loose refs into a packed form. For now, that is the
packed-refs file, but could adjust to other file formats in the future.

This is the first of several sub-tasks of the 'gc' task that could be
extracted to their own tasks. In this process, we should not change the
behavior of the 'gc' task since that remains the default way to keep
repositories maintained. Creating a new task for one of these sub-tasks
only provides more customization options for those choosing to not use
the 'gc' task. It is certainly possible to have both the 'gc' and
'pack-refs' tasks enabled and run regularly. While they may repeat
effort, they do not conflict in a destructive way.

The 'auto_condition' function pointer is left NULL for now. We could
extend this in the future to have a condition check if pack-refs should
be run during 'git maintenance run --auto'.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-09 23:09:24 -08:00
Matheus Tavares
0c5d83b248 grep: error out if --untracked is used with --cached
The options --untracked and --cached are not compatible, but if they are
used together, grep just silently ignores --cached and searches the
working tree. Error out, instead, to avoid any potential confusion.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-09 12:39:06 -08:00
Junio C Hamano
f2d156dc48 Merge branch 'ab/branch-sort' into maint
The implementation of "git branch --sort" wrt the detached HEAD
display has always been hacky, which has been cleaned up.

* ab/branch-sort:
  branch: show "HEAD detached" first under reverse sort
  branch: sort detached HEAD based on a flag
  ref-filter: move ref_sorting flags to a bitfield
  ref-filter: move "cmp_fn" assignment into "else if" arm
  ref-filter: add braces to if/else if/else chain
  branch tests: add to --sort tests
  branch: change "--local" to "--list" in comment
2021-02-08 14:05:55 -08:00
Junio C Hamano
171675a6c5 Merge branch 'ma/more-opaque-lock-file' into maint
Code clean-up.

* ma/more-opaque-lock-file:
  read-cache: try not to peek into `struct {lock_,temp}file`
  refs/files-backend: don't peek into `struct lock_file`
  midx: don't peek into `struct lock_file`
  commit-graph: don't peek into `struct lock_file`
  builtin/gc: don't peek into `struct lock_file`
2021-02-08 14:05:55 -08:00
Junio C Hamano
3e52ab222a Merge branch 'zh/arg-help-format' into maint
Clean up option descriptions in "git cmd --help".

* zh/arg-help-format:
  builtin/*: update usage format
  parse-options: format argh like error messages
2021-02-08 14:05:54 -08:00
Johannes Schindelin
1e79f97326 range-diff: offer --left-only/--right-only options
When comparing commit ranges, one is frequently interested only in one
side, such as asking the question "Has this patch that I submitted to
the Git mailing list been applied?": one would only care about the part
of the output that corresponds to the commits in a local branch.

To make that possible, imitate the `git rev-list` options `--left-only`
and `--right-only`.

This addresses https://github.com/gitgitgadget/git/issues/206

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 21:14:31 -08:00
Johannes Schindelin
f1ce6c191e range-diff: combine all options in a single data structure
This will make it easier to implement the `--left-only` and
`--right-only` options.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 21:14:31 -08:00
Junio C Hamano
77db59c2f9 Merge branch 'jv/pack-objects-narrower-ref-iteration'
The "pack-objects" command needs to iterate over all the tags when
automatic tag following is enabled, but it actually iterated over
all refs and then discarded everything outside "refs/tags/"
hierarchy, which was quite wasteful.

* jv/pack-objects-narrower-ref-iteration:
  builtin/pack-objects.c: avoid iterating all refs
2021-02-05 16:40:45 -08:00
Junio C Hamano
f6ef8baba2 Merge branch 'ph/use-delete-refs'
When removing many branches and tags, the code used to do so one
ref at a time.  There is another API it can use to delete multiple
refs, and it makes quite a lot of performance difference when the
refs are packed.

* ph/use-delete-refs:
  use delete_refs when deleting tags or branches
2021-02-05 16:40:45 -08:00
Junio C Hamano
5198426d91 Merge branch 'zh/ls-files-deduplicate'
"git ls-files" can and does show multiple entries when the index is
unmerged, which is a source for confusion unless -s/-u option is in
use.  A new option --deduplicate has been introduced.

* zh/ls-files-deduplicate:
  ls-files.c: add --deduplicate option
  ls_files.c: consolidate two for loops into one
  ls_files.c: bugfix for --deleted and --modified
2021-02-05 16:40:44 -08:00
Junio C Hamano
aac006aa99 Merge branch 'so/log-diff-merge'
"git log" learned a new "--diff-merges=<how>" option.

* so/log-diff-merge: (32 commits)
  t4013: add tests for --diff-merges=first-parent
  doc/git-show: include --diff-merges description
  doc/rev-list-options: document --first-parent changes merges format
  doc/diff-generate-patch: mention new --diff-merges option
  doc/git-log: describe new --diff-merges options
  diff-merges: add '--diff-merges=1' as synonym for 'first-parent'
  diff-merges: add old mnemonic counterparts to --diff-merges
  diff-merges: let new options enable diff without -p
  diff-merges: do not imply -p for new options
  diff-merges: implement new values for --diff-merges
  diff-merges: make -m/-c/--cc explicitly mutually exclusive
  diff-merges: refactor opt settings into separate functions
  diff-merges: get rid of now empty diff_merges_init_revs()
  diff-merges: group diff-merge flags next to each other inside 'rev_info'
  diff-merges: split 'ignore_merges' field
  diff-merges: fix -m to properly override -c/--cc
  t4013: add tests for -m failing to override -c/--cc
  t4013: support test_expect_failure through ':failure' magic
  diff-merges: revise revs->diff flag handling
  diff-merges: handle imply -p on -c/--cc logic for log.c
  ...
2021-02-05 16:40:44 -08:00
Junio C Hamano
897d28bcc2 Merge branch 'ds/for-each-repo-noopfix' into maint
"git for-each-repo --config=<var> <cmd>" should not run <cmd> for
any repository when the configuration variable <var> is not defined
even once.

* ds/for-each-repo-noopfix:
  for-each-repo: do nothing on empty config
2021-02-05 16:31:23 -08:00
Junio C Hamano
a4031f6dc0 Merge branch 'en/stash-apply-sparse-checkout' into maint
"git stash" did not work well in a sparsely checked out working
tree.

* en/stash-apply-sparse-checkout:
  stash: fix stash application in sparse-checkouts
  stash: remove unnecessary process forking
  t7012: add a testcase demonstrating stash apply bugs in sparse checkouts
2021-02-05 16:31:22 -08:00
Junio C Hamano
a08832f16e Merge branch 'rs/rebase-commit-validation' into maint
Diagnose command line error of "git rebase" early.

* rs/rebase-commit-validation:
  rebase: verify commit parameter
2021-02-05 16:31:22 -08:00
Jonathan Tan
4f37d45706 clone: respect remote unborn HEAD
Teach Git to use the "unborn" feature introduced in a previous patch as
follows: Git will always send the "unborn" argument if it is supported
by the server. During "git clone", if cloning an empty repository, Git
will use the new information to determine the local branch to create. In
all other cases, Git will ignore it.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-05 13:49:55 -08:00
Jonathan Tan
39835409d1 connect, transport: encapsulate arg in struct
In a future patch we plan to return the name of an unborn current branch
from deep in the callchain to a caller via a new pointer parameter that
points at a variable in the caller when the caller calls
get_remote_refs() and transport_get_remote_refs().

In preparation for that, encapsulate the existing ref_prefixes
parameter into a struct. The aforementioned unborn current branch will
go into this new struct in the future patch.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-05 13:49:54 -08:00
Junio C Hamano
973e20b83f Merge branch 'jk/peel-iterated-oid'
The peel_ref() API has been replaced with peel_iterated_oid().

* jk/peel-iterated-oid:
  refs: switch peel_ref() to peel_iterated_oid()
2021-02-03 15:04:49 -08:00
Junio C Hamano
15bf48b987 Merge branch 'ds/maintenance-prefetch-cleanup'
Test clean-up plus UI improvement by hiding extra refs that
the prefetch task uses from "log --decorate" output.

* ds/maintenance-prefetch-cleanup:
  t7900: clean up some broken refs
  maintenance: set log.excludeDecoration durin prefetch
2021-02-03 15:04:48 -08:00
Pranit Bauva
97b8294474 bisect--helper: retire --check-and-set-terms subcommand
The `--check-and-set-terms` subcommand is no longer from the
git-bisect.sh shell script. Instead the function
`check_and_set_terms()` is called from the C implementation.

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-03 14:52:09 -08:00
Pranit Bauva
e4c7b33747 bisect--helper: reimplement bisect_skip shell function in C
Reimplement the `bisect_skip()` shell function in C and also add
`bisect-skip` subcommand to `git bisect--helper` to call it from
git-bisect.sh

Using `--bisect-skip` subcommand is a temporary measure to port shell
function to C so as to use the existing test suite.

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-03 14:52:09 -08:00
Pranit Bauva
9feea34810 bisect--helper: retire --bisect-auto-next subcommand
The --bisect-auto-next subcommand is no longer used from the
git-bisect.sh shell script. Instead the function bisect_auto_next()
is directly called from the C implementation.

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-03 14:52:09 -08:00
Pranit Bauva
b7a6f163d6 bisect--helper: use res instead of return in BISECT_RESET case option
Use `res` variable to store `bisect_reset()` output in BISECT_RESET
case option to make bisect--helper.c more consistent.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-03 14:52:09 -08:00
Pranit Bauva
68efed8c8a bisect--helper: retire --bisect-write subcommand
The `--bisect-write` subcommand is no longer used from the
git-bisect.sh shell script. Instead the function `bisect_write()`
is directly called from the C implementation.

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-03 14:52:08 -08:00
Pranit Bauva
2b1fd947f6 bisect--helper: reimplement bisect_replay shell function in C
Reimplement the `bisect_replay` shell function in C and also add
`--bisect-replay` subcommand to `git bisect--helper` to call it from
git-bisect.sh

Using `--bisect-replay` subcommand is a temporary measure to port shell
function to C so as to use the existing test suite.

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-03 14:52:08 -08:00
Pranit Bauva
97d5ba6a39 bisect--helper: reimplement bisect_log shell function in C
Reimplement the `bisect_log()` shell function in C and also add
`--bisect-log` subcommand to `git bisect--helper` to call it from
git-bisect.sh .

Using `--bisect-log` subcommand is a temporary measure to port shell
function to C so as to use the existing test suite.

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Helped-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-03 14:52:08 -08:00
Torsten Bögershausen
5c327502db MacOS: precompose_argv_prefix()
The following sequence leads to a "BUG" assertion running under MacOS:

  DIR=git-test-restore-p
  Adiarnfd=$(printf 'A\314\210')
  DIRNAME=xx${Adiarnfd}yy
  mkdir $DIR &&
  cd $DIR &&
  git init &&
  mkdir $DIRNAME &&
  cd $DIRNAME &&
  echo "Initial" >file &&
  git add file &&
  echo "One more line" >>file &&
  echo y | git restore -p .

 Initialized empty Git repository in /tmp/git-test-restore-p/.git/
 BUG: pathspec.c:495: error initializing pathspec_item
 Cannot close git diff-index --cached --numstat
 [snip]

The command `git restore` is run from a directory inside a Git repo.
Git needs to split the $CWD into 2 parts:
The path to the repo and "the rest", if any.
"The rest" becomes a "prefix" later used inside the pathspec code.

As an example, "/path/to/repo/dir-inside-repå" would determine
"/path/to/repo" as the root of the repo, the place where the
configuration file .git/config is found.

The rest becomes the prefix ("dir-inside-repå"), from where the
pathspec machinery expands the ".", more about this later.
If there is a decomposed form, (making the decomposing visible like this),
"dir-inside-rep°a" doesn't match "dir-inside-repå".

Git commands need to:

 (a) read the configuration variable "core.precomposeunicode"
 (b) precocompose argv[]
 (c) precompose the prefix, if there was any

The first commit,
76759c7dff "git on Mac OS and precomposed unicode"
addressed (a) and (b).

The call to precompose_argv() was added into parse-options.c,
because that seemed to be a good place when the patch was written.

Commands that don't use parse-options need to do (a) and (b) themselfs.

The commands `diff-files`, `diff-index`, `diff-tree` and `diff`
learned (a) and (b) in
commit 90a78b83e0 "diff: run arguments through precompose_argv"

Branch names (or refs in general) using decomposed code points
resulting in decomposed file names had been fixed in
commit 8e712ef6fc "Honor core.precomposeUnicode in more places"

The bug report from above shows 2 things:
- more commands need to handle precomposed unicode
- (c) should be implemented for all commands using pathspecs

Solution:
precompose_argv() now handles the prefix (if needed), and is renamed into
precompose_argv_prefix().

Inside this function the config variable core.precomposeunicode is read
into the global variable precomposed_unicode, as before.
This reading is skipped if precomposed_unicode had been read before.

The original patch for preocomposed unicode, 76759c7dff, placed
precompose_argv() into parse-options.c

Now add it into git.c::run_builtin() as well.  Existing precompose
calls in diff-files.c and others may become redundant, and if we
audit the callflows that reach these places to make sure that they
can never be reached without going through the new call added to
run_builtin(), we might be able to remove these existing ones.

But in this commit, we do not bother to do so and leave these
precompose callsites as they are.  Because precompose() is
idempotent and can be called on an already precomposed string
safely, this is safer than removing existing calls without fully
vetting the callflows.

There is certainly room for cleanups - this change intends to be a bug fix.
Cleanups needs more tests in e.g. t/t3910-mac-os-precompose.sh, and should
be done in future commits.

[1] git-bugreport-2021-01-06-1209.txt (git can't deal with special characters)
[2] https://lore.kernel.org/git/A102844A-9501-4A86-854D-E3B387D378AA@icloud.com/

Reported-by: Daniel Troger <random_n0body@icloud.com>
Helped-By: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-03 14:09:37 -08:00
Rafael Silva
076b444a62 worktree: teach list verbose mode
"git worktree list" annotates each worktree according to its state such
as "prunable" or "locked", however it is not immediately obvious why
these worktrees are being annotated. For prunable worktrees a reason
is available that is returned by should_prune_worktree() and for locked
worktrees a reason might be available provided by the user via `lock`
command.

Let's teach "git worktree list" a --verbose mode that outputs the reason
why the worktrees are being annotated. The reason is a text that can take
virtually any size and appending the text on the default columned format
will make it difficult to extend the command with other annotations and
not fit nicely on the screen. In order to address this shortcoming the
annotation is then moved to the next line indented followed by the reason
If the reason is not available the annotation stays on the same line as
the worktree itself.

The output of "git worktree list" with verbose becomes like so:

    $ git worktree list --verbose
    ...
    /path/to/locked-no-reason    acb124 [branch-a] locked
    /path/to/locked-with-reason  acc125 [branch-b]
        locked: worktree with a locked reason
    /path/to/prunable-reason     ace127 [branch-d]
        prunable: gitdir file points to non-existent location
    ...

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-30 09:57:40 -08:00
Rafael Silva
9b19a58f66 worktree: teach list to annotate prunable worktree
The "git worktree list" command shows the absolute path to the worktree,
the commit that is checked out, the name of the branch, and a "locked"
annotation if the worktree is locked, however, it does not indicate
whether the worktree is prunable.

The "prune" command will remove a worktree if it is prunable unless
`--dry-run` option is specified. This could lead to a worktree being
removed without the user realizing before it is too late, in case the
user forgets to pass --dry-run for instance. If the "list" command shows
which worktree is prunable, the user could verify before running
"git worktree prune" and hopefully prevents the working tree to be
removed "accidentally" on the worse case scenario.

Let's teach "git worktree list" to show when a worktree is a prunable
candidate for both default and porcelain format.

In the default format a "prunable" text is appended:

    $ git worktree list
    /path/to/main      aba123 [main]
    /path/to/linked    123abc [branch-a]
    /path/to/prunable  ace127 (detached HEAD) prunable

In the --porcelain format a prunable label is added followed by
its reason:

    $ git worktree list --porcelain
    ...
    worktree /path/to/prunable
    HEAD abc1234abc1234abc1234abc1234abc1234abc12
    detached
    prunable gitdir file points to non-existent location
    ...

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-30 09:57:35 -08:00
Rafael Silva
862c723d18 worktree: teach list --porcelain to annotate locked worktree
Commit c57b3367be (worktree: teach `list` to annotate locked worktree,
2020-10-11) taught "git worktree list" to annotate locked worktrees by
appending "locked" text to its output, however, this is not listed in
the --porcelain format.

Teach "list --porcelain" to do the same and add a "locked" attribute
followed by its reason, thus making both default and porcelain format
consistent. If the locked reason is not available then only "locked"
is shown.

The output of the "git worktree list --porcelain" becomes like so:

    $ git worktree list --porcelain
    ...
    worktree /path/to/locked
    HEAD 123abcdea123abcd123acbd123acbda123abcd12
    detached
    locked

    worktree /path/to/locked-with-reason
    HEAD abc123abc123abc123abc123abc123abc123abc1
    detached
    locked reason why it is locked
    ...

In porcelain mode, if the lock reason contains special characters
such as newlines, they are escaped with backslashes and the entire
reason is enclosed in double quotes. For example:

   $ git worktree list --porcelain
   ...
   locked "worktree's path mounted in\nremovable device"
   ...

Furthermore, let's update the documentation to state that some
attributes in the porcelain format might be listed alone or together
with its value depending whether the value is available or not. Thus
documenting the case of the new "locked" attribute.

Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-30 09:57:29 -08:00
Rafael Silva
eb36135af7 worktree: teach worktree_lock_reason() to gently handle main worktree
worktree_lock_reason() aborts with an assertion failure when called on
the main worktree since locking the main worktree is nonsensical. Not
only is this behavior undocumented, thus callers might not even be aware
that the call could potentially crash the program, but it also forces
clients to be extra careful:

    if (!is_main_worktree(wt) && worktree_locked_reason(...))
        ...

Since we know that locking makes no sense in the context of the main
worktree, we can simply return false for the main worktree, thus making
client code less complex by eliminating the need for the callers to have
inside knowledge about the implementation:

    if (worktree_lock_reason(...))
        ...

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-30 09:57:20 -08:00
Rafael Silva
a29a8b7574 worktree: libify should_prune_worktree()
As part of teaching "git worktree list" to annotate worktree that is a
candidate for pruning, let's move should_prune_worktree() from
builtin/worktree.c to worktree.c in order to make part of the worktree
public API.

should_prune_worktree() knows how to select the given worktree for
pruning based on an expiration date, however the expiration value is
stored in a static file-scope variable and it is not local to the
function. In order to move the function, teach should_prune_worktree()
to take the expiration date as an argument and document the new
parameter that is not immediately obvious.

Also, change the function comment to clearly state that the worktree's
path is returned in `wtpath` argument.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-30 09:57:08 -08:00
Jeff King
8380dcd700 oid_pos(): access table through const pointers
When we are looking up an oid in an array, we obviously don't need to
write to the array. Let's mark it as const in the function interfaces,
as well as in the local variables we use to derference the void pointer
(note a few cases use pointers-to-pointers, so we mark everything
const).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-28 12:03:26 -08:00
Jeff King
45ee13b942 hash_pos(): convert to oid_pos()
All of our callers are actually looking up an object_id, not a bare
hash. Likewise, the arrays they are looking in are actual arrays of
object_id (not just raw bytes of hashes, as we might find in a pack
.idx; those are handled by bsearch_hash()).

Using an object_id gives us more type safety, and makes the callers
slightly shorter. It also gets rid of the word "sha1" from several
access functions, though we could obviously also rename those with
s/sha1/hash/.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-28 12:02:39 -08:00
Johannes Schindelin
679b5916cd range-diff/format-patch: refactor check for commit range
Currently, when called with exactly two arguments, `git range-diff`
tests for a literal `..` in each of the two. Likewise, the argument
provided via `--range-diff` to `git format-patch` is checked in the same
manner.

However, `<commit>^!` is a perfectly valid commit range, equivalent to
`<commit>^..<commit>` according to the `SPECIFYING RANGES` section of
gitrevisions[7].

In preparation for allowing more sophisticated ways to specify commit
ranges, let's refactor the check into its own function.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-27 22:01:49 -08:00
Ævar Arnfjörð Bjarmason
15c9649730 grep/log: remove hidden --debug and --grep-debug options
Remove the hidden "grep --debug" and "log --grep-debug" options added
in 17bf35a3c7 (grep: teach --debug option to dump the parse tree,
2012-09-13).

At the time these options seem to have been intended to go along with
a documentation discussion and to help the author of relevant tests to
perform ad-hoc debugging on them[1].

Reasons to want this gone:

 1. They were never documented, and the only (rather trivial) use of
    them in our own codebase for testing is something I removed back
    in e01b4dab01 (grep: change non-ASCII -i test to stop using
    --debug, 2017-05-20).

 2. Googling around doesn't show any in-the-wild uses I could dig up,
    and on the Git ML the only mentions after the original discussion
    seem to have been when they came up in unrelated diff contexts, or
    that test commit of mine.

 3. An exception to that is c581e4a749 (grep: under --debug, show
    whether PCRE JIT is enabled, 2019-08-18) where we added the
    ability to dump out when PCREv2 has the JIT in effect.

    The combination of that and my earlier b65abcafc7 (grep: use PCRE
    v2 for optimized fixed-string search, 2019-07-01) means Git prints
    this out in its most common in-the-wild configuration:

        $ git log  --grep-debug --grep=foo --grep=bar --grep=baz --all-match
        pcre2_jit_on=1
        pcre2_jit_on=1
        pcre2_jit_on=1
        [all-match]
        (or
         pattern_body<body>foo
         (or
          pattern_body<body>bar
          pattern_body<body>baz
         )
        )

        $ git grep --debug \( -e foo --and -e bar \) --or -e baz
        pcre2_jit_on=1
        pcre2_jit_on=1
        pcre2_jit_on=1
        (or
         (and
          patternfoo
          patternbar
         )
         patternbaz
        )

I.e. for each pattern we're considering for the and/or/--all-match
etc. debugging we'll now diligently spew out another identical line
saying whether the PCREv2 JIT is on or not.

I think that nobody's complained about that rather glaringly obviously
bad output says something about how much this is used, i.e. it's
not.

The need for this debugging aid for the composed grep/log patterns
seems to have passed, and the desire to dump the JIT config seems to
have been another one-off around the time we had JIT-related issues on
the PCREv2 codepath. That the original author of this debugging
facility seemingly hasn't noticed the bad output since then[2] is
probably some indicator.

1. https://lore.kernel.org/git/cover.1347615361.git.git@drmicha.warpmail.net/
2. https://lore.kernel.org/git/xmqqk1b8x0ac.fsf@gitster-ct.c.googlers.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-26 11:36:20 -08:00
Taylor Blau
e8c58f894b t: support GIT_TEST_WRITE_REV_INDEX
Add a new option that unconditionally enables the pack.writeReverseIndex
setting in order to run the whole test suite in a mode that generates
on-disk reverse indexes. Additionally, enable this mode in the second
run of tests under linux-gcc in 'ci/run-build-and-tests.sh'.

Once on-disk reverse indexes are proven out over several releases, we
can change the default value of that configuration to 'true', and drop
this patch.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-25 18:32:44 -08:00
Taylor Blau
c97733435a builtin/pack-objects.c: respect 'pack.writeReverseIndex'
Now that we have an implementation that can write the new reverse index
format, enable writing a .rev file in 'git pack-objects' by consulting
the pack.writeReverseIndex configuration variable.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-25 18:32:43 -08:00
Taylor Blau
e37d0b8730 builtin/index-pack.c: write reverse indexes
Teach 'git index-pack' to optionally write and verify reverse index with
'--[no-]rev-index', as well as respecting the 'pack.writeReverseIndex'
configuration option.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-25 18:32:43 -08:00
Taylor Blau
84d544943c builtin/index-pack.c: allow stripping arbitrary extensions
To derive the filename for a .idx file, 'git index-pack' uses
derive_filename() to strip the '.pack' suffix and add the new suffix.

Prepare for stripping off suffixes other than '.pack' by making the
suffix to strip a parameter of derive_filename(). In order to make this
consistent with the "suffix" parameter which does not begin with a ".",
an additional check in derive_filename.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-25 18:32:43 -08:00
Taylor Blau
2f4ba2a867 packfile: prepare for the existence of '*.rev' files
Specify the format of the on-disk reverse index 'pack-*.rev' file, as
well as prepare the code for the existence of such files.

The reverse index maps from pack relative positions (i.e., an index into
the array of object which is sorted by their offsets within the
packfile) to their position within the 'pack-*.idx' file. Today, this is
done by building up a list of (off_t, uint32_t) tuples for each object
(the off_t corresponding to that object's offset, and the uint32_t
corresponding to its position in the index). To convert between pack and
index position quickly, this array of tuples is radix sorted based on
its offset.

This has two major drawbacks:

First, the in-memory cost scales linearly with the number of objects in
a pack.  Each 'struct revindex_entry' is sizeof(off_t) +
sizeof(uint32_t) + padding bytes for a total of 16.

To observe this, force Git to load the reverse index by, for e.g.,
running 'git cat-file --batch-check="%(objectsize:disk)"'. When asking
for a single object in a fresh clone of the kernel, Git needs to
allocate 120+ MB of memory in order to hold the reverse index in memory.

Second, the cost to sort also scales with the size of the pack.
Luckily, this is a linear function since 'load_pack_revindex()' uses a
radix sort, but this cost still must be paid once per pack per process.

As an example, it takes ~60x longer to print the _size_ of an object as
it does to print that entire object's _contents_:

  Benchmark #1: git.compile cat-file --batch <obj
    Time (mean ± σ):       3.4 ms ±   0.1 ms    [User: 3.3 ms, System: 2.1 ms]
    Range (min … max):     3.2 ms …   3.7 ms    726 runs

  Benchmark #2: git.compile cat-file --batch-check="%(objectsize:disk)" <obj
    Time (mean ± σ):     210.3 ms ±   8.9 ms    [User: 188.2 ms, System: 23.2 ms]
    Range (min … max):   193.7 ms … 224.4 ms    13 runs

Instead, avoid computing and sorting the revindex once per process by
writing it to a file when the pack itself is generated.

The format is relatively straightforward. It contains an array of
uint32_t's, the length of which is equal to the number of objects in the
pack.  The ith entry in this table contains the index position of the
ith object in the pack, where "ith object in the pack" is determined by
pack offset.

One thing that the on-disk format does _not_ contain is the full (up to)
eight-byte offset corresponding to each object. This is something that
the in-memory revindex contains (it stores an off_t in 'struct
revindex_entry' along with the same uint32_t that the on-disk format
has). Omit it in the on-disk format, since knowing the index position
for some object is sufficient to get a constant-time lookup in the
pack-*.idx file to ask for an object's offset within the pack.

This trades off between the on-disk size of the 'pack-*.rev' file for
runtime to chase down the offset for some object. Even though the lookup
is constant time, the constant is heavier, since it can potentially
involve two pointer walks in v2 indexes (one to access the 4-byte offset
table, and potentially a second to access the double wide offset table).

Consider trying to map an object's pack offset to a relative position
within that pack. In a cold-cache scenario, more page faults occur while
switching between binary searching through the reverse index and
searching through the *.idx file for an object's offset. Sure enough,
with a cold cache (writing '3' into '/proc/sys/vm/drop_caches' after
'sync'ing), printing out the entire object's contents is still
marginally faster than printing its size:

  Benchmark #1: git.compile cat-file --batch-check="%(objectsize:disk)" <obj >/dev/null
    Time (mean ± σ):      22.6 ms ±   0.5 ms    [User: 2.4 ms, System: 7.9 ms]
    Range (min … max):    21.4 ms …  23.5 ms    41 runs

  Benchmark #2: git.compile cat-file --batch <obj >/dev/null
    Time (mean ± σ):      17.2 ms ±   0.7 ms    [User: 2.8 ms, System: 5.5 ms]
    Range (min … max):    15.6 ms …  18.2 ms    45 runs

(Numbers taken in the kernel after cheating and using the next patch to
generate a reverse index). There are a couple of approaches to improve
cold cache performance not pursued here:

  - We could include the object offsets in the reverse index format.
    Predictably, this does result in fewer page faults, but it triples
    the size of the file, while simultaneously duplicating a ton of data
    already available in the .idx file. (This was the original way I
    implemented the format, and it did show
    `--batch-check='%(objectsize:disk)'` winning out against `--batch`.)

    On the other hand, this increase in size also results in a large
    block-cache footprint, which could potentially hurt other workloads.

  - We could store the mapping from pack to index position in more
    cache-friendly way, like constructing a binary search tree from the
    table and writing the values in breadth-first order. This would
    result in much better locality, but the price you pay is trading
    O(1) lookup in 'pack_pos_to_index()' for an O(log n) one (since you
    can no longer directly index the table).

So, neither of these approaches are taken here. (Thankfully, the format
is versioned, so we are free to pursue these in the future.) But, cold
cache performance likely isn't interesting outside of one-off cases like
asking for the size of an object directly. In real-world usage, Git is
often performing many operations in the revindex (i.e., asking about
many objects rather than a single one).

The trade-off is worth it, since we will avoid the vast majority of the
cost of generating the revindex that the extra pointer chase will look
like noise in the following patch's benchmarks.

This patch describes the format and prepares callers (like in
pack-revindex.c) to be able to read *.rev files once they exist. An
implementation of the writer will appear in the next patch, and callers
will gradually begin to start using the writer in the patches that
follow after that.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-25 18:32:43 -08:00
Junio C Hamano
bcaaf972e6 Merge branch 'tb/pack-revindex-api'
Abstract accesses to in-core revindex that allows enumerating
objects stored in a packfile in the order they appear in the pack,
in preparation for introducing an on-disk precomputed revindex.

* tb/pack-revindex-api: (21 commits)
  for_each_object_in_pack(): clarify pack vs index ordering
  pack-revindex.c: avoid direct revindex access in 'offset_to_pack_pos()'
  pack-revindex: hide the definition of 'revindex_entry'
  pack-revindex: remove unused 'find_revindex_position()'
  pack-revindex: remove unused 'find_pack_revindex()'
  builtin/gc.c: guess the size of the revindex
  for_each_object_in_pack(): convert to new revindex API
  unpack_entry(): convert to new revindex API
  packed_object_info(): convert to new revindex API
  retry_bad_packed_offset(): convert to new revindex API
  get_delta_base_oid(): convert to new revindex API
  rebuild_existing_bitmaps(): convert to new revindex API
  try_partial_reuse(): convert to new revindex API
  get_size_by_pos(): convert to new revindex API
  show_objects_for_type(): convert to new revindex API
  bitmap_position_packfile(): convert to new revindex API
  check_object(): convert to new revindex API
  write_reused_pack_verbatim(): convert to new revindex API
  write_reused_pack_one(): convert to new revindex API
  write_reuse_object(): convert to new revindex API
  ...
2021-01-25 14:19:20 -08:00
Junio C Hamano
7eefa1349b Merge branch 'cc/write-promisor-file'
A bit of code refactoring.

* cc/write-promisor-file:
  pack-write: die on error in write_promisor_file()
  fetch-pack: refactor writing promisor file
  fetch-pack: rename helper to create_promisor_file()
2021-01-25 14:19:19 -08:00
Junio C Hamano
42342b3ee6 Merge branch 'ab/mailmap'
Clean-up docs, codepaths and tests around mailmap.

* ab/mailmap: (22 commits)
  shortlog: remove unused(?) "repo-abbrev" feature
  mailmap doc + tests: document and test for case-insensitivity
  mailmap tests: add tests for empty "<>" syntax
  mailmap tests: add tests for whitespace syntax
  mailmap tests: add a test for comment syntax
  mailmap doc + tests: add better examples & test them
  tests: refactor a few tests to use "test_commit --append"
  test-lib functions: add an --append option to test_commit
  test-lib functions: add --author support to test_commit
  test-lib functions: document arguments to test_commit
  test-lib functions: expand "test_commit" comment template
  mailmap: test for silent exiting on missing file/blob
  mailmap tests: get rid of overly complex blame fuzzing
  mailmap tests: add a test for "not a blob" error
  mailmap tests: remove redundant entry in test
  mailmap tests: improve --stdin tests
  mailmap tests: modernize syntax & test idioms
  mailmap tests: use our preferred whitespace syntax
  mailmap doc: start by mentioning the comment syntax
  check-mailmap doc: note config options
  ...
2021-01-25 14:19:19 -08:00
Junio C Hamano
60ecad090d Merge branch 'ps/fetch-atomic'
"git fetch" learns to treat ref updates atomically in all-or-none
fashion, just like "git push" does, with the new "--atomic" option.

* ps/fetch-atomic:
  fetch: implement support for atomic reference updates
  fetch: allow passing a transaction to `s_update_ref()`
  fetch: refactor `s_update_ref` to use common exit path
  fetch: use strbuf to format FETCH_HEAD updates
  fetch: extract writing to FETCH_HEAD
2021-01-25 14:19:19 -08:00
Junio C Hamano
dfcd905069 Merge branch 'jc/deprecate-pack-redundant'
Warn loudly when the "pack-redundant" command, which has been left
stale with almost unusable performance issues, gets used, as we no
longer want to recommend its use (instead just "repack -d" instead).

* jc/deprecate-pack-redundant:
  pack-redundant: gauge the usage before proposing its removal
2021-01-25 14:19:18 -08:00
Junio C Hamano
9e409d7e07 Merge branch 'ab/branch-sort'
The implementation of "git branch --sort" wrt the detached HEAD
display has always been hacky, which has been cleaned up.

* ab/branch-sort:
  branch: show "HEAD detached" first under reverse sort
  branch: sort detached HEAD based on a flag
  ref-filter: move ref_sorting flags to a bitfield
  ref-filter: move "cmp_fn" assignment into "else if" arm
  ref-filter: add braces to if/else if/else chain
  branch tests: add to --sort tests
  branch: change "--local" to "--list" in comment
2021-01-25 14:19:17 -08:00
Junio C Hamano
58e2ce9112 Merge branch 'ma/more-opaque-lock-file'
Code clean-up.

* ma/more-opaque-lock-file:
  read-cache: try not to peek into `struct {lock_,temp}file`
  refs/files-backend: don't peek into `struct lock_file`
  midx: don't peek into `struct lock_file`
  commit-graph: don't peek into `struct lock_file`
  builtin/gc: don't peek into `struct lock_file`
2021-01-25 14:19:17 -08:00
Junio C Hamano
c7d6d419b0 Merge branch 'ab/mktag'
"git mktag" validates its input using its own rules before writing
a tag object---it has been updated to share the logic with "git
fsck".

* ab/mktag: (23 commits)
  mktag: add a --[no-]strict option
  mktag: mark strings for translation
  mktag: convert to parse-options
  mktag: allow omitting the header/body \n separator
  mktag: allow turning off fsck.extraHeaderEntry
  fsck: make fsck_config() re-usable
  mktag: use fsck instead of custom verify_tag()
  mktag: use puts(str) instead of printf("%s\n", str)
  mktag: remove redundant braces in one-line body "if"
  mktag: use default strbuf_read() hint
  mktag tests: test verify_object() with replaced objects
  mktag tests: improve verify_object() test coverage
  mktag tests: test "hash-object" compatibility
  mktag tests: stress test whitespace handling
  mktag tests: run "fsck" after creating "mytag"
  mktag tests: don't create "mytag" twice
  mktag tests: don't redirect stderr to a file needlessly
  mktag tests: remove needless SHA-1 hardcoding
  mktag tests: use "test_commit" helper
  mktag tests: don't needlessly use a subshell
  ...
2021-01-25 14:19:17 -08:00
Derrick Stolee
dd23022acb sparse-checkout: load sparse-checkout patterns
A future feature will want to load the sparse-checkout patterns into a
pattern_list, but the current mechanism to do so is a bit complicated.
This is made difficult due to needing to find the sparse-checkout file
in different ways throughout the codebase.

The logic implemented in the new get_sparse_checkout_patterns() was
duplicated in populate_from_existing_patterns() in unpack-trees.c. Use
the new method instead, keeping the logic around handling the struct
unpack_trees_options.

The callers to get_sparse_checkout_filename() in
builtin/sparse-checkout.c manipulate the sparse-checkout file directly,
so it is not appropriate to replace logic in that file with
get_sparse_checkout_patterns().

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 17:14:07 -08:00
Derrick Stolee
fb0882648e cache-tree: clean up cache_tree_update()
Make the method safer by allocating a cache_tree member for the given
index_state if it is not already present. This is preferrable to a
BUG() statement or returning with an error because future callers will
want to populate an empty cache-tree using this method.

Callers can also remove their conditional allocations of cache_tree.

Also drop local variables that can be found directly from the 'istate'
parameter.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 17:14:07 -08:00
ZheNing Hu
93a7d9835f ls-files.c: add --deduplicate option
During a merge conflict, the name of a file may appear multiple
times in "git ls-files" output, once for each stage.  If you use
both `--delete` and `--modify` at the same time, the output may
mention a deleted file twice.

When none of the '-t', '-u', or '-s' options is in use, these
duplicate entries do not add much value to the output.

Introduce a new '--deduplicate' option to suppress them.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
[jc: extended doc and rewritten commit log]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 11:48:20 -08:00
ZheNing Hu
ed644d1666 ls_files.c: consolidate two for loops into one
This will make it easier to show only one entry per filename in the
next step.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
[jc: corrected the log message]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 11:48:20 -08:00
ZheNing Hu
f1c462ea41 ls_files.c: bugfix for --deleted and --modified
This situation may occur in the original code: lstat() failed
but we use `&st` to feed ie_modified() later.

Therefore, we can directly execute show_ce without the judgment of
ie_modified() when lstat() has failed.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
[jc: fixed misindented code]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-23 11:48:11 -08:00
Jacob Vosmaer
be18153b97 builtin/pack-objects.c: avoid iterating all refs
In git-pack-objects, we iterate over all the tags if the --include-tag
option is passed on the command line. For some reason this uses
for_each_ref which is expensive if the repo has many refs. We should
use for_each_tag_ref instead.

Because the add_ref_tag callback will now only visit tags we
simplified it a bit.

The motivation for this change is that we observed performance issues
with a repository on gitlab.com that has 500,000 refs but only 2,000
tags. The fetch traffic on that repo is dominated by CI, and when we
changed CI to fetch with 'git fetch --no-tags' we saw a dramatic
change in the CPU profile of git-pack-objects. This lead us to this
particular ref walk. More details in:
https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/746#note_483546598

Signed-off-by: Jacob Vosmaer <jacob@gitlab.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-22 17:27:42 -08:00
Phil Hord
8198907795 use delete_refs when deleting tags or branches
'git tag -d' accepts one or more tag refs to delete, but each deletion
is done by calling `delete_ref` on each argv. This is very slow when
removing from packed refs. Use delete_refs instead so all the removals
can be done inside a single transaction with a single update.

Do the same for 'git branch -d'.

Since delete_refs performs all the packed-refs delete operations
inside a single transaction, if any of the deletes fail then all
them will be skipped. In practice, none of them should fail since
we verify the hash of each one before calling delete_refs, but some
network error or odd permissions problem could have different results
after this change.

Also, since the file-backed deletions are not performed in the same
transaction, those could succeed even when the packed-refs transaction
fails.

After deleting branches, remove the branch config only if the branch
ref was removed and was not subsequently added back in.

A manual test deleting 24,000 tags took about 30 minutes using
delete_ref.  It takes about 5 seconds using delete_refs.

Acked-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Phil Hord <phil.hord@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-21 16:05:05 -08:00
Jeff King
36a317929b refs: switch peel_ref() to peel_iterated_oid()
The peel_ref() interface is confusing and error-prone:

  - it's typically used by ref iteration callbacks that have both a
    refname and oid. But since they pass only the refname, we may load
    the ref value from the filesystem again. This is inefficient, but
    also means we are open to a race if somebody simultaneously updates
    the ref. E.g., this:

      int some_ref_cb(const char *refname, const struct object_id *oid, ...)
      {
              if (!peel_ref(refname, &peeled))
                      printf("%s peels to %s",
                             oid_to_hex(oid), oid_to_hex(&peeled);
      }

    could print nonsense. It is correct to say "refname peels to..."
    (you may see the "before" value or the "after" value, either of
    which is consistent), but mentioning both oids may be mixing
    before/after values.

    Worse, whether this is possible depends on whether the optimization
    to read from the current iterator value kicks in. So it is actually
    not possible with:

      for_each_ref(some_ref_cb);

    but it _is_ possible with:

      head_ref(some_ref_cb);

    which does not use the iterator mechanism (though in practice, HEAD
    should never peel to anything, so this may not be triggerable).

  - it must take a fully-qualified refname for the read_ref_full() code
    path to work. Yet we routinely pass it partial refnames from
    callbacks to for_each_tag_ref(), etc. This happens to work when
    iterating because there we do not call read_ref_full() at all, and
    only use the passed refname to check if it is the same as the
    iterator. But the requirements for the function parameters are quite
    unclear.

Instead of taking a refname, let's instead take an oid. That fixes both
problems. It's a little funny for a "ref" function not to involve refs
at all. The key thing is that it's optimizing under the hood based on
having access to the ref iterator. So let's change the name to make it
clear why you'd want this function versus just peel_object().

There are two other directions I considered but rejected:

  - we could pass the peel information into the each_ref_fn callback.
    However, we don't know if the caller actually wants it or not. For
    packed-refs, providing it is essentially free. But for loose refs,
    we actually have to peel the object, which would be wasteful in most
    cases. We could likewise pass in a flag to the callback indicating
    whether the peeled information is known, but that complicates those
    callbacks, as they then have to decide whether to manually peel
    themselves. Plus it requires changing the interface of every
    callback, whether they care about peeling or not, and there are many
    of them.

  - we could make a function to return the peeled value of the current
    iterated ref (computing it if necessary), and BUG() otherwise. I.e.:

      int peel_current_iterated_ref(struct object_id *out);

    Each of the current callers is an each_ref_fn callback, so they'd
    mostly be happy. But:

      - we use those callbacks with functions like head_ref(), which do
        not use the iteration code. So we'd need to handle the fallback
        case there, anyway.

      - it's possible that a caller would want to call into generic code
        that sometimes is used during iteration and sometimes not. This
        encapsulates the logic to do the fast thing when possible, and
        fallback when necessary.

The implementation is mostly obvious, but I want to call out a few
things in the patch:

  - the test-tool coverage for peel_ref() is now meaningless, as it all
    collapses to a single peel_object() call (arguably they were pretty
    uninteresting before; the tricky part of that function is the
    fast-path we see during iteration, but these calls didn't trigger
    that). I've just dropped it entirely, though note that some other
    tests relied on the tags we created; I've moved that creation to the
    tests where it matters.

  - we no longer need to take a ref_store parameter, since we'd never
    look up a ref now. We do still rely on a global "current iterator"
    variable which _could_ be kept per-ref-store. But in practice this
    is only useful if there are multiple recursive iterations, at which
    point the more appropriate solution is probably a stack of
    iterators. No caller used the actual ref-store parameter anyway
    (they all call the wrapper that passes the_repository).

  - the original only kicked in the optimization when the "refname"
    pointer matched (i.e., not string comparison). We do likewise with
    the "oid" parameter here, but fall back to doing an actual oideq()
    call. This in theory lets us kick in the optimization more often,
    though in practice no current caller cares. It should never be
    wrong, though (peeling is a property of an object, so two refs
    pointing to the same object would peel identically).

  - the original took care not to touch the peeled out-parameter unless
    we found something to put in it. But no caller cares about this, and
    anyway, it is enforced by peel_object() itself (and even in the
    optimized iterator case, that's where we eventually end up). We can
    shorten the code and avoid an extra copy by just passing the
    out-parameter through the stack.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-21 15:51:31 -08:00
Derrick Stolee
96eaffebbf maintenance: set log.excludeDecoration durin prefetch
The 'prefetch' task fetches refs from all remotes and places them in the
refs/prefetch/<remote>/ refspace. As this task is intended to run in the
background, this allows users to keep their local data very close to the
remote servers' data while not updating the users' understanding of the
remote refs in refs/remotes/<remote>/.

However, this can clutter 'git log' decorations with copies of the refs
with the full name 'refs/prefetch/<remote>/<branch>'.

The log.excludeDecoration config option was added in a6be5e67 (log: add
log.excludeDecoration config option, 2020-05-16) for exactly this
purpose.

Ensure we set this only for users that would benefit from it by
assigning it at the beginning of the prefetch task. Other alternatives
would be during 'git maintenance register' or 'git maintenance start',
but those might assign the config even when the prefetch task is
disabled by existing config. Further, users could run 'git maintenance
run --task=prefetch' using their own scripting or scheduling. This
provides the best coverage to automatically update the config when
valuable.

It is improbable, but possible, that users might want to run the
prefetch task _and_ see these refs in their log decorations. This seems
incredibly unlikely to me, but users can always opt-in on a
command-by-command basis using --decorate-refs=refs/prefetch/.

Test that this works in a few cases. In particular, ensure that our
assignment of log.excludeDecoration=refs/prefetch/ is additive to other
existing exclusions. Further, ensure we do not add multiple copies in
multiple runs.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-20 18:46:22 -08:00
Junio C Hamano
aa08688362 Merge branch 'ds/for-each-repo-noopfix'
"git for-each-repo --config=<var> <cmd>" should not run <cmd> for
any repository when the configuration variable <var> is not defined
even once.

* ds/for-each-repo-noopfix:
  for-each-repo: do nothing on empty config
2021-01-15 21:48:46 -08:00
Junio C Hamano
b2ace18759 Merge branch 'ds/maintenance-part-4'
Follow-up on the "maintenance part-3" which introduced scheduled
maintenance tasks to support platforms whose native scheduling
methods are not 'cron'.

* ds/maintenance-part-4:
  maintenance: use Windows scheduled tasks
  maintenance: use launchctl on macOS
  maintenance: include 'cron' details in docs
  maintenance: extract platform-specific scheduling
2021-01-15 21:48:45 -08:00
Junio C Hamano
62fb47a4d3 Merge branch 'en/stash-apply-sparse-checkout'
"git stash" did not work well in a sparsely checked out working
tree.

* en/stash-apply-sparse-checkout:
  stash: fix stash application in sparse-checkouts
  stash: remove unnecessary process forking
  t7012: add a testcase demonstrating stash apply bugs in sparse checkouts
2021-01-15 15:20:29 -08:00
Junio C Hamano
2ce8de6bf9 Merge branch 'zh/arg-help-format'
Clean up option descriptions in "git cmd --help".

* zh/arg-help-format:
  builtin/*: update usage format
  parse-options: format argh like error messages
2021-01-15 15:20:29 -08:00
Junio C Hamano
df26861c56 Merge branch 'rs/rebase-commit-validation'
Diagnose command line error of "git rebase" early.

* rs/rebase-commit-validation:
  rebase: verify commit parameter
2021-01-15 15:20:29 -08:00
Junio C Hamano
8b327f1784 Merge branch 'ma/sha1-is-a-hash'
Retire more names with "sha1" in it.

* ma/sha1-is-a-hash:
  hash-lookup: rename from sha1-lookup
  sha1-lookup: rename `sha1_pos()` as `hash_pos()`
  object-file.c: rename from sha1-file.c
  object-name.c: rename from sha1-name.c
2021-01-15 15:20:29 -08:00
Junio C Hamano
9ba366f12b Merge branch 'bc/rev-parse-path-format'
"git rev-parse" can be explicitly told to give output as absolute
or relative path with the `--path-format=(absolute|relative)` option.

* bc/rev-parse-path-format:
  rev-parse: add option for absolute or relative path formatting
  abspath: add a function to resolve paths with missing components
2021-01-15 15:20:28 -08:00
Taylor Blau
2891b434ac builtin/gc.c: guess the size of the revindex
'estimate_repack_memory()' takes into account the amount of memory
required to load the reverse index in memory by multiplying the assumed
number of objects by the size of the 'revindex_entry' struct.

Prepare for hiding the definition of 'struct revindex_entry' by removing
a 'sizeof()' of that type from outside of pack-revindex.c. Instead,
guess that one off_t and one uint32_t are required per object. Strictly
speaking, this is a worse guess than asking for 'sizeof(struct
revindex_entry)' directly, since the true size of this struct is 16
bytes with padding on the end of the struct in order to align the offset
field.

But, this is an approximation anyway, and it does remove a use of the
'struct revindex_entry' from outside of pack-revindex internals.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-13 21:53:47 -08:00
Taylor Blau
eb3fd99efd check_object(): convert to new revindex API
Replace direct accesses to the revindex with calls to
'offset_to_pack_pos()' and 'pack_pos_to_index()'.

Since this caller already had some error checking (it can jump to the
'give_up' label if it encounters an error), we can easily check whether
or not the provided offset points to an object in the given pack. This
error checking existed prior to this patch, too, since the caller checks
whether the return value from 'find_pack_revindex()' was NULL or not.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-13 21:53:45 -08:00
Taylor Blau
6a5c10c45f write_reused_pack_verbatim(): convert to new revindex API
Replace a direct access to the revindex array with
'pack_pos_to_offset()'.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-13 21:53:45 -08:00
Taylor Blau
66cbd3e2fb write_reused_pack_one(): convert to new revindex API
Replace direct revindex accesses with calls to 'pack_pos_to_offset()'
and 'pack_pos_to_index()'.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-13 21:53:45 -08:00
Taylor Blau
952fc6870d write_reuse_object(): convert to new revindex API
First replace 'find_pack_revindex()' with its replacement
'offset_to_pack_pos()'. This prevents any bogus OFS_DELTA that may make
its way through until 'write_reuse_object()' from causing a bad memory
read (if 'revidx' is 'NULL')

Next, replace a direct access of '->nr' with the wrapper function
'pack_pos_to_index()'.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-13 21:53:45 -08:00
Christian Couder
33add2ad7d fetch-pack: refactor writing promisor file
Let's replace the 2 different pieces of code that write a
promisor file in 'builtin/repack.c' and 'fetch-pack.c'
with a new function called 'write_promisor_file()' in
'pack-write.c' and 'pack.h'.

This might also help us in the future, if we want to put
back the ref names and associated hashes that were in
the promisor files we are repacking in 'builtin/repack.c'
as suggested by a NEEDSWORK comment just above the code
we are refactoring.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 16:01:07 -08:00
Ævar Arnfjörð Bjarmason
4e168333a8 shortlog: remove unused(?) "repo-abbrev" feature
Remove support for the magical "repo-abbrev" comment in .mailmap
files. This was added to .mailmap parsing in [1], as a generalized
feature of the git-shortlog Perl script added earlier in [2].

There was no documentation or tests for this feature, and I don't
think it's used in practice anymore.

What it did was to allow you to specify a single string to be
search-replaced with "/.../" in the .mailmap file. E.g. for
linux.git's current .mailmap:

    git archive --remote=git@gitlab.com:linux-kernel/linux.git \
        HEAD -- .mailmap | grep -a repo-abbrev
    # repo-abbrev: /pub/scm/linux/kernel/git/

Then when running e.g.:

    git shortlog --merges --author=Linus -1 v5.10-rc7..v5.10 | grep Merge

We'd emit (the [...] is mine):

      Merge tag [...]git://git.kernel.org/.../tip/tip

But will now emit:

      Merge tag [...]git.kernel.org/pub/scm/linux/kernel/git/tip/tip

I think at this point this is just a historical artifact we can get
rid of. It was initially meant for Linus's own use when we integrated
the Perl script[2], but since then it seems he's stopped using it.

Digging through Linus's release announcements on the LKML[3] the last
release I can find that made use of this output is Linux 2.6.25-rc6
back in March 2008[4]. Later on Linus started using --no-merges[5],
and nowadays seems to prefer some custom not-quite-shortlog format of
merges from lieutenants[6].

You will still see it on linux.git if you run "git shortlog" manually
yourself with --merges, with this removed you can still get the same
output with:

    git log --pretty=fuller v5.10-rc7..v5.10 |
    sed 's!/pub/scm/linux/kernel/git/!/.../!g' |
    git shortlog

Arguably we should do the same for the search-replacing of "[PATCH]"
at the beginning with "". That seems to be another relic of a bygone
era when linux.git patches would have their E-Mail subject lines
applied as-is by "git am" or whatever. But we documented that feature
in "git-shortlog(1)", and it seems more widely applicable than
something purely kernel-specific.

1. 7595e2ee6e (git-shortlog: make common repository prefix
   configurable with .mailmap, 2006-11-25)
2. fa375c7f1b (Add git-shortlog perl script, 2005-06-04)
3. https://lore.kernel.org/lkml/
4. https://lore.kernel.org/lkml/alpine.LFD.1.00.0803161651350.3020@woody.linux-foundation.org/
5. https://lore.kernel.org/lkml/BANLkTinrbh7Xi27an3uY7pDWrNKhJRYmEA@mail.gmail.com/
6. https://lore.kernel.org/lkml/CAHk-=wg1+kf1AVzXA-RQX0zjM6t9J2Kay9xyuNqcFHWV-y5ZYw@mail.gmail.com/

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 14:04:42 -08:00
Patrick Steinhardt
c7b190dabd fetch: implement support for atomic reference updates
When executing a fetch, then git will currently allocate one reference
transaction per reference update and directly commit it. This means that
fetches are non-atomic: even if some of the reference updates fail,
others may still succeed and modify local references.

This is fine in many scenarios, but this strategy has its downsides.

- The view of remote references may be inconsistent and may show a
  bastardized state of the remote repository.

- Batching together updates may improve performance in certain
  scenarios. While the impact probably isn't as pronounced with loose
  references, the upcoming reftable backend may benefit as it needs to
  write less files in case the update is batched.

- The reference-update hook is currently being executed twice per
  updated reference. While this doesn't matter when there is no such
  hook, we have seen severe performance regressions when doing a
  git-fetch(1) with reference-transaction hook when the remote
  repository has hundreds of thousands of references.

Similar to `git push --atomic`, this commit thus introduces atomic
fetches. Instead of allocating one reference transaction per updated
reference, it causes us to only allocate a single transaction and commit
it as soon as all updates were received. If locking of any reference
fails, then we abort the complete transaction and don't update any
reference, which gives us an all-or-nothing fetch.

Note that this may not completely fix the first of above downsides, as
the consistent view also depends on the server-side. If the server
doesn't have a consistent view of its own references during the
reference negotiation phase, then the client would get the same
inconsistent view the server has. This is a separate problem though and,
if it actually exists, can be fixed at a later point.

This commit also changes the way we write FETCH_HEAD in case `--atomic`
is passed. Instead of writing changes as we go, we need to accumulate
all changes first and only commit them at the end when we know that all
reference updates succeeded. Ideally, we'd just do so via a temporary
file so that we don't need to carry all updates in-memory. This isn't
trivially doable though considering the `--append` mode, where we do not
truncate the file but simply append to it. And given that we support
concurrent processes appending to FETCH_HEAD at the same time without
any loss of data, seeding the temporary file with current contents of
FETCH_HEAD initially and then doing a rename wouldn't work either. So
this commit implements the simple strategy of buffering all changes and
appending them to the file on commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 12:06:15 -08:00
Patrick Steinhardt
d4c8db8f1b fetch: allow passing a transaction to s_update_ref()
The handling of ref updates is completely handled by `s_update_ref()`,
which will manage the complete lifecycle of the reference transaction.
This is fine right now given that git-fetch(1) does not support atomic
fetches, so each reference gets its own transaction. It is quite
inflexible though, as `s_update_ref()` only knows about a single
reference update at a time, so it doesn't allow us to alter the
strategy.

This commit prepares `s_update_ref()` and its only caller
`update_local_ref()` to allow passing an external transaction. If none
is given, then the existing behaviour is triggered which creates a new
transaction and directly commits it. Otherwise, if the caller provides a
transaction, then we only queue the update but don't commit it. This
optionally allows the caller to manage when a transaction will be
committed.

Given that `update_local_ref()` is always called with a `NULL`
transaction for now, no change in behaviour is expected from this
change.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 12:06:15 -08:00
Patrick Steinhardt
c45889f104 fetch: refactor s_update_ref to use common exit path
The cleanup code in `s_update_ref()` is currently duplicated for both
succesful and erroneous exit paths. This commit refactors the function
to have a shared exit path for both cases to remove the duplication.

Suggested-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 12:06:15 -08:00
Patrick Steinhardt
929d044575 fetch: use strbuf to format FETCH_HEAD updates
This commit refactors `append_fetch_head()` to use a `struct strbuf` for
formatting the update which we're about to append to the FETCH_HEAD
file. While the refactoring doesn't have much of a benefit right now, it
serves as a preparatory step to implement atomic fetches where we need
to buffer all updates to FETCH_HEAD and only flush them out if all
reference updates succeeded.

No change in behaviour is expected from this commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 12:06:14 -08:00
Patrick Steinhardt
58a646a368 fetch: extract writing to FETCH_HEAD
When performing a fetch with the default `--write-fetch-head` option, we
write all updated references to FETCH_HEAD while the updates are
performed. Given that updates are not performed atomically, it means
that we we write to FETCH_HEAD even if some or all of the reference
updates fail.

Given that we simply update FETCH_HEAD ad-hoc with each reference, the
logic is completely contained in `store_update_refs` and thus quite hard
to extend. This can already be seen by the way we skip writing to the
FETCH_HEAD: instead of having a conditional which simply skips writing,
we instead open "/dev/null" and needlessly write all updates there.

We are about to extend git-fetch(1) to accept an `--atomic` flag which
will make the fetch an all-or-nothing operation with regards to the
reference updates. This will also require us to make the updates to
FETCH_HEAD an all-or-nothing operation, but as explained doing so is not
easy with the current layout. This commit thus refactors the wa we write
to FETCH_HEAD and pulls out the logic to open, append to, commit and
close the file. While this may seem rather over-the top at first,
pulling out this logic will make it a lot easier to update the code in a
subsequent commit. It also allows us to easily skip writing completely
in case `--no-write-fetch-head` was passed.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12 12:06:14 -08:00
Derrick Stolee
6c62f01552 for-each-repo: do nothing on empty config
'git for-each-repo --config=X' should return success without calling any
subcommands when the config key 'X' has no value. The current
implementation instead segfaults.

A user could run into this issue if they used 'git maintenance start' to
initialize their cron schedule using 'git for-each-repo
--config=maintenance.repo ...' but then using 'git maintenance
unregister' to remove the config option. (Note: 'git maintenance stop'
would remove the config _and_ remove the cron schedule.)

Add a simple test to ensure this works. Use 'git help --no-such-option'
as the potential subcommand to ensure that we will hit a failure if the
subcommand is ever run.

Reported-by: Andreas Bühmann <dev@uuml.de>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-07 19:12:02 -08:00
Ævar Arnfjörð Bjarmason
2708ce62d2 branch: sort detached HEAD based on a flag
Change the ref-filter sorting of detached HEAD to check the
FILTER_REFS_DETACHED_HEAD flag, instead of relying on the ref
description filled-in by get_head_description() to start with "(",
which in turn we expect to ASCII-sort before any other reference.

For context, we'd like the detached line to appear first at the start
of "git branch -l", e.g.:

    $ git branch -l
    * (HEAD detached at <hash>)
      master

This doesn't change that, but improves on a fix made in
28438e84e0 (ref-filter: sort detached HEAD lines firstly, 2019-06-18)
and gives the Chinese translation the ability to use its preferred
punctuation marks again.

In Chinese the fullwidth versions of punctuation like "()" are
typically written as (U+FF08 fullwidth left parenthesis), (U+FF09
fullwidth right parenthesis) instead[1]. This form is used in both
po/zh_{CN,TW}.po in most cases where "()" is translated in a string.

Aside from that improvement to the Chinese translation, it also just
makes for cleaner code that we mark any special cases in the ref_array
we're sorting with flags and make the sort function aware of them,
instead of piggy-backing on the general-case of strcmp() doing the
right thing.

As seen in the amended tests this made reverse sorting a bit more
consistent. Before this we'd sometimes sort this message in the
middle, now it's consistently at the beginning or end, depending on
whether we're doing a normal or reverse sort. Having it at the end
doesn't make much sense either, but at least it behaves consistently
now. A follow-up commit will make this behavior under reverse sorting
even better.

I'm removing the "TRANSLATORS" comments that were in the old code
while I'm at it. Those were added in d4919bb288 (ref-filter: move
get_head_description() from branch.c, 2017-01-10). I think it's
obvious from context, string and translation memory in typical
translation tools that these are the same or similar string.

1. https://en.wikipedia.org/wiki/Chinese_punctuation#Marks_similar_to_European_punctuation

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-07 15:13:21 -08:00
Ævar Arnfjörð Bjarmason
7c269a7b16 ref-filter: move ref_sorting flags to a bitfield
Change the reverse/ignore_case/version sort flags in the ref_sorting
struct into a bitfield. Having three of them was already a bit
unwieldy, but it would be even more so if another flag needed a
function like ref_sorting_icase_all() introduced in
76f9e569ad (ref-filter: apply --ignore-case to all sorting keys,
2020-05-03).

A follow-up change will introduce such a flag, so let's move this over
to a bitfield. Instead of using the usual '#define' pattern I'm using
the "enum" pattern from builtin/rebase.c's b4c8eb024a (builtin
rebase: support --quiet, 2018-09-04).

Perhaps there's a more idiomatic way of doing the "for each in list
amend mask" pattern than this "mask/on" variable combo. This function
doesn't allow us to e.g. do any arbitrary changes to the bitfield for
multiple flags, but I think in this case that's fine. The common case
is that we're calling this with a list of one.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-07 15:13:21 -08:00
Junio C Hamano
8664fcb83b Merge branch 'es/worktree-repair-both-moved'
"git worktree repair" learned to deal with the case where both the
repository and the worktree moved.

* es/worktree-repair-both-moved:
  worktree: teach `repair` to fix multi-directional breakage
2021-01-06 23:33:44 -08:00
Junio C Hamano
d3fa84d528 Merge branch 'fc/pull-merge-rebase'
When a user does not tell "git pull" to use rebase or merge, the
command gives a loud message telling a user to choose between
rebase or merge but creates a merge anyway, forcing users who would
want to rebase to redo the operation.  Fix an early part of this
problem by tightening the condition to give the message---there is
no reason to stop or force the user to choose between rebase or
merge if the history fast-forwards.

* fc/pull-merge-rebase:
  pull: display default warning only when non-ff
  pull: correct condition to trigger non-ff advice
  pull: get rid of unnecessary global variable
  pull: give the advice for choosing rebase/merge much later
  pull: refactor fast-forward check
2021-01-06 23:33:44 -08:00
Junio C Hamano
c256631065 Merge branch 'tb/pack-bitmap'
Various improvements to the codepath that writes out pack bitmaps.

* tb/pack-bitmap: (24 commits)
  pack-bitmap-write: better reuse bitmaps
  pack-bitmap-write: relax unique revwalk condition
  pack-bitmap-write: use existing bitmaps
  pack-bitmap: factor out 'add_commit_to_bitmap()'
  pack-bitmap: factor out 'bitmap_for_commit()'
  pack-bitmap-write: ignore BITMAP_FLAG_REUSE
  pack-bitmap-write: build fewer intermediate bitmaps
  pack-bitmap.c: check reads more aggressively when loading
  pack-bitmap-write: rename children to reverse_edges
  t5310: add branch-based checks
  commit: implement commit_list_contains()
  bitmap: implement bitmap_is_subset()
  pack-bitmap-write: fill bitmap with commit history
  pack-bitmap-write: pass ownership of intermediate bitmaps
  pack-bitmap-write: reimplement bitmap writing
  ewah: add bitmap_dup() function
  ewah: implement bitmap_or()
  ewah: make bitmap growth less aggressive
  ewah: factor out bitmap growth
  rev-list: die when --test-bitmap detects a mismatch
  ...
2021-01-06 23:33:43 -08:00
Ævar Arnfjörð Bjarmason
ffdd02a55d branch: change "--local" to "--list" in comment
There has never been a "git branch --local", this is just a typo for
"--list". Fixes a comment added in 23e714df91 (branch: roll
show_detached HEAD into regular ref_list, 2015-09-23).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 15:15:39 -08:00
ZheNing Hu
e73fe3dd02 builtin/*: update usage format
According to the guidelines in parse-options.h,
we should not end in a full stop or start with
a capital letter. Fix old error and usage
messages to match this expectation.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 15:10:49 -08:00
Ævar Arnfjörð Bjarmason
06ce79152b mktag: add a --[no-]strict option
Now that mktag has been migrated to use the fsck machinery to check
its input, it makes sense to teach it to run in the equivalent of "git
fsck"'s default mode.

For cases where mktag is used to (re)create a tag object using data
from an existing and malformed tag object, the validation may
optionally have to be loosened. Teach the command to take the
"--[no-]strict" option to do so.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 14:22:24 -08:00
Martin Ågren
d4a4976648 builtin/gc: don't peek into struct lock_file
A `struct lock_file` is pretty much just a wrapper around a tempfile.
But it's easy enough to avoid relying on this. Use the wrappers that the
lock file API provides rather than peeking at the temp file or even into
*its* internals.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06 13:53:32 -08:00
Ævar Arnfjörð Bjarmason
2aa9425fbe mktag: mark strings for translation
Mark the errors mktag might emit for translation. This is a plumbing
command, but the errors it emits are intended to be human-readable.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
3f390a366c mktag: convert to parse-options
Convert the "mktag" command to use parse-options.h instead of its own
ad-hoc argc handling. This doesn't matter much in practice since it
doesn't support any options, but removes another special-case in our
codebase, and makes it easier to add options to it in the future.

It does marginally improve the situation for programs that want to
execute git commands in a consistent manner and e.g. always use
--end-of-options. E.g. "gitaly" does that, and has a blacklist of
built-ins that don't support --end-of-options. This is one less
special case for it and other similar programs to support.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
acfc01332b mktag: allow turning off fsck.extraHeaderEntry
In earlier commits mktag learned to use the fsck machinery, at which
point we needed to add fsck.extraHeaderEntry so it could be as strict
about extra headers as it's been ever since it was implemented.

But it's not nice to need to switch away from "mktag" to "hash-object"
+ manual "fsck" just because you'd like to have an extra header. So
let's support turning it off by getting "fsck.*" variables from the
config.

Pedantically speaking it's still not possible to make "mktag" behave
just like "hash-object -t tag" does, since we're unconditionally going
to check the referenced object in verify_object_in_tag(), which is our
own check, and not one that exists in fsck.c.

But the spirit of "this works like fsck" is preserved, in that if you
created such a tag with "hash-object" and did a full "fsck" on the
repository it would also error out about that invalid object, it just
wouldn't emit the same message as fsck does.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
1f3299fda9 fsck: make fsck_config() re-usable
Move the fsck_config() function from builtin/fsck.c to fsck.[ch]. This
allows for re-using it in other tools that expose fsck logic and want
to support its configuration variables.

A logical continuation of this change would be to use a common
function for all of {fetch,receive}.fsck.* and fsck.*. See
5d477a334a (fsck (receive-pack): allow demoting errors to warnings,
2015-06-22) and my own 1362df0d41 (fetch: implement fetch.fsck.*,
2018-07-27) for the relevant code.

However, those routines want to not parse the fsck.skipList into OIDs,
but rather pass them along with the --strict option to another
process. It would be possible to refactor that whole thing so we
support e.g. a "fetch." prefix, then just keep track of the skiplist
as a filename instead of parsing it, and learn to spew that all out
from our internal structures into something we can append to the
--strict option.

But instead I'm planning to re-use this in "mktag", which'll just
re-use these "fsck.*" variables as-is.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
acf9de4c94 mktag: use fsck instead of custom verify_tag()
Change the validation logic in "mktag" to use fsck's fsck_tag()
instead of its own custom parser. Curiously the logic for both dates
back to the same commit[1]. Let's unify them so we're not maintaining
two sets functions to verify that a tag is OK.

The behavior of fsck_tag() and the old "mktag" code being removed here
is different in few aspects.

I think it makes sense to remove some of those checks, namely:

 A. fsck only cares that the timezone matches [-+][0-9]{4}. The mktag
    code disallowed values larger than 1400.

    Yes there's currently no timezone with a greater offset[2], but
    since we allow any number of non-offical timezones (e.g. +1234)
    passing this through seems fine. Git also won't break in the
    future if e.g. French Polynesia decides it needs to outdo the Line
    Islands when it comes to timezone extravagance.

 B. fsck allows missing author names such as "tagger <email>", mktag
    wouldn't, but would allow e.g. "tagger [2 spaces] <email>" (but
    not "tagger [1 space] <email>"). Now we allow all of these.

 C. Like B, but "mktag" disallowed spaces in the <email> part, fsck
    allows it.

In some ways fsck_tag() is stricter than "mktag" was, namely:

 D. fsck disallows zero-padded dates, but mktag didn't care. So
    e.g. the timestamp "0000000000 +0000" produces an error now. A
    test in "t1006-cat-file.sh" relied on this, it's been changed to
    use "hash-object" (without fsck) instead.

There was one check I deemed worth keeping by porting it over to
fsck_tag():

 E. "mktag" did not allow any custom headers, and by extension (as an
    empty commit is allowed) also forbade an extra stray trailing
    newline after the headers it knew about.

    Add a new check in the "ignore" category to fsck and use it. This
    somewhat abuses the facility added in efaba7cc77 (fsck:
    optionally ignore specific fsck issues completely, 2015-06-22).

    This is somewhat of hack, but probably the least invasive change
    we can make here. The fsck command will shuffle these categories
    around, e.g. under --strict the "info" becomes a "warn" and "warn"
    becomes "error". Existing users of fsck's (and others,
    e.g. index-pack) --strict option rely on this.

    So we need to put something into a category that'll be ignored by
    all existing users of the API. Pretending that
    fsck.extraHeaderEntry=error ("ignore" by default) was set serves
    to do this for us.

1. ec4465adb3 (Add "tag" objects that can be used to sign other
   objects., 2005-04-25)

2. https://en.wikipedia.org/wiki/List_of_UTC_time_offsets

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
40ef015a27 mktag: use puts(str) instead of printf("%s\n", str)
This introduces no functional change, but refactors the print-out of
the hash at the end to do the same thing with less code.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
dfe3948728 mktag: remove redundant braces in one-line body "if"
This minor stylistic churn is usually something we'd avoid, but if we
don't do this then the file after changes in subsequent commits will
only have this minor style inconsistency, so let's change this while
we're at it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason
0c439117bb mktag: use default strbuf_read() hint
Change the hardcoded hint of 2^12 to 0. The default strbuf hint is
perfectly fine here, and the only reason we were hardcoding it is
because it survived migration from a pre-strbuf fixed-sized buffer.

See fd17f5b5f7 (Replace all read_fd use with strbuf_read, and get rid
of it., 2007-09-10) for that migration.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:58:29 -08:00
Derrick Stolee
3797a0a7b7 maintenance: use Windows scheduled tasks
Git's background maintenance uses cron by default, but this is not
available on Windows. Instead, integrate with Task Scheduler.

Tasks can be scheduled using the 'schtasks' command. There are several
command-line options that can allow for some advanced scheduling, but
unfortunately these seem to all require authenticating using a password.

Instead, use the "/xml" option to pass an XML file that contains the
configuration for the necessary schedule. These XML files are based on
some that I exported after constructing a schedule in the Task Scheduler
GUI. These options only run background maintenance when the user is
logged in, and more fields are populated with the current username and
SID at run-time by 'schtasks'.

Since the GIT_TEST_MAINT_SCHEDULER environment variable allows us to
specify 'schtasks' as the scheduler, we can test the Windows-specific
logic on other platforms. Thus, add a check that the XML file written
by Git is valid when xmllint exists on the system.

Since we use a temporary file for the XML files sent to 'schtasks', we
prefix the random characters with the frequency so it is easier to
examine the proper file during tests. Instead of an exact match on the
'args' file, we 'grep' for the arguments other than the filename.

There is a deficiency in the current design. Windows has two kinds of
applications: GUI applications that start by "winmain()" and console
applications that start by "main()". Console applications are attached
to a new Console window if they are not already associated with a GUI
application. This means that every hour the scheudled task launches a
command window for the scheduled tasks. Not only is this visually
obtrusive, but it also takes focus from whatever else the user is
doing!

A simple fix would be to insert a GUI application that acts as a shim
between the scheduled task and Git. This is currently possible in Git
for Windows by setting the <Command> tag equal to

  C:\Program Files\Git\git-bash.exe

with options "--hide --no-needs-console --command=cmd\git.exe"
followed by the arguments currently used. Since git-bash.exe is not
included in Windows builds of core Git, I chose to leave out this
feature. My plan is to submit a small patch to Git for Windows that
converts the use of git.exe with this use of git-bash.exe in the
short term. In the long term, we can consider creating this GUI
shim application within core Git, perhaps in contrib/.

Co-authored-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:38:02 -08:00
Derrick Stolee
2afe7e3567 maintenance: use launchctl on macOS
The existing mechanism for scheduling background maintenance is done
through cron. The 'crontab -e' command allows updating the schedule
while cron itself runs those commands. While this is technically
supported by macOS, it has some significant deficiencies:

1. Every run of 'crontab -e' must request elevated privileges through
   the user interface. When running 'git maintenance start' from the
   Terminal app, it presents a dialog box saying "Terminal.app would
   like to administer your computer. Administration can include
   modifying passwords, networking, and system settings." This is more
   alarming than what we are hoping to achieve. If this alert had some
   information about how "git" is trying to run "crontab" then we would
   have some reason to believe that this dialog might be fine. However,
   it also doesn't help that some scenarios just leave Git waiting for
   a response without presenting anything to the user. I experienced
   this when executing the command from a Bash terminal view inside
   Visual Studio Code.

2. While cron initializes a user environment enough for "git config
   --global --show-origin" to show the correct config file information,
   it does not set up the environment enough for Git Credential Manager
   Core to load credentials during a 'prefetch' task. My prefetches
   against private repositories required re-authenticating through UI
   pop-ups in a way that should not be required.

The solution is to switch from cron to the Apple-recommended [1]
'launchd' tool.

[1] https://developer.apple.com/library/archive/documentation/MacOSX/Conceptual/BPSystemStartup/Chapters/ScheduledJobs.html

The basics of this tool is that we need to create XML-formatted
"plist" files inside "~/Library/LaunchAgents/" and then use the
'launchctl' tool to make launchd aware of them. The plist files
include all of the scheduling information, along with the command-line
arguments split across an array of <string> tags.

For example, here is my plist file for the weekly scheduled tasks:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0"><dict>
<key>Label</key><string>org.git-scm.git.weekly</string>
<key>ProgramArguments</key>
<array>
<string>/usr/local/libexec/git-core/git</string>
<string>--exec-path=/usr/local/libexec/git-core</string>
<string>for-each-repo</string>
<string>--config=maintenance.repo</string>
<string>maintenance</string>
<string>run</string>
<string>--schedule=weekly</string>
</array>
<key>StartCalendarInterval</key>
<array>
<dict>
<key>Day</key><integer>0</integer>
<key>Hour</key><integer>0</integer>
<key>Minute</key><integer>0</integer>
</dict>
</array>
</dict>
</plist>

The schedules for the daily and hourly tasks are more complicated
since we need to use an array for the StartCalendarInterval with
an entry for each of the six days other than the 0th day (to avoid
colliding with the weekly task), and each of the 23 hours other
than the 0th hour (to avoid colliding with the daily task).

The "Label" value is currently filled with "org.git-scm.git.X"
where X is the frequency. We need a different plist file for each
frequency.

The launchctl command needs to be aligned with a user id in order
to initialize the command environment. This must be done using
the 'launchctl bootstrap' subcommand. This subcommand is new as
of macOS 10.11, which was released in September 2015. Before that
release the 'launchctl load' subcommand was recommended. The best
source of information on this transition I have seen is available
at [2]. The current design does not preclude a future version that
detects the available fatures of 'launchctl' to use the older
commands. However, it is best to rely on the newest version since
Apple might completely remove the deprecated version on short
notice.

[2] https://babodee.wordpress.com/2016/04/09/launchctl-2-0-syntax/

To remove a schedule, we must run 'launchctl bootout' with a valid
plist file. We also need to 'bootout' a task before the 'bootstrap'
subcommand will succeed, if such a task already exists.

The need for a user id requires us to run 'id -u' which works on
POSIX systems but not Windows. Further, the need for fully-qualitifed
path names including $HOME behaves differently in the Git internals and
the external test suite. The $HOME variable starts with "C:\..." instead
of the "/c/..." that is provided by Git in these subcommands. The test
therefore has a prerequisite that we are not on Windows. The cross-
platform logic still allows us to test the macOS logic on a Linux
machine.

We can verify the commands that were run by 'git maintenance start'
and 'git maintenance stop' by injecting a script that writes the
command-line arguments into GIT_TEST_MAINT_SCHEDULER.

An earlier version of this patch accidentally had an opening
"<dict>" tag when it should have had a closing "</dict>" tag. This
was caught during manual testing with actual 'launchctl' commands,
but we do not want to update developers' tasks when running tests.
It appears that macOS includes the "xmllint" tool which can verify
the XML format. This is useful for any system that might contain
the tool, so use it whenever it is available.

We strive to make these tests work on all platforms, but Windows caused
some headaches. In particular, the value of getuid() called by the C
code is not guaranteed to be the same as `$(id -u)` invoked by a test.
This is because `git.exe` is a native Windows program, whereas the
utility programs run by the test script mostly utilize the MSYS2 runtime,
which emulates a POSIX-like environment. Since the purpose of the test
is to check that the input to the hook is well-formed, the actual user
ID is immaterial, thus we can work around the problem by making the the
test UID-agnostic. Another subtle issue is the $HOME environment
variable being a Windows-style path instead of a Unix-style path. We can
be more flexible here instead of expecting exact path matches.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Co-authored-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-05 14:38:02 -08:00
René Scharfe
ca5120c339 rebase: verify commit parameter
If the user specifies a base commit to switch to, check if it actually
references a commit right away to avoid getting confused later on when
it turns out to be an invalid object.

Reported-by: LeSeulArtichaut <leseulartichaut@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 15:24:13 -08:00
Martin Ågren
bc62692757 hash-lookup: rename from sha1-lookup
Change all remnants of "sha1" in hash-lookup.c and .h and rename them to
reflect that we're not just able to handle SHA-1 these days.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 13:01:55 -08:00
Martin Ågren
7a7d992d0d sha1-lookup: rename sha1_pos() as hash_pos()
Rename this function to reflect that we're not just able to handle SHA-1
these days. There are a few instances of "sha1" left in sha1-lookup.[ch]
after this, but those will be addressed in the next commit.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 13:01:55 -08:00
Martin Ågren
e5afd4449d object-file.c: rename from sha1-file.c
Drop the last remnant of "sha1" in this file and rename it to reflect
that we're not just able to handle SHA-1 these days.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 13:01:55 -08:00
Junio C Hamano
7a50265295 Merge branch 'ma/maintenance-crontab-fix'
Hotfix for a topic of this cycle.

* ma/maintenance-crontab-fix:
  t7900-maintenance: test for magic markers
  gc: fix handling of crontab magic markers
  git-maintenance.txt: add missing word
2020-12-23 13:59:46 -08:00
Junio C Hamano
04cd999638 Merge branch 'dl/checkout-p-merge-base'
Fix to a regression introduced during this cycle.

* dl/checkout-p-merge-base:
  checkout -p: handle tree arguments correctly again
2020-12-23 13:59:46 -08:00
Junio C Hamano
c46f849f8a Merge branch 'jx/pack-redundant-on-single-pack'
"git pack-redandant" when there is only one packfile used to crash,
which has been corrected.

* jx/pack-redundant-on-single-pack:
  pack-redundant: fix crash when one packfile in repo
2020-12-23 13:59:46 -08:00
Martin Ågren
66dc0a3625 gc: fix handling of crontab magic markers
On `git maintenance start`, we add a few entries to the user's cron
table. We wrap our entries using two magic markers, "# BEGIN GIT
MAINTENANCE SCHEDULE" and "# END GIT MAINTENANCE SCHEDULE". At a later
`git maintenance stop`, we will go through the table and remove these
lines. Or rather, we will remove the "BEGIN" marker, the "END" marker
and everything between them.

Alas, we have a bug in how we detect the "END" marker: we don't. As we
loop through all the lines of the crontab, if we are in the "old
region", i.e., the region we're aiming to remove, we make an early
`continue` and don't get as far as checking for the "END" marker. Thus,
once we've seen our "BEGIN", we remove everything until the end of the
file.

Rewrite the logic for identifying these markers. There are four cases
that are mutually exclusive: The current line starts a region or it ends
it, or it's firmly within the region, or it's outside of it (and should
be printed).

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 14:33:08 -08:00
Johannes Schindelin
5c29f19cda checkout -p: handle tree arguments correctly again
This fixes a segmentation fault.

The bug is caused by dereferencing `new_branch_info->commit` when it is
`NULL`, which is the case when the tree-ish argument is actually a tree,
not a commit-ish. This was introduced in 5602b500c3 (builtin/checkout:
fix `git checkout -p HEAD...` bug, 2020-10-07), where we tried to ensure
that the special tree-ish `HEAD...` is handled correctly.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 14:06:09 -08:00
Sergey Organov
0c627f5d3c diff-merges: handle imply -p on -c/--cc logic for log.c
Move logic that handles implying -p on -c/--cc from
log_setup_revisions_tweak() to diff_merges_setup_revs(), where it
belongs.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:31 -08:00
Sergey Organov
3b6c17b5c0 diff-merges: new function diff_merges_set_dense_combined_if_unset()
Call it where given functionality is needed instead of direct
checking/tweaking of diff merges related fields.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:31 -08:00
Sergey Organov
09322b1da9 diff-merges: new function diff_merges_suppress()
This function sets all the relevant flags to disabled state, so that
no code that checks only one of them get it wrong.

Then we call this new function everywhere where diff merges output
suppression is needed.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:31 -08:00
Sergey Organov
4f54544d73 diff-merges: rename diff_merges_default_to_enable() to match semantics
Rename diff_merges_default_to_enable() to
diff_merges_default_to_first_parent() to match its semantics.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:31 -08:00
Sergey Organov
7acf0d06f5 diff-merges: move checks for first_parent_only out of the module
The checks for first_parent_only don't in fact belong to this module,
as the primary purpose of this flag is history traversal limiting, so
get it out of this module and rename the

diff_merges_first_parent_defaults_to_enable()

to

diff_merges_default_to_enable()

to match new semantics.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:30 -08:00
Sergey Organov
18f09473bf diff-merges: rename all functions to have common prefix
Use the same "diff_merges" prefix for all the diff merges function
names.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:30 -08:00
Sergey Organov
a37eec6333 revision: move diff merges functions to its own diff-merges.c
Create separate diff-merges.c and diff-merges.h files, and move all
the code related to handling of diff merges there.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:30 -08:00
Sergey Organov
3d4fd94363 revision: provide implementation for diff merges tweaks
Use these implementations from show_setup_revisions_tweak() and
log_setup_revisions_tweak() in builtin/log.c.

This completes moving of management of diff merges parameters to a
single place, where we can finally observe them simultaneously.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:47:30 -08:00
Eric Sunshine
cf76baea41 worktree: teach repair to fix multi-directional breakage
`git worktree repair` knows how to repair the two-way links between the
repository and a worktree as long as a link in one or the other
direction is sound. For instance, if a linked worktree is moved (without
using `git worktree move`), repair is possible because the worktree
still knows the location of the repository even though the repository no
longer knows where the worktree is. Similarly, if the repository is
moved, repair is possible since the repository still knows the locations
of the worktrees even though the worktrees no longer know where the
repository is.

However, if both the repository and the worktrees are moved, then links
are severed in both directions, and no repair is possible. This is the
case even when the new worktree locations are specified as arguments to
`git worktree repair`. The reason for this limitation is twofold. First,
when `repair` consults the worktree's gitfile (/path/to/worktree/.git)
to determine the corresponding <repo>/worktrees/<id>/gitdir file to fix,
<repo> is the old path to the repository, thus it is unable to fix the
`gitdir` file at its new location since it doesn't know where it is.
Second, when `repair` consults <repo>/worktrees/<id>/gitdir to find the
location of the worktree's gitfile (/path/to/worktree/.git), the path
recorded in `gitdir` is the old location of the worktree's gitfile, thus
it is unable to repair the gitfile since it doesn't know where it is.

Fix these shortcomings by teaching `repair` to attempt to infer the new
location of the <repo>/worktrees/<id>/gitdir file when the location
recorded in the worktree's gitfile has become stale but the file is
otherwise well-formed. The inference is intentionally simple-minded.
For each worktree path specified as an argument, `git worktree repair`
manually reads the ".git" gitfile at that location and, if it is
well-formed, extracts the <id>. It then searches for a corresponding
<id> in <repo>/worktrees/ and, if found, concludes that there is a
reasonable match and updates <repo>/worktrees/<id>/gitdir to point at
the specified worktree path. In order for <repo> to be known, `git
worktree repair` must be run in the main worktree or bare repository.

`git worktree repair` first attempts to repair each incoming
/path/to/worktree/.git gitfile to point at the repository, and then
attempts to repair outgoing <repo>/worktrees/<id>/gitdir files to point
at the worktrees. This sequence was chosen arbitrarily when originally
implemented since the order of fixes is immaterial as long as one side
of the two-way link between the repository and a worktree is sound.
However, for this new repair technique to work, the order must be
reversed. This is because the new inference mechanism, when it is
successful, allows the outgoing <repo>/worktrees/<id>/gitdir file to be
repaired, thus fixing one side of the two-way link. Once that side is
fixed, the other side can be fixed by the existing repair mechanism,
hence the order of repairs is now significant.

Two safeguards are employed to avoid hijacking a worktree from a
different repository if the user accidentally specifies a foreign
worktree as an argument. The first, as described above, is that it
requires an <id> match between the repository and the worktree. That
itself is not foolproof for preventing hijack, so the second safeguard
is that the inference will only kick in if the worktree's
/path/to/worktree/.git gitfile does not point at a repository.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-21 13:44:28 -08:00
Junio C Hamano
3517022568 Merge branch 'ab/unreachable-break'
Code clean-up.

* ab/unreachable-break:
  style: do not "break" in switch() after "return"
2020-12-18 15:15:18 -08:00
Junio C Hamano
772bdcd429 Merge branch 'js/init-defaultbranch-advice'
Our users are going to be trained to prepare for future change of
init.defaultBranch configuration variable.

* js/init-defaultbranch-advice:
  init: provide useful advice about init.defaultBranch
  get_default_branch_name(): prepare for showing some advice
  branch -m: allow renaming a yet-unborn branch
  init: document `init.defaultBranch` better
2020-12-18 15:15:17 -08:00
Junio C Hamano
21127fa982 Merge branch 'tb/partial-clone-filters-fix'
Fix potential server side resource deallocation issues when
responding to a partial clone request.

* tb/partial-clone-filters-fix:
  upload-pack.c: don't free allowed_filters util pointers
  builtin/clone.c: don't ignore transport_fetch_refs() errors
2020-12-17 15:06:40 -08:00
Jiang Xin
0696232390 pack-redundant: fix crash when one packfile in repo
Command `git pack-redundant --all` will crash if there is only one
packfile in the repository.  This is because, if there is only one
packfile in local_packs, `cmp_local_packs` will do nothing and will
leave `pl->unique_objects` as uninitialized.

Also add testcases for repository with no packfile and one packfile
in t5323.

Reported-by: Daniel C. Klauer <daniel.c.klauer@web.de>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-16 21:21:06 -08:00
Felipe Contreras
c525de335e pull: display default warning only when non-ff
There's no need to display the annoying warning on every pull... only
the ones that are not fast-forward.

The current warning tests still pass, but not because of the arguments
or the configuration, but because they are all fast-forward.

We need to test non-fast-forward situations now.

Suggestions-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-15 17:39:42 -08:00
Junio C Hamano
7539fdc629 pull: correct condition to trigger non-ff advice
Refactor the advise() call that teaches users how they can choose
between merge and rebase into a helper function.  This revealed that
the caller's logic needs to be further clarified to allow future
actions (like "erroring out" instead of the current "go ahead and
merge anyway") that should happen whether the advice message is
squelched out.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-15 17:39:42 -08:00
Junio C Hamano
b044db9172 pull: get rid of unnecessary global variable
It is easy enough to do, and gives a more descriptive name to the
variable that is scoped in a more focused way.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-15 17:39:17 -08:00
Ævar Arnfjörð Bjarmason
56f56ac50b style: do not "break" in switch() after "return"
Remove this unreachable code. It was found by SunCC, it's found by a
non-fatal warning emitted by SunCC. It's one of the things it's more
vehement about than GCC & Clang.

It complains about a lot of other similarly unreachable code, e.g. a
BUG(...) without a "return", and a "return 0" after a long if/else,
both of whom have "return" statements. Those are also genuine
redundancies to a compiler, but arguably make the code a bit easier to
read & less fragile to maintain.

These return/break cases are just unnecessary however, and as seen
here the surrounding code just did a plain "return" without a "break"
already.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-15 16:32:50 -08:00
Junio C Hamano
c3b58472be pack-redundant: gauge the usage before proposing its removal
The subcommand is unusably slow and the reason why nobody reports it
as a performance bug is suspected to be the absense of users.  Let's
show a big message that asks the user to tell us that they still
care about the command when an attempt is made to run the command,
with an escape hatch to override it with a command line option.

In a few releases, we may turn it into an error and keep it for a
few more releases before finally removing it (during the whole time,
the plan to remove it would be interrupted by end user raising hand).

Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-15 14:30:11 -08:00
Felipe Contreras
278f4be806 pull: give the advice for choosing rebase/merge much later
Eventually we want to be omit the advice when we can fast-forward
in which case there is no reason to require the user to choose
between rebase or merge.

In order to do so, we need to delay giving the advice up to the
point where we can check if we can fast-forward or not.

Additionally, config_get_rebase() was probably never its true home.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-14 09:03:17 -08:00
Felipe Contreras
77a7ec6329 pull: refactor fast-forward check
We would like to be able to make this check before the decision to
rebase is made in a future step.  Besides, using a separate helper
makes the code easier to follow.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-14 08:59:40 -08:00
Johannes Schindelin
cc0f13c57d get_default_branch_name(): prepare for showing some advice
We are about to introduce a message giving users running `git init` some
advice about `init.defaultBranch`. This will necessarily be done in
`repo_default_branch_name()`.

Not all code paths want to show that advice, though. In particular, the
`git clone` codepath _specifically_ asks for `init_db()` to be quiet,
via the `INIT_DB_QUIET` flag.

In preparation for showing users above-mentioned advice, let's change
the function signature of `get_default_branch_name()` to accept the
parameter `quiet`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 15:53:50 -08:00
Johannes Schindelin
cfaff3aac8 branch -m: allow renaming a yet-unborn branch
In one of the next commits, we would like to give users some advice
regarding the initial branch name, and how to modify it.

To that end, it would be good if `git branch -m <name>` worked in a
freshly initialized repository without any commits. Let's make it so.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-13 15:53:50 -08:00
brian m. carlson
fac60b8925 rev-parse: add option for absolute or relative path formatting
git rev-parse has several options which print various paths.  Some of
these paths are printed relative to the current working directory, and
some are absolute.

Normally, this is not a problem, but there are times when one wants
paths entirely in one format or another.  This can be done trivially if
the paths are canonical, but canonicalizing paths is not possible on
some shell scripting environments which lack realpath(1) and also in Go,
which lacks functions that properly canonicalize paths on Windows.

To help out the scripter, let's provide an option which turns most of
the paths printed by git rev-parse to be either relative to the current
working directory or absolute and canonical.  Document which options are
affected and which are not so that users are not confused.

This approach is cleaner and tidier than providing duplicates of
existing options which are either relative or absolute.

Note that if the user needs both forms, it is possible to pass an
additional option in the middle of the command line which changes the
behavior of subsequent operations.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-12 23:35:51 -08:00
Junio C Hamano
bb48056cb2 Merge branch 'tb/bugreport-no-localtime'
Use of non-reentrant localtime() has been removed.

* tb/bugreport-no-localtime:
  builtin/bugreport.c: use thread-safe localtime_r()
2020-12-08 15:11:21 -08:00
Junio C Hamano
f2a75cb312 Merge branch 'rs/maintenance-run-outside-repo'
"git maintenance run/start/stop" needed to be run in a repository
to hold the lockfile they use, but didn't make sure they are
actually in a repository, which has been corrected.

* rs/maintenance-run-outside-repo:
  t7900: fix typo: "test_execpt_success"
  maintenance: fix SEGFAULT when no repository
2020-12-08 15:11:21 -08:00
Junio C Hamano
5dfb976460 Merge branch 'ma/grep-init-default'
Code clean-up.

* ma/grep-init-default:
  MyFirstObjectWalk: drop `init_walken_defaults()`
  grep: copy struct in one fell swoop
  grep: use designated initializers for `grep_defaults`
  grep: don't set up a "default" repo for grep
2020-12-08 15:11:20 -08:00
Junio C Hamano
01b8886a62 Merge branch 'js/trace2-session-id'
The transport layer was taught to optionally exchange the session
ID assigned by the trace2 subsystem during fetch/push transactions.

* js/trace2-session-id:
  receive-pack: log received client session ID
  send-pack: advertise session ID in capabilities
  upload-pack, serve: log received client session ID
  fetch-pack: advertise session ID in capabilities
  transport: log received server session ID
  serve: advertise session ID in v2 capabilities
  receive-pack: advertise session ID in v0 capabilities
  upload-pack: advertise session ID in v0 capabilities
  trace2: add a public function for getting the SID
  docs: new transfer.advertiseSID option
  docs: new capability to advertise session IDs
2020-12-08 15:11:20 -08:00
Junio C Hamano
d702cb9e89 Merge branch 'ds/maintenance-part-3'
"git maintenance" command had trouble working in a directory whose
pathname contained an ERE metacharacter like '+'.

* ds/maintenance-part-3:
  maintenance: use 'git config --fixed-value'
2020-12-08 15:11:19 -08:00
Junio C Hamano
a10e7842ab Merge branch 'ds/config-literal-value'
Various subcommands of "git config" that takes value_regex
learn the "--literal-value" option to take the value_regex option
as a literal string.

* ds/config-literal-value:
  config doc: value-pattern is not necessarily a regexp
  config: implement --fixed-value with --get*
  config: plumb --fixed-value into config API
  config: add --fixed-value option, un-implemented
  t1300: add test for --replace-all with value-pattern
  t1300: test "set all" mode with value-pattern
  config: replace 'value_regex' with 'value_pattern'
  config: convert multi_replace to flags
2020-12-08 15:11:19 -08:00
Junio C Hamano
1bc550effe Merge branch 'ps/update-ref-multi-transaction'
"git update-ref --stdin" learns to take multiple transactions in a
single session.

* ps/update-ref-multi-transaction:
  update-ref: disallow "start" for ongoing transactions
  p1400: use `git-update-ref --stdin` to test multiple transactions
  update-ref: allow creation of multiple transactions
  t1400: avoid touching refs on filesystem
2020-12-08 15:11:17 -08:00
Jeff King
449fa5ee06 pack-bitmap-write: ignore BITMAP_FLAG_REUSE
The on-disk bitmap format has a flag to mark a bitmap to be "reused".
This is a rather curious feature, and works like this:

  - a run of pack-objects would decide to mark the last 80% of the
    bitmaps it generates with the reuse flag

  - the next time we generate bitmaps, we'd see those reuse flags from
    the last run, and mark those commits as special:

      - we'd be more likely to select those commits to get bitmaps in
        the new output

      - when generating the bitmap for a selected commit, we'd reuse the
        old bitmap as-is (rearranging the bits to match the new pack, of
        course)

However, neither of these behaviors particularly makes sense.

Just because a commit happened to be bitmapped last time does not make
it a good candidate for having a bitmap this time. In particular, we may
choose bitmaps based on how recent they are in history, or whether a ref
tip points to them, and those things will change. We're better off
re-considering fresh which commits are good candidates.

Reusing the existing bitmap _is_ a reasonable thing to do to save
computation. But only reusing exact bitmaps is a weak form of this. If
we have an old bitmap for A and now want a new bitmap for its child, we
should be able to compute that only by looking at trees and that are new
to the child. But this code would consider only exact reuse (which is
perhaps why it was eager to select those commits in the first place).

Furthermore, the recent switch to the reverse-edge algorithm for
generating bitmaps dropped this optimization entirely (and yet still
performs better).

So let's do a few cleanups:

 - drop the whole "reusing bitmaps" phase of generating bitmaps. It's
   not helping anything, and is mostly unused code (or worse, code that
   is using CPU but not doing anything useful)

 - drop the use of the on-disk reuse flag to select commits to bitmap

 - stop setting the on-disk reuse flag in bitmaps we generate (since
   nothing respects it anymore)

We will keep a few innards of the reuse code, which will help us
implement a more capable version of the "reuse" optimization:

 - simplify rebuild_existing_bitmaps() into a function that only builds
   the mapping of bits between the old and new orders, but doesn't
   actually convert any bitmaps

 - make rebuild_bitmap() public; we'll call it lazily to convert bitmaps
   as we traverse (using the mapping created above)

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08 14:48:17 -08:00
Taylor Blau
aab179d937 builtin/clone.c: don't ignore transport_fetch_refs() errors
If 'git clone' couldn't execute 'transport_fetch_refs()' (e.g., because
of an error on the remote's side in 'git upload-pack'), then it will
silently ignore it.

Even though this has been the case at least since clone was ported to C
(way back in 8434c2f1af (Build in clone, 2008-04-27)), 'git fetch'
doesn't ignore these and reports any failures it sees.

That suggests that ignoring the return value in 'git clone' is simply an
oversight that should be corrected. That's exactly what this patch does.
(Noticing and fixing this is no coincidence, we'll want it in the next
patch in order to demonstrate a regression in 'git upload-pack' via a
'git clone'.)

There's no additional logging here, but that matches how 'git fetch'
handles the same case. An assumption there is that whichever part of
transport_fetch_refs() fails will complain loudly, so any additional
logging here is redundant.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-03 12:42:29 -08:00
Junio C Hamano
39d38a5c5f Merge branch 'tb/repack-simplify'
Simplify the logic to deal with a repack operation that ended up
creating the same packfile.

* tb/repack-simplify:
  builtin/repack.c: don't move existing packs out of the way
  builtin/repack.c: keep track of what pack-objects wrote
  repack: make "exts" array available outside cmd_repack()
2020-12-03 00:18:06 -08:00
Junio C Hamano
c692e1b673 Merge branch 'pb/pull-rebase-recurse-submodules'
"git pull --rebase --recurse-submodules" checked for local changes
in a wrong range and failed to run correctly when it should.

* pb/pull-rebase-recurse-submodules:
  pull: check for local submodule modifications with the right range
  t5572: describe '--rebase' tests a little more
  t5572: add notes on a peculiar test
  pull --rebase: compute rebase arguments in separate function
2020-12-03 00:18:06 -08:00
Elijah Newren
ba359fd507 stash: fix stash application in sparse-checkouts
sparse-checkouts are built on the patterns in the
$GIT_DIR/info/sparse-checkout file, where commands have modified
behavior for paths that do not match those patterns.  The differences in
behavior, as far as the bugs concerned here, fall into three different
categories (with git subcommands that fall into each category listed):

  * commands that only look at files matching the patterns:
      * status
      * diff
      * clean
      * update-index
  * commands that remove files from the working tree that do not match
    the patterns, and restore files that do match them:
      * read-tree
      * switch
      * checkout
      * reset (--hard)
  * commands that omit writing files to the working tree that do not
    match the patterns, unless those files are not clean:
      * merge
      * rebase
      * cherry-pick
      * revert

There are some caveats above, e.g. a plain `git diff` ignores files
outside the sparsity patterns but will show diffs for paths outside the
sparsity patterns when revision arguments are passed.  (Technically,
diff is treating the sparse paths as matching HEAD.)  So, there is some
internal inconsistency among these commands.  There are also additional
commands that should behave differently in the face of sparse-checkouts,
as the sparse-checkout documentation alludes to, but the above is
sufficient for me to explain how `git stash` is affected.

What is relevant here is that logically 'stash' should behave like a
merge; it three-way merges the changes the user had in progress at stash
creation time, the HEAD at the time the stash was created, and the
current HEAD, in order to get the stashed changes applied to the current
branch.  However, this simplistic view doesn't quite work in practice,
because stash tweaks it a bit due to two factors: (1) flags like
--keep-index and --include-untracked (why we used two different verbs,
'keep' and 'include', is a rant for another day) modify what should be
staged at the end and include more things that should be quasi-merged,
(2) stash generally wants changes to NOT be staged.  It only provides
exceptions when (a) some of the changes had conflicts and thus we want
to use stages to denote the clean merges and higher order stages to
mark the conflicts, or (b) if there is a brand new file we don't want
it to become untracked.

stash has traditionally gotten this special behavior by first doing a
merge, and then when it's clean, applying a pipeline of commands to
modify the result.  This series of commands for
unstaging-non-newly-added-files came from the following commands:

    git diff-index --cached --name-only --diff-filter=A $CTREE >"$a"
    git read-tree --reset $CTREE
    git update-index --add --stdin <"$a"
    rm -f "$a"

Looking back at the different types of special sparsity handling listed
at the beginning of this message, you may note that we have at least one
of each type covered here: merge, diff-index, and read-tree.  The weird
mix-and-match led to 3 different bugs:

(1) If a path merged cleanly and it didn't match the sparsity patterns,
the merge backend would know to avoid writing it to the working tree and
keep the SKIP_WORKTREE bit, simply only updating it in the index.
Unfortunately, the subsequent commands would essentially undo the
changes in the index and thus simply toss the changes altogether since
there was nothing left in the working tree.  This means the stash is
only partially applied.

(2) If a path existed in the worktree before `git stash apply` despite
having the SKIP_WORKTREE bit set, then the `git read-tree --reset` would
print an error message of the form
      error: Entry 'modified' not uptodate. Cannot merge.
and cause stash to abort early.

(3) If there was a brand new file added by the stash, then the
diff-index command would save that pathname to the temporary file, the
read-tree --reset would remove it from the index, and the update-index
command would barf due to no such file being present in the working
copy; it would print a message of the form:
      error: NEWFILE: does not exist and --remove not passed
      fatal: Unable to process path NEWFILE
and then cause stash to abort early.

Basically, the whole idea of unstage-unless-brand-new requires special
care when you are dealing with a sparse-checkout.  Fix these problems
by applying the following simple rule:

  When we unstage files, if they have the SKIP_WORKTREE bit set,
  clear that bit and write the file out to the working directory.

  (*) If there's already a file present in the way, rename it first.

This fixes all three problems in t7012.13 and allows us to mark it as
passing.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-01 14:39:04 -08:00
Elijah Newren
b34ab4a43b stash: remove unnecessary process forking
When stash was converted from shell to a builtin, it merely
transliterated the forking of various git commands from shell to a C
program that would fork the same commands.  Some of those were converted
over to actual library calls, but much of the pipeline-of-commands
design still remains.  Fix some of this by replacing the portion
corresponding to

    git diff-index --cached --name-only --diff-filter=A $CTREE >"$a"
    git read-tree --reset $CTREE
    git update-index --add --stdin <"$a"
    rm -f "$a"

into a library function that does the same thing.  (The read-tree
--reset was already partially converted over to a library call, but as
an independent piece.)  Note here that this came after a merge operation
was performed.  The merge machinery always stages anything that cleanly
merges, and the above code only runs if there are no conflicts.  Its
purpose is to make it so that when there are no conflicts, all the
changes from the stash are unstaged.  However, that causes brand new
files from the stash to become untracked, so the code above first saves
those files off and then re-adds them afterwards.

We replace the whole series of commands with a simple function that will
unstage files that are not newly added.  This doesn't fix any bugs in
the usage of these commands, it simply matches the existing behavior but
makes it into a single atomic operation that we can then operate on as a
whole.  A subsequent commit will take advantage of this to fix issues
with these commands in sparse-checkouts.

This conversion incidentally fixes t3906.1, because the separate
update-index process would die with the following error messages:
    error: uninitialized_sub: is a directory - add files inside instead
    fatal: Unable to process path uninitialized_sub
The unstaging of the directory as a submodule meant it was no longer
tracked, and thus as an uninitialized directory it could not be added
back using `git update-index --add`, thus resulting in this error and
early abort.  Most of the submodule tests in 3906 continue to fail after
this change, this change was just enough to push the first of those
tests to success.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-01 14:39:04 -08:00
Taylor Blau
4f6460df55 builtin/bugreport.c: use thread-safe localtime_r()
To generate its filename, the 'git bugreport' builtin asks the system
for the current time with 'localtime()'. Since this uses a shared
buffer, it is not thread-safe.

Even though 'git bugreport' is not multi-threaded, using localtime() can
trigger some static analysis tools to complain, and a quick

    $ git grep -oh 'localtime\(_.\)\?' -- **/*.c | sort | uniq -c

shows that the only usage of the thread-unsafe 'localtime' is in a piece
of documentation.

So, convert this instance to use the thread-safe version for
consistency, and to appease some analysis tools.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-01 13:05:37 -08:00
Junio C Hamano
3d8f81f21b Merge branch 'sa/credential-store-timeout'
Multiple "credential-store" backends can race to lock the same
file, causing everybody else but one to fail---reattempt locking
with some timeout to reduce the rate of the failure.

* sa/credential-store-timeout:
  crendential-store: use timeout when locking file
2020-11-30 14:49:45 -08:00
Junio C Hamano
fa27e2d103 Merge branch 'km/stash-error-message-fix'
Error message fix.

* km/stash-error-message-fix:
  stash: add missing space to an error message
2020-11-30 14:49:45 -08:00
Junio C Hamano
f73ee0c6be Merge branch 'mt/worktree-error-message-fix'
Fix formulation of an error message with two placeholders in "git
worktree add" subcommand.

* mt/worktree-error-message-fix:
  worktree: fix order of arguments in error message
2020-11-30 14:49:43 -08:00
Junio C Hamano
1c04cdd424 Merge branch 'ab/gc-keep-base-option'
Fix an option name in "gc" documentation.

* ab/gc-keep-base-option:
  gc: rename keep_base_pack variable for --keep-largest-pack
  gc docs: change --keep-base-pack to --keep-largest-pack
2020-11-30 14:49:43 -08:00
Junio C Hamano
290c94085b Merge branch 'js/pull-rebase-use-advise'
UI improvement.

* js/pull-rebase-use-advise:
  pull: colorize the hint about setting `pull.rebase`
2020-11-30 14:49:42 -08:00
Rafael Silva
e72f7defc4 maintenance: fix SEGFAULT when no repository
The "git maintenance run" and "git maintenance start/stop" commands
holds a file-based lock at the .git/maintenance.lock and
.git/schedule.lock respectively. These locks are used to ensure only
one maintenance process is executed at the time as both operations
involves writing data into the git repository.

The path to the lock file is built using
"the_repository->objects->odb->path" that results in SEGFAULT when we
have no repository available as "the_repository->objects->odb" is
set to NULL.

Let's teach maintenance command to use RUN_SETUP option that will
provide the validation and fail when running outside of a repository.
Hence fixing the SEGFAULT for all three operations and making the
behaviour consistent across all subcommands.

Setting the RUN_SETUP also provides the same protection for all
subcommands given that the "register" and "unregister" also requires to
be executed inside a repository.

Furthermore let's remove the local validation implemented by the
"register" and "unregister" as this will not be required anymore with
the new option.

Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-30 13:44:15 -08:00
Junio C Hamano
2ba70a330b Merge branch 'rs/gc-sort-func-cast-fix'
Fix broken sorting of maintenance tasks.

* rs/gc-sort-func-cast-fix:
  gc: fix cast in compare_tasks_by_selection()
2020-11-25 15:24:53 -08:00
Junio C Hamano
fcf26ef53a Merge branch 'jk/4gb-idx'
The code was not prepared to deal with pack .idx file that is
larger than 4GB.

* jk/4gb-idx:
  packfile: detect overflow in .idx file size checks
  block-sha1: take a size_t length parameter
  fsck: correctly compute checksums on idx files larger than 4GB
  use size_t to store pack .idx byte offsets
  compute pack .idx byte offsets using size_t
2020-11-25 15:24:52 -08:00
Junio C Hamano
8f8f10ac09 Merge branch 'jx/t5411-flake-fix'
The exchange between receive-pack and proc-receive hook did not
carefully check for errors.

* jx/t5411-flake-fix:
  receive-pack: use default version 0 for proc-receive
  receive-pack: gently write messages to proc-receive
  t5411: new helper filter_out_user_friendly_and_stable_output
2020-11-25 15:24:52 -08:00
Derrick Stolee
483a6d9b5d maintenance: use 'git config --fixed-value'
When a repository's leading directories contain regex metacharacters,
the config calls for 'git maintenance register' and 'git maintenance
unregister' are not careful enough. Use the new --fixed-value option
to direct the config machinery to use exact string matches. This is a
more robust option than escaping these arguments in a piecemeal fashion.

For the test, require that we are not running on Windows since the '+'
and '*' characters are not allowed on that filesystem.

Reported-by: Emily Shaffer <emilyshaffer@google.com>
Reported-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-25 15:04:55 -08:00
Derrick Stolee
3f1bae1dc3 config: implement --fixed-value with --get*
The config builtin does its own regex matching of values for the --get,
--get-all, and --get-regexp modes. Plumb the existing 'flags' parameter
to the get_value() method so we can initialize the value-pattern argument
as a fixed string instead of a regex pattern.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-25 14:43:48 -08:00
Derrick Stolee
c90702a1f6 config: plumb --fixed-value into config API
The git_config_set_multivar_in_file_gently() and related methods now
take a 'flags' bitfield, so add a new bit representing the --fixed-value
option from 'git config'. This alters the purpose of the value_pattern
parameter to be an exact string match. This requires some initialization
changes in git_config_set_multivar_in_file_gently() and a new strcmp()
call in the matches() method.

The new CONFIG_FLAGS_FIXED_VALUE flag is initialized in builtin/config.c
based on the --fixed-value option, and that needs to be updated in
several callers.

This patch only affects some of the modes of 'git config', and the rest
will be completed in the next change.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-25 14:43:48 -08:00
Derrick Stolee
fda43942d7 config: add --fixed-value option, un-implemented
The 'git config' builtin takes a 'value-pattern' parameter for several
actions. This can cause confusion when expecting exact value matches
instead of regex matches, especially when the input string contains
metacharacters. While callers can escape the patterns themselves, it
would be more friendly to allow an argument to disable the pattern
matching in favor of an exact string match.

Add a new '--fixed-value' option that does not currently change the
behavior. The implementation will be filled in by later changes for
each appropriate action. For now, check and test that --fixed-value
will abort the command when included with an incompatible action or
without a 'value-pattern' argument.

The name '--fixed-value' was chosen over something simpler like
'--fixed' because some commands allow regular expressions on the
key in addition to the value.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-25 14:43:48 -08:00
Derrick Stolee
247e2f822e config: replace 'value_regex' with 'value_pattern'
The 'value_regex' argument in the 'git config' builtin is poorly named,
especially related to an upcoming change that allows exact string
matches instead of ERE pattern matches.

Perform a mostly mechanical change of every instance of 'value_regex' to
'value_pattern' in the codebase. This is only critical for documentation
and error messages, but it is best to be consistent inside the codebase,
too.

For documentation, use 'value-pattern' which is better punctuation. This
affects Documentation/git-config.txt and the usage in builtin/config.c,
which was already mixed between 'value_regex' and 'value-regex'.

I gave some thought to leaving the value_regex variables inside config.c
that are regex_t pointers. However, it is probably best to keep the name
consistent with the rest of the variables.

This does not update the translations inside the po/ directory, as that
creates conflicts with ongoing work. The input strings should
automatically update through automation, and a few of the output strings
currently use "[value_regex]" directly.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-25 14:43:48 -08:00
Derrick Stolee
504ee1290e config: convert multi_replace to flags
We will extend the flexibility of the config API. Before doing so, let's
take an existing 'int multi_replace' parameter and replace it with a new
'unsigned flags' parameter that can take multiple options as a bit field.

Update all callers that specified multi_replace to now specify the
CONFIG_FLAGS_MULTI_REPLACE flag. To add more clarity, extend the
documentation of git_config_set_multivar_in_file() including a clear
labeling of its arguments. Other config API methods in config.h require
only a change of the final parameter from 'int' to 'unsigned'.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-25 14:43:47 -08:00
Simão Afonso
df7f915fb6 crendential-store: use timeout when locking file
When holding the lock for rewriting the credential file, use a timeout
to avoid race conditions when the credentials file needs to be updated
in parallel.

An example would be doing `fetch --all` on a repository with several
remotes that need credentials, using parallel fetching.

The timeout can be configured using "credentialStore.lockTimeoutMS",
defaulting to 1 second.

Signed-off-by: Simão Afonso <simao.afonso@powertools-tech.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-25 12:30:18 -08:00
Derrick Stolee
31345d5545 maintenance: extract platform-specific scheduling
The existing schedule mechanism using 'cron' is supported by POSIX
platforms, but not Windows. It also works slightly differently on
macOS to significant detriment of the user experience. To allow for
new implementations on these platforms, extract a method that
performs the platform-specific scheduling mechanism. This will be
swapped at compile time with new implementations on specialized
platforms.

As we add this generality, rename GIT_TEST_CRONTAB to
GIT_TEST_MAINT_SCHEDULER. Further, this variable is now parsed as
"<scheduler>:<command>" so we can test platform-specific scheduling
logic even when not on the correct platform. By specifying the
<scheduler> in this string, we will be able to test all three sets of
Git logic from a Linux machine.

Co-authored-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-24 13:02:29 -08:00
Kyle Meyer
eaf5341538 stash: add missing space to an error message
Restore a space that was lost in 8a0fc8d19d (stash: convert apply to
builtin, 2019-02-25).

Signed-off-by: Kyle Meyer <kyle@kyleam.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-24 12:56:31 -08:00
Junio C Hamano
bf0a430f70 Merge branch 'en/strmap'
A specialization of hashmap that uses a string as key has been
introduced.  Hopefully it will see wider use over time.

* en/strmap:
  shortlog: use strset from strmap.h
  Use new HASHMAP_INIT macro to simplify hashmap initialization
  strmap: take advantage of FLEXPTR_ALLOC_STR when relevant
  strmap: enable allocations to come from a mem_pool
  strmap: add a strset sub-type
  strmap: split create_entry() out of strmap_put()
  strmap: add functions facilitating use as a string->int map
  strmap: enable faster clearing and reusing of strmaps
  strmap: add more utility functions
  strmap: new utility functions
  hashmap: provide deallocation function names
  hashmap: introduce a new hashmap_partial_clear()
  hashmap: allow re-use after hashmap_free()
  hashmap: adjust spacing to fix argument alignment
  hashmap: add usage documentation explaining hashmap_free[_entries]()
2020-11-21 15:14:38 -08:00
Junio C Hamano
0dd171f0bc Merge branch 'jk/rev-parse-end-of-options'
"git rev-parse" learned the "--end-of-options" to help scripts to
safely take a parameter that is supposed to be a revision, e.g.
"git rev-parse --verify -q --end-of-options $rev".

* jk/rev-parse-end-of-options:
  rev-parse: handle --end-of-options
  rev-parse: put all options under the "-" check
  rev-parse: don't accept options after dashdash
2020-11-21 15:14:38 -08:00
Junio C Hamano
473c6224c6 Merge branch 'jc/format-patch-name-max'
The maximum length of output filenames "git format-patch" creates
has become configurable (used to be capped at 64).

* jc/format-patch-name-max:
  format-patch: make output filename configurable
2020-11-21 15:14:38 -08:00
Martin Ågren
96313423a7 grep: use designated initializers for grep_defaults
In 15fabd1bbd ("builtin/grep.c: make configuration callback more
reusable", 2012-10-09), we learned to fill a `static struct grep_opt
grep_defaults` which we can use as a blueprint for other such structs.

At the time, we didn't consider designated initializers to be widely
useable, but these days, we do. (See, e.g., cbc0f81d96 ("strbuf: use
designated initializers in STRBUF_INIT", 2017-07-10).)

Use designated initializers to let the compiler set up the struct and so
that we don't need to remember to call `init_grep_defaults()`.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-21 14:50:33 -08:00
Martin Ågren
1d3878799f grep: don't set up a "default" repo for grep
`init_grep_defaults()` fills a `static struct grep_opt grep_defaults`.
This struct is then used by `grep_init()` as a blueprint for other such
structs. Notably, `grep_init()` takes a `struct repo *` and assigns it
into the target struct.

As a result, it is unnecessary for us to take a `struct repo *` in
`init_grep_defaults()` as well. We assign it into the default struct and
never look at it again. And in light of how we return early if we have
already set up the default struct, it's not just unnecessary, but is
also a bit confusing: If we are called twice and with different repos,
is it a bug or a feature that we ignore the second repo?

Drop the repo parameter for `init_grep_defaults()`.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-21 14:50:29 -08:00
Matheus Tavares
b86339b12b worktree: fix order of arguments in error message
`git worktree add` (without --force) errors out when given a path
that is already registered as a worktree and the path is missing on
disk. But the `cmd` and `path` strings are switched on the error
message. Let's fix that.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-21 13:31:34 -08:00
Ævar Arnfjörð Bjarmason
793c1464d3 gc: rename keep_base_pack variable for --keep-largest-pack
As noted in an earlier change the keep_base_pack variable name is a
relic from an earlier on-list version of ae4e89e549 ("gc: add
--keep-largest-pack option", 2018-04-15) before it was renamed to
--keep-largest-pack.

Let's change the variable name to avoid that confusion, it's easier to
read the code if there's a 1=1 mapping between the variable name and
option name.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-21 11:39:59 -08:00
Johannes Schindelin
e01ae2a4a7 pull: colorize the hint about setting pull.rebase
In d18c950a69 (pull: warn if the user didn't say whether to rebase or
to merge, 2020-03-09), a new hint was introduced to encourage users to
make a conscious decision about whether they want their pull to merge or
to rebase by configuring the `pull.rebase` setting.

This warning was clearly intended to advise users, but as pointed out in
https://lore.kernel.org/git/87ima2rdsm.fsf%40evledraar.gmail.com, it
uses `warning()` instead of `advise()`.

One consequence is that the advice is not colorized in the same manner
as other, similar messages. So let's use `advise()` instead.

Pointed-out-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-19 14:13:30 -08:00
René Scharfe
a1c74791d5 gc: fix cast in compare_tasks_by_selection()
compare_tasks_by_selection() is used with QSORT and gets passed pointers
to the elements of "static struct maintenance_task tasks[]".  It casts
the *addresses* of these passed pointers to element pointers, though,
and thus effectively compares some unrelated values from the stack.  Fix
the casts to actually compare array elements.

Detected by USan (make SANITIZE=undefined test).

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-18 14:15:58 -08:00
Junio C Hamano
3f6dc9c366 Merge branch 'pb/blame-funcname-range-userdiff'
"git blame -L :funcname -- path" did not work well for a path for
which a userdiff driver is defined.

* pb/blame-funcname-range-userdiff:
  blame: simplify 'setup_blame_bloom_data' interface
  blame: simplify 'setup_scoreboard' interface
  blame: enable funcname blaming with userdiff driver
  line-log: mention both modes in 'blame' and 'log' short help
  doc: add more pointers to gitattributes(5) for userdiff
  blame-options.txt: also mention 'funcname' in '-L' description
  doc: line-range: improve formatting
  doc: log, gitk: move '-L' description to 'line-range-options.txt'
2020-11-18 13:32:53 -08:00
Junio C Hamano
a1f95951ef Merge branch 'en/merge-ort-api-null-impl'
Preparation for a new merge strategy.

* en/merge-ort-api-null-impl:
  merge,rebase,revert: select ort or recursive by config or environment
  fast-rebase: demonstrate merge-ort's API via new test-tool command
  merge-ort-wrappers: new convience wrappers to mimic the old merge API
  merge-ort: barebones API of new merge strategy with empty implementation
2020-11-18 13:32:53 -08:00
Junio C Hamano
7660da1618 Merge branch 'ds/maintenance-part-3'
Parts of "git maintenance" to ease writing crontab entries (and
other scheduling system configuration) for it.

* ds/maintenance-part-3:
  maintenance: add troubleshooting guide to docs
  maintenance: use 'incremental' strategy by default
  maintenance: create maintenance.strategy config
  maintenance: add start/stop subcommands
  maintenance: add [un]register subcommands
  for-each-repo: run subcommands on configured repos
  maintenance: add --schedule option and config
  maintenance: optionally skip --auto process
2020-11-18 13:32:53 -08:00
Junio C Hamano
c042c455d4 Merge branch 'pw/rebase-i-orig-head'
"git rebase -i" did not store ORIG_HEAD correctly.

* pw/rebase-i-orig-head:
  rebase -i: simplify get_revision_ranges()
  rebase -i: use struct object_id when writing state
  rebase -i: use struct object_id rather than looking up commit
  rebase -i: stop overwriting ORIG_HEAD buffer
2020-11-18 13:32:53 -08:00
Junio C Hamano
5edc8bdc06 Merge branch 'jk/format-patch-output'
"git format-patch --output=there" did not work as expected and
instead crashed.  The option is now supported.

* jk/format-patch-output:
  format-patch: support --output option
  format-patch: tie file-opening logic to output_directory
  format-patch: refactor output selection
2020-11-18 13:32:52 -08:00
Junio C Hamano
f8a1cee7b3 Merge branch 'jc/line-log-takes-no-pathspec'
"git log -L<range>:<path>" is documented to take no pathspec, but
this was not enforced by the command line option parser, which has
been corrected.

* jc/line-log-takes-no-pathspec:
  log: diagnose -L used with pathspec as an error
2020-11-18 13:32:52 -08:00
Junio C Hamano
30f5257611 Merge branch 'rs/empty-reflog-check-fix'
The code to see if "git stash drop" can safely remove refs/stash
has been made more carerful.

* rs/empty-reflog-check-fix:
  stash: simplify reflog emptiness check
2020-11-18 13:32:52 -08:00
Taylor Blau
2fcb03b52d builtin/repack.c: don't move existing packs out of the way
When 'git repack' creates a pack with the same name as any existing
pack, it moves the existing one to 'old-pack-xxx.{pack,idx,...}' and
then renames the new one into place.

Eventually, it would be nice to have 'git repack' allow for writing a
multi-pack index at the critical time (after the new packs have been
written / moved into place, but before the old ones have been deleted).
Guessing that this option might be called '--write-midx', this makes the
following situation (where repacks are issued back-to-back without any
new objects) impossible:

    $ git repack -adb
    $ git repack -adb --write-midx

In the second repack, the existing packs are overwritten verbatim with
the same rename-to-old sequence. At that point, the current MIDX is
invalidated, since it refers to now-missing packs. So that code wants to
be run after the MIDX is re-written. But (prior to this patch) the new
MIDX can't be written until the new packs are moved into place. So, we
have a circular dependency.

This is all hypothetical, since no code currently exists to write a MIDX
safely during a 'git repack' (the 'GIT_TEST_MULTI_PACK_INDEX' does so
unsafely). Putting hypothetical aside, though: why do we need to rename
existing packs to be prefixed with 'old-' anyway?

This behavior dates all the way back to 2ad47d6 (git-repack: Be
careful when updating the same pack as an existing one., 2006-06-25).
2ad47d6 is mainly concerned about a case where a newly written pack
would have a different structure than its index. This used to be
possible when the pack name was a hash of the set of objects. Under this
naming scheme, two packs that store the same set of objects could differ
in delta selection, object positioning, or both. If this happened, then
any such packs would be unreadable in the instant between copying the
new pack and new index (i.e., either the index or pack will be stale
depending on the order that they were copied).

But since 1190a1a (pack-objects: name pack files after trailer hash,
2013-12-05), this is no longer possible, since pack files are named not
after their logical contents (i.e., the set of objects), but by the
actual checksum of their contents. So, this old- behavior can safely go,
which allows us to avoid our circular dependency above.

In addition to avoiding the circular dependency, this patch also makes
'git repack' a lot simpler, since we don't have to deal with failures
encountered when renaming existing packs to be prefixed with 'old-'.

This patch is mostly limited to removing code paths that deal with the
'old' prefixing, with the exception of files that include the pack's
name in their own filename, like .idx, .bitmap, and related files. The
exception is that we want to continue to trust what pack-objects wrote.
That is, it is not the case that we pretend as if pack-objects didn't
write files identical to ones that already exist, but rather that we
respect what pack-objects wrote as the source of truth. That cuts two
ways:

  - If pack-objects produced an identical pack to one that already
    exists with a bitmap, but did not produce a bitmap, we remove the
    bitmap that already exists. (This behavior is codified in t7700.14).

  - If pack-objects produced an identical pack to one that already
    exists, we trust the just-written version of the coresponding .idx,
    .promisor, and other files over the ones that already exist. This
    ensures that we use the most up-to-date versions of this files,
    which is safe even in the face of format changes in, say, the .idx
    file (which would not be reflected in the .idx file's name).

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-17 13:31:55 -08:00
Philippe Blain
5176f20ffe pull: check for local submodule modifications with the right range
Ever since 'git pull' learned '--recurse-submodules' in a6d7eb2c7a
(pull: optionally rebase submodules (remote submodule changes only),
2017-06-23), we check if there are local submodule modifications by
checking the revision range 'curr_head --not rebase_fork_point'.

The goal of this check is to abort the pull if there are submodule
modifications in the local commits being rebased, since this scenario is
not supported.

However, the actual range of commits being rebased is not
'rebase_fork_point..curr_head', as the logic in
'get_rebase_newbase_and_upstream' reveals, it is 'upstream..curr_head'.

If the 'git merge-base --fork-point' invocation in
'get_rebase_fork_point' fails to find a fork point between the current
branch and the remote-tracking branch we are pulling from,
'rebase_fork_point' is null and since 4d36f88be7 (submodule: do not pass
null OID to setup_revisions, 2018-05-24), 'submodule_touches_in_range'
checks 'curr_head' and all its ancestors for submodule modifications.

Since it is highly likely that there are submodule modifications in this
range (which is in effect the whole history of the current branch), this
prevents 'git pull --rebase --recurse-submodules' from succeeding if no
fork point exists between the current branch and the remote-tracking
branch being pulled. This can happen, for example, when the current
branch was forked from a commit which was never recorded in the reflog
of the remote-tracking branch we are pulling, as the last two paragraphs
of the "Discussion on fork-point mode" section in git-merge-base(1)
explain.

Fix this bug by passing 'upstream' instead of 'rebase_fork_point' as the
'excl_oid' argument to 'submodule_touches_in_range'.

Reported-by: Brice Goglin <bgoglin@free.fr>
Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-16 16:01:13 -08:00
Philippe Blain
4f66d79ae3 pull --rebase: compute rebase arguments in separate function
The function 'run_rebase' is responsible for constructing the
command line to be passed to 'git rebase'. This includes both forwarding
pass-through options given to 'git pull' as well computing the <newbase>
and <upstream> arguments to 'git rebase'.

A following commit will need to access the <upstream> argument in
'cmd_pull' to fix a bug with 'git pull --rebase --recurse-submodules'.
In order to do so, refactor the code so that the <newbase> and
<upstream> commits are computed in a new, separate function,
'get_rebase_newbase_and_upstream'.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-16 16:01:13 -08:00
Taylor Blau
704c4a5c07 builtin/repack.c: keep track of what pack-objects wrote
In the subsequent commit, it will become useful to keep track of which
metadata files were written by pack-objects. We already do this to an
extent with the 'exts' array, which only is used in the context of
existing packs.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-16 15:57:44 -08:00
Jeff King
63f4d5cf57 repack: make "exts" array available outside cmd_repack()
We'll use it in a helper function soon.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-16 15:57:43 -08:00
Patrick Steinhardt
8c4417f1cf update-ref: disallow "start" for ongoing transactions
It is currently possible to write multiple "start" commands into
git-update-ref(1) for a single session, but none of them except for the
first one actually have any effect.

Using such nested "start"s may eventually have a sensible effect. One
may imagine that it restarts the current transaction, effectively
emptying it and creating a new one. It may also allow for creation of
nested transactions. But currently, none of these are implemented.

Silently ignoring this misuse is making it hard to iterate in the future
if "start" is ever going to have meaningful semantics in such a context.
This commit thus makes sure to error out in case we see such use.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-16 13:44:01 -08:00
Patrick Steinhardt
262a4d28fe update-ref: allow creation of multiple transactions
While git-update-ref has recently grown commands which allow interactive
control of transactions in e48cf33b61 (update-ref: implement interactive
transaction handling, 2020-04-02), it is not yet possible to create
multiple transactions in a single session. To do so, one currently still
needs to invoke the executable multiple times.

This commit addresses this shortcoming by allowing the "start" command
to create a new transaction if the current transaction has already been
either committed or aborted.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-16 13:44:01 -08:00
Jeff King
a9bc372ef8 use size_t to store pack .idx byte offsets
We sometimes store the offset into a pack .idx file as an "unsigned
long", but the mmap'd size of a pack .idx file can exceed 4GB. This is
sufficient on LP64 systems like Linux, but will be too small on LLP64
systems like Windows, where "unsigned long" is still only 32 bits. Let's
use size_t, which is a better type for an offset into a memory buffer.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-16 13:41:35 -08:00
Jeff King
f86f769550 compute pack .idx byte offsets using size_t
A pack and its matching .idx file are limited to 2^32 objects, because
the pack format contains a 32-bit field to store the number of objects.
Hence we use uint32_t in the code.

But the byte count of even a .idx file can be much larger than that,
because it stores at least a hash and an offset for each object. So
using SHA-1, a v2 .idx file will cross the 4GB boundary at 153,391,650
objects. This confuses load_idx(), which computes the minimum size like
this:

  unsigned long min_size = 8 + 4*256 + nr*(hashsz + 4 + 4) + hashsz + hashsz;

Even though min_size will be big enough on most 64-bit platforms, the
actual arithmetic is done as a uint32_t, resulting in a truncation. We
actually exceed that min_size, but then we do:

  unsigned long max_size = min_size;
  if (nr)
          max_size += (nr - 1)*8;

to account for the variable-sized table. That computation doesn't
overflow quite so low, but with the truncation for min_size, we end up
with a max_size that is much smaller than our actual size. So we
complain that the idx is invalid, and can't find any of its objects.

We can fix this case by casting "nr" to a size_t, which will do the
multiplication in 64-bits (assuming you're on a 64-bit platform; this
will never work on a 32-bit system since we couldn't map the whole .idx
anyway). Likewise, we don't have to worry about further additions,
because adding a smaller number to a size_t will convert the other side
to a size_t.

A few notes:

  - obviously we could just declare "nr" as a size_t in the first place
    (and likewise, packed_git.num_objects).  But it's conceptually a
    uint32_t because of the on-disk format, and we correctly treat it
    that way in other contexts that don't need to compute byte offsets
    (e.g., iterating over the set of objects should and generally does
    use a uint32_t). Switching to size_t would make all of those other
    cases look wrong.

  - it could be argued that the proper type is off_t to represent the
    file offset. But in practice the .idx file must fit within memory,
    because we mmap the whole thing. And the rest of the code (including
    the idx_size variable we're comparing against) uses size_t.

  - we'll add the same cast to the max_size arithmetic line. Even though
    we're adding to a larger type, which will convert our result, the
    multiplication is still done as a 32-bit value and can itself
    overflow. I didn't check this with my test case, since it would need
    an even larger pack (~530M objects), but looking at compiler output
    shows that it works this way. The standard should agree, but I
    couldn't find anything explicit in 6.3.1.8 ("usual arithmetic
    conversions").

The case in load_idx() was the most immediate one that I was able to
trigger. After fixing it, looking up actual objects (including the very
last one in sha1 order) works in a test repo with 153,725,110 objects.
That's because bsearch_hash() works with uint32_t entry indices, and the
actual byte access:

  int cmp = hashcmp(table + mi * stride, sha1);

is done with "stride" as a size_t, causing the uint32_t "mi" to be
promoted to a size_t. This is the way most code will access the index
data.

However, I audited all of the other byte-wise accesses of
packed_git.index_data, and many of the others are suspect (they are
similar to the max_size one, where we are adding to a properly sized
offset or directly to a pointer, but the multiplication in the
sub-expression can overflow). I didn't trigger any of these in practice,
but I believe they're potential problems, and certainly adding in the
cast is not going to hurt anything here.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-16 13:41:35 -08:00
Josh Steadmon
a2a066d96a receive-pack: log received client session ID
When receive-pack receives a session-id capability from the client, log
the received session ID via a trace2 data event.

Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-11 18:26:53 -08:00
Josh Steadmon
8073d75bbf receive-pack: advertise session ID in v0 capabilities
When transfer.advertiseSID is true, advertise receive-pack's session ID
via the new session-id capability.

Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-11 18:26:53 -08:00
Junio C Hamano
902f358555 Merge branch 'rs/clear-commit-marks-in-repo'
Code clean-up.

* rs/clear-commit-marks-in-repo:
  bisect: clear flags in passed repository
  object: allow clear_commit_marks_all to handle any repo
2020-11-11 13:18:37 -08:00
Elijah Newren
449a900969 shortlog: use strset from strmap.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-11 12:55:27 -08:00
Elijah Newren
b19315d8ab Use new HASHMAP_INIT macro to simplify hashmap initialization
Now that hashamp has lazy initialization and a HASHMAP_INIT macro,
hashmaps allocated on the stack can be initialized without a call to
hashmap_init() and in some cases makes the code a bit shorter.  Convert
some callsites over to take advantage of this.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-11 12:55:27 -08:00
Jiang Xin
80ffeb94f4 receive-pack: use default version 0 for proc-receive
In the verison negotiation phase between "receive-pack" and
"proc-receive", "proc-receive" can send an empty flush-pkt to end the
negotiation and use default version 0. Capabilities (such as
"push-options") are not supported in version 0.

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-11 12:46:56 -08:00
Jiang Xin
f65003b4c4 receive-pack: gently write messages to proc-receive
Johannes found a flaky hang in `t5411/test-0013-bad-protocol.sh` in the
osx-clang job of the CI/PR builds, and ran into an issue when using
the `--stress` option with the following error messages:

    fatal: unable to write flush packet: Broken pipe
    send-pack: unexpected disconnect while reading sideband packet
    fatal: the remote end hung up unexpectedly

In this test case, the "proc-receive" hook sends an error message and
dies earlier. While "receive-pack" on the other side of the pipe
should forward the error message of the "proc-receive" hook to the
client side, but it fails to do so. This is because "receive-pack"
uses `packet_write_fmt()` and `packet_flush()` to write pkt-line
message to "proc-receive" hook, and these functions die immediately
when pipe is broken. Using "gently" forms for these functions will get
more predicable output.

Add more "--die-*" options to test helper to test different stages of
the protocol between "receive-pack" and "proc-receive" hook.

Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-11 12:46:56 -08:00
Jeff King
3a1f91cfd9 rev-parse: handle --end-of-options
We taught rev-list a new way to separate options from revisions in
19e8789b23 (revision: allow --end-of-options to end option parsing,
2019-08-06), but rev-parse uses its own parser. It should know about
--end-of-options not only for consistency, but because it may be
presented with similarly ambiguous cases. E.g., if a caller does:

  git rev-parse "$rev" -- "$path"

to parse an untrusted input, then it will get confused if $rev contains
an option-like string like "--local-env-vars". Or even "--not-real",
which we'd keep as an option to pass along to rev-list.

Or even more importantly:

  git rev-parse --verify "$rev"

can be confused by options, even though its purpose is safely parsing
untrusted input. On the plus side, it will always fail the --verify
part, as it will not have parsed a revision, so the caller will
generally "fail closed" rather than continue to use the untrusted
string. But it will still trigger whatever option was in "$rev"; this
should be mostly harmless, since rev-parse options are all read-only,
but I didn't carefully audit all paths.

This patch lets callers write:

  git rev-parse --end-of-options "$rev" -- "$path"

and:

  git rev-parse --verify --end-of-options "$rev"

which will both treat "$rev" always as a revision parameter. The latter
is a bit clunky. It would be nicer if we had defined "--verify" to
require that its next argument be the revision. But we have not
historically done so, and:

  git rev-parse --verify -q "$rev"

does currently work. I added a test here to confirm that we didn't break
that.

A few implementation notes:

 - We don't document --end-of-options explicitly in commands, but rather
   in gitcli(7). So I didn't give it its own section in git-rev-parse(1).
   But I did call it out specifically in the --verify section, and
   include it in the examples, which should show best practices.

 - We don't have to re-indent the main option-parsing block, because we
   can combine our "did we see end of options" check with "does it start
   with a dash". The exception is the pre-setup options, which need
   their own block.

 - We do however have to pull the "--" parsing out of the "does it start
   with dash" block, because we want to parse it even if we've seen
   --end-of-options.

 - We'll leave "--end-of-options" in the output. This is probably not
   technically necessary, as a careful caller will do:

     git rev-parse --end-of-options $revs -- $paths

   and anything in $revs will be resolved to an object id. However, it
   does help a slightly less careful caller like:

     git rev-parse --end-of-options $revs_or_paths

   where a path "--foo" will remain in the output as long as it also
   exists on disk. In that case, it's helpful to retain --end-of-options
   to get passed along to rev-list, s it would otherwise see just
   "--foo".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-10 13:46:27 -08:00
Jeff King
9033addfa6 rev-parse: put all options under the "-" check
The option-parsing loop of rev-parse checks whether the first character
of an arg is "-". If so, then it enters a series of conditionals
checking for individual options. But some options are inexplicably
outside of that outer conditional.

This doesn't produce the wrong behavior; the conditional is actually
redundant with the individual option checks, and it's really only its
fallback "continue" that we care about. But we should at least be
consistent.

One obvious alternative is that we could get rid of the conditional
entirely. But we'll be using the extra block it provides in the next
patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-10 13:46:27 -08:00
Jeff King
e05e2ae8fe rev-parse: don't accept options after dashdash
Because of the order in which we check options in rev-parse, there are a
few options we accept even after a "--". This is wrong, because the
whole point of "--" is to say "everything after here is a path". Let's
move the "did we see a dashdash" check (it's called "as_is" in the code)
to the top of the parsing loop.

Note there is one subtlety here. The options are ordered so that some
are checked before we even see if we're in a repository (they continue
the loop, and if we get past a certain point, then we do the repository
setup). By moving the as_is check higher, it's also in that "before
setup" section, even though it might look at the repository via
verify_filename(). However, this works out: we'd never set as_is until
we parse "--", and we don't parse that until after doing the setup.

An alternative here to avoid the subtlety is to put the as_is check at
the top of the post-setup options. But then every pre-setup option would
have to remember to check "if (!as_is && !strcmp(...))". So while this
is a bit magical, it's harder for future code to get wrong.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-10 13:46:27 -08:00
Junio C Hamano
3baf58bfb4 format-patch: make output filename configurable
For the past 15 years, we've used the hardcoded 64 as the length
limit of the filename of the output from the "git format-patch"
command.  Since the value is shorter than the 80-column terminal, it
could grow without line wrapping a bit.  At the same time, since the
value is longer than half of the 80-column terminal, we could fit
two or more of them in "ls" output on such a terminal if we allowed
to lower it.

Introduce a new command line option --filename-max-length=<n> and a
new configuration variable format.filenameMaxLength to override the
hardcoded default.

While we are at it, remove a check that the name of output directory
does not exceed PATH_MAX---this check is pointless in that by the
time control reaches the function, the caller would already have
done an equivalent of "mkdir -p", so if the system does not like an
overly long directory name, the control wouldn't have reached here,
and otherwise, we know that the system allowed the output directory
to exist.  In the worst case, we will get an error when we try to
open the output file and handle the error correctly anyway.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-09 17:44:41 -08:00
Junio C Hamano
6a44c9c0d0 Merge branch 'jk/committer-date-is-author-date-fix-simplify'
Code simplification.

* jk/committer-date-is-author-date-fix-simplify:
  am, sequencer: stop parsing our own committer ident
2020-11-09 14:06:28 -08:00
Junio C Hamano
ecf95d938b Merge branch 'ab/git-remote-exit-code'
Exit codes from "git remote add" etc. were not usable by scripted
callers.

* ab/git-remote-exit-code:
  remote: add meaningful exit code on missing/existing
2020-11-09 14:06:26 -08:00
Junio C Hamano
92d6bd2e90 Merge branch 'jk/checkout-index-errors'
"git checkout-index" did not consistently signal an error with its
exit status.

* jk/checkout-index-errors:
  checkout-index: propagate errors to exit code
  checkout-index: drop error message from empty --stage=all
2020-11-09 14:06:26 -08:00
Junio C Hamano
cfdc70b299 Merge branch 'mr/bisect-in-c-3'
Rewriting "git bisect" in C continues.

* mr/bisect-in-c-3:
  bisect--helper: retire `--bisect-autostart` subcommand
  bisect--helper: retire `--write-terms` subcommand
  bisect--helper: retire `--check-expected-revs` subcommand
  bisect--helper: reimplement `bisect_state` & `bisect_head` shell functions in C
  bisect--helper: retire `--next-all` subcommand
  bisect--helper: retire `--bisect-clean-state` subcommand
  bisect--helper: finish porting `bisect_start()` to C
2020-11-09 14:06:25 -08:00
Phillip Wood
8843302307 rebase -i: simplify get_revision_ranges()
Now that all the external users of head_hash have been converted to
use a opts->orig_head instead we can stop returning head_hash from
get_revision_ranges().

Because we want to pass the full object names back to the caller in
`revisions` the find_unique_abbrev_r() call that was used to initialize
`head_hash` is replaced with oid_to_hex().

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-04 14:10:41 -08:00
Phillip Wood
a2bb10d06d rebase -i: use struct object_id when writing state
Rather than passing a string around pass the struct object_id that the
string was created from call oid_hex() when we write the file.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-04 14:10:41 -08:00
Phillip Wood
f3e27a02d5 rebase -i: use struct object_id rather than looking up commit
We already have a struct object_id containing the oid that we want to
set ORIG_HEAD to so use that rather than converting it to a string and
then calling get_oid() on that string.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-04 14:10:41 -08:00
Phillip Wood
e100bea481 rebase -i: stop overwriting ORIG_HEAD buffer
After rebasing, ORIG_HEAD is supposed to point to the old HEAD of the
rebased branch.  The code used find_unique_abbrev() to obtain the
object name of the old HEAD and wrote to both
.git/rebase-merge/orig-head (used by `rebase --abort` to go back to
the previous state) and to ORIG_HEAD.  The buffer find_unique_abbrev()
gives back is volatile, unfortunately, and was overwritten after the
former file is written but before ORIG_FILE is written, leaving an
incorrect object name in it.

Avoid relying on the volatile buffer of find_unique_abbrev(), and
instead supply our own buffer to keep the object name.

I think that all of the users of head_hash should actually be using
opts->orig_head instead as passing a string rather than a struct
object_id around is a hang over from the scripted implementation. This
patch just fixes the immediate bug and adds a regression test based on
Caspar's reproduction example[1]. The users will be converted to use
struct object_id and head_hash removed in the next few commits.

[1] https://lore.kernel.org/git/CAFzd1+7PDg2PZgKw7U0kdepdYuoML9wSN4kofmB_-8NHrbbrHg@mail.gmail.com

Reported-by: Caspar Duregger <herr.kaste@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-04 14:10:41 -08:00
Jeff King
dc1672dd10 format-patch: support --output option
We've never intended to support diff's --output option in format-patch.
And until baa4adc66a (parse-options: disable option abbreviation with
PARSE_OPT_KEEP_UNKNOWN, 2019-01-27), it was impossible to trigger. We
first parse the format-patch options before handing the remainder off to
setup_revisions(). Before that commit, we'd accept "--output=foo" as an
abbreviation for "--output-directory=foo". But afterwards, we don't
check abbreviations, and --output gets passed to the diff code.

This results in nonsense behavior and bugs. The diff code will have
opened a filehandle at rev.diffopt.file, but we'll overwrite that with
our own handles that we open for each individual patch file. So the
--output file will always just be empty. But worse, the diff code also
sets rev.diffopt.close_file, so log_tree_commit() will close the
filehandle itself. And then the main loop in cmd_format_patch() will try
to close it again, resulting in a double-free.

The simplest solution would be to just disallow --output with
format-patch, as nobody ever intended it to work. However, we have
accidentally documented it (because format-patch includes diff-options).
And it does work with "git log", which writes the whole output to the
specified file. It's easy enough to make that work for format-patch,
too: it's really the same as --stdout, but pointed at a specific file.

We can detect the use of the --output option by the "close_file" flag
(note that we can't use rev.diffopt.file, since the diff setup will
otherwise set it to stdout). So we just need to unset that flag, but
don't have to do anything else. Our situation is otherwise exactly like
--stdout (note that we don't fclose() the file, but nor does the stdout
case; exiting the program takes care of that for us).

Reported-by: Johannes Postler <johannes.postler@txture.io>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-04 14:05:29 -08:00
Jeff King
1e1693b2bb format-patch: tie file-opening logic to output_directory
In format-patch we're either outputting to stdout or to individual files
in an output directory (which may be just "./"). Our logic for whether
to open a new file for each patch is checked with "!use_stdout", but it
is equally correct to check for a non-NULL output_directory.

The distinction will matter when we add a new single-stream output in a
future patch, when only one of the three methods will want individual
files. Let's swap the logic here in preparation.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-04 14:05:28 -08:00
Jeff King
4c6f781f9c format-patch: refactor output selection
The --stdout and --output-directory options are mutually exclusive, but
it's hard to tell from reading the code. We have three separate
conditionals that check for use_stdout, and it's only after we've set up
the output_directory fully that we check whether the user also specified
--stdout.

Instead, let's check the exclusion explicitly first, then have a single
conditional that handles stdout versus an output directory. This is
slightly easier to follow now, and also will keep things sane when we
add another output mode in a future patch.

We'll add a few tests as well, covering the mutual exclusion and the
fact that we are not confused by a configured output directory.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-04 14:05:28 -08:00
Junio C Hamano
39664cb0ac log: diagnose -L used with pathspec as an error
The -L option is documented to accept no pathspec, but the
command line option parser has allowed the combination without
checking so far.  Ensure that there is no pathspec when the -L
option is in effect to fix this.

Incidentally, this change fixes another bug in the command line
option parser, which has allowed the -L option used together
with the --follow option.  Because the latter requires exactly
one path given, but the former takes no pathspec, they become
mutually incompatible automatically.  Because the -L option
follows renames on its own, there is no reason to give --follow
at the same time.

The new tests say they may fail with "-L and --follow being
incompatible" instead of "-L and pathspec being incompatible".
Currently the expected failure can come only from the latter, but
this is to futureproof them, in case we decide to add code to
explicititly die on -L and --follow used together.

Heled-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-04 13:38:33 -08:00
Elijah Newren
14c4586c2d merge,rebase,revert: select ort or recursive by config or environment
Allow the testsuite to run where it treats requests for "recursive" or
the default merge algorithm via consulting the environment variable
GIT_TEST_MERGE_ALGORITHM which is expected to either be "recursive" (the
old traditional algorithm) or "ort" (the new algorithm).

Also, allow folks to pick the new algorithm via config setting.  It
turns out builtin/merge.c already had a way to allow users to specify a
different default merge algorithm: pull.twohead.  Rather odd
configuration name (especially to be in the 'pull' namespace rather than
'merge') but it's there.  Add that same configuration to rebase,
cherry-pick, and revert.

This required updating the various callsites that called merge_trees()
or merge_recursive() to conditionally call the new API, so this serves
as another demonstration of what the new API looks and feels like.
There are almost certainly some callsites that have not yet been
modified to work with the new merge algorithm, but this represents the
ones that I have been testing with thus far.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-02 16:35:50 -08:00
Junio C Hamano
1ae0949a03 Merge branch 'mk/diff-ignore-regex'
"git diff" family of commands learned the "-I<regex>" option to
ignore hunks whose changed lines all match the given pattern.

* mk/diff-ignore-regex:
  diff: add -I<regex> that ignores matching changes
  merge-base, xdiff: zero out xpparam_t structures
2020-11-02 13:17:44 -08:00
Junio C Hamano
51830654fc Merge branch 'jk/fast-import-marks-cleanup'
Code clean-up.

* jk/fast-import-marks-cleanup:
  fast-import: remove duplicated option-parsing line
2020-11-02 13:17:40 -08:00
Junio C Hamano
e0f6ad2984 Merge branch 'tk/credential-config'
"git credential' didn't honor the core.askPass configuration
variable (among other things), which has been corrected.

* tk/credential-config:
  credential: load default config
2020-11-02 13:17:39 -08:00
Junio C Hamano
b6fb70c985 Merge branch 'dl/diff-merge-base'
"git diff A...B" learned "git diff --merge-base A B", which is a
longer short-hand to say the same thing.

* dl/diff-merge-base:
  contrib/completion: complete `git diff --merge-base`
  builtin/diff-tree: learn --merge-base
  builtin/diff-index: learn --merge-base
  t4068: add --merge-base tests
  diff-lib: define diff_get_merge_base()
  diff-lib: accept option flags in run_diff_index()
  contrib/completion: extract common diff/difftool options
  git-diff.txt: backtick quote command text
  git-diff-index.txt: make --cached description a proper sentence
  t4068: remove unnecessary >tmp
2020-11-02 13:17:39 -08:00
Junio C Hamano
761a4e9ab1 Merge branch 'bk/sob-dco'
Document that the meaning of a Signed-off-by trailer can vary from
project to project in the end-user documentation, and clarify what
it means to this project.

* bk/sob-dco:
  Documentation: stylistically normalize references to Signed-off-by:
  SubmittingPatches: clarify DCO is our --signoff rule
  Documentation: clarify and expand description of --signoff
  doc: preparatory clean-up of description on the sign-off option
2020-11-02 13:17:39 -08:00
Junio C Hamano
0be2d65132 Merge branch 'ds/maintenance-commit-graph-auto-fix'
Test-coverage enhancement of running commit-graph task "git
maintenance" as needed led to discovery and fix of a bug.

* ds/maintenance-commit-graph-auto-fix:
  maintenance: core.commitGraph=false prevents writes
  maintenance: test commit-graph auto condition
2020-11-02 13:17:39 -08:00
Junio C Hamano
cd47bbe164 Merge branch 'jk/fast-import-marks-alloc-fix'
"git fast-import" wasted a lot of memory when many marks were in use.

* jk/fast-import-marks-alloc-fix:
  fast-import: fix over-allocation of marks storage
2020-11-02 13:17:37 -08:00
Elijah Newren
6da1a25814 hashmap: provide deallocation function names
hashmap_free(), hashmap_free_entries(), and hashmap_free_() have existed
for a while, but aren't necessarily the clearest names, especially with
hashmap_partial_clear() being added to the mix and lazy-initialization
now being supported.  Peff suggested we adopt the following names[1]:

  - hashmap_clear() - remove all entries and de-allocate any
    hashmap-specific data, but be ready for reuse

  - hashmap_clear_and_free() - ditto, but free the entries themselves

  - hashmap_partial_clear() - remove all entries but don't deallocate
    table

  - hashmap_partial_clear_and_free() - ditto, but free the entries

This patch provides the new names and converts all existing callers over
to the new naming scheme.

[1] https://lore.kernel.org/git/20201030125059.GA3277724@coredump.intra.peff.net/

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-02 12:15:50 -08:00
Philippe Blain
3af31e8786 blame: simplify 'setup_blame_bloom_data' interface
The penultimate commit moved the initialization of 'sb.path' in
'builtin/blame.c::cmd_blame' before the call to
'blame.c::setup_blame_bloom_data'. Since 'cmd_blame' is the only caller
of 'setup_blame_bloom_data', it is now unnecessary for
'setup_blame_bloom_data' to receive 'path' as a separate argument, as
'sb.path' is already initialized.

Remove this argument from setup_blame_bloom_data's interface and use the
'path' field of the 'sb' 'struct blame_scoreboard' instead.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-01 15:54:15 -08:00
Philippe Blain
88894aaeea blame: simplify 'setup_scoreboard' interface
The previous commit moved the initialization of 'sb.path' in
'builtin/blame.c::cmd_blame' before the call to
'blame.c::setup_scoreboard'. Since 'cmd_blame' is the only caller of
'setup_scoreboard', it is now unnecessary for 'setup_scoreboard' to
receive 'path' as a separate argument, as 'sb.path' is already
initialized.

Remove this argument from setup_scoreboard's interface and use the
'path' field of the 'sb' 'struct blame_scoreboard' instead.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-01 15:54:15 -08:00
Philippe Blain
9466e3809d blame: enable funcname blaming with userdiff driver
In blame.c::cmd_blame, we send the 'path' field of the 'sb' 'struct
blame_scoreboard' as the 'path' argument to
'line-range.c::parse_range_arg', but 'sb.path' is not set yet; it's set
to the local variable 'path' a few lines later at line 1137.

This 'path' argument is only used in 'parse_range_arg' if we are blaming
a funcname, i.e. `git blame -L :<funcname> <path>`, and in that case it
is sent to 'parse_range_funcname', where it is used to determine if a
userdiff driver should be used for said <path> to match the given
funcname.

Since 'path' is yet unset, the userdiff driver is never used, so we fall
back to the default funcname regex, which is usually not appropriate for
paths that are set to use a specific userdiff driver, and thus either we
match some unrelated lines, or we die with

    fatal: -L parameter '<funcname>' starting at line 1: no match

This has been the case ever since `git blame` learned to blame a
funcname in 13b8f68c1f (log -L: :pattern:file syntax to find by
funcname, 2013-03-28).

Enable funcname blaming for paths using specific userdiff drivers by
initializing 'sb.path' earlier in 'cmd_blame', when some of its other
fields are initialized, so that it is set when passed to
'parse_range_arg'.

Add a regression test in 'annotate-tests.sh', which is sourced in
t8001-annotate.sh and t8002-blame.sh, leveraging an existing file used
to test the userdiff patterns in t4018-diff-funcname.

Also, use 'sb.path' instead of 'path' when constructing the error
message at line 1114, for consistency.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-01 15:54:15 -08:00
Philippe Blain
180d641d7d line-log: mention both modes in 'blame' and 'log' short help
'git blame -h' and 'git log -h' both show '-L <n,m>' and describe this
option as "Process only line range n,m, counting from 1". No hint is
given that a function name regex can also be used.

Use <range> instead, and expand the description of the option to mention
both modes. Remove "counting from 1" as it's uneeded; it's uncommon to
refer to the first line of a file as "line 0".

Also, for 'git log', improve the wording to better reflect the long help.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-01 15:54:14 -08:00
René Scharfe
4f44c5659b stash: simplify reflog emptiness check
Calling rev-parse to check if the drop subcommand removed the last stash
and treating its failure as confirmation is fragile, as the command can
fail for other reasons, e.g. because the system is out of memory.
Directly check if the reflog is empty instead, which is more robust.

Reported-by: Marek Mrva <mrva@eof-studios.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-01 15:51:31 -08:00
René Scharfe
cd8888452c object: allow clear_commit_marks_all to handle any repo
Allow callers to specify the repository to use.  Rename the function to
repo_clear_commit_marks to document its new scope.  No functional change
intended.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-31 10:46:34 -07:00
Junio C Hamano
4f9f7c1442 Merge branch 'jk/committer-date-is-author-date-fix' into maint
In 2.29, "--committer-date-is-author-date" option of "rebase" and
"am" subcommands lost the e-mail address by mistake, which has been
corrected.

* jk/committer-date-is-author-date-fix:
  rebase: fix broken email with --committer-date-is-author-date
  am: fix broken email with --committer-date-is-author-date
  t3436: check --committer-date-is-author-date result more carefully
2020-10-29 14:18:47 -07:00
Junio C Hamano
0e41cfad62 Merge branch 'dl/checkout-guess'
"git checkout" learned to use checkout.guess configuration variable
and enable/disable its "--[no-]guess" option accordingly.

* dl/checkout-guess:
  checkout: learn to respect checkout.guess
  Documentation/config/checkout: replace sq with backticks
2020-10-27 15:09:51 -07:00
Junio C Hamano
f3cfeb3078 Merge branch 'dl/checkout-p-merge-base'
"git checkout -p A...B [-- <path>]" did not work, even though the
same command without "-p" correctly used the merge-base between
commits A and B.

* dl/checkout-p-merge-base:
  t2016: add a NEEDSWORK about the PERL prerequisite
  add-patch: add NEEDSWORK about comparing commits
  Doc: document "A...B" form for <tree-ish> in checkout and switch
  builtin/checkout: fix `git checkout -p HEAD...` bug
2020-10-27 15:09:51 -07:00
Junio C Hamano
40696c6727 Merge branch 'sb/clone-origin'
"git clone" learned clone.defaultremotename configuration variable
to customize what nickname to use to call the remote the repository
was cloned from.

* sb/clone-origin:
  clone: allow configurable default for `-o`/`--origin`
  clone: read new remote name from remote_name instead of option_origin
  clone: validate --origin option before use
  refs: consolidate remote name validation
  remote: add tests for add and rename with invalid names
  clone: use more conventional config/option layering
  clone: add tests for --template and some disallowed option pairs
2020-10-27 15:09:50 -07:00
Junio C Hamano
de0a7effc8 Merge branch 'sk/force-if-includes'
"git push --force-with-lease[=<ref>]" can easily be misused to lose
commits unless the user takes good care of their own "git fetch".
A new option "--force-if-includes" attempts to ensure that what is
being force-pushed was created after examining the commit at the
tip of the remote ref that is about to be force-replaced.

* sk/force-if-includes:
  t, doc: update tests, reference for "--force-if-includes"
  push: parse and set flag for "--force-if-includes"
  push: add reflog check for "--force-if-includes"
2020-10-27 15:09:49 -07:00
Junio C Hamano
52b8c8c716 Merge branch 'ds/maintenance-part-2'
"git maintenance", an extended big brother of "git gc", continues
to evolve.

* ds/maintenance-part-2:
  maintenance: add incremental-repack auto condition
  maintenance: auto-size incremental-repack batch
  maintenance: add incremental-repack task
  midx: use start_delayed_progress()
  midx: enable core.multiPackIndex by default
  maintenance: create auto condition for loose-objects
  maintenance: add loose-objects task
  maintenance: add prefetch task
2020-10-27 15:09:47 -07:00
Junio C Hamano
26bb5437f6 Merge branch 'rs/worktree-list-show-locked'
"git worktree list" now shows if each worktree is locked.  This
possibly may open us to show other kinds of states in the future.

* rs/worktree-list-show-locked:
  worktree: teach `list` to annotate locked worktree
2020-10-27 15:09:47 -07:00
Junio C Hamano
ae84e924da Merge branch 'rs/tighten-callers-of-deref-tag'
Code clean-up.

* rs/tighten-callers-of-deref-tag:
  line-log: handle deref_tag() returning NULL
  blame: handle deref_tag() returning NULL
  grep: handle deref_tag() returning NULL
2020-10-27 15:09:46 -07:00
Jeff King
7e41061588 checkout-index: propagate errors to exit code
If we encounter an error while checking out an explicit path, we print a
message to stderr but do not actually exit with a non-zero code. While
this is a plumbing command and the behavior goes all the way back to
33db5f4d90 (Add a "checkout-cache" command which does what the name
suggests., 2005-04-09), this is almost certainly an oversight:

  - we _do_ return an exit code from checkout_file(); the caller just
    never reads it

  - errors while checking out all paths (with "-a") do result in a
    non-zero exit code.

  - it would be quite unusual not to use the exit code for an error,
    as otherwise the caller has no idea the command failed except by
    scraping stderr

To keep our tests simple and portable, we can use the most obvious
error: asking to checkout a path which is not in the index at all.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-27 12:41:56 -07:00
Jeff King
0b809c8248 checkout-index: drop error message from empty --stage=all
If checkout-index is given --stage=all for a specific path, it will try
to write stages 1-3 (if present) for that path to temporary files.
However, if the file is present only at stage 0, it writes nothing but
gives a confusing message:

  $ git checkout-index --stage=all -- Makefile
  git checkout-index: Makefile does not exist at stage 4

This is nonsense. There is no stage 4 (it's just an internal enum value
we use for "all"), and the documentation clearly states:

  Paths which only have a stage 0 entry will always be omitted from the
  output.

Here it's talking about the list of tempfiles written to stdout, but it
seems clear that this case was not meant to be an error. We even have a
test which covers it, but it only checks that the command reports an
exit code of 0, not its stderr. And it reports 0 only because of another
bug which fails to propagate errors (which will be fixed in a subsequent
patch).

So let's make the test more thorough. We'll also cover the case that we
found _no_ entry, not even a stage zero, which should still be an error.
However, because of the other bug, we'll have to mark this as expecting
failure for the moment.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-27 12:41:54 -07:00
Ævar Arnfjörð Bjarmason
9144ba4cf5 remote: add meaningful exit code on missing/existing
Change the exit code for the likes of "git remote add/rename" to exit
with 2 if the remote in question doesn't exist, and 3 if it
does. Before we'd just die() and exit with the general 128 exit code.

This changes the output message from e.g.:

    fatal: remote origin already exists.

To:

    error: remote origin already exists.

Which I believe is a feature, since we generally use "fatal" for the
generic errors, and "error" for the more specific ones with a custom
exit code, but this part of the change may break code that already
relies on stderr parsing (not that we ever supported that...).

The motivation for this is a discussion around some code in GitLab's
gitaly which wanted to check this, and had to parse stderr to do so:
https://gitlab.com/gitlab-org/gitaly/-/merge_requests/2695

It's worth noting as an aside that a method of checking this that
doesn't rely on that is to check with "git config" whether the value
in question does or doesn't exist. That introduces a TOCTOU race
condition, but on the other hand this code (e.g. "git remote add")
already has a TOCTOU race.

We go through the config.lock for the actual setting of the config,
but the pseudocode logic is:

    read_config();
    check_config_and_arg_sanity();
    save_config();

So e.g. if a sleep() is added right after the remote_is_configured()
check in add() we'll clobber remote.NAME.url, and add another (usually
duplicate) remote.NAME.fetch entry (and other values, depending on
invocation).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-27 11:40:33 -07:00
Junio C Hamano
f34687dc81 Merge branch 'jk/committer-date-is-author-date-fix'
In 2.29, "--committer-date-is-author-date" option of "rebase" and
"am" subcommands lost the e-mail address by mistake, which has been
corrected.

* jk/committer-date-is-author-date-fix:
  rebase: fix broken email with --committer-date-is-author-date
  am: fix broken email with --committer-date-is-author-date
  t3436: check --committer-date-is-author-date result more carefully
2020-10-26 14:59:58 -07:00
Jeff King
2020451c5b am, sequencer: stop parsing our own committer ident
For the --committer-date-is-author-date option of git-am and git-rebase,
we format the committer ident, then re-parse it to find the name and
email, and then feed those back to fmt_ident().

We can simplify this by handling it all at the time of the fmt_ident()
call. We pass in the appropriate getenv() results, and if they're not
present, then our WANT_COMMITTER_IDENT flag tells fmt_ident() to fill in
the appropriate value from the config. Which is exactly what
git_committer_ident() was doing under the hood.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-26 09:59:57 -07:00
Jeff King
16b0bb99ea am: fix broken email with --committer-date-is-author-date
Commit e8cbe2118a (am: stop exporting GIT_COMMITTER_DATE, 2020-08-17)
rewrote the code for setting the committer date to use fmt_ident(),
rather than setting an environment variable and letting commit_tree()
handle it. But it introduced two bugs:

  - we use the author email string instead of the committer email

  - when parsing the committer ident, we used the wrong variable to
    compute the length of the email, resulting in it always being a
    zero-length string

This commit fixes both, which causes our test of this option via the
rebase "apply" backend to now succeed.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-23 08:25:19 -07:00
Michał Kępień
ec7967cfaf merge-base, xdiff: zero out xpparam_t structures
xpparam_t structures are usually zero-initialized before their specific
fields are assigned to, but there are three locations in the tree where
that does not happen.  Add the missing memset() calls in order to make
initialization of xpparam_t structures consistent tree-wide and to
prevent stack garbage from being used as field values.

Signed-off-by: Michał Kępień <michal@isc.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-20 12:53:26 -07:00
Bradley M. Kuhn
3abd4a67d9 Documentation: stylistically normalize references to Signed-off-by:
Ted reported an old typo in the git-commit.txt and merge-options.txt.
Namely, the phrase "Signed-off-by line" was used without either a
definite nor indefinite article.

Upon examination, it seems that the documentation (including items in
Documentation/, but also option help strings) have been quite
inconsistent on usage when referring to `Signed-off-by`.

First, very few places used a definite or indefinite article with the
phrase "Signed-off-by line", but that was the initial typo that led
to this investigation.  So, normalize using either an indefinite or
definite article consistently.

The original phrasing, in Commit 3f971fc425 (Documentation updates,
2005-08-14), is "Add Signed-off-by line".  Commit 6f855371a5 (Add
--signoff, --check, and long option-names. 2005-12-09) switched to
using "Add `Signed-off-by:` line", but didn't normalize the former
commit to match.  Later commits seem to have cut and pasted from one
or the other, which is likely how the usage became so inconsistent.

Junio stated on the git mailing list in
<xmqqy2k1dfoh.fsf@gitster.c.googlers.com> a preference to leave off
the colon.  Thus, prefer `Signed-off-by` (with backticks) for the
documentation files and Signed-off-by (without backticks) for option
help strings.

Additionally, Junio argued that "trailer" is now the standard term to
refer to `Signed-off-by`, saying that "becomes plenty clear that we
are not talking about any random line in the log message".  As such,
prefer "trailer" over "line" anywhere the former word fits.

However, leave alone those few places in documentation that use
Signed-off-by to refer to the process (rather than the specific
trailer), or in places where mail headers are generally discussed in
comparison with Signed-off-by.

Reported-by: "Theodore Y. Ts'o" <tytso@mit.edu>
Signed-off-by: Bradley M. Kuhn <bkuhn@sfconservancy.org>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-20 11:57:40 -07:00
Thomas Koutcher
567ad2c0f9 credential: load default config
Make `git credential fill` honour the core.askPass variable.

Signed-off-by: Thomas Koutcher <thomas.koutcher@online.fr>
[jk: added test]
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-16 12:30:45 -07:00
Pranit Bauva
b0f6494f70 bisect--helper: retire --bisect-autostart subcommand
The `--bisect-autostart` subcommand is no longer used from the
git-bisect.sh shell script. Instead the function
`bisect_autostart()` is directly called from the C implementation.

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-16 12:24:20 -07:00
Pranit Bauva
5c517fe345 bisect--helper: retire --write-terms subcommand
The `--write-terms` subcommand is no longer used from the
git-bisect.sh shell script. Instead the function `write_terms()`
is called from the C implementation of `set_terms()` and
`bisect_start()`.

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-16 12:24:20 -07:00
Pranit Bauva
9b437b056d bisect--helper: retire --check-expected-revs subcommand
The `--check-expected-revs` subcommand is no longer used from the
git-bisect.sh shell script. Functions `check_expected_revs` and
`is_expected_revs` are also deleted.

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-16 12:24:20 -07:00
Pranit Bauva
27257bc466 bisect--helper: reimplement bisect_state & bisect_head shell functions in C
Reimplement the `bisect_state()` shell functions in C and also add a
subcommand `--bisect-state` to `git-bisect--helper` to call them from
git-bisect.sh .

Using `--bisect-state` subcommand is a temporary measure to port shell
function to C so as to use the existing test suite. As more functions
are ported, this subcommand will be retired and will be called by some
other methods.

`bisect_head()` is only called from `bisect_state()`, thus it is not
required to introduce another subcommand.

Note that the `eval` in the changed line of `git-bisect.sh` cannot be
dropped: it is necessary because the `rev` and the `tail`
variables may contain multiple, quoted arguments that need to be
passed to `bisect--helper` (without the quotes, naturally).

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-16 12:24:20 -07:00
Pranit Bauva
04774b4e70 bisect--helper: retire --next-all subcommand
The `--next-all` subcommand is no longer used from the git-bisect.sh
shell script. Instead the function `bisect_next_all()` is called from
the C implementation of `bisect_next()`.

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-16 12:24:20 -07:00
Pranit Bauva
e4396072e7 bisect--helper: retire --bisect-clean-state subcommand
The `--bisect-clean-state` subcommand is no longer used from the
git-bisect.sh shell script. Instead the function
`bisect_clean_state()` is directly called from the C
implementation.

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-16 12:24:20 -07:00
Pranit Bauva
88ad372fc0 bisect--helper: finish porting bisect_start() to C
Add the subcommand to `git bisect--helper` and call it from
git-bisect.sh.

With the conversion of `bisect_auto_next()` from shell to C in a
previous commit, `bisect_start()` can now be fully ported to C.

So let's complete the `--bisect-start` subcommand of
`git bisect--helper` so that it fully implements `bisect_start()`,
and let's use this subcommand in `git-bisect.sh` instead of
`bisect_start()`.

Note that the `eval` in the changed line of `git-bisect.sh` cannot be
dropped: it is necessary because the `rev` and the `tail`
variables may contain multiple, quoted arguments that need to be
passed to `bisect--helper` (without the quotes, naturally).

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-16 12:24:20 -07:00
Jeff King
6db29ab213 fast-import: remove duplicated option-parsing line
Commit 1bdca81641 (fast-import: add options for rewriting submodules,
2020-02-22) accidentally added two lines parsing the option
"rewrite-submodules-from". This didn't do anything in practice, because
they're in an if/else chain and so the second one can never trigger.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-16 08:48:47 -07:00
Derrick Stolee
61f7a383d3 maintenance: use 'incremental' strategy by default
The 'git maintenance (register|start)' subcommands add the current
repository to the global Git config so maintenance will operate on that
repository. It does not specify what maintenance should occur or how
often.

To make it simple for users to start background maintenance with a
recommended schedlue, update the 'maintenance.strategy' config option in
both the 'register' and 'start' subcommands. This allows users to
customize beyond the defaults using individual
'maintenance.<task>.schedule' options, but also the user can opt-out of
this strategy using 'maintenance.strategy=none'.

Helped-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-16 08:36:42 -07:00
Derrick Stolee
a4cb1a2339 maintenance: create maintenance.strategy config
To provide an on-ramp for users to use background maintenance without
several 'git config' commands, create a 'maintenance.strategy' config
option. Currently, the only important value is 'incremental' which
assigns the following schedule:

* gc: never
* prefetch: hourly
* commit-graph: hourly
* loose-objects: daily
* incremental-repack: daily

These tasks are chosen to minimize disruptions to foreground Git
commands and use few compute resources.

The 'maintenance.strategy' is intended as a baseline that can be
customzied further by manually assigning 'maintenance.<task>.enabled'
and 'maintenance.<task>.schedule' config options, which will override
any recommendation from 'maintenance.strategy'. This operates similarly
to config options like 'feature.experimental' which operate as "meta"
config options that change default config values.

This presents a way forward for updating the 'incremental' strategy in
the future or adding new strategies. For example, a potential strategy
could be to include a 'full' strategy that runs the 'gc' task weekly
and no other tasks by default.

Helped-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-16 08:36:42 -07:00
Jeff King
3f018ec716 fast-import: fix over-allocation of marks storage
Fast-import stores its marks in a trie-like structure made of mark_set
structs. Each struct has a fixed size (1024). If our id number is too
large to fit in the struct, then we allocate a new struct which shifts
the id number by 10 bits. Our original struct becomes a child node
of this new layer, and the new struct becomes the top level of the trie.

This scheme was broken by ddddf8d7e2 (fast-import: permit reading
multiple marks files, 2020-02-22). Before then, we had a top-level
"marks" pointer, and the push-down worked by assigning the new top-level
struct to "marks". But after that commit, insert_mark() takes a pointer
to the mark_set, rather than using the global "marks". It continued to
assign to the global "marks" variable during the push down, which was
wrong for two reasons:

  - we added a call in option_rewrite_submodules() which uses a separate
    mark set; pushing down on "marks" is outright wrong here. We'd
    corrupt the "marks" set, and we'd fail to correctly store any
    submodule mappings with an id over 1024.

  - the other callers passed "marks", but the push-down was still wrong.
    In read_mark_file(), we take the pointer to the mark_set as a
    parameter. So even though insert_mark() was updating the global
    "marks", the local pointer we had in read_mark_file() was not
    updated. As a result, we'd add a new level when needed, but then the
    next call to insert_mark() wouldn't see it! It would then allocate a
    new layer, which would also not be seen, and so on. Lookups for the
    lost layers obviously wouldn't work, but before we even hit any
    lookup stage, we'd generally run out of memory and die.

Our tests didn't notice either of these cases because they didn't have
enough marks to trigger the push-down behavior. The new tests in t9304
cover both cases (and fail without this patch).

We can solve the problem by having insert_mark() take a pointer-to-pointer
of the top-level of the set. Then our push down can assign to it in a
way that the caller actually sees. Note the subtle reordering in
option_rewrite_submodules(). Our call to read_mark_file() may modify our
top-level set pointer, so we have to wait until after it returns to
assign its value into the string_list.

Reported-by: Sergey Brester <serg.brester@sebres.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-15 10:30:53 -07:00
René Scharfe
db7d07f610 blame: handle deref_tag() returning NULL
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-12 12:25:14 -07:00
René Scharfe
e30b1525fb grep: handle deref_tag() returning NULL
deref_tag() can return NULL.  Exit gracefully in that case instead
of blindly dereferencing the return value.

.name shouldn't ever be NULL, but grep_object() handles that case
explicitly, so let's be defensive here as well and show the broken
object's ID if it happens to lack a name after all.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-12 12:25:14 -07:00
Rafael Silva
c57b3367be worktree: teach list to annotate locked worktree
The "git worktree list" shows the absolute path to the working tree,
the commit that is checked out and the name of the branch. It is not
immediately obvious which of the worktrees, if any, are locked.

"git worktree remove" refuses to remove a locked worktree with
an error message. If "git worktree list" told which worktrees
are locked in its output, the user would not even attempt to
remove such a worktree, or would realize that
"git worktree remove -f -f <path>" is required.

Teach "git worktree list" to append "locked" to its output.
The output from the command becomes like so:

    $ git worktree list
    /path/to/main             abc123 [master]
    /path/to/worktree         456def (detached HEAD)
    /path/to/locked-worktree  123abc (detached HEAD) locked

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-12 12:24:29 -07:00
Derrick Stolee
d334107c5d maintenance: core.commitGraph=false prevents writes
Recently, a user had an issue due to combining
fetch.writeCommitGraph=true with core.commitGraph=false. The root bug
has been resolved by preventing commit-graph writes when
core.commitGraph is disabled. This happens inside the 'git commit-graph
write' command, but we can be more aware of this situation and prevent
that process from ever starting in the 'commit-graph' maintenance task.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-12 12:13:21 -07:00
Junio C Hamano
d620daaa34 Merge branch 'ja/misc-doc-fixes'
Doc fixes.

* ja/misc-doc-fixes:
  doc: fix the bnf like style of some commands
  doc: git-remote fix ups
  doc: use linkgit macro where needed.
  git-bisect-lk2009: make continuation of list indented
2020-10-08 21:53:26 -07:00
Junio C Hamano
c7ac8c0a7c Merge branch 'jk/index-pack-hotfixes'
Hotfix and clean-up for the jt/threaded-index-pack topic that has
graduated to v2.29-rc0.

* jk/index-pack-hotfixes:
  index-pack: make get_base_data() comment clearer
  index-pack: drop type_cas mutex
  index-pack: restore "resolving deltas" progress meter
2020-10-08 21:53:26 -07:00
Jean-Noël Avila
9f443f5531 doc: fix the bnf like style of some commands
In command line options, variables are entered between < and >

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-08 14:01:19 -07:00
Derrick Stolee
8f801804be maintenance: test commit-graph auto condition
The auto condition for the commit-graph maintenance task walks refs
looking for commits that are not in the commit-graph file. This was
added in 4ddc79b2 (maintenance: add auto condition for commit-graph
task, 2020-09-17) but was left untested.

The initial goal of this change was to demonstrate the feature works
properly by adding tests. However, there was an off-by-one error that
caused the basic tests around maintenance.commit-graph.auto=1 to fail
when it should work.

The subtlety is that if a ref tip is not in the commit-graph, then we
were not adding that to the total count. In the test, we see that we
have only added one commit since our last commit-graph write, so the
auto condition would say there is nothing to do.

The fix is simple: add the check for the commit-graph position to see
that the tip is not in the commit-graph file before starting our walk.
Since this happens before adding to the DFS stack, we do not need to
clear our (currently empty) commit list.

This does add some extra complexity for the test, because we also want
to verify that the walk along the parents actually does some work. This
means we need to add at least two commits in a row without writing the
commit-graph. However, we also need to make sure no additional refs are
pointing to the middle of this list or else the for_each_ref() in
should_write_commit_graph() might visit these commits as tips instead of
doing a DFS walk. Hence, the last two commits are added with "git
commit" instead of "test_commit".

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-08 10:24:40 -07:00
Denton Liu
64f1f58fe7 checkout: learn to respect checkout.guess
The current behavior of git checkout/switch is that --guess is currently
enabled by default. However, some users may not wish for this to happen
automatically. Instead of forcing users to specify --no-guess manually
each time, teach these commands the checkout.guess configuration
variable that gives users the option to set a default behavior.

Teach the completion script to recognize the new config variable and
disable DWIM logic if it is set to false.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-08 09:25:29 -07:00
Jonathan Tan
ec6a8f9705 index-pack: make get_base_data() comment clearer
A comment mentions that we may free cached delta bases via
find_unresolved_deltas(), but that function went away in f08cbf60fe
(index-pack: make quantum of work smaller, 2020-09-08). Since we need to
rewrite that comment anyway, make the entire comment clearer.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-07 13:32:27 -07:00
Jeff King
bebe171947 index-pack: drop type_cas mutex
The type_cas lock lost all of its callers in f08cbf60fe (index-pack:
make quantum of work smaller, 2020-09-08), so we can safely delete it.
The compiler didn't alert us that the variable became unused, because we
still call pthread_mutex_init() and pthread_mutex_destroy() on it.

It's worth considering also whether that commit was in error to remove
the use of the lock. Why don't we need it now, if we did before, as
described in ab791dd138 (index-pack: fix race condition with duplicate
bases, 2014-08-29)? I think the answer is that we now look at and assign
the child_obj->real_type field in the main thread while holding the
work_lock(). So we don't have to worry about racing with the worker
threads.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-07 11:51:26 -07:00
Jeff King
cea69151a4 index-pack: restore "resolving deltas" progress meter
Commit f08cbf60fe (index-pack: make quantum of work smaller, 2020-09-08)
refactored the main loop in threaded_second_pass(), but also deleted the
call to display_progress() at the top of the loop. This means that users
typically see no progress at all during the delta resolution phase (and
for large repositories, Git appears to hang).

This looks like an accident that was unrelated to the intended change of
that commit, since we continue to update nr_resolved_deltas in
resolve_delta(). Let's restore the call to get that progress back.

We'll also add a test that confirms we generate the expected progress.
This isn't perfect, as it wouldn't catch a bug where progress was
delayed to the end. That was probably possible to trigger when receiving
a thin pack, because we'd eventually call display_progress() from
fix_unresolved_deltas(), but only once after doing all the work.
However, since our test case generates a complete pack, it reliably
demonstrates this particular bug and its fix. And we can't do better
without making the test racy.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-07 11:50:09 -07:00
Denton Liu
5602b500c3 builtin/checkout: fix git checkout -p HEAD... bug
Running `git checkout -p` with a merge-base rev results in an error:

	$ git checkout -p HEAD...
	usage: git diff-index [-m] [--cached] [<common-diff-options>] <tree-ish> [<path>...]
	common diff options:
	  -z            output diff-raw with lines terminated with NUL.
	  -p            output patch format.
	  -u            synonym for -p.
	  --patch-with-raw
			output both a patch and the diff-raw format.
	  --stat        show diffstat instead of patch.
	  --numstat     show numeric diffstat instead of patch.
	  --patch-with-stat
			output a patch and prepend its diffstat.
	  --name-only   show only names of changed files.
	  --name-status show names and status of changed files.
	  --full-index  show full object name on index lines.
	  --abbrev=<n>  abbreviate object names in diff-tree header and diff-raw.
	  -R            swap input file pairs.
	  -B            detect complete rewrites.
	  -M            detect renames.
	  -C            detect copies.
	  --find-copies-harder
			try unchanged files as candidate for copy detection.
	  -l<n>         limit rename attempts up to <n> paths.
	  -O<file>      reorder diffs according to the <file>.
	  -S<string>    find filepair whose only one side contains the string.
	  --pickaxe-all
			show all files diff when -S is used and hit is found.
	  -a  --text    treat all files as text.

	Cannot close git diff-index --cached --numstat --summary HEAD... -- () at <redacted>/libexec/git-core/git-add--interactive line 183.

This happens because checkout passes the literal argument (in the
example, `HEAD...`) to diff-index which does not recognise merge-base
revs.

Fix this by using the hex of the found commit instead of the given name.
Note that "HEAD" is handled specially in run_add_interactive() so it's
explicitly not changed.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-07 09:49:05 -07:00
Junio C Hamano
5f8c70a148 Merge branch 'jk/format-auto-base-when-able'
"git format-patch" learns to take "whenAble" as a possible value
for the format.useAutoBase configuration variable to become no-op
when the  automatically computed base does not make sense.

* jk/format-auto-base-when-able:
  format-patch: teach format.useAutoBase "whenAble" option
2020-10-05 14:01:55 -07:00
Junio C Hamano
8e3ec76a20 Merge branch 'jk/refspecs-negative'
"git fetch" and "git push" support negative refspecs.

* jk/refspecs-negative:
  refspec: add support for negative refspecs
2020-10-05 14:01:54 -07:00
Junio C Hamano
e68f0a4e57 Merge branch 'jt/keep-partial-clone-filter-upon-lazy-fetch'
The lazy fetching done internally to make missing objects available
in a partial clone incorrectly made permanent damage to the partial
clone filter in the repository, which has been corrected.

* jt/keep-partial-clone-filter-upon-lazy-fetch:
  fetch: do not override partial clone filter
  promisor-remote: remove unused variable
2020-10-05 14:01:53 -07:00
Junio C Hamano
19dd352d03 Merge branch 'jk/unused'
Code cleanup.

* jk/unused:
  dir.c: drop unused "untracked" from treat_path_fast()
  sequencer: handle ignore_footer when parsing trailers
  test-advise: check argument count with argc instead of argv
  sparse-checkout: fill in some options boilerplate
  sequencer: drop repository argument from run_git_commit()
  push: drop unused repo argument to do_push()
  assert PARSE_OPT_NONEG in parse-options callbacks
  env--helper: write to opt->value in parseopt helper
  drop unused argc parameters
  convert: drop unused crlf_action from check_global_conv_flags_eol()
2020-10-05 14:01:52 -07:00
Junio C Hamano
34415c76c8 Merge branch 'so/combine-diff-simplify'
Code simplification.

* so/combine-diff-simplify:
  diff: get rid of redundant 'dense' argument
2020-10-05 14:01:51 -07:00
Junio C Hamano
58138d3f26 Merge branch 'js/default-branch-name-part-2'
Update the tests to drop word 'master' from them.

* js/default-branch-name-part-2:
  t9902: avoid using the branch name `master`
  tests: avoid variations of the `master` branch name
  t3200: avoid variations of the `master` branch name
  fast-export: avoid using unnecessary language in a code comment
  t/test-terminal: avoid non-inclusive language
2020-10-05 14:01:50 -07:00
Junio C Hamano
2fa8aacc72 Merge branch 'jk/shortlog-group-by-trailer'
"git shortlog" has been taught to group commits by the contents of
the trailer lines, like "Reviewed-by:", "Coauthored-by:", etc.

* jk/shortlog-group-by-trailer:
  shortlog: allow multiple groups to be specified
  shortlog: parse trailer idents
  shortlog: rename parse_stdin_ident()
  shortlog: de-duplicate trailer values
  shortlog: match commit trailers with --group
  trailer: add interface for iterating over commit trailers
  shortlog: add grouping option
  shortlog: change "author" variables to "ident"
2020-10-04 12:49:14 -07:00
Junio C Hamano
f4cc68cbd0 Merge branch 'mr/bisect-in-c-2'
Rewrite of the "git bisect" script in C continues.

* mr/bisect-in-c-2:
  bisect--helper: reimplement `bisect_next` and `bisect_auto_next` shell functions in C
  bisect: call 'clear_commit_marks_all()' in 'bisect_next_all()'
  bisect--helper: reimplement `bisect_autostart` shell function in C
  bisect--helper: introduce new `write_in_file()` function
  bisect--helper: use '-res' in 'cmd_bisect__helper' return
  bisect--helper: BUG() in cmd_*() on invalid subcommand
2020-10-04 12:49:08 -07:00
Junio C Hamano
03a01824a4 Merge branch 'cc/bisect-start-fix'
"git bisect start X Y", when X and Y are not valid committish
object names, should take X and Y as pathspec, but didn't.

* cc/bisect-start-fix:
  bisect: don't use invalid oid as rev when starting
2020-10-04 12:49:08 -07:00
Junio C Hamano
230ff3e997 Merge branch 'jc/blame-ignore-fix'
"git blame --ignore-rev/--ignore-revs-file" failed to validate
their input are valid revision, and failed to take into account
that the user may want to give an annotated tag instead of a
commit, which has been corrected.

* jc/blame-ignore-fix:
  blame: validate and peel the object names on the ignore list
  t8013: minimum preparatory clean-up
2020-10-04 12:49:07 -07:00
Junio C Hamano
86cca370e1 Merge branch 'jk/drop-unaligned-loads'
Compilation fix around type punning.

* jk/drop-unaligned-loads:
  Revert "fast-export: use local array to store anonymized oid"
  bswap.h: drop unaligned loads
2020-10-04 12:49:06 -07:00
Srinidhi Kaushik
3b990aa645 push: parse and set flag for "--force-if-includes"
The previous commit added the necessary machinery to implement the
"--force-if-includes" protection, when "--force-with-lease" is used
without giving exact object the remote still ought to have. Surface
the feature by adding a command line option and a configuration
variable to enable it.

 - Add a flag: "TRANSPORT_PUSH_FORCE_IF_INCLUDES" to indicate that the
   new option was passed from the command line of via configuration
   settings; update command line and configuration parsers to set the
   new flag accordingly.

 - Introduce a new configuration option "push.useForceIfIncludes", which
   is equivalent to setting "--force-if-includes" in the command line.

 - Update "remote-curl" to recognize and pass this option to "send-pack"
   when enabled.

 - Update "advise" to catch the reject reason "REJECT_REF_NEEDS_UPDATE",
   set when the ref status is "REF_STATUS_REJECT_REMOTE_UPDATED" and
   (optionally) print a help message when the push fails.

 - The new option is a "no-op" in the following scenarios:
    * When used without "--force-with-lease".
    * When used with "--force-with-lease", and if the expected commit
      on the remote side is specified as an argument.

Signed-off-by: Srinidhi Kaushik <shrinidhi.kaushik@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-03 09:59:19 -07:00
Srinidhi Kaushik
99a1f9ae10 push: add reflog check for "--force-if-includes"
Add a check to verify if the remote-tracking ref of the local branch
is reachable from one of its "reflog" entries.

The check iterates through the local ref's reflog to see if there
is an entry for the remote-tracking ref and collecting any commits
that are seen, into a list; the iteration stops if an entry in the
reflog matches the remote ref or if the entry timestamp is older
the latest entry of the remote ref's "reflog". If there wasn't an
entry found for the remote ref, "in_merge_bases_many()" is called
to check if it is reachable from the list of collected commits.

When a local branch that is based on a remote ref, has been rewound
and is to be force pushed on the remote, "--force-if-includes" runs
a check that ensures any updates to the remote-tracking ref that may
have happened (by push from another repository) in-between the time
of the last update to the local branch (via "git-pull", for instance)
and right before the time of push, have been integrated locally
before allowing a forced update.

If the new option is passed without specifying "--force-with-lease",
or specified along with "--force-with-lease=<refname>:<expect>" it
is a "no-op".

Signed-off-by: Srinidhi Kaushik <shrinidhi.kaushik@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-03 09:59:18 -07:00
Jacob Keller
7efba5fa39 format-patch: teach format.useAutoBase "whenAble" option
The format.useAutoBase configuration option exists to allow users to
enable '--base=auto' for format-patch by default.

This can sometimes lead to poor workflow, due to unexpected failures
when attempting to format an ancient patch:

    $ git format-patch -1 <an old commit>
    fatal: base commit shouldn't be in revision list

This can be very confusing, as it is not necessarily immediately obvious
that the user requested a --base (since this was in the configuration,
not on the command line).

We do want --base=auto to fail when it cannot provide a suitable base,
as it would be equally confusing if a formatted patch did not include
the base information when it was requested.

Teach format.useAutoBase a new mode, "whenAble". This mode will cause
format-patch to attempt to include a base commit when it can. However,
if no valid base commit can be found, then format-patch will continue
formatting the patch without a base commit.

In order to avoid making yet another branch name unusable with --base,
do not teach --base=whenAble or --base=whenable.

Instead, refactor the base_commit option to use a callback, and rely on
the global configuration variable auto_base.

This does mean that a user cannot request this optional base commit
generation from the command line. However, this is likely not too
valuable. If the user requests base information manually, they will be
immediately informed of the failure to acquire a suitable base commit.
This allows the user to make an informed choice about whether to
continue the format.

Add tests to cover the new mode of operation for --base.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-01 15:22:10 -07:00
Sean Barag
de9ed3ef37 clone: allow configurable default for -o/--origin
While the default remote name of "origin" can be changed at clone-time
with `git clone`'s `--origin` option, it was previously not possible
to specify a default value for the name of that remote.  Add support for
a new `clone.defaultRemoteName` config, with the newly-created remote
name resolved in priority order:

1. (Highest priority) A remote name passed directly to `git clone -o`
2. A `clone.defaultRemoteName=new_name` in config `git clone -c`
3. A `clone.defaultRemoteName` value set in `/path/to/template/config`,
   where `--template=/path/to/template` is provided
4. A `clone.defaultRemoteName` value set in a non-template config file
5. The default value of `origin`

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Derrick Stolee <stolee@gmail.com>
Helped-by: Andrei Rybak <rybak.a.v@gmail.com>
Signed-off-by: Sean Barag <sean@barag.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-30 22:09:13 -07:00
Sean Barag
75ca3906b1 clone: read new remote name from remote_name instead of option_origin
In a future patch, the name of the remote created by `git clone` may
come from multiple sources.  To avoid confusion, convert most uses of
option_origin to remote_name, leaving option_origin to exclusively
represent the -o/--origin option.

Helped-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Sean Barag <sean@barag.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-30 22:09:13 -07:00
Sean Barag
ebe7e28a36 clone: validate --origin option before use
Providing a bad origin name to `git clone` currently reports an
'invalid refspec' error instead of a more explicit message explaining
that the `--origin` option was malformed.  This behavior dates back to
since 8434c2f1 (Build in clone, 2008-04-27).  Reintroduce
validation for the provided `--origin` option, but notably _don't_
include a multi-level check (e.g. "foo/bar") that was present in the
original `git-clone.sh`.  `git remote` allows multi-level remote names
since at least 46220ca100 (remote.c: Fix overtight refspec validation,
2008-03-20), so that appears to be the desired behavior.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Derrick Stolee <stolee@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Sean Barag <sean@barag.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-30 22:09:13 -07:00
Sean Barag
f2c6fda886 refs: consolidate remote name validation
In preparation for a future patch, extract from remote.c a function that
validates possible remote names so that its rules can be used
consistently in other places.

Helped-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Sean Barag <sean@barag.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-30 22:09:13 -07:00
Sean Barag
552955ed7f clone: use more conventional config/option layering
Parsing command-line options before reading from config required careful
handling to ensure CLI options were treated with higher priority.  Read
config first to let parsed CLI naively overwrite matching config values.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Sean Barag <sean@barag.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-30 22:09:13 -07:00
Jacob Keller
c0192df630 refspec: add support for negative refspecs
Both fetch and push support pattern refspecs which allow fetching or
pushing references that match a specific pattern. Because these patterns
are globs, they have somewhat limited ability to express more complex
situations.

For example, suppose you wish to fetch all branches from a remote except
for a specific one. To allow this, you must setup a set of refspecs
which match only the branches you want. Because refspecs are either
explicit name matches, or simple globs, many patterns cannot be
expressed.

Add support for a new type of refspec, referred to as "negative"
refspecs. These are prefixed with a '^' and mean "exclude any ref
matching this refspec". They can only have one "side" which always
refers to the source. During a fetch, this refers to the name of the ref
on the remote. During a push, this refers to the name of the ref on the
local side.

With negative refspecs, users can express more complex patterns. For
example:

 git fetch origin refs/heads/*:refs/remotes/origin/* ^refs/heads/dontwant

will fetch all branches on origin into remotes/origin, but will exclude
fetching the branch named dontwant.

Refspecs today are commutative, meaning that order doesn't expressly
matter. Rather than forcing an implied order, negative refspecs will
always be applied last. That is, in order to match, a ref must match at
least one positive refspec, and match none of the negative refspecs.
This is similar to how negative pathspecs work.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-30 14:52:00 -07:00
Jeff King
75d3bee157 sparse-checkout: fill in some options boilerplate
The sparse-checkout passes along argv and argc to its sub-command helper
functions. Many of these sub-commands do not yet take any command-line
options, and ignore those parameters.

Let's instead add empty option lists and make sure we call
parse_options(). That will give a useful error message for something
like:

  git sparse-checkout list --nonsense

which currently just silently ignores the unknown option.

As a bonus, it also silences some -Wunused-parameter warnings.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-30 12:53:48 -07:00
Jeff King
5b9427e0ac push: drop unused repo argument to do_push()
We stopped using the "repo" argument in 8e4c8af058 (push: disallow --all
and refspecs when remote.<name>.mirror is set, 2019-09-02), which moved
the pushremote handling to its caller.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-30 12:53:47 -07:00
Jeff King
8d2aa8dfac assert PARSE_OPT_NONEG in parse-options callbacks
In the spirit of 517fe807d6 (assert NOARG/NONEG behavior of
parse-options callbacks, 2018-11-05), let's cover some parse-options
callbacks which expect to be used with PARSE_OPT_NONEG but don't
explicitly assert that this is the case. These callbacks are all used
correctly in the current code, but this will help document their
expectations and future-proof the code.

As a bonus, it also silences -Wunused-parameters (these were added since
the initial sweep of 517fe807d6, and we can't yet turn on
-Wunused-parameters to remind people because it has too many existing
false positives).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-30 12:53:47 -07:00
Jeff King
424e28fcad env--helper: write to opt->value in parseopt helper
We use OPT_CALLBACK_F() to call the option_parse_type() callback,
passing it the address of "cmdmode" as the value to write to. But the
callback doesn't look at opt->value at all, and instead writes to a
global variable.

This works out because that's the same global variable we happen to pass
in, but it's rather confusing.  Let's use the passed-in value instead.
We'll also make "cmdmode" a local variable of the main function,
ensuring we can't make the same mistake again.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-30 12:53:47 -07:00
Jeff King
e885a84f1b drop unused argc parameters
Many functions take an argv/argc pair, but never actually look at argc.
This makes it useless at best (we use the NULL sentinel in argv to find
the end of the array), and misleading at worst (what happens if the argc
count does not match the argv NULL?).

In each of these instances, the argv NULL does match the argc count, so
there are no bugs here. But let's tighten the interfaces to make it
harder to get wrong (and to reduce some -Wunused-parameter complaints).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-30 12:53:47 -07:00
Junio C Hamano
299deeac8a Merge branch 'ah/pull'
Earlier we taught "git pull" to warn when the user does not say the
histories need to be merged, rebased or accepts only fast-
forwarding, but the warning triggered for those who have set the
pull.ff configuration variable.

* ah/pull:
  pull: don't warn if pull.ff has been set
2020-09-29 14:01:22 -07:00
Junio C Hamano
b28919c7bc Merge branch 'bc/clone-with-git-default-hash-fix'
"git clone" that clones from SHA-1 repository, while
GIT_DEFAULT_HASH set to use SHA-256 already, resulted in an
unusable repository that half-claims to be SHA-256 repository
with SHA-1 objects and refs.  This has been corrected.

* bc/clone-with-git-default-hash-fix:
  builtin/clone: avoid failure with GIT_DEFAULT_HASH
2020-09-29 14:01:20 -07:00
Junio C Hamano
288ed98bf7 Merge branch 'tb/bloom-improvements'
"git commit-graph write" learned to limit the number of bloom
filters that are computed from scratch with the --max-new-filters
option.

* tb/bloom-improvements:
  commit-graph: introduce 'commitGraph.maxNewFilters'
  builtin/commit-graph.c: introduce '--max-new-filters=<n>'
  commit-graph: rename 'split_commit_graph_opts'
  bloom: encode out-of-bounds filters as non-empty
  bloom/diff: properly short-circuit on max_changes
  bloom: use provided 'struct bloom_filter_settings'
  bloom: split 'get_bloom_filter()' in two
  commit-graph.c: store maximum changed paths
  commit-graph: respect 'commitGraph.readChangedPaths'
  t/helper/test-read-graph.c: prepare repo settings
  commit-graph: pass a 'struct repository *' in more places
  t4216: use an '&&'-chain
  commit-graph: introduce 'get_bloom_filter_settings()'
2020-09-29 14:01:20 -07:00
Sergey Organov
d01141de5a diff: get rid of redundant 'dense' argument
Get rid of 'dense' argument that is redundant for every function that has
'struct rev_info *rev' argument as well, as the value of 'dense' passed is
always taken from 'rev->dense_combined_merges' field.

The only place where this was not the case is in 'submodule.c' where
'diff_tree_combined_merge()' was called with '1' for 'dense' argument. However,
at that call the 'revs' instance used is local to the function, and we now just
set 'revs->dense_combined_merges' to 1 in this local instance.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-29 11:54:53 -07:00
Jonathan Tan
23547c4051 fetch: do not override partial clone filter
When a fetch with the --filter argument is made, the configured default
filter is set even if one already exists. This change was made in
5e46139376 ("builtin/fetch: remove unique promisor remote limitation",
2019-06-25) - in particular, changing from:

 * If this is the FIRST partial-fetch request, we enable partial
 * on this repo and remember the given filter-spec as the default
 * for subsequent fetches to this remote.

to:

 * If this is a partial-fetch request, we enable partial on
 * this repo if not already enabled and remember the given
 * filter-spec as the default for subsequent fetches to this
 * remote.

(The given filter-spec is "remembered" even if there is already an
existing one.)

This is problematic whenever a lazy fetch is made, because lazy fetches
are made using "git fetch --filter=blob:none", but this will also happen
if the user invokes "git fetch --filter=<filter>" manually. Therefore,
restore the behavior prior to 5e46139376, which writes a filter-spec
only if the current fetch request is the first partial-fetch one (for
that remote).

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-28 16:11:59 -07:00
Jeff King
63d24fa0b0 shortlog: allow multiple groups to be specified
Now that shortlog supports reading from trailers, it can be useful to
combine counts from multiple trailers, or between trailers and authors.
This can be done manually by post-processing the output from multiple
runs, but it's non-trivial to make sure that each name/commit pair is
counted only once.

This patch teaches shortlog to accept multiple --group options on the
command line, and pull data from all of them. That makes it possible to
run:

  git shortlog -ns --group=author --group=trailer:co-authored-by

to get a shortlog that counts authors and co-authors equally.

The implementation is mostly straightforward. The "group" enum becomes a
bitfield, and the trailer key becomes a list. I didn't bother
implementing the multi-group semantics for reading from stdin. It would
be possible to do, but the existing matching code makes it awkward, and
I doubt anybody cares.

The duplicate suppression we used for trailers now covers authors and
committers as well (though in non-trailer single-group mode we can skip
the hash insertion and lookup, since we only see one value per commit).

There is one subtlety: we now care about the case when no group bit is
set (in which case we default to showing the author). The caller in
builtin/log.c needs to be adapted to ask explicitly for authors, rather
than relying on shortlog_init(). It would be possible with some
gymnastics to make this keep working as-is, but it's not worth it for a
single caller.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-27 12:21:05 -07:00
Jeff King
56d5dde752 shortlog: parse trailer idents
Trailers don't necessarily contain name/email identity values, so
shortlog has so far treated them as opaque strings. However, since many
trailers do contain identities, it's useful to treat them as such when
they can be parsed. That lets "-e" work as usual, as well as mailmap.

When they can't be parsed, we'll continue with the old behavior of
treating them as a single string (there's no new test for that here,
since the existing tests cover a trailer like this).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-27 12:21:05 -07:00
Jeff King
87abb96222 shortlog: rename parse_stdin_ident()
This function is actually useful for parsing any identity, whether from
stdin or not. We'll need it for handling trailers.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-27 12:21:05 -07:00
Jeff King
f17b0b99bf shortlog: de-duplicate trailer values
The current documentation is vague about what happens with
--group=trailer:signed-off-by when we see a commit with:

  Signed-off-by: One
  Signed-off-by: Two
  Signed-off-by: One

We clearly should credit both "One" and "Two", but should "One" get
credited twice? The current code does so, but mostly because that was
the easiest thing to do. It's probably more useful to count each commit
at most once. This will become especially important when we allow
values from multiple sources in a future patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-27 12:21:05 -07:00
Jeff King
47beb37bc6 shortlog: match commit trailers with --group
If a project uses commit trailers, this patch lets you use
shortlog to see who is performing each action. For example,
running:

  git shortlog -ns --group=trailer:reviewed-by

in git.git shows who has reviewed. You can even use a custom
format to see things like who has helped whom:

  git shortlog --format="...helped %an (%ad)" \
               --group=trailer:helped-by

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-27 12:21:05 -07:00
Jeff King
92338c450b shortlog: add grouping option
In preparation for adding more grouping types, let's refactor the
committer/author grouping code and add a user-facing option that binds
them together. In particular:

  - the main option is now "--group", to make it clear
    that the various group types are mutually exclusive. The
    "--committer" option is an alias for "--group=committer".

  - we keep an enum rather than a binary flag, to prepare
    for more values

  - we prefer switch statements to ternary assignment, since
    other group types will need more custom code

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-27 12:21:05 -07:00
Junio C Hamano
6c430a647c Merge branch 'jx/proc-receive-hook'
"git receive-pack" that accepts requests by "git push" learned to
outsource most of the ref updates to the new "proc-receive" hook.

* jx/proc-receive-hook:
  doc: add documentation for the proc-receive hook
  transport: parse report options for tracking refs
  t5411: test updates of remote-tracking branches
  receive-pack: new config receive.procReceiveRefs
  doc: add document for capability report-status-v2
  New capability "report-status-v2" for git-push
  receive-pack: feed report options to post-receive
  receive-pack: add new proc-receive hook
  t5411: add basic test cases for proc-receive hook
  transport: not report a non-head push as a branch
2020-09-25 15:25:39 -07:00
Junio C Hamano
48794acc50 Merge branch 'ds/maintenance-part-1'
A "git gc"'s big brother has been introduced to take care of more
repository maintenance tasks, not limited to the object database
cleaning.

* ds/maintenance-part-1:
  maintenance: add trace2 regions for task execution
  maintenance: add auto condition for commit-graph task
  maintenance: use pointers to check --auto
  maintenance: create maintenance.<task>.enabled config
  maintenance: take a lock on the objects directory
  maintenance: add --task option
  maintenance: add commit-graph task
  maintenance: initialize task array
  maintenance: replace run_auto_gc()
  maintenance: add --quiet option
  maintenance: create basic maintenance runner
2020-09-25 15:25:38 -07:00
Derrick Stolee
2fec604f8d maintenance: add start/stop subcommands
Add new subcommands to 'git maintenance' that start or stop background
maintenance using 'cron', when available. This integration is as simple
as I could make it, barring some implementation complications.

The schedule is laid out as follows:

  0 1-23 * * *   $cmd maintenance run --schedule=hourly
  0 0    * * 1-6 $cmd maintenance run --schedule=daily
  0 0    * * 0   $cmd maintenance run --schedule=weekly

where $cmd is a properly-qualified 'git for-each-repo' execution:

$cmd=$path/git --exec-path=$path for-each-repo --config=maintenance.repo

where $path points to the location of the Git executable running 'git
maintenance start'. This is critical for systems with multiple versions
of Git. Specifically, macOS has a system version at '/usr/bin/git' while
the version that users can install resides at '/usr/local/bin/git'
(symlinked to '/usr/local/libexec/git-core/git'). This will also use
your locally-built version if you build and run this in your development
environment without installing first.

This conditional schedule avoids having cron launch multiple 'git
for-each-repo' commands in parallel. Such parallel commands would likely
lead to the 'hourly' and 'daily' tasks competing over the object
database lock. This could lead to to some tasks never being run! Since
the --schedule=<frequency> argument will run all tasks with _at least_
the given frequency, the daily runs will also run the hourly tasks.
Similarly, the weekly runs will also run the daily and hourly tasks.

The GIT_TEST_CRONTAB environment variable is not intended for users to
edit, but instead as a way to mock the 'crontab [-l]' command. This
variable is set in test-lib.sh to avoid a future test from accidentally
running anything with the cron integration from modifying the user's
schedule. We use GIT_TEST_CRONTAB='test-tool crontab <file>' in our
tests to check how the schedule is modified in 'git maintenance
(start|stop)' commands.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-25 10:59:44 -07:00
Derrick Stolee
0c18b70081 maintenance: add [un]register subcommands
In preparation for launching background maintenance from the 'git
maintenance' builtin, create register/unregister subcommands. These
commands update the new 'maintenance.repos' config option in the global
config so the background maintenance job knows which repositories to
maintain.

These commands allow users to add a repository to the background
maintenance list without disrupting the actual maintenance mechanism.

For example, a user can run 'git maintenance register' when no
background maintenance is running and it will not start the background
maintenance. A later update to start running background maintenance will
then pick up this repository automatically.

The opposite example is that a user can run 'git maintenance unregister'
to remove the current repository from background maintenance without
halting maintenance for other repositories.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-25 10:59:44 -07:00
Derrick Stolee
4950b2a2b5 for-each-repo: run subcommands on configured repos
It can be helpful to store a list of repositories in global or system
config and then iterate Git commands on that list. Create a new builtin
that makes this process simple for experts. We will use this builtin to
run scheduled maintenance on all configured repositories in a future
change.

The test is very simple, but does highlight that the "--" argument is
optional.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-25 10:59:44 -07:00
Derrick Stolee
b08ff1fee0 maintenance: add --schedule option and config
Maintenance currently triggers when certain data-size thresholds are
met, such as number of pack-files or loose objects. Users may want to
run certain maintenance tasks based on frequency instead. For example,
a user may want to perform a 'prefetch' task every hour, or 'gc' task
every day. To help these users, update the 'git maintenance run' command
to include a '--schedule=<frequency>' option. The allowed frequencies
are 'hourly', 'daily', and 'weekly'. These values are also allowed in a
new config value 'maintenance.<task>.schedule'.

The 'git maintenance run --schedule=<frequency>' checks the '*.schedule'
config value for each enabled task to see if the configured frequency is
at least as frequent as the frequency from the '--schedule' argument. We
use the following order, for full clarity:

	'hourly' > 'daily' > 'weekly'

Use new 'enum schedule_priority' to track these values numerically.

The following cron table would run the scheduled tasks with the correct
frequencies:

  0 1-23 * * *    git -C <repo> maintenance run --schedule=hourly
  0 0    * * 1-6  git -C <repo> maintenance run --schedule=daily
  0 0    * * 0    git -C <repo> maintenance run --schedule=weekly

This cron schedule will run --schedule=hourly every hour except at
midnight. This avoids a concurrent run with the --schedule=daily that
runs at midnight every day except the first day of the week. This avoids
a concurrent run with the --schedule=weekly that runs at midnight on
the first day of the week. Since --schedule=daily also runs the
'hourly' tasks and --schedule=weekly runs the 'hourly' and 'daily'
tasks, we will still see all tasks run with the proper frequencies.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-25 10:59:44 -07:00
Derrick Stolee
e841a79a13 maintenance: add incremental-repack auto condition
The incremental-repack task updates the multi-pack-index by deleting pack-
files that have been replaced with new packs, then repacking a batch of
small pack-files into a larger pack-file. This incremental repack is faster
than rewriting all object data, but is slower than some other
maintenance activities.

The 'maintenance.incremental-repack.auto' config option specifies how many
pack-files should exist outside of the multi-pack-index before running
the step. These pack-files could be created by 'git fetch' commands or
by the loose-objects task. The default value is 10.

Setting the option to zero disables the task with the '--auto' option,
and a negative value makes the task run every time.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-25 10:53:05 -07:00
Derrick Stolee
a13e3d0ec8 maintenance: auto-size incremental-repack batch
When repacking during the 'incremental-repack' task, we use the
--batch-size option in 'git multi-pack-index repack'. The initial setting
used --batch-size=0 to repack everything into a single pack-file. This is
not sustainable for a large repository. The amount of work required is
also likely to use too many system resources for a background job.

Update the 'incremental-repack' task by dynamically computing a
--batch-size option based on the current pack-file structure.

The dynamic default size is computed with this idea in mind for a client
repository that was cloned from a very large remote: there is likely one
"big" pack-file that was created at clone time. Thus, do not try
repacking it as it is likely packed efficiently by the server.

Instead, we select the second-largest pack-file, and create a batch size
that is one larger than that pack-file. If there are three or more
pack-files, then this guarantees that at least two will be combined into
a new pack-file.

Of course, this means that the second-largest pack-file size is likely
to grow over time and may eventually surpass the initially-cloned
pack-file. Recall that the pack-file batch is selected in a greedy
manner: the packs are considered from oldest to newest and are selected
if they have size smaller than the batch size until the total selected
size is larger than the batch size. Thus, that oldest "clone" pack will
be first to repack after the new data creates a pack larger than that.

We also want to place some limits on how large these pack-files become,
in order to bound the amount of time spent repacking. A maximum
batch-size of two gigabytes means that large repositories will never be
packed into a single pack-file using this job, but also that repack is
rather expensive. This is a trade-off that is valuable to have if the
maintenance is being run automatically or in the background. Users who
truly want to optimize for space and performance (and are willing to pay
the upfront cost of a full repack) can use the 'gc' task to do so.

Create a test for this two gigabyte limit by creating an EXPENSIVE test
that generates two pack-files of roughly 2.5 gigabytes in size, then
performs an incremental repack. Check that the --batch-size argument in
the subcommand uses the hard-coded maximum.

Helped-by: Chris Torek <chris.torek@gmail.com>
Reported-by: Son Luong Ngoc <sluongng@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-25 10:53:05 -07:00
Derrick Stolee
52fe41ff1c maintenance: add incremental-repack task
The previous change cleaned up loose objects using the
'loose-objects' that can be run safely in the background. Add a
similar job that performs similar cleanups for pack-files.

One issue with running 'git repack' is that it is designed to
repack all pack-files into a single pack-file. While this is the
most space-efficient way to store object data, it is not time or
memory efficient. This becomes extremely important if the repo is
so large that a user struggles to store two copies of the pack on
their disk.

Instead, perform an "incremental" repack by collecting a few small
pack-files into a new pack-file. The multi-pack-index facilitates
this process ever since 'git multi-pack-index expire' was added in
19575c7 (multi-pack-index: implement 'expire' subcommand,
2019-06-10) and 'git multi-pack-index repack' was added in ce1e4a1
(midx: implement midx_repack(), 2019-06-10).

The 'incremental-repack' task runs the following steps:

1. 'git multi-pack-index write' creates a multi-pack-index file if
   one did not exist, and otherwise will update the multi-pack-index
   with any new pack-files that appeared since the last write. This
   is particularly relevant with the background fetch job.

   When the multi-pack-index sees two copies of the same object, it
   stores the offset data into the newer pack-file. This means that
   some old pack-files could become "unreferenced" which I will use
   to mean "a pack-file that is in the pack-file list of the
   multi-pack-index but none of the objects in the multi-pack-index
   reference a location inside that pack-file."

2. 'git multi-pack-index expire' deletes any unreferenced pack-files
   and updaes the multi-pack-index to drop those pack-files from the
   list. This is safe to do as concurrent Git processes will see the
   multi-pack-index and not open those packs when looking for object
   contents. (Similar to the 'loose-objects' job, there are some Git
   commands that open pack-files regardless of the multi-pack-index,
   but they are rarely used. Further, a user that self-selects to
   use background operations would likely refrain from using those
   commands.)

3. 'git multi-pack-index repack --bacth-size=<size>' collects a set
   of pack-files that are listed in the multi-pack-index and creates
   a new pack-file containing the objects whose offsets are listed
   by the multi-pack-index to be in those objects. The set of pack-
   files is selected greedily by sorting the pack-files by modified
   time and adding a pack-file to the set if its "expected size" is
   smaller than the batch size until the total expected size of the
   selected pack-files is at least the batch size. The "expected
   size" is calculated by taking the size of the pack-file divided
   by the number of objects in the pack-file and multiplied by the
   number of objects from the multi-pack-index with offset in that
   pack-file. The expected size approximates how much data from that
   pack-file will contribute to the resulting pack-file size. The
   intention is that the resulting pack-file will be close in size
   to the provided batch size.

   The next run of the incremental-repack task will delete these
   repacked pack-files during the 'expire' step.

   In this version, the batch size is set to "0" which ignores the
   size restrictions when selecting the pack-files. It instead
   selects all pack-files and repacks all packed objects into a
   single pack-file. This will be updated in the next change, but
   it requires doing some calculations that are better isolated to
   a separate change.

These steps are based on a similar background maintenance step in
Scalar (and VFS for Git) [1]. This was incredibly effective for
users of the Windows OS repository. After using the same VFS for Git
repository for over a year, some users had _thousands_ of pack-files
that combined to up to 250 GB of data. We noticed a few users were
running into the open file descriptor limits (due in part to a bug
in the multi-pack-index fixed by af96fe3 (midx: add packs to
packed_git linked list, 2019-04-29).

These pack-files were mostly small since they contained the commits
and trees that were pushed to the origin in a given hour. The GVFS
protocol includes a "prefetch" step that asks for pre-computed pack-
files containing commits and trees by timestamp. These pack-files
were grouped into "daily" pack-files once a day for up to 30 days.
If a user did not request prefetch packs for over 30 days, then they
would get the entire history of commits and trees in a new, large
pack-file. This led to a large number of pack-files that had poor
delta compression.

By running this pack-file maintenance step once per day, these repos
with thousands of packs spanning 200+ GB dropped to dozens of pack-
files spanning 30-50 GB. This was done all without removing objects
from the system and using a constant batch size of two gigabytes.
Once the work was done to reduce the pack-files to small sizes, the
batch size of two gigabytes means that not every run triggers a
repack operation, so the following run will not expire a pack-file.
This has kept these repos in a "clean" state.

[1] https://github.com/microsoft/scalar/blob/master/Scalar.Common/Maintenance/PackfileMaintenanceStep.cs

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-25 10:53:04 -07:00
Derrick Stolee
3e220e6069 maintenance: create auto condition for loose-objects
The loose-objects task deletes loose objects that already exist in a
pack-file, then place the remaining loose objects into a new pack-file.
If this step runs all the time, then we risk creating pack-files with
very few objects with every 'git commit' process. To prevent
overwhelming the packs directory with small pack-files, place a minimum
number of objects to justify the task.

The 'maintenance.loose-objects.auto' config option specifies a minimum
number of loose objects to justify the task to run under the '--auto'
option. This defaults to 100 loose objects. Setting the value to zero
will prevent the step from running under '--auto' while a negative value
will force it to run every time.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-25 10:53:04 -07:00
Derrick Stolee
252cfb7cb8 maintenance: add loose-objects task
One goal of background maintenance jobs is to allow a user to
disable auto-gc (gc.auto=0) but keep their repository in a clean
state. Without any cleanup, loose objects will clutter the object
database and slow operations. In addition, the loose objects will
take up extra space because they are not stored with deltas against
similar objects.

Create a 'loose-objects' task for the 'git maintenance run' command.
This helps clean up loose objects without disrupting concurrent Git
commands using the following sequence of events:

1. Run 'git prune-packed' to delete any loose objects that exist
   in a pack-file. Concurrent commands will prefer the packed
   version of the object to the loose version. (Of course, there
   are exceptions for commands that specifically care about the
   location of an object. These are rare for a user to run on
   purpose, and we hope a user that has selected background
   maintenance will not be trying to do foreground maintenance.)

2. Run 'git pack-objects' on a batch of loose objects. These
   objects are grouped by scanning the loose object directories in
   lexicographic order until listing all loose objects -or-
   reaching 50,000 objects. This is more than enough if the loose
   objects are created only by a user doing normal development.
   We noticed users with _millions_ of loose objects because VFS
   for Git downloads blobs on-demand when a file read operation
   requires populating a virtual file.

This step is based on a similar step in Scalar [1] and VFS for Git.
[1] https://github.com/microsoft/scalar/blob/master/Scalar.Common/Maintenance/LooseObjectsStep.cs

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-25 10:53:04 -07:00
Derrick Stolee
28cb5e66dd maintenance: add prefetch task
When working with very large repositories, an incremental 'git fetch'
command can download a large amount of data. If there are many other
users pushing to a common repo, then this data can rival the initial
pack-file size of a 'git clone' of a medium-size repo.

Users may want to keep the data on their local repos as close as
possible to the data on the remote repos by fetching periodically in
the background. This can break up a large daily fetch into several
smaller hourly fetches.

The task is called "prefetch" because it is work done in advance
of a foreground fetch to make that 'git fetch' command much faster.

However, if we simply ran 'git fetch <remote>' in the background,
then the user running a foreground 'git fetch <remote>' would lose
some important feedback when a new branch appears or an existing
branch updates. This is especially true if a remote branch is
force-updated and this isn't noticed by the user because it occurred
in the background. Further, the functionality of 'git push
--force-with-lease' becomes suspect.

When running 'git fetch <remote> <options>' in the background, use
the following options for careful updating:

1. --no-tags prevents getting a new tag when a user wants to see
   the new tags appear in their foreground fetches.

2. --refmap= removes the configured refspec which usually updates
   refs/remotes/<remote>/* with the refs advertised by the remote.
   While this looks confusing, this was documented and tested by
   b40a50264a (fetch: document and test --refmap="", 2020-01-21),
   including this sentence in the documentation:

	Providing an empty `<refspec>` to the `--refmap` option
	causes Git to ignore the configured refspecs and rely
	entirely on the refspecs supplied as command-line arguments.

3. By adding a new refspec "+refs/heads/*:refs/prefetch/<remote>/*"
   we can ensure that we actually load the new values somewhere in
   our refspace while not updating refs/heads or refs/remotes. By
   storing these refs here, the commit-graph job will update the
   commit-graph with the commits from these hidden refs.

4. --prune will delete the refs/prefetch/<remote> refs that no
   longer appear on the remote.

5. --no-write-fetch-head prevents updating FETCH_HEAD.

We've been using this step as a critical background job in Scalar
[1] (and VFS for Git). This solved a pain point that was showing up
in user reports: fetching was a pain! Users do not like waiting to
download the data that was created while they were away from their
machines. After implementing background fetch, the foreground fetch
commands sped up significantly because they mostly just update refs
and download a small amount of new data. The effect is especially
dramatic when paried with --no-show-forced-udpates (through
fetch.showForcedUpdates=false).

[1] https://github.com/microsoft/scalar/blob/master/Scalar.Common/Maintenance/FetchStep.cs

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-25 10:53:04 -07:00
Jeff King
45d93eb824 shortlog: change "author" variables to "ident"
We already match "committer", and we're about to start
matching more things. Let's use a more neutral variable to
avoid confusion.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-25 10:47:50 -07:00
Christian Couder
73c6de06af bisect: don't use invalid oid as rev when starting
In 06f5608c14 (bisect--helper: `bisect_start` shell function
partially in C, 2019-01-02), we changed the following shell
code:

-      rev=$(git rev-parse -q --verify "$arg^{commit}") || {
-              test $has_double_dash -eq 1 &&
-              die "$(eval_gettext "'\$arg' does not appear to be a valid revision")"
-              break
-      }
-      revs="$revs $rev"

into:

+      char *commit_id = xstrfmt("%s^{commit}", arg);
+      if (get_oid(commit_id, &oid) && has_double_dash)
+              die(_("'%s' does not appear to be a valid "
+                    "revision"), arg);
+
+      string_list_append(&revs, oid_to_hex(&oid));
+      free(commit_id);

In case of an invalid "arg" when "has_double_dash" is false, the old
code would "break" out of the argument loop.

In the new C code though, `oid_to_hex(&oid)` is unconditonally
appended to "revs". This is wrong first because "oid" is junk as
`get_oid(commit_id, &oid)` failed and second because it doesn't break
out of the argument loop.

Not breaking out of the argument loop means that "arg" is then not
treated as a path restriction (which is wrong).

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-25 09:57:48 -07:00
Alex Henrie
54200cef86 pull: don't warn if pull.ff has been set
A user who understands enough to set pull.ff does not need additional
instructions.

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-24 23:04:27 -07:00
Junio C Hamano
610e2b9240 blame: validate and peel the object names on the ignore list
The command reads list of object names to place on the ignore list
either from the command line or from a file, but they are not
checked with their object type (those read from the file are not
even checked for object existence).

Extend the oidset_parse_file() API and allow it to take a callback
that can be used to die (e.g. when an inappropriate input is read)
or modify the object name read (e.g. when a tag pointing at a commit
is read, and the caller wants a commit object name), and use it in
the code that handles ignore list.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-24 22:20:58 -07:00
Jeff King
176380fd11 Revert "fast-export: use local array to store anonymized oid"
This reverts commit f39ad38410.

That commit was trying to silence a type-punning warning on older
versions of gcc. However, its analysis was all wrong. I didn't notice
that we _were_ in fact type-punning because there are two versions of
put_be32(): one that uses casts and unaligned loads, and another that
uses bitshifts. I looked at the latter, but on my platform we were
defaulting to the former.

However, as of the previous commit, we'll always use the bitshift
version. So we can drop this hackery to avoid the warning, making the
code slightly cleaner.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-24 12:30:11 -07:00
Pranit Bauva
517ecb3161 bisect--helper: reimplement bisect_next and bisect_auto_next shell functions in C
Reimplement the `bisect_next()` and the `bisect_auto_next()` shell functions
in C and add the subcommands to `git bisect--helper` to call them from
git-bisect.sh .

bisect_auto_next() function returns an enum bisect_error type as whole
`git bisect` can exit with an error code when bisect_next() does.

Return an error when `bisect_next()` fails, that fix a bug on shell script
version.

Using `--bisect-next` and `--bisect-auto-next` subcommands is a
temporary measure to port shell function to C so as to use the existing
test suite. As more functions are ported, `--bisect-auto-next`
subcommand will be retired and will be called by some other methods.

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-24 12:06:30 -07:00
Pranit Bauva
09535f056b bisect--helper: reimplement bisect_autostart shell function in C
Reimplement the `bisect_autostart()` shell function in C and add the
C implementation from `bisect_next()` which was previously left
uncovered.

Add `--bisect-autostart` subcommand to be called from git-bisect.sh.
Using `--bisect-autostart` subcommand is a temporary measure to port
the shell function to C so as to use the existing test suite. As more
functions are ported, this subcommand will be retired and
bisect_autostart() will be called directly by `bisect_state()`.

Change behavior of shell script that returned success when user aborted
the bisection.

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-24 12:06:30 -07:00
Junio C Hamano
6854689e65 Merge branch 'ar/fetch-ipversion-in-all'
"git fetch --all --ipv4/--ipv6" forgot to pass the protocol options
to instances of the "git fetch" that talk to individual remotes,
which has been corrected.

* ar/fetch-ipversion-in-all:
  fetch: pass --ipv4 and --ipv6 options to sub-fetches
2020-09-22 12:36:34 -07:00
Junio C Hamano
39149df364 Merge branch 'cs/don-t-pretend-a-failed-remote-set-head-succeeded'
"git remote set-head" that failed still said something that hints
the operation went through, which was misleading.

* cs/don-t-pretend-a-failed-remote-set-head-succeeded:
  remote: don't show success message when set-head fails
2020-09-22 12:36:32 -07:00
Junio C Hamano
26a3728bed Merge branch 'al/ref-filter-merged-and-no-merged'
"git for-each-ref" and friends that list refs used to allow only
one --merged or --no-merged to filter them; they learned to take
combination of both kind of filtering.

* al/ref-filter-merged-and-no-merged:
  Doc: prefer more specific file name
  ref-filter: make internal reachable-filter API more precise
  ref-filter: allow merged and no-merged filters
  Doc: cover multiple contains/no-contains filters
  t3201: test multiple branch filter combinations
2020-09-22 12:36:31 -07:00
Junio C Hamano
4aff18a3f0 Merge branch 'ls/mergetool-meld-auto-merge'
The 'meld' backend of the "git mergetool" learned to give the
underlying 'meld' the '--auto-merge' option, which would help
reduce the amount of text that requires manual merging.

* ls/mergetool-meld-auto-merge:
  mergetool: allow auto-merge for meld to follow the vim-diff behavior
2020-09-22 12:36:29 -07:00
Junio C Hamano
b7e65b51e5 Merge branch 'jt/threaded-index-pack'
"git index-pack" learned to resolve deltified objects with greater
parallelism.

* jt/threaded-index-pack:
  index-pack: make quantum of work smaller
  index-pack: make resolve_delta() assume base data
  index-pack: calculate {ref,ofs}_{first,last} early
  index-pack: remove redundant child field
  index-pack: unify threaded and unthreaded code
  index-pack: remove redundant parameter
  Documentation: deltaBaseCacheLimit is per-thread
2020-09-22 12:36:28 -07:00
Junio C Hamano
634e0084fa Merge branch 'es/format-patch-interdiff-cleanup'
"format-patch --range-diff=<prev> <origin>..HEAD" has been taught
not to ignore <origin> when <prev> is a single version.

* es/format-patch-interdiff-cleanup:
  format-patch: use 'origin' as start of current-series-range when known
  diff-lib: tighten show_interdiff()'s interface
  diff: move show_interdiff() from its own file to diff-lib
2020-09-22 12:36:28 -07:00
Junio C Hamano
bcb68bff80 Merge branch 'os/fetch-submodule-optim'
Optimization around submodule handling.

* os/fetch-submodule-optim:
  fetch: do not look for submodule changes in unchanged refs
2020-09-22 12:36:28 -07:00
brian m. carlson
47ac970309 builtin/clone: avoid failure with GIT_DEFAULT_HASH
If a user is cloning a SHA-1 repository with GIT_DEFAULT_HASH set to
"sha256", then we can end up with a repository where the repository
format version is 0 but the extensions.objectformat key is set to
"sha256".  This is both wrong (the user has a SHA-1 repository) and
nonfunctional (because the extension cannot be used in a v0 repository).

This happens because in a clone, we initially set up the repository, and
then change its algorithm based on what the remote side tells us it's
using.  We've initially set up the repository as SHA-256 in this case,
and then later on reset the repository version without clearing the
extension.

We could just always set the extension in this case, but that would mean
that our SHA-1 repositories weren't compatible with older Git versions,
even though there's no reason why they shouldn't be.  And we also don't
want to initialize the repository as SHA-1 initially, since that means
if we're cloning an empty repository, we'll have failed to honor the
GIT_DEFAULT_HASH variable and will end up with a SHA-1 repository, not a
SHA-256 repository.

Neither of those are appealing, so let's tell the repository
initialization code if we're doing a reinit like this, and if so, to
clear the extension if we're using SHA-1.  This makes sure we produce a
valid and functional repository and doesn't break any of our other use
cases.

Reported-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-22 09:22:32 -07:00
Johannes Schindelin
5a0c32bd4b fast-export: avoid using unnecessary language in a code comment
In an ongoing effort to avoid non-inclusive language, let's avoid using
the branch name "master" in a code comment.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-21 15:19:27 -07:00
Denton Liu
3d09c22869 builtin/diff-tree: learn --merge-base
The previous commit introduced ---merge-base a way to take the diff
between the working tree or index and the merge base between an arbitrary
commit and HEAD. It makes sense to extend this option to support the
case where two commits are given too and behave in a manner identical to
`git diff A...B`.

Introduce the --merge-base flag as an alternative to triple-dot
notation. Thus, we would be able to write the above as
`git diff --merge-base A B`.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-21 13:37:03 -07:00
Denton Liu
0f5a1d449b builtin/diff-index: learn --merge-base
There is currently no easy way to take the diff between the working tree
or index and the merge base between an arbitrary commit and HEAD. Even
diff's `...` notation doesn't allow this because it only works between
commits. However, the ability to do this would be desirable to a user
who would like to see all the changes they've made on a branch plus
uncommitted changes without taking into account changes made in the
upstream branch.

Teach diff-index and diff (with one commit) the --merge-base option
which allows a user to use the merge base of a commit and HEAD as the
"before" side.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-20 21:30:26 -07:00
Denton Liu
4c3fe82ef1 diff-lib: accept option flags in run_diff_index()
In a future commit, we will teach run_diff_index() to accept more
options via flag bits. For now, change `cached` into a flag in the
`option` bitfield. The behaviour should remain exactly the same.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-20 21:30:26 -07:00
Junio C Hamano
80cacaec41 Merge branch 'mt/config-fail-nongit-early'
Unlike "git config --local", "git config --worktree" did not fail
early and cleanly when started outside a git repository.

* mt/config-fail-nongit-early:
  config: complain about --worktree outside of a git repo
2020-09-18 17:58:06 -07:00
Junio C Hamano
9d4e7ec4d9 Merge branch 'jc/quote-path-cleanup'
"git status --short" quoted a path with SP in it when tracked, but
not those that are untracked, ignored or unmerged.  They are all
shown quoted consistently.

* jc/quote-path-cleanup:
  quote: turn 'nodq' parameter into a set of flags
  quote: rename misnamed sq_lookup[] to cq_lookup[]
  wt-status: consistently quote paths in "status --short" output
  quote_path: code clarification
  quote_path: optionally allow quoting a path with SP in it
  quote_path: give flags parameter to quote_path()
  quote_path: rename quote_path_relative() to quote_path()
2020-09-18 17:58:04 -07:00
Junio C Hamano
45f462b5c5 Merge branch 'es/wt-add-detach'
"git worktree add" learns that the "-d" is a synonym to "--detach"
option to create a new worktree without being on a branch.

* es/wt-add-detach:
  git-worktree.txt: discuss branch-based vs. throwaway worktrees
  worktree: teach `add` to recognize -d as shorthand for --detach
  git-checkout.txt: document -d short option for --detach
2020-09-18 17:58:04 -07:00
Junio C Hamano
e96b271d18 Merge branch 'jc/add-i-use-builtin-experimental'
The "add -i/-p" machinery has been written in C but it is not used
by default yet.  It is made default to those who are participating
in feature.experimental experiment.

* jc/add-i-use-builtin-experimental:
  add -i: use the built-in version when feature.experimental is set
2020-09-18 17:58:02 -07:00
Junio C Hamano
21de7e9c50 Merge branch 'rs/refspec-leakfix'
Leakfix.

* rs/refspec-leakfix:
  refspec: add and use refspec_appendf()
  push: release strbufs used for refspec formatting
2020-09-18 17:58:00 -07:00
Junio C Hamano
9b8074427b Merge branch 'rs/misc-cleanups'
Misc cleanups.

* rs/misc-cleanups:
  pack-bitmap-write: use hashwrite_be32() in write_hash_cache()
  midx: use hashwrite_u8() in write_midx_header()
  fast-import: use write_pack_header()
2020-09-18 17:58:00 -07:00
Taylor Blau
d356d5debe commit-graph: introduce 'commitGraph.maxNewFilters'
Introduce a configuration variable to specify a default value for the
recently-introduce '--max-new-filters' option of 'git commit-graph
write'.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-18 10:39:22 -07:00
Taylor Blau
809e0327f5 builtin/commit-graph.c: introduce '--max-new-filters=<n>'
Introduce a command-line flag to specify the maximum number of new Bloom
filters that a 'git commit-graph write' is willing to compute from
scratch.

Prior to this patch, a commit-graph write with '--changed-paths' would
compute Bloom filters for all selected commits which haven't already
been computed (i.e., by a previous commit-graph write with '--split'
such that a roll-up or replacement is performed).

This behavior can cause prohibitively-long commit-graph writes for a
variety of reasons:

  * There may be lots of filters whose diffs take a long time to
    generate (for example, they have close to the maximum number of
    changes, diffing itself takes a long time, etc).

  * Old-style commit-graphs (which encode filters with too many entries
    as not having been computed at all) cause us to waste time
    recomputing filters that appear to have not been computed only to
    discover that they are too-large.

This can make the upper-bound of the time it takes for 'git commit-graph
write --changed-paths' to be rather unpredictable.

To make this command behave more predictably, introduce
'--max-new-filters=<n>' to allow computing at most '<n>' Bloom filters
from scratch. This lets "computing" already-known filters proceed
quickly, while bounding the number of slow tasks that Git is willing to
do.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-18 10:35:39 -07:00
Taylor Blau
98bb796191 commit-graph: rename 'split_commit_graph_opts'
In the subsequent commit, additional options will be added to the
commit-graph API which have nothing to do with splitting.

Rename the 'split_commit_graph_opts' structure to the more-generic
'commit_graph_opts' to encompass both. Likewise, rename the 'flags'
member to instead be 'split_flags' to clarify that it only has to do
with the behavior implied by '--split'.

Suggested-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-17 21:55:50 -07:00
Christian Schlack
5a07c6c3c2 remote: don't show success message when set-head fails
Suppress the message 'origin/HEAD set to master' in case of an error.

  $ git remote set-head origin -a
  error: Not a valid ref: refs/remotes/origin/master
  origin/HEAD set to master

Signed-off-by: Christian Schlack <christian@backhub.co>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-17 11:40:17 -07:00
Derrick Stolee
25914c4fde maintenance: add trace2 regions for task execution
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-17 11:30:05 -07:00
Derrick Stolee
4ddc79b2da maintenance: add auto condition for commit-graph task
Instead of writing a new commit-graph in every 'git maintenance run
--auto' process (when maintenance.commit-graph.enalbed is configured to
be true), only write when there are "enough" commits not in a
commit-graph file.

This count is controlled by the maintenance.commit-graph.auto config
option.

To compute the count, use a depth-first search starting at each ref, and
leaving markers using the SEEN flag. If this count reaches the limit,
then terminate early and start the task. Otherwise, this operation will
peel every ref and parse the commit it points to. If these are all in
the commit-graph, then this is typically a very fast operation. Users
with many refs might feel a slow-down, and hence could consider updating
their limit to be very small. A negative value will force the step to
run every time.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-17 11:30:05 -07:00
Derrick Stolee
916d0626c2 maintenance: use pointers to check --auto
The 'git maintenance run' command has an '--auto' option. This is used
by other Git commands such as 'git commit' or 'git fetch' to check if
maintenance should be run after adding data to the repository.

Previously, this --auto option was only used to add the argument to the
'git gc' command as part of the 'gc' task. We will be expanding the
other tasks to perform a check to see if they should do work as part of
the --auto flag, when they are enabled by config.

First, update the 'gc' task to perform the auto check inside the
maintenance process. This prevents running an extra 'git gc --auto'
command when not needed. It also shows a model for other tasks.

Second, use the 'auto_condition' function pointer as a signal for
whether we enable the maintenance task under '--auto'. For instance, we
do not want to enable the 'fetch' task in '--auto' mode, so that
function pointer will remain NULL.

Now that we are not automatically calling 'git gc', a test in
t5514-fetch-multiple.sh must be changed to watch for 'git maintenance'
instead.

We continue to pass the '--auto' option to the 'git gc' command when
necessary, because of the gc.autoDetach config option changes behavior.
Likely, we will want to absorb the daemonizing behavior implied by
gc.autoDetach as a maintenance.autoDetach config option.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-17 11:30:05 -07:00
Derrick Stolee
65d655b52d maintenance: create maintenance.<task>.enabled config
Currently, a normal run of "git maintenance run" will only run the 'gc'
task, as it is the only one enabled. This is mostly for backwards-
compatible reasons since "git maintenance run --auto" commands replaced
previous "git gc --auto" commands after some Git processes. Users could
manually run specific maintenance tasks by calling "git maintenance run
--task=<task>" directly.

Allow users to customize which steps are run automatically using config.
The 'maintenance.<task>.enabled' option then can turn on these other
tasks (or turn off the 'gc' task).

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-17 11:30:05 -07:00
Derrick Stolee
d7514f6ed5 maintenance: take a lock on the objects directory
Performing maintenance on a Git repository involves writing data to the
.git directory, which is not safe to do with multiple writers attempting
the same operation. Ensure that only one 'git maintenance' process is
running at a time by holding a file-based lock. Simply the presence of
the .git/maintenance.lock file will prevent future maintenance. This
lock is never committed, since it does not represent meaningful data.
Instead, it is only a placeholder.

If the lock file already exists, then no maintenance tasks are
attempted. This will become very important later when we implement the
'prefetch' task, as this is our stop-gap from creating a recursive process
loop between 'git fetch' and 'git maintenance run --auto'.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-17 11:30:05 -07:00
Derrick Stolee
090511bc0b maintenance: add --task option
A user may want to only run certain maintenance tasks in a certain
order. Add the --task=<task> option, which allows a user to specify an
ordered list of tasks to run. These cannot be run multiple times,
however.

Here is where our array of maintenance_task pointers becomes critical.
We can sort the array of pointers based on the task order, but we do not
want to move the struct data itself in order to preserve the hashmap
references. We use the hashmap to match the --task=<task> arguments into
the task struct data.

Keep in mind that the 'enabled' member of the maintenance_task struct is
a placeholder for a future 'maintenance.<task>.enabled' config option.
Thus, we use the 'enabled' member to specify which tasks are run when
the user does not specify any --task=<task> arguments. The 'enabled'
member should be ignored if --task=<task> appears.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-17 11:30:05 -07:00
Derrick Stolee
663b2b1b90 maintenance: add commit-graph task
The first new task in the 'git maintenance' builtin is the
'commit-graph' task. This updates the commit-graph file
incrementally with the command

	git commit-graph write --reachable --split

By writing an incremental commit-graph file using the "--split"
option we minimize the disruption from this operation. The default
behavior is to merge layers until the new "top" layer is less than
half the size of the layer below. This provides quick writes most
of the time, with the longer writes following a power law
distribution.

Most importantly, concurrent Git processes only look at the
commit-graph-chain file for a very short amount of time, so they
will verly likely not be holding a handle to the file when we try
to replace it. (This only matters on Windows.)

If a concurrent process reads the old commit-graph-chain file, but
our job expires some of the .graph files before they can be read,
then those processes will see a warning message (but not fail).
This could be avoided by a future update to use the --expire-time
argument when writing the commit-graph.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-17 11:30:05 -07:00
Derrick Stolee
3103e9848f maintenance: initialize task array
In anticipation of implementing multiple maintenance tasks inside the
'maintenance' builtin, use a list of structs to describe the work to be
done.

The struct maintenance_task stores the name of the task (as given by a
future command-line argument) along with a function pointer to its
implementation and a boolean for whether the step is enabled.

A list these structs are initialized with the full list of implemented
tasks along with a default order. For now, this list only contains the
"gc" task. This task is also the only task enabled by default.

The run subcommand will return a nonzero exit code if any task fails.
However, it will attempt all tasks in its loop before returning with the
failure. Also each failed task will print an error message.

Helped-by: Taylor Blau <me@ttaylorr.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-17 11:30:05 -07:00
Derrick Stolee
a95ce12430 maintenance: replace run_auto_gc()
The run_auto_gc() method is used in several places to trigger a check
for repo maintenance after some Git commands, such as 'git commit' or
'git fetch'.

To allow for extra customization of this maintenance activity, replace
the 'git gc --auto [--quiet]' call with one to 'git maintenance run
--auto [--quiet]'. As we extend the maintenance builtin with other
steps, users will be able to select different maintenance activities.

Rename run_auto_gc() to run_auto_maintenance() to be clearer what is
happening on this call, and to expose all callers in the current diff.
Rewrite the method to use a struct child_process to simplify the calls
slightly.

Since 'git fetch' already allows disabling the 'git gc --auto'
subprocess, add an equivalent option with a different name to be more
descriptive of the new behavior: '--[no-]maintenance'. Update the
documentation to include these options at the same time.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-17 11:30:05 -07:00
Derrick Stolee
3ddaad0e06 maintenance: add --quiet option
Maintenance activities are commonly used as steps in larger scripts.
Providing a '--quiet' option allows those scripts to be less noisy when
run on a terminal window. Turn this mode on by default when stderr is
not a terminal.

Pipe the option to the 'git gc' child process.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-17 11:30:05 -07:00
Derrick Stolee
2057d75038 maintenance: create basic maintenance runner
The 'gc' builtin is our current entrypoint for automatically maintaining
a repository. This one tool does many operations, such as repacking the
repository, packing refs, and rewriting the commit-graph file. The name
implies it performs "garbage collection" which means several different
things, and some users may not want to use this operation that rewrites
the entire object database.

Create a new 'maintenance' builtin that will become a more general-
purpose command. To start, it will only support the 'run' subcommand,
but will later expand to add subcommands for scheduling maintenance in
the background.

For now, the 'maintenance' builtin is a thin shim over the 'gc' builtin.
In fact, the only option is the '--auto' toggle, which is handed
directly to the 'gc' builtin. The current change is isolated to this
simple operation to prevent more interesting logic from being lost in
all of the boilerplate of adding a new builtin.

Use existing builtin/gc.c file because we want to share code between the
two builtins. It is possible that we will have 'maintenance' replace the
'gc' builtin entirely at some point, leaving 'git gc' as an alias for
some specific arguments to 'git maintenance run'.

Create a new test_subcommand helper that allows us to test if a certain
subcommand was run. It requires storing the GIT_TRACE2_EVENT logs in a
file. A negation mode is available that will be used in later tests.

Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-17 11:30:04 -07:00
Lin Sun
dbd8c09bfe mergetool: allow auto-merge for meld to follow the vim-diff behavior
Make the mergetool used with "meld" backend behave similarly to "vimdiff" by
telling it to auto-merge non-conflicting parts and highlight the conflicting
parts when `mergetool.meld.useAutoMerge` is configured with `true`, or `auto`
for detecting the `--auto-merge` option automatically.

Helped-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Helped-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Lin Sun <lin.sun@zoom.us>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-16 17:11:20 -07:00
Aaron Lipman
21bf933928 ref-filter: allow merged and no-merged filters
Enable ref-filter to process multiple merged and no-merged filters, and
extend functionality to git branch, git tag and git for-each-ref. This
provides an easy way to check for branches that are "graduation
candidates:"

$ git branch --no-merged master --merged next

If passed more than one merged (or more than one no-merged) filter, refs
must be reachable from any one of the merged commits, and reachable from
none of the no-merged commits.

Signed-off-by: Aaron Lipman <alipman88@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-16 12:38:10 -07:00
Alex Riesen
4e735c1326 fetch: pass --ipv4 and --ipv6 options to sub-fetches
The options indicate user intent for the whole fetch operation, and
ignoring them in sub-fetches (i.e. "--all" and recursive fetching of
submodules) is quite unexpected when, for instance, it is intended
to limit all of the communication to a specific transport protocol
for some reason.

Signed-off-by: Alex Riesen <alexander.riesen@cetitec.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-15 14:15:05 -07:00
Junio C Hamano
88910c9939 quote_path: give flags parameter to quote_path()
The quote_path() function computes a path (relative to its base
directory) and c-quotes the result if necessary.  Teach it to take a
flags parameter to allow its behaviour to be enriched later.

No behaviour change intended.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-10 10:49:19 -07:00
Junio C Hamano
c34d24b8a4 quote_path: rename quote_path_relative() to quote_path()
There is no quote_path_absolute() or anything that causes confusion,
and one of the two large consumers already rename the long name
locally with a preprocessor macro.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-10 10:49:17 -07:00
Junio C Hamano
0df670bc0b Merge branch 'jt/interpret-branch-name-fallback'
"git status" has trouble showing where it came from by interpreting
reflog entries that recordcertain events, e.g. "checkout @{u}", and
gives a hard/fatal error.  Even though it inherently is impossible
to give a correct answer because the reflog entries lose some
information (e.g. "@{u}" does not record what branch the user was
on hence which branch 'the upstream' needs to be computed, and even
if the record were available, the relationship between branches may
have changed), at least hide the error to allow "status" show its
output.

* jt/interpret-branch-name-fallback:
  wt-status: tolerate dangling marks
  refs: move dwim_ref() to header file
  sha1-name: replace unsigned int with option struct
2020-09-09 13:53:09 -07:00
Junio C Hamano
9f7833fd55 Merge branch 'ss/submodule-summary-in-c-fixes'
Fixups to a topic in 'next'.

* ss/submodule-summary-in-c-fixes:
  t7421: eliminate 'grep' check in t7421.4 for mingw compatibility
  submodule: fix style in function definition
  submodule: eliminate unused parameters from print_submodule_summary()
2020-09-09 13:53:07 -07:00
Junio C Hamano
eb7460fd31 Merge branch 'es/worktree-repair'
"git worktree" gained a "repair" subcommand to help users recover
after moving the worktrees or repository manually without telling
Git.  Also, "git init --separate-git-dir" no longer corrupts
administrative data related to linked worktrees.

* es/worktree-repair:
  init: make --separate-git-dir work from within linked worktree
  init: teach --separate-git-dir to repair linked worktrees
  worktree: teach "repair" to fix outgoing links to worktrees
  worktree: teach "repair" to fix worktree back-links to main worktree
  worktree: add skeleton "repair" command
2020-09-09 13:53:07 -07:00
Junio C Hamano
1aadb47aad Merge branch 'jk/worktree-check-clean-leakfix'
Leakfix.

* jk/worktree-check-clean-leakfix:
  worktree: fix leak in check_clean_worktree()
2020-09-09 13:53:07 -07:00
Junio C Hamano
a31677dde3 Merge branch 'tb/repack-clearing-midx'
When a packfile is removed by "git repack", multi-pack-index gets
cleared; the code was taught to do so less aggressively by first
checking if the midx actually refers to a pack that no longer
exists.

* tb/repack-clearing-midx:
  midx: traverse the local MIDX first
  builtin/repack.c: invalidate MIDX only when necessary
2020-09-09 13:53:06 -07:00
Junio C Hamano
bbdba3d883 Merge branch 'ss/submodule-summary-in-c'
Yet another subcommand of "git submodule" is getting rewritten in C.

* ss/submodule-summary-in-c:
  submodule: port submodule subcommand 'summary' from shell to C
  t7421: introduce a test script for verifying 'summary' output
  submodule: rename helper functions to avoid ambiguity
  submodule: remove extra line feeds between callback struct and macro
2020-09-09 13:53:05 -07:00
Taylor Blau
ab14d0676c commit-graph: pass a 'struct repository *' in more places
In a future commit, some commit-graph internals will want access to
'r->settings', but we only have the 'struct object_directory *'
corresponding to that repository.

Add an additional parameter to pass the repository around in more
places.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-09 12:51:48 -07:00
Matheus Tavares
378fe5fc3d config: complain about --worktree outside of a git repo
Running `git config --worktree` outside of a git repository hits a BUG()
when trying to enumerate the worktrees. Let's catch this error earlier
and die() with a friendlier message.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-09 12:47:47 -07:00
Jonathan Tan
f08cbf60fe index-pack: make quantum of work smaller
Currently, when index-pack resolves deltas, it does not split up delta
trees into threads: each delta base root (an object that is not a
REF_DELTA or OFS_DELTA) can go into its own thread, but all deltas on
that root (direct or indirect) are processed in the same thread.

This is a problem when a repository contains a large text file (thus,
delta-able) that is modified many times - delta resolution time during
fetching is dominated by processing the deltas corresponding to that
text file.

This patch contains a solution to that. When cloning using

  git -c core.deltabasecachelimit=1g clone \
    https://fuchsia.googlesource.com/third_party/vulkan-cts

on my laptop, clone time improved from 3m2s to 2m5s (using 3 threads,
which is the default).

The solution is to have a global work stack. This stack contains delta
bases (objects, whether appearing directly in the packfile or generated
by delta resolution, that themselves have delta children) that need to
be processed; whenever a thread needs work, it peeks at the top of the
stack and processes its next unprocessed child. If a thread finds the
stack empty, it will look for more delta base roots to push on the stack
instead.

The main weakness of having a global work stack is that more time is
spent in the mutex, but profiling has shown that most time is spent in
the resolution of the deltas themselves, so this shouldn't be an issue
in practice. In any case, experimentation (as described in the clone
command above) shows that this patch is a net improvement.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-08 15:52:17 -07:00
Eric Sunshine
07a7f8debf format-patch: use 'origin' as start of current-series-range when known
When formatting a patch series over `origin..HEAD`, one would expect
that range to be used as the current-series-range when computing a
range-diff between the previous and current versions of a patch series.
However, infer_range_diff_ranges() ignores `origin..HEAD` when
--range-diff=<prev> specifies a single revision rather than a range, and
instead unexpectedly computes the current-series-range based upon
<prev>. Address this anomaly by unconditionally using `origin..HEAD` as
the current-series-range regardless of <prev> as long as `origin` is
known, and only fall back to basing current-series-range on <prev> when
`origin` is not known.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-08 15:03:27 -07:00
Eric Sunshine
72a7239016 diff-lib: tighten show_interdiff()'s interface
To compute and show an interdiff, show_interdiff() needs only the two
OID's to compare and a diffopts, yet it expects callers to supply an
entire rev_info. The demand for rev_info is not only overkill, but also
places unnecessary burden on potential future callers which might not
otherwise have a rev_info at hand. Address this by tightening its
signature to require only the items it needs instead of a full rev_info.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-08 15:03:27 -07:00
Eric Sunshine
cdffbdc217 diff: move show_interdiff() from its own file to diff-lib
show_interdiff() is a relatively small function and not likely to grow
larger or more complicated. Rather than dedicating an entire source file
to it, relocate it to diff-lib.c which houses other "take two things and
compare them" functions meant to be re-used but not so low-level as to
reside in the core diff implementation.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-08 15:03:26 -07:00
Junio C Hamano
2df2d81ddd add -i: use the built-in version when feature.experimental is set
We have had parallel implementations of "add -i/-p" since 2.25 and
have been using them from various codepaths since 2.26 days, but
never made the built-in version the default.

We have found and fixed a handful of corner case bugs in the
built-in version, and it may be a good time to start switching over
the user base from the scripted version to the built-in version.
Let's enable the built-in version for those who opt into the
feature.experimental guinea-pig program to give wider exposure.

Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-08 14:53:36 -07:00
Eric Sunshine
c670aa47df worktree: teach add to recognize -d as shorthand for --detach
Like `git switch` and `git checkout`, `git worktree add` can check out a
branch or set up a detached HEAD. However, unlike those other commands,
`git worktree add` does not understand -d as shorthand for --detach,
which may confound users accustomed to using -d for this purpose.
Address this shortcoming by teaching `add` to recognize -d for --detach,
thus bringing it in line with the other commands.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-06 18:53:56 -07:00
René Scharfe
ccb181d0f0 fast-import: use write_pack_header()
Call write_pack_header() to hash and write a pack header instead of
open-coding this function.  This gets rid of duplicate code and of the
magic version number 2 -- which has been used here since c90be46abd
(Changed fast-import's pack header creation to use pack.h, 2006-08-16)
and in pack.h (again) since 29f049a0c2 (Revert "move pack creation to
version 3", 2006-10-14).

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-06 13:40:37 -07:00
René Scharfe
1af8b8c0a5 refspec: add and use refspec_appendf()
Add a function for building a refspec using printf-style formatting.  It
frees callers from managing their own buffer.  Use it throughout the
tree to shorten and simplify its callers.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-06 13:15:46 -07:00
René Scharfe
30035d9c66 push: release strbufs used for refspec formatting
map_refspec() either returns the passed in ref string or a detached
strbuf.  This makes it hard for callers to release the possibly
allocated memory, and set_refspecs() consequently leaks it.

Let map_refspec() append any refspecs directly and release its own
strbufs after use.  Rename it to refspec_append_mapped() and don't
return anything to reflect its increased responsibility.

set_refspecs() also leaks its strbufs.  Do the same here and directly
call refspec_append() in each if branch instead of holding onto a
detached strbuf, then dispose of the allocated memory after use.  We
need to add an else branch for the final call because all the other
conditional branches already add their formatted refspec now.

setup_push_upstream() and setup_push_current() forgot to release their
strbufs as well; plug these leaks, too, while at it.

None of these leaks were likely to impact users, because the number
and sizes of refspecs are usually small and the allocations are only
done once per program run.  Clean them up nevertheless, as another
step on the long road towards zero memory leaks.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-06 13:15:45 -07:00
Orgad Shaneh
7ea0c2f44d fetch: do not look for submodule changes in unchanged refs
When fetching recursively with submodules, for each ref in the
superproject, we call check_for_new_submodule_commits() which collects all
the objects that have to be checked for submodule changes on
calculate_changed_submodule_paths(). On the first call, it also collects all
the existing refs for excluding them from the scan.

calculate_changed_submodule_paths() creates an argument array with all the
collected new objects, followed by --not and all the old objects. This argv
is passed to setup_revisions, which parses each argument, converts it back
to an oid and resolves the object. The parsing itself also does redundant
work, because it is treated like user input, while in fact it is a full
oid. So it needlessly attempts to look it up as ref (checks if it has ^, ~
etc.), checks if it is a file name etc.

For a repository with many refs, all of this is expensive. But if the fetch
in the superproject did not update the ref (i.e. the objects that are
required to exist in the submodule did not change), there is no need to
include it in the list.

Before commit be76c212 (fetch: ensure submodule objects fetched,
2018-12-06), submodule reference changes were only detected for refs that
were changed, but not for new refs. This commit covered also this case, but
what it did was to just include every ref.

This change should reduce the number of scanned refs by about half (except
the case of a no-op fetch, which will not scan any ref), because all the
existing refs will still be listed after --not.

The regression was reported here:
https://public-inbox.org/git/CAGHpTBKSUJzFSWc=uznSu2zB33qCSmKXM-
iAjxRCpqNK5bnhRg@mail.gmail.com/

Signed-off-by: Orgad Shaneh <orgads@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-06 09:50:49 -07:00
Junio C Hamano
da6b99c39a Merge branch 'hl/bisect-doc-clarify-bad-good-ordering'
Doc update.

* hl/bisect-doc-clarify-bad-good-ordering:
  bisect: swap command-line options in documentation
2020-09-03 12:37:06 -07:00
Junio C Hamano
b4100f366c Merge branch 'jt/lazy-fetch'
Updates to on-demand fetching code in lazily cloned repositories.

* jt/lazy-fetch:
  fetch: no FETCH_HEAD display if --no-write-fetch-head
  fetch-pack: remove no_dependents code
  promisor-remote: lazy-fetch objects in subprocess
  fetch-pack: do not lazy-fetch during ref iteration
  fetch: only populate existing_refs if needed
  fetch: avoid reading submodule config until needed
  fetch: allow refspecs specified through stdin
  negotiator/noop: add noop fetch negotiator
2020-09-03 12:37:04 -07:00
Junio C Hamano
18aff08e04 Merge branch 'jc/undash-in-tree-git-callers'
A handful of places in in-tree code still relied on being able to
execute the git subcommands, especially built-ins, in "git-foo"
form, which have been corrected.

* jc/undash-in-tree-git-callers:
  credential-cache: use child_process.args
  cvsexportcommit: do not run git programs in dashed form
  transport-helper: do not run git-remote-ext etc. in dashed form
2020-09-03 12:37:03 -07:00
Junio C Hamano
afd49c39dd Merge branch 'jk/slimmed-down'
Trim an unused binary and turn a bunch of commands into built-in.

* jk/slimmed-down:
  drop vcs-svn experiment
  make git-fast-import a builtin
  make git-bugreport a builtin
  make credential helpers builtins
  Makefile: drop builtins from MSVC pdb list
2020-09-03 12:37:02 -07:00
Junio C Hamano
9c31b19dd0 Merge branch 'pw/rebase-i-more-options'
"git rebase -i" learns a bit more options.

* pw/rebase-i-more-options:
  t3436: do not run git-merge-recursive in dashed form
  rebase: add --reset-author-date
  rebase -i: support --ignore-date
  rebase -i: support --committer-date-is-author-date
  am: stop exporting GIT_COMMITTER_DATE
  rebase -i: add --ignore-whitespace flag
2020-09-03 12:37:01 -07:00
Jonathan Tan
f24c30e0b6 wt-status: tolerate dangling marks
When a user checks out the upstream branch of HEAD, the upstream branch
not being a local branch, and then runs "git status", like this:

  git clone $URL client
  cd client
  git checkout @{u}
  git status

no status is printed, but instead an error message:

  fatal: HEAD does not point to a branch

(This error message when running "git branch" persists even after
checking out other things - it only stops after checking out a branch.)

This is because "git status" reads the reflog when determining the "HEAD
detached" message, and thus attempts to DWIM "@{u}", but that doesn't
work because HEAD no longer points to a branch.

Therefore, when calculating the status of a worktree, tolerate dangling
marks. This is done by adding an additional parameter to
dwim_ref() and repo_dwim_ref().

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-02 14:39:25 -07:00
Jonathan Tan
db3c293ecd fetch: no FETCH_HEAD display if --no-write-fetch-head
887952b8c6 ("fetch: optionally allow disabling FETCH_HEAD update",
2020-08-18) introduced the ability to disable writing to FETCH_HEAD
during fetch, but did not suppress the "<source> -> FETCH_HEAD" message
when this ability is used. This message is misleading in this case,
because FETCH_HEAD is not written. Also, because "fetch" is used to
lazy-fetch missing objects in a partial clone, this significantly
clutters up the output in that case since the objects to be fetched are
potentially numerous.

Therefore, suppress this message when --no-write-fetch-head is passed
(but not when --dry-run is set).

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-02 14:26:55 -07:00
Junio C Hamano
c57afd73ef Merge branch 'rs/checkout-no-overlay-pathspec-fix'
"git restore/checkout --no-overlay" with wildcarded pathspec
mistakenly removed matching paths in subdirectories, which has been
corrected.

* rs/checkout-no-overlay-pathspec-fix:
  checkout, restore: make pathspec recursive
2020-08-31 15:49:50 -07:00
Junio C Hamano
cca424ba90 Merge branch 'jk/refspecs-cleanup'
Preliminary code clean-up before introducing "negative refspec".

* jk/refspecs-cleanup:
  refspec: make sure stack refspec_item variables are zeroed
  refspec: fix documentation referring to refspec_item
2020-08-31 15:49:48 -07:00
Junio C Hamano
e699684cf6 Merge branch 'hn/refs-pseudorefs'
Accesses to two pseudorefs have been updated to properly use ref
API.

* hn/refs-pseudorefs:
  sequencer: treat REVERT_HEAD as a pseudo ref
  builtin/commit: suggest update-ref for pseudoref removal
  sequencer: treat CHERRY_PICK_HEAD as a pseudo ref
  refs: make refs_ref_exists public
2020-08-31 15:49:48 -07:00
Junio C Hamano
53015c9dd4 Merge branch 'jk/index-pack-w-more-threads'
Long ago, we decided to use 3 threads by default when running the
index-pack task in parallel, which has been adjusted a bit upwards.

* jk/index-pack-w-more-threads:
  index-pack: adjust default threading cap
  p5302: count up to online-cpus for thread tests
  p5302: disable thread-count parameter tests by default
2020-08-31 15:49:48 -07:00
Eric Sunshine
59d876ccd6 init: make --separate-git-dir work from within linked worktree
The intention of `git init --separate-work-dir=<path>` is to move the
.git/ directory to a location outside of the main worktree. When used
within a linked worktree, however, rather than moving the .git/
directory as intended, it instead incorrectly moves the worktree's
.git/worktrees/<id> directory to <path>, thus disconnecting the linked
worktree from its parent repository and breaking the worktree in the
process since its local .git file no longer points at a location at
which it can find the object database. Fix this broken behavior.

An intentional side-effect of this change is that it also closes a
loophole not caught by ccf236a23a (init: disallow --separate-git-dir
with bare repository, 2020-08-09) in which the check to prevent
--separate-git-dir being used in conjunction with a bare repository was
unable to detect the invalid combination when invoked from within a
linked worktree. Therefore, add a test to verify that this loophole is
closed, as well.

Reported-by: Henré Botha <henrebotha@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-31 11:47:45 -07:00
Eric Sunshine
42264bc841 init: teach --separate-git-dir to repair linked worktrees
A linked worktree's .git file is a "gitfile" pointing at the
.git/worktrees/<id> directory within the repository. When `git init
--separate-git-dir=<path>` is used on an existing repository to relocate
the repository's .git/ directory to a different location, it neglects to
update the .git files of linked worktrees, thus breaking the worktrees
by making it impossible for them to locate the repository. Fix this by
teaching --separate-git-dir to repair the .git file of each linked
worktree to point at the new repository location.

Reported-by: Henré Botha <henrebotha@gmail.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-31 11:47:45 -07:00
Eric Sunshine
b214ab5aa5 worktree: teach "repair" to fix outgoing links to worktrees
The .git/worktrees/<id>/gitdir file points at the location of a linked
worktree's .git file. Its content must be of the form
/path/to/worktree/.git (from which the location of the worktree itself
can be derived by stripping the "/.git" suffix). If the gitdir file is
deleted or becomes corrupted or outdated, then Git will be unable to
find the linked worktree. An easy way for the gitdir file to become
outdated is for the user to move the worktree manually (without using
"git worktree move"). Although it is possible to manually update the
gitdir file to reflect the new linked worktree location, doing so
requires a level of knowledge about worktree internals beyond what a
user should be expected to know offhand.

Therefore, teach "git worktree repair" how to repair broken or outdated
.git/worktrees/<id>/gitdir files automatically. (For this to work, the
command must either be invoked from within the worktree whose gitdir
file requires repair, or from within the main or any linked worktree by
providing the path of the broken worktree as an argument to "git
worktree repair".)

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-31 11:47:45 -07:00
Eric Sunshine
bdd1f3e4da worktree: teach "repair" to fix worktree back-links to main worktree
The .git file in a linked worktree is a "gitfile" which points back to
the .git/worktrees/<id> entry in the main worktree or bare repository.
If a worktree's .git file is deleted or becomes corrupted or outdated,
then the linked worktree won't know how to find the repository or any of
its own administrative files (such as 'index', 'HEAD', etc.). An easy
way for the .git file to become outdated is for the user to move the
main worktree or bare repository. Although it is possible to manually
update each linked worktree's .git file to reflect the new repository
location, doing so requires a level of knowledge about worktree
internals beyond what a user should be expected to know offhand.

Therefore, teach "git worktree repair" how to repair broken or outdated
worktree .git files automatically. (For this to work, the command must
be invoked from within the main worktree or bare repository, or from
within a worktree which has not become disconnected from the repository
-- such as one which was created after the repository was moved.)

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-31 11:47:45 -07:00
Miriam Rubio
7b4de74b5d bisect--helper: introduce new write_in_file() function
Let's refactor code adding a new `write_in_file()` function
that opens a file for writing a message and closes it and a
wrapper for writing mode.

This helper will be used in later steps and makes the code
simpler and easier to understand.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-28 16:21:16 -07:00
Miriam Rubio
30276765c1 bisect--helper: use '-res' in 'cmd_bisect__helper' return
Following 'enum bisect_error' vocabulary, return variable 'res' is
always non-positive.
Let's use '-res' instead of 'abs(res)' to make the code clearer.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-28 16:21:16 -07:00
Miriam Rubio
ef5aef5ee0 bisect--helper: BUG() in cmd_*() on invalid subcommand
In cmd_bisect__helper() function, if an invalid or no
subcommand is passed there is a BUG.

BUG() out instead of returning an error.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-28 16:21:15 -07:00
Taylor Blau
59552fb3e2 midx: traverse the local MIDX first
When a repository has an alternate object directory configured, callers
can traverse through each alternate's MIDX by walking the '->next'
pointer.

But, when 'prepare_multi_pack_index_one()' loads multiple MIDXs, it
places the new ones at the front of this pointer chain, not at the end.
This can be confusing for callers such as 'git repack -ad', causing test
failures like in t7700.6 with 'GIT_TEST_MULTI_PACK_INDEX=1'.

The occurs when dropping a pack known to the local MIDX with alternates
configured that have their own MIDX. Since the alternate's MIDX is
returned via 'get_multi_pack_index()', 'midx_contains_pack()' returns
true (which is correct, since it traverses through the '->next' pointer
to find the MIDX in the chain that does contain the requested object).
But, we call 'clear_midx_file()' on 'the_repository', which drops the
MIDX at the path of the first MIDX in the chain, which (in the case of
t7700.6 is the one in the alternate).

This patch addresses that by:

  - placing the local MIDX first in the chain when calling
    'prepare_multi_pack_index_one()', and

  - introducing a new 'get_local_multi_pack_index()', which explicitly
    returns the repository-local MIDX, if any.

Don't impose an additional order on the MIDX's '->next' pointer beyond
that the first item in the chain must be local if one exists so that we
avoid a quadratic insertion.

Likewise, use 'get_local_multi_pack_index()' in
'remove_redundant_pack()' to fix the formerly broken t7700.6 when run
with 'GIT_TEST_MULTI_PACK_INDEX=1'.

Finally, note that the MIDX ordering invariant is only preserved by the
insertion order in 'prepare_packed_git()', which traverses through the
ODB's '->next' pointer, meaning we visit the local object store first.
This fragility makes this an undesirable long-term solution if more
callers are added, but it is acceptable for now since this is the only
caller.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-28 14:07:09 -07:00
Hugo Locurcio
ef4d9f8a32 bisect: swap command-line options in documentation
The positional arguments are specified in this order: "bad" then "good".
To avoid confusion, the options above the positional arguments
are now specified in the same order. They can still be specified in any
order since they're options, not positional arguments.

Signed-off-by: Hugo Locurcio <hugo.locurcio@hugo.pro>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-28 14:06:06 -07:00
Junio C Hamano
0d9a8e33f9 Merge branch 'jk/leakfix'
Code clean-up.

* jk/leakfix:
  submodule--helper: fix leak of core.worktree value
  config: fix leak in git_config_get_expiry_in_days()
  config: drop git_config_get_string_const()
  config: fix leaks from git_config_get_string_const()
  checkout: fix leak of non-existent branch names
  submodule--helper: use strbuf_release() to free strbufs
  clear_pattern_list(): clear embedded hashmaps
2020-08-27 14:04:49 -07:00
Jiang Xin
31e8595a11 receive-pack: new config receive.procReceiveRefs
Add a new multi-valued config variable "receive.procReceiveRefs"
for `receive-pack` command, like the follows:

    git config --system --add receive.procReceiveRefs refs/for
    git config --system --add receive.procReceiveRefs refs/drafts

If the specific prefix strings given by the config variables match the
reference names of the commands which are sent from git client to
`receive-pack`, these commands will be executed by an external hook
(named "proc-receive"), instead of the internal `execute_commands`
function.

For example, if it is set to "refs/for", pushing to a reference such as
"refs/for/master" will not create or update reference "refs/for/master",
but may create or update a pull request directly by running the hook
"proc-receive".

Optional modifiers can be provided in the beginning of the value to
filter commands for specific actions: create (a), modify (m),
delete (d). A `!` can be included in the modifiers to negate the
reference prefix entry. E.g.:

    git config --system --add receive.procReceiveRefs ad:refs/heads
    git config --system --add receive.procReceiveRefs !:refs/heads

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-27 12:47:47 -07:00
Jiang Xin
63518a574a New capability "report-status-v2" for git-push
The new introduced "proc-receive" hook may handle a command for a
pseudo-reference with a zero-old as its old-oid, while the hook may
create or update a reference with different name, different new-oid,
and different old-oid (the reference may exist already with a non-zero
old-oid).  Current "report-status" protocol cannot report the status for
such reference rewrite.

Add new capability "report-status-v2" and new report protocol which is
not backward compatible for report of git-push.

If a user pushes to a pseudo-reference "refs/for/master/topic", and
"receive-pack" creates two new references "refs/changes/23/123/1" and
"refs/changes/24/124/1", for client without the knowledge of
"report-status-v2", "receive-pack" will only send "ok/ng" directives in
the report, such as:

    ok ref/for/master/topic

But for client which has the knowledge of "report-status-v2",
"receive-pack" will use "option" directives to report more attributes
for the reference given by the above "ok/ng" directive.

    ok refs/for/master/topic
    option refname refs/changes/23/123/1
    option new-oid <new-oid>
    ok refs/for/master/topic
    option refname refs/changes/24/124/1
    option new-oid <new-oid>

The client will report two new created references to the end user.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-27 12:47:47 -07:00
Jiang Xin
195d6eaea3 receive-pack: feed report options to post-receive
When commands are fed to the "post-receive" hook, report options will
be parsed and the real old-oid, new-oid, reference name will feed to
the "post-receive" hook.

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-27 12:47:47 -07:00
Jiang Xin
15d3af5e22 receive-pack: add new proc-receive hook
Git calls an internal `execute_commands` function to handle commands
sent from client to `git-receive-pack`.  Regardless of what references
the user pushes, git creates or updates the corresponding references if
the user has write-permission.  A contributor who has no
write-permission, cannot push to the repository directly.  So, the
contributor has to write commits to an alternate location, and sends
pull request by emails or by other ways.  We call this workflow as a
distributed workflow.

It would be more convenient to work in a centralized workflow like what
Gerrit provided for some cases.  For example, a read-only user who
cannot push to a branch directly can run the following `git push`
command to push commits to a pseudo reference (has a prefix "refs/for/",
not "refs/heads/") to create a code review.

    git push origin \
        HEAD:refs/for/<branch-name>/<session>

The `<branch-name>` in the above example can be as simple as "master",
or a more complicated branch name like "foo/bar".  The `<session>` in
the above example command can be the local branch name of the client
side, such as "my/topic".

We cannot implement a centralized workflow elegantly by using
"pre-receive" + "post-receive", because Git will call the internal
function "execute_commands" to create references (even the special
pseudo reference) between these two hooks.  Even though we can delete
the temporarily created pseudo reference via the "post-receive" hook,
having a temporary reference is not safe for concurrent pushes.

So, add a filter and a new handler to support this kind of workflow.
The filter will check the prefix of the reference name, and if the
command has a special reference name, the filter will turn a specific
field (`run_proc_receive`) on for the command.  Commands with this filed
turned on will be executed by a new handler (a hook named
"proc-receive") instead of the internal `execute_commands` function.
We can use this "proc-receive" command to create pull requests or send
emails for code review.

Suggested by Junio, this "proc-receive" hook reads the commands,
push-options (optional), and send result using a protocol in pkt-line
format.  In the following example, the letter "S" stands for
"receive-pack" and letter "H" stands for the hook.

    # Version and features negotiation.
    S: PKT-LINE(version=1\0push-options atomic...)
    S: flush-pkt
    H: PKT-LINE(version=1\0push-options...)
    H: flush-pkt

    # Send commands from server to the hook.
    S: PKT-LINE(<old-oid> <new-oid> <ref>)
    S: ... ...
    S: flush-pkt
    # Send push-options only if the 'push-options' feature is enabled.
    S: PKT-LINE(push-option)
    S: ... ...
    S: flush-pkt

    # Receive result from the hook.
    # OK, run this command successfully.
    H: PKT-LINE(ok <ref>)
    # NO, I reject it.
    H: PKT-LINE(ng <ref> <reason>)
    # Fall through, let 'receive-pack' to execute it.
    H: PKT-LINE(ok <ref>)
    H: PKT-LINE(option fall-through)
    # OK, but has an alternate reference.  The alternate reference name
    # and other status can be given in options
    H: PKT-LINE(ok <ref>)
    H: PKT-LINE(option refname <refname>)
    H: PKT-LINE(option old-oid <old-oid>)
    H: PKT-LINE(option new-oid <new-oid>)
    H: PKT-LINE(option forced-update)
    H: ... ...
    H: flush-pkt

After receiving a command, the hook will execute the command, and may
create/update different reference.  For example, a command for a pseudo
reference "refs/for/master/topic" may create/update different reference
such as "refs/pull/123/head".  The alternate reference name and other
status are given in option lines.

The list of commands returned from "proc-receive" will replace the
relevant commands that are sent from user to "receive-pack", and
"receive-pack" will continue to run the "execute_commands" function and
other routines.  Finally, the result of the execution of these commands
will be reported to end user.

The reporting function from "receive-pack" to "send-pack" will be
extended in latter commit just like what the "proc-receive" hook reports
to "receive-pack".

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-27 12:47:47 -07:00
Shourya Shukla
d79b145569 t7421: eliminate 'grep' check in t7421.4 for mingw compatibility
The 'grep' check in test 4 of t7421 resulted in the failure of t7421 on
Windows due to a different error message

    error: cannot spawn git: No such file or directory

instead of

    fatal: exec 'rev-parse': cd to 'my-subm' failed: No such file or directory

Tighten up the check to compute 'src_abbrev' by guarding the
'verify_submodule_committish()' call using `p->status !='D'`, so that
the former isn't called in case of non-existent submodule directory,
consequently, there is no such error message on any execution
environment. The same need not be implemented for 'dst_abbrev' and is
rather redundant since the conditional 'if (S_ISGITLINK(p->mod_dst))'
already guards the 'verify_submodule_committish()' when we have a
status of 'D'.

Therefore, eliminate the 'grep' check in t7421. Instead, verify the
absence of an error message by doing a 'test_must_be_empty' on the
file containing the error.

Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Helped-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-27 11:47:10 -07:00
Eric Sunshine
e8e1ff24c5 worktree: add skeleton "repair" command
Worktree administrative files can become corrupted or outdated due to
external factors. Although, it is often possible to recover from such
situations by hand-tweaking these files, doing so requires intimate
knowledge of worktree internals. While information necessary to make
such repairs manually can be obtained from git-worktree.txt and
gitrepository-layout.txt, we can assist users more directly by teaching
git-worktree how to repair its administrative files itself (at least to
some extent). Therefore, add a "git worktree repair" command which
attempts to correct common problems which may arise due to factors
beyond Git's control.

At this stage, the "repair" command is a mere skeleton; subsequent
commits will flesh out the functionality.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-27 08:59:13 -07:00
Jeff King
27ed6ccc12 worktree: fix leak in check_clean_worktree()
We allocate a child_env strvec but never free its memory. Instead, let's
just use the strvec that our child_process struct provides, which is
cleaned up automatically when we run the command.

And while we're moving the initialization of the child_process around,
let's switch it to use the official init function (zero-initializing it
works OK, since strvec is happy enough with that, but it sets a bad
example).

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-27 08:30:17 -07:00
Taylor Blau
e08f7bb093 builtin/repack.c: invalidate MIDX only when necessary
In 525e18c04b (midx: clear midx on repack, 2018-07-12), 'git repack'
learned to remove a multi-pack-index file if it added or removed a pack
from the object store.

This mechanism is a little over-eager, since it is only necessary to
drop a MIDX if 'git repack' removes a pack that the MIDX references.
Adding a pack outside of the MIDX does not require invalidating the
MIDX, and likewise for removing a pack the MIDX does not know about.

Teach 'git repack' to check for this by loading the MIDX, and checking
whether the to-be-removed pack is known to the MIDX. This requires a
slightly odd alternation to a test in t5319, which is explained with a
comment. A new test is added to show that the MIDX is left alone when
both packs known to it are marked as .keep, but two packs unknown to it
are removed and combined into one new pack.

Helped-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-26 13:55:46 -07:00
Shourya Shukla
f0c6b6467d submodule: fix style in function definition
The definitions of 'verify_submodule_committish()' and
'print_submodule_summary()' had wrong styling in terms of the asterisk
placement. Amend them.

Also, the warning printed in case of an unexpected file mode printed the
mode in decimal. Print it in octal for enhanced readability.

Reported-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-25 13:43:21 -07:00
Shourya Shukla
e0f7ae564e submodule: eliminate unused parameters from print_submodule_summary()
Eliminate the parameters 'missing_{src,dst}' from the
'print_submodule_summary()' function call since they are not used
anywhere in the function.

Reported-by: Jeff King <peff@peff.net>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-25 13:43:10 -07:00
Junio C Hamano
ad00f44f54 Merge branch 'en/dir-clear'
Leakfix with code clean-up.

* en/dir-clear:
  dir: fix problematic API to avoid memory leaks
  dir: make clear_directory() free all relevant memory
2020-08-24 14:54:34 -07:00
Junio C Hamano
b556050733 Merge branch 'jc/no-update-fetch-head'
"git fetch" learned --no-write-fetch-head option to avoid writing
the FETCH_HEAD file.

* jc/no-update-fetch-head:
  fetch: optionally allow disabling FETCH_HEAD update
2020-08-24 14:54:31 -07:00
Junio C Hamano
ff20794402 Merge branch 'jk/unleak-fixes'
Fix some incorrect UNLEAK() annotations.

* jk/unleak-fixes:
  ls-remote: simplify UNLEAK() usage
  stop calling UNLEAK() before die()
2020-08-24 14:54:30 -07:00
Junio C Hamano
a654836d96 Merge branch 'es/init-no-separate-git-dir-in-bare'
The purpose of "git init --separate-git-dir" is to initialize a
new project with the repository separate from the working tree,
or, in the case of an existing project, to move the repository
(the .git/ directory) out of the working tree. It does not make
sense to use --separate-git-dir with a bare repository for which
there is no working tree, so disallow its use with bare
repositories.

* es/init-no-separate-git-dir-in-bare:
  init: disallow --separate-git-dir with bare repository
2020-08-24 14:54:28 -07:00
Jonathan Tan
ee6f058384 index-pack: make resolve_delta() assume base data
A subsequent commit will make the quantum of work smaller, necessitating
more locking. This commit allows resolve_delta() to be called outside
the lock.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-24 14:14:52 -07:00
Jonathan Tan
b4718cae51 index-pack: calculate {ref,ofs}_{first,last} early
This is refactoring 2 of 2 to simplify struct base_data.

Whenever we make a struct base_data, immediately calculate its delta
children. This eliminates confusion as to when the
{ref,ofs}_{first,last} fields are initialized.

Before this patch, the delta children were calculated at the last
possible moment. This allowed the members of struct base_data to be
populated in any order, superficially useful when we have the object
contents before the struct object_entry. But this makes reasoning about
the state of struct base_data more complicated, hence this patch.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-24 14:12:58 -07:00
Jonathan Tan
a7f7e84a49 index-pack: remove redundant child field
This is refactoring 1 of 2 to simplify struct base_data.

In index-pack, each thread maintains a doubly-linked list of the delta
chain that it is currently processing (the "base" and "child" pointers
in struct base_data). When a thread exceeds the delta base cache limit
and needs to reclaim memory, it uses the "child" pointers to traverse
the lineage, reclaiming the memory of the eldest delta bases first.

A subsequent patch will perform memory reclaiming in a different way and
will thus no longer need the "child" pointer. Because the "child"
pointer is redundant even now, remove it so that the aforementioned
subsequent patch will be clearer. In the meantime, reclaim memory in the
reverse order of the "base" pointers.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-24 14:11:14 -07:00
Jonathan Tan
46e6fb1e44 index-pack: unify threaded and unthreaded code
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-24 14:02:31 -07:00
Jonathan Tan
fc968e26c2 index-pack: remove redundant parameter
find_{ref,ofs}_delta_{,children} take an enum object_type parameter, but
the object type is already present in the name of the function. Remove
that parameter from these functions.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-24 13:55:57 -07:00
René Scharfe
bfda204ade checkout, restore: make pathspec recursive
The pathspec given to git checkout and git restore is used with both
tree_entry_interesting (via read_tree_recursive) and match_pathspec
(via ce_path_match).  The latter effectively only supports recursive
matching regardless of the value of the pathspec flag "recursive",
which is unset here.

That causes different match results for pathspecs with wildcards, and
can lead checkout and restore in no-overlay mode to remove entries
instead of modifying them.  Enable recursive matching for both checkout
and restore to make matching consistent.

Setting the flag in checkout_main() technically also affects git switch,
but since that command doesn't accept pathspecs at all this has no
actual consequence.

Reported-by: Sergii Shkarnikov <sergii.shkarnikov@globallogic.com>
Initial-test-by: Sergii Shkarnikov <sergii.shkarnikov@globallogic.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-22 13:37:43 -07:00
Jeff King
fbff95b67f index-pack: adjust default threading cap
Commit b8a2486f15 (index-pack: support multithreaded delta resolving,
2012-05-06) describes an experiment that shows that setting the number
of threads for index-pack higher than 3 does not help.

I repeated that experiment using a more modern version of Git and a more
modern CPU and got different results.

Here are timings for p5302 against linux.git run on my laptop, a Core
i9-9880H with 8 cores plus hyperthreading (so online-cpus returns 16):

  5302.3: index-pack 0 threads                   256.28(253.41+2.79)
  5302.4: index-pack 1 threads                   257.03(254.03+2.91)
  5302.5: index-pack 2 threads                   149.39(268.34+3.06)
  5302.6: index-pack 4 threads                   94.96(294.10+3.23)
  5302.7: index-pack 8 threads                   68.12(339.26+3.89)
  5302.8: index-pack 16 threads                  70.90(655.03+7.21)
  5302.9: index-pack default number of threads   116.91(290.05+3.21)

You can see that wall-clock times continue to improve dramatically up to
the number of cores, but bumping beyond that (into hyperthreading
territory) does not help (and in fact hurts a little).

Here's the same experiment on a machine with dual Xeon 6230's, totaling
40 cores (80 with hyperthreading):

  5302.3: index-pack 0 threads                    310.04(302.73+6.90)
  5302.4: index-pack 1 threads                    310.55(302.68+7.40)
  5302.5: index-pack 2 threads                    178.17(304.89+8.20)
  5302.6: index-pack 5 threads                    99.53(315.54+9.56)
  5302.7: index-pack 10 threads                   72.80(327.37+12.79)
  5302.8: index-pack 20 threads                   60.68(357.74+21.66)
  5302.9: index-pack 40 threads                   58.07(454.44+67.96)
  5302.10: index-pack 80 threads                  59.81(720.45+334.52)
  5302.11: index-pack default number of threads   134.18(309.32+7.98)

The results are similar; things stop improving at 40 threads. Curiously,
going from 20 to 40 really doesn't help much, either (and increases CPU
time considerably). So that may represent an actual barrier to
parallelism, where we lose out due to context-switching and loss of
cache locality, but don't reap the wall-clock benefits due to contention
of our coarse-grained locks.

So what's a good default value? It's clear that the current cap of 3 is
too low; our default values are 42% and 57% slower than the best times
on each machine. The results on the 40-core machine imply that 20
threads is an actual barrier regardless of the number of cores, so we'll
take that as a maximum. We get the best results on these machines at
half of the online-cpus value. That's presumably a result of the
hyperthreading. That's common on multi-core Intel processors, but not
necessarily elsewhere. But if we take it as an assumption, we can
perform optimally on hyperthreaded machines and still do much better
than the status quo on other machines, as long as we never half below
the current value of 3.

So that's what this patch does.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-21 12:02:36 -07:00
Han-Wen Nienhuys
b6d2558c9e builtin/commit: suggest update-ref for pseudoref removal
When pseudorefs move to a different ref storage mechanism, pseudorefs no longer
can be removed with 'rm'. Instead, suggest a "update-ref -d" command, which will
work regardless of ref storage backend.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-21 11:20:10 -07:00
Han-Wen Nienhuys
c8e4159efd sequencer: treat CHERRY_PICK_HEAD as a pseudo ref
Check for existence and delete CHERRY_PICK_HEAD through ref functions.
This will help cherry-pick work with alternate ref storage backends.

Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-21 11:20:10 -07:00
Junio C Hamano
2a978f8273 Merge branch 'jc/object-names-are-not-sha-1'
A few end-user facing messages have been updated to be
hash-algorithm agnostic.

* jc/object-names-are-not-sha-1:
  messages: avoid SHA-1 in end-user facing messages
2020-08-19 16:14:52 -07:00
Rohit Ashiwal
27126692ba rebase: add --reset-author-date
The previous commit introduced --ignore-date flag to rebase -i, but the
name is rather vague as it does not say whether the author date or the
committer date is ignored. Add an alias to convey the precise purpose.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-19 15:22:56 -07:00
Phillip Wood
a3894aad67 rebase -i: support --ignore-date
Rebase is implemented with two different backends - 'apply' and
'merge' each of which support a different set of options. In
particular the apply backend supports a number of options implemented
by 'git am' that are not implemented in the merge backend. This means
that the available options are different depending on which backend is
used which is confusing. This patch adds support for the --ignore-date
option to the merge backend. This option uses the current time as the
author date rather than reusing the original author date when
rewriting commits. We take care to handle the combination of
--ignore-date and --committer-date-is-author-date in the same way as
the apply backend.

Original-patch-by: Rohit Ashiwal <rohit.ashiwal265@gmail.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-19 15:19:59 -07:00
Elijah Newren
eceba53214 dir: fix problematic API to avoid memory leaks
The dir structure seemed to have a number of leaks and problems around
it.  First I noticed that parent_hashmap and recursive_hashmap were
being leaked (though Peff noticed and submitted fixes before me).  Then
I noticed in the previous commit that clear_directory() was only taking
responsibility for a subset of fields within dir_struct, despite the
fact that entries[] and ignored[] we allocated internally to dir.c.
That, of course, resulted in many callers either leaking or haphazardly
trying to free these arrays and their contents.

Digging further, I found that despite the pretty clear documentation
near the top of dir.h that folks were supposed to call clear_directory()
when the user no longer needed the dir_struct, there were four callers
that didn't bother doing that at all.  However, two of them clearly
thought about leaks since they had an UNLEAK(dir) directive, which to me
suggests that the method to free the data was too unclear.  I suspect
the non-obviousness of the API and its holes led folks to avoid it,
which then snowballed into further problems with the entries[],
ignored[], parent_hashmap, and recursive_hashmap problems.

Rename clear_directory() to dir_clear() to be more in line with other
data structures in git, and introduce a dir_init() to handle the
suggested memsetting of dir_struct to all zeroes.  I hope that a name
like "dir_clear()" is more clear, and that the presence of dir_init()
will provide a hint to those looking at the code that they need to look
for either a dir_clear() or a dir_free() and lead them to find
dir_clear().

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-18 17:17:31 -07:00
Elijah Newren
dad4f23ce5 dir: make clear_directory() free all relevant memory
The calling convention for the dir API is supposed to end with a call to
clear_directory() to free up no longer needed memory.  However,
clear_directory() didn't free dir->entries or dir->ignored.  I believe
this was an oversight, but a number of callers noticed memory leaks and
started free'ing these.  Unfortunately, they did so somewhat haphazardly
(sometimes freeing the entries in the arrays, and sometimes only
free'ing the arrays themselves).  This suggests the callers weren't
trying to make sure any possible memory used might be free'd, but just
the memory they noticed their usecase definitely had allocated.

Fix this mess by moving all the duplicated free'ing logic into
clear_directory().  End by resetting dir to a pristine state so it could
be reused if desired.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-18 17:17:29 -07:00
Jonathan Tan
9dfa8dbeee fetch-pack: remove no_dependents code
Now that Git has switched to using a subprocess to lazy-fetch missing
objects, remove the no_dependents code as it is no longer used.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-18 16:46:53 -07:00
Jonathan Tan
abcb7eeb31 fetch: only populate existing_refs if needed
In "fetch", get_ref_map() iterates over all refs to populate
"existing_refs" in order to populate peer_ref->old_oid in the returned
refmap, even if the refmap has no peer_ref set - which is the case when
only literal hashes (i.e. no refs by name) are fetched.

Iterating over refs causes the targets of those refs to be checked for
existence. Avoiding this is especially important when we use "git fetch"
to perform lazy fetches in a partial clone because a target of such a
ref may need to be itself lazy-fetched (and otherwise causing an
infinite loop).

Therefore, avoid populating "existing_refs" until necessary. With this
patch, because Git lazy-fetches objects by literal hashes (to be done in
a subsequent commit), it will then be able to guarantee avoiding reading
targets of refs.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-18 13:25:05 -07:00
Jonathan Tan
e5b942136e fetch: avoid reading submodule config until needed
In "fetch", there are two parameters submodule_fetch_jobs_config and
recurse_submodules that can be set in a variety of ways: through
.gitmodules, through .git/config, and through the command line.
Currently "fetch" handles this by first reading .gitmodules, then
reading .git/config (allowing it to overwrite existing values), then
reading the command line (allowing it to overwrite existing values).

Notice that we can avoid reading .gitmodules if .git/config and/or the
command line already provides us with what we need. In addition, if
recurse_submodules is found to be "no", we do not need the value of
submodule_fetch_jobs_config.

Avoiding reading .gitmodules is especially important when we use "git
fetch" to perform lazy fetches in a partial clone because the
.gitmodules file itself might need to be lazy fetched (and otherwise
causing an infinite loop).

In light of all this, avoid reading .gitmodules until necessary. When
reading it, we may only need one of the two parameters it provides, so
teach fetch_config_from_gitmodules() to support NULL arguments. With
this patch, users (including Git itself when invoking "git fetch" to
lazy-fetch) will be able to guarantee avoiding reading .gitmodules by
passing --recurse-submodules=no.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-18 13:25:05 -07:00
Jonathan Tan
2b713c272c fetch: allow refspecs specified through stdin
In a subsequent patch, partial clones will be taught to fetch missing
objects using a "git fetch" subprocess. Because the number of objects
fetched may be too numerous to fit on the command line, teach "fetch" to
accept refspecs passed through stdin.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-18 13:25:05 -07:00
Junio C Hamano
887952b8c6 fetch: optionally allow disabling FETCH_HEAD update
If you run fetch but record the result in remote-tracking branches,
and either if you do nothing with the fetched refs (e.g. you are
merely mirroring) or if you always work from the remote-tracking
refs (e.g. you fetch and then merge origin/branchname separately),
you can get away with having no FETCH_HEAD at all.

Teach "git fetch" a command line option "--[no-]write-fetch-head".
The default is to write FETCH_HEAD, and the option is primarily
meant to be used with the "--no-" prefix to override this default,
because there is no matching fetch.writeFetchHEAD configuration
variable to flip the default to off (in which case, the positive
form may become necessary to defeat it).

Note that under "--dry-run" mode, FETCH_HEAD is never written;
otherwise you'd see list of objects in the file that you do not
actually have.  Passing `--write-fetch-head` does not force `git
fetch` to write the file.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-18 12:56:57 -07:00