Commit Graph

17927 Commits

Author SHA1 Message Date
Jeff King
953aa54e1a pack-objects: clamp negative window size to 0
A negative window size makes no sense, and the code in find_deltas() is
not prepared to handle it. If you pass "-1", for example, we end up
generate a 0-length array of "struct unpacked", but our loop assumes it
has at least one entry in it (and we end up reading garbage memory).

We could complain to the user about this, but it's more forgiving to
just clamp it to 0, which means "do not find any deltas at all". The
0-case is already tested earlier in the script, so we'll make sure this
does the same thing.

Reported-by: Yiyuan guo <yguoaz@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:29:27 +09:00
Jeff King
95356789ee t5300: check that we produced expected number of deltas
We pack a set of objects both with and without --window=0, assuming that
the 0-length window will cause us not to produce any deltas. Let's
confirm that this is the case.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:29:16 +09:00
Jeff King
5489899812 t5300: modernize basic tests
The first set of tests in t5300 goes back to 2005, and doesn't use some
of our customary style and tools these days. In preparation for touching
them, let's modernize a few things:

  - titles go on the line with test_expect_success, with a hanging
    open-quote to start the test body

  - test bodies should be indented with tabs

  - opening braces for shell blocks in &&-chains go on their own line

  - no space between redirect operators and files (">foo", not "> foo")

  - avoid doing work outside of test blocks; in this case, we can stick
    the setup of ".git2" into the appropriate blocks

  - avoid modifying and then cleaning up the environment or current
    directory by using subshells and "git -C"

  - this test does a curious thing when testing the unpacking: it sets
    GIT_OBJECT_DIRECTORY, and then does a "git init" in the _original_
    directory, creating a weird mixed situation. Instead, it's much
    simpler to just "git init --bare" a new repository to unpack into,
    and check the results there. I renamed this "git2" instead of
    ".git2" to make it more clear it's a separate repo.

  - we can observe that the bodies of the no-delta, ref_delta, and
    ofs_delta cases are all virtually identical except for the pack
    creation, and factor out shared helper functions. I collapsed "do
    the unpack" and "check the results of the unpack" into a single
    test, since that makes the expected lifetime of the "git2" temporary
    directory more clear (that also lets us use test_when_finished to
    clean it up). This does make the "-v" output slightly less useful,
    but the improvement in reading the actual test code makes it worth
    it.

  - I dropped the "pwd" calls from some tests. These don't do anything
    functional, and I suspect may have been an aid for debugging when
    the script was more cavalier about leaving the working directory
    changed between tests.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-03 14:29:16 +09:00
Junio C Hamano
59bb0aa93e Merge branch 'so/log-diff-merge'
"git log" learned "--diff-merges=<style>" option, with an
associated configuration variable log.diffMerges.

* so/log-diff-merge:
  doc/diff-options: document new --diff-merges features
  diff-merges: introduce log.diffMerges config variable
  diff-merges: adapt -m to enable default diff format
  diff-merges: refactor set_diff_merges()
  diff-merges: introduce --diff-merges=on
2021-04-30 13:50:26 +09:00
Junio C Hamano
8e97852919 Merge branch 'ds/sparse-index-protections'
Builds on top of the sparse-index infrastructure to mark operations
that are not ready to mark with the sparse index, causing them to
fall back on fully-populated index that they always have worked with.

* ds/sparse-index-protections: (47 commits)
  name-hash: use expand_to_path()
  sparse-index: expand_to_path()
  name-hash: don't add directories to name_hash
  revision: ensure full index
  resolve-undo: ensure full index
  read-cache: ensure full index
  pathspec: ensure full index
  merge-recursive: ensure full index
  entry: ensure full index
  dir: ensure full index
  update-index: ensure full index
  stash: ensure full index
  rm: ensure full index
  merge-index: ensure full index
  ls-files: ensure full index
  grep: ensure full index
  fsck: ensure full index
  difftool: ensure full index
  commit: ensure full index
  checkout: ensure full index
  ...
2021-04-30 13:50:26 +09:00
Junio C Hamano
d250f90359 Merge branch 'ds/maintenance-prefetch-fix'
The prefetch task in "git maintenance" assumed that "git fetch"
from any remote would fetch all its local branches, which would
fetch too much if the user is interested in only a subset of
branches there.

* ds/maintenance-prefetch-fix:
  maintenance: respect remote.*.skipFetchAll
  maintenance: use 'git fetch --prefetch'
  fetch: add --prefetch option
  maintenance: simplify prefetch logic
2021-04-30 13:50:25 +09:00
Junio C Hamano
a819e2b3ef Merge branch 'ow/push-quiet-set-upstream'
"git push --quiet --set-upstream" was not quiet when setting the
upstream branch configuration, which has been corrected.

* ow/push-quiet-set-upstream:
  transport: respect verbosity when setting upstream
2021-04-30 13:50:25 +09:00
Junio C Hamano
13158b9910 Merge branch 'jk/promisor-optim'
Handling of "promisor packs" that allows certain objects to be
missing and lazily retrievable has been optimized (a bit).

* jk/promisor-optim:
  revision: avoid parsing with --exclude-promisor-objects
  lookup_unknown_object(): take a repository argument
  is_promisor_object(): free tree buffer after parsing
2021-04-30 13:50:24 +09:00
Junio C Hamano
522010b573 Merge branch 'ab/detox-gettext-tests'
Test clean-up.

* ab/detox-gettext-tests:
  tests: remove all uses of test_i18cmp
2021-04-20 17:23:36 -07:00
Junio C Hamano
2eebac2c49 Merge branch 'jk/pack-objects-bitmap-progress-fix'
When "git pack-objects" makes a literal copy of a part of existing
packfile using the reachability bitmaps, its update to the progress
meter was broken.

* jk/pack-objects-bitmap-progress-fix:
  pack-objects: update "nr_seen" progress based on pack-reused count
2021-04-20 17:23:35 -07:00
Junio C Hamano
ab99efc817 Merge branch 'ab/userdiff-tests'
A bit of code clean-up and a lot of test clean-up around userdiff
area.

* ab/userdiff-tests:
  blame tests: simplify userdiff driver test
  blame tests: don't rely on t/t4018/ directory
  userdiff: remove support for "broken" tests
  userdiff tests: list builtin drivers via test-tool
  userdiff tests: explicitly test "default" pattern
  userdiff: add and use for_each_userdiff_driver()
  userdiff style: normalize pascal regex declaration
  userdiff style: declare patterns with consistent style
  userdiff style: re-order drivers in alphabetical order
2021-04-20 17:23:34 -07:00
Junio C Hamano
6d7a62d74d Merge branch 'ar/userdiff-scheme'
Userdiff patterns for "Scheme" has been added.

* ar/userdiff-scheme:
  userdiff: add support for Scheme
2021-04-20 17:23:34 -07:00
Sergey Organov
17c13e60fd diff-merges: introduce log.diffMerges config variable
New log.diffMerges configuration variable sets the format that
--diff-merges=on will be using. The default is "separate".

t4013: add the following tests for log.diffMerges config:

* Test that wrong values are denied.

* Test that the value of log.diffMerges properly affects both
--diff-merges=on and -m.

t9902: fix completion tests for log.d* to match log.diffMerges.

Added documentation for log.diffMerges.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 23:38:35 -07:00
Sergey Organov
4320815eb9 diff-merges: introduce --diff-merges=on
Introduce the notion of default diff format for merges, and the option
"on" to select it. The default format is "separate" and can't yet
be changed, so effectively "on" is just a synonym for "separate"
for now. Add corresponding test to t4013.

This is in preparation for introducing log.diffMerges configuration
option that will let --diff-merges=on to be configured to any
supported format.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 23:38:35 -07:00
Junio C Hamano
7bec8e7fa6 Merge branch 'en/ort-readiness'
Plug the ort merge backend throughout the rest of the system, and
start testing it as a replacement for the recursive backend.

* en/ort-readiness:
  Add testing with merge-ort merge strategy
  t6423: mark remaining expected failure under merge-ort as such
  Revert "merge-ort: ignore the directory rename split conflict for now"
  merge-recursive: add a bunch of FIXME comments documenting known bugs
  merge-ort: write $GIT_DIR/AUTO_MERGE whenever we hit a conflict
  t: mark several submodule merging tests as fixed under merge-ort
  merge-ort: implement CE_SKIP_WORKTREE handling with conflicted entries
  t6428: new test for SKIP_WORKTREE handling and conflicts
  merge-ort: support subtree shifting
  merge-ort: let renormalization change modify/delete into clean delete
  merge-ort: have ll_merge() use a special attr_index for renormalization
  merge-ort: add a special minimal index just for renormalization
  merge-ort: use STABLE_QSORT instead of QSORT where required
2021-04-16 13:53:34 -07:00
Derrick Stolee
32f67888d8 maintenance: respect remote.*.skipFetchAll
If a remote has the skipFetchAll setting enabled, then that remote is
not intended for frequent fetching. It makes sense to not fetch that
data during the 'prefetch' maintenance task. Skip that remote in the
iteration without error. The skip_default_update member is initialized
in remote.c:handle_config() as part of initializing the 'struct remote'.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 13:36:55 -07:00
Derrick Stolee
cfd781ea22 maintenance: use 'git fetch --prefetch'
The 'prefetch' maintenance task previously forced the following refspec
for each remote:

	+refs/heads/*:refs/prefetch/<remote>/*

If a user has specified a more strict refspec for the remote, then this
prefetch task downloads more objects than necessary.

The previous change introduced the '--prefetch' option to 'git fetch'
which manipulates the remote's refspec to place all resulting refs into
refs/prefetch/, with further partitioning based on the destinations of
those refspecs.

Update the documentation to be more generic about the destination refs.
Do not mention custom refspecs explicitly, as that does not need to be
highlighted in this documentation. The important part of placing refs in
refs/prefetch/ remains.

Reported-by: Tom Saeger <tom.saeger@oracle.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 13:36:55 -07:00
Derrick Stolee
2e03115d0c fetch: add --prefetch option
The --prefetch option will be used by the 'prefetch' maintenance task
instead of sending refspecs explicitly across the command-line. The
intention is to modify the refspec to place all results in
refs/prefetch/ instead of anywhere else.

Create helper method filter_prefetch_refspec() to modify a given refspec
to fit the rules expected of the prefetch task:

 * Negative refspecs are preserved.
 * Refspecs without a destination are removed.
 * Refspecs whose source starts with "refs/tags/" are removed.
 * Other refspecs are placed within "refs/prefetch/".

Finally, we add the 'force' option to ensure that prefetch refs are
replaced as necessary.

There are some interesting cases that are worth testing.

An earlier version of this change dropped the "i--" from the loop that
deletes a refspec item and shifts the remaining entries down. This
allowed some refspecs to not be modified. The subtle part about the
first --prefetch test is that the "refs/tags/*" refspec appears directly
before the "refs/heads/bogus/*" refspec. Without that "i--", this
ordering would remove the "refs/tags/*" refspec and leave the last one
unmodified, placing the result in "refs/heads/*".

It is possible to have an empty refspec. This is typically the case for
remotes other than the origin, where users want to fetch a specific tag
or branch. To correctly test this case, we need to further remove the
upstream remote for the local branch. Thus, we are testing a refspec
that will be deleted, leaving nothing to fetch.

Helped-by: Tom Saeger <tom.saeger@oracle.com>
Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-16 13:36:55 -07:00
Junio C Hamano
5a7e52bed2 Merge branch 'jz/apply-3way-cached'
"git apply" now takes "--3way" and "--cached" at the same time, and
work and record results only in the index.

* jz/apply-3way-cached:
  git-apply: allow simultaneous --cached and --3way options
2021-04-15 13:36:01 -07:00
Junio C Hamano
771c758e8a Merge branch 'jz/apply-run-3way-first'
"git apply --3way" has always been "to fall back to 3-way merge
only when straight application fails". Swap the order of falling
back so that 3-way is always attempted first (only when the option
is given, of course) and then straight patch application is used as
a fallback when it fails.

* jz/apply-run-3way-first:
  git-apply: try threeway first when "--3way" is used
2021-04-15 13:36:00 -07:00
Øystein Walle
f3cce896a8 transport: respect verbosity when setting upstream
A command such as `git push -qu origin feature` will print "Branch
'feature' set up to track remote branch 'feature' from 'origin'." even
when --quiet is passed. In this case it's because install_branch_config() is
always called with BRANCH_CONFIG_VERBOSE.

struct transport keeps track of the desired verbosity. Fix the above
issue by passing BRANCH_CONFIG_VERBOSE conditionally based on that.

Signed-off-by: Øystein Walle <oystwa@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-15 12:52:49 -07:00
Junio C Hamano
8446b388b1 Merge branch 'cc/test-helper-bloom-usage-fix'
Usage message fix for a test helper.

* cc/test-helper-bloom-usage-fix:
  test-bloom: fix missing 'bloom' from usage string
2021-04-13 15:28:52 -07:00
Junio C Hamano
2279289e95 Merge branch 'ab/send-email-validate-errors'
Clean-up codepaths that implements "git send-email --validate"
option and improves the message from it.

* ab/send-email-validate-errors:
  git-send-email: improve --validate error output
  git-send-email: refactor duplicate $? checks into a function
  git-send-email: test full --validate output
2021-04-13 15:28:51 -07:00
Junio C Hamano
0623669fc6 Merge branch 'tb/pack-preferred-tips-to-give-bitmap'
A configuration variable has been added to force tips of certain
refs to be given a reachability bitmap.

* tb/pack-preferred-tips-to-give-bitmap:
  builtin/pack-objects.c: respect 'pack.preferBitmapTips'
  t/helper/test-bitmap.c: initial commit
  pack-bitmap: add 'test_bitmap_commits()' helper
2021-04-13 15:28:50 -07:00
Junio C Hamano
f63add4aa8 Merge branch 'jk/ref-filter-segfault-fix'
A NULL-dereference bug has been corrected in an error codepath in
"git for-each-ref", "git branch --list" etc.

* jk/ref-filter-segfault-fix:
  ref-filter: fix NULL check for parse object failure
2021-04-13 15:28:50 -07:00
Ævar Arnfjörð Bjarmason
feeb03bce6 tests: remove all uses of test_i18cmp
Finish the removal I started in 1108cea7f8 (tests: remove most uses
of test_i18ncmp, 2021-02-11). At that time the function wasn't removed
due to disruption with in-flight changes, remove the occurrences that
have landed since then.

As of writing this there are no test_i18ncmp uses between "master" and
"seen", so let's also remove the function to finally put it to rest.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-13 14:41:24 -07:00
Jeff King
c1fa951d7e revision: avoid parsing with --exclude-promisor-objects
When --exclude-promisor-objects is given, before traversing any objects
we iterate over all of the objects in any promisor packs, marking them
as UNINTERESTING and SEEN. We turn the oid we get from iterating the
pack into an object with parse_object(), but this has two problems:

  - it's slow; we are zlib inflating (and reconstructing from deltas)
    every byte of every object in the packfile

  - it leaves the tree buffers attached to their structs, which means
    our heap usage will grow to store every uncompressed tree
    simultaneously. This can be gigabytes.

We can obviously fix the second by freeing the tree buffers after we've
parsed them. But we can observe that the function doesn't look at the
object contents at all! The only reason we call parse_object() is that
we need a "struct object" on which to set the flags. There are two
options here:

  - we can look up just the object type via oid_object_info(), and then
    call the appropriate lookup_foo() function

  - we can call lookup_unknown_object(), which gives us an OBJ_NONE
    struct (which will get auto-converted later by object_as_type() via
    calls to lookup_commit(), etc).

The first one is closer to the current code, but we do pay the price to
look up the type for each object. The latter should be more efficient in
CPU, though it wastes a little bit of memory (the "unknown" object
structs are a union of all object types, so some of the structs are
bigger than they need to be). It also runs the risk of triggering a
latent bug in code that calls lookup_object() directly but isn't ready
to handle OBJ_NONE (such code would already be buggy, but we use
lookup_unknown_object() infrequently enough that it might be hiding).

I went with the second option here. I don't think the risk is high (and
we'd want to find and fix any such bugs anyway), and it should be more
efficient overall.

The new tests in p5600 show off the improvement (this is on git.git):

  Test                                 HEAD^               HEAD
  -------------------------------------------------------------------------------
  5600.5: count commits                0.37(0.37+0.00)     0.38(0.38+0.00) +2.7%
  5600.6: count non-promisor commits   11.74(11.37+0.37)   0.04(0.03+0.00) -99.7%

The improvement is particularly big in this script because _every_
object in the newly-cloned partial repo is a promisor object. So after
marking them all, there's nothing left to traverse.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-13 13:22:37 -07:00
Jeff King
45a187cc34 lookup_unknown_object(): take a repository argument
All of the other lookup_foo() functions take a repository argument, but
lookup_unknown_object() was never converted, and it uses the_repository
internally. Let's fix that.

We could leave a wrapper that uses the_repository, but there aren't that
many calls, so we'll just convert them all. I looked briefly at each
site to see if we had a repository struct (besides the_repository) we
could pass, but none of them do (so this conversion to pass
the_repository is a pure noop in each case, though it does take us one
step closer to eventually getting rid of the_repository).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-13 13:18:46 -07:00
Jeff King
fcc07e980b is_promisor_object(): free tree buffer after parsing
To get the list of all promisor objects, we not only include all objects
in promisor packs, but also parse each of those objects to see which
objects they reference. After parsing a tree object, the tree->buffer
field will remain populated until we explicitly free it. So in a partial
clone of blob:none, for example, we are essentially reading every tree
in the repository (since they're all in the initial promisor pack), and
keeping all of their uncompressed contents in memory at once.

This patch frees the tree buffers after we've finished marking all of
their reachable objects. We shouldn't need to do this for any other
object type. While we are using some extra memory to store the structs,
no other object type stores the whole contents in its parsed form (we do
sometimes hold on to commit buffers, but less so these days due to
commit graphs, plus most commands which care about promisor objects turn
off the save_commit_buffer global).

Even for a moderate-sized repository like git.git, this patch drops the
peak heap (as measured by massif) for git-fsck from ~1.7GB to ~138MB.
Fsck is a good candidate for measuring here because it doesn't interact
with the promisor code except to call is_promisor_object(), so we can
isolate just this problem.

The added perf test shows only a tiny improvement on my machine for
git.git, since 1.7GB isn't enough to cause any real memory pressure:

  Test                                 HEAD^               HEAD
  --------------------------------------------------------------------------------
  5600.4: fsck                         21.26(20.90+0.35)   20.84(20.79+0.04) -2.0%

With linux.git the absolute change is a bit bigger, though still a small
percentage:

  Test                          HEAD^                 HEAD
  -----------------------------------------------------------------------------
  5600.4: fsck                  262.26(259.13+3.12)   254.92(254.62+0.29) -2.8%

I didn't have the patience to run it under massif with linux.git, but
it's probably on the order of about 14GB improvement, since that's the
sum of the sizes of all of the uncompressed trees (but still isn't
enough to create memory pressure on this particular machine, which has
64GB of RAM). Smaller machines would probably see a bigger effect on
runtime (and sadly our perf suite does not measure peak heap).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-13 13:16:39 -07:00
Jeff King
8e118e8490 pack-objects: update "nr_seen" progress based on pack-reused count
When serving a clone or fetch with bitmaps, after deciding which objects
need to be sent our "pack reuse" mechanism kicks in: we try to send
more-or-less verbatim a bunch of objects from the beginning of the
bitmapped packfile without even adding them to the to_pack.objects
array.

After deciding which objects will be in the "reused" portion, we update
nr_result to account for those, and then trigger display_progress() to
show the user (who is undoubtedly dazzled that we managed to enumerate
so many objects so quickly).

But then something confusing happens: the "Enumerating objects" progress
meter jumps _backwards_, counting up from zero the number of objects we
actually add into to_pack.objects.

This worked correctly once upon a time, but was broken in 5af050437a
(pack-objects: show some progress when counting kept objects,
2018-04-15), when the latter half of that progress meter switched to
using a separate nr_seen counter, rather than nr_result. Nobody noticed
for two reasons:

  - prior to the pack-reuse fixes from a14aebeac3 (Merge branch
    'jk/packfile-reuse-cleanup', 2020-02-14), the reuse code almost
    never kicked in anyway

  - the output looks _kind of_ correct. The "backwards" moment is hard
    to catch, because we overwrite the old progress number with the new
    one, and the larger number is displayed only for a second. So unless
    you look at that exact second, you just see the much smaller value,
    counting up to the number of non-reused objects (though of course if
    you catch it in stderr, or look at GIT_TRACE_PACKET from a server
    with bitmaps, you can see both values).

This smaller output isn't wrong per se, but isn't counting what we ever
intended to. We should give the user the whole number of objects we
considered (which, as per 5af050437a's original purpose, is already
_not_ a count of what goes into to_pack.objects). The follow-on
"Counting objects" meter shows the actual number of objects we feed into
that array.

We can easily fix this by bumping (and showing) nr_seen for the
pack-reused objects. When the included test is run without this patch,
the second pack-objects invocation produces "Enumerating objects: 1" to
show the one loose object, even though the resulting pack has hundreds
of objects in it. With it, we jump to "Enumerating objects: 674" after
deciding on reuse, and then "675" when we add in the loose object.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-12 11:31:30 -07:00
Atharva Raykar
a437390310 userdiff: add support for Scheme
Add a diff driver for Scheme-like languages which recognizes top level
and local `define` forms, whether it is a function definition, binding,
syntax definition or a user-defined `define-xyzzy` form.

Also supports R6RS `library` forms, `module` forms along with class and
struct declarations used in Racket (PLT Scheme).

Alternate "def" syntax such as those in Gerbil Scheme are also
supported, like defstruct, defsyntax and so on.

The rationale for picking `define` forms for the hunk headers is because
it is usually the only significant form for defining the structure of
the program, and it is a common pattern for schemers to have local
function definitions to hide their visibility, so it is not only the top
level `define`'s that are of interest. Schemers also extend the language
with macros to provide their own define forms (for example, something
like a `define-test-suite`) which is also captured in the hunk header.

Since it is common practice to extend syntax with variants of a form
like `module+`, `class*` etc, those have been supported as well.

The word regex is a best-effort attempt to conform to R7RS[1] valid
identifiers, symbols and numbers.

[1] https://small.r7rs.org/attachment/r7rs.pdf (section 2.1)

Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 13:56:09 -07:00
Junio C Hamano
1b31224e59 Merge branch 'en/ort-perf-batch-9'
The ort merge backend has been optimized by skipping irrelevant
renames.

* en/ort-perf-batch-9:
  diffcore-rename: avoid doing basename comparisons for irrelevant sources
  merge-ort: skip rename detection entirely if possible
  merge-ort: use relevant_sources to filter possible rename sources
  merge-ort: precompute whether directory rename detection is needed
  merge-ort: introduce wrappers for alternate tree traversal
  merge-ort: add data structures for an alternate tree traversal
  merge-ort: precompute subset of sources for which we need rename detection
  diffcore-rename: enable filtering possible rename sources
2021-04-08 13:23:26 -07:00
Junio C Hamano
82fd285e46 Merge branch 'en/sequencer-edit-upon-conflict-fix'
"git cherry-pick/revert" with or without "--[no-]edit" did not spawn
the editor as expected (e.g. "revert --no-edit" after a conflict
still asked to edit the message), which has been corrected.

* en/sequencer-edit-upon-conflict-fix:
  sequencer: fix edit handling for cherry-pick and revert messages
2021-04-08 13:23:26 -07:00
Junio C Hamano
22eee7f455 Merge branch 'll/clone-reject-shallow'
"git clone --reject-shallow" option fails the clone as soon as we
notice that we are cloning from a shallow repository.

* ll/clone-reject-shallow:
  builtin/clone.c: add --reject-shallow option
2021-04-08 13:23:25 -07:00
Junio C Hamano
e6b971fcf5 Merge branch 'tb/reverse-midx'
An on-disk reverse-index to map the in-pack location of an object
back to its object name across multiple packfiles is introduced.

* tb/reverse-midx:
  midx.c: improve cache locality in midx_pack_order_cmp()
  pack-revindex: write multi-pack reverse indexes
  pack-write.c: extract 'write_rev_file_order'
  pack-revindex: read multi-pack reverse indexes
  Documentation/technical: describe multi-pack reverse indexes
  midx: make some functions non-static
  midx: keep track of the checksum
  midx: don't free midx_name early
  midx: allow marking a pack as preferred
  t/helper/test-read-midx.c: add '--show-objects'
  builtin/multi-pack-index.c: display usage on unrecognized command
  builtin/multi-pack-index.c: don't enter bogus cmd_mode
  builtin/multi-pack-index.c: split sub-commands
  builtin/multi-pack-index.c: define common usage with a macro
  builtin/multi-pack-index.c: don't handle 'progress' separately
  builtin/multi-pack-index.c: inline 'flags' with options
2021-04-08 13:23:25 -07:00
Ævar Arnfjörð Bjarmason
f08b4013c3 blame tests: simplify userdiff driver test
Simplify the test added in 9466e3809d (blame: enable funcname blaming
with userdiff driver, 2020-11-01) to use the --author support recently
added in 999cfc4f45 (test-lib functions: add --author support to
test_commit, 2021-01-12).

We also did not need the full fortran-external-function content. Let's
cut it down to just the important parts.

I'm modifying it to demonstrate that the fortran-specific userdiff
function is in effect by adding "DO NOT MATCH ..." and "AS THE ..."
lines surrounding the "RIGHT" one.

This is to check that we're using the userdiff "fortran" driver, as
opposed to the default driver which would match on those lines as part
of the general heuristic of matching a line that doesn't begin with
whitespace.

The test had also been leaving behind a .gitattributes file for later
tests to possibly trip over, let's clean it up with
"test_when_finished".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 12:19:10 -07:00
Ævar Arnfjörð Bjarmason
b269441be2 blame tests: don't rely on t/t4018/ directory
Refactor a test added in 9466e3809d (blame: enable funcname blaming
with userdiff driver, 2020-11-01) so that the blame tests don't rely
on stealing the contents of "t/t4018/fortran-external-function".

I have another patch series that'll possibly (or not) refactor that
file, but having this test inter-dependency makes things simple in any
case by making this test more readable.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 12:19:10 -07:00
Ævar Arnfjörð Bjarmason
6cb77966ec userdiff: remove support for "broken" tests
There have been no "broken" tests since 75c3b6b2e8 (userdiff: improve
Fortran xfuncname regex, 2020-08-12). Let's remove the test support
for them.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 12:19:10 -07:00
Ævar Arnfjörð Bjarmason
28e8f0d5e5 userdiff tests: list builtin drivers via test-tool
Change the userdiff test to list the builtin drivers via the
test-tool, using the new for_each_userdiff_driver() API function.

This gets rid of the need to modify this part of the test every time a
new pattern is added, see 2ff6c34612 (userdiff: support Bash,
2020-10-22) and 09dad9256a (userdiff: support Markdown, 2020-05-02)
for two recent examples.

I only need the "list-builtin-drivers "argument here, but let's add
"list-custom-drivers" and "list-drivers" too, just because it's easy.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 12:19:10 -07:00
Ævar Arnfjörð Bjarmason
132bf25989 userdiff tests: explicitly test "default" pattern
Since 122aa6f9c0 (diff: introduce diff.<driver>.binary, 2008-10-05)
the internals of the userdiff.c code have understood a "default" name,
which is invoked as userdiff_find_by_name("default") and present in
the "builtin_drivers" struct. Let's test for this special case.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08 12:19:10 -07:00
Jerry Zhang
c0c2a37ac2 git-apply: allow simultaneous --cached and --3way options
"git apply" does not allow "--cached" and "--3way" to be used
together, since "--3way" writes conflict markers into the working
tree.

Allow "git apply" to accept "--cached" and "--3way" at the same
time.  When a single file auto-resolves cleanly, the result is
placed in the index at stage #0 and the command exits with 0 status.

For a file that has a conflict which cannot be cleanly
auto-resolved, the original contents from common ancestor (stage
conflict at the content level, and the command exists with non-zero
status, because there is no place (like the working tree) to leave a
half-resolved merge for the user to resolve.

The user can use `git diff` to view the contents of the conflict, or
`git checkout -m -- .` to regenerate the conflict markers in the
working directory.

Don't attempt rerere in this case since it depends on conflict
markers written to file for its database storage and lookup. There
would be two main changes required to get rerere working:

1. Allow the rerere api to accept in memory object rather than
   files, which would allow us to pass in the conflict markers
   contained in the result from ll_merge().

2. Rerere can't write to the working directory, so it would have to
   apply the result to cache stage #0 directly. A flag would be
   needed to control this.

Signed-off-by: Jerry Zhang <jerry@skydio.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-07 22:20:33 -07:00
Junio C Hamano
58840e62a4 Merge branch 'ps/pack-bitmap-optim'
Optimize "rev-list --use-bitmap-index --objects" corner case that
uses negative tags as the stopping points.

* ps/pack-bitmap-optim:
  pack-bitmap: avoid traversal of objects referenced by uninteresting tag
2021-04-07 16:54:09 -07:00
Junio C Hamano
68e15e0c23 Merge branch 'zh/commit-trailer'
"git commit" learned "--trailer <key>[=<value>]" option; together
with the interpret-trailers command, this will make it easier to
support custom trailers.

* zh/commit-trailer:
  commit: add --trailer option
2021-04-07 16:54:08 -07:00
Jerry Zhang
923cd87ac8 git-apply: try threeway first when "--3way" is used
The apply_fragments() method of "git apply"
can silently apply patches incorrectly if
a file has repeating contents. In these
cases a three-way merge is capable of applying
it correctly in more situations, and will
show a conflict rather than applying it
incorrectly. However, because the patches
apply "successfully" using apply_fragments(),
git will never fall back to the merge, even
if the "--3way" flag is used, and the user has
no way to ensure correctness by forcing the
three-way merge method.

Change the behavior so that when "--3way" is used,
git will always try the three-way merge first and
will only fall back to apply_fragments() in cases
where blobs are not available or some other error
(but not in the case of a merge conflict).

Since user-facing results will be different,
this has backwards compatibility implications
for users depending on the old behavior. In
addition, the three-way merge will be slower
than direct patch application.

Signed-off-by: Jerry Zhang <jerry@skydio.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-06 17:11:41 -07:00
Ævar Arnfjörð Bjarmason
ea7811b37e git-send-email: improve --validate error output
Improve the output we emit on --validate error to:

 * Say "FILE:LINE" instead of "FILE: LINE", to match "grep -n",
   compiler error messages etc.

 * Don't say "patch contains a" after just mentioning the filename,
   just leave it at "FILE:LINE: is longer than[...]. The "contains a"
   sounded like we were talking about the file in general, when we're
   actually checking it line-by-line.

 * Don't just say "rejected by sendemail-validate hook", but combine
   that with the system_or_msg() output to say what exit code the hook
   died with.

I had an aborted attempt to make the line length checker note all
lines that were longer than the limit. I didn't think that was worth
the effort, but I've left in the testing change to check that we die
as soon as we spot the first long line.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-06 12:57:06 -07:00
Ævar Arnfjörð Bjarmason
e585210e1b git-send-email: test full --validate output
Change the tests that grep substrings out of the output to use a full
test_cmp, in preparation for improving the output.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-06 12:57:05 -07:00
Christian Couder
dba94e3a85 test-bloom: fix missing 'bloom' from usage string
Like 'get_murmur3' and 'generate_filter', 'get_filter_for_commit' is a
subcommand of `test-tool bloom` not of `test-tool` itself.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-05 22:54:34 -07:00
Junio C Hamano
8a4394d1c1 Merge branch 'zh/format-patch-fractional-reroll-count'
"git format-patch -v<n>" learned to allow a reroll count that is
not an integer.

* zh/format-patch-fractional-reroll-count:
  format-patch: allow a non-integral version numbers
2021-04-02 14:43:14 -07:00
Junio C Hamano
861794b60d Merge branch 'jh/simple-ipc'
A simple IPC interface gets introduced to build services like
fsmonitor on top.

* jh/simple-ipc:
  t0052: add simple-ipc tests and t/helper/test-simple-ipc tool
  simple-ipc: add Unix domain socket implementation
  unix-stream-server: create unix domain socket under lock
  unix-socket: disallow chdir() when creating unix domain sockets
  unix-socket: add backlog size option to unix_stream_listen()
  unix-socket: eliminate static unix_stream_socket() helper function
  simple-ipc: add win32 implementation
  simple-ipc: design documentation for new IPC mechanism
  pkt-line: add options argument to read_packetized_to_strbuf()
  pkt-line: add PACKET_READ_GENTLE_ON_READ_ERROR option
  pkt-line: do not issue flush packets in write_packetized_*()
  pkt-line: eliminate the need for static buffer in packet_write_gently()
2021-04-02 14:43:14 -07:00
Taylor Blau
9218c6a40c midx: allow marking a pack as preferred
When multiple packs in the multi-pack index contain the same object, the
MIDX machinery must make a choice about which pack it associates with
that object. Prior to this patch, the lowest-ordered[1] pack was always
selected.

Pack selection for duplicate objects is relatively unimportant today,
but it will become important for multi-pack bitmaps. This is because we
can only invoke the pack-reuse mechanism when all of the bits for reused
objects come from the reuse pack (in order to ensure that all reused
deltas can find their base objects in the same pack).

To encourage the pack selection process to prefer one pack over another
(the pack to be preferred is the one a caller would like to later use as
a reuse pack), introduce the concept of a "preferred pack". When
provided, the MIDX code will always prefer an object found in a
preferred pack over any other.

No format changes are required to store the preferred pack, since it
will be able to be inferred with a corresponding MIDX bitmap, by looking
up the pack associated with the object in the first bit position (this
ordering is described in detail in a subsequent commit).

[1]: the ordering is specified by MIDX internals; for our purposes we
can consider the "lowest ordered" pack to be "the one with the
most-recent mtime.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01 13:07:37 -07:00