Make it clear that `ort` is the default merge strategy now rather than
`recursive`, including moving `ort` to the front of the list of merge
strategies.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a few reasons to switch the default:
* Correctness
* Extensibility
* Performance
I'll provide some summaries about each.
=== Correctness ===
The original impetus for a new merge backend was to fix issues that were
difficult to fix within recursive's design. The success with this goal
is perhaps most easily demonstrated by running the following:
$ git grep -2 KNOWN_FAILURE t/ | grep -A 4 GIT_TEST_MERGE_ALGORITHM
$ git grep test_expect_merge_algorithm.failure.success t/
$ git grep test_expect_merge_algorithm.success.failure t/
In order, these greps show:
* Seven sets of submodule tests (10 total tests) that fail with
recursive but succeed with ort
* 22 other tests that fail with recursive, but succeed with ort
* 0 tests that pass with recursive, but fail with ort
=== Extensibility ===
Being able to perform merges without touching the working tree or index
makes it possible to create new features that were difficult with the
old backend:
* Merging, cherry-picking, rebasing, reverting in bare repositories...
or just on branches that aren't checked out.
* `git diff AUTO_MERGE` -- ability to see what changes the user has
made to resolve conflicts so far (see commit 5291828df8 ("merge-ort:
write $GIT_DIR/AUTO_MERGE whenever we hit a conflict", 2021-03-20)
* A --remerge-diff option for log/show, used to show diffs for merges
that display the difference between what an automatic merge would
have created and what was recorded in the merge. (This option will
often result in an empty diff because many merges are clean, but for
the non-clean ones it will show how conflicts were fixed including
the removal of conflict markers, and also show additional changes
made outside of conflict regions to e.g. fix semantic conflicts.)
* A --remerge-diff-only option for log/show, similar to --remerge-diff
but also showing how cherry-picks or reverts differed from what an
automatic cherry-pick or revert would provide.
The last three have been implemented already (though only one has been
submitted upstream so far; the others were waiting for performance work
to complete), and I still plan to implement the first one.
=== Performance ===
I'll quote from the summary of my final optimization for merge-ort
(while fixing the testcase name from 'no-renames' to 'few-renames'):
Timings
Infinite
merge- merge- Parallelism
recursive recursive of rename merge-ort
v2.30.0 current detection current
---------- --------- ----------- ---------
few-renames: 18.912 s 18.030 s 11.699 s 198.3 ms
mega-renames: 5964.031 s 361.281 s 203.886 s 661.8 ms
just-one-mega: 149.583 s 11.009 s 7.553 s 264.6 ms
Speedup factors
Infinite
merge- merge- Parallelism
recursive recursive of rename
v2.30.0 current detection merge-ort
---------- --------- ----------- ---------
few-renames: 1 1.05 1.6 95
mega-renames: 1 16.5 29 9012
just-one-mega: 1 13.6 20 565
And, for partial clone users:
Factor reduction in number of objects needed
Infinite
merge- merge- Parallelism
recursive recursive of rename
v2.30.0 current detection merge-ort
---------- --------- ----------- ---------
mega-renames: 1 1 1 181.3
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There were two locations in the code that referred to 'merge-recursive'
but which were also applicable to 'merge-ort'. Update them to more
general wording.
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 58634dbff8 ("rebase: Allow merge strategies to be used when
rebasing", 2006-06-21) added the --merge option to git-rebase so that
renames could be detected (at least when using the `recursive` merge
backend). However, git-am -3 gained that same ability in commit
579c9bb198 ("Use merge-recursive in git-am -3.", 2006-12-28). As such,
the comment about being able to detect renames is not particularly
noteworthy. Remove it.
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We already have diff-algorithm that explains why there are special diff
algorithms, so we do not need to re-explain patience. patience exists
as its own toplevel option for historical reasons, but there's no reason
to give it special preference or document it again and suggest it's more
important than other diff algorithms, so just refer to it as a
deprecated shorthand for `diff-algorithm=patience`.
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stating that the recursive strategy "currently cannot make use of
detected copies" implies that this is a technical shortcoming of the
current algorithm. I disagree with that. I don't see how copies could
possibly be used in a sane fashion in a merge algorithm -- would we
propagate changes in one file on one side of history to each copy of
that file when merging? That makes no sense to me. I cannot think of
anything else that would make sense either. Change the wording to
simply state that we ignore any copies.
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is probably helpful to cover the default merge strategy first, so
move the text for the resolve strategy to later in the document.
Further, the wording for "resolve" claimed that it was "considered
generally safe and fast", which might imply in some readers minds that
the same is not true of other strategies. Rather than adding this text
to all the strategies, just remove it from this one.
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A few places in the documentation referred to the "`recursive` strategy"
using the phrase "`git merge-recursive`", suggesting that it was forking
subprocesses to call a toplevel builtin. Perhaps that was relevant to
when rebase was a shell script, but it seems like a rather indirect way
to refer to the `recursive` strategy. Simplify the references.
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit 0c4fd732f0 ("Move computation of dir_rename_count from
merge-ort to diffcore-rename", 2021-02-27), much of the logic for
computing directory renames moved into diffcore-rename.
directory-rename-detection.txt had claims that all of that logic was
found in merge-recursive. Update the documentation.
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When --rebase-merges was first introduced, it only worked with the
`recursive` strategy. Some time later, it gained support for merges
using the `octopus` strategy. The limitation of only supporting these
two strategies was documented in 25cff9f109 ("rebase -i --rebase-merges:
add a section to the man page", 2018-04-25) and lifted in e145d99347
("rebase -r: support merge strategies other than `recursive`",
2019-07-31). However, when the limitation was lifted, the documentation
was not updated. Update it now.
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Windows rmdir() equivalent behaves differently from POSIX ones in
that when used on a symbolic link that points at a directory, the
target directory gets removed, which has been corrected.
* tb/mingw-rmdir-symlink-to-directory:
mingw: align symlinks-related rmdir() behavior with Linux
The local changes stashed by "git merge --autostash" were lost when
the merge failed in certain ways, which has been corrected.
* pb/merge-autostash-more:
merge: apply autostash if merge strategy fails
merge: apply autostash if fast-forward fails
Documentation: define 'MERGE_AUTOSTASH'
merge: add missing word "strategy" to a message
Further optimization on "merge -sort" backend.
* en/ort-perf-batch-14:
merge-ort: restart merge with cached renames to reduce process entry cost
merge-ort: avoid recursing into directories when we don't need to
merge-ort: defer recursing into directories when merge base is matched
merge-ort: add a handle_deferred_entries() helper function
merge-ort: add data structures for allowable trivial directory resolves
merge-ort: add some more explanations in collect_merge_info_callback()
merge-ort: resolve paths early when we have sufficient information
Reorganize and update the SubmitingPatches document.
* ab/update-submitting-patches:
SubmittingPatches: replace discussion of Travis with GitHub Actions
SubmittingPatches: move discussion of Signed-off-by above "send"
Leak plugging.
* ah/plugleaks:
reset: clear_unpack_trees_porcelain to plug leak
builtin/rebase: fix options.strategy memory lifecycle
builtin/merge: free found_ref when done
builtin/mv: free or UNLEAK multiple pointers at end of cmd_mv
convert: release strbuf to avoid leak
read-cache: call diff_setup_done to avoid leak
ref-filter: also free head for ATOM_HEAD to avoid leak
diffcore-rename: move old_dir/new_dir definition to plug leak
builtin/for-each-repo: remove unnecessary argv copy to plug leak
builtin/submodule--helper: release unused strbuf to avoid leak
environment: move strbuf into block to plug leak
fmt-merge-msg: free newly allocated temporary strings when done
Rewrite of "git submodule" in C continues.
* ar/submodule-add:
submodule: drop unused sm_name parameter from show_fetch_remotes()
submodule--helper: introduce add-clone subcommand
submodule--helper: refactor module_clone()
submodule: prefix die messages with 'fatal'
t7400: test failure to add submodule in tracked path
When doing reference negotiation, git-fetch-pack(1) is loading all refs
from disk in order to determine which commits it has in common with the
remote repository. This can be quite expensive in repositories with many
references though: in a real-world repository with around 2.2 million
refs, fetching a single commit by its ID takes around 44 seconds.
Dominating the loading time is decompression and parsing of the objects
which are referenced by commits. Given the fact that we only care about
commits (or tags which can be peeled to one) in this context, there is
thus an easy performance win by switching the parsing logic to make use
of the commit graph in case we have one available. Like this, we avoid
hitting the object database to parse these commits but instead only load
them from the commit-graph. This results in a significant performance
boost when executing git-fetch in said repository with 2.2 million refs:
Benchmark #1: HEAD~: git fetch $remote $commit
Time (mean ± σ): 44.168 s ± 0.341 s [User: 42.985 s, System: 1.106 s]
Range (min … max): 43.565 s … 44.577 s 10 runs
Benchmark #2: HEAD: git fetch $remote $commit
Time (mean ± σ): 19.498 s ± 0.724 s [User: 18.751 s, System: 0.690 s]
Range (min … max): 18.629 s … 20.454 s 10 runs
Summary
'HEAD: git fetch $remote $commit' ran
2.27 ± 0.09 times faster than 'HEAD~: git fetch $remote $commit'
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When I was fixing fuzzies as I updating po/id.po for 2.33.0 l10n round,
I noticed a triple-dash typo (--pickaxe-all) at diff.c, which according
to git-diff(1) manpage, the correct option name should be --pickaxe-all.
Fix the typo.
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Simplify code maintenance by removing the ability to toggle between
usage of memory pools and direct allocations. This allows us to also
remove paths_to_free since it was solely about bookkeeping to make sure
we freed the necessary paths, and allows us to remove some auxiliary
functions.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When performing a rebase, rmdir() is called on the folder .git/logs. On
Unix rmdir() exits without deleting anything in case .git/logs is a
symbolic link but the equivalent functions on Windows (_rmdir, _wrmdir
and RemoveDirectoryW) do not behave the same and remove the folder if it
is symlinked even if it is not empty.
This creates issues when folders in .git/ are symlinks which is
especially the case when git-repo[1] is used: It replaces `.git/logs/`
with a symlink.
One such issue is that the _target_ of that symlink is removed e.g.
during a `git rebase`, where `delete_reflog("REBASE_HEAD")` will not
only try to remove `.git/logs/REBASE_HEAD` but then recursively try to
remove the parent directories until an error occurs, a technique that
obviously relies on `rmdir()` refusing to remove a symlink.
This was reported in https://github.com/git-for-windows/git/issues/2967.
This commit updates mingw_rmdir() so that its behavior is the same as
Linux rmdir() in case of symbolic links.
To verify that Git does not regress on the reported issue, this patch
adds a regression test for the `git rebase` symptom, even if the same
`rmdir()` behavior is quite likely to cause potential problems in other
Git commands as well.
[1]: git-repo is a python tool built on top of Git which helps manage
many Git repositories. It stores all the .git/ folders in a central
place by taking advantage of symbolic links.
More information: https://gerrit.googlesource.com/git-repo/
Signed-off-by: Thomas Bétous <tomspycell@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
24c30e0b6 (wt-status: tolerate dangling marks, 2020-09-01) adds a test
that uses a BRE which breaks at least with OpenBSD's grep.
switch to an ERE as it is done for similar checks and while at it, remove
the now obsolete test_i18ngrep call.
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the preceding commits we introduced new documentation that talks
about "[commit|object] prerequsite(s)", but also faithfully moved
around existing documentation that talks about the "basis".
Let's change both that moved-around documentation and other existing
documentation in the file to consistently use "[commit|object]"
prerequisite(s)" instead of talking about "basis". The mention of
"basis" isn't wrong, but readers will be helped by us using only one
term throughout the document for this concept.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elaborate on the restriction that you cannot provide a revision that
doesn't resolve to a reference in the "SPECIFYING REFERENCES" section
with examples.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Split out the discussion bout "object prerequisites" into its own
section, and add some more examples of the common cases.
See 2e0afafebd (Add git-bundle: move objects and references by
archive, 2007-02-22) for the introduction of the documentation being
changed here.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rewrite the "DESCRIPTION" section for "git bundle" to start by talking
about what bundles are in general terms, rather than diving directly
into one example of what they might be used for.
This changes documentation that's been substantially the same ever
since the command was added in 2e0afafebd (Add git-bundle: move
objects and references by archive, 2007-02-22).
I've split up the DESCRIPTION into that section and a "BUNDLE FORMAT"
section, it briefly discusses the format, but then links to the
technical/bundle-format.txt documentation.
The "the user must specify a basis" part of this is discussed below in
"SPECIFYING REFERENCES", and will be further elaborated on in a
subsequent commit. So I'm removing that part and letting the mention
of "revision exclusions" suffice.
There was a discussion about whether to say anything at all about
"thin packs" here[1]. I think it's good to mention it for the curious
reader willing to read the technical docs, but let's explicitly say
that there's no "thick pack", and that the difference shouldn't
matter.
1. http://lore.kernel.org/git/xmqqk0mbt5rj.fsf@gitster.g
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A race between repacking and using pack bitmaps has been corrected.
* jk/check-pack-valid-before-opening-bitmap:
pack-bitmap: check pack validity when opening bitmap
"git read-tree" had a codepath where blobs are fetched one-by-one
from the promisor remote, which has been corrected to fetch in bulk.
* jt/bulk-prefetch:
cache-tree: prefetch in partial clone read-tree
unpack-trees: refactor prefetching code