Add the ability to control rename detection for merge via a config setting.
This setting behaves the same and defaults to the value of diff.renames but only
applies to merge.
Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Ben Peart <benpeart@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update the documentation to better indicate that the renameLimit setting is
ignored if rename detection is turned off via command line options or config
settings.
Signed-off-by: Ben Peart <benpeart@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The can-working-tree-updates-be-skipped check has had a long and blemished
history. The update can be skipped iff:
a) The merge is clean
b) The merge matches what was in HEAD (content, mode, pathname)
c) The target path is usable (i.e. not involved in D/F conflict)
Traditionally, we split b into parts:
b1) The merged result matches the content and mode found in HEAD
b2) The merged target path existed in HEAD
Steps a & b1 are easy to check; we have always gotten those right. While
it is easy to overlook step c, this was fixed seven years ago with commit
4ab9a157d0 ("merge_content(): Check whether D/F conflicts are still
present", 2010-09-20). merge-recursive didn't have a readily available
way to directly check step b2, so various approximations were used:
* In commit b2c8c0a762 ("merge-recursive: When we detect we can skip
an update, actually skip it", 2011-02-28), it was noted that although
the code claimed it was skipping the update, it did not actually skip
the update. The code was made to skip it, but used lstat(path, ...)
as an approximation to path-was-tracked-in-index-before-merge.
* In commit 5b448b8530 ("merge-recursive: When we detect we can skip
an update, actually skip it", 2011-08-11), the problem with using
lstat was noted. It was changed to the approximation
path2 && strcmp(path, path2)
which is also wrong. !path2 || strcmp(path, path2) would have been
better, but would have fallen short with directory renames.
* In c5b761fb27 ("merge-recursive: ensure we write updates for
directory-renamed file", 2018-02-14), the problem with the previous
approximation was noted and changed to
was_tracked(path)
That looks close to what we were trying to answer, but was_tracked()
as implemented at the time should have been named is_tracked(); it
returned something different than what we were looking for.
* To make matters more complex, fixing was_tracked() isn't sufficient
because the splitting of b into b1 and b2 is wrong. Consider the
following merge with a rename/add conflict:
side A: modify foo, add unrelated bar
side B: rename foo->bar (but don't modify the mode or contents)
In this case, the three-way merge of original foo, A's foo, and B's
bar will result in a desired pathname of bar with the same
mode/contents that A had for foo. Thus, A had the right mode and
contents for the file, and it had the right pathname present (namely,
bar), but the bar that was present was unrelated to the contents, so
the working tree update was not skippable.
Fix this by introducing a new function:
was_tracked_and_matches(o, path, &mfi.oid, mfi.mode)
and use it to directly check for condition b.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, merge_content() would print "Auto-merging" whenever the final
content and mode aren't already available from HEAD. There are a few
problems with this:
1) There are other code paths doing merges that should probably have the
same message printed, in particular rename/rename(2to1) which cannot
call into the normal rename logic.
2) If both sides of the merge have modifications, then a content merge
is needed. It may turn out that the end result matches one of the
sides (because the other only had a subset of the same changes), but
the merge was still needed. Currently, the message will not print in
that case, though it seems like it should.
Move the printing of this message to merge_file_1() in order to address
both issues.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
was_dirty() uses was_tracked(), which has been updated to use the original
index rather than the current one. However, was_dirty() also had a
separate call to cache_file_exists(), causing it to still implicitly use
the current index. Update that to instead use index_file_exists().
Also, was_dirty() had a hack where it would mark any file as non-dirty if
we simply didn't know its modification time. This was due to using the
current index rather than the original index, because D/F conflicts and
such would cause unpack_trees() to not copy the modification times from
the original index to the current one. Now that we are using the original
index, we can dispense with this hack.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In commit aacb82de3f ("merge-recursive: Split was_tracked() out of
would_lose_untracked()", 2011-08-11), was_tracked() was split out of
would_lose_untracked() with the intent to provide a function that could
answer whether a path was tracked in the index before the merge. Sadly,
it instead returned whether the path was in the working tree due to having
been tracked in the index before the merge OR having been written there by
unpack_trees(). The distinction is important when renames are involved,
e.g. for a merge where:
HEAD: modifies path b
other: renames b->c
In this case, c was not tracked in the index before the merge, but would
have been added to the index at stage 0 and written to the working tree by
unpack_trees(). would_lose_untracked() is more interested in the
in-working-copy-for-either-reason behavior, while all other uses of
was_tracked() want just was-it-tracked-in-index-before-merge behavior.
Unsplit would_lose_untracked() and write a new was_tracked() function
which answers whether a path was tracked in the index before the merge
started.
This will also affect was_dirty(), helping it to return better results
since it can base answers off the original index rather than an index that
possibly only copied over some of the stat information. However,
was_dirty() will need an additional change that will be made in a
subsequent patch.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add several tests checking whether updates can be skipped in a merge.
Also add several similar testcases for where updates cannot be skipped in
a merge to make sure that we skip if and only if we should.
In particular:
* Testcase 1a (particularly 1a-check-L) would have pointed out the
problem Linus has been dealing with for year with his merges[1].
* Testcase 2a (particularly 2a-check-L) would have pointed out the
problem with my directory-rename-series before it broke master[2].
* Testcases 3[ab] (particularly 3a-check-L) provide a simpler testcase
than 12b of t6043 making that one easier to understand.
* There are several complementary testcases to make sure we're not just
fixing those particular issues while regressing in the opposite
direction.
* There are also a pair of tests for the special case when a merge
results in a skippable update AND the user has dirty modifications to
the path.
[1] https://public-inbox.org/git/CA+55aFzLZ3UkG5svqZwSnhNk75=fXJRkvU1m_RHBG54NOoaZPA@mail.gmail.com/
[2] https://public-inbox.org/git/xmqqmuya43cs.fsf@gitster-ct.c.googlers.com/
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a cherry-pick or merge with a rename results in a skippable update
(due to the merged content matching what HEAD already had), but the
working directory is dirty, avoid trying to refresh the index as that
will fail.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
conflict_rename_normal() was doing some handling for dirty files that
more naturally belonged in merge_content. Move it, and rename a
parameter for clarity while at it.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Four closely related changes all with the purpose of fixing error handling
in this function:
- fix reported function name in add_cacheinfo error messages
- differentiate between the two error messages
- abort early when we hit the error (stop ignoring return code)
- mark a test which was hitting this error as failing until we get the
right fix
In more detail...
In commit 0424138d57 ("Fix bogus error message from merge-recursive
error path", 2007-04-01), it was noted that the name of the function which
the error message claimed it was reported from did not match the actual
function name. This was changed to something closer to the real function
name, but it still didn't match the actual function name. Fix the
reported name to match.
Second, the two errors in this function had identical messages, preventing
us from knowing which error had been triggered. Add a couple words to the
second error message to differentiate the two.
Next, make sure callers do not ignore the return code so that it will stop
processing further entries (processing further entries could result in
more output which could cause the error to scroll off the screen, or at
least be missed by the user) and make it clear the error is the cause of
the early abort. These errors should never be triggered in production; if
either one is, it represents a bug in the calling path somewhere and is
likely to have resulted in mis-merged content. The combination of
ignoring of the return code and continuing to print other standard
messages after hitting the error resulted in the following bug report from
Junio: "...the command pretends that everything went well and merged
cleanly in that path...[Behaving] in a buggy and unexplainable way is bad
enough, doing so silently is unexcusable." Fix this.
Finally, there was one test in the testsuite that did hit this error path,
but was passing anyway. This would have been easy to miss since it had a
test_must_fail and thus could have failed for the wrong reason, but in a
separate testing step I added an intentional NULL-dereference to the
codepath where these error messages are printed in order to flush out such
cases. I could modify that test to explicitly check for this error and
fail the test if it is hit, but since this test operates in a bit of a
gray area and needed other changes, I went for a different fix. The gray
area this test operates in is the following: If the merge of a certain
file results in the same version of the file that existed in HEAD, but
there are dirty modifications to the file, is that an error with a
"Refusing to overwrite existing file" expected, or a case where the merge
should succeed since we shouldn't have to touch the dirty file anyway?
Recent discussion on the list leaned towards saying it should be a
success. Therefore, change the expected behavior of this test to match.
As a side effect, this makes the failed-due-to-hitting-add_cacheinfo-error
very clear, and we can mark the test as test_expect_failure. A subsequent
commit will implement the necessary changes to get this test to pass
again.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a file on one side of history was renamed, and merely modified on the
other side, then applying a directory rename to the modified side gives us
a rename/rename(1to2) conflict. We should only apply directory renames to
pairs representing either adds or renames.
Making this change means that a directory rename testcase that was
previously reported as a rename/delete conflict will now be reported as a
modify/delete conflict.
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a testcase showing spurious rename/rename(1to2) conflicts occurring
due to directory rename detection.
Also add a pair of testcases dealing with moving directory hierarchies
around that were suggested by Stefan Beller as "food for thought" during
his review of an earlier patch series, but which actually uncovered a
bug. Round things out with a test that is a cross between the two
testcases that showed existing bugs in order to make sure we aren't
merely addressing problems in isolation but in general.
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This fixes an issue that existed before my directory rename detection
patches that affects both normal renames and renames implied by
directory rename detection. Additional codepaths that only affect
overwriting of dirty files that are involved in directory rename
detection will be added in a subsequent commit.
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit hooks together all the directory rename logic by making the
necessary changes to the rename struct, it's dst_entry, and the
diff_filepair under consideration.
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
get_renames() would look up stage data that already existed (populated
in get_unmerged(), taken from whatever unpack_trees() created), and if
it didn't exist, would call insert_stage_data() to create the necessary
entry for the given file. The insert_stage_data() fallback becomes
much more important for directory rename detection, because that creates
a mechanism to have a file in the resulting merge that didn't exist on
either side of history. However, insert_stage_data(), due to calling
get_tree_entry() loaded up trees as readily as files. We aren't
interested in comparing trees to files; the D/F conflict handling is
done elsewhere. This code is just concerned with what entries existed
for a given path on the different sides of the merge, so create a
get_tree_entry_if_blob() helper function and use it.
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before trying to apply directory renames to paths within the given
directories, we want to make sure that there aren't conflicts at the
file level either. If there aren't any, then get the new name from
any directory renames.
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
directory renaming and merging can cause one or more files to be moved to
where an existing file is, or to cause several files to all be moved to
the same (otherwise vacant) location. Add checking and reporting for such
cases, falling back to no-directory-rename handling for such paths.
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before trying to apply directory renames to paths within the given
directories, we want to make sure that there aren't conflicts at the
directory level. There will be additional checks at the individual
file level too, which will be added later.
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This populates a set of directory renames for us. The set of directory
renames is not yet used, but will be in subsequent commits.
Note that the use of a string_list for possible_new_dirs in the new
dir_rename_entry struct implies an O(n^2) algorithm; however, in practice
I expect the number of distinct directories that files were renamed into
from a single original directory to be O(1). My guess is that n has a
mode of 1 and a mean of less than 2, so, for now, string_list seems good
enough for possible_new_dirs.
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git http-fetch" (deprecated) had an optional and experimental
"feature" to fetch only commits and/or trees, which nobody used.
This has been removed.
* ma/http-walker-no-partial:
walker: drop fields of `struct walker` which are always 1
http-fetch: make `-a` standard behaviour
* js/runtime-prefix:
Avoid multiple PREFIX definitions
git_setup_gettext: plug memory leak
gettext: avoid initialization if the locale dir is not present
Error messages from "git push" can be painted for more visibility.
* js/colored-push-errors:
config: document the settings to colorize push errors/hints
push: test to verify that push errors are colored
push: colorize errors
color: introduce support for colorizing stderr
"git gc --prune=nonsense" spent long time repacking and then
silently failed when underlying "git prune --expire=nonsense"
failed to parse its command line. This has been corrected.
* jc/parseopt-expiry-errors:
parseopt: handle malformed --expire arguments more nicely
gc: do not upcase error message shown with die()
"git fast-export" had a regression in v2.15.0 era where it skipped
some merge commits in certain cases, which has been corrected.
* ma/fast-export-skip-merge-fix:
fast-export: fix regression skipping some merge-commits
The command line completion (in contrib/) has been taught that "git
stash save" has been deprecated ("git stash push" is the preferred
spelling in the new world) and does not offer it as a possible
completion candidate when "git stash push" can be.
* tg/demote-stash-save-in-completion:
completion: make stash -p and alias for stash push -p
completion: stop showing 'save' for stash by default
When fed input that already has In-Reply-To: and/or References:
headers and told to add the same information, "git send-email"
added these headers separately, instead of appending to an existing
one, which is a violation of the RFC. This has been corrected.
* sa/send-email-dedup-some-headers:
send-email: avoid duplicate In-Reply-To/References
"git submodule status" did not check the symbolic revision name it
computed for the submodule HEAD is not the NULL, and threw it at
printf routines, which has been corrected.
* nd/submodule-status-fix:
submodule--helper: don't print null in 'submodule status'
During a "rebase -i" session, the code could give older timestamp
to commits created by later "pick" than an earlier "reword", which
has been corrected.
* js/ident-date-fix:
sequencer: reset the committer date before commits
What is queued here is only the obviously correct and
uncontroversial code clean-up part, which is an earlier 7 patches,
of a larger series.
The remainder that is not queued introduces a few configuration
variables to deal with e-signature backends with different
signature format.
* bt/gpg-interface:
gpg-interface: find the last gpg signature line
gpg-interface: extract gpg line matching helper
gpg-interface: fix const-correctness of "eol" pointer
gpg-interface: use size_t for signature buffer size
gpg-interface: modernize function declarations
gpg-interface: handle bool user.signingkey
t7004: fix mistaken tag name
"git ls-remote" learned an option to allow sorting its output based
on the refnames being shown.
* hn/sort-ls-remote:
ls-remote: create '--sort' option
"git config --get" learned the "--default" option, to help the
calling script. Building on top of the tb/config-type topic, the
"git config" learns "--type=color" type. Taken together, you can
do things like "git config --get foo.color --default blue" and get
the ANSI color sequence for the color given to foo.color variable,
or "blue" if the variable does not exist.
* tb/config-default:
builtin/config: introduce `color` type specifier
config.c: introduce 'git_config_color' to parse ANSI colors
builtin/config: introduce `--default`
The "git config" command uses separate options e.g. "--int",
"--bool", etc. to specify what type the caller wants the value to
be interpreted as. A new "--type=<typename>" option has been
introduced, which would make it cleaner to define new types.
* tb/config-type:
builtin/config.c: support `--type=<type>` as preferred alias for `--<type>`
builtin/config.c: treat type specifiers singularly
The completion script (in contrib/) learned to clear cached list of
command line options upon dot-sourcing it again in a more efficient
way.
* sg/completion-clear-cached:
completion: reduce overhead of clearing cached --options
"git worktree remove" learned that "-f" is a shorthand for
"--force" option, just like for "git worktree add".
* sb/worktree-remove-opt-force:
worktree: accept -f as short for --force for removal
The new "checkout-encoding" attribute can ask Git to convert the
contents to the specified encoding when checking out to the working
tree (and the other way around when checking in).
* ls/checkout-encoding:
convert: add round trip check based on 'core.checkRoundtripEncoding'
convert: add tracing for 'working-tree-encoding' attribute
convert: check for detectable errors in UTF encodings
convert: add 'working-tree-encoding' attribute
utf8: add function to detect a missing UTF-16/32 BOM
utf8: add function to detect prohibited UTF-16/32 BOM
utf8: teach same_encoding() alternative UTF encoding names
strbuf: add a case insensitive starts_with()
strbuf: add xstrdup_toupper()
strbuf: remove unnecessary NUL assignment in xstrdup_tolower()
The scripts in contrib/emacs/ have outlived their usefulness and
have been replaced with a stub that errors out and tells the user
there are replacements.
* ab/nuke-emacs-contrib:
git{,-blame}.el: remove old bitrotting Emacs code
The build procedure "make DEVELOPER=YesPlease" learned to enable a
bit more warning options depending on the compiler used to help
developers more. There also is "make DEVOPTS=tokens" knob
available now, for those who want to help fixing warnings we
usually ignore, for example.
* nd/warn-more-for-devs:
Makefile: add a DEVOPTS to get all of -Wextra
Makefile: add a DEVOPTS to suppress -Werror under DEVELOPER
Makefile: detect compiler and enable more warnings in DEVELOPER=1
connect.c: mark die_initial_contact() NORETURN
The effort to pass the repository in-core structure throughout the
API continues. This round deals with the code that implements the
refs/replace/ mechanism.
* sb/object-store-replace:
replace-object: allow lookup_replace_object to handle arbitrary repositories
replace-object: allow do_lookup_replace_object to handle arbitrary repositories
replace-object: allow prepare_replace_object to handle arbitrary repositories
refs: allow for_each_replace_ref to handle arbitrary repositories
refs: store the main ref store inside the repository struct
replace-object: add repository argument to lookup_replace_object
replace-object: add repository argument to do_lookup_replace_object
replace-object: add repository argument to prepare_replace_object
refs: add repository argument to for_each_replace_ref
refs: add repository argument to get_main_ref_store
replace-object: check_replace_refs is safe in multi repo environment
replace-object: eliminate replace objects prepared flag
object-store: move lookup_replace_object to replace-object.h
replace-object: move replace_map to object store
replace_object: use oidmap
Precompute and store information necessary for ancestry traversal
in a separate file to optimize graph walking.
* ds/commit-graph:
commit-graph: implement "--append" option
commit-graph: build graph from starting commits
commit-graph: read only from specific pack-indexes
commit: integrate commit graph with commit parsing
commit-graph: close under reachability
commit-graph: add core.commitGraph setting
commit-graph: implement git commit-graph read
commit-graph: implement git-commit-graph write
commit-graph: implement write_commit_graph()
commit-graph: create git-commit-graph builtin
graph: add commit graph design document
commit-graph: add format document
csum-file: refactor finalize_hashfile() method
csum-file: rename hashclose() to finalize_hashfile()
"git config --unset a.b", when "a.b" is the last variable in an
otherwise empty section "a", left an empty section "a" behind, and
worse yet, a subsequent "git config a.c value" did not reuse that
empty shell and instead created a new one. These have been
(partially) corrected.
* js/empty-config-section-fix:
git_config_set: reuse empty sections
git config --unset: remove empty sections (in the common case)
git_config_set: make use of the config parser's event stream
git_config_set: do not use a state machine
config_set_store: rename some fields for consistency
config: avoid using the global variable `store`
config: introduce an optional event stream while parsing
t1300: `--unset-all` can leave an empty section behind (bug)
t1300: add a few more hairy examples of sections becoming empty
t1300: remove unreasonable expectation from TODO
t1300: avoid relying on a bug
config --replace-all: avoid extra line breaks
t1300: demonstrate that --replace-all can "invent" newlines
t1300: rename it to reflect that `repo-config` was deprecated
git_config_set: fix off-by-two
Code restructuring, in preparation for further work.
* ot/libify-get-ref-atom-value:
ref-filter: libify get_ref_atom_value()
ref-filter: add return value to parsers
ref-filter: change parsing function error handling
ref-filter: add return value && strbuf to handlers
ref-filter: start adding strbufs with errors
ref-filter: add shortcut to work with strbufs