git-commit-vandalism

Author	SHA1	Message	Date
Elijah Newren	2b86c10084	merge-ort: fix bug with dir rename vs change dir to symlink When changing a directory to a symlink on one side of history, and renaming the parent of that directory to a different directory name on the other side, e.g. with this kind of setup: Base commit: Has a file named dir/subdir/file Side1: Rename dir/ -> renamed-dir/ Side2: delete dir/subdir/file, add dir/subdir as symlink Then merge-ort was running into an assertion failure: git: merge-ort.c:2622: apply_directory_rename_modifications: Assertion `ci->dirmask == 0' failed merge-recursive did not have as obvious an issue handling this case, likely because we never fixed it to handle the case from commit `902c521a35` ("t6423: more involved directory rename test", 2020-10-15) where we need to be careful about nested renames when a directory rename occurs (dir/ -> renamed-dir/ implies dir/subdir/ -> renamed-dir/subdir/). However, merge-recursive does have multiple problems with this testcase: * Incorrect stages for the file: merge-recursive omits the stage in the index corresponding to the base stage, making `git status` report "added by us" for renamed-dir/subdir/file instead of the expected "deleted by them". * Poor directory/file conflict handling: For the renamed-dir/subdir symlink, instead of reporting a file/directory conflict as expected, it reports "Error: Refusing to lose untracked file at renamed-dir/subdir". This is a lie because there is no untracked file at that location. It then does the normal suboptimal merge-recursive thing of having the symlink be tracked in the index at a location where it can't be written due to D/F conflicts (namely, renamed-dir/subdir), but writes it to the working tree at a different location as a new untracked file (namely, renamed-dir/subdir~B^0) Technically, these problems don't prevent the user from resolving the merge if they can figure out to ignore the confusion, but because both pieces of output are quite confusing I don't want to modify the test to claim the recursive also passes it even if it doesn't have the bug that ort did. So, fix the bug in ort by splitting the conflict_info for "dir/subdir" into two, one for the directory part, one for the file (i.e. symlink) part, since the symlink is being renamed by directory rename detection. The directory part is needed for proper nesting, since there are still conflict_info fields for files underneath it (though those are marked as is_null, they are still present until the entries are processed, and the entry processing wants every non-toplevel entry to have a parent directory). Reported-by: Stefano Rivera <stefano@rivera.za.net> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-10-22 16:10:33 -07:00
Elijah Newren	6693fb3f01	t64xx: convert 'test_create_repo' to 'git init' Convert the merge-specific tests (those in the t64xx range) over to using 'git init' instead of 'test_create_repo'. Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-26 09:23:03 -07:00
Elijah Newren	751e165424	merge-ort: fix issue with dual rename and add/add conflict There is code in both merge-recursive and merge-ort for avoiding doubly transitive renames (i.e. one side renames directory A/ -> B/, and the other side renames directory B/ -> C/), because this combination would otherwise make a mess for new files added to A/ on the first side and wondering which directory they end up in -- especially if there were even more renames such as the first side renaming C/ -> D/. In such cases, it just turns "off" directory rename detection for the higher order transitive cases. The testcases added in t6423 a couple commits ago are slightly different but similar in principle. They involve a similar case of paired renaming but instead of A/ -> B/ and B/ -> C/, the second side renames a leading directory of B/ to C/. And both sides add a new file somewhere under the directory that the other side will rename. While the new files added start within different directories and thus could logically end up within different directories, it is weird for a file on one side to end up where the other one started and not move along with it. So, let's just turn off directory rename detection in this case as well. Another way to look at this is that if the source name involved in a directory rename on one side is the target name of a directory rename operation for a file from the other side, then we avoid the doubly transitive rename. (More concretely, if a directory rename on side D wants to rename a file on side E from OLD_NAME -> NEW_NAME, and side D already had a file named NEW_NAME, and a directory rename on side E wants to rename side D's NEW_NAME -> NEWER_NAME, then we turn off the directory rename detection for NEW_NAME to prevent the NEW_NAME -> NEWER_NAME rename, and instead end up with an add/add conflict on NEW_NAME.) Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-07-06 09:39:46 -07:00
Elijah Newren	0565cee5e4	t6423: add tests of dual directory rename plus add/add conflict This is an attempt at minimalizing a testcase reported by Glen Choo with tensorflow where merge-ort would report an assertion failure: Assertion failed: (ci->filemask == 2 \|\| ci->filemask == 4), function apply_directory_rename_modifications, file merge-ort.c, line 2410 reversing the direction of the merge provides a different error: error: cache entry has null sha1: ... fatal: unable to write .git/index so we add testcases for both. With these new testcases, the recursive strategy differs in that it returns the latter error for both merge directions. These testcases are somehow a little different than Glen's original tensorflow testcase in that these ones trigger a bug with the recursive algorithm whereas his testcase didn't. I figure that means these testcases somehow manage to be more comprehensive. Reported-by: Glen Choo <chooglen@google.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-07-06 09:39:46 -07:00
Elia Pinto	c614beb933	t6423-merge-rename-directories.sh: use the $(...) construct The Git CodingGuidelines prefer the $(...) construct for command substitution instead of using the backquotes `...`. The backquoted form is the traditional method for command substitution, and is supported by POSIX. However, all but the simplest uses become complicated quickly. In particular, embedded command substitutions and/or the use of double quotes require careful escaping with the backslash character. The patch was generated by: for _f in $(find . -name "*.sh") do shellcheck -i SC2006 -f diff ${_f} \| ifne git apply -p2 done and then carefully proof-read. Signed-off-by: Elia Pinto <gitter.spiros@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-03-13 19:03:32 +00:00
Elijah Newren	8b09a900a1	merge-ort: restart merge with cached renames to reduce process entry cost The merge algorithm mostly consists of the following three functions: collect_merge_info() detect_and_process_renames() process_entries() Prior to the trivial directory resolution optimization of the last half dozen commits, process_entries() was consistently the slowest, followed by collect_merge_info(), then detect_and_process_renames(). When the trivial directory resolution applies, it often dramatically decreases the amount of time spent in the two slower functions. Looking at the performance results in the previous commit, the trivial directory resolution optimization helps amazingly well when there are no relevant renames. It also helps really well when reapplying a long series of linear commits (such as in a rebase or cherry-pick), since the relevant renames may well be cached from the first reapplied commit. But when there are any relevant renames that are not cached (represented by the just-one-mega testcase), then the optimization does not help at all. Often, I noticed that when the optimization does not apply, it is because there are a handful of relevant sources -- maybe even only one. It felt frustrating to need to recurse into potentially hundreds or even thousands of directories just for a single rename, but it was needed for correctness. However, staring at this list of functions and noticing that process_entries() is the most expensive and knowing I could avoid it if I had cached renames suggested a simple idea: change collect_merge_info() detect_and_process_renames() process_entries() into collect_merge_info() detect_and_process_renames() <cache all the renames, and restart> collect_merge_info() detect_and_process_renames() process_entries() This may seem odd and look like more work. However, note that although we run collect_merge_info() twice, the second time we get to employ trivial directory resolves, which makes it much faster, so the increased time in collect_merge_info() is small. While we run detect_and_process_renames() again, all renames are cached so it's nearly a no-op (we don't call into diffcore_rename_extended() but we do have a little bit of data structure checking and fixing up). And the big payoff comes from the fact that process_entries(), will be much faster due to having far fewer entries to process. This restarting only makes sense if we can save recursing into enough directories to make it worth our while. Introduce a simple heuristic to guide this. Note that this heuristic uses a "wanted_factor" that I have virtually no actual real world data for, just some back-of-the-envelope quasi-scientific calculations that I included in some comments and then plucked a simple round number out of thin air. It could be that tweaking this number to make it either higher or lower improves the optimization. (There's slightly more here; when I first introduced this optimization, I used a factor of 10, because I was completely confident it was big enough to not cause slowdowns in special cases. I was certain it was higher than needed. Several months later, I added the rough calculations which make me think the optimal number is close to 2; but instead of pushing to the limit, I just bumped it to 3 to reduce the risk that there are special cases where this optimization can result in slowing down the code a little. If the ratio of path counts is below 3, we probably will only see minor performance improvements at best anyway.) Also, note that while the diffstat looks kind of long (nearly 100 lines), more than half of it is in two comments explaining how things work. For the testcases mentioned in commit `557ac0350d` ("merge-ort: begin performance work; instrument with trace2_region_* calls", 2020-10-28), this change improves the performance as follows: Before After no-renames: 205.1 ms ± 3.8 ms 204.2 ms ± 3.0 ms mega-renames: 1.564 s ± 0.010 s 1.076 s ± 0.015 s just-one-mega: 479.5 ms ± 3.9 ms 364.1 ms ± 7.0 ms Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-20 14:47:40 -07:00
Junio C Hamano	d3b88be1b4	Merge branch 'en/merge-dir-rename-corner-case-fix' The merge code had funny interactions between content based rename detection and directory rename detection. * en/merge-dir-rename-corner-case-fix: merge-recursive: handle rename-to-self case merge-ort: ensure we consult df_conflict and path_conflicts t6423: test directory renames causing rename-to-self	2021-07-16 17:42:45 -07:00
Junio C Hamano	89efac81c7	Merge branch 'en/ort-perf-batch-12' More fix-ups and optimization to "merge -sort". * en/ort-perf-batch-12: merge-ort: miscellaneous touch-ups Fix various issues found in comments diffcore-rename: avoid unnecessary strdup'ing in break_idx merge-ort: replace string_list_df_name_compare with faster alternative	2021-07-16 17:42:45 -07:00
Elijah Newren	3585d0ea23	merge-recursive: handle rename-to-self case Directory rename detection can cause transitive renames, e.g. if the two different sides of history each do one half of: A/file -> B/file B/ -> C/ then directory rename detection transitively renames to give us A/file -> C/file However, when C/ == A/, note that this gives us A/file -> A/file. merge-recursive assumed that any rename D -> E would have D != E. While that is almost always true, the above is a special case where it is not. So we cannot do things like delete the rename source, we cannot assume that a file existing at path E implies a rename/add conflict and we have to be careful about what stages end up in the output. This change feels a bit hackish. It took me surprisingly many hours to find, and given merge-recursive's design causing it to attempt to enumerate all combinations of edge and corner cases with special code for each combination, I'm worried there are other similar fixes needed elsewhere if we can just come up with the right special testcase. Perhaps an audit would rule it out, but I have not the energy. merge-recursive deserves to die, and since it is on its way out anyway, fixing this particular bug narrowly will have to be good enough. Reported-by: Anders Kaseorg <andersk@mit.edu> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-30 14:40:10 -07:00
Elijah Newren	a492d5331c	merge-ort: ensure we consult df_conflict and path_conflicts Path conflicts (typically rename path conflicts, e.g. rename/rename(1to2) or rename/add/delete), and directory/file conflicts should obviously result in files not being marked as clean in the merge. We had a codepath where we missed consulting the path_conflict and df_conflict flags, based on match_mask. Granted, it requires an unusual setup to trigger this codepath (directory rename causing rename-to-self is the only case I can think of), but we still need to handle it. To make it clear that we have audited the other codepaths that do not explicitly mention these flags, add some assertions that the flags are not set. Reported-by: Anders Kaseorg <andersk@mit.edu> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-30 14:40:10 -07:00
Elijah Newren	806f83287f	t6423: test directory renames causing rename-to-self Directory rename detection can cause transitive renames, e.g. if the two different sides of history each do one half of: A/file -> B/file B/ -> C/ then directory rename detection transitively renames to give us C/file. Since the default for merge.directoryRenames is conflict, this results in an error message saying it is unclear whether the file should be placed at B/file or C/file. What if C/ is A/, though? In such a case, the transitive rename would give us A/file, the original name we started with. Logically, having an error message with B/file vs. A/file should be fine, as should leaving the file where it started. But the logic in both merge-recursive and merge-ort did not handle a case of a filename being renamed to itself correctly; merge-recursive had two bugs, and merge-ort had one. Add some testcases covering such a scenario. Based-on-testcase-by: Anders Kaseorg <andersk@mit.edu> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-30 14:40:09 -07:00
Junio C Hamano	169914ede2	Merge branch 'en/ort-perf-batch-11' Optimize out repeated rename detection in a sequence of mergy operations. * en/ort-perf-batch-11: merge-ort, diffcore-rename: employ cached renames when possible merge-ort: handle interactions of caching and rename/rename(1to1) cases merge-ort: add helper functions for using cached renames merge-ort: preserve cached renames for the appropriate side merge-ort: avoid accidental API mis-use merge-ort: add code to check for whether cached renames can be reused merge-ort: populate caches of rename detection results merge-ort: add data structures for in-memory caching of rename detection t6429: testcases for remembering renames fast-rebase: write conflict state to working tree, index, and HEAD fast-rebase: change assert() to BUG() Documentation/technical: describe remembering renames optimization t6423: rename file within directory that other side renamed	2021-06-14 13:33:27 +09:00
Elijah Newren	356da0f98b	Fix various issues found in comments A random hodge-podge of incorrect or out-of-date comments that I found: * t6423 had a comment that has referred to the wrong test for years; fix it to refer to the right one. * diffcore-rename had a FIXME comment meant to remind myself to investigate if I could make another code change. I later investigated and removed the FIXME, but while cherry-picking the patch to submit upstream I missed the later update. Remove the comment now. * merge-ort had the early part of a comment for a function; I had meant to include the more involved description when I updated the function. Update the comment now. Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-09 11:40:04 +09:00
Elijah Newren	15f3e1e056	t6423: rename file within directory that other side renamed Add a new testcase where one side of history renames: olddir/ -> newdir/ and the other side of history renames: olddir/a -> olddir/alpha When using merge.directoryRenames=true, it seems logical to expect the file to end up at newdir/alpha. Unfortunately, both merge-recursive and merge-ort currently see this as a rename/rename conflict: olddir/a -> newdir/a vs. olddir/a -> newdir/alpha Suggesting that there's some extra logic we probably want to add somewhere to allow this case to run without triggering a conflict. For now simply document this known issue. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-04 12:53:48 +09:00
Junio C Hamano	7bec8e7fa6	Merge branch 'en/ort-readiness' Plug the ort merge backend throughout the rest of the system, and start testing it as a replacement for the recursive backend. * en/ort-readiness: Add testing with merge-ort merge strategy t6423: mark remaining expected failure under merge-ort as such Revert "merge-ort: ignore the directory rename split conflict for now" merge-recursive: add a bunch of FIXME comments documenting known bugs merge-ort: write $GIT_DIR/AUTO_MERGE whenever we hit a conflict t: mark several submodule merging tests as fixed under merge-ort merge-ort: implement CE_SKIP_WORKTREE handling with conflicted entries t6428: new test for SKIP_WORKTREE handling and conflicts merge-ort: support subtree shifting merge-ort: let renormalization change modify/delete into clean delete merge-ort: have ll_merge() use a special attr_index for renormalization merge-ort: add a special minimal index just for renormalization merge-ort: use STABLE_QSORT instead of QSORT where required	2021-04-16 13:53:34 -07:00
Junio C Hamano	1b31224e59	Merge branch 'en/ort-perf-batch-9' The ort merge backend has been optimized by skipping irrelevant renames. * en/ort-perf-batch-9: diffcore-rename: avoid doing basename comparisons for irrelevant sources merge-ort: skip rename detection entirely if possible merge-ort: use relevant_sources to filter possible rename sources merge-ort: precompute whether directory rename detection is needed merge-ort: introduce wrappers for alternate tree traversal merge-ort: add data structures for an alternate tree traversal merge-ort: precompute subset of sources for which we need rename detection diffcore-rename: enable filtering possible rename sources	2021-04-08 13:23:26 -07:00
Elijah Newren	259490e572	t6423: mark remaining expected failure under merge-ort as such When we started on merge-ort, thousands of tests failed when run with the GIT_TEST_MERGE_ALGORITHM=ort flag; with so many, it didn't make sense to flip all their test expectations. The ones in t6409, t6418, and the submodule tests are being handled by an independent in-flight topic ("Complete merge-ort implemenation...almost"). The ones in t6423 were left out of the other series because other ongoing series that this commit depends upon were addressing those. Now that we only have one remaining test failure in t6423, let's mark it as such. This remaining test will be fixed by a future optimization series, but since merge-recursive doesn't pass this test either, passing it is not necessary for declaring merge-ort ready for general use. Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-20 12:35:40 -07:00
Elijah Newren	32a56dfb99	merge-ort: precompute subset of sources for which we need rename detection rename detection works by trying to pair all file deletions (or "sources") with all file additions (or "destinations"), checking similarity, and then marking the sufficiently similar ones as renames. This can be expensive if there are many sources and destinations on a given side of history as it results in an N x M comparison matrix. However, there are many cases where we can compute in advance that detecting renames for some of the sources provides no useful information and thus that we can exclude those sources from the matrix. To see why, first note that the merge machinery uses detected renames in two ways: * directory rename detection: when one side of history renames a directory, and the other side of history adds new files to that directory, we want to be able to warn the user about the need to chose whether those new files stay in the old directory or move to the new one. * three-way content merging: in order to do three-way content merging of files, we need three different file versions. If one side of history renamed a file, then some of the content for the file is found under a different path than in the merge base or on the other side of history. Add a simple testcase showing the two kinds of reasons renames are relevant; it's a testcase that will only pass if we detect both kinds of needed renames. Other than the testcase added above, this commit concentrates just on the three-way content merging; it will punt and mark all sources as needed for directory rename detection, and leave it to future commits to narrow that down more. The point of three-way content merging is to reconcile changes made on both sides of history. What if the file wasn't modified on both sides? There are two possibilities: * If it wasn't modified on the renamed side: -> then we get to do exact rename detection, which is cheap. * If it wasn't modified on the unrenamed side: -> then detection of a rename for that source file is irrelevant That latter claim might be surprising at first, so let's walk through a case to show why rename detection for that source file is irrelevant. Let's use two filenames, old.c & new.c, with the following abbreviated object ids (and where the value '000000' is used to denote that the file is missing in that commit): old.c new.c MERGE_BASE: 01d01d 000000 MERGE_SIDE1: 01d01d 000000 MERGE_SIDE2: 000000 5e1ec7 If the rename isn't detected: then old.c looks like it was unmodified on one side and deleted on the other and should thus be removed. new.c looks like a new file we should keep as-is. If the rename is detected: then a three-way content merge is done. Since the version of the file in MERGE_BASE and MERGE_SIDE1 are identical, the three-way merge will produce exactly the version of the file whose abbreviated object id is 5e1ec7. It will record that file at the path new.c, while removing old.c from the directory. Note that these two results are identical -- a single file named 'new.c' with object id 5e1ec7. In other words, it doesn't matter if the rename is detected in the case where the file is unmodified on the unrenamed side. Use this information to compute whether we need rename detection for each source created in add_pair(). It's probably worth noting that there used to be a few other edge or corner cases besides three-way content merges and directory rename detection where lack of rename detection could have affected the result, but those cases actually highlighted where conflict resolution methods were not consistent with each other. Fixing those inconsistencies were thus critically important to enabling this optimization. That work involved the following: * bringing consistency to add/add, rename/add, and rename/rename conflict types, as done back in the topic merged at commit `ac193e0e0a` ("Merge branch 'en/merge-path-collision'", 2019-01-04), and further extended in commits `2a7c16c980` ("t6422, t6426: be more flexible for add/add conflicts involving renames", 2020-08-10) and `e8eb99d4a6` ("t642[23]: be more flexible for add/add conflicts involving pair renames", 2020-08-10) * making rename/delete more consistent with modify/delete as done in commits `1f3c9ba707` ("t6425: be more flexible with rename/delete conflict messages", 2020-08-10) and `727c75b23f` ("t6404, t6423: expect improved rename/delete handling in ort backend", 2020-10-26) Since the set of relevant_sources we compute has not yet been narrowed down for directory rename detection, we do not pass it to diffcore_rename_extended() yet. That will be done after subsequent commits narrow down the list of relevant_sources needed for directory rename detection reasons. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-10 22:18:04 -08:00
Ævar Arnfjörð Bjarmason	a926c4b904	tests: remove most uses of C_LOCALE_OUTPUT As a follow-up to `d162b25f95` (tests: remove support for GIT_TEST_GETTEXT_POISON, 2021-01-20) remove those uses of the now always true C_LOCALE_OUTPUT prerequisite from those tests which declare it as an argument to test_expect_{success,failure}. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 23:48:26 -08:00
Elijah Newren	848a856b13	t6423: add more details about direct resolution of directories Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-26 12:31:24 -07:00
Elijah Newren	fd15863ec8	t6423: note improved ort handling with untracked files Similar to the previous commit, since the "recursive" backend relies on unpack_trees() to check if unstaged or untracked files would be overwritten by a merge, and unpack_trees() does not understand renames -- it has false positives and false negatives. Once it has run, since it updates as it goes, merge-recursive then has to handle completing the merge as best it can despite extra changes in the working copy. However, this is not just an issue for dirty files, but also for untracked files because directory renames can cause file contents to need to be written to a location that was not tracked on either side of history. Since the "ort" backend does the complete merge inmemory, and only updates the index and working copy as a post-processing step, if there are untracked files in the way it can simply abort the merge much like checkout does. Update t6423 to reflect the better merge abilities and expectations for ort, while still leaving the best-case-as-good-as-recursive-can-do expectations there for the recursive backend so we retain its stability until we are ready to deprecate and remove it. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-26 12:31:24 -07:00
Elijah Newren	23bef2e33c	t6423, t6436: note improved ort handling with dirty files The "recursive" backend relies on unpack_trees() to check if unstaged changes would be overwritten by a merge, but unpack_trees() does not understand renames -- and once it returns, it has already written many updates to the working tree and index. As such, "recursive" had to do a special 4-way merge where it would need to also treat the working copy as an extra source of differences that we had to carefully avoid overwriting and resulting in moving files to new locations to avoid conflicts. The "ort" backend, by contrast, does the complete merge inmemory, and only updates the index and working copy as a post-processing step. If there are dirty files in the way, it can simply abort the merge. Update t6423 and t6436 to reflect the better merge abilities and expectations we have for ort, while still leaving the best-case-as-good-as-recursive-can-do expectations there for the recursive backend so we retain its stability until we are ready to deprecate and remove it. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-26 12:31:24 -07:00
Elijah Newren	c12d1f2ac2	t6423: expect improved conflict markers labels in the ort backend Conflict markers carry an extra annotation of the form REF-OR-COMMIT:FILENAME to help distinguish where the content is coming from, with the :FILENAME piece being left off if it is the same for both sides of history (thus only renames with content conflicts carry that part of the annotation). However, there were cases where the :FILENAME annotation was accidentally left off, due to merge-recursive's every-codepath-needs-a-copy-of-all-special-case-code format. Update a few tests to have the correct :FILENAME extension on relevant paths with the ort backend, while leaving the expectation for merge-recursive the same to avoid destabilizing it. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-26 12:31:24 -07:00
Elijah Newren	727c75b23f	t6404, t6423: expect improved rename/delete handling in ort backend When a file is renamed and has content conflicts, merge-recursive does not have some stages for the old filename and some stages for the new filename in the index; instead it copies all the stages corresponding to the old filename over to the corresponding locations for the new filename, so that there are three higher order stages all corresponding to the new filename. Doing things this way makes it easier for the user to access the different versions and to resolve the conflict (no need to manually 'git rm' the old version as well as 'git add' the new one). rename/deletes should be handled similarly -- there should be two stages for the renamed file rather than just one. We do not want to destabilize merge-recursive right now, so instead update relevant tests to have different expectations depending on whether the "recursive" or "ort" merge strategies are in use. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-26 12:31:24 -07:00
Elijah Newren	ef52778708	merge tests: expect improved directory/file conflict handling in ort merge-recursive.c is built on the idea of running unpack_trees() and then "doing minor touch-ups" to get the result. Unfortunately, unpack_trees() was run in an update-as-it-goes mode, leading merge-recursive.c to follow suit and end up with an immediate evaluation and fix-it-up-as-you-go design. Some things like directory/file conflicts are not well representable in the index data structure, and required special extra code to handle. But then when it was discovered that rename/delete conflicts could also be involved in directory/file conflicts, the special directory/file conflict handling code had to be copied to the rename/delete codepath. ...and then it had to be copied for modify/delete, and for rename/rename(1to2) conflicts, ...and yet it still missed some. Further, when it was discovered that there were also file/submodule conflicts and submodule/directory conflicts, we needed to copy the special submodule handling code to all the special cases throughout the codebase. And then it was discovered that our handling of directory/file conflicts was suboptimal because it would create untracked files to store the contents of the conflicting file, which would not be cleaned up if someone were to run a 'git merge --abort' or 'git rebase --abort'. It was also difficult or scary to try to add or remove the index entries corresponding to these files given the directory/file conflict in the index. But changing merge-recursive.c to handle these correctly was a royal pain because there were so many sites in the code with similar but not identical code for handling directory/file/submodule conflicts that would all need to be updated. I have worked hard to push all directory/file/submodule conflict handling in merge-ort through a single codepath, and avoid creating untracked files for storing tracked content (it does record things at alternate paths, but makes sure they have higher-order stages in the index). Since updating merge-recursive is too much work and we don't want to destabilize it, instead update the testsuite to have different expectations for relevant directory/file/submodule conflict tests. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-26 12:31:23 -07:00
Elijah Newren	f06481f127	t/: new helper for tests that pass with ort but fail with recursive There are a number of tests that the "recursive" backend does not handle correctly but which the redesign in "ort" will. Add a new helper in lib-merge.sh for selecting a different test expectation based on the setting of GIT_TEST_MERGE_ALGORITHM, and use it in various testcases to document which ones we expect to fail under recursive but pass under ort. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-26 12:31:23 -07:00
Elijah Newren	c64432aacd	t6423: more involved rules for renaming directories into each other Testcases 12b and 12c were both slightly weird; they were marked as having a weird resolution, but with the note that even straightforward simple rules can give weird results when the input is bizarre. However, during optimization work for merge-ort, I discovered a significant speedup that is possible if we add one more fairly straightforward rule: we don't bother doing directory rename detection if there are no new files added to the directory on the other side of the history to be affected by the directory rename. This seems like an obvious and straightforward rule, but there was one funny corner case where directory rename detection could affect only existing files: the funny corner case where two directories are renamed into each other on opposite sides of history. In other words, it only results in a different output for testcases 12b and 12c. Since we already thought testcases 12b and 12c were weird anyway, and because the optimization often has a significant effect on common cases (but is entirely prevented if we can't change how 12b and 12c function), let's add the additional rule and tweak how 12b and 12c work. Split both testcases into two (one where we add no new files, and one where the side that doesn't rename a given directory will add files to it), and mark them with the new expectation. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-16 12:29:28 -07:00
Elijah Newren	8536821d05	t6423: update directory rename detection tests with new rule While investigating the issues highlighted by the testcase in the previous patch, I also found a shortcoming in the directory rename detection rules. Split testcase 6b into two to explain this issue and update directory-rename-detection.txt to remove one of the previous rules that I know believe to be detrimental. Also, update the wording around testcase 8e; while we are not modifying the results of that testcase, we were previously unsure of the appropriate resolution of that test and the new rule makes the previously chosen resolution for that testcase a bit more solid. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-16 12:29:28 -07:00
Elijah Newren	902c521a35	t6423: more involved directory rename test Add a new testcase modelled on a real world repository example that served multiple purposes: * it uncovered a bug in the current directory rename detection implementation. * it is a good test of needing to do directory rename detection for a series of commits instead of just one (and uses rebase instead of just merge like all the other tests in this testfile). * it is an excellent stress test for some of the optimizations in my new merge-ort engine I can expand on the final item later when I have submitted more of merge-ort, but the bug is the main immediate concern. It arises as follows: * dir/subdir/ has several files * almost all files in dir/subdir/ are renamed to folder/subdir/ * one of the files in dir/subdir/ is renamed to folder/subdir/newsubdir/ * If the other side of history (that doesn't do the renames) adds a new file to dir/subdir/, where should it be placed after the merge? The most obvious two choices are: (1) leave the new file in dir/subdir/, don't make it follow the rename, and (2) move the new file to folder/subdir/, following the rename of most the files. However, there's a possible third choice here: (3) move the new file to folder/subdir/newsubdir/. The choice reinforce the fact that merge.directoryRenames=conflict is a good default, but when the merge machinery needs to stick it somewhere and notify the user of the possibility that they might want to place it elsewhere. Surprisingly, the current code would always choose (3), while the real world repository was clearly expecting (2) -- move the file along with where the herd of files was going, not with the special exception. The problem here is that for the majority of the file renames, dir/subdir/ -> folder/subdir/ is actually represented as dir/ -> folder/ This directory rename would have a big weight associated with it since most the files followed that rename. However, we always consult the most immediate directory first, and there is only one rename rule for it: dir/subdir/ -> folder/subdir/newsubdir/ Since this rule is the only one for mapping from dir/subdir/, it automatically wins and that directory rename was followed instead of the desired dir/subdir/ -> folder/subdir/. Unfortunately, the fix is a bit involved so for now just add the testcase documenting the issue. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-16 12:29:27 -07:00
Elijah Newren	e8eb99d4a6	t642[23]: be more flexible for add/add conflicts involving pair renames Much like the last commit accepted 'add/add' and 'rename/add' interchangably, we also want to do the same for 'add/add' and 'rename/rename'. This also allows us to avoid the ambiguity in meaning with 'rename/rename' (is it two separate files renamed to the same location, or one file renamed on both sides but differently)? Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-10 15:59:01 -07:00
Elijah Newren	1cb588775f	t6423: add an explanation about why one of the tests does not pass I had long since forgotten the idea behind this test and why it failed, and took a little while to figure it out. To prevent others from having to spend a similar time on it, add an explanation in the comments. However, the reasoning in the explanation makes me question why I considered it a failure at all. I'm not sure if I had a better reason when I originally wrote it, but for now just add commentary about the possible expectations and why it behaves the way it does right now. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-10 15:59:01 -07:00
Elijah Newren	6c74948f20	t6416, t6423: clarify some comments and fix some typos Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-10 15:59:01 -07:00
Elijah Newren	a0601b2eb3	t6423: fix test setup for a couple tests Commit `da1e295e00` ("t604[236]: do not run setup in separate tests", 2019-10-22) removed approximately half the tests (which were setup-only tests) in t6043 by turning them into functions that the subsequent test would call as their first step. This ensured that any test from this file could be run entirely independently of all the other tests in the file. Unfortunately, the call to the new setup function was missed in two of the test_expect_failure cases. Add them in. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-10 15:59:00 -07:00
Elijah Newren	919df31955	Collect merge-related tests to t64xx The tests for the merge machinery are spread over several places. Collect them into t64xx for simplicity. Some notes: t60[234].sh: Merge tests started in t602, overgrew bisect and remote tracking tests in t6030, t6040, and t6041, and nearly overtook replace tests in t6050. This made picking out relevant tests that I wanted to run in a tighter loop slightly more annoying for years. t303.sh: These started out as tests for the 'merge-recursive' toplevel command, but did not restrict to that and had lots of overlap with the underlying merge machinery. t7405, t7613: submodule-specific merge logic started out in submodule.c but was moved to merge-recursive.c in commit `18cfc08866` ("submodule.c: move submodule merging to merge-recursive.c", 2018-05-15). Since these tests are about the logic found in the merge machinery, moving these tests to be with the merge tests makes sense. t7607, t7609: Having tests spread all over the place makes it more likely that additional tests related to a certain piece of logic grow in all those other places. Much like t303.sh, these two tests were about the underlying merge machinery rather than outer levels. Tests that were NOT moved: t76[01].sh: Other than the four tests mentioned above, the remaining tests in t76[01].sh are related to non-recursive merge strategies, parameter parsing, and other stuff associated with the highlevel builtin/merge.c rather than the recursive merge machinery. t3[45].sh: The rebase testcases in t34.sh also test the merge logic pretty heavily; sometimes changes I make only trigger failures in the rebase tests. The rebase tests are already nicely coupled together, though, and I didn't want to mess that up. Similar comments apply for the cherry-pick tests in t35*.sh. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-10 15:59:00 -07:00

34 Commits