Forbids rename detection logic from matching two empty files as renames
during merge-recursive to prevent mismerges.
By Jeff King
* jk/diff-no-rename-empty:
merge-recursive: don't detect renames of empty files
teach diffcore-rename to optionally ignore empty content
make is_empty_blob_sha1 available everywhere
drop casts from users EMPTY_TREE_SHA1_BIN
Resurrects the preparatory clean-up patches from another topic that was
discarded, as this would give a saner foundation to build on diff.algo
configuration option series.
* jc/diff-algo-cleanup:
xdiff: PATIENCE/HISTOGRAM are not independent option bits
xdiff: remove XDL_PATCH_* macros
Merge-recursive detects renames so that if one side modifies
"foo" and the other side moves it to "bar", the modification
is applied to "bar". However, our rename detection is based
on content analysis, it can be wrong (i.e., two files were
not intended as a rename, but just happen to have the same
or similar content).
This is quite rare if the files actually contain content,
since two unrelated files are unlikely to have exactly the
same content. However, empty files present a problem, in
that there is nothing to analyze. An uninteresting
placeholder file with zero bytes may or may not be related
to a placeholder file with another name.
The result is that adding content to an empty file may cause
confusion if the other side of a merge removed it; your
content may end up in another random placeholder file that
was added.
Let's err on the side of caution and not consider empty
files as renames. This will cause a modify/delete conflict
on the merge, which will let the user sort it out
themselves.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This macro already evaluates to the correct type, as it
casts the string literal to "unsigned char *" itself
(and callers who want the literal can use the _LITERAL
form).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Because the default Myers, patience and histogram algorithms cannot be in
effect at the same time, XDL_PATIENCE_DIFF and XDL_HISTOGRAM_DIFF are not
independent bits. Instead of wasting one bit per algorithm, define a few
macros to access the few bits they occupy and update the code that access
them.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* tr/cache-tree:
reset: update cache-tree data when appropriate
commit: write cache-tree data when writing index anyway
Refactor cache_tree_update idiom from commit
Test the current state of the cache-tree optimization
Add test-scrap-cache-tree
We'll need to safely create or update the cache-tree data of the_index
from other places. While at it, give it an argument that lets us
silence the messages produced by unmerged entries (which prevent it
from working).
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The merge-recursive code uses the commit->util field directly to annotate
the commit objects given from the command line, i.e. the remote heads to
be merged, with a single string to be used to describe it in its trace
messages and conflict markers.
Correct this short-signtedness by redefining the field to be a pointer to
a structure "struct merge_remote_desc" that later enhancements can add
more information. Store the original objects we were told to merge in a
field "obj" in this struct, so that we can recover the tag we were told to
merge.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The submodule merge search is not useful during virtual merges because
the results cannot be used automatically. Furthermore any suggestions
made by the search may apply to commits different than HEAD:sub and
MERGE_HEAD:sub, thus confusing the user. Skip searching for submodule
merges during a virtual merge such as that between B and C while merging
the heads of:
B---BC
/ \ /
A X
\ / \
C---CB
Run the search only when the recursion level is zero (!o->call_depth).
This fixes known breakage tested in t7405-submodule-merge.
Signed-off-by: Brad King <brad.king@kitware.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix another instance of a recursive merge incorrectly paying attention to
the working tree file during a virtual ancestor merge, that resulted in
spurious and useless "addinfo_cache failed" error message.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git term is 'working tree', so replace the most public references
to 'working copy'.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* en/merge-recursive-2: (57 commits)
merge-recursive: Don't re-sort a list whose order we depend upon
merge-recursive: Fix virtual merge base for rename/rename(1to2)/add-dest
t6036: criss-cross + rename/rename(1to2)/add-dest + simple modify
merge-recursive: Avoid unnecessary file rewrites
t6022: Additional tests checking for unnecessary updates of files
merge-recursive: Fix spurious 'refusing to lose untracked file...' messages
t6022: Add testcase for spurious "refusing to lose untracked" messages
t3030: fix accidental success in symlink rename
merge-recursive: Fix working copy handling for rename/rename/add/add
merge-recursive: add handling for rename/rename/add-dest/add-dest
merge-recursive: Have conflict_rename_delete reuse modify/delete code
merge-recursive: Make modify/delete handling code reusable
merge-recursive: Consider modifications in rename/rename(2to1) conflicts
merge-recursive: Create function for merging with branchname:file markers
merge-recursive: Record more data needed for merging with dual renames
merge-recursive: Defer rename/rename(2to1) handling until process_entry
merge-recursive: Small cleanups for conflict_rename_rename_1to2
merge-recursive: Fix rename/rename(1to2) resolution for virtual merge base
merge-recursive: Introduce a merge_file convenience function
merge-recursive: Fix modify/delete resolution in the recursive case
...
When this code was first written (v1.4.3-rc1~174^2~4, merge-recur: if
there is no common ancestor, fake empty one, 2006-08-09), everyone
needing a fake empty tree had to make her own, but ever since
v1.5.5-rc0~180^2~1 (2008-02-13), the object lookup machinery provides
a ready-made one. Use it.
This is just a simplification, though it also fixes a small leak
(since the tree in the virtual common ancestor commit is never freed).
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In record_df_conflict_files() we would resort the entries list using
df_name_compare to get a convenient ordering. Unfortunately, this broke
assumptions of the get_renames() code (via string_list_lookup() calls)
which needed the list to be in the standard ordering. When those lookups
would fail, duplicate stage_data entries could be inserted, causing the
process_renames and process_entry code to fail (in particular, a path that
that process_renames had marked as processed would still be processed
anyway in process_entry due to the duplicate entry).
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Earlier in this series, the patch "merge-recursive: add handling for
rename/rename/add-dest/add-dest" added code to handle the rename on each
side of history also being involved in a rename/add conflict, but only
did so in the non-recursive case. Add code for the recursive case,
ensuring that the "added" files are not simply deleted.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Often times, a potential conflict at a path is resolved by merge-recursive
by using the content that was already present at that location. In such
cases, we do not want to overwrite the content that is already present, as
that could trigger unnecessary recompilations. One of the patches earlier
in this series ("merge-recursive: When we detect we can skip an update,
actually skip it") fixed the cases that involved content merges, but there
were a few other cases as well.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Calling update_stages() before update_file() can sometimes result in git
thinking the file being updated is untracked (whenever update_stages
moves it to stage 3). Reverse the call order, and add a big comment to
update_stages to hopefully prevent others from making the same mistake.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If either side of a rename/rename(1to2) conflict is itself also involved
in a rename/add-dest conflict, then we need to make sure both the rename
and the added file appear in the working copy.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Each side of the rename in rename/rename(1to2) could potentially also be
involved in a rename/add conflict. Ensure stages for such conflicts are
also recorded.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
modify/delete and rename/delete share a lot of similarities; we'd like all
the criss-cross and D/F conflict handling specializations to be shared
between the two.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our previous conflict resolution for renaming two different files to the
same name ignored the fact that each of those files may have modifications
from both sides of history to consider. We need to do a three-way merge
for each of those files, and then handle the conflict of both sets of
merged contents trying to be recorded with the same name.
It is important to note that this changes our strategy in the recursive
case. After doing a three-way content merge of each of the files
involved, we still are faced with the fact that we are trying to put both
of the results (including conflict markers) into the same path. We could
do another two-way merge, but I think that becomes confusing. Also,
taking a hint from the modify/delete and rename/delete cases we handled
earlier, a more useful "common ground" would be to keep the three-way
content merge but record it with the original filename. The renames can
still be detected, we just allow it to be done in the o->call_depth=0
case. This seems to result in simpler & easier to understand merge
conflicts as well, as evidenced by some of the changes needed in our
testsuite in t6036. (However, it should be noted that this change will
cause problems those renames also occur along with a file being added
whose name matches the source of the rename. Since git currently cannot
detect rename/add-source situations, though, this codepath is not
currently used for those cases anyway.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We want to be able to reuse the code to do a three-way file content merge
and have the conflict markers use both branchname and filename. Split it
out into a separate function.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When two different files are renamed to one, we need to be able to do
three-way merges for both of those files. To do that, we need to record
the sha1sum of the (possibly modified) file on the unrenamed side. Modify
setup_rename_conflict_info() to take this extra information and record it
when the rename_type is RENAME_TWO_FILES_TO_ONE.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This puts the code for the different types of double rename conflicts
closer together (fewer lines of other code separating the two paths) and
increases similarity between how they are handled.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When renaming one file to two files, we really should be doing a content
merge. Also, in the recursive case, undoing the renames and recording the
merged file in the index with the source of the rename (while deleting
both destinations) allows the renames to be re-detected in the
non-recursive merge and will result in fewer spurious conflicts.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
merge_file previously required diff_filespec arguments, but all callers
only had sha1s and modes. Rename merge_file to merge_file_1 and introduce
a new merge_file convenience function which takes the sha1s and modes and
creates the temporary diff_filespec variables needed to call merge_file_1.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When o->call_depth>0 and we have conflicts, we try to find "middle ground"
when creating the virtual merge base. In the case of content conflicts,
this can be done by doing a three-way content merge and using the result.
In all parts where the three-way content merge is clean, it is the correct
middle ground, and in parts where it conflicts there is no middle ground
but the conflict markers provide a good compromise since they are unlikely
to accidentally match any further changes.
In the case of a modify/delete conflict, we cannot do the same thing.
Accepting either endpoint as the resolution for the virtual merge base
runs the risk that when handling the non-recursive case we will silently
accept one person's resolution over another without flagging a conflict.
In this case, the closest "middle ground" we have is actually the merge
base of the candidate merge bases. (We could alternatively attempt a
three way content merge using an empty file in place of the deleted file,
but that seems to be more work than necessary.)
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 882fd11 (merge-recursive: Delay content merging for renames 2010-09-20),
there was code that checked for whether we could skip updating a file in
the working directory, based on whether the merged version matched the
current working copy. Due to the desire to handle directory/file conflicts
that were resolvable, that commit deferred content merging by first
updating the index with the unmerged entries and then moving the actual
merging (along with the skip-the-content-update check) to another function
that ran later in the merge process. As part moving the content merging
code, a bug was introduced such that although the message about skipping
the update would be printed (whenever GIT_MERGE_VERBOSITY was sufficiently
high), the file would be unconditionally updated in the working copy
anyway.
When we detect that the file does not need to be updated in the working
copy, update the index appropriately and then return early before updating
the working copy.
Note that there was a similar change in b2c8c0a (merge-recursive: When we
detect we can skip an update, actually skip it 2011-02-28), but it was
reverted by 6db4105 (Revert "Merge branch 'en/merge-recursive'"
2011-05-19) since it did not fix both of the relevant types of unnecessary
update breakages and, worse, it made use of some band-aids that caused
other problems. The reason this change works is due to the changes earlier
in this series to (a) record_df_conflict_files instead of just unlinking
them early, (b) allowing make_room_for_path() to remove D/F entries,
(c) the splitting of update_stages_and_entry() to have its functionality
called at different points, and (d) making the pathnames of the files
involved in the merge available to merge_content().
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Whenever there are merge conflicts in file contents, we would mark the
different sides of the conflict with the two branches being merged.
However, when there is a rename involved as well, the branchname is not
sufficient to specify where the conflicting content came from. In such
cases, mark the two sides of the conflict with branchname:filename rather
than just branchname.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The consolidation of process_entry() and process_df_entry() allows us to
consolidate more code paths concerning rename conflicts, and to do
a few additional related cleanups. It also means we are using
rename_df_conflict_info in some cases where there is no D/F conflict;
rename it to rename_conflict_info.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The whole point of adding process_df_entry() was to ensure that files of
D/F conflicts were processed after paths under the corresponding
directory. However, given that the entries are in sorted order, all we
need to do is iterate through them in reverse order to achieve the same
effect. That lets us remove some duplicated code, and lets us keep
track of one less thing as we read the code ("do we need to make sure
this is processed before process_df_entry() or do we need to defer it
until then?").
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When dealing with file merging and renames and D/F conflicts and possible
criss-cross merges (how's that for a corner case?), we did not do a
thorough job ensuring the index and working directory had the correct
contents. Fix the logic in merge_content() to handle this. Also,
correct some erroneous tests in t6022 that were expecting the wrong number
of unmerged index entries. These changes fix one of the tests in t6042
(and almost fix another one from t6042 as well).
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a couple of places where changes are needed to for situations
involving rename/add-source issues. Add comments about the needed changes
(and existing bugs) until git has been enabled to detect such cases.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code for rename_rename_2to1 conflicts (two files both being renamed to
the same filename) was dead since the rename/add path was always being
independently triggered for each of the renames instead. Further,
reviving the dead code showed that it was inherently buggy and would
always segfault -- among a few other bugs.
Move the else-if branch for the rename/rename block before the rename/add
block to make sure it is checked first, and fix up the rename/rename(2to1)
code segments to make it handle most cases. Work is still needed to
handle higher dimensional corner cases such as rename/rename/modify/modify
issues.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the recursive case (o->call_depth > 0), we do not modify the working
directory. However, when o->call_depth==0, file renames can mean we need
to delete the old filename from the working copy. Since there have been
lots of changes and mistakes here, let's go through the details. Let's
start with a simple explanation of what we are trying to achieve:
Original goal: If a file is renamed on the side of history being merged
into head, the filename serving as the source of that rename needs to be
removed from the working directory.
The path to getting the above statement implemented in merge-recursive took
several steps. The relevant bits of code may be instructive to keep in
mind for the explanation, especially since an English-only description
involves double negatives that are hard to follow. These bits of code are:
int remove_file(..., const char *path, int no_wd)
{
...
int update_working_directory = !o->call_depth && !no_wd;
and
remove_file(o, 1, ren1_src, <expression>);
Where the choice for <expression> has morphed over time:
65ac6e9 (merge-recursive: adjust to loosened "working file clobbered"
check 2006-10-27), introduced the "no_wd" parameter to remove_file() and
used "1" for <expression>. This meant ren1_src was never deleted, leaving
it around in the working copy.
In 8371234 (Remove uncontested renamed files during merge. 2006-12-13),
<expression> was changed to "index_only" (where index_only ==
!!o->call_depth; see b7fa51da). This was equivalent to using "0" for
<expression> (due to the early logic in remove_file), and is orthogonal to
the condition we actually want to check at this point; it resulted in the
source file being removed except when index_only was false. This was
problematic because the file could have been renamed on the side of history
including head, in which case ren1_src could correspond to an untracked
file that should not be deleted.
In 183d797 (Keep untracked files not involved in a merge. 2007-02-04),
<expression> was changed to "index_only || stage == 3". While this gives
correct behavior, the "index_only ||" portion of <expression> is
unnecessary and makes the code slightly harder to follow.
There were also two further changes to this expression, though without
any change in behavior. First in b7fa51d (merge-recursive: get rid of the
index_only global variable 2008-09-02), it was changed to "o->call_depth
|| stage == 3". (index_only == !!o->call_depth). Later, in 41d70bd6
(merge-recursive: Small code clarification -- variable name and comments),
this was changed to "o->call_depth || renamed_stage == 2" (where stage was
renamed to other_stage and renamed_stage == other_stage ^ 1).
So we ended with <expression> being "o->call_depth || renamed_stage == 2".
But the "o->call_depth ||" piece was unnecessary. We can remove it,
leaving us with <expression> being "renamed_stage == 2". This doesn't
change behavior at all, but it makes the code clearer. Which is good,
because it's about to get uglier.
Corrected goal: If a file is renamed on the side of history being merged
into head, the filename serving as the source of that rename needs to be
removed from the working directory *IF* that file is tracked in head AND
the file tracked in head is related to the original file.
Note that the only difference between the original goal and the corrected
goal is the two extra conditions added at the end. The first condition is
relevant in a rename/delete conflict. If the file was deleted on the
HEAD side of the merge and an untracked file of the same name was added to
the working copy, then without that extra condition the untracked file
will be erroneously deleted. This changes <expression> to "renamed_stage
== 2 || !was_tracked(ren1_src)".
The second additional condition is relevant in two cases.
The first case the second condition can occur is when a file is deleted
and a completely different file is added with the same name. To my
knowledge, merge-recursive has no mechanism for detecting deleted-and-
replaced-by-different-file cases, so I am simply punting on this
possibility.
The second case for the second condition to occur is when there is a
rename/rename/add-source conflict. That is, when the original file was
renamed on both sides of history AND the original filename is being
re-used by some unrelated (but tracked) content. This case also presents
some additional difficulties for us since we cannot currently detect these
rename/rename/add-source conflicts; as long as the rename detection logic
"optimizes" by ignoring filenames that are present at both ends of the
diff, these conflicts will go unnoticed. However, rename/rename conflicts
are handled by an entirely separate codepath not being discussed here, so
this case is not relevant for the line of code under consideration.
In summary:
Change <expression> from "o->call_depth || renamed_stage == 2" to
"renamed_stage == 2 || !was_tracked(ren1_src)", in order to remove
unnecessary code and avoid deleting untracked files.
96 lines of explanation in the changelog to describe a one-line fix...
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of having the process_renames logic update the stages in the index
for the rename destination, have the index updated after process_entry or
process_df_entry. This will also allow us to have process_entry determine
whether a file was tracked and existed in the working copy before the
merge started.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If there were several files conflicting below a directory corresponding
to a D/F conflict, and the file of that D/F conflict is in the way, we
want it to be removed. Since files of D/F conflicts are handled last,
they can be reinstated later and possibly with a new unique name.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Checking whether a filename was part of stage 0 or stage 2 is code that we
would like to be able to call from a few other places without also
lstat()-ing the file to see if it exists in the working copy.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename make_room_for_directories_of_df_conflicts() to
record_df_conflict_files() to reflect the change in functionality.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, we were using lstat() to determine if a directory was still
present after a merge (and thus in the way of adding a file). We should
have been using lstat() only to determine if untracked directories were in
the way (and then only when necessary to check for untracked directories);
we should instead using the index to determine if there is a tracked
directory in the way. Create a new function to do this and use it to
replace the existing checks for directories being in the way.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We cannot assume that directory/file conflicts will appear in sorted
order; for example, 'letters.txt' comes between 'letters' and
'letters/file'.
Thanks to Johannes for a pointer about qsort stability issues with
Windows and suggested code change.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a D/F conflict is introduced via an add/add conflict, when
o->call_depth > 0 we need to ensure that the higher stage entry from the
base stage is removed.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
make_room_for_directories_of_df_conflicts() is about making sure necessary
working directory changes can succeed. When o->call_depth > 0 (i.e. the
recursive case), we do not want to make any working directory changes so
this function should be skipped.
Note that make_room_for_directories_of_df_conflicts() is broken as has
been pointed out by Junio; it should NOT be unlinking files. What it
should do is keep track of files that could be unlinked if a directory
later needs to be written in their place. However, that work also is only
relevant in the non-recursive case, so this change is helpful either way.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We are only calling update_stages_options() one way really, so we can
consolidate the slightly different variants into one and remove some
parameters whose values are always the same.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Hopefully no one ever hits this error except when making large changes to
merge-recursive.c and debugging...
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Port JGit's HistogramDiff algorithm over to C. Rough numbers (TODO) show
that it is faster than its --patience cousin, as well as the default
Meyers algorithm.
The implementation has been reworked to use structs and pointers,
instead of bitmasks, thus doing away with JGit's 2^28 line limit.
We also use xdiff's default hash table implementation (xdl_hash_bits()
with XDL_HASHLONG()) for convenience.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/rename-degrade-cc-to-c:
diffcore-rename: fall back to -C when -C -C busts the rename limit
diffcore-rename: record filepair for rename src
diffcore-rename: refactor "too many candidates" logic
builtin/diff.c: remove duplicated call to diff_result_code()
As the band-aid to merge-recursive seems to regress complex merges in an
unpleasant way. The merge-recursive implementation needs to be rewritten
in such a way that it resolves renames and D/F conflicts entirely in-core
and not to touch working tree at all while doing so. But in the meantime,
this reverts commit ac9666f84 that merged the topic in its entirety.
The config machinery already makes section and variable names
lowercase when parsing them, so using strcasecmp for comparison just
feels wasteful. No noticeable change intended.
Noticed-by: Jay Soffian <jaysoffian@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* nd/struct-pathspec:
pathspec: rename per-item field has_wildcard to use_wildcard
Improve tree_entry_interesting() handling code
Convert read_tree{,_recursive} to support struct pathspec
Reimplement read_tree_recursive() using tree_entry_interesting()
* jc/rename-degrade-cc-to-c:
diffcore-rename: fall back to -C when -C -C busts the rename limit
diffcore-rename: record filepair for rename src
diffcore-rename: refactor "too many candidates" logic
builtin/diff.c: remove duplicated call to diff_result_code()
* en/merge-recursive:
merge-recursive: tweak magic band-aid
merge-recursive: When we detect we can skip an update, actually skip it
t6022: New test checking for unnecessary updates of files in D/F conflicts
t6022: New test checking for unnecessary updates of renamed+modified files
This patch changes behavior of the two functions. Previously it does
prefix matching only. Now it can also do wildcard matching.
All callers are updated. Some gain wildcard matching (archive,
checkout), others reset pathspec_item.has_wildcard to retain old
behavior (ls-files, ls-tree as they are plumbing).
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When there are too many paths in the project, the number of rename source
candidates "git diff -C -C" finds will exceed the rename detection limit,
and no inexact rename detection is performed. We however could fall back
to "git diff -C" if the number of modified paths is sufficiently small.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Running checks against working tree (e.g. lstat()) and causing
changes to working tree (e.g. unlink()) while building a virtual
ancestor merge does not make any sense. Avoid doing so.
This is not a real fix; it is another magic band-aid on top of
another band-aid we placed earlier.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
Prepare draft release notes to 1.7.4.2
gitweb: highlight: replace tabs with spaces
make_absolute_path: return the input path if it points to our buffer
valgrind: ignore SSE-based strlen invalid reads
diff --submodule: split into bite-sized pieces
cherry: split off function to print output lines
branch: split off function that writes tracking info and commit subject
standardize brace placement in struct definitions
compat: make gcc bswap an inline function
enums: omit trailing comma for portability
Conflicts:
RelNotes
In a struct definitions, unlike functions, the prevailing style is for
the opening brace to go on the same line as the struct name, like so:
struct foo {
int bar;
char *baz;
};
Indeed, grepping for 'struct [a-z_]* {$' yields about 5 times as many
matches as 'struct [a-z_]*$'.
Linus sayeth:
Heretic people all over the world have claimed that this inconsistency
is ... well ... inconsistent, but all right-thinking people know that
(a) K&R are _right_ and (b) K&R are right.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 882fd11 (merge-recursive: Delay content merging for renames 2010-09-20),
there was code that checked for whether we could skip updating a file in
the working directory, based on whether the merged version matched the
current working copy. Due to the desire to handle directory/file conflicts
that were resolvable, that commit deferred content merging by first
updating the index with the unmerged entries and then moving the actual
merging (along with the skip-the-content-update check) to another function
that ran later in the merge process. As part moving the content merging
code, a bug was introduced such that although the message about skipping
the update would be printed (whenever GIT_MERGE_VERBOSITY was sufficiently
high), the file would be unconditionally updated in the working copy
anyway.
When we detect that the file does not need to be updated in the working
copy, update the index appropriately and then return early before updating
the working copy.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a variable-args function, the code for writing into a strbuf is
non-trivial. We ended up cutting and pasting it in several places
because there was no vprintf-style function for strbufs (which in turn
was held up by a lack of va_copy).
Now that we have a fallback va_copy, we can add strbuf_vaddf, the
strbuf equivalent of vsprintf. And we can clean up the cut and paste
mess.
Signed-off-by: Jeff King <peff@peff.net>
Improved-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The user can enable or disable it explicitly with the new
--progress, but it defaults to checking isatty(2).
This works only with merge-recursive and subtree. In theory
we could pass a progress flag to other strategies, but none
of them support progress at this point, so let's wait until
they grow such a feature before worrying about propagating
it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We did this once before in 5070591 (bump rename limit
defaults, 2008-04-30). Back then, we were shooting for about
1 second for a diff/log calculation, and 5 seconds for a
merge.
There are a few new things to consider, though:
1. Average processors are faster now.
2. We've seen on the mailing list some ugly merges where
not using inexact rename detection leads to many more
conflicts. Merges of this size take a long time
anyway, so users are probably happy to spend a little
bit of time computing the renames.
Let's bump the diff/merge default limits from 200/500 to
400/1000. Those are 2 seconds and 10 seconds respectively on
my modern hardware.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The warning is generated deep in the diffcore code, which
means that it will come first, followed possibly by a spew
of conflicts, making it hard to see.
Instead, let's have diffcore pass back the information about
how big the rename limit would needed to have been, and then
the caller can provide a more appropriate message (and at a
more appropriate time).
No refactoring of other non-merge callers is necessary,
because nobody else was even using the warn_on_rename_limit
feature.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* en/merge-recursive: (41 commits)
t6022: Use -eq not = to test output of wc -l
merge-recursive:make_room_for_directories - work around dumb compilers
merge-recursive: Remove redundant path clearing for D/F conflicts
merge-recursive: Make room for directories in D/F conflicts
handle_delete_modify(): Check whether D/F conflicts are still present
merge_content(): Check whether D/F conflicts are still present
conflict_rename_rename_1to2(): Fix checks for presence of D/F conflicts
conflict_rename_delete(): Check whether D/F conflicts are still present
merge-recursive: Delay modify/delete conflicts if D/F conflict present
merge-recursive: Delay content merging for renames
merge-recursive: Delay handling of rename/delete conflicts
merge-recursive: Move handling of double rename of one file to other file
merge-recursive: Move handling of double rename of one file to two
merge-recursive: Avoid doubly merging rename/add conflict contents
merge-recursive: Update merge_content() call signature
merge-recursive: Update conflict_rename_rename_1to2() call signature
merge-recursive: Structure process_df_entry() to handle more cases
merge-recursive: Have process_entry() skip D/F or rename entries
merge-recursive: New function to assist resolving renames in-core only
merge-recursive: New data structures for deferring of D/F conflicts
...
Conflicts:
t/t6020-merge-df.sh
t/t6036-recursive-corner-cases.sh
Some vintage of gcc does not seem to notice last_len is only used when
last_file is already set to non-NULL at which point last_len is also
set.
Noticed on FreeBSD 8
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code had several places where individual checks were done to remove
files that could be in the way of directories in D/F conflicts. Not all
D/F conflicts could have a path cleared for them in such a manner, however,
leading to the need to create make_room_for_directories_of_df_conflicts()
as done in the previous patch. That new function could not have been
incorporated into the code sooner, since not all relevant code paths had
been deferred to process_df_entry() yet, leading to the creation of even
more of these now-redundant path removals.
Clean out all of these extra D/F path clearing cases.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When there are unmerged entries present, make sure to check for D/F
conflicts first and remove any files present in HEAD that would be in the
way of creating files below the correspondingly named directory. Such
files will be processed again at the end of the merge in
process_df_entry(); at that time we will be able to tell if we need to
and can reinstate the file, whether we need to place its contents in a
different file due to the directory still being present, etc.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If all the paths below some directory involved in a D/F conflict were not
removed during the rest of the merge, then the contents of the file whose
path conflicted needs to be recorded in file with an alternative filename.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If all the paths below some directory involved in a D/F conflict were not
removed during the rest of the merge, then the contents of the file whose
path conflicted needs to be recorded in file with an alternative filename.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function is called from process_df_entry(), near the end of the merge.
Rather than just checking whether one of the sides of the merge had a
directory at the same path as one of our files, check whether that
directory is still present by this point of our merge.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If all the paths below some directory involved in a D/F conflict were not
removed during the rest of the merge, then the contents of the file whose
path conflicted needs to be recorded in file with an alternative filename.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When handling merges with modify/delete conflicts, if the modified path is
involved in a D/F conflict, handle the issue in process_df_entry() rather
than process_entry().
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the handling of content merging for renames from process_renames() to
process_df_entry().
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the handling of rename/delete conflicts from process_renames() to
process_df_entry().
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the handling of rename/rename conflicts where one file is renamed on
both sides to the same file, from process_renames() to process_entry().
Here we avoid the three way merge logic by just using
update_stages_and_entry() to move the higher stage entries in the index
from the rename source to the rename destination, and then allow
process_entry() to do its magic.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the handling of rename/rename conflicts where one file is renamed to
two different files, from process_renames() to process_df_entry().
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a commit moves A to B while another commit created B (or moved C to
B), and these two different commits serve as different merge-bases for a
later merge, c94736a (merge-recursive: don't segfault while handling
rename clashes 2009-07-30) added some special code to avoid segfaults.
Since that commit, the two versions of B are merged in place (which could
be potentially conflicting) and the intermediate result is used as the
virtual ancestor.
However, right before this special merge, try_merge was turned on, meaning
that process_renames() would try an alternative merge that ignores the
'add' part of the conflict, and, if the merge is clean, store that as the
new virtual ancestor. This could cause incorrect merging of criss-cross
merges; it would typically result in just recording a slightly confusing
merge base, but in some cases it could cause silent acceptance of one side
of a merge as the final resolution when a conflict should have been
flagged.
When we do a special merge for such a rename/add conflict between
merge-bases, turn try_merge off to avoid an inappropriate second merge.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Enable calling merge_content() and providing more information about renames
and D/F conflicts (which we will want to do from process_df_entry()).
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To facilitate having this function called later using information stored
in a rename_df_conflict_info struct, accept a diff_filepair instead of a
rename.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Modify process_df_entry() (mostly just indentation level changes) to
get it ready for handling more D/F conflict type cases.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If an entry has an associated rename_df_conflict_info, skip it and allow
it to be processed by process_df_entry().
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
process_renames() and process_entry() have nearly identical code for
doing three-way file merging to resolve content changes. Since we are
already deferring some of the current rename handling in order to better
handle D/F conflicts, it seems to make sense to defer content merging as
well and remove the (nearly) duplicated code sections for handling this
merging.
To facilitate this process, add a new update_stages_and_entry() function
which will map the higher stage index entries from two files involved in a
rename into the resulting rename destination's index entries, and update
the associated stage_data structure.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we need to resolve paths (including renames) in-core first and defer
checking of D/F conflicts (namely waiting to see if directories are still
in the way after all paths are resolved) before updating files involved in
D/F conflicts, we will need to first process_renames, then record some
information about the rename needed at D/F resolution time, and then make
use of that information when resolving D/F conflicts at the end.
This commit adds some relevant data structures for storing the necessary
information.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This move is in preparation for merge_content growing and being called from
multiple places in order to handle D/F conflicts.
I also snuck in a small change to the output in the case that the merged
content for the file matches the current file contents, to make it better
match (and thus more able to take over) how other merge_file() calls in
process_renames() are handled.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This move is in preparation for the function being called from multiple
places in order to handle D/F conflicts.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This move is in preparation for the function growing and being called from
multiple places in order to handle D/F conflicts.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we want to resolve merges in-core and then detect at the end whether
D/F conflicts remain in the way, we should just apply renames in-core and
let logic elsewhere check for D/F conflicts.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The names conflict_rename_rename and conflict_rename_rename_2 did not make
it clear what they were handling. Since the first of these handles one
file being renamed in both branches to different files, while the latter
handles two different files being renamed to the same thing, add a little
'1to2' and '2to1' suffix on these and an explanatory comment to make their
intent clearer.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>