Commit Graph

608 Commits

Author SHA1 Message Date
Junio C Hamano
0b179f3175 Merge branch 'nd/sha1-name-c-wo-the-repository'
Further code clean-up to allow the lowest level of name-to-object
mapping layer to work with a passed-in repository other than the
default one.

* nd/sha1-name-c-wo-the-repository: (34 commits)
  sha1-name.c: remove the_repo from get_oid_mb()
  sha1-name.c: remove the_repo from other get_oid_*
  sha1-name.c: remove the_repo from maybe_die_on_misspelt_object_name
  submodule-config.c: use repo_get_oid for reading .gitmodules
  sha1-name.c: add repo_get_oid()
  sha1-name.c: remove the_repo from get_oid_with_context_1()
  sha1-name.c: remove the_repo from resolve_relative_path()
  sha1-name.c: remove the_repo from diagnose_invalid_index_path()
  sha1-name.c: remove the_repo from handle_one_ref()
  sha1-name.c: remove the_repo from get_oid_1()
  sha1-name.c: remove the_repo from get_oid_basic()
  sha1-name.c: remove the_repo from get_describe_name()
  sha1-name.c: remove the_repo from get_oid_oneline()
  sha1-name.c: add repo_interpret_branch_name()
  sha1-name.c: remove the_repo from interpret_branch_mark()
  sha1-name.c: remove the_repo from interpret_nth_prior_checkout()
  sha1-name.c: remove the_repo from get_short_oid()
  sha1-name.c: add repo_for_each_abbrev()
  sha1-name.c: store and use repo in struct disambiguate_state
  sha1-name.c: add repo_find_unique_abbrev_r()
  ...
2019-05-09 00:37:25 +09:00
Junio C Hamano
96379f043f Merge branch 'en/merge-directory-renames'
"git merge-recursive" backend recently learned a new heuristics to
infer file movement based on how other files in the same directory
moved.  As this is inherently less robust heuristics than the one
based on the content similarity of the file itself (rather than
based on what its neighbours are doing), it sometimes gives an
outcome unexpected by the end users.  This has been toned down to
leave the renamed paths in higher/conflicted stages in the index so
that the user can examine and confirm the result.

* en/merge-directory-renames:
  merge-recursive: switch directory rename detection default
  merge-recursive: give callers of handle_content_merge() access to contents
  merge-recursive: track information associated with directory renames
  t6043: fix copied test description to match its purpose
  merge-recursive: switch from (oid,mode) pairs to a diff_filespec
  merge-recursive: cleanup handle_rename_* function signatures
  merge-recursive: track branch where rename occurred in rename struct
  merge-recursive: remove ren[12]_other fields from rename_conflict_info
  merge-recursive: shrink rename_conflict_info
  merge-recursive: move some struct declarations together
  merge-recursive: use 'ci' for rename_conflict_info variable name
  merge-recursive: rename locals 'o' and 'a' to 'obuf' and 'abuf'
  merge-recursive: rename diff_filespec 'one' to 'o'
  merge-recursive: rename merge_options argument from 'o' to 'opt'
  Use 'unsigned short' for mode, like diff_filespec does
2019-05-09 00:37:22 +09:00
Nguyễn Thái Ngọc Duy
a133c40b23 commit.cocci: refactor code, avoid double rewrite
"maybe" pointer in 'struct commit' is tricky because it can be lazily
initialized to take advantage of commit-graph if available. This makes
it not safe to access directly.

This leads to a rule in commit.cocci to rewrite 'x->maybe_tree' to
'get_commit_tree(x)'. But that rule alone could lead to incorrectly
rewrite assignments, e.g. from

    x->maybe_tree = yes

to

    get_commit_tree(x) = yes

Because of this we have a second rule to revert this effect. Szeder
found out that we could do better by performing the assignment rewrite
rule first, then the remaining is read-only access and handled by the
current first rule.

For this to work, we need to transform "x->maybe_tree = y" to something
that does NOT contain "x->maybe_tree" to avoid the original first
rule. This is where set_commit_tree() comes in.

Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-16 18:56:51 +09:00
Elijah Newren
8c8e5bd6eb merge-recursive: switch directory rename detection default
When all of x/a, x/b, and x/c have moved to z/a, z/b, and z/c on one
branch, there is a question about whether x/d added on a different
branch should remain at x/d or appear at z/d when the two branches are
merged.  There are different possible viewpoints here:

  A) The file was placed at x/d; it's unrelated to the other files in
     x/ so it doesn't matter that all the files from x/ moved to z/ on
     one branch; x/d should still remain at x/d.

  B) x/d is related to the other files in x/, and x/ was renamed to z/;
     therefore x/d should be moved to z/d.

Since there was no ability to detect directory renames prior to
git-2.18, users experienced (A) regardless of context.  Choice (B) was
implemented in git-2.18, with no option to go back to (A), and has been
in use since.  However, one user reported that the merge results did not
match their expectations, making the change of default problematic,
especially since there was no notice printed when directory rename
detection moved files.

Note that there is also a third possibility here:

  C) There are different answers depending on the context and content
     that cannot be determined by git, so this is a conflict.  Use a
     higher stage in the index to record the conflict and notify the
     user of the potential issue instead of silently selecting a
     resolution for them.

Add an option for users to specify their preference for whether to use
directory rename detection, and default to (C).  Even when directory
rename detection is on, add notice messages about files moved into new
directories.

As a sidenote, x/d did not have to be a new file here; it could have
already existed at some other path and been renamed to x/d, with
directory rename detection just renaming it again to z/d.  Thus, it's
not just new files, but also a modification to all rename types (normal
renames, rename/add, rename/delete, rename/rename(1to1),
rename/rename(1to2), and rename/rename(2to1)).

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 16:02:08 +09:00
Elijah Newren
e62d11239c merge-recursive: give callers of handle_content_merge() access to contents
Pass a merge_file_info struct to handle_content_merge() so that the
callers can access the oid and mode of the result afterward.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 16:02:08 +09:00
Elijah Newren
6d169fd321 merge-recursive: track information associated with directory renames
Directory rename detection previously silently applied.  In order to
allow printing information about paths that changed or printing a
conflict notification (and only doing so near other potential conflict
messages associated with the paths), save this information inside the
rename struct for later use.  A subsequent patch will make use of the
additional information.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 16:02:08 +09:00
Elijah Newren
8daec1df03 merge-recursive: switch from (oid,mode) pairs to a diff_filespec
There was a significant inconsistency in the various parts of the API
used in merge-recursive; many places used a pair of (oid, mode) to track
file version/contents, while other parts used a diff_filespec (which
have an oid and mode embedded in it).  This inconsistency caused lots of
places to need to pack and unpack data to call into other functions.
This has been the subject of some past cleanups (see e.g. commit
0270a07ad0 ("merge-recursive: remove final remaining caller of
merge_file_one()", 2018-09-19)), but let's just remove the underlying
mess altogether by switching to use diff_filespec.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 16:02:07 +09:00
Elijah Newren
e2d563dfa9 merge-recursive: cleanup handle_rename_* function signatures
Instead of passing various bits and pieces of 'ci', just pass it
directly.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 16:02:07 +09:00
Elijah Newren
c336ab8593 merge-recursive: track branch where rename occurred in rename struct
We previously tracked the branch associated with a rename in a separate
field in rename_conflict_info, but since it is directly associated with
the rename it makes more sense to move it into the rename struct.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 16:02:07 +09:00
Elijah Newren
3f9c92ec99 merge-recursive: remove ren[12]_other fields from rename_conflict_info
The ren1_other and ren2_other fields were synthesized from information
in ren1->src_entry and ren2->src_entry.  Since we already have the
necessary information in ren1 and ren2, just use those.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 16:02:07 +09:00
Elijah Newren
e9cd1b5ca4 merge-recursive: shrink rename_conflict_info
The rename_conflict_info struct used both a pair and a stage_data which
were taken from a rename struct.  Just use the original rename struct.
This will also allow us to start making other simplifications to the
code.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 16:02:07 +09:00
Elijah Newren
967d6be725 merge-recursive: move some struct declarations together
These structs are related and reference each other, so move them
together to make it easier for folks to determine what they hold and
what their purpose is.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 16:02:07 +09:00
Elijah Newren
043622b2e9 merge-recursive: use 'ci' for rename_conflict_info variable name
We used a couple different names, but used 'ci' the most.  Use the same
variable name throughout for a little extra consistency.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 16:02:07 +09:00
Elijah Newren
93a02c5553 merge-recursive: rename locals 'o' and 'a' to 'obuf' and 'abuf'
Since we want to replace oid,mode pairs with a single diff_filespec,
we will soon want to be able to use the names 'o', 'a', and 'b' for
the three different file versions.  Rename some local variables in
blob_unchanged() that would otherwise conflict.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 16:02:07 +09:00
Elijah Newren
e3de888ca0 merge-recursive: rename diff_filespec 'one' to 'o'
In the previous commit, we noted that several places throughout merge
recursive both had a reason to use 'o'; some for a merge_options struct,
and others for a diff_filespec struct.  Some places had both, forcing
one of the two to be renamed, though the choice was inconsistent.  Now
that the merge_options struct has been renamed to 'opt' everywhere, we
can replace the few places that used 'one' for the diff_filespec to 'o'.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 16:02:07 +09:00
Elijah Newren
259ccb6cc3 merge-recursive: rename merge_options argument from 'o' to 'opt'
The name 'o' was used for the merge_options struct pointer taken by many
functions, but in a few places it was named 'opt'.  Several functions
that didn't need merge_options instead used 'o' for a diff_filespec
argument or local.  Some functions needed both an inconsistently either
renamed the merge_options to 'opt' or the diff_filespec to 'one'.  I
want to remove the weird split in the codebase between using a
diff_filespec and a pair of (oid,mode) values in favor of using a
diff_filespec everywhere, but that dramatically increases the number of
cases where we want to use 'o' as a diff_filespec.  Rename the
merge_options argument to 'opt' to make room.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 16:02:07 +09:00
Elijah Newren
5ec1e72823 Use 'unsigned short' for mode, like diff_filespec does
struct diff_filespec defines mode to be an 'unsigned short'.  Several
other places in the API which we'd like to interact with using a
diff_filespec used a plain unsigned (or unsigned int).  This caused
problems when taking addresses, so switch to unsigned short.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 16:02:07 +09:00
brian m. carlson
db1ba2a230 submodule: avoid hard-coded constants
Instead of using hard-coded 40-based constants, express these values in
terms of the_hash_algo and GIT_MAX_HEXSZ.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-01 11:57:37 +09:00
Junio C Hamano
b0e7fb2e5c Merge branch 'nd/completion-more-parameters'
The command line completion (in contrib/) has been taught to
complete more subcommand parameters.

* nd/completion-more-parameters:
  completion: add more parameter value completion
2019-03-07 09:59:58 +09:00
Nguyễn Thái Ngọc Duy
5a59a2301f completion: add more parameter value completion
This adds value completion for a couple more paramters. To make it
easier to maintain these hard coded lists, add a comment at the original
list/code to remind people to update git-completion.bash too.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-20 12:31:56 -08:00
Jeff King
b53c502807 merge-recursive: drop several unused parameters
There are a few functions related to directory renames that have unused
parameters. After consulting with the author in [1], these seem to be
leftover cruft from the development process, and not signs of any bug.
Let's drop them.

[1] https://public-inbox.org/git/CABPp-BHobf8wbBsXF97scNQCzkxQukziODRXq6JOOWq61cAd9g@mail.gmail.com/

Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-14 15:26:15 -08:00
Junio C Hamano
7589e63648 Merge branch 'nd/the-index-final'
The assumption to work on the single "in-core index" instance has
been reduced from the library-ish part of the codebase.

* nd/the-index-final:
  cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switch
  read-cache.c: remove the_* from index_has_changes()
  merge-recursive.c: remove implicit dependency on the_repository
  merge-recursive.c: remove implicit dependency on the_index
  sha1-name.c: remove implicit dependency on the_index
  read-cache.c: replace update_index_if_able with repo_&
  read-cache.c: kill read_index()
  checkout: avoid the_index when possible
  repository.c: replace hold_locked_index() with repo_hold_locked_index()
  notes-utils.c: remove the_repository references
  grep: use grep_opt->repo instead of explict repo argument
2019-02-06 22:05:23 -08:00
Junio C Hamano
d6f05a435f Merge branch 'nd/attr-pathspec-in-tree-walk'
The traversal over tree objects has learned to honor
":(attr:label)" pathspec match, which has been implemented only for
enumerating paths on the filesystem.

* nd/attr-pathspec-in-tree-walk:
  tree-walk: support :(attr) matching
  dir.c: move, rename and export match_attrs()
  pathspec.h: clean up "extern" in function declarations
  tree-walk.c: make tree_entry_interesting() take an index
  tree.c: make read_tree*() take 'struct repository *'
2019-01-14 15:29:28 -08:00
Nguyễn Thái Ngọc Duy
150fe065f7 read-cache.c: remove the_* from index_has_changes()
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-14 12:13:05 -08:00
Nguyễn Thái Ngọc Duy
d7cf3a96e9 merge-recursive.c: remove implicit dependency on the_repository
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-14 12:13:04 -08:00
Nguyễn Thái Ngọc Duy
0d6caa2d08 merge-recursive.c: remove implicit dependency on the_index
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-14 12:13:04 -08:00
Nguyễn Thái Ngọc Duy
e1ff0a32e4 read-cache.c: kill read_index()
read_index() shares the same problem as hold_locked_index(): it
assumes $GIT_DIR/index. Move all call sites to repo_read_index()
instead. read_index_preload() and read_index_unmerged() are also
killed as a consequence.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-14 12:13:04 -08:00
Nguyễn Thái Ngọc Duy
3a95f31d1c repository.c: replace hold_locked_index() with repo_hold_locked_index()
hold_locked_index() assumes the index path at $GIT_DIR/index. This is
not good for places that take an arbitrary index_state instead of
the_index, which is basically everywhere except builtin/.

Replace it with repo_hold_locked_index(). hold_locked_index() remains
as a wrapper around repo_hold_locked_index() to reduce changes in builtin/

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-14 12:13:04 -08:00
Junio C Hamano
ac193e0e0a Merge branch 'en/merge-path-collision'
Updates for corner cases in merge-recursive.

* en/merge-path-collision:
  t6036: avoid non-portable "cp -a"
  merge-recursive: combine error handling
  t6036, t6043: increase code coverage for file collision handling
  merge-recursive: improve rename/rename(1to2)/add[/add] handling
  merge-recursive: use handle_file_collision for add/add conflicts
  merge-recursive: improve handling for rename/rename(2to1) conflicts
  merge-recursive: fix rename/add conflict handling
  merge-recursive: new function for better colliding conflict resolutions
  merge-recursive: increase marker length with depth of recursion
  t6036, t6042: testcases for rename collision of already conflicting files
  t6042: add tests for consistency in file collision conflict handling
2019-01-04 13:33:32 -08:00
Nguyễn Thái Ngọc Duy
e092073d64 tree.c: make read_tree*() take 'struct repository *'
These functions call tree_entry_interesting() which will soon require
a 'struct index_state *' to be passed in. Instead of just changing the
function signature to take an index, update to take a repo instead
because these functions do need object database access.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-19 10:50:33 +09:00
Derrick Stolee
80cee6e321 merge-recursive: combine error handling
In handle_rename_rename_1to2(), we have duplicated error handling
around colliding paths. Specifically, when we want to write out
the file and there is a directory or untracked file in the way,
we need to create a temporary file to hold the contents. This has
some special output to alert the user, and this output is
duplicated for each side of the conflict.

Simplify the call by generating this new path in a helper
function.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-08 14:23:54 +09:00
Elijah Newren
48c9cb9d6d merge-recursive: improve rename/rename(1to2)/add[/add] handling
When we have a rename/rename(1to2) conflict, each of the renames can
collide with a file addition.  Each of these rename/add conflicts suffered
from the same kinds of problems that normal rename/add suffered from.
Make the code use handle_file_conflicts() as well so that we get all the
same fixes and consistent behavior between the different conflict types.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-08 14:23:53 +09:00
Elijah Newren
dcf2815098 merge-recursive: use handle_file_collision for add/add conflicts
This results in no-net change of behavior, it simply ensures that all
file-collision conflict handling types are being handled the same by
calling the same function.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-08 14:23:53 +09:00
Elijah Newren
bbafc9c44a merge-recursive: improve handling for rename/rename(2to1) conflicts
This makes the rename/rename(2to1) conflicts use the new
handle_file_collision() function.  Since that function was based
originally on the rename/rename(2to1) handling code, the main
differences here are in what was added.  In particular:

  * Instead of storing files at collide_path~HEAD and collide_path~MERGE,
    the files are two-way merged and recorded at collide_path.

  * Instead of recording the version of the renamed file that existed
    on the renamed side in the index (thus ignoring any changes that
    were made to the file on the side of history without the rename),
    we do a three-way content merge on the renamed path, then store
    that at either stage 2 or stage 3.

  * Note that since the content merge for each rename may have conflicts,
    and then we have to merge the two renamed files, we can end up with
    nested conflict markers.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-08 14:23:53 +09:00
Elijah Newren
7f8671656f merge-recursive: fix rename/add conflict handling
This makes the rename/add conflict handling make use of the new
handle_file_collision() function, which fixes several bugs and improves
things for the rename/add case significantly.  Previously, rename/add
would:

  * Not leave any higher order stage entries in the index, making it
    appear as if there were no conflict.
  * Would place the rename file at the colliding path, and move the
    added file elsewhere, which combined with the lack of higher order
    stage entries felt really odd.  It's not clear to me why the
    rename should take precedence over the add; if one should be moved
    out of the way, they both probably should.
  * In the recursive case, it would do a two way merge of the added
    file and the version of the renamed file on the renamed side,
    completely excluding modifications to the renamed file on the
    unrenamed side of history.

Use the new handle_file_collision() to fix all of these issues.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-08 14:23:53 +09:00
Elijah Newren
37b65ce36b merge-recursive: new function for better colliding conflict resolutions
There are three conflict types that represent two (possibly entirely
unrelated) files colliding at the same location:
  * add/add
  * rename/add
  * rename/rename(2to1)

These three conflict types already share more similarity than might be
immediately apparent from their description: (1) the handling of the
rename variants already involves removing any entries from the index
corresponding to the original file names[*], thus only leaving entries
in the index for the colliding path; (2) likewise, any trace of the
original file name in the working tree is also removed.  So, in all
three cases we're left with how to represent two colliding files in both
the index and the working copy.

[*] Technically, this isn't quite true because rename/rename(2to1)
conflicts in the recursive (o->call_depth > 0) case do an "unrename"
since about seven years ago.  But even in that case, Junio felt
compelled to explain that my decision to "unrename" wasn't necessarily
the only or right answer -- search for "Comment from Junio" in t6036 for
details.

My initial motivation for looking at these three conflict types was that
if the handling of these three conflict types is the same, at least in
the limited set of cases where a renamed file is unmodified on the side
of history where the file is not renamed, then a significant performance
improvement for rename detection during merges is possible.  However,
while that served as motivation to look at these three types of
conflicts, the actual goal of this new function is to try to improve the
handling for all three cases, not to merely make them the same as each
other in that special circumstance.

=== Handling the working tree ===

The previous behavior for these conflict types in regards to the
working tree (assuming the file collision occurs at 'foo') was:
  * add/add does a two-way merge of the two files and records it as 'foo'.
  * rename/rename(2to1) records the two different files into two new
    uniquely named files (foo~HEAD and foo~$MERGE), while removing 'foo'
    from the working tree.
  * rename/add records the two different files into two different
    locations, recording the add at foo~$SIDE and, oddly, recording
    the rename at foo (why is the rename more important than the add?)

So, the question for what to write to the working tree boils down to
whether the two colliding files should be two-way merged and recorded in
place, or recorded into separate files.  As per discussion on the git
mailing lit, two-way merging was deemed to always be preferred, as that
makes these cases all more like content conflicts that users can handle
from within their favorite editor, IDE, or merge tool.  Note that since
renames already involve a content merge, rename/add and
rename/rename(2to1) conflicts could result in nested conflict markers.

=== Handling of the index ===

For a typical rename, unpack_trees() would set up the index in the
following fashion:
           old_path  new_path
   stage1: 5ca1ab1e  00000000
   stage2: f005ba11  00000000
   stage3: 00000000  b0a710ad
And merge-recursive would rewrite this to
           new_path
   stage1: 5ca1ab1e
   stage2: f005ba11
   stage3: b0a710ad
Removing old_path from the index means the user won't have to `git rm
old_path` manually every time a renamed path has a content conflict.
It also means they can use `git checkout [--ours|--theirs|--conflict|-m]
new_path`, `git diff [--ours|--theirs]` and various other commands that
would be difficult otherwise.

This strategy becomes a problem when we have a rename/add or
rename/rename(2to1) conflict, however, because then we have only three
slots to store blob sha1s and we need either four or six.  Previously,
this was handled by continuing to delete old_path from the index, and
just outright ignoring any blob shas from old_path.  That had the
downside of deleting any trace of changes made to old_path on the other
side of history.  This function instead does a three-way content merge of
the renamed file, and stores the blob sha1 for that at either stage2 or
stage3 for new_path (depending on which side the rename came from).  That
has the advantage of bringing information about changes on both sides and
still allows for easy resolution (no need to git rm old_path, etc.), but
does have the downside that if the content merge had conflict markers,
then what we store in the index is the sha1 of a blob with conflict
markers.  While that is a downside, it seems less problematic than the
downsides of any obvious alternatives, and certainly makes more sense
than the previous handling.  Further, it has a precedent in that when we
do recursive merges, we may accept a file with conflict markers as the
resolution for the merge of the merge-bases, which will then show up in
the index of the outer merge at stage 1 if a conflict exists at the outer
level.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-08 14:23:53 +09:00
Elijah Newren
b2a7942b8b merge-recursive: increase marker length with depth of recursion
Later patches in this series will modify file collision conflict
handling (e.g. from rename/add and rename/rename(2to1) conflicts) so
that multiply nested conflict markers can arise even before considering
conflicts in the virtual merge base.  Including the virtual merge base
will provide a way to get triply (or higher) nested conflict markers.
This new way to get nested conflict markers will force the need for a
more general mechanism to extend the length of conflict markers in order
to differentiate between different nestings.

Along with this change to conflict marker length handling, we want to
make sure that we don't regress handling for other types of conflicts
with nested conflict markers.  Add a more involved testcase using
merge.conflictstyle=diff3, where not only does the virtual merge base
contain conflicts, but its virtual merge base does as well (i.e. a case
with triply nested conflict markers).  While there are multiple
reasonable ways to handle nested conflict markers in the virtual merge
base for this type of situation, the easiest approach that dovetails
well with the new needs for the file collision conflict handling is to
require that the length of the conflict markers increase with each
subsequent nesting.

Subsequent patches which change the rename/add and rename/rename(2to1)
conflict handling will modify the extra_marker_size flag appropriately
for their new needs.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-08 14:23:53 +09:00
Junio C Hamano
ff8e25e99e Merge branch 'en/merge-cleanup-more'
Further clean-up of merge-recursive machinery.

* en/merge-cleanup-more:
  merge-recursive: avoid showing conflicts with merge branch before HEAD
  merge-recursive: improve auto-merging messages with path collisions
2018-11-03 00:53:57 +09:00
Junio C Hamano
11877b9ebe Merge branch 'nd/the-index'
Various codepaths in the core-ish part learn to work on an
arbitrary in-core index structure, not necessarily the default
instance "the_index".

* nd/the-index: (23 commits)
  revision.c: reduce implicit dependency the_repository
  revision.c: remove implicit dependency on the_index
  ws.c: remove implicit dependency on the_index
  tree-diff.c: remove implicit dependency on the_index
  submodule.c: remove implicit dependency on the_index
  line-range.c: remove implicit dependency on the_index
  userdiff.c: remove implicit dependency on the_index
  rerere.c: remove implicit dependency on the_index
  sha1-file.c: remove implicit dependency on the_index
  patch-ids.c: remove implicit dependency on the_index
  merge.c: remove implicit dependency on the_index
  merge-blobs.c: remove implicit dependency on the_index
  ll-merge.c: remove implicit dependency on the_index
  diff-lib.c: remove implicit dependency on the_index
  read-cache.c: remove implicit dependency on the_index
  diff.c: remove implicit dependency on the_index
  grep.c: remove implicit dependency on the_index
  diff.c: remove the_index dependency in textconv() functions
  blame.c: rename "repo" argument to "r"
  combine-diff.c: remove implicit dependency on the_index
  ...
2018-10-19 13:34:02 +09:00
Elijah Newren
4f44545313 merge-recursive: avoid showing conflicts with merge branch before HEAD
We want to load unmerged entries from HEAD into the index at stage 2 and
from MERGE_HEAD into stage 3.  Similarly, folks expect merge conflicts
to look like

    <<<<<<<< HEAD
    content from our side
    ========
    content from their side
    >>>>>>>> MERGE_HEAD

not

    <<<<<<<< MERGE_HEAD
    content from their side
    ========
    content from our side
    >>>>>>>> HEAD

The correct order usually comes naturally and for free, but with renames
we often have data in the form {rename_branch, other_branch}, and
working relative to the rename first (e.g. for rename/add) is more
convenient elsewhere in the code.  Address the slight impedance
mismatch by having some functions re-call themselves with flipped
arguments when the branch order is reversed.

Note that setup_rename_conflict_info() has one asymmetry in it, in
setting dst_entry1->processed=0 but not doing similarly for
dst_entry2->processed.  When dealing with rename/rename and similar
conflicts, we do not want the processing to happen twice, so the
desire to only set one of the entries to unprocessed is intentional.
So, while this change modifies which branch's entry will be marked as
unprocessed, that dovetails nicely with putting HEAD first so that we
get the index stage entries and conflict markers in the right order.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-18 14:45:39 +09:00
Elijah Newren
2b168ef3ff merge-recursive: improve auto-merging messages with path collisions
Each individual file involved in a rename could have also been modified
on both sides of history, meaning it may need to have content merges.
If two such files are renamed into the same location, then on top of the
two natural auto-merging messages we also have to two-way merge the
result, giving us messages that look like

  Auto-merging somefile.c (was somecase.c)
  Auto-merging somefile.c (was somefolder.c)
  Auto-merging somefile.c

However, despite the fact that I was the one who put the "(was %s)"
portions into the messages (and just a few months ago), I was still
initially confused when running into a rename/rename(2to1) case and
wondered if somefile.c had been merged three times.  Update this to
instead be:

  Auto-merging version of somefile.c from somecase.c
  Auto-merging version of somefile.c from someportfolio.c
  Auto-merging somefile.c

This is an admittedly long set of messages for a single path, but you
only get all three messages when dealing with the rare case of a
rename/rename(2to1) conflict where both sides of both original files
were also modified, in conflicting ways.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-18 13:54:36 +09:00
Junio C Hamano
74bb46354f Merge branch 'en/merge-cleanup'
Code clean-up.

* en/merge-cleanup:
  merge-recursive: rename merge_file_1() and merge_content()
  merge-recursive: remove final remaining caller of merge_file_one()
  merge-recursive: avoid wrapper function when unnecessary and wasteful
  merge-recursive: set paths correctly when three-way merging content
2018-10-16 16:16:00 +09:00
Junio C Hamano
ae109a9789 Merge branch 'en/double-semicolon-fix'
Code clean-up.

* en/double-semicolon-fix:
  Remove superfluous trailing semicolons
2018-09-24 10:30:47 -07:00
Nguyễn Thái Ngọc Duy
2abf350385 revision.c: remove implicit dependency on the_index
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-21 09:51:19 -07:00
Nguyễn Thái Ngọc Duy
32eaa46883 ll-merge.c: remove implicit dependency on the_index
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-21 09:48:10 -07:00
Nguyễn Thái Ngọc Duy
e675765235 diff.c: remove implicit dependency on the_index
A new variant repo_diff_setup() is added that takes 'struct repository *'
and diff_setup() becomes a thin macro around it that is protected by
NO_THE_REPOSITORY_COMPATIBILITY_MACROS, similar to NO_THE_INDEX_....
The plan is these macros will always be defined for all library files
and the macros are only accessible in builtin/

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-21 09:48:10 -07:00
Elijah Newren
d957355675 merge-recursive: rename merge_file_1() and merge_content()
Summary:
  merge_file_1()  -> merge_mode_and_contents()
  merge_content() -> handle_content_merge()

merge_file_1() is a very unhelpful name.  Rename it to
merge_mode_and_contents() to reflect what it does.

merge_content() calls merge_mode_and_contents() to do the main part of
its work, but most of this function was about higher level stuff, e.g.
printing out conflict messages, updating skip_worktree bits, checking
for ability to avoid updating the working tree or for D/F conflicts
being in the way, etc.  Since there are several handle_*() functions for
similar levels of checking and handling in merge-recursive.c (e.g.
handle_change_delete(), handle_rename_rename_2to1()), let's rename this
function to handle_content_merge().

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-20 12:33:57 -07:00
Elijah Newren
0270a07ad0 merge-recursive: remove final remaining caller of merge_file_one()
The function names merge_file_one() and merge_file_1() aren't
particularly intuitive function names, especially since there is no
associated merge_file() function that these are related to.  The
previous commit showed that merge_file_one() was prone to be called when
merge_file_1() should be, and since it is just a thin wrapper around
merge_file_1() anyway and only has one caller left, let's just remove
merge_file_one() entirely.

(It also turns out that the one remaining caller of merge_file_one()
has very broken code that needs to be completely rewritten, but that's
the subject of a future patch series; for now, we're just translating
it into a merge_file_1() call.)

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-20 12:33:57 -07:00
Elijah Newren
75f3fa79c3 merge-recursive: avoid wrapper function when unnecessary and wasteful
merge_file_one() is a convenience function taking a bunch of oids and
modes, combining each pair into a diff_filespec, and then calling
merge_file_1().  When we already start with diff_filespec's, we can
just call merge_file_1() directly instead of splitting out the oids
and modes for the wrapper to recombine into what we already had.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-20 12:33:57 -07:00
Elijah Newren
52396e1d3d merge-recursive: set paths correctly when three-way merging content
merge_3way() has code to mark different sides of the conflict with info
about where the content comes from.  If the names of the files involved
match, it simply uses the branch name.  If the names of the files do not
match, it uses branchname:filename.  Unfortunately, merge_content()
previously always called it with one.path = a.path = b.path.  Granted,
it didn't have other path information available to it for years, but
that was corrected by passing rename_conflict_info in commit
3c217c077a ("merge-recursive: Provide more info in conflict markers
with file renames", 2011-08-11).  In that commit, instead of just fixing
the bug with the pathnames, it created fake branch names incorporating
both the branch name and file name.

This "fake branch" workaround was extended further when I pulled that
logic out into a special function in commit dac4741554
("merge-recursive: Create function for merging with branchname:file
markers", 2011-08-11), and a number of other sites outside of
merge_content() have been added which call into that.  However, this
Rube-Goldberg-esque setup is not merely duplicate code and unnecessary
work, it also risked having other callsites invoke it in a way that
would result in markers of the form branchname:filename:filename (i.e.
with the filename repeated).

Fix this whole mess by:
  - setting one.path, a.path, and b.path appropriately
  - calling merge_file_1() directly
  - deleting the merge_file_special_markers() workaround wrapper

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-20 12:33:57 -07:00