How to keep a project's canonical history correct.
During the mail thread about "Pull is mostly evil" a user asked how the first parent could become reversed. This howto explains how the first parent can get reversed when viewed by the project and then explains a method to keep the history correct. Signed-off-by: Stephen P. Smith <ischis2@cox.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
parent
b4f86a4ce8
commit
0678b649a1
@ -59,6 +59,7 @@ SP_ARTICLES += howto/recover-corrupted-blob-object
|
||||
SP_ARTICLES += howto/recover-corrupted-object-harder
|
||||
SP_ARTICLES += howto/rebuild-from-update-hook
|
||||
SP_ARTICLES += howto/rebase-from-internal-branch
|
||||
SP_ARTICLES += howto/keep-canonical-history-correct
|
||||
SP_ARTICLES += howto/maintain-git
|
||||
API_DOCS = $(patsubst %.txt,%,$(filter-out technical/api-index-skel.txt technical/api-index.txt, $(wildcard technical/api-*.txt)))
|
||||
SP_ARTICLES += $(API_DOCS)
|
||||
|
216
Documentation/howto/keep-canonical-history-correct.txt
Normal file
216
Documentation/howto/keep-canonical-history-correct.txt
Normal file
@ -0,0 +1,216 @@
|
||||
From: Junio C Hamano <gitster@pobox.com>
|
||||
Date: Wed, 07 May 2014 13:15:39 -0700
|
||||
Subject: Beginner question on "Pull is mostly evil"
|
||||
Abstract: This how-to explains a method for keeping a
|
||||
project's history correct when using git pull.
|
||||
Content-type: text/asciidoc
|
||||
|
||||
Keep authoritative canonical history correct with git pull
|
||||
==========================================================
|
||||
|
||||
Sometimes a new project integrator will end up with project history
|
||||
that appears to be "backwards" from what other project developers
|
||||
expect. This howto presents a suggested integration workflow for
|
||||
maintaining a central repository.
|
||||
|
||||
Suppose that that central repository has this history:
|
||||
|
||||
------------
|
||||
---o---o---A
|
||||
------------
|
||||
|
||||
which ends at commit `A` (time flows from left to right and each node
|
||||
in the graph is a commit, lines between them indicating parent-child
|
||||
relationship).
|
||||
|
||||
Then you clone it and work on your own commits, which leads you to
|
||||
have this history in *your* repository:
|
||||
|
||||
------------
|
||||
---o---o---A---B---C
|
||||
------------
|
||||
|
||||
Imagine your coworker did the same and built on top of `A` in *his*
|
||||
repository in the meantime, and then pushed it to the
|
||||
central repository:
|
||||
|
||||
------------
|
||||
---o---o---A---X---Y---Z
|
||||
------------
|
||||
|
||||
Now, if you `git push` at this point, because your history that leads
|
||||
to `C` lacks `X`, `Y` and `Z`, it will fail. You need to somehow make
|
||||
the tip of your history a descendant of `Z`.
|
||||
|
||||
One suggested way to solve the problem is "fetch and then merge", aka
|
||||
`git pull`. When you fetch, your repository will have a history like
|
||||
this:
|
||||
|
||||
------------
|
||||
---o---o---A---B---C
|
||||
\
|
||||
X---Y---Z
|
||||
------------
|
||||
|
||||
Once you run merge after that, while still on *your* branch, i.e. `C`,
|
||||
you will create a merge `M` and make the history look like this:
|
||||
|
||||
------------
|
||||
---o---o---A---B---C---M
|
||||
\ /
|
||||
X---Y---Z
|
||||
------------
|
||||
|
||||
`M` is a descendant of `Z`, so you can push to update the central
|
||||
repository. Such a merge `M` does not lose any commit in both
|
||||
histories, so in that sense it may not be wrong, but when people want
|
||||
to talk about "the authoritative canonical history that is shared
|
||||
among the project participants", i.e. "the trunk", they often view
|
||||
it as "commits you see by following the first-parent chain", and use
|
||||
this command to view it:
|
||||
|
||||
------------
|
||||
$ git log --first-parent
|
||||
------------
|
||||
|
||||
For all other people who observed the central repository after your
|
||||
coworker pushed `Z` but before you pushed `M`, the commit on the trunk
|
||||
used to be `o-o-A-X-Y-Z`. But because you made `M` while you were on
|
||||
`C`, `M`'s first parent is `C`, so by pushing `M` to advance the
|
||||
central repository, you made `X-Y-Z` a side branch, not on the trunk.
|
||||
|
||||
You would rather want to have a history of this shape:
|
||||
|
||||
------------
|
||||
---o---o---A---X---Y---Z---M'
|
||||
\ /
|
||||
B-----------C
|
||||
------------
|
||||
|
||||
so that in the first-parent chain, it is clear that the project first
|
||||
did `X` and then `Y` and then `Z` and merged a change that consists of
|
||||
two commits `B` and `C` that achieves a single goal. You may have
|
||||
worked on fixing the bug #12345 with these two patches, and the merge
|
||||
`M'` with swapped parents can say in its log message "Merge
|
||||
fix-bug-12345". Having a way to tell `git pull` to create a merge
|
||||
but record the parents in reverse order may be a way to do so.
|
||||
|
||||
Note that I said "achieves a single goal" above, because this is
|
||||
important. "Swapping the merge order" only covers a special case
|
||||
where the project does not care too much about having unrelated
|
||||
things done on a single merge but cares a lot about first-parent
|
||||
chain.
|
||||
|
||||
There are multiple schools of thought about the "trunk" management.
|
||||
|
||||
1. Some projects want to keep a completely linear history without any
|
||||
merges. Obviously, swapping the merge order would not match their
|
||||
taste. You would need to flatten your history on top of the
|
||||
updated upstream to result in a history of this shape instead:
|
||||
+
|
||||
------------
|
||||
---o---o---A---X---Y---Z---B---C
|
||||
------------
|
||||
+
|
||||
with `git pull --rebase` or something.
|
||||
|
||||
2. Some projects tolerate merges in their history, but do not worry
|
||||
too much about the first-parent order, and allow fast-forward
|
||||
merges. To them, swapping the merge order does not hurt, but
|
||||
it is unnecessary.
|
||||
|
||||
3. Some projects want each commit on the "trunk" to do one single
|
||||
thing. The output of `git log --first-parent` in such a project
|
||||
would show either a merge of a side branch that completes a single
|
||||
theme, or a single commit that completes a single theme by itself.
|
||||
If your two commits `B` and `C` (or they may even be two groups of
|
||||
commits) were solving two independent issues, then the merge `M'`
|
||||
we made in the earlier example by swapping the merge order is
|
||||
still not up to the project standard. It merges two unrelated
|
||||
efforts `B` and `C` at the same time.
|
||||
|
||||
For projects in the last category (Git itself is one of them),
|
||||
individual developers would want to prepare a history more like
|
||||
this:
|
||||
|
||||
------------
|
||||
C0--C1--C2 topic-c
|
||||
/
|
||||
---o---o---A master
|
||||
\
|
||||
B0--B1--B2 topic-b
|
||||
------------
|
||||
|
||||
That is, keeping separate topics on separate branches, perhaps like
|
||||
so:
|
||||
|
||||
------------
|
||||
$ git clone $URL work && cd work
|
||||
$ git checkout -b topic-b master
|
||||
$ ... work to create B0, B1 and B2 to complete one theme
|
||||
$ git checkout -b topic-c master
|
||||
$ ... same for the theme of topic-c
|
||||
------------
|
||||
|
||||
And then
|
||||
|
||||
------------
|
||||
$ git checkout master
|
||||
$ git pull --ff-only
|
||||
------------
|
||||
|
||||
would grab `X`, `Y` and `Z` from the upstream and advance your master
|
||||
branch:
|
||||
|
||||
------------
|
||||
C0--C1--C2 topic-c
|
||||
/
|
||||
---o---o---A---X---Y---Z master
|
||||
\
|
||||
B0--B1--B2 topic-b
|
||||
------------
|
||||
|
||||
And then you would merge these two branches separately:
|
||||
|
||||
------------
|
||||
$ git merge topic-b
|
||||
$ git merge topic-c
|
||||
------------
|
||||
|
||||
to result in
|
||||
|
||||
------------
|
||||
C0--C1---------C2
|
||||
/ \
|
||||
---o---o---A---X---Y---Z---M---N
|
||||
\ /
|
||||
B0--B1-----B2
|
||||
------------
|
||||
|
||||
and push it back to the central repository.
|
||||
|
||||
It is very much possible that while you are merging topic-b and
|
||||
topic-c, somebody again advanced the history in the central repository
|
||||
to put `W` on top of `Z`, and make your `git push` fail.
|
||||
|
||||
In such a case, you would rewind to discard `M` and `N`, update the
|
||||
tip of your 'master' again and redo the two merges:
|
||||
|
||||
------------
|
||||
$ git reset --hard origin/master
|
||||
$ git pull --ff-only
|
||||
$ git merge topic-b
|
||||
$ git merge topic-c
|
||||
------------
|
||||
|
||||
The procedure will result in a history that looks like this:
|
||||
|
||||
------------
|
||||
C0--C1--------------C2
|
||||
/ \
|
||||
---o---o---A---X---Y---Z---W---M'--N'
|
||||
\ /
|
||||
B0--B1---------B2
|
||||
------------
|
||||
|
||||
See also http://git-blame.blogspot.com/2013/09/fun-with-first-parent-history.html
|
Loading…
Reference in New Issue
Block a user