[PATCH] Documentation: pull, push, packing repository and working with others.
Describe where you can pull from with a bit more detail. Clarify description of pushing. Add a section on packing repositories. Add a section on recommended workflow for the project lead, subsystem maintainers and individual developers. Move "Tag" section around to make the flow of example simpler to follow. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This commit is contained in:
parent
e7c1ca4273
commit
3eb5128a10
@ -453,6 +453,55 @@ With that, you should now be having some inkling of what git does, and
|
|||||||
can explore on your own.
|
can explore on your own.
|
||||||
|
|
||||||
|
|
||||||
|
[ Side note: most likely, you are not directly using the core
|
||||||
|
git Plumbing commands, but using Porcelain like Cogito on top
|
||||||
|
of it. Cogito works a bit differently and you usually do not
|
||||||
|
have to run "git-update-cache" yourself for changed files (you
|
||||||
|
do tell underlying git about additions and removals via
|
||||||
|
"cg-add" and "cg-rm" commands). Just before you make a commit
|
||||||
|
with "cg-commit", Cogito figures out which files you modified,
|
||||||
|
and runs "git-update-cache" on them for you. ]
|
||||||
|
|
||||||
|
|
||||||
|
Tagging a version
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
In git, there's two kinds of tags, a "light" one, and a "signed tag".
|
||||||
|
|
||||||
|
A "light" tag is technically nothing more than a branch, except we put
|
||||||
|
it in the ".git/refs/tags/" subdirectory instead of calling it a "head".
|
||||||
|
So the simplest form of tag involves nothing more than
|
||||||
|
|
||||||
|
cat .git/HEAD > .git/refs/tags/my-first-tag
|
||||||
|
|
||||||
|
after which point you can use this symbolic name for that particular
|
||||||
|
state. You can, for example, do
|
||||||
|
|
||||||
|
git diff my-first-tag
|
||||||
|
|
||||||
|
to diff your current state against that tag (which at this point will
|
||||||
|
obviously be an empty diff, but if you continue to develop and commit
|
||||||
|
stuff, you can use your tag as a "anchor-point" to see what has changed
|
||||||
|
since you tagged it.
|
||||||
|
|
||||||
|
A "signed tag" is actually a real git object, and contains not only a
|
||||||
|
pointer to the state you want to tag, but also a small tag name and
|
||||||
|
message, along with a PGP signature that says that yes, you really did
|
||||||
|
that tag. You create these signed tags with
|
||||||
|
|
||||||
|
git tag <tagname>
|
||||||
|
|
||||||
|
which will sign the current HEAD (but you can also give it another
|
||||||
|
argument that specifies the thing to tag, ie you could have tagged the
|
||||||
|
current "mybranch" point by using "git tag <tagname> mybranch").
|
||||||
|
|
||||||
|
You normally only do signed tags for major releases or things
|
||||||
|
like that, while the light-weight tags are useful for any marking you
|
||||||
|
want to do - any time you decide that you want to remember a certain
|
||||||
|
point, just create a private tag for it, and you have a nice symbolic
|
||||||
|
name for the state at that point.
|
||||||
|
|
||||||
|
|
||||||
Copying archives
|
Copying archives
|
||||||
-----------------
|
-----------------
|
||||||
|
|
||||||
@ -729,117 +778,277 @@ simply do
|
|||||||
and optionally give a branch-name for the remote end as a second
|
and optionally give a branch-name for the remote end as a second
|
||||||
argument.
|
argument.
|
||||||
|
|
||||||
[ Todo: fill in real examples ]
|
The "remote" repository can even be on the same machine. One of
|
||||||
|
the following notations can be used to name the repository to
|
||||||
|
pull from:
|
||||||
|
|
||||||
|
Rsync URL
|
||||||
|
rsync://remote.machine/path/to/repo.git/
|
||||||
|
|
||||||
Tagging a version
|
HTTP(s) URL
|
||||||
-----------------
|
http://remote.machine/path/to/repo.git/
|
||||||
|
|
||||||
In git, there's two kinds of tags, a "light" one, and a "signed tag".
|
GIT URL
|
||||||
|
git://remote.machine/path/to/repo.git/
|
||||||
|
remote.machine:/path/to/repo.git/
|
||||||
|
|
||||||
A "light" tag is technically nothing more than a branch, except we put
|
Local directory
|
||||||
it in the ".git/refs/tags/" subdirectory instead of calling it a "head".
|
/path/to/repo.git/
|
||||||
So the simplest form of tag involves nothing more than
|
|
||||||
|
|
||||||
cat .git/HEAD > .git/refs/tags/my-first-tag
|
[ Side Note: currently, HTTP transport is slightly broken in
|
||||||
|
that when the remote repository is "packed" they do not always
|
||||||
|
work. But we have not talked about packing repository yet, so
|
||||||
|
let's not worry too much about it for now. ]
|
||||||
|
|
||||||
after which point you can use this symbolic name for that particular
|
[ Digression: you could do without using any branches at all, by
|
||||||
state. You can, for example, do
|
keeping as many local repositories as you would like to have
|
||||||
|
branches, and merging between them with "git pull", just like
|
||||||
git diff my-first-tag
|
you merge between branches. The advantage of this approach is
|
||||||
|
that it lets you keep set of files for each "branch" checked
|
||||||
to diff your current state against that tag (which at this point will
|
out and you may find it easier to switch back and forth if you
|
||||||
obviously be an empty diff, but if you continue to develop and commit
|
juggle multiple lines of development simultaneously. Of
|
||||||
stuff, you can use your tag as a "anchor-point" to see what has changed
|
course, you will pay the price of more disk usage to hold
|
||||||
since you tagged it.
|
multiple working trees, but disk space is cheap these days. ]
|
||||||
|
|
||||||
A "signed tag" is actually a real git object, and contains not only a
|
|
||||||
pointer to the state you want to tag, but also a small tag name and
|
|
||||||
message, along with a PGP signature that says that yes, you really did
|
|
||||||
that tag. You create these signed tags with
|
|
||||||
|
|
||||||
git tag <tagname>
|
|
||||||
|
|
||||||
which will sign the current HEAD (but you can also give it another
|
|
||||||
argument that specifies the thing to tag, ie you could have tagged the
|
|
||||||
current "mybranch" point by using "git tag <tagname> mybranch").
|
|
||||||
|
|
||||||
You normally only do signed tags for major releases or things
|
|
||||||
like that, while the light-weight tags are useful for any marking you
|
|
||||||
want to do - any time you decide that you want to remember a certain
|
|
||||||
point, just create a private tag for it, and you have a nice symbolic
|
|
||||||
name for the state at that point.
|
|
||||||
|
|
||||||
|
|
||||||
Publishing your work
|
Publishing your work
|
||||||
--------------------
|
--------------------
|
||||||
|
|
||||||
We already talked about using somebody else's work from a remote
|
So we can use somebody else's work from a remote repository; but
|
||||||
repository, in the "merging external work" section. It involved
|
how can _you_ prepare a repository to let other people pull from
|
||||||
fetching the work from a remote repository; but how would _you_
|
it?
|
||||||
prepare a repository so that other people can fetch from it?
|
|
||||||
|
|
||||||
Your real work happens in your working directory with your
|
Your do your real work in your working directory that has your
|
||||||
primary repository hanging under it as its ".git" subdirectory.
|
primary repository hanging under it as its ".git" subdirectory.
|
||||||
You _could_ make it accessible remotely and ask people to pull
|
You _could_ make that repository accessible remotely and ask
|
||||||
from it, but in practice that is not the way things are usually
|
people to pull from it, but in practice that is not the way
|
||||||
done. A recommended way is to have a public repository, make it
|
things are usually done. A recommended way is to have a public
|
||||||
reachable by other people, and when the changes you made in your
|
repository, make it reachable by other people, and when the
|
||||||
primary working directory are in good shape, update the public
|
changes you made in your primary working directory are in good
|
||||||
repository with it.
|
shape, update the public repository from it. This is often
|
||||||
|
called "pushing".
|
||||||
|
|
||||||
[ Side note: this public repository could further be mirrored,
|
[ Side note: this public repository could further be mirrored,
|
||||||
and that is how kernel.org git repositories are done. ]
|
and that is how kernel.org git repositories are done. ]
|
||||||
|
|
||||||
Publishing the changes from your private repository to your
|
Publishing the changes from your local (private) repository to
|
||||||
public repository requires you to have write privilege on the
|
your remote (public) repository requires a write privilege on
|
||||||
machine that hosts your public repository, and it is internally
|
the remote machine. You need to have an SSH account there to
|
||||||
done via an SSH connection.
|
run a single command, "git-receive-pack".
|
||||||
|
|
||||||
First, you need to create an empty repository to push to on the
|
First, you need to create an empty repository on the remote
|
||||||
machine that houses your public repository. This needs to be
|
machine that will house your public repository. This empty
|
||||||
|
repository will be populated and be kept up-to-date by pushing
|
||||||
|
into it later. Obviously, this repository creation needs to be
|
||||||
done only once.
|
done only once.
|
||||||
|
|
||||||
|
[ Digression: "git push" uses a pair of programs,
|
||||||
|
"git-send-pack" on your local machine, and "git-receive-pack"
|
||||||
|
on the remote machine. The communication between the two over
|
||||||
|
the network internally uses an SSH connection. ]
|
||||||
|
|
||||||
Your private repository's GIT directory is usually .git, but
|
Your private repository's GIT directory is usually .git, but
|
||||||
often your public repository is named "<projectname>.git".
|
your public repository is often named after the project name,
|
||||||
Let's create such a public repository for project "my-git".
|
i.e. "<project>.git". Let's create such a public repository for
|
||||||
After logging into the remote machine, create an empty
|
project "my-git". After logging into the remote machine, create
|
||||||
directory:
|
an empty directory:
|
||||||
|
|
||||||
mkdir my-git.git
|
mkdir my-git.git
|
||||||
|
|
||||||
Then, initialize that directory with git-init-db, but this time,
|
Then, make that directory into a GIT repository by running
|
||||||
since it's name is not usual ".git", we do things a bit
|
git-init-db, but this time, since it's name is not the usual
|
||||||
differently:
|
".git", we do things slightly differently:
|
||||||
|
|
||||||
GIT_DIR=my-git.git git-init-db
|
GIT_DIR=my-git.git git-init-db
|
||||||
|
|
||||||
Make sure this directory is available for others you want your
|
Make sure this directory is available for others you want your
|
||||||
changes to be pulled by. Also make sure that you have the
|
changes to be pulled by via the transport of your choice. Also
|
||||||
'git-receive-pack' program on the $PATH.
|
you need to make sure that you have the "git-receive-pack"
|
||||||
|
program on the $PATH.
|
||||||
|
|
||||||
[ Side note: many installations of sshd does not invoke your
|
[ Side note: many installations of sshd do not invoke your shell
|
||||||
shell as the login shell when you directly run programs; what
|
as the login shell when you directly run programs; what this
|
||||||
this means is that if your login shell is bash, only .bashrc
|
means is that if your login shell is bash, only .bashrc is
|
||||||
is read bypassing .bash_profile. As a workaround, make sure
|
read and not .bash_profile. As a workaround, make sure
|
||||||
.bashrc sets up $PATH so that 'git-receive-pack' program can
|
.bashrc sets up $PATH so that you can run 'git-receive-pack'
|
||||||
be run. ]
|
program. ]
|
||||||
|
|
||||||
Your 'public repository' is ready to accept your changes. Now,
|
Your "public repository" is now ready to accept your changes.
|
||||||
come back to the machine you have your private repository. From
|
Come back to the machine you have your private repository. From
|
||||||
there, run this command:
|
there, run this command:
|
||||||
|
|
||||||
git push <public-host>:/path/to/my-git.git master
|
git push <public-host>:/path/to/my-git.git master
|
||||||
|
|
||||||
This synchronizes your public repository to match the named
|
This synchronizes your public repository to match the named
|
||||||
branch head (i.e. refs/heads/master in this case) and objects
|
branch head (i.e. "master" in this case) and objects reachable
|
||||||
reachable from them in your current repository.
|
from them in your current repository.
|
||||||
|
|
||||||
As a real example, this is how I update my public git
|
As a real example, this is how I update my public git
|
||||||
repository. Kernel.org mirror network takes care of the
|
repository. Kernel.org mirror network takes care of the
|
||||||
propagation to other publically visible machines:
|
propagation to other publicly visible machines:
|
||||||
|
|
||||||
git push master.kernel.org:/pub/scm/git/git.git/
|
git push master.kernel.org:/pub/scm/git/git.git/
|
||||||
|
|
||||||
|
|
||||||
[ to be continued.. cvsimports, pushing and pulling ]
|
[ Digression: your GIT "public" repository people can pull from
|
||||||
|
is different from a public CVS repository that lets read-write
|
||||||
|
access to multiple developers. It is a copy of _your_ primary
|
||||||
|
repository published for others to use, and you should not
|
||||||
|
push into it from more than one repository (this means, not
|
||||||
|
just disallowing other developers to push into it, but also
|
||||||
|
you should push into it from a single repository of yours).
|
||||||
|
Sharing the result of work done by multiple people are always
|
||||||
|
done by pulling (i.e. fetching and merging) from public
|
||||||
|
repositories of those people. Typically this is done by the
|
||||||
|
"project lead" person, and the resulting repository is
|
||||||
|
published as the public repository of the "project lead" for
|
||||||
|
everybody to base further changes on. ]
|
||||||
|
|
||||||
|
|
||||||
|
Packing your repository
|
||||||
|
-----------------------
|
||||||
|
|
||||||
|
Earlier, we saw that one file under .git/objects/??/ directory
|
||||||
|
is stored for each git object you create. This representation
|
||||||
|
is convenient and efficient to create atomically and safely, but
|
||||||
|
not so to transport over the network. Since git objects are
|
||||||
|
immutable once they are created, there is a way to optimize the
|
||||||
|
storage by "packing them together". The command
|
||||||
|
|
||||||
|
git repack
|
||||||
|
|
||||||
|
will do it for you. If you followed the tutorial examples, you
|
||||||
|
would have accumulated about 17 objects in .git/objects/??/
|
||||||
|
directories by now. "git repack" tells you how many objects it
|
||||||
|
packed, and stores the packed file in .git/objects/pack
|
||||||
|
directory.
|
||||||
|
|
||||||
|
[ Side Note: you will see two files, pack-*.pack and pack-*.idx,
|
||||||
|
in .git/objects/pack directory. They are closely related to
|
||||||
|
each other, and if you ever copy them by hand to a different
|
||||||
|
repository for whatever reason, you should make sure you copy
|
||||||
|
them together. The former holds all the data from the objects
|
||||||
|
in the pack, and the latter holds the index for random
|
||||||
|
access. ]
|
||||||
|
|
||||||
|
If you are paranoid, running "git-verify-pack" command would
|
||||||
|
detect if you have a corrupt pack, but do not worry too much.
|
||||||
|
Our programs are always perfect ;-).
|
||||||
|
|
||||||
|
Once you have packed objects, you do not need to leave the
|
||||||
|
unpacked objects that are contained in the pack file anymore.
|
||||||
|
|
||||||
|
git prune-packed
|
||||||
|
|
||||||
|
would remove them for you.
|
||||||
|
|
||||||
|
You can try running "find .git/objects -type f" before and after
|
||||||
|
you run "git prune-packed" if you are curious.
|
||||||
|
|
||||||
|
[ Side Note: as we already mentioned, "git pull" is broken for
|
||||||
|
some transports dealing with packed repositories right now, so
|
||||||
|
do not run "git prune-packed" if you plan to give "git pull"
|
||||||
|
access via HTTP transport for now. ]
|
||||||
|
|
||||||
|
If you run "git repack" again at this point, it will say
|
||||||
|
"Nothing to pack". Once you continue your development and
|
||||||
|
accumulate the changes, running "git repack" again will create a
|
||||||
|
new pack, that contains objects created since you packed your
|
||||||
|
archive the last time. We recommend that you pack your project
|
||||||
|
soon after the initial import (unless you are starting your
|
||||||
|
project from scratch), and then run "git repack" every once in a
|
||||||
|
while, depending on how active your project is.
|
||||||
|
|
||||||
|
When a repository is synchronized via "git push" and "git pull",
|
||||||
|
objects packed in the source repository is usually stored
|
||||||
|
unpacked in the destination, unless rsync transport is used.
|
||||||
|
|
||||||
|
|
||||||
|
Working with Others
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
A recommended work cycle for a "project lead" is like this:
|
||||||
|
|
||||||
|
(1) Prepare your primary repository on your local machine. Your
|
||||||
|
work is done there.
|
||||||
|
|
||||||
|
(2) Prepare a public repository accessible to others.
|
||||||
|
|
||||||
|
(3) Push into the public repository from your primary
|
||||||
|
repository.
|
||||||
|
|
||||||
|
(4) "git repack" the public repository. This establishes a big
|
||||||
|
pack that contains the initial set of objects.
|
||||||
|
|
||||||
|
(5) Keep working in your primary repository, and push your
|
||||||
|
changes to the public repository. Your changes include
|
||||||
|
your own, patches you receive via e-mail, and merge resulting
|
||||||
|
from pulling the "public" repositories of your "subsystem
|
||||||
|
maintainers".
|
||||||
|
|
||||||
|
You can repack this private repository whenever you feel
|
||||||
|
like.
|
||||||
|
|
||||||
|
(6) Every once in a while, "git repack" the public repository.
|
||||||
|
Go back to step (5) and continue working.
|
||||||
|
|
||||||
|
A recommended work cycle for a "subsystem maintainer" that
|
||||||
|
works on that project and has own "public repository" is like
|
||||||
|
this:
|
||||||
|
|
||||||
|
(1) Prepare your work repository, by "git clone" the public
|
||||||
|
repository of the "project lead".
|
||||||
|
|
||||||
|
(2) Prepare a public repository accessible to others.
|
||||||
|
|
||||||
|
(3) Copy over the packed files from "project lead" public
|
||||||
|
repository to your public repository by hand; this part is
|
||||||
|
currently not automated.
|
||||||
|
|
||||||
|
(4) Push into the public repository from your primary
|
||||||
|
repository.
|
||||||
|
|
||||||
|
(5) Keep working in your primary repository, and push your
|
||||||
|
changes to your public repository, and ask your "project
|
||||||
|
lead" to pull from it. Your changes include your own,
|
||||||
|
patches you receive via e-mail, and merge resulting from
|
||||||
|
pulling the "public" repositories of your "project lead"
|
||||||
|
and possibly your "sub-subsystem maintainers".
|
||||||
|
|
||||||
|
You can repack this private repository whenever you feel
|
||||||
|
like.
|
||||||
|
|
||||||
|
(6) Every once in a while, "git repack" the public repository.
|
||||||
|
Go back to step (5) and continue working.
|
||||||
|
|
||||||
|
A recommended work cycle for an "individual developer" who does
|
||||||
|
not have a "public" repository is somewhat different. It goes
|
||||||
|
like this:
|
||||||
|
|
||||||
|
(1) Prepare your work repositories, by "git clone" the public
|
||||||
|
repository of the "project lead" (or "subsystem
|
||||||
|
maintainer", if you work on a subsystem).
|
||||||
|
|
||||||
|
(2) Copy .git/refs/master to .git/refs/upstream.
|
||||||
|
|
||||||
|
(3) Do your work there. Make commits.
|
||||||
|
|
||||||
|
(4) Run "git fetch" from the public repository of your upstream
|
||||||
|
every once in a while. This does only the first half of
|
||||||
|
"git pull" but does not merge. The head of the public
|
||||||
|
repository is stored in .git/FETCH_HEAD. Copy it in
|
||||||
|
.git/refs/heads/upstream.
|
||||||
|
|
||||||
|
(5) Use "git cherry" to see which ones of your patches were
|
||||||
|
accepted, and/or use "git rebase" to port your unmerged
|
||||||
|
changes forward to the updated upstream.
|
||||||
|
|
||||||
|
(6) Use "git format-patch upstream" to prepare patches for
|
||||||
|
e-mail submission to your upstream and send it out.
|
||||||
|
Go back to step (3) and continue.
|
||||||
|
|
||||||
|
[Side Note: I think Cogito calls this upstream "origin".
|
||||||
|
Somebody care to confirm or deny? ]
|
||||||
|
|
||||||
|
|
||||||
|
[ to be continued.. cvsimports ]
|
||||||
|
Loading…
Reference in New Issue
Block a user