user-manual: git-fsck, dangling objects

Initial import of fsck and dangling objects discussion, mostly lifted from
an email from Linus.

Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
This commit is contained in:
J. Bruce Fields 2007-01-28 23:29:19 -05:00
parent b181d57ff4
commit 21dcb3b7ab

View File

@ -1373,12 +1373,37 @@ Ensuring reliability
Checking the repository for corruption
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TODO:
git-fsck
"dangling objects" explanation
Brief explanation here,
include forward reference to longer explanation from
Linus, to be added to later chapter
The gitlink:git-fsck-objects[1] command runs a number of self-consistency
checks on the repository, and reports on any problems. This may take some
time. The most common warning by far is about "dangling" objects:
-------------------------------------------------
$ git fsck-objects
dangling commit 7281251ddd2a61e38657c827739c57015671a6b3
dangling commit 2706a059f258c6b245f298dc4ff2ccd30ec21a63
dangling commit 13472b7c4b80851a1bc551779171dcb03655e9b5
dangling blob 218761f9d90712d37a9c5e36f406f92202db07eb
dangling commit bf093535a34a4d35731aa2bd90fe6b176302f14f
dangling commit 8e4bec7f2ddaa268bef999853c25755452100f8e
dangling tree d50bb86186bf27b681d25af89d3b5b68382e4085
dangling tree b24c2473f1fd3d91352a624795be026d64c8841f
...
-------------------------------------------------
Dangling objects are objects that are harmless, but also unnecessary; you can
remove them at any time with gitlink:git-prune[1] or the --prune option to
gitlink:git-gc[1]:
-------------------------------------------------
$ git gc --prune
-------------------------------------------------
This may be time-consuming. Unlike most other git operations (including git-gc
when run without any options), it is not safe to prune while other git
operations are in progress in the same repository.
For more about dangling merges, see <<dangling-merges>>.
Recovering lost changes
~~~~~~~~~~~~~~~~~~~~~~~
@ -2693,6 +2718,93 @@ objects will work exactly as they did before.
The gitlink:git-gc[1] command performs packing, pruning, and more for
you, so is normally the only high-level command you need.
[[dangling-objects]]
Dangling objects
^^^^^^^^^^^^^^^^
The gitlink:git-fsck-objects[1] command will sometimes complain about dangling
objects. They are not a problem.
The most common cause of dangling objects is that you've rebased a branch, or
you have pulled from somebody else who rebased a branch--see
<<cleaning-up-history>>. In that case, the old head of the original branch
still exists, as does obviously everything it pointed to. The branch pointer
itself just doesn't, since you replaced it with another one.
There are also other situations too that cause dangling objects. For example, a
"dangling blob" may arise because you did a "git add" of a file, but then,
before you actually committed it and made it part of the bigger picture, you
changed something else in that file and committed that *updated* thing - the
old state that you added originally ends up not being pointed to by any
commit or tree, so it's now a dangling blob object.
Similarly, when the "recursive" merge strategy runs, and finds that there
are criss-cross merges and thus more than one merge base (which is fairly
unusual, but it does happen), it will generate one temporary midway tree
(or possibly even more, if you had lots of criss-crossing merges and
more than two merge bases) as a temporary internal merge base, and again,
those are real objects, but the end result will not end up pointing to
them, so they end up "dangling" in your repository.
Generally, dangling objects aren't anything to worry about. They can even
be very useful: if you screw something up, the dangling objects can be how
you recover your old tree (say, you did a rebase, and realized that you
really didn't want to - you can look at what dangling objects you have,
and decide to reset your head to some old dangling state).
For commits, the most useful thing to do with dangling objects tends to be
to do a simple
------------------------------------------------
$ gitk <dangling-commit-sha-goes-here> --not --all
------------------------------------------------
which means exactly what it sounds like: it says that you want to see the
commit history that is described by the dangling commit(s), but you do NOT
want to see the history that is described by all your branches and tags
(which are the things you normally reach). That basically shows you in a
nice way what the dangling commit was (and notice that it might not be
just one commit: we only report the "tip of the line" as being dangling,
but there might be a whole deep and complex commit history that has gotten
dropped - rebasing will do that).
For blobs and trees, you can't do the same, but you can examine them. You
can just do
------------------------------------------------
$ git show <dangling-blob/tree-sha-goes-here>
------------------------------------------------
to show what the contents of the blob were (or, for a tree, basically what
the "ls" for that directory was), and that may give you some idea of what
the operation was that left that dangling object.
Usually, dangling blobs and trees aren't very interesting. They're almost
always the result of either being a half-way mergebase (the blob will
often even have the conflict markers from a merge in it, if you have had
conflicting merges that you fixed up by hand), or simply because you
interrupted a "git fetch" with ^C or something like that, leaving _some_
of the new objects in the object database, but just dangling and useless.
Anyway, once you are sure that you're not interested in any dangling
state, you can just prune all unreachable objects:
------------------------------------------------
$ git prune
------------------------------------------------
and they'll be gone. But you should only run "git prune" on a quiescent
repository - it's kind of like doing a filesystem fsck recovery: you don't
want to do that while the filesystem is mounted.
(The same is true of "git-fsck-objects" itself, btw - but since
git-fsck-objects never actually *changes* the repository, it just reports
on what it found, git-fsck-objects itself is never "dangerous" to run.
Running it while somebody is actually changing the repository can cause
confusing and scary messages, but it won't actually do anything bad. In
contrast, running "git prune" while somebody is actively changing the
repository is a *BAD* idea).
Glossary of git terms
=====================