user-manual: recovering from corruption
Some instructions on dealing with corruption of the object database. Most of this text is from an example by Linus, identified by Nicolas Pitre <nico@cam.org> with a little further editing by me. Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
This commit is contained in:
parent
7cb192eab0
commit
1cdade2c4c
@ -1560,6 +1560,11 @@ This may be time-consuming. Unlike most other git operations (including
|
|||||||
git-gc when run without any options), it is not safe to prune while
|
git-gc when run without any options), it is not safe to prune while
|
||||||
other git operations are in progress in the same repository.
|
other git operations are in progress in the same repository.
|
||||||
|
|
||||||
|
If gitlink:git-fsck[1] complains about sha1 mismatches or missing
|
||||||
|
objects, you may have a much more serious problem; your best option is
|
||||||
|
probably restoring from backups. See
|
||||||
|
<<recovering-from-repository-corruption>> for a detailed discussion.
|
||||||
|
|
||||||
[[recovering-lost-changes]]
|
[[recovering-lost-changes]]
|
||||||
Recovering lost changes
|
Recovering lost changes
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
@ -3220,6 +3225,127 @@ confusing and scary messages, but it won't actually do anything bad. In
|
|||||||
contrast, running "git prune" while somebody is actively changing the
|
contrast, running "git prune" while somebody is actively changing the
|
||||||
repository is a *BAD* idea).
|
repository is a *BAD* idea).
|
||||||
|
|
||||||
|
[[recovering-from-repository-corruption]]
|
||||||
|
Recovering from repository corruption
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
By design, git treats data trusted to it with caution. However, even in
|
||||||
|
the absence of bugs in git itself, it is still possible that hardware or
|
||||||
|
operating system errors could corrupt data.
|
||||||
|
|
||||||
|
The first defense against such problems is backups. You can back up a
|
||||||
|
git directory using clone, or just using cp, tar, or any other backup
|
||||||
|
mechanism.
|
||||||
|
|
||||||
|
As a last resort, you can search for the corrupted objects and attempt
|
||||||
|
to replace them by hand. Back up your repository before attempting this
|
||||||
|
in case you corrupt things even more in the process.
|
||||||
|
|
||||||
|
We'll assume that the problem is a single missing or corrupted blob,
|
||||||
|
which is sometimes a solveable problem. (Recovering missing trees and
|
||||||
|
especially commits is *much* harder).
|
||||||
|
|
||||||
|
Before starting, verify that there is corruption, and figure out where
|
||||||
|
it is with gitlink:git-fsck[1]; this may be time-consuming.
|
||||||
|
|
||||||
|
Assume the output looks like this:
|
||||||
|
|
||||||
|
------------------------------------------------
|
||||||
|
$ git-fsck --full
|
||||||
|
broken link from tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
|
||||||
|
to blob 4b9458b3786228369c63936db65827de3cc06200
|
||||||
|
missing blob 4b9458b3786228369c63936db65827de3cc06200
|
||||||
|
------------------------------------------------
|
||||||
|
|
||||||
|
(Typically there will be some "dangling object" messages too, but they
|
||||||
|
aren't interesting.)
|
||||||
|
|
||||||
|
Now you know that blob 4b9458b3 is missing, and that the tree 2d9263c6
|
||||||
|
points to it. If you could find just one copy of that missing blob
|
||||||
|
object, possibly in some other repository, you could move it into
|
||||||
|
.git/objects/4b/9458b3... and be done. Suppose you can't. You can
|
||||||
|
still examine the tree that pointed to it with gitlink:git-ls-tree[1],
|
||||||
|
which might output something like:
|
||||||
|
|
||||||
|
------------------------------------------------
|
||||||
|
$ git ls-tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
|
||||||
|
100644 blob 8d14531846b95bfa3564b58ccfb7913a034323b8 .gitignore
|
||||||
|
100644 blob ebf9bf84da0aab5ed944264a5db2a65fe3a3e883 .mailmap
|
||||||
|
100644 blob ca442d313d86dc67e0a2e5d584b465bd382cbf5c COPYING
|
||||||
|
...
|
||||||
|
100644 blob 4b9458b3786228369c63936db65827de3cc06200 myfile
|
||||||
|
...
|
||||||
|
------------------------------------------------
|
||||||
|
|
||||||
|
So now you know that the missing blob was the data for a file named
|
||||||
|
"myfile". And chances are you can also identify the directory--let's
|
||||||
|
say it's in "somedirectory". If you're lucky the missing copy might be
|
||||||
|
the same as the copy you have checked out in your working tree at
|
||||||
|
"somedirectory/myfile"; you can test whether that's right with
|
||||||
|
gitlink:git-hash-object[1]:
|
||||||
|
|
||||||
|
------------------------------------------------
|
||||||
|
$ git hash-object -w somedirectory/myfile
|
||||||
|
------------------------------------------------
|
||||||
|
|
||||||
|
which will create and store a blob object with the contents of
|
||||||
|
somedirectory/myfile, and output the sha1 of that object. if you're
|
||||||
|
extremely lucky it might be 4b9458b3786228369c63936db65827de3cc06200, in
|
||||||
|
which case you've guessed right, and the corruption is fixed!
|
||||||
|
|
||||||
|
Otherwise, you need more information. How do you tell which version of
|
||||||
|
the file has been lost?
|
||||||
|
|
||||||
|
The easiest way to do this is with:
|
||||||
|
|
||||||
|
------------------------------------------------
|
||||||
|
$ git log --raw --all --full-history -- somedirectory/myfile
|
||||||
|
------------------------------------------------
|
||||||
|
|
||||||
|
Because you're asking for raw output, you'll now get something like
|
||||||
|
|
||||||
|
------------------------------------------------
|
||||||
|
commit abc
|
||||||
|
Author:
|
||||||
|
Date:
|
||||||
|
...
|
||||||
|
:100644 100644 4b9458b... newsha... M somedirectory/myfile
|
||||||
|
|
||||||
|
|
||||||
|
commit xyz
|
||||||
|
Author:
|
||||||
|
Date:
|
||||||
|
|
||||||
|
...
|
||||||
|
:100644 100644 oldsha... 4b9458b... M somedirectory/myfile
|
||||||
|
------------------------------------------------
|
||||||
|
|
||||||
|
This tells you that the immediately preceding version of the file was
|
||||||
|
"newsha", and that the immediately following version was "oldsha".
|
||||||
|
You also know the commit messages that went with the change from oldsha
|
||||||
|
to 4b9458b and with the change from 4b9458b to newsha.
|
||||||
|
|
||||||
|
If you've been committing small enough changes, you may now have a good
|
||||||
|
shot at reconstructing the contents of the in-between state 4b9458b.
|
||||||
|
|
||||||
|
If you can do that, you can now recreate the missing object with
|
||||||
|
|
||||||
|
------------------------------------------------
|
||||||
|
$ git hash-object -w <recreated-file>
|
||||||
|
------------------------------------------------
|
||||||
|
|
||||||
|
and your repository is good again!
|
||||||
|
|
||||||
|
(Btw, you could have ignored the fsck, and started with doing a
|
||||||
|
|
||||||
|
------------------------------------------------
|
||||||
|
$ git log --raw --all
|
||||||
|
------------------------------------------------
|
||||||
|
|
||||||
|
and just looked for the sha of the missing object (4b9458b..) in that
|
||||||
|
whole thing. It's up to you - git does *have* a lot of information, it is
|
||||||
|
just missing one particular blob version.
|
||||||
|
|
||||||
[[the-index]]
|
[[the-index]]
|
||||||
The index
|
The index
|
||||||
-----------
|
-----------
|
||||||
@ -4429,4 +4555,7 @@ Write a chapter on using plumbing and writing scripts.
|
|||||||
|
|
||||||
Alternates, clone -reference, etc.
|
Alternates, clone -reference, etc.
|
||||||
|
|
||||||
git unpack-objects -r for recovery
|
More on recovery from repository corruption. See:
|
||||||
|
http://marc.theaimsgroup.com/?l=git&m=117263864820799&w=2
|
||||||
|
http://marc.theaimsgroup.com/?l=git&m=117147855503798&w=2
|
||||||
|
http://marc.theaimsgroup.com/?l=git&m=117147855503798&w=2
|
||||||
|
Loading…
Reference in New Issue
Block a user