user-manual: recovering from corruption
Some instructions on dealing with corruption of the object database. Most of this text is from an example by Linus, identified by Nicolas Pitre <nico@cam.org> with a little further editing by me. Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
This commit is contained in:
parent
7cb192eab0
commit
1cdade2c4c
@ -1560,6 +1560,11 @@ This may be time-consuming. Unlike most other git operations (including
|
||||
git-gc when run without any options), it is not safe to prune while
|
||||
other git operations are in progress in the same repository.
|
||||
|
||||
If gitlink:git-fsck[1] complains about sha1 mismatches or missing
|
||||
objects, you may have a much more serious problem; your best option is
|
||||
probably restoring from backups. See
|
||||
<<recovering-from-repository-corruption>> for a detailed discussion.
|
||||
|
||||
[[recovering-lost-changes]]
|
||||
Recovering lost changes
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
@ -3220,6 +3225,127 @@ confusing and scary messages, but it won't actually do anything bad. In
|
||||
contrast, running "git prune" while somebody is actively changing the
|
||||
repository is a *BAD* idea).
|
||||
|
||||
[[recovering-from-repository-corruption]]
|
||||
Recovering from repository corruption
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
By design, git treats data trusted to it with caution. However, even in
|
||||
the absence of bugs in git itself, it is still possible that hardware or
|
||||
operating system errors could corrupt data.
|
||||
|
||||
The first defense against such problems is backups. You can back up a
|
||||
git directory using clone, or just using cp, tar, or any other backup
|
||||
mechanism.
|
||||
|
||||
As a last resort, you can search for the corrupted objects and attempt
|
||||
to replace them by hand. Back up your repository before attempting this
|
||||
in case you corrupt things even more in the process.
|
||||
|
||||
We'll assume that the problem is a single missing or corrupted blob,
|
||||
which is sometimes a solveable problem. (Recovering missing trees and
|
||||
especially commits is *much* harder).
|
||||
|
||||
Before starting, verify that there is corruption, and figure out where
|
||||
it is with gitlink:git-fsck[1]; this may be time-consuming.
|
||||
|
||||
Assume the output looks like this:
|
||||
|
||||
------------------------------------------------
|
||||
$ git-fsck --full
|
||||
broken link from tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
|
||||
to blob 4b9458b3786228369c63936db65827de3cc06200
|
||||
missing blob 4b9458b3786228369c63936db65827de3cc06200
|
||||
------------------------------------------------
|
||||
|
||||
(Typically there will be some "dangling object" messages too, but they
|
||||
aren't interesting.)
|
||||
|
||||
Now you know that blob 4b9458b3 is missing, and that the tree 2d9263c6
|
||||
points to it. If you could find just one copy of that missing blob
|
||||
object, possibly in some other repository, you could move it into
|
||||
.git/objects/4b/9458b3... and be done. Suppose you can't. You can
|
||||
still examine the tree that pointed to it with gitlink:git-ls-tree[1],
|
||||
which might output something like:
|
||||
|
||||
------------------------------------------------
|
||||
$ git ls-tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
|
||||
100644 blob 8d14531846b95bfa3564b58ccfb7913a034323b8 .gitignore
|
||||
100644 blob ebf9bf84da0aab5ed944264a5db2a65fe3a3e883 .mailmap
|
||||
100644 blob ca442d313d86dc67e0a2e5d584b465bd382cbf5c COPYING
|
||||
...
|
||||
100644 blob 4b9458b3786228369c63936db65827de3cc06200 myfile
|
||||
...
|
||||
------------------------------------------------
|
||||
|
||||
So now you know that the missing blob was the data for a file named
|
||||
"myfile". And chances are you can also identify the directory--let's
|
||||
say it's in "somedirectory". If you're lucky the missing copy might be
|
||||
the same as the copy you have checked out in your working tree at
|
||||
"somedirectory/myfile"; you can test whether that's right with
|
||||
gitlink:git-hash-object[1]:
|
||||
|
||||
------------------------------------------------
|
||||
$ git hash-object -w somedirectory/myfile
|
||||
------------------------------------------------
|
||||
|
||||
which will create and store a blob object with the contents of
|
||||
somedirectory/myfile, and output the sha1 of that object. if you're
|
||||
extremely lucky it might be 4b9458b3786228369c63936db65827de3cc06200, in
|
||||
which case you've guessed right, and the corruption is fixed!
|
||||
|
||||
Otherwise, you need more information. How do you tell which version of
|
||||
the file has been lost?
|
||||
|
||||
The easiest way to do this is with:
|
||||
|
||||
------------------------------------------------
|
||||
$ git log --raw --all --full-history -- somedirectory/myfile
|
||||
------------------------------------------------
|
||||
|
||||
Because you're asking for raw output, you'll now get something like
|
||||
|
||||
------------------------------------------------
|
||||
commit abc
|
||||
Author:
|
||||
Date:
|
||||
...
|
||||
:100644 100644 4b9458b... newsha... M somedirectory/myfile
|
||||
|
||||
|
||||
commit xyz
|
||||
Author:
|
||||
Date:
|
||||
|
||||
...
|
||||
:100644 100644 oldsha... 4b9458b... M somedirectory/myfile
|
||||
------------------------------------------------
|
||||
|
||||
This tells you that the immediately preceding version of the file was
|
||||
"newsha", and that the immediately following version was "oldsha".
|
||||
You also know the commit messages that went with the change from oldsha
|
||||
to 4b9458b and with the change from 4b9458b to newsha.
|
||||
|
||||
If you've been committing small enough changes, you may now have a good
|
||||
shot at reconstructing the contents of the in-between state 4b9458b.
|
||||
|
||||
If you can do that, you can now recreate the missing object with
|
||||
|
||||
------------------------------------------------
|
||||
$ git hash-object -w <recreated-file>
|
||||
------------------------------------------------
|
||||
|
||||
and your repository is good again!
|
||||
|
||||
(Btw, you could have ignored the fsck, and started with doing a
|
||||
|
||||
------------------------------------------------
|
||||
$ git log --raw --all
|
||||
------------------------------------------------
|
||||
|
||||
and just looked for the sha of the missing object (4b9458b..) in that
|
||||
whole thing. It's up to you - git does *have* a lot of information, it is
|
||||
just missing one particular blob version.
|
||||
|
||||
[[the-index]]
|
||||
The index
|
||||
-----------
|
||||
@ -4429,4 +4555,7 @@ Write a chapter on using plumbing and writing scripts.
|
||||
|
||||
Alternates, clone -reference, etc.
|
||||
|
||||
git unpack-objects -r for recovery
|
||||
More on recovery from repository corruption. See:
|
||||
http://marc.theaimsgroup.com/?l=git&m=117263864820799&w=2
|
||||
http://marc.theaimsgroup.com/?l=git&m=117147855503798&w=2
|
||||
http://marc.theaimsgroup.com/?l=git&m=117147855503798&w=2
|
||||
|
Loading…
Reference in New Issue
Block a user