Documentation: describe pack idx v2
Lifted from the log message of c553ca25bd
(pack-objects: learn about pack index version 2).
Acked-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
parent
29ab27f4b5
commit
71362bd552
@ -1,9 +1,9 @@
|
|||||||
GIT pack format
|
GIT pack format
|
||||||
===============
|
===============
|
||||||
|
|
||||||
= pack-*.pack file has the following format:
|
= pack-*.pack files have the following format:
|
||||||
|
|
||||||
- The header appears at the beginning and consists of the following:
|
- A header appears at the beginning and consists of the following:
|
||||||
|
|
||||||
4-byte signature:
|
4-byte signature:
|
||||||
The signature is: {'P', 'A', 'C', 'K'}
|
The signature is: {'P', 'A', 'C', 'K'}
|
||||||
@ -34,18 +34,14 @@ GIT pack format
|
|||||||
|
|
||||||
- The trailer records 20-byte SHA1 checksum of all of the above.
|
- The trailer records 20-byte SHA1 checksum of all of the above.
|
||||||
|
|
||||||
= pack-*.idx file has the following format:
|
= Original (version 1) pack-*.idx files have the following format:
|
||||||
|
|
||||||
- The header consists of 256 4-byte network byte order
|
- The header consists of 256 4-byte network byte order
|
||||||
integers. N-th entry of this table records the number of
|
integers. N-th entry of this table records the number of
|
||||||
objects in the corresponding pack, the first byte of whose
|
objects in the corresponding pack, the first byte of whose
|
||||||
object name are smaller than N. This is called the
|
object name is less than or equal to N. This is called the
|
||||||
'first-level fan-out' table.
|
'first-level fan-out' table.
|
||||||
|
|
||||||
Observation: we would need to extend this to an array of
|
|
||||||
8-byte integers to go beyond 4G objects per pack, but it is
|
|
||||||
not strictly necessary.
|
|
||||||
|
|
||||||
- The header is followed by sorted 24-byte entries, one entry
|
- The header is followed by sorted 24-byte entries, one entry
|
||||||
per object in the pack. Each entry is:
|
per object in the pack. Each entry is:
|
||||||
|
|
||||||
@ -55,10 +51,6 @@ GIT pack format
|
|||||||
|
|
||||||
20-byte object name.
|
20-byte object name.
|
||||||
|
|
||||||
Observation: we would definitely need to extend this to
|
|
||||||
8-byte integer plus 20-byte object name to handle a packfile
|
|
||||||
that is larger than 4GB.
|
|
||||||
|
|
||||||
- The file is concluded with a trailer:
|
- The file is concluded with a trailer:
|
||||||
|
|
||||||
A copy of the 20-byte SHA1 checksum at the end of
|
A copy of the 20-byte SHA1 checksum at the end of
|
||||||
@ -68,31 +60,30 @@ GIT pack format
|
|||||||
|
|
||||||
Pack Idx file:
|
Pack Idx file:
|
||||||
|
|
||||||
idx
|
-- +--------------------------------+
|
||||||
+--------------------------------+
|
fanout | fanout[0] = 2 (for example) |-.
|
||||||
| fanout[0] = 2 |-.
|
table +--------------------------------+ |
|
||||||
+--------------------------------+ |
|
|
||||||
| fanout[1] | |
|
| fanout[1] | |
|
||||||
+--------------------------------+ |
|
+--------------------------------+ |
|
||||||
| fanout[2] | |
|
| fanout[2] | |
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
|
||||||
| fanout[255] | |
|
| fanout[255] = total objects |---.
|
||||||
+--------------------------------+ |
|
-- +--------------------------------+ | |
|
||||||
main | offset | |
|
main | offset | | |
|
||||||
index | object name 00XXXXXXXXXXXXXXXX | |
|
index | object name 00XXXXXXXXXXXXXXXX | | |
|
||||||
table +--------------------------------+ |
|
table +--------------------------------+ | |
|
||||||
| offset | |
|
| offset | | |
|
||||||
| object name 00XXXXXXXXXXXXXXXX | |
|
| object name 00XXXXXXXXXXXXXXXX | | |
|
||||||
+--------------------------------+ |
|
+--------------------------------+<+ |
|
||||||
.-| offset |<+
|
.-| offset | |
|
||||||
| | object name 01XXXXXXXXXXXXXXXX |
|
| | object name 01XXXXXXXXXXXXXXXX | |
|
||||||
| +--------------------------------+
|
| +--------------------------------+ |
|
||||||
| | offset |
|
| | offset | |
|
||||||
| | object name 01XXXXXXXXXXXXXXXX |
|
| | object name 01XXXXXXXXXXXXXXXX | |
|
||||||
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
|
||||||
| | offset |
|
| | offset | |
|
||||||
| | object name FFXXXXXXXXXXXXXXXX |
|
| | object name FFXXXXXXXXXXXXXXXX | |
|
||||||
| +--------------------------------+
|
--| +--------------------------------+<--+
|
||||||
trailer | | packfile checksum |
|
trailer | | packfile checksum |
|
||||||
| +--------------------------------+
|
| +--------------------------------+
|
||||||
| | idxfile checksum |
|
| | idxfile checksum |
|
||||||
@ -116,3 +107,40 @@ Pack file entry: <+
|
|||||||
20-byte base object name SHA1 (the size above is the
|
20-byte base object name SHA1 (the size above is the
|
||||||
size of the delta data that follows).
|
size of the delta data that follows).
|
||||||
delta data, deflated.
|
delta data, deflated.
|
||||||
|
|
||||||
|
|
||||||
|
= Version 2 pack-*.idx files support packs larger than 4 GiB, and
|
||||||
|
have some other reorganizations. They have the format:
|
||||||
|
|
||||||
|
- A 4-byte magic number '\377tOc' which is an unreasonable
|
||||||
|
fanout[0] value.
|
||||||
|
|
||||||
|
- A 4-byte version number (= 2)
|
||||||
|
|
||||||
|
- A 256-entry fan-out table just like v1.
|
||||||
|
|
||||||
|
- A table of sorted 20-byte SHA1 object names. These are
|
||||||
|
packed together without offset values to reduce the cache
|
||||||
|
footprint of the binary search for a specific object name.
|
||||||
|
|
||||||
|
- A table of 4-byte CRC32 values of the packed object data.
|
||||||
|
This is new in v2 so compressed data can be copied directly
|
||||||
|
from pack to pack during repacking withough undetected
|
||||||
|
data corruption.
|
||||||
|
|
||||||
|
- A table of 4-byte offset values (in network byte order).
|
||||||
|
These are usually 31-bit pack file offsets, but large
|
||||||
|
offsets are encoded as an index into the next table with
|
||||||
|
the msbit set.
|
||||||
|
|
||||||
|
- A table of 8-byte offset entries (empty for pack files less
|
||||||
|
than 2 GiB). Pack files are organized with heavily used
|
||||||
|
objects toward the front, so most object references should
|
||||||
|
not need to refer to this table.
|
||||||
|
|
||||||
|
- The same trailer as a v1 pack file:
|
||||||
|
|
||||||
|
A copy of the 20-byte SHA1 checksum at the end of
|
||||||
|
corresponding packfile.
|
||||||
|
|
||||||
|
20-byte SHA1-checksum of all of the above.
|
||||||
|
Loading…
Reference in New Issue
Block a user