git-commit-vandalism/Documentation/technical/bundle-format.txt
brian m. carlson c5aecfc866 bundle: add new version for use with SHA-256
Currently we detect the hash algorithm in use by the length of the
object ID.  This is inelegant and prevents us from using a different
hash algorithm that is also 256 bits in length.

Since we cannot extend the v2 format in a backward-compatible way, let's
add a v3 format, which is identical, except for the addition of
capabilities, which are prefixed by an at sign.  We add "object-format"
as the only capability and reject unknown capabilities, since we do not
have a network connection and therefore cannot negotiate with the other
side.

For compatibility, default to the v2 format for SHA-1 and require v3
for SHA-256.

In t5510, always use format v3 so we can be sure we produce consistent
results across hash algorithms.  Since head -n N lists the top N lines
instead of the Nth line, let's run our output through sed to normalize
it and compare it against a fixed value, which will make sure we get
exactly what we're expecting.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-30 09:16:48 -07:00

77 lines
2.6 KiB
Plaintext

= Git bundle v2 format
The Git bundle format is a format that represents both refs and Git objects.
== Format
We will use ABNF notation to define the Git bundle format. See
protocol-common.txt for the details.
A v2 bundle looks like this:
----
bundle = signature *prerequisite *reference LF pack
signature = "# v2 git bundle" LF
prerequisite = "-" obj-id SP comment LF
comment = *CHAR
reference = obj-id SP refname LF
pack = ... ; packfile
----
A v3 bundle looks like this:
----
bundle = signature *capability *prerequisite *reference LF pack
signature = "# v3 git bundle" LF
capability = "@" key ["=" value] LF
prerequisite = "-" obj-id SP comment LF
comment = *CHAR
reference = obj-id SP refname LF
key = 1*(ALPHA / DIGIT / "-")
value = *(%01-09 / %0b-FF)
pack = ... ; packfile
----
== Semantics
A Git bundle consists of several parts.
* "Capabilities", which are only in the v3 format, indicate functionality that
the bundle requires to be read properly.
* "Prerequisites" lists the objects that are NOT included in the bundle and the
reader of the bundle MUST already have, in order to use the data in the
bundle. The objects stored in the bundle may refer to prerequisite objects and
anything reachable from them (e.g. a tree object in the bundle can reference
a blob that is reachable from a prerequisite) and/or expressed as a delta
against prerequisite objects.
* "References" record the tips of the history graph, iow, what the reader of the
bundle CAN "git fetch" from it.
* "Pack" is the pack data stream "git fetch" would send, if you fetch from a
repository that has the references recorded in the "References" above into a
repository that has references pointing at the objects listed in
"Prerequisites" above.
In the bundle format, there can be a comment following a prerequisite obj-id.
This is a comment and it has no specific meaning. The writer of the bundle MAY
put any string here. The reader of the bundle MUST ignore the comment.
=== Note on the shallow clone and a Git bundle
Note that the prerequisites does not represent a shallow-clone boundary. The
semantics of the prerequisites and the shallow-clone boundaries are different,
and the Git bundle v2 format cannot represent a shallow clone repository.
== Capabilities
Because there is no opportunity for negotiation, unknown capabilities cause 'git
bundle' to abort. The only known capability is `object-format`, which specifies
the hash algorithm in use, and can take the same values as the
`extensions.objectFormat` configuration value.