git-commit-vandalism/Documentation/technical/protocol-common.txt

97 lines
2.7 KiB
Plaintext
Raw Normal View History

Documentation Common to Pack and Http Protocols
===============================================
ABNF Notation
-------------
ABNF notation as described by RFC 5234 is used within the protocol documents,
except the following replacement core rules are used:
----
HEXDIG = DIGIT / "a" / "b" / "c" / "d" / "e" / "f"
----
We also define the following common rules:
----
NUL = %x00
zero-id = 40*"0"
obj-id = 40*(HEXDIGIT)
refname = "HEAD"
refname /= "refs/" <see discussion below>
----
A refname is a hierarchical octet string beginning with "refs/" and
not violating the 'git-check-ref-format' command's validation rules.
More specifically, they:
. They can include slash `/` for hierarchical (directory)
grouping, but no slash-separated component can begin with a
dot `.`.
. They must contain at least one `/`. This enforces the presence of a
category like `heads/`, `tags/` etc. but the actual names are not
restricted.
. They cannot have two consecutive dots `..` anywhere.
. They cannot have ASCII control characters (i.e. bytes whose
values are lower than \040, or \177 `DEL`), space, tilde `~`,
docs: stop using asciidoc no-inline-literal In asciidoc 7, backticks like `foo` produced a typographic effect, but did not otherwise affect the syntax. In asciidoc 8, backticks introduce an "inline literal" inside which markup is not interpreted. To keep compatibility with existing documents, asciidoc 8 has a "no-inline-literal" attribute to keep the old behavior. We enabled this so that the documentation could be built on either version. It has been several years now, and asciidoc 7 is no longer in wide use. We can now decide whether or not we want inline literals on their own merits, which are: 1. The source is much easier to read when the literal contains punctuation. You can use `master~1` instead of `master{tilde}1`. 2. They are less error-prone. Because of point (1), we tend to make mistakes and forget the extra layer of quoting. This patch removes the no-inline-literal attribute from the Makefile and converts every use of backticks in the documentation to an inline literal (they must be cleaned up, or the example above would literally show "{tilde}" in the output). Problematic sites were found by grepping for '`.*[{\\]' and examined and fixed manually. The results were then verified by comparing the output of "html2text" on the set of generated html pages. Doing so revealed that in addition to making the source more readable, this patch fixes several formatting bugs: - HTML rendering used the ellipsis character instead of literal "..." in code examples (like "git log A...B") - some code examples used the right-arrow character instead of '->' because they failed to quote - api-config.txt did not quote tilde, and the resulting HTML contained a bogus snippet like: <tt><sub></tt> foo <tt></sub>bar</tt> which caused some parsers to choke and omit whole sections of the page. - git-commit.txt confused ``foo`` (backticks inside a literal) with ``foo'' (matched double-quotes) - mentions of `A U Thor <author@example.com>` used to erroneously auto-generate a mailto footnote for author@example.com - the description of --word-diff=plain incorrectly showed the output as "[-removed-] and {added}", not "{+added+}". - using "prime" notation like: commit `C` and its replacement `C'` confused asciidoc into thinking that everything between the first backtick and the final apostrophe were meant to be inside matched quotes - asciidoc got confused by the escaping of some of our asterisks. In particular, `credential.\*` and `credential.<url>.\*` properly escaped the asterisk in the first case, but literally passed through the backslash in the second case. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-26 10:51:57 +02:00
caret `^`, colon `:`, question-mark `?`, asterisk `*`,
or open bracket `[` anywhere.
. They cannot end with a slash `/` or a dot `.`.
. They cannot end with the sequence `.lock`.
. They cannot contain a sequence `@{`.
. They cannot contain a `\\`.
pkt-line Format
---------------
Much (but not all) of the payload is described around pkt-lines.
A pkt-line is a variable length binary string. The first four bytes
of the line, the pkt-len, indicates the total length of the line,
in hexadecimal. The pkt-len includes the 4 bytes used to contain
the length's hexadecimal representation.
A pkt-line MAY contain binary data, so implementors MUST ensure
pkt-line parsing/formatting routines are 8-bit clean.
A non-binary line SHOULD BE terminated by an LF, which if present
MUST be included in the total length.
The maximum length of a pkt-line's data component is 65520 bytes.
Implementations MUST NOT send pkt-line whose length exceeds 65524
(65520 bytes of payload + 4 bytes of length data).
Implementations SHOULD NOT send an empty pkt-line ("0004").
A pkt-line with a length field of 0 ("0000"), called a flush-pkt,
is a special case and MUST be handled differently than an empty
pkt-line ("0004").
----
pkt-line = data-pkt / flush-pkt
data-pkt = pkt-len pkt-payload
pkt-len = 4*(HEXDIG)
pkt-payload = (pkt-len - 4)*(OCTET)
flush-pkt = "0000"
----
Examples (as C-style strings):
----
pkt-line actual value
---------------------------------
"0006a\n" "a\n"
"0005a" "a"
"000bfoobar\n" "foobar\n"
"0004" ""
----