Tolerate zlib deflation with window size < 32Kb
Git currently reports loose objects as 'corrupt' if they've been deflated using a window size less than 32Kb, because the experimental_loose_object() function doesn't recognise the header byte as a zlib header. This patch makes the function tolerant of all valid window sizes (15-bit to 8-bit) - but doesn't sacrifice it's accuracy in distingushing the standard loose-object format from the experimental (now abandoned) format. On memory constrained systems zlib may use a much smaller window size - working on Agit, I found that Android uses a 4KB window; giving a header byte of 0x48, not 0x78. Consequently all loose objects generated appear 'corrupt', which is why Agit is a read-only Git client at this time - I don't want my client to generate Git repos that other clients treat as broken :( This patch makes Git tolerant of different deflate settings - it might appear that it changes experimental_loose_object() to the point where it could incorrectly identify the experimental format as the standard one, but the two criteria (bitmask & checksum) can only give a false result for an experimental object where both of the following are true: 1) object size is exactly 8 bytes when uncompressed (bitmask) 2) [single-byte in-pack git type&size header] * 256 + [1st byte of the following zlib header] % 31 = 0 (checksum) As it happens, for all possible combinations of valid object type (1-4) and window bits (0-7), the only time when the checksum will be divisible by 31 is for 0x1838 - ie object type *1*, a Commit - which, due the fields all Commit objects must contain, could never be as small as 8 bytes in size. Given this, the combination of the two criteria (bitmask & checksum) always correctly determines the buffer format, and is more tolerant than the previous version. The alternative to this patch is simply removing support for the experimental format, which I am also totally cool with. References: Android uses a 4KB window for deflation: http://android.git.kernel.org/?p=platform/libcore.git;a=blob;f=luni/src/main/native/java_util_zip_Deflater.cpp;h=c0b2feff196e63a7b85d97cf9ae5bb2583409c28;hb=refs/heads/gingerbread#l53 Code snippet searching for false positives with the zlib checksum: https://gist.github.com/1118177 Signed-off-by: Roberto Tyley <roberto.tyley@guardian.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
parent
e9e0643fe6
commit
7f684a2aff
32
sha1_file.c
32
sha1_file.c
@ -1217,14 +1217,34 @@ static int experimental_loose_object(unsigned char *map)
|
||||
unsigned int word;
|
||||
|
||||
/*
|
||||
* Is it a zlib-compressed buffer? If so, the first byte
|
||||
* must be 0x78 (15-bit window size, deflated), and the
|
||||
* first 16-bit word is evenly divisible by 31. If so,
|
||||
* we are looking at the official format, not the experimental
|
||||
* one.
|
||||
* We must determine if the buffer contains the standard
|
||||
* zlib-deflated stream or the experimental format based
|
||||
* on the in-pack object format. Compare the header byte
|
||||
* for each format:
|
||||
*
|
||||
* RFC1950 zlib w/ deflate : 0www1000 : 0 <= www <= 7
|
||||
* Experimental pack-based : Stttssss : ttt = 1,2,3,4
|
||||
*
|
||||
* If bit 7 is clear and bits 0-3 equal 8, the buffer MUST be
|
||||
* in standard loose-object format, UNLESS it is a Git-pack
|
||||
* format object *exactly* 8 bytes in size when inflated.
|
||||
*
|
||||
* However, RFC1950 also specifies that the 1st 16-bit word
|
||||
* must be divisible by 31 - this checksum tells us our buffer
|
||||
* is in the standard format, giving a false positive only if
|
||||
* the 1st word of the Git-pack format object happens to be
|
||||
* divisible by 31, ie:
|
||||
* ((byte0 * 256) + byte1) % 31 = 0
|
||||
* => 0ttt10000www1000 % 31 = 0
|
||||
*
|
||||
* As it happens, this case can only arise for www=3 & ttt=1
|
||||
* - ie, a Commit object, which would have to be 8 bytes in
|
||||
* size. As no Commit can be that small, we find that the
|
||||
* combination of these two criteria (bitmask & checksum)
|
||||
* can always correctly determine the buffer format.
|
||||
*/
|
||||
word = (map[0] << 8) + map[1];
|
||||
if (map[0] == 0x78 && !(word % 31))
|
||||
if ((map[0] & 0x8F) == 0x08 && !(word % 31))
|
||||
return 0;
|
||||
else
|
||||
return 1;
|
||||
|
68
t/t1013-loose-object-format.sh
Executable file
68
t/t1013-loose-object-format.sh
Executable file
@ -0,0 +1,68 @@
|
||||
#!/bin/sh
|
||||
#
|
||||
# Copyright (c) 2011 Roberto Tyley
|
||||
#
|
||||
|
||||
test_description='Correctly identify and parse loose object headers
|
||||
|
||||
There are two file formats for loose objects - the original standard
|
||||
format, and the experimental format introduced with Git v1.4.3, later
|
||||
deprecated with v1.5.3. Although Git no longer writes the
|
||||
experimental format, objects in both formats must be read, with the
|
||||
format for a given file being determined by the header.
|
||||
|
||||
Detecting file format based on header is not entirely trivial, not
|
||||
least because the first byte of a zlib-deflated stream will vary
|
||||
depending on how much memory was allocated for the deflation window
|
||||
buffer when the object was written out (for example 4KB on Android,
|
||||
rather that 32KB on a normal PC).
|
||||
|
||||
The loose objects used as test vectors have been generated with the
|
||||
following Git versions:
|
||||
|
||||
standard format: Git v1.7.4.1
|
||||
experimental format: Git v1.4.3 (legacyheaders=false)
|
||||
standard format, deflated with 4KB window size: Agit/JGit on Android
|
||||
'
|
||||
|
||||
. ./test-lib.sh
|
||||
LF='
|
||||
'
|
||||
|
||||
assert_blob_equals() {
|
||||
printf "%s" "$2" >expected &&
|
||||
git cat-file -p "$1" >actual &&
|
||||
test_cmp expected actual
|
||||
}
|
||||
|
||||
test_expect_success setup '
|
||||
cp -R "$TEST_DIRECTORY/t1013/objects" .git/
|
||||
git --version
|
||||
'
|
||||
|
||||
test_expect_success 'read standard-format loose objects' '
|
||||
git cat-file tag 8d4e360d6c70fbd72411991c02a09c442cf7a9fa &&
|
||||
git cat-file commit 6baee0540ea990d9761a3eb9ab183003a71c3696 &&
|
||||
git ls-tree 7a37b887a73791d12d26c0d3e39568a8fb0fa6e8 &&
|
||||
assert_blob_equals "257cc5642cb1a054f08cc83f2d943e56fd3ebe99" "foo$LF"
|
||||
'
|
||||
|
||||
test_expect_success 'read experimental-format loose objects' '
|
||||
git cat-file tag 76e7fa9941f4d5f97f64fea65a2cba436bc79cbb &&
|
||||
git cat-file commit 7875c6237d3fcdd0ac2f0decc7d3fa6a50b66c09 &&
|
||||
git ls-tree 95b1625de3ba8b2214d1e0d0591138aea733f64f &&
|
||||
assert_blob_equals "2e65efe2a145dda7ee51d1741299f848e5bf752e" "a" &&
|
||||
assert_blob_equals "9ae9e86b7bd6cb1472d9373702d8249973da0832" "ab" &&
|
||||
assert_blob_equals "85df50785d62d3b05ab03d9cbf7e4a0b49449730" "abcd" &&
|
||||
assert_blob_equals "1656f9233d999f61ef23ef390b9c71d75399f435" "abcdefgh" &&
|
||||
assert_blob_equals "1e72a6b2c4a577ab0338860fa9fe87f761fc9bbd" "abcdefghi" &&
|
||||
assert_blob_equals "70e6a83d8dcb26fc8bc0cf702e2ddeb6adca18fd" "abcdefghijklmnop" &&
|
||||
assert_blob_equals "bd15045f6ce8ff75747562173640456a394412c8" "abcdefghijklmnopqrstuvwx"
|
||||
'
|
||||
|
||||
test_expect_success 'read standard-format objects deflated with smaller window buffer' '
|
||||
git cat-file tag f816d5255855ac160652ee5253b06cd8ee14165a &&
|
||||
git cat-file tag 149cedb5c46929d18e0f118e9fa31927487af3b6
|
||||
'
|
||||
|
||||
test_done
|
BIN
t/t1013/objects/14/9cedb5c46929d18e0f118e9fa31927487af3b6
Normal file
BIN
t/t1013/objects/14/9cedb5c46929d18e0f118e9fa31927487af3b6
Normal file
Binary file not shown.
BIN
t/t1013/objects/16/56f9233d999f61ef23ef390b9c71d75399f435
Normal file
BIN
t/t1013/objects/16/56f9233d999f61ef23ef390b9c71d75399f435
Normal file
Binary file not shown.
BIN
t/t1013/objects/1e/72a6b2c4a577ab0338860fa9fe87f761fc9bbd
Normal file
BIN
t/t1013/objects/1e/72a6b2c4a577ab0338860fa9fe87f761fc9bbd
Normal file
Binary file not shown.
BIN
t/t1013/objects/25/7cc5642cb1a054f08cc83f2d943e56fd3ebe99
Normal file
BIN
t/t1013/objects/25/7cc5642cb1a054f08cc83f2d943e56fd3ebe99
Normal file
Binary file not shown.
BIN
t/t1013/objects/2e/65efe2a145dda7ee51d1741299f848e5bf752e
Normal file
BIN
t/t1013/objects/2e/65efe2a145dda7ee51d1741299f848e5bf752e
Normal file
Binary file not shown.
BIN
t/t1013/objects/6b/aee0540ea990d9761a3eb9ab183003a71c3696
Normal file
BIN
t/t1013/objects/6b/aee0540ea990d9761a3eb9ab183003a71c3696
Normal file
Binary file not shown.
BIN
t/t1013/objects/70/e6a83d8dcb26fc8bc0cf702e2ddeb6adca18fd
Normal file
BIN
t/t1013/objects/70/e6a83d8dcb26fc8bc0cf702e2ddeb6adca18fd
Normal file
Binary file not shown.
@ -0,0 +1,2 @@
|
||||
Âxś%ĚA‚0@Ń}O1{cSZ(<28>ăνáĂthŞ”’ZŚÜŢ Ë˙?
¦m×6dµiťÉ9…¤Gĺ<47>h´Ř¨ÁZR'Q¶…<C2B6>RŚˇ<C59A>‚řłp‘ç‚ÓqL9âĎ=g¸§<C2B8>sIĐoopÎ˙”eĎ«_1»€ł¤$×ç*Si«ëNwpP•RBôűĹÁú
|
||||
ł‡[(đ®d-ŤřÁL9á
|
BIN
t/t1013/objects/78/75c6237d3fcdd0ac2f0decc7d3fa6a50b66c09
Normal file
BIN
t/t1013/objects/78/75c6237d3fcdd0ac2f0decc7d3fa6a50b66c09
Normal file
Binary file not shown.
BIN
t/t1013/objects/7a/37b887a73791d12d26c0d3e39568a8fb0fa6e8
Normal file
BIN
t/t1013/objects/7a/37b887a73791d12d26c0d3e39568a8fb0fa6e8
Normal file
Binary file not shown.
BIN
t/t1013/objects/85/df50785d62d3b05ab03d9cbf7e4a0b49449730
Normal file
BIN
t/t1013/objects/85/df50785d62d3b05ab03d9cbf7e4a0b49449730
Normal file
Binary file not shown.
BIN
t/t1013/objects/8d/4e360d6c70fbd72411991c02a09c442cf7a9fa
Normal file
BIN
t/t1013/objects/8d/4e360d6c70fbd72411991c02a09c442cf7a9fa
Normal file
Binary file not shown.
BIN
t/t1013/objects/95/b1625de3ba8b2214d1e0d0591138aea733f64f
Normal file
BIN
t/t1013/objects/95/b1625de3ba8b2214d1e0d0591138aea733f64f
Normal file
Binary file not shown.
BIN
t/t1013/objects/9a/e9e86b7bd6cb1472d9373702d8249973da0832
Normal file
BIN
t/t1013/objects/9a/e9e86b7bd6cb1472d9373702d8249973da0832
Normal file
Binary file not shown.
BIN
t/t1013/objects/bd/15045f6ce8ff75747562173640456a394412c8
Normal file
BIN
t/t1013/objects/bd/15045f6ce8ff75747562173640456a394412c8
Normal file
Binary file not shown.
BIN
t/t1013/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391
Normal file
BIN
t/t1013/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391
Normal file
Binary file not shown.
@ -0,0 +1 @@
|
||||
H<EFBFBD>ЬС<0E>0<0C>aЯ{<7B>о
IЛe&Цј*Ѕ<1D>GАп^И§љПЫDхв<D185>wU<77>в<EFBFBD>ЌSБ4Њ<19>ЦЊ<C2AD> ,fХ[№пVAлКЮќxШЧі6[wtGЇLuИ?<3F>ІВМкз@<40>"gь{<7B>+byО%M
|
Loading…
Reference in New Issue
Block a user