Merge branch 'ta/hash-function-transition-doc'

Update formatting and grammar of the hash transition plan documentation, plus some updates. * ta/hash-function-transition-doc: doc: use https links doc hash-function-transition: move rationale upwards doc hash-function-transition: fix incomplete sentence doc hash-function-transition: use upper case consistently doc hash-function-transition: use SHA-1 and SHA-256 consistently doc hash-function-transition: fix asciidoc output
2021-02-22 16:12:43 -08:00 · 2021-02-22 16:12:43 -08:00 · dc24948be9
commit dc24948be9
parent 15af6e6fee 6eda9ac9e5
2 changed files with 150 additions and 147 deletions
--- a/Documentation/technical/hash-function-transition.txt
+++ b/Documentation/technical/hash-function-transition.txt
@ -33,16 +33,9 @@ researchers. On 23 February 2017 the SHAttered attack

 Git v2.13.0 and later subsequently moved to a hardened SHA-1
 implementation by default, which isn't vulnerable to the SHAttered
-attack.
+attack, but SHA-1 is still weak.

-Thus Git has in effect already migrated to a new hash that isn't SHA-1
-and doesn't share its vulnerabilities, its new hash function just
-happens to produce exactly the same output for all known inputs,
-except two PDFs published by the SHAttered researchers, and the new
-implementation (written by those researchers) claims to detect future
-cryptanalytic collision attacks.
-
-Regardless, it's considered prudent to move past any variant of SHA-1
+Thus it's considered prudent to move past any variant of SHA-1
 to a new hash. There's no guarantee that future attacks on SHA-1 won't
 be published in the future, and those attacks may not have viable
 mitigations.
@ -57,6 +50,38 @@ SHA-1 still possesses the other properties such as fast object lookup
 and safe error checking, but other hash functions are equally suitable
 that are believed to be cryptographically secure.

+Choice of Hash
+--------------
+The hash to replace the hardened SHA-1 should be stronger than SHA-1
+was: we would like it to be trustworthy and useful in practice for at
+least 10 years.
+
+Some other relevant properties:
+
+1. A 256-bit hash (long enough to match common security practice; not
+   excessively long to hurt performance and disk usage).
+
+2. High quality implementations should be widely available (e.g., in
+   OpenSSL and Apple CommonCrypto).
+
+3. The hash function's properties should match Git's needs (e.g. Git
+   requires collision and 2nd preimage resistance and does not require
+   length extension resistance).
+
+4. As a tiebreaker, the hash should be fast to compute (fortunately
+   many contenders are faster than SHA-1).
+
+There were several contenders for a successor hash to SHA-1, including
+SHA-256, SHA-512/256, SHA-256x16, K12, and BLAKE2bp-256.
+
+In late 2018 the project picked SHA-256 as its successor hash.
+
+See 0ed8d8da374 (doc hash-function-transition: pick SHA-256 as
+NewHash, 2018-08-04) and numerous mailing list threads at the time,
+particularly the one starting at
+https://lore.kernel.org/git/20180609224913.GC38834@genre.crustytoothpaste.net/
+for more information.
+
 Goals
 -----
 1. The transition to SHA-256 can be done one local repository at a time.
@ -94,7 +119,7 @@ Overview
 --------
 We introduce a new repository format extension. Repositories with this
 extension enabled use SHA-256 instead of SHA-1 to name their objects.
-This affects both object names and object content --- both the names
+This affects both object names and object content -- both the names
 of objects and all references to other objects within an object are
 switched to the new hash function.

@ -107,7 +132,7 @@ mapping to allow naming objects using either their SHA-1 and SHA-256 names
 interchangeably.

 "git cat-file" and "git hash-object" gain options to display an object
-in its sha1 form and write an object given its sha1 form. This
+in its SHA-1 form and write an object given its SHA-1 form. This
 requires all objects referenced by that object to be present in the
 object database so that they can be named using the appropriate name
 (using the bidirectional hash mapping).
@ -115,7 +140,7 @@ object database so that they can be named using the appropriate name
 Fetches from a SHA-1 based server convert the fetched objects into
 SHA-256 form and record the mapping in the bidirectional mapping table
 (see below for details). Pushes to a SHA-1 based server convert the
-objects being pushed into sha1 form so the server does not have to be
+objects being pushed into SHA-1 form so the server does not have to be
 aware of the hash function the client is using.

 Detailed Design
@ -151,38 +176,38 @@ repository extensions.

 Object names
 ~~~~~~~~~~~~
-Objects can be named by their 40 hexadecimal digit sha1-name or 64
-hexadecimal digit sha256-name, plus names derived from those (see
+Objects can be named by their 40 hexadecimal digit SHA-1 name or 64
+hexadecimal digit SHA-256 name, plus names derived from those (see
 gitrevisions(7)).

-The sha1-name of an object is the SHA-1 of the concatenation of its
-type, length, a nul byte, and the object's sha1-content. This is the
+The SHA-1 name of an object is the SHA-1 of the concatenation of its
+type, length, a nul byte, and the object's SHA-1 content. This is the
 traditional <sha1> used in Git to name objects.

-The sha256-name of an object is the SHA-256 of the concatenation of its
-type, length, a nul byte, and the object's sha256-content.
+The SHA-256 name of an object is the SHA-256 of the concatenation of its
+type, length, a nul byte, and the object's SHA-256 content.

 Object format
 ~~~~~~~~~~~~~
 The content as a byte sequence of a tag, commit, or tree object named
-by sha1 and sha256 differ because an object named by sha256-name refers to
-other objects by their sha256-names and an object named by sha1-name
-refers to other objects by their sha1-names.
+by SHA-1 and SHA-256 differ because an object named by SHA-256 name refers to
+other objects by their SHA-256 names and an object named by SHA-1 name
+refers to other objects by their SHA-1 names.

-The sha256-content of an object is the same as its sha1-content, except
-that objects referenced by the object are named using their sha256-names
-instead of sha1-names. Because a blob object does not refer to any
-other object, its sha1-content and sha256-content are the same.
+The SHA-256 content of an object is the same as its SHA-1 content, except
+that objects referenced by the object are named using their SHA-256 names
+instead of SHA-1 names. Because a blob object does not refer to any
+other object, its SHA-1 content and SHA-256 content are the same.

-The format allows round-trip conversion between sha256-content and
-sha1-content.
+The format allows round-trip conversion between SHA-256 content and
+SHA-1 content.

 Object storage
 ~~~~~~~~~~~~~~
 Loose objects use zlib compression and packed objects use the packed
 format described in Documentation/technical/pack-format.txt, just like
-today. The content that is compressed and stored uses sha256-content
-instead of sha1-content.
+today. The content that is compressed and stored uses SHA-256 content
+instead of SHA-1 content.

 Pack index
 ~~~~~~~~~~
@ -191,21 +216,21 @@ hash functions. They have the following format (all integers are in
 network byte order):

 - A header appears at the beginning and consists of the following:
-  - The 4-byte pack index signature: '\377t0c'
-  - 4-byte version number: 3
-  - 4-byte length of the header section, including the signature and
+  * The 4-byte pack index signature: '\377t0c'
+  * 4-byte version number: 3
+  * 4-byte length of the header section, including the signature and
    version number
-  - 4-byte number of objects contained in the pack
-  - 4-byte number of object formats in this pack index: 2
-  - For each object format:
-    - 4-byte format identifier (e.g., 'sha1' for SHA-1)
-    - 4-byte length in bytes of shortened object names. This is the
+  * 4-byte number of objects contained in the pack
+  * 4-byte number of object formats in this pack index: 2
+  * For each object format:
+    ** 4-byte format identifier (e.g., 'sha1' for SHA-1)
+    ** 4-byte length in bytes of shortened object names. This is the
      shortest possible length needed to make names in the shortened
      object name table unambiguous.
-    - 4-byte integer, recording where tables relating to this format
+    ** 4-byte integer, recording where tables relating to this format
      are stored in this index file, as an offset from the beginning.
-  - 4-byte offset to the trailer from the beginning of this file.
-  - Zero or more additional key/value pairs (4-byte key, 4-byte
+  * 4-byte offset to the trailer from the beginning of this file.
+  * Zero or more additional key/value pairs (4-byte key, 4-byte
    value). Only one key is supported: 'PSRC'. See the "Loose objects
    and unreachable objects" section for supported values and how this
    is used.  All other keys are reserved. Readers must ignore
@ -213,37 +238,36 @@ network byte order):
 - Zero or more NUL bytes. This can optionally be used to improve the
  alignment of the full object name table below.
 - Tables for the first object format:
-  - A sorted table of shortened object names.  These are prefixes of
+  * A sorted table of shortened object names.  These are prefixes of
    the names of all objects in this pack file, packed together
    without offset values to reduce the cache footprint of the binary
    search for a specific object name.

-  - A table of full object names in pack order. This allows resolving
+  * A table of full object names in pack order. This allows resolving
    a reference to "the nth object in the pack file" (from a
    reachability bitmap or from the next table of another object
    format) to its object name.

-  - A table of 4-byte values mapping object name order to pack order.
+  * A table of 4-byte values mapping object name order to pack order.
    For an object in the table of sorted shortened object names, the
    value at the corresponding index in this table is the index in the
    previous table for that same object.
-
    This can be used to look up the object in reachability bitmaps or
    to look up its name in another object format.

-  - A table of 4-byte CRC32 values of the packed object data, in the
+  * A table of 4-byte CRC32 values of the packed object data, in the
    order that the objects appear in the pack file. This is to allow
    compressed data to be copied directly from pack to pack during
    repacking without undetected data corruption.

-  - A table of 4-byte offset values. For an object in the table of
+  * A table of 4-byte offset values. For an object in the table of
    sorted shortened object names, the value at the corresponding
    index in this table indicates where that object can be found in
    the pack file. These are usually 31-bit pack file offsets, but
    large offsets are encoded as an index into the next table with the
    most significant bit set.

-  - A table of 8-byte offset entries (empty for pack files less than
+  * A table of 8-byte offset entries (empty for pack files less than
    2 GiB). Pack files are organized with heavily used objects toward
    the front, so most object references should not need to refer to
    this table.
@ -252,10 +276,10 @@ network byte order):
  up to and not including the table of CRC32 values.
 - Zero or more NUL bytes.
 - The trailer consists of the following:
-  - A copy of the 20-byte SHA-256 checksum at the end of the
+  * A copy of the 20-byte SHA-256 checksum at the end of the
    corresponding packfile.

-  - 20-byte SHA-256 checksum of all of the above.
+  * 20-byte SHA-256 checksum of all of the above.

 Loose object index
 ~~~~~~~~~~~~~~~~~~
@ -288,18 +312,18 @@ To remove entries (e.g. in "git pack-refs" or "git-prune"):

 Translation table
 ~~~~~~~~~~~~~~~~~
-The index files support a bidirectional mapping between sha1-names
-and sha256-names. The lookup proceeds similarly to ordinary object
-lookups. For example, to convert a sha1-name to a sha256-name:
+The index files support a bidirectional mapping between SHA-1 names
+and SHA-256 names. The lookup proceeds similarly to ordinary object
+lookups. For example, to convert a SHA-1 name to a SHA-256 name:

 1. Look for the object in idx files. If a match is present in the
-    idx's sorted list of truncated sha1-names, then:
-    a. Read the corresponding entry in the sha1-name order to pack
+    idx's sorted list of truncated SHA-1 names, then:
+    a. Read the corresponding entry in the SHA-1 name order to pack
       name order mapping.
-    b. Read the corresponding entry in the full sha1-name table to
+    b. Read the corresponding entry in the full SHA-1 name table to
       verify we found the right object. If it is, then
-    c. Read the corresponding entry in the full sha256-name table.
-       That is the object's sha256-name.
+    c. Read the corresponding entry in the full SHA-256 name table.
+       That is the object's SHA-256 name.
 2. Check for a loose object. Read lines from loose-object-idx until
    we find a match.

@ -313,10 +337,10 @@ Since all operations that make new objects (e.g., "git commit") add
 the new objects to the corresponding index, this mapping is possible
 for all objects in the object store.

-Reading an object's sha1-content
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-The sha1-content of an object can be read by converting all sha256-names
-its sha256-content references to sha1-names using the translation table.
+Reading an object's SHA-1 content
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The SHA-1 content of an object can be read by converting all SHA-256 names
+of its SHA-256 content references to SHA-1 names using the translation table.

 Fetch
 ~~~~~
@ -339,7 +363,7 @@ the following steps:
 1. index-pack: inflate each object in the packfile and compute its
   SHA-1. Objects can contain deltas in OBJ_REF_DELTA format against
   objects the client has locally. These objects can be looked up
-   using the translation table and their sha1-content read as
+   using the translation table and their SHA-1 content read as
   described above to resolve the deltas.
 2. topological sort: starting at the "want"s from the negotiation
   phase, walk through objects in the pack and emit a list of them,
@ -348,12 +372,12 @@ the following steps:
   (This list only contains objects reachable from the "wants". If the
   pack from the server contained additional extraneous objects, then
   they will be discarded.)
-3. convert to sha256: open a new (sha256) packfile. Read the topologically
+3. convert to SHA-256: open a new SHA-256 packfile. Read the topologically
   sorted list just generated. For each object, inflate its
-   sha1-content, convert to sha256-content, and write it to the sha256
-   pack. Record the new sha1<->sha256 mapping entry for use in the idx.
+   SHA-1 content, convert to SHA-256 content, and write it to the SHA-256
+   pack. Record the new SHA-1<-->SHA-256 mapping entry for use in the idx.
 4. sort: reorder entries in the new pack to match the order of objects
-   in the pack the server generated and include blobs. Write a sha256 idx
+   in the pack the server generated and include blobs. Write a SHA-256 idx
   file
 5. clean up: remove the SHA-1 based pack file, index, and
   topologically sorted list obtained from the server in steps 1
@ -378,19 +402,20 @@ experimenting to get this to perform well.
 Push
 ~~~~
 Push is simpler than fetch because the objects referenced by the
-pushed objects are already in the translation table. The sha1-content
+pushed objects are already in the translation table. The SHA-1 content
 of each object being pushed can be read as described in the "Reading
-an object's sha1-content" section to generate the pack written by git
+an object's SHA-1 content" section to generate the pack written by git
 send-pack.

 Signed Commits
 ~~~~~~~~~~~~~~
 We add a new field "gpgsig-sha256" to the commit object format to allow
 signing commits without relying on SHA-1. It is similar to the
-existing "gpgsig" field. Its signed payload is the sha256-content of the
+existing "gpgsig" field. Its signed payload is the SHA-256 content of the
 commit object with any "gpgsig" and "gpgsig-sha256" fields removed.

 This means commits can be signed
+
 1. using SHA-1 only, as in existing signed commit objects
 2. using both SHA-1 and SHA-256, by using both gpgsig-sha256 and gpgsig
   fields.
@ -404,10 +429,11 @@ Signed Tags
 ~~~~~~~~~~~
 We add a new field "gpgsig-sha256" to the tag object format to allow
 signing tags without relying on SHA-1. Its signed payload is the
-sha256-content of the tag with its gpgsig-sha256 field and "-----BEGIN PGP
+SHA-256 content of the tag with its gpgsig-sha256 field and "-----BEGIN PGP
 SIGNATURE-----" delimited in-body signature removed.

 This means tags can be signed
+
 1. using SHA-1 only, as in existing signed tag objects
 2. using both SHA-1 and SHA-256, by using gpgsig-sha256 and an in-body
   signature.
@ -415,11 +441,11 @@ This means tags can be signed

 Mergetag embedding
 ~~~~~~~~~~~~~~~~~~
-The mergetag field in the sha1-content of a commit contains the
-sha1-content of a tag that was merged by that commit.
+The mergetag field in the SHA-1 content of a commit contains the
+SHA-1 content of a tag that was merged by that commit.

-The mergetag field in the sha256-content of the same commit contains the
-sha256-content of the same tag.
+The mergetag field in the SHA-256 content of the same commit contains the
+SHA-256 content of the same tag.

 Submodules
 ~~~~~~~~~~
@ -494,7 +520,7 @@ Caveats
 -------
 Invalid objects
 ~~~~~~~~~~~~~~~
-The conversion from sha1-content to sha256-content retains any
+The conversion from SHA-1 content to SHA-256 content retains any
 brokenness in the original object (e.g., tree entry modes encoded with
 leading 0, tree objects whose paths are not sorted correctly, and
 commit objects without an author or committer). This is a deliberate
@ -513,15 +539,15 @@ allow lifting this restriction.

 Alternates
 ~~~~~~~~~~
-For the same reason, a sha256 repository cannot borrow objects from a
-sha1 repository using objects/info/alternates or
+For the same reason, a SHA-256 repository cannot borrow objects from a
+SHA-1 repository using objects/info/alternates or
 $GIT_ALTERNATE_OBJECT_REPOSITORIES.

 git notes
 ~~~~~~~~~
-The "git notes" tool annotates objects using their sha1-name as key.
+The "git notes" tool annotates objects using their SHA-1 name as key.
 This design does not describe a way to migrate notes trees to use
-sha256-names. That migration is expected to happen separately (for
+SHA-256 names. That migration is expected to happen separately (for
 example using a file at the root of the notes tree to describe which
 hash it uses).

@ -555,7 +581,7 @@ unclear:

 	Git 2.12

-Does this mean Git v2.12.0 is the commit with sha1-name
+Does this mean Git v2.12.0 is the commit with SHA-1 name
 e7e07d5a4fcc2a203d9873968ad3e6bd4d7419d7 or the commit with
 new-40-digit-hash-name e7e07d5a4fcc2a203d9873968ad3e6bd4d7419d7?

@ -600,42 +626,10 @@ example:

    git --output-format=sha1 log abac87a^{sha1}..f787cac^{sha256}

-Choice of Hash
--------------
-In early 2005, around the time that Git was written, Xiaoyun Wang,
-Yiqun Lisa Yin, and Hongbo Yu announced an attack finding SHA-1
-collisions in 2^69 operations. In August they published details.
-Luckily, no practical demonstrations of a collision in full SHA-1 were
-published until 10 years later, in 2017.
-
-Git v2.13.0 and later subsequently moved to a hardened SHA-1
-implementation by default that mitigates the SHAttered attack, but
-SHA-1 is still believed to be weak.
-
-The hash to replace this hardened SHA-1 should be stronger than SHA-1
-was: we would like it to be trustworthy and useful in practice for at
-least 10 years.
-
-Some other relevant properties:
-
-1. A 256-bit hash (long enough to match common security practice; not
-   excessively long to hurt performance and disk usage).
-
-2. High quality implementations should be widely available (e.g., in
-   OpenSSL and Apple CommonCrypto).
-
-3. The hash function's properties should match Git's needs (e.g. Git
-   requires collision and 2nd preimage resistance and does not require
-   length extension resistance).
-
-4. As a tiebreaker, the hash should be fast to compute (fortunately
-   many contenders are faster than SHA-1).
-
-We choose SHA-256.
-
 Transition plan
 ---------------
 Some initial steps can be implemented independently of one another:
+
 - adding a hash function API (vtable)
 - teaching fsck to tolerate the gpgsig-sha256 field
 - excluding gpgsig-* from the fields copied by "git commit --amend"
@ -647,9 +641,9 @@ Some initial steps can be implemented independently of one another:
 - introducing index v3
 - adding support for the PSRC field and safer object pruning

-
 The first user-visible change is the introduction of the objectFormat
 extension (without compatObjectFormat). This requires:
+
 - teaching fsck about this mode of operation
 - using the hash function API (vtable) when computing object names
 - signing objects and verifying signatures
@ -657,6 +651,7 @@ extension (without compatObjectFormat). This requires:
  repository

 Next comes introduction of compatObjectFormat:
+
 - implementing the loose-object-idx
 - translating object names between object formats
 - translating object content between object formats
@ -669,10 +664,11 @@ Next comes introduction of compatObjectFormat:
  "Object names on the command line" above)

 The next step is supporting fetches and pushes to SHA-1 repositories:
+
 - allow pushes to a repository using the compat format
 - generate a topologically sorted list of the SHA-1 names of fetched
  objects
- convert the fetched packfile to sha256 format and generate an idx
+- convert the fetched packfile to SHA-256 format and generate an idx
  file
 - re-sort to match the order of objects in the fetched packfile

@ -734,6 +730,7 @@ Using hash functions in parallel
 Objects newly created would be addressed by the new hash, but inside
 such an object (e.g. commit) it is still possible to address objects
 using the old hash function.
+
 * You cannot trust its history (needed for bisectability) in the
  future without further work
 * Maintenance burden as the number of supported hash functions grows
@ -743,36 +740,38 @@ using the old hash function.
 Signed objects with multiple hashes
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Instead of introducing the gpgsig-sha256 field in commit and tag objects
-for sha256-content based signatures, an earlier version of this design
-added "hash sha256 <sha256-name>" fields to strengthen the existing
-sha1-content based signatures.
+for SHA-256 content based signatures, an earlier version of this design
+added "hash sha256 <SHA-256 name>" fields to strengthen the existing
+SHA-1 content based signatures.

 In other words, a single signature was used to attest to the object
 content using both hash functions. This had some advantages:
+
 * Using one signature instead of two speeds up the signing process.
 * Having one signed payload with both hashes allows the signer to
-  attest to the sha1-name and sha256-name referring to the same object.
+  attest to the SHA-1 name and SHA-256 name referring to the same object.
 * All users consume the same signature. Broken signatures are likely
  to be detected quickly using current versions of git.

 However, it also came with disadvantages:
-* Verifying a signed object requires access to the sha1-names of all
+
+* Verifying a signed object requires access to the SHA-1 names of all
  objects it references, even after the transition is complete and
  translation table is no longer needed for anything else. To support
-  this, the design added fields such as "hash sha1 tree <sha1-name>"
-  and "hash sha1 parent <sha1-name>" to the sha256-content of a signed
+  this, the design added fields such as "hash sha1 tree <SHA-1 name>"
+  and "hash sha1 parent <SHA-1 name>" to the SHA-256 content of a signed
  commit, complicating the conversion process.
-* Allowing signed objects without a sha1 (for after the transition is
+* Allowing signed objects without a SHA-1 (for after the transition is
  complete) complicated the design further, requiring a "nohash sha1"
-  field to suppress including "hash sha1" fields in the sha256-content
+  field to suppress including "hash sha1" fields in the SHA-256 content
  and signed payload.

 Lazily populated translation table
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Some of the work of building the translation table could be deferred to
 push time, but that would significantly complicate and slow down pushes.
-Calculating the sha1-name at object creation time at the same time it is
-being streamed to disk and having its sha256-name calculated should be
+Calculating the SHA-1 name at object creation time at the same time it is
+being streamed to disk and having its SHA-256 name calculated should be
 an acceptable cost.

 Document History
@ -782,18 +781,19 @@ Document History
 bmwill@google.com, jonathantanmy@google.com, jrnieder@gmail.com,
 sbeller@google.com

-Initial version sent to
-http://lore.kernel.org/git/20170304011251.GA26789@aiede.mtv.corp.google.com
+* Initial version sent to https://lore.kernel.org/git/20170304011251.GA26789@aiede.mtv.corp.google.com

 2017-03-03 jrnieder@gmail.com
 Incorporated suggestions from jonathantanmy and sbeller:
-* describe purpose of signed objects with each hash type
-* redefine signed object verification using object content under the
+
+* Describe purpose of signed objects with each hash type
+* Redefine signed object verification using object content under the
  first hash function

 2017-03-06 jrnieder@gmail.com
+
 * Use SHA3-256 instead of SHA2 (thanks, Linus and brian m. carlson).[1][2]
-* Make sha3-based signatures a separate field, avoiding the need for
+* Make SHA3-based signatures a separate field, avoiding the need for
  "hash" and "nohash" fields (thanks to peff[3]).
 * Add a sorting phase to fetch (thanks to Junio for noticing the need
  for this).
@ -805,23 +805,26 @@ Incorporated suggestions from jonathantanmy and sbeller:
  especially Junio).

 2017-09-27 jrnieder@gmail.com, sbeller@google.com
-* use placeholder NewHash instead of SHA3-256
-* describe criteria for picking a hash function.
-* include a transition plan (thanks especially to Brandon Williams
+
+* Use placeholder NewHash instead of SHA3-256
+* Describe criteria for picking a hash function.
+* Include a transition plan (thanks especially to Brandon Williams
  for fleshing these ideas out)
-* define the translation table (thanks, Shawn Pearce[5], Jonathan
+* Define the translation table (thanks, Shawn Pearce[5], Jonathan
  Tan, and Masaya Suzuki)
-* avoid loose object overhead by packing more aggressively in
+* Avoid loose object overhead by packing more aggressively in
  "git gc --auto"

 Later history:

- See the history of this file in git.git for the history of subsequent
+* See the history of this file in git.git for the history of subsequent
  edits. This document history is no longer being maintained as it
  would now be superfluous to the commit log

-[1] http://lore.kernel.org/git/CA+55aFzJtejiCjV0e43+9oR3QuJK2PiFiLQemytoLpyJWe6P9w@mail.gmail.com/
-[2] http://lore.kernel.org/git/CA+55aFz+gkAsDZ24zmePQuEs1XPS9BP_s8O7Q4wQ7LV7X5-oDA@mail.gmail.com/
-[3] http://lore.kernel.org/git/20170306084353.nrns455dvkdsfgo5@sigill.intra.peff.net/
-[4] http://lore.kernel.org/git/20170304224936.rqqtkdvfjgyezsht@genre.crustytoothpaste.net
+References:
+
+ [1] https://lore.kernel.org/git/CA+55aFzJtejiCjV0e43+9oR3QuJK2PiFiLQemytoLpyJWe6P9w@mail.gmail.com/
+ [2] https://lore.kernel.org/git/CA+55aFz+gkAsDZ24zmePQuEs1XPS9BP_s8O7Q4wQ7LV7X5-oDA@mail.gmail.com/
+ [3] https://lore.kernel.org/git/20170306084353.nrns455dvkdsfgo5@sigill.intra.peff.net/
+ [4] https://lore.kernel.org/git/20170304224936.rqqtkdvfjgyezsht@genre.crustytoothpaste.net
 [5] https://lore.kernel.org/git/CAJo=hJtoX9=AyLHHpUJS7fueV9ciZ_MNpnEPHUz8Whui6g9F0A@mail.gmail.com/
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@ -34,7 +34,7 @@ filter_git () {
 # Compare two files and ensure that `clean` and `smudge` respectively are
 # called at least once if specified in the `expect` file. The actual
 # invocation count is not relevant because their number can vary.
-# c.f. http://lore.kernel.org/git/xmqqshv18i8i.fsf@gitster.mtv.corp.google.com/
+# c.f. https://lore.kernel.org/git/xmqqshv18i8i.fsf@gitster.mtv.corp.google.com/
 test_cmp_count () {
 	expect=$1
 	actual=$2
@ -49,7 +49,7 @@ test_cmp_count () {

 # Compare two files but exclude all `clean` invocations because Git can
 # call `clean` zero or more times.
-# c.f. http://lore.kernel.org/git/xmqqshv18i8i.fsf@gitster.mtv.corp.google.com/
+# c.f. https://lore.kernel.org/git/xmqqshv18i8i.fsf@gitster.mtv.corp.google.com/
 test_cmp_exclude_clean () {
 	expect=$1
 	actual=$2