Commit Graph

25 Commits

Author SHA1 Message Date
Ben Peart
3255089ada ieot: add Index Entry Offset Table (IEOT) extension
This patch enables addressing the CPU cost of loading the index by adding
additional data to the index that will allow us to efficiently multi-
thread the loading and conversion of cache entries.

It accomplishes this by adding an (optional) index extension that is a
table of offsets to blocks of cache entries in the index file.  To make
this work for V4 indexes, when writing the cache entries, it periodically
"resets" the prefix-compression by encoding the current entry as if the
path name for the previous entry is completely different and saves the
offset of that entry in the IEOT.  Basically, with V4 indexes, it
generates offsets into blocks of prefix-compressed entries.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-11 15:32:48 +09:00
Ben Peart
3b1d9e045e eoie: add End of Index Entry (EOIE) extension
The End of Index Entry (EOIE) is used to locate the end of the variable
length index entries and the beginning of the extensions. Code can take
advantage of this to quickly locate the index extensions without having
to parse through all of the index entries.

The EOIE extension is always written out to the index file including to
the shared index when using the split index feature. Because it is always
written out, the SHA checksums in t/t1700-split-index.sh were updated
to reflect its inclusion.

It is written as an optional extension to ensure compatibility with other
git implementations that do not yet support it.  It is always written out
to ensure it is available as often as possible to speed up index operations.

Because it must be able to be loaded before the variable length cache
entries and other index extensions, this extension must be written last.
The signature for this extension is { 'E', 'O', 'I', 'E' }.

The extension consists of:

- 32-bit offset to the end of the index entries

- 160-bit SHA-1 over the extension types and their sizes (but not
their contents).  E.g. if we have "TREE" extension that is N-bytes
long, "REUC" extension that is M-bytes long, followed by "EOIE",
then the hash would be:

SHA-1("TREE" + <binary representation of N> +
    "REUC" + <binary representation of M>)

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-11 15:32:48 +09:00
Ben Peart
780494b1f5 fsmonitor: add documentation for the fsmonitor extension.
This includes the core.fsmonitor setting, the fsmonitor integration hook,
and the fsmonitor index extension.

Also add documentation for the new fsmonitor options to ls-files and
update-index.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-01 17:23:05 +09:00
Junio C Hamano
73167677c0 Merge branch 'jc/em-dash-in-doc'
AsciiDoc markup fixes.

* jc/em-dash-in-doc:
  Documentation: AsciiDoc spells em-dash as double-dashes, not triple
2015-10-29 13:59:16 -07:00
Junio C Hamano
3b19dba703 Documentation: AsciiDoc spells em-dash as double-dashes, not triple
Again, we do not usually process release notes with AsciiDoc, but it
is better to be consistent.

This incidentally reveals breakages left by an ancient 5e00439f
(Documentation: build html for all files in technical and howto,
2012-10-23).  The index-format documentation was originally written
to be read as straight text without formatting and when the commit
forced everything in Documentation/ to go through AsciiDoc, it did
not do any adjustment--hence the double-dashes will be seen in the
resulting text that is rendered as preformatted fixed-width without
converted into em-dashes.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-22 13:02:33 -07:00
Junio C Hamano
2e8ef44ee5 Merge branch 'ta/docfix-index-format-tech'
* ta/docfix-index-format-tech:
  typofix for index-format.txt
2015-08-17 15:07:52 -07:00
Thomas Ackermann
da4c5adae9 typofix for index-format.txt
Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-28 11:05:41 -07:00
Nguyễn Thái Ngọc Duy
1e8fef609e untracked cache: guard and disable on system changes
If the user enables untracked cache, then

 - move worktree to an unsupported filesystem
 - or simply upgrade OS
 - or move the whole (portable) disk from one machine to another
 - or access a shared fs from another machine

there's no guarantee that untracked cache can still function properly.
Record the worktree location and OS footprint in the cache. If it
changes, err on the safe side and disable the cache. The user can
'update-index --untracked-cache' again to make sure all conditions are
met.

This adds a new requirement that setup_git_directory* must be called
before read_cache() because we need worktree location by then, or the
cache is dropped.

This change does not cover all bases, you can fool it if you try
hard. The point is to stop accidents.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: brian m. carlson <sandals@crustytoothpaste.net>
Helped-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12 13:45:18 -07:00
Nguyễn Thái Ngọc Duy
83c094ad0d untracked cache: save to an index extension
Helped-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12 13:45:16 -07:00
Junio C Hamano
63903d0e4e Merge branch 'nd/split-index'
A typofix to the documentation of a feature already in the release.

* nd/split-index:
  index-format.txt: add a missing closing quote
2014-12-22 12:28:11 -08:00
Nguyễn Thái Ngọc Duy
f2667a8330 index-format.txt: add a missing closing quote
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-11 14:24:37 -08:00
Thomas Ackermann
f745acb028 Documentation: typofixes
In addition to fixing trivial and obvious typos, be careful about
the following points:

 - Spell ASCII, URL and CRC in ALL CAPS;
 - Spell Linux as Capitalized;
 - Do not omit periods in "i.e." and "e.g.".

Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-11-04 13:14:44 -08:00
Nguyễn Thái Ngọc Duy
5fc2fc8fa2 read-cache: split-index mode
This split-index mode is designed to keep write cost proportional to
the number of changes the user has made, not the size of the work
tree. (Read cost is another matter, to be dealt separately.)

This mode stores index info in a pair of $GIT_DIR/index and
$GIT_DIR/sharedindex.<SHA-1>. sharedindex is large and unchanged over
time while "index" is smaller and updated often. Format details are in
index-format.txt, although not everything is implemented in this
patch.

Shared indexes are not automatically removed, because it's unclear if
the shared index is needed by any (even temporary) indexes by just
looking at it. After a while you'll collect stale shared indexes. The
good news is one shared index is useable for long, until
$GIT_DIR/index becomes too big and sluggish that the new shared index
must be created.

The safest way to clean shared indexes is to turn off split index
mode, so shared files are all garbage, delete them all, then turn on
split index mode again.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13 11:49:39 -07:00
Ondřej Bílka
17b83d71d5 typofix: documentation
Signed-off-by: Ondřej Bílka <neleai@seznam.cz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-22 16:06:48 -07:00
Junio C Hamano
865e99b5fd Merge branch 'nd/doc-index-format'
Update the index format documentation to mention the v4 format.

* nd/doc-index-format:
  update-index: list supported idx versions and their features
  read-cache.c: use INDEX_FORMAT_{LB,UB} in verify_hdr()
  index-format.txt: mention of v4 is missing in some places
2013-03-19 12:15:14 -07:00
Nguyễn Thái Ngọc Duy
300e39f6aa index-format.txt: mention of v4 is missing in some places
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-22 12:47:14 -08:00
Thomas Ackermann
2de9b71138 Documentation: the name of the system is 'Git', not 'git'
Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-01 13:53:33 -08:00
Thomas Ackermann
48a8c26c62 Documentation: avoid poor-man's small caps GIT
In the earlier days, we used to spell the name of the system as GIT,
to simulate as if it were typeset with capital G and IT in small
caps.  Later we stopped doing so at around 1.6.5 days.

Let's stop doing so throughout the documentation.  The name to refer
to the whole system (and the concept it embodies) is "Git"; the
command end-users type is "git".  And document this in the coding
guideline.

Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-01 13:53:25 -08:00
Junio C Hamano
d34ccd6df7 Merge branch 'nd/index-format-doc'
* nd/index-format-doc:
  index-format.txt: clarify what is "invalid"
2012-12-21 15:18:32 -08:00
Nguyễn Thái Ngọc Duy
4a6385fe55 index-format.txt: clarify what is "invalid"
A cache-tree entry with a negative entry count is considered invalid
by the current Git; it records that we do not know the object name
of a tree that would result by writing the directory covered by the
cache-tree as a tree object.

Clarify that any entry with a negative entry count is invalid, but
the implementations must write -1 there. This way, we can later
decide to allow writers to use negative values other than -1 to
encode optional information on such invalidated entries without
harming interoperability; we do not know what will be encoded and
how, so we keep these other negative values as reserved for now.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-13 10:12:25 -08:00
Thomas Ackermann
5316c8e939 Documentation/technical: convert plain text files to asciidoc
These were not originally meant for asciidoc, but they are already
so close.  Mark them up in asciidoc.

Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-16 16:09:09 -07:00
Junio C Hamano
afd7bd2220 index-v4: document the entry format
Document the format so that others can learn from and build on top of
the series.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-27 16:03:31 -07:00
Carlos Martín Nieto
e44b6df90c Documentation: clarify the invalidated tree entry format
When the entry_count is -1, the tree is invalidated and therefore has
not associated hash (or object name). Explicitly state that the next
entry starts after the newline.

Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-31 18:23:49 -07:00
Junio C Hamano
23fcc98f7f doc: technical details about the index file format
* Clarify "string of unsigned bytes";

 * Blob has two variants (regular file vs symlink), not (blob vs symlink);

 * Clarify permission mode bits;

 * Clarify ce_namelen() "too long to fit in the length field" case;

 * Clarify "." etc are forbidden as path components;

 * Match the description with the internal wording "cache-tree";

 * All types of extension begin with signature and length as explained in
   the first part. Don't repeat the "length" part in the description of
   each extension (can be mistaken as if there is a separate 32-bit size
   field inside the extension), but state what the signature for each
   extension is.

 * Don't say "Extension tag", as we have said "Extension signature" in the
   first part---be consistent;

 * Clarify the invalidation of cache-tree entries;

 * Correct description on subtree_nr field in the cache-tree;

 * Clarify the order of entries in cache-tree;

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-23 15:57:10 -07:00
Nguyễn Thái Ngọc Duy
8c7d05171e doc: technical details about the index file format
This bases on the original work by Robin Rosenberg.

Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-27 01:21:27 -08:00