Git with broken hash generation to generate collisions between object IDs. Don't use this! https://undefinedbehavior.de/posts/commit-vandalism/
Go to file
Jeff King 6d9617f4f7 delta_base_cache: drop special treatment of blobs
When the delta base cache runs out of allowed memory, it has
to drop entries. It does so by walking an LRU list, dropping
objects until we are under the memory limit. But we actually
walk the list twice: once to drop blobs, and then again to
drop other objects (which are generally trees). This comes
from 18bdec1 (Limit the size of the new delta_base_cache,
2007-03-19).

This performs poorly as the number of entries grows, because
any time dropping blobs does not satisfy the limit, we have
to walk the _entire_ list, trees included, looking for blobs
to drop, before starting to drop any trees.

It's not generally a problem now, as the cache is limited to
only 256 entries. But as we could benefit from increasing
that in a future patch, it's worth looking at how it
performs as the cache size grows. And the answer is "not
well".

The table below shows times for various operations with
different values of MAX_DELTA_CACHE (which is not a run-time
knob; I recompiled with -DMAX_DELTA_CACHE=$n for each).

I chose "git log --raw" ("log-raw" in the table) because it
will access all of the trees, but no blobs at all (so in a
sense it is a worst case for this problem, because we will
always walk over the entire list of trees once before
realizing there are no blobs to drop). This is also
representative of other tree-only operations like "rev-list
--objects" and "git log -- <path>".

I also timed "git log -Sfoo --raw" ("log-S" in the table).
It similarly accesses all of the trees, but also the blobs
for each commit. It's representative of "git log -p", though
it emphasizes the cost of blob access more, as "-S" is
cheaper than computing an actual blob diff.

All timings are best-of-3 wall-clock times (though they all
were CPU bound, so the user CPU times are similar). The
repositories were fully packed with --depth=50, and the
default core.deltaBaseCacheLimit of 96M was in effect.  The
current value of MAX_DELTA_CACHE is 256, so I started there
and worked up by factors of 2.

First, here are values for git.git (the asterisk signals the
fastest run for each operation):

    MAX_DELTA_CACHE    log-raw       log-S
    ---------------   ---------    ---------
                256   0m02.212s    0m12.634s
                512   0m02.136s*   0m10.614s
               1024   0m02.156s    0m08.614s
               2048   0m02.208s    0m07.062s
               4096   0m02.190s    0m06.484s*
               8192   0m02.176s    0m07.635s
              16384   0m02.913s    0m19.845s
              32768   0m03.617s    1m05.507s
              65536   0m04.031s    1m18.488s

You can see that for the tree-only log-raw case, we don't
actually benefit that much as the cache grows (all the
differences up through 8192 are basically just noise; this
is probably because we don't actually have that many
distinct trees in git.git). But for log-S, we get a definite
speed improvement as the cache grows, but the improvements
are lost as cache size grows and the linear LRU management
starts to dominate.

Here's the same thing run against linux.git:

    MAX_DELTA_CACHE    log-raw       log-S
    ---------------   ---------    ----------
                256   0m40.987s     5m13.216s
                512   0m37.949s     5m03.243s
               1024   0m35.977s     4m50.580s
               2048   0m33.855s     4m39.818s
               4096   0m32.913s     4m47.299s*
               8192   0m32.176s*    5m14.650s
              16384   0m32.185s     6m31.625s
              32768   0m38.056s     9m31.136s
              65536   1m30.518s    17m38.549s

The pattern is similar, though the effect in log-raw is more
pronounced here. The times dip down in the middle, and then
go back up as we keep growing.

So we know there's a problem. What's the solution?

The obvious one is to improve the data structure to avoid
walking over tree entries during the looking-for-blobs
traversal. We can do this by keeping _two_ LRU lists: one
for blobs, and one for other objects. We drop items from the
blob LRU first, and then from the tree LRU (if necessary).

Here's git.git using that strategy:

    MAX_DELTA_CACHE    log-raw      log-S
    ---------------   ---------   ----------
                256   0m02.264s   0m12.830s
                512   0m02.201s   0m10.771s
               1024   0m02.181s   0m08.593s
               2048   0m02.205s   0m07.116s
               4096   0m02.158s   0m06.537s*
               8192   0m02.213s   0m07.246s
              16384   0m02.155s*  0m10.975s
              32768   0m02.159s   0m16.047s
              65536   0m02.181s   0m16.992s

The upswing on log-raw is gone completely. But log-S still
has it (albeit much better than without this strategy).
Let's see what linux.git shows:

    MAX_DELTA_CACHE    log-raw       log-S
    ---------------   ---------    ---------
                256   0m42.519s    5m14.654s
                512   0m39.106s    5m04.708s
               1024   0m36.802s    4m51.454s
               2048   0m34.685s    4m39.378s*
               4096   0m33.663s    4m44.047s
               8192   0m33.157s    4m50.644s
              16384   0m33.090s*   4m49.648s
              32768   0m33.458s    4m53.371s
              65536   0m33.563s    5m04.580s

The results are similar. The tree-only case again performs
well (not surprising; we're literally just dropping the one
useless walk, and not otherwise changing the cache eviction
strategy at all). But the log-S case again does a bit worse
as the cache grows (though possibly that's within the noise,
which is much larger for this case).

Perhaps this is an indication that the "remove blobs first"
strategy is not actually optimal. The intent of it is to
avoid blowing out the tree cache when we see large blobs,
but it also means we'll throw away useful, recent blobs in
favor of older trees.

Let's run the same numbers without caring about object type
at all (i.e., one LRU list, and always evicting whatever is
at the head, regardless of type).

Here's git.git:

    MAX_DELTA_CACHE    log-raw      log-S
    ---------------   ---------   ---------
                256   0m02.227s   0m12.821s
                512   0m02.143s   0m10.602s
               1024   0m02.127s   0m08.642s
               2048   0m02.148s   0m07.123s
               4096   0m02.194s   0m06.448s*
               8192   0m02.239s   0m06.504s
              16384   0m02.144s*  0m06.502s
              32768   0m02.202s   0m06.622s
              65536   0m02.230s   0m06.677s

Much smoother; there's no dramatic upswing as we increase
the cache size (some remains, though it's small enough that
it's mostly run-to-run noise. E.g., in the log-raw case,
note how 8192 is 50-100ms higher than its neighbors). Note
also that we stop getting any real benefit for log-S after
about 4096 entries; that number will depend on the size of
the repository, the size of the blob entries, and the memory
limit of the cache.

Let's see what linux.git shows for the same strategy:

    MAX_DELTA_CACHE    log-raw      log-S
    ---------------   ---------   ---------
                256   0m41.661s   5m12.410s
                512   0m39.547s   5m07.920s
               1024   0m37.054s   4m54.666s
               2048   0m35.871s   4m41.194s*
               4096   0m34.646s   4m51.648s
               8192   0m33.881s   4m55.342s
              16384   0m35.190s   5m00.122s
              32768   0m35.060s   4m58.851s
              65536   0m33.311s*  4m51.420s

It's similarly good. As with the "separate blob LRU"
strategy, there's a lot of noise on the log-S run here. But
it's certainly not any worse, is possibly a bit better, and
the improvement over "separate blob LRU" on the git.git case
is dramatic.

So it seems like a clear winner, and that's what this patch
implements.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-23 14:57:44 -07:00
block-sha1
builtin Merge branch 'sb/checkout-explit-detach-no-advice' 2016-08-19 15:34:15 -07:00
ci
compat Merge branch 'js/nedmalloc-gcc6-warnings' into maint 2016-08-10 11:55:31 -07:00
contrib git-multimail: update to release 1.4.0 2016-08-17 11:36:08 -07:00
Documentation Git 2.10-rc1 2016-08-19 15:39:33 -07:00
ewah
git-gui
gitk-git
gitweb gitweb: escape link body in format_ref_marker 2016-08-01 12:55:40 -07:00
mergetools
perl
po
ppc sha1: provide another level of indirection for the SHA-1 functions 2015-11-05 10:35:11 -08:00
refs Merge branch 'mh/ref-iterators' 2016-07-25 14:13:33 -07:00
t Merge branch 'ab/hooks' 2016-08-19 15:34:16 -07:00
templates push options: {pre,post}-receive hook learns about push options 2016-07-14 15:50:17 -07:00
vcs-svn
xdiff Merge branch 'js/ignore-space-at-eol' into maint 2016-08-08 14:21:35 -07:00
.gitattributes
.gitignore
.mailmap .mailmap: use Christian Couder's Tuxfamily address 2016-08-08 15:17:24 -07:00
.travis.yml Merge branch 'ls/travis-enable-httpd-tests' 2016-07-25 14:13:40 -07:00
abspath.c
aclocal.m4
advice.c
advice.h
alias.c
alloc.c alloc: factor out commit index 2014-07-13 18:59:05 -07:00
archive-tar.c Merge branch 'jk/big-and-future-archive-tar' 2016-08-12 09:47:37 -07:00
archive-zip.c
archive.c i18n: archive: mark errors for translation 2016-08-09 12:44:59 -07:00
archive.h
argv-array.c
argv-array.h
attr.c
attr.h
base85.c
bisect.c
bisect.h
blob.c
blob.h
branch.c
branch.h
builtin.h Merge branch 'sb/submodule-helper' 2015-10-05 12:30:19 -07:00
bulk-checkin.c
bulk-checkin.h cleanups: ensure that git-compat-util.h is included first 2014-09-15 12:05:14 -07:00
bundle.c
bundle.h
cache-tree.c
cache-tree.h
cache.h Merge branch 'jk/trace-fixup' 2016-08-12 09:47:36 -07:00
check_bindir check_bindir: avoid "test <cond> -a/-o <cond>" 2014-06-09 14:47:06 -07:00
check-builtins.sh
check-racy.c
color.c
color.h Merge branch 'js/color-on-windows-comment' into maint 2016-07-28 11:25:55 -07:00
column.c
column.h
combine-diff.c
command-list.txt
commit-slab.h Merge branch 'vs/typofix' 2016-08-12 09:47:37 -07:00
commit.c Merge branch 'rs/pull-signed-tag' 2016-08-19 15:34:14 -07:00
commit.h Merge branch 'rs/pull-signed-tag' 2016-08-19 15:34:14 -07:00
common-main.c
config.c i18n: config: unfold error messages marked for translation 2016-07-28 09:11:09 -07:00
config.mak.in Merge branch 'jc/remove-export-from-config-mak-in' 2013-04-01 09:00:02 -07:00
config.mak.uname Merge branch 'ew/build-time-pager-tweaks' 2016-08-08 14:48:44 -07:00
configure.ac Merge branch 'ew/autoconf-pthread' into maint 2016-08-10 11:55:21 -07:00
connect.c
connect.h
connected.c check_connected: add progress flag 2016-07-20 12:11:09 -07:00
connected.h check_connected: add progress flag 2016-07-20 12:11:09 -07:00
convert.c convert: Correct NNO tests and missing LF will be replaced by CRLF 2016-08-14 13:45:52 -07:00
convert.h
copy.c copy.c: use error_errno() 2016-05-09 12:29:08 -07:00
COPYING
credential-cache--daemon.c
credential-cache.c
credential-store.c
credential.c
credential.h
csum-file.c
csum-file.h
ctype.c
daemon.c Merge branch 'ew/daemon-socket-keepalive' 2016-07-28 10:34:43 -07:00
date.c date: add "unix" format 2016-07-27 14:15:51 -07:00
decorate.c
decorate.h
delta.h comments: fix misuses of "nor" 2014-03-31 15:29:27 -07:00
diff-delta.c
diff-lib.c
diff-no-index.c
diff.c Merge branch 'kw/patch-ids-optim' 2016-08-12 09:47:39 -07:00
diff.h patch-ids: add flag to create the diff patch id using header only data 2016-07-29 14:10:01 -07:00
diffcore-break.c
diffcore-delta.c
diffcore-order.c
diffcore-pickaxe.c diffcore-pickaxe: support case insensitive match on non-ascii 2016-07-01 12:44:57 -07:00
diffcore-rename.c pass constants as first argument to st_mult() 2016-08-01 14:01:03 -07:00
diffcore.h
dir-iterator.c
dir-iterator.h
dir.c Merge branch 'rs/use-strbuf-addbuf' into maint 2016-08-08 14:21:42 -07:00
dir.h Merge branch 'mh/split-under-lock' 2016-07-25 14:13:32 -07:00
editor.c
entry.c
environment.c
exec_cmd.c
exec_cmd.h prepare_{git,shell}_cmd: use argv_array 2016-02-22 14:51:09 -08:00
fast-import.c Merge branch 'jk/common-main' 2016-07-19 13:22:19 -07:00
fetch-pack.c fetch-pack: grow stateless RPC windows exponentially 2016-07-19 13:27:22 -07:00
fetch-pack.h
fmt-merge-msg.h
fsck.c
fsck.h
generate-cmdlist.sh
gettext.c
gettext.h
git-add--interactive.perl
git-archimport.perl git-archimport: use a lowercase "usage:" string 2013-02-24 13:31:06 -08:00
git-bisect.sh
git-compat-util.h Merge branch 'jk/tighten-alloc' 2016-08-17 14:07:46 -07:00
git-cvsexportcommit.perl
git-cvsimport.perl Merge branch 'cn/cvsimport-perl-update' 2015-06-25 11:08:08 -07:00
git-cvsserver.perl
git-difftool--helper.sh difftool: always honor fatal error exit codes 2016-08-15 15:24:05 -07:00
git-difftool.perl difftool: use Git::* functions instead of passing around state 2016-07-28 14:01:55 -07:00
git-filter-branch.sh
git-instaweb.sh
git-merge-octopus.sh
git-merge-one-file.sh
git-merge-resolve.sh
git-mergetool--lib.sh
git-mergetool.sh Merge branch 'nf/mergetool-prompt' 2016-05-03 14:08:17 -07:00
git-p4.py Spelling fixes 2016-08-11 14:35:42 -07:00
git-parse-remote.sh
git-quiltimport.sh
git-rebase--am.sh
git-rebase--interactive.sh Merge branch 'js/rebase-i-progress-tidy' 2016-08-08 14:48:38 -07:00
git-rebase--merge.sh
git-rebase.sh
git-relink.perl
git-remote-testgit.sh
git-request-pull.sh
git-send-email.perl
git-sh-i18n.sh
git-sh-setup.sh Merge branch 'ew/build-time-pager-tweaks' 2016-08-08 14:48:44 -07:00
git-stash.sh i18n: git-stash: mark messages for translation 2016-08-10 10:50:18 -07:00
git-submodule.sh Merge branch 'sb/submodule-update-dot-branch' 2016-08-10 12:33:20 -07:00
git-svn.perl git-svn: allow --version to work anywhere 2016-07-22 20:38:11 +00:00
GIT-VERSION-GEN Git 2.10-rc1 2016-08-19 15:39:33 -07:00
git-web--browse.sh
git.c
git.rc
gpg-interface.c Merge branch 'lt/gpg-show-long-key-in-signature-verification-maint' into lt/gpg-show-long-key-in-signature-verification 2016-08-16 15:04:13 -07:00
gpg-interface.h
graph.c Merge branch 'js/log-to-diffopt-file' 2016-07-19 13:22:15 -07:00
graph.h
grep.c Merge branch 'js/am-3-merge-recursive-direct' 2016-08-10 12:33:20 -07:00
grep.h Merge branch 'jc/grep-commandline-vs-configuration' into maint 2016-08-10 11:55:29 -07:00
hashmap.c
hashmap.h hashmap: add string interning API 2014-07-07 13:56:38 -07:00
help.c
help.h
hex.c
http-backend.c Merge branch 'ew/http-backend-batch-headers' 2016-08-12 09:47:38 -07:00
http-fetch.c
http-push.c Merge branch 'rs/use-strbuf-addstr' 2016-08-08 14:48:41 -07:00
http-walker.c
http.c Merge branch 'rs/use-strbuf-addstr' 2016-08-08 14:48:41 -07:00
http.h
ident.c Merge branch 'jk/reset-ident-time-per-commit' into maint 2016-08-12 09:16:56 -07:00
imap-send.c die("bug"): report bugs consistently 2016-07-26 11:13:44 -07:00
INSTALL
iterator.h
khash.h
kwset.c
kwset.h
levenshtein.c
levenshtein.h
LGPL-2.1 provide a copy of the LGPLv2.1 2011-05-19 18:23:17 -07:00
line-log.c
line-log.h line-log.c: make line_log_data_init() static 2015-01-15 11:05:47 -08:00
line-range.c
line-range.h
list-objects.c
list-objects.h
list.h
ll-merge.c
ll-merge.h
lockfile.c lockfile: improve error message when lockfile exists 2016-03-01 10:16:46 -08:00
lockfile.h
log-tree.c Merge branch 'nd/log-decorate-color-head-arrow' 2016-08-08 14:48:42 -07:00
log-tree.h
mailinfo.c Merge branch 'rs/mailinfo-lib' 2016-08-17 14:07:47 -07:00
mailinfo.h
mailmap.c
mailmap.h
Makefile Merge branch 'ew/build-time-pager-tweaks' 2016-08-08 14:48:44 -07:00
match-trees.c
merge-blobs.c
merge-blobs.h
merge-recursive.c Merge branch 'rs/pull-signed-tag' 2016-08-19 15:34:14 -07:00
merge-recursive.h merge-recursive: offer an option to retain the output in 'obuf' 2016-08-01 11:45:30 -07:00
merge.c
mergesort.c
mergesort.h mergesort: rename it to llist_mergesort() 2012-04-17 11:07:01 -07:00
mru.c add generic most-recently-used list 2016-07-29 11:05:07 -07:00
mru.h add generic most-recently-used list 2016-07-29 11:05:07 -07:00
name-hash.c
notes-cache.c
notes-cache.h
notes-merge.c Merge branch 'rs/notes-merge-no-toctou' 2016-07-28 10:34:41 -07:00
notes-merge.h
notes-utils.c notes: allow treeish expressions as notes ref 2016-01-12 15:10:01 -08:00
notes-utils.h
notes.c
notes.h Merge branch 'jk/notes-merge-from-anywhere' 2016-02-03 14:15:59 -08:00
object.c
object.h
pack-bitmap-write.c
pack-bitmap.c
pack-bitmap.h pack-bitmap.c: make pack_bitmap_filename() static 2015-01-15 11:04:10 -08:00
pack-check.c
pack-objects.c
pack-objects.h
pack-revindex.c
pack-revindex.h
pack-write.c sha1_file: drop free_pack_by_name 2016-07-29 11:05:06 -07:00
pack.h
pager.c pager: move pager-specific setup into the build 2016-08-04 13:51:02 -07:00
parse-options-cb.c Merge branch 'jk/parse-options-concat' 2016-08-03 15:10:25 -07:00
parse-options.c
parse-options.h
patch-delta.c
patch-ids.c rebase: avoid computing unnecessary patch IDs 2016-08-11 14:39:16 -07:00
patch-ids.h rebase: avoid computing unnecessary patch IDs 2016-08-11 14:39:16 -07:00
path.c Merge branch 'ab/hooks' 2016-08-19 15:34:16 -07:00
pathspec.c
pathspec.h
pkt-line.c
pkt-line.h
preload-index.c
pretty.c Merge branch 'rs/use-strbuf-add-unique-abbrev' 2016-08-12 09:47:37 -07:00
prio-queue.c
prio-queue.h
progress.c
progress.h
prompt.c
prompt.h
quote.c Merge branch 'nd/icase' into maint 2016-07-28 11:26:03 -07:00
quote.h Merge branch 'nd/icase' into maint 2016-07-28 11:26:03 -07:00
reachable.c
reachable.h
read-cache.c Merge branch 'jc/renormalize-merge-kill-safer-crlf' 2016-07-25 14:13:39 -07:00
README.md
ref-filter.c
ref-filter.h
reflog-walk.c
reflog-walk.h
refs.c pass constants as first argument to st_mult() 2016-08-01 14:01:03 -07:00
refs.h Merge branch 'mh/ref-iterators' 2016-07-25 14:13:33 -07:00
RelNotes Some fixes for 2.9.3 2016-07-28 11:28:32 -07:00
remote-curl.c
remote-testsvn.c common-main: call git_extract_argv0_path() 2016-07-01 15:09:10 -07:00
remote.c Merge branch 'jk/push-force-with-lease-creation' 2016-08-10 12:33:18 -07:00
remote.h Merge branch 'jk/push-force-with-lease-creation' 2016-08-10 12:33:18 -07:00
replace_object.c
rerere.c Merge branch 'jc/rerere-multi' 2016-05-23 14:54:38 -07:00
rerere.h
resolve-undo.c
resolve-undo.h
revision.c Merge branch 'kw/patch-ids-optim' 2016-08-12 09:47:39 -07:00
revision.h
run-command.c Merge branch 'ab/hooks' 2016-08-19 15:34:16 -07:00
run-command.h
send-pack.c Merge branch 'rs/use-strbuf-addstr' into maint 2016-08-10 11:55:34 -07:00
send-pack.h
sequencer.c Merge branch 'js/am-3-merge-recursive-direct' 2016-08-10 12:33:20 -07:00
sequencer.h
server-info.c
setup.c i18n: setup: mark error messages for translation 2016-08-09 12:44:59 -07:00
sh-i18n--envsubst.c add an extra level of indirection to main() 2016-07-01 15:09:10 -07:00
sha1_file.c delta_base_cache: drop special treatment of blobs 2016-08-23 14:57:44 -07:00
sha1_name.c
sha1-array.c
sha1-array.h
sha1-lookup.c
sha1-lookup.h
shallow.c pass constants as first argument to st_mult() 2016-08-01 14:01:03 -07:00
shell.c
shortlog.h
show-index.c
sideband.c Merge branch 'lf/recv-sideband-cleanup' into maint 2016-08-08 14:21:41 -07:00
sideband.h
sigchain.c
sigchain.h
split-index.c
split-index.h
strbuf.c Merge branch 'rs/use-strbuf-addbuf' into maint 2016-08-08 14:21:42 -07:00
strbuf.h Merge branch 'rs/use-strbuf-addbuf' into maint 2016-08-08 14:21:42 -07:00
streaming.c
streaming.h
string-list.c
string-list.h Merge branch 'sb/string-list' 2014-12-22 12:27:30 -08:00
submodule-config.c Merge branch 'sb/submodule-update-dot-branch' 2016-08-10 12:33:20 -07:00
submodule-config.h submodule-config: keep configured branch around 2016-08-01 14:42:07 -07:00
submodule.c Merge branch 'bc/cocci' 2016-07-19 13:22:16 -07:00
submodule.h
symlinks.c symlinks: remove PATH_MAX limitation 2014-07-07 11:22:42 -07:00
tag.c
tag.h
tar.h
tempfile.c
tempfile.h
thread-utils.c
thread-utils.h
trace.c trace: do not fall back to stderr 2016-08-05 09:28:17 -07:00
trace.h
trailer.c die("bug"): report bugs consistently 2016-07-26 11:13:44 -07:00
trailer.h
transport-helper.c Spelling fixes 2016-08-11 14:35:42 -07:00
transport.c Merge branch 'rs/use-strbuf-add-unique-abbrev' 2016-08-12 09:47:37 -07:00
transport.h
tree-diff.c
tree-walk.c
tree-walk.h
tree.c
tree.h
unicode_width.h
unimplemented.sh unimplemented.sh: use the $( ... ) construct for command substitution 2015-12-27 15:33:13 -08:00
unix-socket.c Merge branch 'rs/strbuf-getcwd' 2014-09-02 13:28:44 -07:00
unix-socket.h credentials: add "cache" helper 2011-12-11 23:16:25 -08:00
unpack-trees.c
unpack-trees.h
update_unicode.sh
upload-pack.c Spelling fixes 2016-08-11 14:35:42 -07:00
url.c
url.h
urlmatch.c
urlmatch.h
usage.c
userdiff.c userdiff: add built-in pattern for CSS 2016-06-03 14:45:56 -07:00
userdiff.h
utf8.c
utf8.h
varint.c
varint.h
version.c
version.h
versioncmp.c
walker.c
walker.h walker: let walker_say take arbitrary formats 2016-07-08 10:11:23 -07:00
wildmatch.c
wildmatch.h
worktree.c Merge branch 'nd/worktree-lock' 2016-07-28 10:34:42 -07:00
worktree.h
wrap-for-bin.sh
wrapper.c Merge branch 'sb/submodule-parallel-fetch' into maint 2016-07-28 11:26:02 -07:00
write_or_die.c write_or_die: drop write_or_whine_pipe() 2016-08-05 09:28:17 -07:00
ws.c
wt-status.c Merge branch 'js/am-3-merge-recursive-direct' 2016-08-10 12:33:20 -07:00
wt-status.h
xdiff-interface.c
xdiff-interface.h
zlib.c

Git - fast, scalable, distributed revision control system

Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals.

Git is an Open Source project covered by the GNU General Public License version 2 (some parts of it are under different licenses, compatible with the GPLv2). It was originally written by Linus Torvalds with help of a group of hackers around the net.

Please read the file INSTALL for installation instructions.

Many Git online resources are accessible from http://git-scm.com/ including full documentation and Git related tools.

See Documentation/gittutorial.txt to get started, then see Documentation/giteveryday.txt for a useful minimum set of commands, and Documentation/git-.txt for documentation of each command. If git has been correctly installed, then the tutorial can also be read with man gittutorial or git help tutorial, and the documentation of each command with man git-<commandname> or git help <commandname>.

CVS users may also want to read Documentation/gitcvs-migration.txt (man gitcvs-migration or git help cvs-migration if git is installed).

The user discussion and development of Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org (read Documentation/SubmittingPatches for instructions on patch submission). To subscribe to the list, send an email with just "subscribe git" in the body to majordomo@vger.kernel.org. The mailing list archives are available at http://news.gmane.org/gmane.comp.version-control.git/, http://marc.info/?l=git and other archival sites.

The maintainer frequently sends the "What's cooking" reports that list the current status of various development topics to the mailing list. The discussion following them give a good reference for project status, development direction and remaining tasks.

The name "git" was given by Linus Torvalds when he wrote the very first version. He described the tool as "the stupid content tracker" and the name as (depending on your mood):

  • random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of "get" may or may not be relevant.
  • stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
  • "global information tracker": you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
  • "goddamn idiotic truckload of sh*t": when it breaks