Git with broken hash generation to generate collisions between object IDs. Don't use this! https://undefinedbehavior.de/posts/commit-vandalism/
Go to file
Taylor Blau 59f0d5073f bloom: encode out-of-bounds filters as non-empty
When a changed-path Bloom filter has either zero, or more than a
certain number (commonly 512) of entries, the commit-graph machinery
encodes it as "missing". More specifically, it sets the indices adjacent
in the BIDX chunk as equal to each other to indicate a "length 0"
filter; that is, that the filter occupies zero bytes on disk.

This has heretofore been fine, since the commit-graph machinery has no
need to care about these filters with too few or too many changed paths.
Both cases act like no filter has been generated at all, and so there is
no need to store them.

In a subsequent commit, however, the commit-graph machinery will learn
to only compute Bloom filters for some commits in the current
commit-graph layer. This is a change from the current implementation
which computes Bloom filters for all commits that are in the layer being
written. Critically for this patch, only computing some of the Bloom
filters means adding a third state for length 0 Bloom filters: zero
entries, too many entries, or "hasn't been computed".

It will be important for that future patch to distinguish between "not
representable" (i.e., zero or too-many changed paths), and "hasn't been
computed". In particular, we don't want to waste time recomputing
filters that have already been computed.

To that end, change how we store Bloom filters in the "computed but not
representable" category:

  - Bloom filters with no entries are stored as a single byte with all
    bits low (i.e., all queries to that Bloom filter will return
    "definitely not")

  - Bloom filters with too many entries are stored as a single byte with
    all bits set high (i.e., all queries to that Bloom filter will
    return "maybe").

These rules are sufficient to not incur a behavior change by changing
the on-disk representation of these two classes. Likewise, no
specification changes are necessary for the commit-graph format, either:

  - Filters that were previously empty will be recomputed and stored
    according to the new rules, and

  - old clients reading filters generated by new clients will interpret
    the filters correctly and be none the wiser to how they were
    generated.

Clients will invoke the Bloom machinery in more cases than before, but
this can be addressed by returning a NULL filter when all bits are set
high. This can be addressed in a future patch.

Note that this does increase the size of on-disk commit-graphs, but far
less than other proposals. In particular, this is generally more
efficient than storing a bitmap for which commits haven't computed their
Bloom filters. Storing a bitmap incurs a penalty of one bit per commit,
whereas storing explicit filters as above incurs a penalty of one byte
per too-large or empty commit.

In practice, these boundary commits likely occupy a small proportion of
the overall number of commits, and so the size penalty is likely smaller
than storing a bitmap for all commits.

See, for example, these relative proportions of such boundary commits
(collected by SZEDER Gábor):

                  |     Percentage of     |    commit-graph   |           |
                  |   commits modifying   |     file size     |           |
                  ├────────┬──────────────┼───────────────────┤    pct.   |
                  | 0 path | >= 512 paths | before  |  after  |   change  |
 ┌────────────────┼────────┼──────────────┼─────────┼─────────┼───────────┤
 | android-base   | 13.20% |        0.13% | 37.468M | 37.534M | +0.1741 % |
 | cmssw          |  0.15% |        0.23% | 17.118M | 17.119M | +0.0091 % |
 | cpython        |  3.07% |        0.01% |  7.967M |  7.971M | +0.0423 % |
 | elasticsearch  |  0.70% |        1.00% |  8.833M |  8.835M | +0.0128 % |
 | gcc            |  0.00% |        0.08% | 16.073M | 16.074M | +0.0030 % |
 | gecko-dev      |  0.14% |        0.64% | 59.868M | 59.874M | +0.0105 % |
 | git            |  0.11% |        0.02% |  3.895M |  3.895M | +0.0020 % |
 | glibc          |  0.02% |        0.10% |  3.555M |  3.555M | +0.0021 % |
 | go             |  0.00% |        0.07% |  3.186M |  3.186M | +0.0018 % |
 | homebrew-cask  |  0.40% |        0.02% |  7.035M |  7.035M | +0.0065 % |
 | homebrew-core  |  0.01% |        0.01% | 11.611M | 11.611M | +0.0002 % |
 | jdk            |  0.26% |        5.64% |  5.537M |  5.540M | +0.0590 % |
 | linux          |  0.01% |        0.51% | 63.735M | 63.740M | +0.0073 % |
 | llvm-project   |  0.12% |        0.03% | 25.515M | 25.516M | +0.0050 % |
 | rails          |  0.10% |        0.10% |  6.252M |  6.252M | +0.0027 % |
 | rust           |  0.07% |        0.17% |  9.364M |  9.364M | +0.0033 % |
 | tensorflow     |  0.09% |        1.02% |  7.009M |  7.010M | +0.0158 % |
 | webkit         |  0.05% |        0.31% | 17.405M | 17.406M | +0.0047 % |

(where the above increase is determined by computing a non-split
commit-graph before and after this patch).

Given that these projects are all "large" by commit count, the storage
cost by writing these filters explicitly is negligible. In the most
extreme example, android-base (which has 494,848 commits at the time of
writing) would have its commit-graph increase by a modest 68.4 KB.

Finally, a test to exercise filters which contain too many changed path
entries will be introduced in a subsequent patch.

Suggested-by: SZEDER Gábor <szeder.dev@gmail.com>
Suggested-by: Jakub Narębski <jnareb@gmail.com>
Helped-by: Derrick Stolee <dstolee@microsoft.com>
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-17 21:55:50 -07:00
.github Merge branch 'es/advertise-contribution-doc' 2020-06-17 21:54:06 -07:00
block-sha1
builtin commit-graph: pass a 'struct repository *' in more places 2020-09-09 12:51:48 -07:00
ci ci: use absolute PYTHON_PATH in the Linux jobs 2020-07-23 15:32:06 -07:00
compat Merge branch 'js/msvc-build-fix' 2020-06-17 21:54:03 -07:00
contrib Merge branch 'mp/complete-show-color-moved' 2020-08-04 13:53:56 -07:00
Documentation bloom: encode out-of-bounds filters as non-empty 2020-09-17 21:55:50 -07:00
ewah Merge branch 'jk/object-filter-with-bitmap' 2020-03-02 15:07:18 -08:00
git-gui Merge https://github.com/prati0100/git-gui into master 2020-07-20 12:04:06 -07:00
gitk-git Merge gitk to pick up emergency build fix 2019-09-17 14:59:18 -07:00
gitweb Merge branch 'eb/gitweb-more-trailers' 2020-05-01 13:39:56 -07:00
mergetools
negotiator
perl perl: make SVN code hash independent 2020-06-22 11:21:07 -07:00
po Merge branch 'master' of github.com:Softcatala/git-po 2020-07-27 00:05:41 +08:00
ppc
refs refs: move the logic to add \t to reflog to the files backend 2020-07-31 10:21:51 -07:00
sha1collisiondetection@855827c583 sha1dc: update from upstream 2019-05-14 16:45:01 +09:00
sha1dc Merge branch 'jk/lore-is-the-archive' 2019-12-06 15:09:23 -08:00
sha256 hash: implement and use a context cloning function 2020-02-24 09:33:21 -08:00
t bloom: encode out-of-bounds filters as non-empty 2020-09-17 21:55:50 -07:00
templates Merge branch 'kw/fsmonitor-watchman-racefix' 2020-02-14 12:54:20 -08:00
trace2 trace2: teach Git to log environment variables 2020-03-23 13:14:53 -07:00
vcs-svn
xdiff Merge branch 'rs/xdiff-ignore-ws-w-func-context' 2019-12-16 13:08:32 -08:00
.cirrus.yml CI: add FreeBSD CI support via Cirrus-CI 2019-12-20 12:09:12 -08:00
.clang-format clang-format: use git grep to generate the ForEachMacros list 2019-06-04 14:50:40 -07:00
.editorconfig editorconfig: indent text files with tabs 2020-01-06 08:46:32 -08:00
.gitattributes Fix build with core.autocrlf=true 2019-09-24 19:48:27 +05:30
.gitignore Merge branch 'es/bugreport' 2020-05-01 13:39:59 -07:00
.gitmodules
.mailmap Merge branch 'bc/wildcard-credential' 2020-03-05 10:43:02 -08:00
.travis.yml ci: fix the jobname of the GETTEXT_POISON job 2020-04-07 22:17:10 -07:00
.tsan-suppressions replace-object: make replace operations thread-safe 2020-01-17 13:52:14 -08:00
abspath.c real_path_if_valid(): remove unsafe API 2020-03-10 11:41:40 -07:00
aclocal.m4
add-interactive.c interactive: refactor code asking the user for interactive input 2020-04-10 10:26:31 -07:00
add-interactive.h built-in add -p: respect the interactive.singlekey config setting 2020-01-15 12:06:17 -08:00
add-patch.c comment: fix spelling mistakes inside comments 2020-07-29 11:39:40 -07:00
advice.c Merge branch 'hw/advise-ng' 2020-03-25 13:57:41 -07:00
advice.h Merge branch 'hw/advise-ng' 2020-03-25 13:57:41 -07:00
alias.c
alias.h
alloc.c commit: move members graph_pos, generation to a slab 2020-06-17 14:37:30 -07:00
alloc.h object: drop parsed_object_pool->commit_count 2020-06-17 14:37:14 -07:00
apply.c Use OPT_CALLBACK and OPT_CALLBACK_F 2020-04-28 10:47:10 -07:00
apply.h apply.h: include missing header 2019-09-28 14:04:16 +09:00
archive-tar.c parse_config_key(): return subsection len as size_t 2020-04-10 14:44:29 -07:00
archive-zip.c streaming: allow open_istream() to handle any repo 2020-01-31 10:45:39 -08:00
archive.c convert: provide additional metadata to filters 2020-03-16 11:37:02 -07:00
archive.h convert: provide additional metadata to filters 2020-03-16 11:37:02 -07:00
argv-array.c argv-array: add space after while 2019-11-20 13:29:02 +09:00
argv-array.h argv-array: move doc to argv-array.h 2019-11-18 15:21:29 +09:00
attr.c attr: move doc to attr.h 2019-11-18 15:21:28 +09:00
attr.h attr: move doc to attr.h 2019-11-18 15:21:28 +09:00
banned.h banned.h: fix vsprintf()'s ban message 2019-08-26 10:32:45 -07:00
base85.c
bisect.c bisect: stop referring to sha1_array 2020-03-30 10:59:08 -07:00
bisect.h bisect: libify bisect_next_all 2020-02-19 09:37:15 -08:00
blame.c bloom: split 'get_bloom_filter()' in two 2020-09-17 09:31:25 -07:00
blame.h blame: use changed-path Bloom filters 2020-04-16 15:38:06 -07:00
blob.c object: drop parsed_object_pool->commit_count 2020-06-17 14:37:14 -07:00
blob.h
bloom.c bloom: encode out-of-bounds filters as non-empty 2020-09-17 21:55:50 -07:00
bloom.h bloom: encode out-of-bounds filters as non-empty 2020-09-17 21:55:50 -07:00
branch.c Merge branch 'es/get-worktrees-unsort' 2020-07-06 22:09:15 -07:00
branch.h Merge branch 'nd/switch-and-restore' 2019-07-09 15:25:44 -07:00
bugreport.c Merge branch 'rs/retire-strbuf-write-fd' 2020-06-29 14:17:26 -07:00
builtin.h Lib-ify prune-packed 2020-03-24 15:04:44 -07:00
bulk-checkin.c bulk-checkin: zero-initialize hashfile_checkpoint 2019-09-06 11:03:39 -07:00
bulk-checkin.h
bundle.c bundle: detect hash algorithm when reading refs 2020-06-19 14:04:09 -07:00
bundle.h bundle: detect hash algorithm when reading refs 2020-06-19 14:04:09 -07:00
cache-tree.c sha1-file: pass git_hash_algo to hash_object_file() 2020-01-31 10:45:39 -08:00
cache-tree.h cache-tree: share code between functions writing an index as a tree 2019-08-19 10:08:03 -07:00
cache.h Merge branch 'jk/reject-newer-extensions-in-v0' into master 2020-07-30 13:20:32 -07:00
chdir-notify.c
chdir-notify.h
check_bindir
check-builtins.sh
checkout.c
checkout.h
CODE_OF_CONDUCT.md CODE_OF_CONDUCT: mention individual project-leader emails 2019-10-10 10:41:46 +09:00
color.c color.c: alias RGB colors 8-15 to aixterm colors 2020-02-11 11:19:00 -08:00
color.h
column.c comment: fix spelling mistakes inside comments 2020-07-29 11:39:40 -07:00
column.h
combine-diff.c oid_array: rename source file from sha1-array 2020-03-30 10:59:08 -07:00
command-list.txt bash-completion: add git-prune into bash completion 2020-06-22 11:29:38 -07:00
commit-graph.c bloom: encode out-of-bounds filters as non-empty 2020-09-17 21:55:50 -07:00
commit-graph.h commit-graph: pass a 'struct repository *' in more places 2020-09-09 12:51:48 -07:00
commit-reach.c Merge branch 'cb/is-descendant-of' 2020-07-06 22:09:16 -07:00
commit-reach.h commit-reach: avoid is_descendant_of() shim 2020-06-23 16:36:53 -07:00
commit-slab-decl.h Merge branch 'sg/commit-graph-cleanups' into master 2020-07-30 13:20:30 -07:00
commit-slab-impl.h commit-slab: add a function to deep free entries on the slab 2020-06-08 12:28:49 -07:00
commit-slab.h commit-slab: add a function to deep free entries on the slab 2020-06-08 12:28:49 -07:00
commit.c Merge branch 'tb/fix-persistent-shallow' into master 2020-07-09 14:00:44 -07:00
commit.h commit: move members graph_pos, generation to a slab 2020-06-17 14:37:30 -07:00
common-main.c common-main: delay trace2 initialization 2019-08-06 13:09:01 -07:00
config.c config: reject parsing of files over INT_MAX 2020-04-10 14:58:21 -07:00
config.h git_config_parse_key(): return baselen as size_t 2020-04-10 14:52:22 -07:00
config.mak.dev Merge branch 'bc/sha-256-part-1-of-4' 2020-03-26 17:11:20 -07:00
config.mak.in
config.mak.uname Merge branch 'cb/no-more-gmtime' 2020-05-20 08:33:27 -07:00
configure.ac Merge branch 'dd/sequencer-utf8' 2019-12-01 09:04:36 -08:00
connect.c Merge branch 'bc/sha-256-part-2' 2020-07-06 22:09:13 -07:00
connect.h Merge branch 'bc/sha-256-part-2' 2020-07-06 22:09:13 -07:00
connected.c fetch-pack: support more than one pack lockfile 2020-06-10 18:06:34 -07:00
connected.h connected: always use partial clone optimization 2020-03-29 10:37:44 -07:00
convert.c parse_config_key(): return subsection len as size_t 2020-04-10 14:44:29 -07:00
convert.h convert: provide additional metadata to filters 2020-03-16 11:37:02 -07:00
copy.c
COPYING
credential-cache--daemon.c
credential-cache.c
credential-store.c Merge branch 'cb/credential-store-ignore-bogus-lines' 2020-05-08 14:25:01 -07:00
credential.c Merge branch 'js/partial-urlmatch' 2020-05-05 14:54:30 -07:00
credential.h credential: correct order of parameters for credential_match 2020-05-04 22:56:33 -07:00
csum-file.c hash: implement and use a context cloning function 2020-02-24 09:33:21 -08:00
csum-file.h csum-file: introduce hashfile_total() 2020-01-23 10:51:50 -08:00
ctype.c
daemon.c Fix spelling errors in code comments 2019-11-10 16:00:54 +09:00
date.c date.c: allow compact version of ISO-8601 datetime 2020-04-24 14:06:09 -07:00
decorate.c hashmap: convert sha1hash() to oidhash() 2019-06-20 10:44:22 -07:00
decorate.h
delta-islands.c oid_array: rename source file from sha1-array 2020-03-30 10:59:08 -07:00
delta-islands.h delta-islands: respect progress flag 2019-06-20 13:29:49 -07:00
delta.h
detect-compiler
diff-delta.c diff-delta: set size out-parameter to 0 for NULL delta 2019-09-06 11:03:39 -07:00
diff-lib.c diff-files --raw: show correct post-image of intent-to-add files 2020-07-01 16:15:43 -07:00
diff-no-index.c
diff.c Merge branch 'jk/diff-memuse-optim-with-stat-unmatch' 2020-06-17 21:54:00 -07:00
diff.h bloom/diff: properly short-circuit on max_changes 2020-09-17 09:31:25 -07:00
diffcore-break.c diff: restrict when prefetching occurs 2020-04-07 16:09:29 -07:00
diffcore-delta.c
diffcore-order.c
diffcore-pickaxe.c
diffcore-rename.c diff: restrict when prefetching occurs 2020-04-07 16:09:29 -07:00
diffcore.h diff: restrict when prefetching occurs 2020-04-07 16:09:29 -07:00
dir-iterator.c dir-iterator: add flags parameter to dir_iterator_begin 2019-07-11 13:52:15 -07:00
dir-iterator.h dir-iterator: add flags parameter to dir_iterator_begin 2019-07-11 13:52:15 -07:00
dir.c Merge branch 'en/fill-directory-exponential' into master 2020-07-30 13:20:36 -07:00
dir.h Merge branch 'ds/sparse-cone' 2019-12-25 11:21:58 -08:00
editor.c real_path: remove unsafe API 2020-03-10 11:41:40 -07:00
entry.c Merge branch 'mt/entry-fstat-fallback-fix' into master 2020-07-09 14:00:45 -07:00
environment.c Merge branch 'tb/shallow-cleanup' 2020-05-13 12:19:18 -07:00
exec-cmd.c trace2: create new combined trace facility 2019-02-22 15:27:59 -08:00
exec-cmd.h
fast-import.c Merge branch 'en/fast-import-looser-date' 2020-06-02 13:35:05 -07:00
fetch-negotiator.c repo-settings: create feature.experimental setting 2019-08-13 13:33:55 -07:00
fetch-negotiator.h repo-settings: create feature.experimental setting 2019-08-13 13:33:55 -07:00
fetch-pack.c Merge branch 'bc/sha-256-part-2' 2020-07-06 22:09:13 -07:00
fetch-pack.h fetch-pack: support more than one pack lockfile 2020-06-10 18:06:34 -07:00
fmt-merge-msg.c fmt-merge-msg: allow merge destination to be omitted again 2020-07-30 12:43:10 -07:00
fmt-merge-msg.h Lib-ify fmt-merge-msg 2020-03-24 15:04:43 -07:00
fsck.c Merge branch 'rs/fsck-duplicate-names-in-trees' 2020-06-08 18:06:29 -07:00
fsck.h fsck: only provide oid/type in fsck_error callback 2019-10-28 14:05:18 +09:00
fsmonitor.c Remove doubled words in various comments 2020-07-28 14:28:14 -07:00
fsmonitor.h mark_fsmonitor_valid(): mark the index as changed if needed 2019-05-28 12:43:43 -07:00
fuzz-commit-graph.c commit-graph: pass a 'struct repository *' in more places 2020-09-09 12:51:48 -07:00
fuzz-pack-headers.c
fuzz-pack-idx.c
generate-cmdlist.sh help: move list_config_help to builtin/help 2020-04-16 15:22:16 -07:00
generate-configlist.sh help: move list_config_help to builtin/help 2020-04-16 15:22:16 -07:00
gettext.c Merge branch 'ab/test-env' 2019-07-25 13:59:20 -07:00
gettext.h
git-add--interactive.perl checkout -p: handle new files correctly 2020-05-27 14:50:20 -07:00
git-archimport.perl
git-bisect.sh bisect: treat BISECT_HEAD as a pseudo ref 2020-07-10 13:53:37 -07:00
git-compat-util.h Merge branch 'bc/sha-256-part-2' 2020-07-06 22:09:13 -07:00
git-cvsexportcommit.perl git-cvsexportcommit: port to SHA-256 2020-06-22 11:21:07 -07:00
git-cvsimport.perl git-cvsimport: port to SHA-256 2020-06-22 11:21:07 -07:00
git-cvsserver.perl git-cvsserver: port to SHA-256 2020-06-22 11:21:07 -07:00
git-difftool--helper.sh mergetool: use get_merge_tool function 2019-05-13 23:11:59 +09:00
git-filter-branch.sh Recommend git-filter-repo instead of git-filter-branch 2019-09-05 13:01:48 -07:00
git-instaweb.sh
git-merge-octopus.sh
git-merge-one-file.sh
git-merge-resolve.sh
git-mergetool--lib.sh Merge branch 'dl/difftool-mergetool' 2019-05-19 16:45:30 +09:00
git-mergetool.sh mergetool: use shell variable magic instead of awk 2019-06-12 13:20:56 -07:00
git-p4.py Merge branch 'bk/p4-prepare-p4-only-fix' 2020-06-02 13:35:01 -07:00
git-parse-remote.sh
git-quiltimport.sh
git-rebase--preserve-merges.sh rebase: fold git-rebase--common into the -p backend 2019-07-31 12:24:06 -07:00
git-request-pull.sh request-pull: warn if the remote object is not the same as the local one 2019-05-28 13:06:25 -07:00
git-send-email.perl send-email: restore --in-reply-to superseding behavior 2020-07-01 16:12:21 -07:00
git-sh-i18n.sh tests: make GIT_TEST_GETTEXT_POISON a boolean 2019-06-21 09:42:49 -07:00
git-sh-setup.sh stash: optionally use the scripted version again 2019-03-07 09:41:40 +09:00
git-submodule.sh submodule: port subcommand 'set-branch' from shell to C 2020-06-02 10:51:54 -07:00
git-svn.perl git-svn: set the OID length based on hash algorithm 2020-06-22 11:21:07 -07:00
GIT-VERSION-GEN Git 2.28 2020-07-26 18:01:43 -07:00
git-web--browse.sh
git.c Merge branch 'ta/wait-on-aliased-commands-upon-signal' into master 2020-07-15 16:29:43 -07:00
git.rc mingw: embed a manifest to trick UAC into Doing The Right Thing 2019-06-27 12:55:45 -07:00
gpg-interface.c gpg-interface: prefer check_signature() for GPG verification 2020-03-15 09:46:28 -07:00
gpg-interface.h gpg-interface: prefer check_signature() for GPG verification 2020-03-15 09:46:28 -07:00
graph.c graph.c: limit linkage of internal variable 2020-04-27 11:21:25 -07:00
graph.h graph: move doc to graph.h and graph.c 2019-11-18 15:21:28 +09:00
grep.c comment: fix spelling mistakes inside comments 2020-07-29 11:39:40 -07:00
grep.h grep: replace grep_read_mutex by internal obj read lock 2020-01-17 13:52:14 -08:00
hash.h hash: implement and use a context cloning function 2020-02-24 09:33:21 -08:00
hashmap.c Fix spelling errors in code comments 2019-11-10 16:00:54 +09:00
hashmap.h hashmap: fix typo in usage docs 2020-07-28 14:28:15 -07:00
help.c help: add shell-path to --build-options 2020-05-12 22:02:17 -07:00
help.h bugreport: gather git version and build info 2020-04-16 15:23:42 -07:00
hex.c hex: add functions to parse hex object IDs in any algorithm 2020-02-24 09:33:21 -08:00
http-backend.c
http-fetch.c http-fetch: support fetching packfiles by URL 2020-06-10 18:06:34 -07:00
http-push.c Merge branch 'bc/http-push-flagsfix' 2020-07-06 22:09:17 -07:00
http-walker.c http: refactor finish_http_pack_request() 2020-06-10 18:06:34 -07:00
http.c Merge branch 'jt/cdn-offload' 2020-06-25 12:27:47 -07:00
http.h Merge branch 'jt/cdn-offload' 2020-06-25 12:27:47 -07:00
ident.c
imap-send.c http, imap-send: stop using CURLOPT_VERBOSE 2020-05-11 11:18:01 -07:00
INSTALL INSTALL: drop support for docbook-xsl before 1.74 2020-03-29 09:25:38 -07:00
interdiff.c
interdiff.h
iterator.h
json-writer.c
json-writer.h
khash.h hashmap: convert sha1hash() to oidhash() 2019-06-20 10:44:22 -07:00
kwset.c Merge branch 'rs/copy-array' into maint 2019-07-29 12:38:15 -07:00
kwset.h kset.h, tar.h: add missing header guard to prevent multiple inclusion 2019-11-07 20:12:04 +09:00
levenshtein.c
levenshtein.h
LGPL-2.1
line-log.c bloom: split 'get_bloom_filter()' in two 2020-09-17 09:31:25 -07:00
line-log.h line-log: more responsive, incremental 'git log -L' 2020-05-11 09:33:56 -07:00
line-range.c
line-range.h
linear-assignment.c linear-assignment: fix potential out of bounds memory access 2018-09-14 09:10:26 -07:00
linear-assignment.h
list-objects-filter-options.c repository: add a helper function to perform repository format upgrade 2020-06-05 10:13:30 -07:00
list-objects-filter-options.h Use OPT_CALLBACK and OPT_CALLBACK_F 2020-04-28 10:47:10 -07:00
list-objects-filter.c list-objects-filter: treat NULL filter_options as "disabled" 2020-05-04 21:57:58 -07:00
list-objects-filter.h list-objects-filter: implement composite filters 2019-06-28 08:41:53 -07:00
list-objects.c Merge branch 'jk/list-objects-optim-wo-trees' 2019-10-07 11:32:56 +09:00
list-objects.h
list.h
ll-merge.c parse_config_key(): return subsection len as size_t 2020-04-10 14:44:29 -07:00
ll-merge.h merge: move doc to ll-merge.h 2019-11-18 15:21:28 +09:00
lockfile.c lockfile.c: introduce 'hold_lock_file_for_update_mode' 2020-04-27 11:27:36 -07:00
lockfile.h lockfile.c: introduce 'hold_lock_file_for_update_mode' 2020-04-27 11:27:36 -07:00
log-tree.c Merge branch 'ds/log-exclude-decoration-config' 2020-04-28 15:50:08 -07:00
log-tree.h log: add log.excludeDecoration config option 2020-04-16 11:05:48 -07:00
ls-refs.c upload-pack: handle unexpected delim packets 2020-03-27 12:18:48 -07:00
ls-refs.h
mailinfo.c mailinfo: disallow NUL character in mail's header 2020-04-22 14:01:03 -07:00
mailinfo.h *.[ch]: remove extern from function declarations using spatch 2019-05-05 15:20:06 +09:00
mailmap.c
mailmap.h
Makefile Merge branch 'lo/sparse-universal-zero-init' 2020-06-02 13:35:04 -07:00
match-trees.c match-trees.c: remove the_repo from shift_tree*() 2019-06-27 12:45:17 -07:00
mem-pool.c
mem-pool.h
merge-blobs.c merge-blobs.c: remove implicit dependency on the_index 2018-09-21 09:48:10 -07:00
merge-blobs.h
merge-recursive.c merge-recursive: fix rename/rename(1to2) for working tree with a binary 2020-05-14 12:14:19 -07:00
merge-recursive.h hashmap_entry: remove first member requirement from docs 2019-10-07 10:20:12 +09:00
merge.c builtin/checkout: compute checkout metadata for checkouts 2020-03-16 11:37:02 -07:00
mergesort.c
mergesort.h
midx.c multi-pack-index: respect repack.packKeptObjects=false 2020-05-10 09:50:55 -07:00
midx.h Merge branch 'ds/multi-pack-index' 2020-05-01 13:39:55 -07:00
name-hash.c Merge branch 'en/doc-typofix' 2019-12-01 09:04:35 -08:00
notes-cache.c
notes-cache.h notes-cache.c: remove the_repository references 2018-11-12 14:50:06 +09:00
notes-merge.c
notes-merge.h
notes-utils.c strbuf: add and use strbuf_insertstr() 2020-02-10 09:04:45 -08:00
notes-utils.h
notes.c Merge branch 'jh/notes-fanout-fix' into maint 2020-03-17 15:02:22 -07:00
notes.h Merge branch 'dl/format-patch-notes-config-fixup' 2019-12-25 11:21:58 -08:00
object-store.h packfile: compute and use the index CRC offset 2020-05-27 10:07:07 -07:00
object.c object: drop parsed_object_pool->commit_count 2020-06-17 14:37:14 -07:00
object.h Merge branch 'tb/fix-persistent-shallow' into master 2020-07-09 14:00:44 -07:00
oid-array.c oid_array: rename source file from sha1-array 2020-03-30 10:59:08 -07:00
oid-array.h oid_array: rename source file from sha1-array 2020-03-30 10:59:08 -07:00
oidmap.c hashmap: introduce hashmap_free_entries 2019-10-07 10:20:11 +09:00
oidmap.h hashmap: use *_entry APIs for iteration 2019-10-07 10:20:11 +09:00
oidset.c oidset: introduce 'oidset_size' 2020-04-15 09:20:29 -07:00
oidset.h Merge branch 'tb/commit-graph-split-strategy' 2020-05-01 13:39:52 -07:00
pack-bitmap-write.c pack-objects: drop packlist index_pos optimization 2019-09-06 11:03:42 -07:00
pack-bitmap.c pack-bitmap: pass object filter to fill-in traversal 2020-05-04 21:57:58 -07:00
pack-bitmap.h Merge branch 'jk/object-filter-with-bitmap' 2020-03-02 15:07:18 -08:00
pack-check.c pack-check: push oid lookup into loop 2020-02-24 12:55:53 -08:00
pack-objects.c pack-objects: convert oe_set_delta_ext() to use object_id 2020-02-24 12:55:52 -08:00
pack-objects.h pack-objects: convert oe_set_delta_ext() to use object_id 2020-02-24 12:55:52 -08:00
pack-revindex.c
pack-revindex.h
pack-write.c Merge branch 'jb/doc-packfile-name' into master 2020-07-30 21:34:32 -07:00
pack.h
packfile.c packfile: compute and use the index CRC offset 2020-05-27 10:07:07 -07:00
packfile.h packfile: drop nth_packed_object_sha1() 2020-02-24 12:55:53 -08:00
pager.c pager: add a helper function to clear the last line in the terminal 2019-06-24 13:38:46 -07:00
parse-options-cb.c oid_array: rename source file from sha1-array 2020-03-30 10:59:08 -07:00
parse-options.c parse-options: teach "git cmd -h" to show alias as alias 2020-03-16 14:27:07 -07:00
parse-options.h merge: teach --autostash option 2020-04-10 09:28:02 -07:00
patch-delta.c
patch-ids.c hashmap: remove type arg from hashmap_{get,put,remove}_entry 2019-10-07 10:20:12 +09:00
patch-ids.h
path.c Merge branch 'dl/merge-autostash' 2020-04-29 16:15:27 -07:00
path.h merge: teach --autostash option 2020-04-10 09:28:02 -07:00
pathspec.c prefix_path: show gitdir if worktree unavailable 2020-03-15 09:35:46 -07:00
pathspec.h Merge branch 'hw/doc-in-header' 2019-12-16 13:08:39 -08:00
pkt-line.c Merge branch 'bc/sha-256-part-2' 2020-07-06 22:09:13 -07:00
pkt-line.h Merge branch 'bc/sha-256-part-2' 2020-07-06 22:09:13 -07:00
preload-index.c mark_fsmonitor_valid(): mark the index as changed if needed 2019-05-28 12:43:43 -07:00
pretty.c format-patch: teach --no-encode-email-headers 2020-04-07 22:37:18 -07:00
pretty.h format-patch: teach --no-encode-email-headers 2020-04-07 22:37:18 -07:00
prio-queue.c
prio-queue.h
progress.c progress: call trace2_region_leave() only after calling _enter() 2020-05-15 09:41:30 -07:00
progress.h progress.c: silence cgcc suggestion about internal linkage 2020-04-27 11:21:28 -07:00
promisor-remote.c Merge branch 'jt/avoid-prefetch-when-able-in-diff' 2020-04-28 15:50:04 -07:00
promisor-remote.h promisor-remote: accept 0 as oid_nr in function 2020-04-02 12:42:32 -07:00
prompt.c interactive: explicitly fflush stdout before expecting input 2020-04-10 10:27:16 -07:00
prompt.h interactive: refactor code asking the user for interactive input 2020-04-10 10:26:31 -07:00
protocol.c config: let feature.experimental imply protocol.version=2 2020-05-21 09:31:42 -07:00
protocol.h *.[ch]: remove extern from function declarations using spatch 2019-05-05 15:20:06 +09:00
prune-packed.c Lib-ify prune-packed 2020-03-24 15:04:44 -07:00
prune-packed.h Lib-ify prune-packed 2020-03-24 15:04:44 -07:00
quote.c quote: use isalnum() to check for alphanumeric characters 2020-02-24 09:30:29 -08:00
quote.h quote: add sq_append_quote_argv_pretty() 2019-08-09 10:48:02 -07:00
range-diff.c range-diff: avoid negative string precision 2020-04-15 18:32:48 -07:00
range-diff.h Merge branch 'dl/range-diff-with-notes' 2019-12-16 13:08:46 -08:00
reachable.c pack-bitmap: basic noop bitmap filter infrastructure 2020-02-14 10:46:22 -08:00
reachable.h
read-cache.c read-cache: remove bogus shortcut 2020-07-16 10:42:52 -07:00
README.md ci: retire the Azure Pipelines definition 2020-04-10 10:30:40 -07:00
rebase-interactive.c Merge branch 'rt/format-zero-length-fix' 2020-03-09 11:21:21 -07:00
rebase-interactive.h Merge branch 'en/rebase-backend' 2020-03-02 15:07:19 -08:00
rebase.c pull --rebase/remote rename: document and honor single-letter abbreviations rebase types 2020-02-10 10:52:10 -08:00
rebase.h pull --rebase/remote rename: document and honor single-letter abbreviations rebase types 2020-02-10 10:52:10 -08:00
ref-filter.c Merge branch 'sk/typofixes' into master 2020-07-30 21:34:29 -07:00
ref-filter.h Merge branch 'jk/for-each-ref-multi-key-sort-fix' 2020-05-08 14:25:04 -07:00
reflog-walk.c Merge branch 'nd/i18n' 2018-08-15 15:08:23 -07:00
reflog-walk.h
refs.c Merge branch 'hn/reftable' into master 2020-08-01 13:49:13 -07:00
refs.h Merge branch 'js/default-branch-name' 2020-07-06 22:09:17 -07:00
refspec.c
refspec.h remote: move doc to remote.h and refspec.h 2019-11-18 15:21:28 +09:00
RelNotes First batch post 2.28 2020-07-30 13:20:36 -07:00
remote-curl.c Merge branch 'bc/push-cas-cquoted-refname' into master 2020-07-30 13:20:34 -07:00
remote-testsvn.c testsvn: respect init.defaultBranch 2020-06-24 09:14:21 -07:00
remote.c remote: use the configured default branch name when appropriate 2020-06-24 09:14:21 -07:00
remote.h stateless-connect: send response end packet 2020-05-24 16:26:00 -07:00
replace-object.c replace-object: make replace operations thread-safe 2020-01-17 13:52:14 -08:00
replace-object.h replace-object: make replace operations thread-safe 2020-01-17 13:52:14 -08:00
repo-settings.c commit-graph: respect 'commitGraph.readChangedPaths' 2020-09-09 12:51:48 -07:00
repository.c repository: require a build flag to use SHA-256 2020-02-24 09:33:21 -08:00
repository.h commit-graph: respect 'commitGraph.readChangedPaths' 2020-09-09 12:51:48 -07:00
rerere.c Fix spelling errors in code comments 2019-11-10 16:00:54 +09:00
rerere.h
reset.c Merge branch 'dl/merge-autostash' 2020-04-29 16:15:27 -07:00
reset.h reset: extract reset_head() from rebase 2020-04-10 09:28:02 -07:00
resolve-undo.c
resolve-undo.h
revision.c bloom: split 'get_bloom_filter()' in two 2020-09-17 09:31:25 -07:00
revision.h Merge branch 'ds/commit-graph-bloom-updates' into master 2020-07-30 13:20:31 -07:00
run-command.c Merge branch 'ta/wait-on-aliased-commands-upon-signal' into master 2020-07-15 16:29:43 -07:00
run-command.h Merge branch 'ta/wait-on-aliased-commands-upon-signal' into master 2020-07-15 16:29:43 -07:00
send-pack.c Merge branch 'js/default-branch-name' 2020-07-06 22:09:17 -07:00
send-pack.h
sequencer.c Merge branch 'js/rebase-autosquash-double-fixup-fix' 2020-05-14 14:39:43 -07:00
sequencer.h Merge branch 'dl/merge-autostash' 2020-04-29 16:15:27 -07:00
serve.c Merge branch 'bc/sha-256-part-2' 2020-07-06 22:09:13 -07:00
serve.h
server-info.c Fix spelling errors in code comments 2019-11-10 16:00:54 +09:00
setup.c Merge branch 'jk/reject-newer-extensions-in-v0' into master 2020-07-30 13:20:32 -07:00
sh-i18n--envsubst.c cleanup: fix possible overflow errors in binary search, part 2 2019-06-13 11:28:53 -07:00
sha1-file.c Merge branch 'jt/pretend-object-never-come-from-elsewhere' 2020-08-04 13:53:58 -07:00
sha1-lookup.c Merge branch 'js/azure-pipelines-msvc' 2019-10-15 13:48:00 +09:00
sha1-lookup.h *.[ch]: manually align parameter lists 2019-05-05 15:20:10 +09:00
sha1-name.c Merge branch 'jc/missing-ref-store-fix' 2020-04-22 13:42:55 -07:00
sha1dc_git.c hex: drop sha1_to_hex() 2019-11-13 10:09:10 +09:00
sha1dc_git.h sha1dc_git.h: re-arrange an ifdef chain for a subsequent change 2017-12-08 15:01:01 -08:00
shallow.c Merge branch 'sg/commit-graph-cleanups' into master 2020-07-30 13:20:30 -07:00
shallow.h shallow: use struct 'shallow_lock' for additional safety 2020-04-30 14:19:13 -07:00
shell.c interactive: refactor code asking the user for interactive input 2020-04-10 10:26:31 -07:00
shortlog.h
sideband.c
sideband.h
sigchain.c
sigchain.h sigchain: move doc to sigchain.h 2019-11-18 15:21:29 +09:00
split-index.c
split-index.h
stable-qsort.c Move git_sort(), a stable sort, into into libgit.a 2019-10-02 14:44:51 +09:00
strbuf.c Merge branch 'rs/retire-strbuf-write-fd' 2020-06-29 14:17:26 -07:00
strbuf.h Merge branch 'rs/retire-strbuf-write-fd' 2020-06-29 14:17:26 -07:00
streaming.c streaming: allow open_istream() to handle any repo 2020-01-31 10:45:39 -08:00
streaming.h streaming: allow open_istream() to handle any repo 2020-01-31 10:45:39 -08:00
string-list.c
string-list.h Merge branch 'en/string-list-can-be-custom-sorted' into maint 2020-02-14 12:42:27 -08:00
sub-process.c hashmap: remove type arg from hashmap_{get,put,remove}_entry 2019-10-07 10:20:12 +09:00
sub-process.h hashmap_entry: remove first member requirement from docs 2019-10-07 10:20:12 +09:00
submodule-config.c parse_config_key(): return subsection len as size_t 2020-04-10 14:44:29 -07:00
submodule-config.h submodule-config: add skip_if_read option to repo_read_gitmodules() 2020-01-17 13:52:14 -08:00
submodule.c Merge branch 'jk/oid-array-cleanups' 2020-04-22 13:42:49 -07:00
submodule.h get_superproject_working_tree(): return strbuf 2020-03-10 11:41:40 -07:00
symlinks.c
tag.c object: drop parsed_object_pool->commit_count 2020-06-17 14:37:14 -07:00
tag.h tag: factor out get_tagged_oid() 2019-09-05 14:10:18 -07:00
tar.h kset.h, tar.h: add missing header guard to prevent multiple inclusion 2019-11-07 20:12:04 +09:00
tempfile.c tempfile.c: introduce 'create_tempfile_mode' 2020-04-27 11:27:35 -07:00
tempfile.h tempfile.c: introduce 'create_tempfile_mode' 2020-04-27 11:27:35 -07:00
thread-utils.c
thread-utils.h
tmp-objdir.c Replace all die("BUG: ...") calls by BUG() ones 2018-05-06 19:06:13 +09:00
tmp-objdir.h
trace2.c trace2: teach Git to log environment variables 2020-03-23 13:14:53 -07:00
trace2.h trace2: teach Git to log environment variables 2020-03-23 13:14:53 -07:00
trace.c http, imap-send: stop using CURLOPT_VERBOSE 2020-05-11 11:18:01 -07:00
trace.h http, imap-send: stop using CURLOPT_VERBOSE 2020-05-11 11:18:01 -07:00
trailer.c
trailer.h
transport-helper.c Merge branch 'js/default-branch-name' 2020-07-06 22:09:17 -07:00
transport-internal.h transport: teach all vtables to allow fetch first 2019-08-22 14:20:39 -07:00
transport.c Merge branch 'bc/sha-256-part-2' 2020-07-06 22:09:13 -07:00
transport.h Merge branch 'bc/sha-256-part-2' 2020-07-06 22:09:13 -07:00
tree-diff.c bloom/diff: properly short-circuit on max_changes 2020-09-17 09:31:25 -07:00
tree-walk.c tree-walk.c: don't match submodule entries for 'submod/anything' 2020-06-08 12:28:48 -07:00
tree-walk.h tree-walk.c: break circular dependency with unpack-trees 2020-02-04 10:32:15 -08:00
tree.c object: drop parsed_object_pool->commit_count 2020-06-17 14:37:14 -07:00
tree.h
unicode-width.h unicode: update the width tables to Unicode 13.0 2020-03-17 15:06:37 -07:00
unimplemented.sh
unix-socket.c
unix-socket.h
unpack-trees.c Merge branch 'en/sparse-checkout' 2020-05-20 08:33:29 -07:00
unpack-trees.h Merge branch 'en/sparse-checkout' 2020-04-29 16:15:30 -07:00
upload-pack.c upload-pack: do not lazy-fetch "have" objects 2020-07-16 14:07:19 -07:00
upload-pack.h
url.c Fix spelling errors in code comments 2019-11-10 16:00:54 +09:00
url.h list-objects-filter: implement composite filters 2019-06-28 08:41:53 -07:00
urlmatch.c credential: handle credential.<partial-URL>.<key> again 2020-04-24 15:53:46 -07:00
urlmatch.h credential: handle credential.<partial-URL>.<key> again 2020-04-24 15:53:46 -07:00
usage.c vreportf(): avoid relying on stdio buffering 2019-11-02 15:20:21 +09:00
userdiff.c Merge branch 'ah/userdiff-markdown' 2020-05-08 14:25:01 -07:00
userdiff.h
utf8.c utf8: use skip_iprefix() in same_utf_encoding() 2019-11-10 16:04:36 +09:00
utf8.h
varint.c cleanups: ensure that git-compat-util.h is included first 2014-09-15 12:05:14 -07:00
varint.h
version.c
version.h
versioncmp.c
walker.c Merge branch 'rs/show-progress-in-dumb-http-fetch' 2020-03-09 11:21:21 -07:00
walker.h remote-curl: show progress for fetches over dumb HTTP 2020-03-03 13:15:40 -08:00
wildmatch.c
wildmatch.h
worktree.c Merge branch 'es/worktree-code-cleanup' 2020-07-06 22:09:19 -07:00
worktree.h worktree: drop get_worktrees() unused 'flags' argument 2020-06-22 10:31:15 -07:00
wrap-for-bin.sh
wrapper.c wrapper: add function to compare strings with different NUL termination 2020-05-27 10:07:06 -07:00
write-or-die.c
ws.c
wt-status.c Remove doubled words in various comments 2020-07-28 14:28:14 -07:00
wt-status.h wt-status: show sparse checkout status as well 2020-06-18 14:12:28 -07:00
xdiff-interface.c xdiff: avoid computing non-zero offset from NULL pointer 2020-01-28 23:13:25 -08:00
xdiff-interface.h Fix spelling errors in code comments 2019-11-10 16:00:54 +09:00
zlib.c

Build status

Git - fast, scalable, distributed revision control system

Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals.

Git is an Open Source project covered by the GNU General Public License version 2 (some parts of it are under different licenses, compatible with the GPLv2). It was originally written by Linus Torvalds with help of a group of hackers around the net.

Please read the file INSTALL for installation instructions.

Many Git online resources are accessible from https://git-scm.com/ including full documentation and Git related tools.

See Documentation/gittutorial.txt to get started, then see Documentation/giteveryday.txt for a useful minimum set of commands, and Documentation/git-<commandname>.txt for documentation of each command. If git has been correctly installed, then the tutorial can also be read with man gittutorial or git help tutorial, and the documentation of each command with man git-<commandname> or git help <commandname>.

CVS users may also want to read Documentation/gitcvs-migration.txt (man gitcvs-migration or git help cvs-migration if git is installed).

The user discussion and development of Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org (read Documentation/SubmittingPatches for instructions on patch submission). To subscribe to the list, send an email with just "subscribe git" in the body to majordomo@vger.kernel.org. The mailing list archives are available at https://lore.kernel.org/git/, http://marc.info/?l=git and other archival sites.

Issues which are security relevant should be disclosed privately to the Git Security mailing list git-security@googlegroups.com.

The maintainer frequently sends the "What's cooking" reports that list the current status of various development topics to the mailing list. The discussion following them give a good reference for project status, development direction and remaining tasks.

The name "git" was given by Linus Torvalds when he wrote the very first version. He described the tool as "the stupid content tracker" and the name as (depending on your mood):

  • random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of "get" may or may not be relevant.
  • stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
  • "global information tracker": you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
  • "goddamn idiotic truckload of sh*t": when it breaks