git-commit-vandalism

Author	SHA1	Message	Date
Ben Peart	3255089ada	ieot: add Index Entry Offset Table (IEOT) extension This patch enables addressing the CPU cost of loading the index by adding additional data to the index that will allow us to efficiently multi- thread the loading and conversion of cache entries. It accomplishes this by adding an (optional) index extension that is a table of offsets to blocks of cache entries in the index file. To make this work for V4 indexes, when writing the cache entries, it periodically "resets" the prefix-compression by encoding the current entry as if the path name for the previous entry is completely different and saves the offset of that entry in the IEOT. Basically, with V4 indexes, it generates offsets into blocks of prefix-compressed entries. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-11 15:32:48 +09:00
Ben Peart	abb4bb8384	read-cache: load cache extensions on a worker thread This patch helps address the CPU cost of loading the index by loading the cache extensions on a worker thread in parallel with loading the cache entries. In some cases, loading the extensions takes longer than loading the cache entries so this patch utilizes the new EOIE to start the thread to load the extensions before loading all the cache entries in parallel. This is possible because the current extensions don't access the cache entries in the index_state structure so are OK that they don't all exist yet. The CACHE_EXT_TREE, CACHE_EXT_RESOLVE_UNDO, and CACHE_EXT_UNTRACKED extensions don't even get a pointer to the index so don't have access to the cache entries. CACHE_EXT_LINK only uses the index_state to initialize the split index. CACHE_EXT_FSMONITOR only uses the index_state to save the fsmonitor last update and dirty flags. I used p0002-read-cache.sh to generate some performance data: Test w/100,000 files reduced the time by 0.53% Test w/1,000,000 files reduced the time by 27.78% Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-11 15:32:48 +09:00
Ben Peart	c780b9cfe8	config: add new index.threads config setting Add support for a new index.threads config setting which will be used to control the threading code in do_read_index(). A value of 0 will tell the index code to automatically determine the correct number of threads to use. A value of 1 will make the code single threaded. A value greater than 1 will set the maximum number of threads to use. For testing purposes, this setting can be overwritten by setting the GIT_TEST_INDEX_THREADS=<n> environment variable to a value greater than 0. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-11 15:32:48 +09:00
Ben Peart	3b1d9e045e	eoie: add End of Index Entry (EOIE) extension The End of Index Entry (EOIE) is used to locate the end of the variable length index entries and the beginning of the extensions. Code can take advantage of this to quickly locate the index extensions without having to parse through all of the index entries. The EOIE extension is always written out to the index file including to the shared index when using the split index feature. Because it is always written out, the SHA checksums in t/t1700-split-index.sh were updated to reflect its inclusion. It is written as an optional extension to ensure compatibility with other git implementations that do not yet support it. It is always written out to ensure it is available as often as possible to speed up index operations. Because it must be able to be loaded before the variable length cache entries and other index extensions, this extension must be written last. The signature for this extension is { 'E', 'O', 'I', 'E' }. The extension consists of: - 32-bit offset to the end of the index entries - 160-bit SHA-1 over the extension types and their sizes (but not their contents). E.g. if we have "TREE" extension that is N-bytes long, "REUC" extension that is M-bytes long, followed by "EOIE", then the hash would be: SHA-1("TREE" + <binary representation of N> + "REUC" + <binary representation of M>) Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-11 15:32:48 +09:00
Ben Peart	371ed0defa	read-cache: clean up casting and byte decoding This patch does a clean up pass to minimize the casting required to work with the memory mapped index (mmap). It also makes the decoding of network byte order more consistent by using get_be32() where possible. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-11 15:32:48 +09:00
Junio C Hamano	5a0cc8aca7	Third batch for 2.20 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-10 12:38:03 +09:00
Junio C Hamano	66ec2373fe	Merge branch 'ab/fsck-skiplist' Update fsck.skipList implementation and documentation. * ab/fsck-skiplist: fsck: support comments & empty lines in skipList fsck: use oidset instead of oid_array for skipList fsck: use strbuf_getline() to read skiplist file fsck: add a performance test for skipList fsck: add a performance test fsck: document that skipList input must be unabbreviated fsck: document and test commented & empty line skipList input fsck: document and test sorted skipList input fsck tests: add a test for no skipList input fsck tests: setup of bogus commit object	2018-10-10 12:37:16 +09:00
Junio C Hamano	468b322137	Merge branch 'ds/multi-pack-verify' "git multi-pack-index" learned to detect corruption in the .midx file it uses, and this feature has been integrated into "git fsck". * ds/multi-pack-verify: fsck: verify multi-pack-index multi-pack-index: report progress during 'verify' multi-pack-index: verify object offsets multi-pack-index: fix 32-bit vs 64-bit size check multi-pack-index: verify oid lookup order multi-pack-index: verify oid fanout order multi-pack-index: verify missing pack multi-pack-index: verify packname order multi-pack-index: verify corrupt chunk lookup table multi-pack-index: verify bad header multi-pack-index: add 'verify' verb	2018-10-10 12:37:16 +09:00
Junio C Hamano	d555663f16	Merge branch 'bc/hash-independent-tests' Various tests have been updated to make it easier to swap the hash function used for object identification. * bc/hash-independent-tests: t5318: use test_oid for HASH_LEN t1407: make hash size independent t1406: make hash-size independent t1405: make hash size independent t1400: switch hard-coded object ID to variable t1006: make hash size independent t0064: make hash size independent t0002: abstract away SHA-1 specific constants t0000: update tests for SHA-256 t0000: use hash translation table t: add test functions to translate hash-related values	2018-10-10 12:37:16 +09:00
Junio C Hamano	77b5046ae3	Merge branch 'nd/test-tool' Test helper binaries clean-up. * nd/test-tool: Makefile: add a hint about TEST_BUILTINS_OBJS t/helper: merge test-dump-fsmonitor into test-tool t/helper: merge test-parse-options into test-tool t/helper: merge test-pkt-line into test-tool t/helper: merge test-dump-untracked-cache into test-tool t/helper: keep test-tool command list sorted	2018-10-10 12:37:16 +09:00
Junio C Hamano	3ba371f9df	Merge branch 'nd/config-split' Split Documentation/config.txt for easier maintenance. * nd/config-split: config.txt: move submodule part out to a separate file config.txt: move sequence.editor out of "core" part config.txt: move sendemail part out to a separate file config.txt: move receive part out to a separate file config.txt: move push part out to a separate file config.txt: move pull part out to a separate file config.txt: move gui part out to a separate file config.txt: move gitcvs part out to a separate file config.txt: move format part out to a separate file config.txt: move fetch part out to a separate file config.txt: follow camelCase naming	2018-10-10 12:37:15 +09:00
Jonathan Tan	2f215ff10b	cache-tree: skip some blob checks in partial clone In a partial clone, whenever a sparse checkout occurs, the existence of all blobs in the index is verified, whether they are included or excluded by the .git/info/sparse-checkout specification. This significantly degrades performance because a lazy fetch occurs whenever the existence of a missing blob is checked. This is because cache_tree_update() checks the existence of all objects in the index, whether or not CE_SKIP_WORKTREE is set on them. Teach cache_tree_update() to skip checking CE_SKIP_WORKTREE objects when the repository is a partial clone. This improves performance for sparse checkout and also other operations that use cache_tree_update(). Instead of completely removing the check, an argument could be made that the check should instead be replaced by a check that the blob is promised, but for performance reasons, I decided not to do this. If the user needs to verify the repository, it can be done using fsck (which will notify if a tree points to a missing and non-promised blob, whether the blob is included or excluded by the sparse-checkout specification). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-10 10:20:43 +09:00
Junio C Hamano	2efbb7f521	Declare that the next one will be named 2.20 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-10 09:20:03 +09:00
Taylor Blau	40f327faf5	transport.c: introduce core.alternateRefsPrefixes The recently-introduced "core.alternateRefsCommand" allows callers to specify with high flexibility the tips that they wish to advertise from alternates. This flexibility comes at the cost of some inconvenience when the caller only wishes to limit the advertisement to one or more prefixes. For example, to advertise only tags, a caller using 'core.alternateRefsCommand' would have to do: $ git config core.alternateRefsCommand ' \ f() { git -C "$1" for-each-ref \ refs/tags --format="%(objectname)" }; f "$@"' The above is cumbersome to write, so let's introduce a "core.alternateRefsPrefixes" to address this common case. Instead, the caller can run: $ git config core.alternateRefsPrefixes 'refs/tags' Which will behave identically to the longer example using "core.alternateRefsCommand". Since the value of "core.alternateRefsPrefixes" is appended to 'git for-each-ref' and then executed, include a "--" before taking the configured value to avoid misinterpreting arguments as flags to 'git for-each-ref'. In the case that the caller wishes to specify multiple prefixes, they may separate them by whitespace. If "core.alternateRefsCommand" is set, it will take precedence over "core.alternateRefsPrefixes". Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-09 14:30:03 +09:00
Taylor Blau	89284c1d6c	transport.c: introduce core.alternateRefsCommand When in a repository containing one or more alternates, Git would sometimes like to list references from those alternates. For example, 'git receive-pack' lists the "tips" pointed to by references in those alternates as special ".have" references. Listing ".have" references is designed to make pushing changes from upstream to a fork a lightweight operation, by advertising to the pusher that the fork already has the objects (via its alternate). Thus, the client can avoid sending them. However, when the alternate (upstream, in the previous example) has a pathologically large number of references, the initial advertisement is too expensive. In fact, it can dominate any such optimization where the pusher avoids sending certain objects. Introduce "core.alternateRefsCommand" in order to provide a facility to limit or filter alternate references. This can be used, for example, to filter out references the alternate does not wish to send (for space concerns, or otherwise) during the initial advertisement. Let the repository that has alternates configure this command to avoid trusting the alternate to provide us a safe command to run in the shell. To find the alternate, pass its absolute path as the first argument. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-09 14:30:03 +09:00
Taylor Blau	1e5f31d444	transport.c: extract 'fill_alternate_refs_command' To list alternate references, 'read_alternate_refs' creates a child process running 'git for-each-ref' in the alternate's Git directory. Prepare to run other commands besides 'git for-each-ref' by introducing and moving the relevant code from 'read_alternate_refs' to 'fill_alternate_refs_command'. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-09 14:30:03 +09:00
Jeff King	bdf4276c91	transport: drop refnames from for_each_alternate_ref None of the current callers use the refname parameter we pass to their callbacks. In theory somebody _could_ do so, but it's actually quite weird if you think about it: it's a ref in somebody else's repository. So the name has no meaning locally, and in fact there may be duplicates if there are multiple alternates. The users of this interface really only care about seeing some ref tips, since that promises that the alternate has the full commit graph reachable from there. So let's keep the information we pass back to the bare minimum. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-09 14:30:02 +09:00
Michael Witten	ad0b8f9575	docs: typo: s/isimilar/similar/ Signed-off-by: Michael Witten <mfwitten@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-07 10:11:02 +09:00
Michael Witten	634dbd0ad8	docs: graph: remove unnecessary `graph_update()' call The sample code calls `get_revision()' followed by `graph_update()', but the documentation and source code indicate that `get_revision()' already calls `graph_update()' for you. Signed-off-by: Michael Witten <mfwitten@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-07 10:10:49 +09:00
Michael Witten	42ce44e00a	docs: typo: s/go/to/ Signed-off-by: Michael Witten <mfwitten@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-07 10:10:42 +09:00
Steven Fernandez	705f5f122c	git-completion.bash: add completion for stash list Since stash list accepts git-log options, add the following useful options that make sense in the context of the `git stash list` command: --name-status --oneline --patch-with-stat Signed-off-by: Steven Fernandez <steve@lonetwin.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-07 10:05:49 +09:00
Jonathan Tan	e70a3030e7	fetch: do not list refs if fetching only hashes If only hash literals are given on a "git fetch" command-line, tag following is not requested, and the fetch is done using protocol v2, a list of refs is not required from the remote. Therefore, optimize by invoking transport_get_remote_refs() only if we need the refs. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-07 09:53:21 +09:00
Jonathan Tan	6ab4055775	transport: list refs before fetch if necessary The built-in bundle transport and the transport helper interface do not work when transport_fetch_refs() is called immediately after transport creation. This will be needed in a subsequent patch, so fix this. Evidence: fetch_refs_from_bundle() relies on data->header being initialized in get_refs_from_bundle(), and fetch() in transport-helper.c relies on either data->fetch or data->import being set by get_helper(), but neither transport_helper_init() nor fetch() calls get_helper(). Up until the introduction of the partial clone feature, this has not been a problem, because transport_fetch_refs() is always called after transport_get_remote_refs(). With the introduction of the partial clone feature, which involves calling transport_fetch_refs() (to fetch objects by their OIDs) without transport_get_remote_refs(), this is still not a problem, but only coincidentally - we do not support partially cloning a bundle, and as for cloning using a transport-helper-using protocol, it so happens that before transport_fetch_refs() is called, fetch_refs() in fetch-object.c calls transport_set_option(), which means that the aforementioned get_helper() is invoked through set_helper_option() in transport-helper.c. This could be fixed by fixing the transports themselves, but it doesn't seem like a good idea to me to open up previously untested code paths; also, there may be transport helpers in the wild that assume that "list" is always called before "fetch". Instead, fix this by having transport_fetch_refs() call transport_get_remote_refs() to ensure that the latter is always called at least once, unless the transport explicitly states that it supports fetching without listing refs. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-07 09:53:19 +09:00
Jonathan Tan	0177565148	transport: do not list refs if possible When all refs to be fetched are exact OIDs, it is possible to perform a fetch without requiring the remote to list refs if protocol v2 is used. Teach Git to do this. This currently has an effect only for lazy fetches done from partial clones. The change necessary to likewise optimize "git fetch <remote> <sha-1>" will be done in a subsequent patch. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-07 09:53:15 +09:00
Jonathan Tan	99bcb883cb	transport: allow skipping of ref listing The get_refs_via_connect() function both performs the handshake (including determining the protocol version) and obtaining the list of remote refs. However, the fetch protocol v2 supports fetching objects without the listing of refs, so make it possible for the user to skip the listing by creating a new handshake() function. This will be used in a subsequent commit. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-07 09:35:41 +09:00
Matthew DeVore	8d6ba49563	tests: order arguments to git-rev-list properly It is a common mistake to put positional arguments before flags when invoking git-rev-list. Order the positional arguments last. This patch skips git-rev-list invocations which include the --not flag, since the ordering of flags and positional arguments affects the behavior. This patch also skips invocations of git-rev-list that occur in command substitution in which the exit code is discarded, since fixing those properly will require a more involved cleanup. Signed-off-by: Matthew DeVore <matvore@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-07 08:51:18 +09:00
Matthew DeVore	b00b6ace5c	t9109: don't swallow Git errors upstream of pipes 'git ... \| foo' will mask any errors or crashes in git, so split up such pipes in this file. One testcase uses several separate pipe sequences in a row which are awkward to split up. Wrap the split-up pipe in a function so the awkwardness is not repeated. Also change that testcase's surrounding quotes from double to single to avoid premature string interpolation. Signed-off-by: Matthew DeVore <matvore@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-07 08:51:18 +09:00
Matthew DeVore	61de0ff695	tests: don't swallow Git errors upstream of pipes Some pipes in tests lose the exit code of git processes, which can mask unexpected behavior like crashes. Split these pipes up so that git commands are only at the end of pipes rather than the beginning or middle. The violations fixed in this patch were found in the process of fixing pipe placement in a prior patch. Signed-off-by: Matthew DeVore <matvore@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-07 08:51:18 +09:00
Matthew DeVore	dcbaa0b361	t/*: fix ordering of expected/observed arguments Fix various places where the ordering was obviously wrong, meaning it was easy to find with grep. Signed-off-by: Matthew DeVore <matvore@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-07 08:51:18 +09:00
Matthew DeVore	bdbc17e86a	tests: standardize pipe placement Instead of using a line-continuation and pipe on the second line, take advantage of the shell's implicit line continuation after a pipe character. So for example, instead of some long line \ \| next line use some long line \| next line And add a blank line before and after the pipe where it aids readability (it usually does). This better matches the coding style documented in Documentation/CodingGuidelines and used in shell scripts elsewhere in the tree. Signed-off-by: Matthew DeVore <matvore@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-07 08:51:18 +09:00
Matthew DeVore	a378fee5b0	Documentation: add shell guidelines Add the following guideline to Documentation/CodingGuidelines: Break overlong lines after "&&", "\|\|", and "\|", not before them; that way the command can continue to subsequent lines without backslash at the end. And the following to t/README (since it is specific to writing tests): Pipes and $(git ...) should be avoided when they swallow exit codes of Git processes Signed-off-by: Matthew DeVore <matvore@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-07 08:51:17 +09:00
Matthew DeVore	441ee35d83	t/README: reformat Do, Don't, Keep in mind lists The list of Don'ts for test writing has grown large such that it is hard to see at a glance which section an item is in. In other words, if I ignore a little bit of surrounding context, the "don'ts" look like "do's." To make the list more readable, prefix "Don't" in front of every first sentence in the items. Also, the "Keep in mind" list is out of place and awkward, because it was a very short "list" beneath two very long ones, and it seemed easy to miss under the list of "don'ts," and it only had one item. So move this item to the list of "do's" and phrase as "Remember..." Signed-off-by: Matthew DeVore <matvore@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-07 08:51:17 +09:00
Derrick Stolee	53c36670e7	commit-graph: reduce initial oid allocation While writing a commit-graph file, we store the full list of commits in a flat list. We use this list for sorting and ensuring we are closed under reachability. The initial allocation assumed that (at most) one in four objects is a commit. This is a dramatic over-count for many repos, especially large ones. Since we grow the repo dynamically, reduce this count by a factor of eight. We still set it to a minimum of 1024 before allocating. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-07 08:25:05 +09:00
Martin Ågren	0bfb48e672	builtin/commit-graph.c: UNLEAK variables `graph_verify()`, `graph_read()` and `graph_write()` do the hard work of `cmd_commit_graph()`. As soon as these return, so does `cmd_commit_graph()`. `strbuf_getline()` may allocate memory in the strbuf, yet return EOF. We need to release the strbuf or UNLEAK it. Go for the latter since we are close to returning from `graph_write()`. `graph_write()` also fails to free the strings in the string list. They have been added to the list with `strdup_strings` set to 0. We could flip `strdup_strings` before clearing the list, which is our usual hack in situations like this. But since we are about to exit, let's just UNLEAK the whole string list instead. UNLEAK `graph` in `graph_verify`. While at it, and for consistency, UNLEAK in `graph_read()` as well, and remove an unnecessary UNLEAK just before dying. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-07 08:25:05 +09:00
Derrick Stolee	f4dbdfc4d5	commit-graph: clean up leaked memory during write The write_commit_graph() method in commit-graph.c leaks some lits and strings during execution. In addition, a list of strings is leaked in write_commit_graph_reachable(). Clean these up so our memory checking is cleaner. Further, if we use a list of pack-files to find the commits, we can leak the packed_git structs after scanning them for commits. Running the following commands demonstrates the leak before and the fix after: * valgrind --leak-check=full ./git commit-graph write --reachable * valgrind --leak-check=full ./git commit-graph write --stdin-packs Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-07 08:25:05 +09:00
Phillip Wood	47cb16a264	diff --color-moved: fix a memory leak Free the hashmap items as well as the hashmap itself. This was found with asan. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-04 22:48:21 -07:00
Phillip Wood	9c1a6c2bf8	diff --color-moved-ws: fix another memory leak This is obvious in retrospect, it was found with asan. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-04 22:48:17 -07:00
Phillip Wood	fe4516d103	diff --color-moved-ws: fix a memory leak Don't duplicate the indentation string if we're not going to use it. This was found with asan. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-04 22:48:12 -07:00
Phillip Wood	cf074a9b0e	diff --color-moved-ws: fix out of bounds string access When adjusting the start of the string to take account of the change in indentation the code was not checking that the string being adjusted was in fact longer than the indentation change. This was detected by asan. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-04 22:48:07 -07:00
Phillip Wood	74d156f4a1	diff --color-moved-ws: fix double free crash Running git diff --color-moved-ws=allow-indentation-change v2.18.0 v2.19.0 results in a crash due to a double free. This happens when two potential moved blocks start with consecutive lines. As pmb_advance_or_null_multi_match() advances it copies the ws_delta from the last matching line to the next. When the first of our consecutive lines is advanced its ws_delta well be copied to the second, overwriting the ws_delta of the block containing the second line. Then when the second line is advanced it will copy the new ws_delta to the line below it and so on. Eventually one of these blocks will stop matching and the ws_delta will be freed. From then on the other block is in a use-after-free state and when it stops matching it will try to free the ws_delta that has already been freed by the other block. The solution is to store the ws_delta in the array of potential moved blocks rather than with the lines. This means that it no longer needs to be copied around and one block cannot overwrite the ws_delta of another. Additionally it saves some malloc/free calls as we don't keep allocating and freeing ws_deltas. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-04 22:47:26 -07:00
René Scharfe	8c84ae659e	oidset: uninline oidset_init() There is no need to inline oidset_init(), as it's typically only called twice in the lifetime of an oidset (once at the beginning and at the end by oidset_clear()) and kh_resize_* is quite big, so move its definition to oidset.c. Document it while we're at it. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-04 11:12:14 -07:00
René Scharfe	8b2f8cbcb1	oidset: use khash Reimplement oidset using khash.h in order to reduce its memory footprint and make it faster. Performance of a command that mainly checks for duplicate objects using an oidset, with master and Clang 6.0.1: $ cmd="./git-cat-file --batch-all-objects --unordered --buffer --batch-check='%(objectname)'" $ /usr/bin/time $cmd >/dev/null 0.22user 0.03system 0:00.25elapsed 99%CPU (0avgtext+0avgdata 48484maxresident)k 0inputs+0outputs (0major+11204minor)pagefaults 0swaps $ hyperfine "$cmd" Benchmark #1: ./git-cat-file --batch-all-objects --unordered --buffer --batch-check='%(objectname)' Time (mean ± σ): 250.0 ms ± 6.0 ms [User: 225.9 ms, System: 23.6 ms] Range (min … max): 242.0 ms … 261.1 ms And with this patch: $ /usr/bin/time $cmd >/dev/null 0.14user 0.00system 0:00.15elapsed 100%CPU (0avgtext+0avgdata 41396maxresident)k 0inputs+0outputs (0major+8318minor)pagefaults 0swaps $ hyperfine "$cmd" Benchmark #1: ./git-cat-file --batch-all-objects --unordered --buffer --batch-check='%(objectname)' Time (mean ± σ): 151.9 ms ± 4.9 ms [User: 130.5 ms, System: 21.2 ms] Range (min … max): 148.2 ms … 170.4 ms Initial-patch-by: Jeff King <peff@peff.net> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-04 11:12:13 -07:00
René Scharfe	9249ca26ac	khash: factor out kh_release_* Add a function for releasing the khash-internal allocations, but not the khash structure itself. It can be used with on-stack khash structs. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-04 11:12:13 -07:00
René Scharfe	22a1646511	fetch-pack: load tip_oids eagerly iff needed tip_oids_contain() lazily loads refs into an oidset at its first call. It abuses the internal (sub)member .map.tablesize of that oidset to check if it has done that already. Determine if the oidset needs to be populated upfront and then do that instead. This duplicates a loop, but simplifies the existing one by separating concerns between the two. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-04 11:12:13 -07:00
René Scharfe	bf73282c0b	fetch-pack: factor out is_unmatched_ref() Move the code to determine if a request is unmatched to its own little helper. This allows us to reuse it in a subsequent patch. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-04 11:12:13 -07:00
Jonathan Tan	4c7f9567ea	fetch-pack: exclude blobs when lazy-fetching trees A partial clone with missing trees can be obtained using "git clone --filter=tree:none <repo>". In such a repository, when a tree needs to be lazily fetched, any tree or blob it directly or indirectly references is fetched as well, regardless of whether the original command required those objects, or if the local repository already had some of them. This is because the fetch protocol, which the lazy fetch uses, does not allow clients to request that only the wanted objects be sent, which would be the ideal solution. This patch implements a partial solution: specify the "blob:none" filter, somewhat reducing the fetch payload. This change has no effect when lazily fetching blobs (due to how filters work). And if lazily fetching a commit (such repositories are difficult to construct and is not a use case we support very well, but it is possible), referenced commits and trees are still fetched - only the blobs are not fetched. The necessary code change is done in fetch_pack() instead of somewhere closer to where the "filter" instruction is written to the wire so that only one part of the code needs to be changed in order for users of all protocol versions to benefit from this optimization. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-04 06:03:49 -07:00
Jonathan Tan	12f19a9825	fetch-pack: avoid object flags if no_dependents When fetch_pack() is invoked as part of another Git command (due to a lazy fetch from a partial clone, for example), it uses object flags that may already be used by the outer Git command. The commit that introduced the lazy fetch feature (`88e2f9ed8e` ("introduce fetch-object: fetch one promisor object", 2017-12-05)) tried to avoid this overlap, but it did not avoid it totally. It was successful in avoiding writing COMPLETE, but did not avoid reading COMPLETE, and did not avoid writing and reading ALTERNATE. Ensure that no flags are written or read by fetch_pack() in the case where it is used to perform a lazy fetch. To do this, it is sufficient to avoid checking completeness of wanted refs (unnecessary in the case of lazy fetches), and to avoid negotiation-related work (in the current implementation, already, no negotiation is performed). After that was done, the lack of overlap was verified by checking all direct and indirect usages of COMPLETE and ALTERNATE - that they are read or written only if no_dependents is false. There are other possible solutions to this issue: (1) Split fetch-pack.{c,h} into a flag-using part and a non-flag-using part, and whenever no_dependents is set, only use the non-flag-using part. (2) Make fetch_pack() be able to be used with arbitrary repository objects. fetch_pack() should then create its own repository object based on the given repository object, with its own object hashtable, so that the flags do not conflict. (1) is possible but invasive - some functions would need to be split; and such invasiveness would potentially be unnecessary if we ever were to need (2) anyway. (2) would be useful if we were to support, say, submodules that were partial clones themselves, but I don't know when or if the Git project plans to support those. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-04 06:00:53 -07:00
Johannes Schindelin	2939a1f703	mingw: bump the minimum Windows version to Vista Quite some time ago, a last plea to the XP users out there who want to see Windows XP support in Git for Windows, asking them to get engaged and help, vanished into the depths of the universe. We tried for a long time to play nice with the last remaining XP users who somehow manage to build Git from source, but a recent update of mingw-w64 (7.0.0.5233.e0c09544 -> 7.0.0.5245.edf66197) finally dropped the last sign of XP support, and Git for Windows' SDK is no longer able to build core Git's `master` branch as a consequence. (Git for Windows' `master` branch already bumped the minimum Windows version to Vista a while ago, so it is fine.) It is time to require Windows Vista or later to build Git from source. This, incidentally, lets us use quite a few nice new APIs. It also means that we no longer need the inet_pton() and inet_ntop() emulation, which is nice. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-04 05:39:56 -07:00
Johannes Schindelin	3571e78aa4	mingw: set _WIN32_WINNT explicitly for Git for Windows Previously, we only ever declared a target Windows version if compiling with Visual C. Which meant that we were relying on the MinGW headers to guess which Windows version we want to target... Let's be explicit about it, in particular because we actually want to bump the target Windows version to Vista (which we will do in the next commit). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-04 05:39:56 -07:00
Johannes Schindelin	d7e357fb9c	compat/poll: prepare for targeting Windows Vista Windows Vista (and later) actually have a working poll(), but we still cannot use it because it only works on sockets. So let's detect when we are targeting Windows Vista and undefine those constants, and define `pollfd` so that we can declare our own pollfd struct. We also need to make sure that we override those constants after `winsock2.h` has been `#include`d (otherwise we would not really override those constants). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-04 05:39:56 -07:00

1 2 3 4 5 ...

53334 Commits