git-commit-vandalism

Author	SHA1	Message	Date
Junio C Hamano	b99a579f8e	Merge branch 'sb/more-repo-in-api' The in-core repository instances are passed through more codepaths. * sb/more-repo-in-api: (23 commits) t/helper/test-repository: celebrate independence from the_repository path.h: make REPO_GIT_PATH_FUNC repository agnostic commit: prepare free_commit_buffer and release_commit_memory for any repo commit-graph: convert remaining functions to handle any repo submodule: don't add submodule as odb for push submodule: use submodule repos for object lookup pretty: prepare format_commit_message to handle arbitrary repositories commit: prepare logmsg_reencode to handle arbitrary repositories commit: prepare repo_unuse_commit_buffer to handle any repo commit: prepare get_commit_buffer to handle any repo commit-reach: prepare in_merge_bases[_many] to handle any repo commit-reach: prepare get_merge_bases to handle any repo commit-reach.c: allow get_merge_bases_many_0 to handle any repo commit-reach.c: allow remove_redundant to handle any repo commit-reach.c: allow merge_bases_many to handle any repo commit-reach.c: allow paint_down_to_common to handle any repo commit: allow parse_commit* to handle any repo object: parse_object to honor its repository argument object-store: prepare has_{sha1, object}_file to handle any repo object-store: prepare read_object_file to deal with any repo ...	2019-02-05 14:26:09 -08:00
Junio C Hamano	371820d5f1	Merge branch 'bc/tree-walk-oid' The code to walk tree objects has been taught that we may be working with object names that are not computed with SHA-1. * bc/tree-walk-oid: cache: make oidcpy always copy GIT_MAX_RAWSZ bytes tree-walk: store object_id in a separate member match-trees: use hashcpy to splice trees match-trees: compute buffer offset correctly when splicing tree-walk: copy object ID before use	2019-01-29 12:47:56 -08:00
brian m. carlson	ea82b2a085	tree-walk: store object_id in a separate member When parsing a tree, we read the object ID directly out of the tree buffer. This is normally fine, but such an object ID cannot be used with oidcpy, which copies GIT_MAX_RAWSZ bytes, because if we are using SHA-1, there may not be that many bytes to copy. Instead, store the object ID in a separate struct member. Since we can no longer efficiently compute the path length, store that information as well in struct name_entry. Ensure we only copy the object ID into the new buffer if the path length is nonzero, as some callers will pass us an empty path with no object ID following it, and we will not want to read past the end of the buffer. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-15 09:57:41 -08:00
René Scharfe	d4e19e5163	object-store: factor out odb_clear_loose_cache() Add and use a function for emptying the loose object cache, so callers don't have to know any of its implementation details. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-08 09:40:19 -08:00
Junio C Hamano	3b2f8a02fa	Merge branch 'jk/loose-object-cache' Code clean-up with optimization for the codepath that checks (non-)existence of loose objects. * jk/loose-object-cache: odb_load_loose_cache: fix strbuf leak fetch-pack: drop custom loose object cache sha1-file: use loose object cache for quick existence check object-store: provide helpers for loose_objects_cache sha1-file: use an object_directory for the main object dir handle alternates paths the same as the main object dir sha1_file_name(): overwrite buffer instead of appending rename "alternate_object_database" to "object_directory" submodule--helper: prefer strip_suffix() to ends_with() fsck: do not reuse child_process structs	2019-01-04 13:33:32 -08:00
Junio C Hamano	5fb9263295	Merge branch 'ds/test-multi-pack-index' Tests for the recently introduced multi-pack index machinery. * ds/test-multi-pack-index: packfile: close multi-pack-index in close_all_packs multi-pack-index: define GIT_TEST_MULTI_PACK_INDEX midx: close multi-pack-index on repack midx: fix broken free() in close_midx()	2018-11-13 22:37:19 +09:00
Jeff King	3a2e08245c	object-store: provide helpers for loose_objects_cache Our object_directory struct has a loose objects cache that all users of the struct can see. But the only one that knows how to load the cache is find_short_object_filename(). Let's extract that logic in to a reusable function. While we're at it, let's also reset the cache when we re-read the object directories. This shouldn't have an impact on performance, as re-reads are meant to be rare (and are already expensive, so we avoid them with things like OBJECT_INFO_QUICK). Since the cache is already meant to be an approximation, it's tempting to skip even this bit of safety. But it's necessary to allow more code to use it. For instance, fetch-pack explicitly re-reads the object directory after performing its fetch, and would be confused if we didn't clear the cache. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-13 14:22:03 +09:00
Jeff King	f0eaf63819	sha1-file: use an object_directory for the main object dir Our handling of alternate object directories is needlessly different from the main object directory. As a result, many places in the code basically look like this: do_something(r->objects->objdir); for (odb = r->objects->alt_odb_list; odb; odb = odb->next) do_something(odb->path); That gets annoying when do_something() is non-trivial, and we've resorted to gross hacks like creating fake alternates (see find_short_object_filename()). Instead, let's give each raw_object_store a unified list of object_directory structs. The first will be the main store, and everything after is an alternate. Very few callers even care about the distinction, and can just loop over the whole list (and those who care can just treat the first element differently). A few observations: - we don't need r->objects->objectdir anymore, and can just mechanically convert that to r->objects->odb->path - object_directory's path field needs to become a real pointer rather than a FLEX_ARRAY, in order to fill it with expand_base_dir() - we'll call prepare_alt_odb() earlier in many functions (i.e., outside of the loop). This may result in us calling it even when our function would be satisfied looking only at the main odb. But this doesn't matter in practice. It's not a very expensive operation in the first place, and in the majority of cases it will be a noop. We call it already (and cache its results) in prepare_packed_git(), and we'll generally check packs before loose objects. So essentially every program is going to call it immediately once per program. Arguably we should just prepare_alt_odb() immediately upon setting up the repository's object directory, which would save us sprinkling calls throughout the code base (and forgetting to do so has been a source of subtle bugs in the past). But I've stopped short of that here, since there are already a lot of other moving parts in this patch. - Most call sites just get shorter. The check_and_freshen() functions are an exception, because they have entry points to handle local and nonlocal directories separately. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-13 14:22:03 +09:00
Jeff King	263db403fa	rename "alternate_object_database" to "object_directory" In preparation for unifying the handling of alt odb's and the normal repo object directory, let's use a more neutral name. This patch is purely mechanical, swapping the type name, and converting any variables named "alt" to "odb". There should be no functional change, but it will reduce the noise in subsequent diffs. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-13 14:22:02 +09:00
Junio C Hamano	d829d491ee	Merge branch 'bc/hash-transition-part-15' More codepaths are moving away from hardcoded hash sizes. * bc/hash-transition-part-15: rerere: convert to use the_hash_algo submodule: make zero-oid comparison hash function agnostic apply: rename new_sha1_prefix and old_sha1_prefix apply: replace hard-coded constants tag: express constant in terms of the_hash_algo transport: use parse_oid_hex instead of a constant upload-pack: express constants in terms of the_hash_algo refs/packed-backend: express constants using the_hash_algo packfile: express constants in terms of the_hash_algo pack-revindex: express constants in terms of the_hash_algo builtin/fetch-pack: remove constants with parse_oid_hex builtin/mktree: remove hard-coded constant builtin/repack: replace hard-coded constants pack-bitmap-write: use GIT_MAX_RAWSZ for allocation object_id.cocci: match only expressions of type 'struct object_id'	2018-10-30 15:43:42 +09:00
Derrick Stolee	dc7d664335	packfile: close multi-pack-index in close_all_packs Whenever we delete pack-files from the object directory, we need to also delete the multi-pack-index that may refer to those objects. Sometimes, this connection is obvious, like during a repack. Other times, this is less obvious, like when gc calls a repack command and then does other actions on the objects, like write a commit-graph file. The pattern we use to avoid out-of-date in-memory packed_git structs is to call close_all_packs(). This should also call close_midx(). Since we already pass an object store to close_all_packs(), this is a nicely scoped operation. This fixes a test failure when running t6500-gc.sh with GIT_TEST_MULTI_PACK_INDEX=1. Reported-by: Szeder Gábor <szeder.dev@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-26 11:49:06 +09:00
Stefan Beller	33b94066f2	packfile: allow has_packed_and_bad to handle arbitrary repositories has_packed_and_bad is not widely used, so just migrate it all at once. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-19 16:21:10 +09:00
Josh Steadmon	1127a98cce	fuzz: add fuzz testing for packfile indices. Breaks the majority of check_packed_git_idx() into a separate function, load_idx(). The latter function operates on arbitrary buffers, which makes it suitable as a fuzzing test target. Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-15 14:29:03 +09:00
brian m. carlson	268babd6fb	packfile: express constants in terms of the_hash_algo Replace uses of GIT_SHA1_RAWSZ with references to the_hash_algo to avoid dependence on a particular hash length. It's likely that in the future, we'll update the pack format to indicate what hash algorithm it uses, and then this code will change. However, at least on an interim basis, make it easier to develop on a pure SHA-256 Git by using the_hash_algo here. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-15 12:53:15 +09:00
Junio C Hamano	769af0fd9e	Merge branch 'jk/cocci' spatch transformation to replace boolean uses of !hashcmp() to newly introduced oideq() is added, and applied, to regain performance lost due to support of multiple hash algorithms. * jk/cocci: show_dirstat: simplify same-content check read-cache: use oideq() in ce_compare functions convert hashmap comparison functions to oideq() convert "hashcmp() != 0" to "!hasheq()" convert "oidcmp() != 0" to "!oideq()" convert "hashcmp() == 0" to hasheq() convert "oidcmp() == 0" to oideq() introduce hasheq() and oideq() coccinelle: use <...> for function exclusion	2018-09-17 13:53:57 -07:00
Jeff King	67947c34ae	convert "hashcmp() != 0" to "!hasheq()" This rounds out the previous three patches, covering the inequality logic for the "hash" variant of the functions. As with the previous three, the accompanying code changes are the mechanical result of applying the coccinelle patch; see those patches for more discussion. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-29 11:32:49 -07:00
Jeff King	e3ff0683e2	convert "hashcmp() == 0" to hasheq() This is the partner patch to the previous one, but covering the "hash" variants instead of "oid". Note that our coccinelle rule is slightly more complex to avoid triggering the call in hasheq(). I didn't bother to add a new rule to convert: - hasheq(E1->hash, E2->hash) + oideq(E1, E2) Since these are new functions, there won't be any such existing callers. And since most of the code is already using oideq, we're not likely to introduce new ones. We might still see "!hashcmp(E1->hash, E2->hash)" from topics in flight. But because our new rule comes after the existing ones, that should first get converted to "!oidcmp(E1, E2)" and then to "oideq(E1, E2)". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-29 11:32:49 -07:00
Derrick Stolee	454ea2e4d7	treewide: use get_all_packs There are many places in the codebase that want to iterate over all packfiles known to Git. The purposes are wide-ranging, and those that can take advantage of the multi-pack-index already do. So, use get_all_packs() instead of get_packed_git() to be sure we are iterating over all packfiles. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-20 15:31:40 -07:00
Derrick Stolee	0bff5269d3	packfile: add all_packs list If a repo contains a multi-pack-index, then the packed_git list does not contain the packfiles that are covered by the multi-pack-index. This is important for doing object lookups, abbreviations, and approximating object count. However, there are many operations that really want to iterate over all packfiles. Create a new 'all_packs' linked list that contains this list, starting with the packfiles in the multi-pack-index and then continuing along the packed_git linked list. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-20 15:31:40 -07:00
Derrick Stolee	fe86c3beb5	midx: stop reporting garbage When prepare_packed_git is called with the report_garbage method initialized, we report unexpected files in the objects directory as garbage. Stop reporting the multi-pack-index and the pack-files it covers as garbage. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-20 15:31:39 -07:00
Derrick Stolee	2cf489a3bf	multi-pack-index: store local property A pack-file is 'local' if it is stored within the usual object directory. If it is stored in an alternate, it is non-local. Pack-files are stored using a 'pack_local' member in the packed_git struct. Add a similar 'local' member to the multi_pack_index struct and 'local' parameters to the methods that load and prepare multi- pack-indexes. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-20 15:31:39 -07:00
Junio C Hamano	c00ba2233e	Sync 'ds/multi-pack-index' to v2.19.0-rc0 * ds/multi-pack-index: (23 commits) midx: clear midx on repack packfile: skip loading index if in multi-pack-index midx: prevent duplicate packfile loads midx: use midx in approximate_object_count midx: use existing midx when writing new one midx: use midx in abbreviation calculations midx: read objects from multi-pack-index config: create core.multiPackIndex setting midx: write object offsets midx: write object id fanout chunk midx: write object ids in a chunk midx: sort and deduplicate objects from packfiles midx: read pack names into array multi-pack-index: write pack names in chunk multi-pack-index: read packfile list packfile: generalize pack directory list t5319: expand test data multi-pack-index: load into memory midx: write header information to lockfile multi-pack-index: add 'write' verb ...	2018-08-20 15:29:54 -07:00
Jeff King	736eb88fdc	for_each_packed_object: support iterating in pack-order We currently iterate over objects within a pack in .idx order, which uses the object hashes. That means that it is effectively random with respect to the location of the object within the pack. If you're going to access the actual object data, there are two reasons to move linearly through the pack itself: 1. It improves the locality of access in the packfile. In the cold-cache case, this may mean fewer disk seeks, or better usage of disk cache. 2. We store related deltas together in the packfile. Which means that the delta base cache can operate much more efficiently if we visit all of those related deltas in sequence, as the earlier items are likely to still be in the cache. Whereas if we visit the objects in random order, our cache entries are much more likely to have been evicted by unrelated deltas in the meantime. So in general, if you're going to access the object contents pack order is generally going to end up more efficient. But if you're simply generating a list of object names, or if you're going to end up sorting the result anyway, you're better off just using the .idx order, as finding the pack order means generating the in-memory pack-revindex. According to the numbers in `8b8dfd5132` (pack-revindex: radix-sort the revindex, 2013-07-11), that takes about 200ms for linux.git, and 20ms for git.git (those numbers are a few years old but are still a good ballpark). That makes it a good optimization for some cases (we can save tens of seconds in git.git by having good locality of delta access, for a 20ms cost), but a bad one for others (e.g., right now "cat-file --batch-all-objects --batch-check="%(objectname)" is 170ms in git.git, so adding 20ms to that is noticeable). Hence this patch makes it an optional flag. You can't actually do any interesting timings yet, as it's not plumbed through to any user-facing tools like cat-file. That will come in a later patch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-13 13:48:28 -07:00
Jeff King	a7ff6f5a0f	for_each_*_object: take flag arguments as enum It's not wrong to pass our flags in an "unsigned", as we know it will be at least as large as the enum. However, using the enum in the declaration makes it more obvious where to find the list of flags. While we're here, let's also drop the "extern" noise-words from the declarations, per our modern coding style. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-13 13:48:25 -07:00
Derrick Stolee	17c35c8969	packfile: skip loading index if in multi-pack-index Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-20 11:27:29 -07:00
Derrick Stolee	f3a002bd84	midx: prevent duplicate packfile loads The multi-pack-index, when present, tracks the existence of objects and their offsets within a list of packfiles. This allows us to use the multi-pack-index for object lookups, abbreviations, and object counts. When the multi-pack-index tracks a packfile, then we do not need to add that packfile to the packed_git linked list or the MRU list. We still need to load the packfiles that are not tracked by the multi-pack-index. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-20 11:27:29 -07:00
Derrick Stolee	b8990fbfed	midx: use midx in approximate_object_count Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-20 11:27:29 -07:00
Derrick Stolee	8aac67a174	midx: use midx in abbreviation calculations Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-20 11:27:28 -07:00
Derrick Stolee	3715a6335c	midx: read objects from multi-pack-index Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-20 11:27:28 -07:00
Derrick Stolee	c4d25228eb	config: create core.multiPackIndex setting The core.multiPackIndex config setting controls the multi-pack- index (MIDX) feature. If false, the setting will disable all reads from the multi-pack-index file. Read this config setting in the new prepare_multi_pack_index_one() which is called during prepare_packed_git(). This check is run once per repository. Add comparison commands in t5319-multi-pack-index.sh to check typical Git behavior remains the same as the config setting is turned on and off. This currently includes 'git rev-list' and 'git log' commands to trigger several object database reads. Currently, these would only catch an error in the prepare_multi_pack_index_one(), but with later commits will catch errors in object lookups, abbreviations, and approximate object counts. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-20 11:27:28 -07:00
Derrick Stolee	fe1ed56f5e	midx: sort and deduplicate objects from packfiles Before writing a list of objects and their offsets to a multi-pack-index, we need to collect the list of objects contained in the packfiles. There may be multiple copies of some objects, so this list must be deduplicated. It is possible to artificially get into a state where there are many duplicate copies of objects. That can create high memory pressure if we are to create a list of all objects before de-duplication. To reduce this memory pressure without a significant performance drop, automatically group objects by the first byte of their object id. Use the IDX fanout tables to group the data, copy to a local array, then sort. Copy only the de-duplicated entries. Select the duplicate based on the most-recent modified time of a packfile containing the object. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-20 11:27:28 -07:00
Derrick Stolee	9208e318f5	packfile: generalize pack directory list In anticipation of sharing the pack directory listing with the multi-pack-index, generalize prepare_packed_git_one() into for_each_file_in_pack_dir(). Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-20 11:27:28 -07:00
Stefan Beller	109cd76dd3	object: add repository argument to parse_object Add a repository argument to allow the callers of parse_object to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-06-29 10:43:38 -07:00
Junio C Hamano	faff81287b	Merge branch 'jl/zlib-restore-nul-termination' Make zlib inflate codepath more robust against versions of zlib that clobber unused portion of outbuf. * jl/zlib-restore-nul-termination: packfile: correct zlib buffer handling	2018-06-18 10:18:43 -07:00
Jeremy Linton	b611396e97	packfile: correct zlib buffer handling The buffer being passed to zlib includes a NUL terminator that git needs to keep in place. unpack_compressed_entry() attempts to detect the case that the source buffer hasn't been fully consumed by checking to see if the destination buffer has been over consumed. This causes a problem, that more recent zlib patches have been poisoning the unconsumed portions of the buffer which overwrites the NUL byte, while correctly returning length and status. Let's place the NUL at the end of the buffer after inflate returns to assure that it doesn't result in problems for git even if its been overwritten by zlib. Signed-off-by: Jeremy Linton <lintonrjeremy@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-06-13 11:34:27 -07:00
Junio C Hamano	42c8ce1c49	Merge branch 'bc/object-id' Conversion from uchar[20] to struct object_id continues. * bc/object-id: (42 commits) merge-one-file: compute empty blob object ID add--interactive: compute the empty tree value Update shell scripts to compute empty tree object ID sha1_file: only expose empty object constants through git_hash_algo dir: use the_hash_algo for empty blob object ID sequencer: use the_hash_algo for empty tree object ID cache-tree: use is_empty_tree_oid sha1_file: convert cached object code to struct object_id builtin/reset: convert use of EMPTY_TREE_SHA1_BIN builtin/receive-pack: convert one use of EMPTY_TREE_SHA1_HEX wt-status: convert two uses of EMPTY_TREE_SHA1_HEX submodule: convert several uses of EMPTY_TREE_SHA1_HEX sequencer: convert one use of EMPTY_TREE_SHA1_HEX merge: convert empty tree constant to the_hash_algo builtin/merge: switch tree functions to use object_id builtin/am: convert uses of EMPTY_TREE_SHA1_BIN to the_hash_algo sha1-file: add functions for hex empty tree and blob OIDs builtin/receive-pack: avoid hard-coded constants for push certs diff: specify abbreviation size in terms of the_hash_algo upload-pack: replace use of several hard-coded constants ...	2018-05-30 14:04:10 +09:00
Junio C Hamano	50f08db594	Merge branch 'js/use-bug-macro' Developer support update, by using BUG() macro instead of die() to mark codepaths that should not happen more clearly. * js/use-bug-macro: BUG_exit_code: fix sparse "symbol not declared" warning Convert remaining die*(BUG) messages Replace all die("BUG: ...") calls by BUG() ones run-command: use BUG() to report bugs, not die() test-tool: help verifying BUG() code paths	2018-05-30 14:04:07 +09:00
Junio C Hamano	fcb6df3254	Merge branch 'sb/oid-object-info' The codepath around object-info API has been taught to take the repository object (which in turn tells the API which object store the objects are to be located). * sb/oid-object-info: cache.h: allow oid_object_info to handle arbitrary repositories packfile: add repository argument to cache_or_unpack_entry packfile: add repository argument to unpack_entry packfile: add repository argument to read_object packfile: add repository argument to packed_object_info packfile: add repository argument to packed_to_object_type packfile: add repository argument to retry_bad_packed_offset cache.h: add repository argument to oid_object_info cache.h: add repository argument to oid_object_info_extended	2018-05-23 14:38:16 +09:00
Junio C Hamano	c89b6e136e	Merge branch 'ds/lazy-load-trees' The code has been taught to use the duplicated information stored in the commit-graph file to learn the tree object name for a commit to avoid opening and parsing the commit object when it makes sense to do so. * ds/lazy-load-trees: coccinelle: avoid wrong transformation suggestions from commit.cocci commit-graph: lazy-load trees for commits treewide: replace maybe_tree with accessor methods commit: create get_commit_tree() method treewide: rename tree to maybe_tree	2018-05-23 14:38:13 +09:00
Junio C Hamano	b10edb2df5	Merge branch 'ds/commit-graph' Precompute and store information necessary for ancestry traversal in a separate file to optimize graph walking. * ds/commit-graph: commit-graph: implement "--append" option commit-graph: build graph from starting commits commit-graph: read only from specific pack-indexes commit: integrate commit graph with commit parsing commit-graph: close under reachability commit-graph: add core.commitGraph setting commit-graph: implement git commit-graph read commit-graph: implement git-commit-graph write commit-graph: implement write_commit_graph() commit-graph: create git-commit-graph builtin graph: add commit graph design document commit-graph: add format document csum-file: refactor finalize_hashfile() method csum-file: rename hashclose() to finalize_hashfile()	2018-05-08 15:59:20 +09:00
Johannes Schindelin	033abf97fc	Replace all die("BUG: ...") calls by BUG() ones In `d8193743e0` (usage.c: add BUG() function, 2017-05-12), a new macro was introduced to use for reporting bugs instead of die(). It was then subsequently used to convert one single caller in `588a538ae5` (setup_git_env: convert die("BUG") to BUG(), 2017-05-12). The cover letter of the patch series containing this patch (cf 20170513032414.mfrwabt4hovujde2@sigill.intra.peff.net) is not terribly clear why only one call site was converted, or what the plan is for other, similar calls to die() to report bugs. Let's just convert all remaining ones in one fell swoop. This trick was performed by this invocation: sed -i 's/die("BUG: /BUG("/g' $(git grep -l 'die("BUG' \*.c) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-06 19:06:13 +09:00
brian m. carlson	37fec86a83	packfile: abstract away hash constant values There are several instances of the constant 20 and 20-based values in the packfile code. Abstract away dependence on SHA-1 by using the values from the_hash_algo instead. Use unsigned values for temporary constants to provide the compiler with more information about what kinds of values it should expect. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-02 13:59:50 +09:00
brian m. carlson	544443cb3c	packfile: convert find_pack_entry to object_id Convert find_pack_entry and the static function fill_pack_entry to take pointers to struct object_id. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-02 13:59:49 +09:00
brian m. carlson	14c3c80c81	packfile: convert has_sha1_pack to object_id Convert this function to take a pointer to struct object_id and rename it has_object_pack for consistency with has_object_file. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-02 13:59:49 +09:00
brian m. carlson	c51c39418b	packfile: remove unused member from struct pack_entry The sha1 member in struct pack_entry is unused except for one instance in which we store a value in it. Since nobody ever reads this value, don't bother to compute it and remove the member from struct pack_entry. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-02 13:59:49 +09:00
Stefan Beller	9d98354f48	cache.h: allow oid_object_info to handle arbitrary repositories This involves also adapting oid_object_info_extended and a some internal functions that are used to implement these. It all has to happen in one patch, because of a single recursive chain of calls visits all these functions. oid_object_info_extended is also used in partial clones, which allow fetching missing objects. As this series will not add the repository struct to the transport code and fetch_object(), add a TODO note and omit fetching if a user tries to use a partial clone in a repository other than the_repository. Among the functions modified to handle arbitrary repositories, unpack_entry() is one of them. Note that it still references the globals "delta_base_cache" and "delta_base_cached", but those are safe to be referenced (the former is indexed partly by "struct packed_git *", which is repo-specific, and the latter is only used to limit the size of the former as an optimization). Helped-by: Brandon Williams <bmwill@google.com> Helped-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-26 10:54:28 +09:00
Stefan Beller	589de91185	packfile: add repository argument to cache_or_unpack_entry Add a repository argument to allow the callers of cache_or_unpack_entry to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-26 10:54:27 +09:00
Stefan Beller	57a6a500be	packfile: add repository argument to unpack_entry Add a repository argument to allow the callers of unpack_entry to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-26 10:54:27 +09:00
Stefan Beller	5da6534dd6	packfile: add repository argument to read_object Add a repository argument to allow the callers of read_object to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-26 10:54:27 +09:00
Jonathan Nieder	720aaa1a74	packfile: add repository argument to packed_object_info Add a repository argument to allow callers of packed_object_info to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-26 10:54:27 +09:00
Stefan Beller	144f4948a1	packfile: add repository argument to packed_to_object_type Add a repository argument to allow the callers of packed_to_object_type to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-26 10:54:27 +09:00
Stefan Beller	0df23781fe	packfile: add repository argument to retry_bad_packed_offset Add a repository argument to allow the callers of retry_bad_packed_offset to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-26 10:54:27 +09:00
Stefan Beller	0df8e96566	cache.h: add repository argument to oid_object_info Add a repository argument to allow the callers of oid_object_info to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-26 10:54:27 +09:00
Stefan Beller	7ecd869060	cache.h: add repository argument to oid_object_info_extended Add a repository argument to allow oid_object_info_extended callers to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-26 10:54:27 +09:00
Junio C Hamano	3a1ec60c43	Merge branch 'sb/packfiles-in-repository' Refactoring of the internal global data structure continues. * sb/packfiles-in-repository: packfile: keep prepare_packed_git() private packfile: allow find_pack_entry to handle arbitrary repositories packfile: add repository argument to find_pack_entry packfile: allow reprepare_packed_git to handle arbitrary repositories packfile: allow prepare_packed_git to handle arbitrary repositories packfile: allow prepare_packed_git_one to handle arbitrary repositories packfile: add repository argument to reprepare_packed_git packfile: add repository argument to prepare_packed_git packfile: add repository argument to prepare_packed_git_one packfile: allow install_packed_git to handle arbitrary repositories packfile: allow rearrange_packed_git to handle arbitrary repositories packfile: allow prepare_packed_git_mru to handle arbitrary repositories	2018-04-11 13:09:55 +09:00
Junio C Hamano	cf0b1793ea	Merge branch 'sb/object-store' Refactoring the internal global data structure to make it possible to open multiple repositories, work with and then close them. Rerolled by Duy on top of a separate preliminary clean-up topic. The resulting structure of the topics looked very sensible. * sb/object-store: (27 commits) sha1_file: allow sha1_loose_object_info to handle arbitrary repositories sha1_file: allow map_sha1_file to handle arbitrary repositories sha1_file: allow map_sha1_file_1 to handle arbitrary repositories sha1_file: allow open_sha1_file to handle arbitrary repositories sha1_file: allow stat_sha1_file to handle arbitrary repositories sha1_file: allow sha1_file_name to handle arbitrary repositories sha1_file: add repository argument to sha1_loose_object_info sha1_file: add repository argument to map_sha1_file sha1_file: add repository argument to map_sha1_file_1 sha1_file: add repository argument to open_sha1_file sha1_file: add repository argument to stat_sha1_file sha1_file: add repository argument to sha1_file_name sha1_file: allow prepare_alt_odb to handle arbitrary repositories sha1_file: allow link_alt_odb_entries to handle arbitrary repositories sha1_file: add repository argument to prepare_alt_odb sha1_file: add repository argument to link_alt_odb_entries sha1_file: add repository argument to read_info_alternates sha1_file: add repository argument to link_alt_odb_entry sha1_file: add raw_object_store argument to alt_odb_usable pack: move approximate object count to object store ...	2018-04-11 13:09:55 +09:00
Derrick Stolee	2e27bd7731	treewide: replace maybe_tree with accessor methods In anticipation of making trees load lazily, create a Coccinelle script (contrib/coccinelle/commit.cocci) to ensure that all references to the 'maybe_tree' member of struct commit are either mutations or accesses through get_commit_tree() or get_commit_tree_oid(). Apply the Coccinelle script to create the rest of the patch. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-11 10:47:16 +09:00
Derrick Stolee	891435d55d	treewide: rename tree to maybe_tree Using the commit-graph file to walk commit history removes the large cost of parsing commits during the walk. This exposes a performance issue: lookup_tree() takes a large portion of the computation time, even when Git never uses those trees. In anticipation of lazy-loading these trees, rename the 'tree' member of struct commit to 'maybe_tree'. This serves two purposes: it hints at the future role of possibly being NULL even if the commit has a valid tree, and it allows for unambiguous transformation from simple member access (i.e. commit->maybe_tree) to method access. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-11 10:47:16 +09:00
Junio C Hamano	2d5792f071	Merge branch 'bw/c-plus-plus' into ds/lazy-load-trees * bw/c-plus-plus: (37 commits) replace: rename 'new' variables trailer: rename 'template' variables tempfile: rename 'template' variables wrapper: rename 'template' variables environment: rename 'namespace' variables diff: rename 'template' variables environment: rename 'template' variables init-db: rename 'template' variables unpack-trees: rename 'new' variables trailer: rename 'new' variables submodule: rename 'new' variables split-index: rename 'new' variables remote: rename 'new' variables ref-filter: rename 'new' variables read-cache: rename 'new' variables line-log: rename 'new' variables imap-send: rename 'new' variables http: rename 'new' variables entry: rename 'new' variables diffcore-delta: rename 'new' variables ...	2018-04-11 10:46:32 +09:00
Derrick Stolee	049d51a2bb	commit-graph: read only from specific pack-indexes Teach git-commit-graph to inspect the objects only in a certain list of pack-indexes within the given pack directory. This allows updating the commit graph iteratively. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-11 10:43:02 +09:00
Nguyễn Thái Ngọc Duy	464416a2ea	packfile: keep prepare_packed_git() private The reason callers have to call this is to make sure either packed_git or packed_git_mru pointers are initialized since we don't do that by default. Sometimes it's hard to see this connection between where the function is called and where packed_git pointer is used (sometimes in separate functions). Keep this dependency internal because now all access to packed_git and packed_git_mru must go through get_xxx() wrappers. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-26 10:07:43 -07:00
Stefan Beller	0a0dd632aa	packfile: allow find_pack_entry to handle arbitrary repositories Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-26 10:07:43 -07:00
Stefan Beller	613b42f283	packfile: add repository argument to find_pack_entry While at it move the documentation to the header and mention which pack files are searched. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-26 10:07:43 -07:00
Stefan Beller	4c2a13b4e2	packfile: allow reprepare_packed_git to handle arbitrary repositories Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-26 10:07:43 -07:00
Stefan Beller	0f90a9f27e	packfile: allow prepare_packed_git to handle arbitrary repositories Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-26 10:07:43 -07:00
Stefan Beller	935cdd6922	packfile: allow prepare_packed_git_one to handle arbitrary repositories Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-26 10:07:43 -07:00
Stefan Beller	a49d283435	packfile: add repository argument to reprepare_packed_git See previous patch for explanation. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-26 10:07:43 -07:00
Stefan Beller	6fdb4e9f5a	packfile: add repository argument to prepare_packed_git Add a repository argument to allow prepare_packed_git callers to be more specific about which repository to handle. See commit "sha1_file: add repository argument to link_alt_odb_entry" for an explanation of the #define trick. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-26 10:07:43 -07:00
Stefan Beller	072a109356	packfile: add repository argument to prepare_packed_git_one Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-26 10:07:43 -07:00
Stefan Beller	5babff16d9	packfile: allow install_packed_git to handle arbitrary repositories This conversion was done without the #define trick used in the earlier series refactoring to have better repository access, because this function is easy to review, as it only has one caller and all lines but the first two are converted. We must not convert 'pack_open_fds' to be a repository specific variable, as it is used to monitor resource usage of the machine that Git executes on. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-26 10:07:43 -07:00
Stefan Beller	c235beac4e	packfile: allow rearrange_packed_git to handle arbitrary repositories Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-26 10:07:43 -07:00
Stefan Beller	804be79690	packfile: allow prepare_packed_git_mru to handle arbitrary repositories This conversion was done without the #define trick used in the earlier series refactoring to have better repository access, because this function is easy to review, as all lines are converted and it has only one caller Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-26 10:07:43 -07:00
Stefan Beller	0b20903405	sha1_file: add repository argument to prepare_alt_odb See previous patch for explanation. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-26 10:05:55 -07:00
Stefan Beller	9a00580d03	pack: move approximate object count to object store The approximate_object_count() function maintains a rough count of objects in a repository to estimate how long object name abbreviates should be. Object names are scoped to a repository and the appropriate length may differ by repository, so the object count should not be global. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-26 10:05:55 -07:00
Stefan Beller	5508f69348	pack: move prepare_packed_git_run_once to object store Each repository's object store can be initialized independently, so they must not share a run_once variable. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-26 10:05:55 -07:00
Stefan Beller	d0b5986622	object-store: close all packs upon clearing the object store Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-26 10:05:55 -07:00
Stefan Beller	a80d72db2a	object-store: move packed_git and packed_git_mru to object store In a process with multiple repositories open, packfile accessors should be associated to a single repository and not shared globally. Move packed_git and packed_git_mru into the_repository and adjust callers to reflect this. [nd: while at there, wrap access to these two fields in get_packed_git() and get_packed_git_mru(). This allows us to lazily initialize these fields without caller doing that explicitly] Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-26 10:05:46 -07:00
Stefan Beller	031dc927f4	object-store: move alt_odb_list and alt_odb_tail to object store In a process with multiple repositories open, alternates should be associated to a single repository and not shared globally. Move alt_odb_list and alt_odb_tail into the_repository and adjust callers to reflect this. Now that the alternative object data base is per repository, we're leaking its memory upon freeing a repository. The next patch plugs this hole. No functional change intended. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-23 11:06:01 -07:00
Stefan Beller	0d4a132144	object-store: migrate alternates struct and functions from cache.h Migrate the struct alternate_object_database and all its related functions to the object store as these functions are easier found in that header. The migration is just a verbatim copy, no need to include the object store header at any C file, because cache.h includes repository.h which in turn includes the object-store.h Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-23 11:06:01 -07:00
Derrick Stolee	3d475f46a8	packfile: define and use bsearch_pack() The method bsearch_hash() generalizes binary searches using a fanout table. The only consumer is currently find_pack_entry_one(). It requires a bit of pointer arithmetic to align the fanout table and the lookup table depending on the pack-index version. Extract the pack-index pointer arithmetic to a new method, bsearch_pack(), so this can be re-used in other code paths. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-22 11:00:07 -07:00
brian m. carlson	abef9020e3	sha1_file: convert sha1_object_info* to object_id Convert sha1_object_info and sha1_object_info_extended to take pointers to struct object_id and rename them to use "oid" instead of "sha1" in their names. Update the declaration and definition and apply the following semantic patch, plus the standard object_id transforms: @@ expression E1, E2; @@ - sha1_object_info(E1.hash, E2) + oid_object_info(&E1, E2) @@ expression E1, E2; @@ - sha1_object_info(E1->hash, E2) + oid_object_info(E1, E2) @@ expression E1, E2, E3; @@ - sha1_object_info_extended(E1.hash, E2, E3) + oid_object_info_extended(&E1, E2, E3) @@ expression E1, E2, E3; @@ - sha1_object_info_extended(E1->hash, E2, E3) + oid_object_info_extended(E1, E2, E3) Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 09:23:49 -07:00
brian m. carlson	4310b0c441	packfile: convert unpack_entry to struct object_id Convert unpack_entry and read_object to use struct object_id. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 09:23:49 -07:00
brian m. carlson	d169d6644c	sha1_file: convert retry_bad_packed_offset to struct object_id Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 09:23:49 -07:00
Junio C Hamano	7d4bebfe93	Merge branch 'jt/binsearch-with-fanout' into HEAD * jt/binsearch-with-fanout: packfile: refactor hash search with fanout table packfile: remove GIT_DEBUG_LOOKUP log statements	2018-03-13 13:34:04 -07:00
Junio C Hamano	169c9c0169	Merge branch 'bw/c-plus-plus' Avoid using identifiers that clash with C++ keywords. Even though it is not a goal to compile Git with C++ compilers, changes like this help use of code analysis tools that targets C++ on our codebase. * bw/c-plus-plus: (37 commits) replace: rename 'new' variables trailer: rename 'template' variables tempfile: rename 'template' variables wrapper: rename 'template' variables environment: rename 'namespace' variables diff: rename 'template' variables environment: rename 'template' variables init-db: rename 'template' variables unpack-trees: rename 'new' variables trailer: rename 'new' variables submodule: rename 'new' variables split-index: rename 'new' variables remote: rename 'new' variables ref-filter: rename 'new' variables read-cache: rename 'new' variables line-log: rename 'new' variables imap-send: rename 'new' variables http: rename 'new' variables entry: rename 'new' variables diffcore-delta: rename 'new' variables ...	2018-03-06 14:54:07 -08:00
Junio C Hamano	f2fcbeb3bf	Merge branch 'jt/binsearch-with-fanout' Refactor the code to binary search starting from a fan-out table (which is how the packfile is indexed with object names) into a reusable helper. * jt/binsearch-with-fanout: packfile: refactor hash search with fanout table packfile: remove GIT_DEBUG_LOOKUP log statements	2018-02-27 10:34:03 -08:00
Jonathan Tan	b4e00f7306	packfile: refactor hash search with fanout table Subsequent patches will introduce file formats that make use of a fanout array and a sorted table containing hashes, just like packfiles. Refactor the hash search in packfile.c into its own function, so that those patches can make use of it as well. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-15 13:08:55 -08:00
Jonathan Tan	4669e7d68e	packfile: remove GIT_DEBUG_LOOKUP log statements In commit `628522ec14` ("sha1-lookup: more memory efficient search in sorted list of SHA-1", 2008-04-09), a different algorithm for searching a sorted list was introduced, together with a set of log statements guarded by GIT_DEBUG_LOOKUP that are invoked both when using that algorithm and when using the existing binary search. Those log statements was meant for experiments and debugging, but with the removal of the aforementioned different algorithm in commit `f1068efefe` ("sha1_file: drop experimental GIT_USE_LOOKUP search", 2017-08-09), those log statements are probably no longer necessary. Remove those statements. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-15 13:08:53 -08:00
Brandon Williams	debca9d2fe	object: rename function 'typename' to 'type_name' Rename C++ keyword in order to bring the codebase closer to being able to be compiled with a C++ compiler. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-14 13:10:05 -08:00
Brandon Williams	6ca32f4714	object_info: change member name from 'typename' to 'type_name' Rename C++ keyword in order to bring the codebase closer to being able to be compiled with a C++ compiler. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-14 13:10:05 -08:00
Junio C Hamano	2dc69eef1b	Merge branch 'ds/use-get-be64' Code clean-up. * ds/use-get-be64: packfile: use get_be64() for large offsets	2018-02-13 13:39:11 -08:00
Junio C Hamano	867622398f	Merge branch 'gs/retire-mru' Retire mru API as it does not give enough abstraction over underlying list API to be worth it. * gs/retire-mru: mru: Replace mru.[ch] with list.h implementation	2018-02-13 13:39:06 -08:00
Junio C Hamano	afc8aa3fbf	Merge branch 'ot/mru-on-list' The first step to getting rid of mru API and using the doubly-linked list API directly instead. * ot/mru-on-list: mru: use double-linked list from list.h	2018-02-13 13:39:05 -08:00
Gargi Sharma	ec2dd32c70	mru: Replace mru.[ch] with list.h implementation Replace the custom calls to mru.[ch] with calls to list.h. This patch is the final step in removing the mru API completely and inlining the logic. This patch leads to significant code reduction and the mru API hence, is not a useful abstraction anymore. Signed-off-by: Gargi Sharma <gs051095@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-24 09:52:16 -08:00
Derrick Stolee	ad622a256f	packfile: use get_be64() for large offsets The pack-index version 2 format uses two 4-byte integers in network-byte order to represent one 8-byte value. The current implementation has several code clones for stitching these integers together. Use get_be64() to create an 8-byte integer from two 4-byte integers represented this way. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-19 11:04:56 -08:00
Jonathan Tan	498f1f61f1	fsck: introduce partialclone extension Currently, Git does not support repos with very large numbers of objects or repos that wish to minimize manipulation of certain blobs (for example, because they are very large) very well, even if the user operates mostly on part of the repo, because Git is designed on the assumption that every referenced object is available somewhere in the repo storage. In such an arrangement, the full set of objects is usually available in remote storage, ready to be lazily downloaded. Teach fsck about the new state of affairs. In this commit, teach fsck that missing promisor objects referenced from the reflog are not an error case; in future commits, fsck will be taught about other cases. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-05 09:46:05 -08:00
Derrick Stolee	19716b21a4	cleanup: fix possible overflow errors in binary search A common mistake when writing binary search is to allow possible integer overflow by using the simple average: mid = (min + max) / 2; Instead, use the overflow-safe version: mid = min + (max - min) / 2; This translation is safe since the operation occurs inside a loop conditioned on "min < max". The included changes were found using the following git grep: git grep '/ 2;' '.c' Making this cleanup will prevent future review friction when a new binary search is contructed based on existing code. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-10 08:57:24 +09:00
Junio C Hamano	cb1083ca23	Merge branch 'jk/read-in-full' Code clean-up to prevent future mistakes by copying and pasting code that checks the result of read_in_full() function. * jk/read-in-full: worktree: check the result of read_in_full() worktree: use xsize_t to access file size distinguish error versus short read from read_in_full() avoid looking at errno for short read_in_full() returns prefer "!=" when checking read_in_full() result notes-merge: drop dead zero-write code files-backend: prefer "0" for write_in_full() error check	2017-10-03 15:42:49 +09:00
Olga Telezhnaya	8865859dfc	mru: use double-linked list from list.h Simplify mru.[ch] and related code by reusing the double-linked list implementation from list.h instead of a custom one. This commit is an intermediate step. Our final goal is to get rid of mru.[ch] at all and inline all logic. Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored by: Jeff King <peff@peff.net> Signed-off-by: Olga Telezhnaia <olyatelezhnaya@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-01 17:30:26 +09:00
Jeff King	41dcc4dccc	distinguish error versus short read from read_in_full() Many callers of read_in_full() expect to see the exact number of bytes requested, but their error handling lumps together true read errors and short reads due to unexpected EOF. We can give more specific error messages by separating these cases (showing errno when appropriate, and otherwise describing the short read). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-27 15:45:24 +09:00
Jonathan Nieder	607bd8315c	pack: make packed_git_mru global a value instead of a pointer The MRU cache that keeps track of recently used packs is represented using two global variables: struct mru packed_git_mru_storage; struct mru *packed_git_mru = &packed_git_mru_storage; Callers never assign to the packed_git_mru pointer, though, so we can simplify by eliminating it and using &packed_git_mru_storage (renamed to &packed_git_mru) directly. This variable is only used by the packfile subsystem, making this a relatively uninvasive change (and any new unadapted callers would trigger a compile error). Noticed while moving these globals to the object_store struct. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-14 15:05:48 +09:00
Jonathan Tan	7709f468fd	pack: move for_each_packed_object() Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	f9a8672a81	pack: move has_pack_index() Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	150e3001d0	pack: move has_sha1_pack() Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	1a1e5d4f47	pack: move find_pack_entry() and make it global This function needs to be global as it is used by sha1_file.c and will be used by packfile.c. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	d6fe0036fd	pack: move find_sha1_pack() Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	a2551953b9	pack: move find_pack_entry_one(), is_pack_valid() Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	9e0f45f5a6	pack: move check_pack_index_ptr(), nth_packed_object_offset() Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	d5a1676182	pack: move nth_packed_object_{sha1,oid} Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	f1d8130be0	pack: move clear_delta_base_cache(), packed_object_info(), unpack_entry() Both sha1_file.c and packfile.c now need read_object(), so a copy of read_object() was created in packfile.c. This patch makes both mark_bad_packed_object() and has_packed_and_bad() global. Unlike most of the other patches in this series, these 2 functions need to remain global. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	3588dd6e99	pack: move unpack_object_header() Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	7b3aa75df7	pack: move get_size_from_delta() Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	32b42e152f	pack: move unpack_object_header_buffer() Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	0abe14f6a5	pack: move {,re}prepare_packed_git and approximate_object_count Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	e65f186242	pack: move install_packed_git() Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	9a42865374	pack: move add_packed_git() Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	97de1803f8	pack: move unuse_pack() Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	84f80ad5e1	pack: move use_pack() The function open_packed_git() needs to be temporarily made global. Its scope will be restored to static in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:06 -07:00
Jonathan Tan	3836d88ae5	pack: move pack-closing functions The function close_pack_fd() needs to be temporarily made global. Its scope will be restored to static in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:06 -07:00
Jonathan Tan	f0e17e86e1	pack: move release_pack_memory() The function unuse_one_window() needs to be temporarily made global. Its scope will be restored to static in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:06 -07:00
Jonathan Tan	0317f45576	pack: move open_pack_index(), parse_pack_index() alloc_packed_git() in packfile.c is duplicated from sha1_file.c. In a subsequent commit, alloc_packed_git() will be removed from sha1_file.c. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:06 -07:00
Jonathan Tan	8e21176c3c	pack: move pack_report() Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:06 -07:00
Jonathan Tan	6d6a80e068	pack: move static state variables sha1_file.c declares some static variables that store packfile-related state. Move them to packfile.c. They are temporarily made global, but subsequent commits will restore their scope back to static. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:06 -07:00
Jonathan Tan	4f39cd821d	pack: move pack name-related functions Currently, sha1_file.c and cache.h contain many functions, both related to and unrelated to packfiles. This makes both files very large and causes an unclear separation of concerns. Create a new file, packfile.c, to hold all packfile-related functions currently in sha1_file.c. It has a corresponding header packfile.h. In this commit, the pack name-related functions are moved. Subsequent commits will move the other functions. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:06 -07:00

1 2 3 4 5

224 Commits