git-commit-vandalism

Author	SHA1	Message	Date
Han Xin	aaf81223f4	unpack-objects: use stream_loose_object() to unpack large objects Make use of the stream_loose_object() function introduced in the preceding commit to unpack large objects. Before this we'd need to malloc() the size of the blob before unpacking it, which could cause OOM with very large blobs. We could use the new streaming interface to unpack all blobs, but doing so would be much slower, as demonstrated e.g. with this benchmark using git-hyperfine[0]: rm -rf /tmp/scalar.git && git clone --bare https://github.com/Microsoft/scalar.git /tmp/scalar.git && mv /tmp/scalar.git/objects/pack/.pack /tmp/scalar.git/my.pack && git hyperfine \ -r 2 --warmup 1 \ -L rev origin/master,HEAD -L v "10,512,1k,1m" \ -s 'make' \ -p 'git init --bare dest.git' \ -c 'rm -rf dest.git' \ './git -C dest.git -c core.bigFileThreshold={v} unpack-objects </tmp/scalar.git/my.pack' Here we'll perform worse with lower core.bigFileThreshold settings with this change in terms of speed, but we're getting lower memory use in return: Summary './git -C dest.git -c core.bigFileThreshold=10 unpack-objects </tmp/scalar.git/my.pack' in 'origin/master' ran 1.01 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1k unpack-objects </tmp/scalar.git/my.pack' in 'origin/master' 1.01 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1m unpack-objects </tmp/scalar.git/my.pack' in 'origin/master' 1.01 ± 0.02 times faster than './git -C dest.git -c core.bigFileThreshold=1m unpack-objects </tmp/scalar.git/my.pack' in 'HEAD' 1.02 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/scalar.git/my.pack' in 'origin/master' 1.09 ± 0.01 times faster than './git -C dest.git -c core.bigFileThreshold=1k unpack-objects </tmp/scalar.git/my.pack' in 'HEAD' 1.10 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/scalar.git/my.pack' in 'HEAD' 1.11 ± 0.00 times faster than './git -C dest.git -c core.bigFileThreshold=10 unpack-objects </tmp/scalar.git/my.pack' in 'HEAD' A better benchmark to demonstrate the benefits of that this one, which creates an artificial repo with a 1, 25, 50, 75 and 100MB blob: rm -rf /tmp/repo && git init /tmp/repo && ( cd /tmp/repo && for i in 1 25 50 75 100 do dd if=/dev/urandom of=blob.$i count=$(($i1024)) bs=1024 done && git add blob.* && git commit -mblobs && git gc && PACK=$(echo .git/objects/pack/pack-*.pack) && cp "$PACK" my.pack ) && git hyperfine \ --show-output \ -L rev origin/master,HEAD -L v "512,50m,100m" \ -s 'make' \ -p 'git init --bare dest.git' \ -c 'rm -rf dest.git' \ '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold={v} unpack-objects </tmp/repo/my.pack 2>&1 \| grep Maximum' Using this test we'll always use >100MB of memory on origin/master (around ~105MB), but max out at e.g. ~55MB if we set core.bigFileThreshold=50m. The relevant "Maximum resident set size" lines were manually added below the relevant benchmark: '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=50m unpack-objects </tmp/repo/my.pack 2>&1 \| grep Maximum' in 'origin/master' ran Maximum resident set size (kbytes): 107080 1.02 ± 0.78 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/repo/my.pack 2>&1 \| grep Maximum' in 'origin/master' Maximum resident set size (kbytes): 106968 1.09 ± 0.79 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=100m unpack-objects </tmp/repo/my.pack 2>&1 \| grep Maximum' in 'origin/master' Maximum resident set size (kbytes): 107032 1.42 ± 1.07 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=100m unpack-objects </tmp/repo/my.pack 2>&1 \| grep Maximum' in 'HEAD' Maximum resident set size (kbytes): 107072 1.83 ± 1.02 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=50m unpack-objects </tmp/repo/my.pack 2>&1 \| grep Maximum' in 'HEAD' Maximum resident set size (kbytes): 55704 2.16 ± 1.19 times faster than '/usr/bin/time -v ./git -C dest.git -c core.bigFileThreshold=512 unpack-objects </tmp/repo/my.pack 2>&1 \| grep Maximum' in 'HEAD' Maximum resident set size (kbytes): 4564 This shows that if you have enough memory this new streaming method is slower the lower you set the streaming threshold, but the benefit is more bounded memory use. An earlier version of this patch introduced a new "core.bigFileStreamingThreshold" instead of re-using the existing "core.bigFileThreshold" variable[1]. As noted in a detailed overview of its users in [2] using it has several different meanings. Still, we consider it good enough to simply re-use it. While it's possible that someone might want to e.g. consider objects "small" for the purposes of diffing but "big" for the purposes of writing them such use-cases are probably too obscure to worry about. We can always split up "core.bigFileThreshold" in the future if there's a need for that. 0. https://github.com/avar/git-hyperfine/ 1. https://lore.kernel.org/git/20211210103435.83656-1-chiyutianyi@gmail.com/ 2. https://lore.kernel.org/git/20220120112114.47618-5-chiyutianyi@gmail.com/ Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: Derrick Stolee <stolee@gmail.com> Helped-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Han Xin <chiyutianyi@gmail.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-13 10:22:36 -07:00
Han Xin	a1bf5ca29f	unpack-objects: low memory footprint for get_data() in dry_run mode As the name implies, "get_data(size)" will allocate and return a given amount of memory. Allocating memory for a large blob object may cause the system to run out of memory. Before preparing to replace calling of "get_data()" to unpack large blob objects in latter commits, refactor "get_data()" to reduce memory footprint for dry_run mode. Because in dry_run mode, "get_data()" is only used to check the integrity of data, and the returned buffer is not used at all, we can allocate a smaller buffer and use it as zstream output. Make the function return NULL in the dry-run mode, as no callers use the returned buffer. The "find [...]objects/?? -type f \| wc -l" test idiom being used here is adapted from the same "find" use added to another test in `d9545c7f46` (fast-import: implement unpack limit, 2016-04-25). Suggested-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Han Xin <chiyutianyi@gmail.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-13 10:22:35 -07:00
Junio C Hamano	83937e9592	Merge branch 'ns/batch-fsync' Introduce a filesystem-dependent mechanism to optimize the way the bits for many loose object files are ensured to hit the disk platter. * ns/batch-fsync: core.fsyncmethod: performance tests for batch mode t/perf: add iteration setup mechanism to perf-lib core.fsyncmethod: tests for batch mode test-lib-functions: add parsing helpers for ls-files and ls-tree core.fsync: use batch mode and sync loose objects by default on Windows unpack-objects: use the bulk-checkin infrastructure update-index: use the bulk-checkin infrastructure builtin/add: add ODB transaction around add_files_to_cache cache-tree: use ODB transaction around writing a tree core.fsyncmethod: batched disk flushes for loose-objects bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' bulk-checkin: rename 'state' variable and separate 'plugged' boolean	2022-06-03 14:30:34 -07:00
Neeraj Singh	425d290ce5	unpack-objects: use the bulk-checkin infrastructure The unpack-objects functionality is used by fetch, push, and fast-import to turn the transfered data into object database entries when there are fewer objects than the 'unpacklimit' setting. By enabling an odb-transaction when unpacking objects, we can take advantage of batched fsyncs. Here are some performance numbers to justify batch mode for unpack-objects, collected on a WSL2 Ubuntu VM. Fsync Mode \| Time for 90 objects (ms) ------------------------------------- Off \| 170 On,fsync \| 760 On,batch \| 230 Note that the default unpackLimit is 100 objects, so there's a 3x benefit in the worst case. The non-batch mode fsync scales linearly with the number of objects, so there are significant benefits even with smaller numbers of objects. Signed-off-by: Neeraj Singh <neerajsi@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-04-06 13:13:26 -07:00
Ævar Arnfjörð Bjarmason	44439c1c58	object-file API: have hash_object_file() take "enum object_type" Change the hash_object_file() function to take an "enum object_type". Since a preceding commit all of its callers are passing either "{commit,tree,blob,tag}_type", or the result of a call to type_name(), the parse_object() caller that would pass NULL is now using stream_object_signature(). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-25 17:16:32 -08:00
Ævar Arnfjörð Bjarmason	c80d226a04	object-file API: have write_object_file() take "enum object_type" Change the write_object_file() function to take an "enum object_type" instead of a "const char type". Its callers either passed {commit,tree,blob,tag}_type and can pass the corresponding OBJ_ type instead, or were hardcoding strings like "blob". This avoids the back & forth fragility where the callers of write_object_file() would have the enum type, and convert it themselves via type_name(). We do have to now do that conversion ourselves before calling write_object_file_prepare(), but those codepaths will be similarly adjusted in subsequent commits. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-25 17:16:31 -08:00
brian m. carlson	14228447c9	hash: provide per-algorithm null OIDs Up until recently, object IDs did not have an algorithm member, only a hash. Consequently, it was possible to share one null (all-zeros) object ID among all hash algorithms. Now that we're going to be handling objects from multiple hash algorithms, it's important to make sure that all object IDs have a correct algorithm field. Introduce a per-algorithm null OID, and add it to struct hash_algo. Introduce a wrapper function as well, and use it everywhere we used to use the null_oid constant. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 16:31:39 +09:00
brian m. carlson	5951bf467e	Use the final_oid_fn to finalize hashing of object IDs When we're hashing a value which is going to be an object ID, we want to zero-pad that value if necessary. To do so, use the final_oid_fn instead of the final_fn anytime we're going to create an object ID to ensure we perform this operation. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 16:31:38 +09:00
brian m. carlson	92e2cab96b	Always use oidread to read into struct object_id In the future, we'll want oidread to automatically set the hash algorithm member for an object ID we read into it, so ensure we use oidread instead of hashcpy everywhere we're copying a hash value into a struct object_id. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 16:31:38 +09:00
Junio C Hamano	5644419d04	Merge branch 'ab/fsck-api-cleanup' Fsck API clean-up. * ab/fsck-api-cleanup: fetch-pack: use new fsck API to printing dangling submodules fetch-pack: use file-scope static struct for fsck_options fetch-pack: don't needlessly copy fsck_options fsck.c: move gitmodules_{found,done} into fsck_options fsck.c: add an fsck_set_msg_type() API that takes enums fsck.c: pass along the fsck_msg_id in the fsck_error callback fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from .c to .h fsck.c: give "FOREACH_MSG_ID" a more specific name fsck.c: undefine temporary STR macro after use fsck.c: call parse_msg_type() early in fsck_set_msg_type() fsck.h: re-order and re-assign "enum fsck_msg_type" fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" fsck.c: rename remaining fsck_msg_id "id" to "msg_id" fsck.c: remove (mostly) redundant append_msg_id() function fsck.c: rename variables in fsck_set_msg_type() for less confusion fsck.h: use "enum object_type" instead of "int" fsck.h: use designed initializers for FSCK_OPTIONS_{DEFAULT,STRICT} fsck.c: refactor and rename common config callback	2021-04-07 16:54:09 -07:00
Ævar Arnfjörð Bjarmason	a1aad71601	fsck.h: use "enum object_type" instead of "int" Change the fsck_walk_func to use an "enum object_type" instead of an "int" type. The types are compatible, and ever since this was added in `355885d531` (add generic, type aware object chain walker, 2008-02-25) we've used entries from object_type (OBJ_BLOB etc.). So this doesn't really change anything as far as the generated code is concerned, it just gives the compiler more information and makes this easier to read. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
René Scharfe	ca56dadb4b	use CALLOC_ARRAY Add and apply a semantic patch for converting code that open-codes CALLOC_ARRAY to use it instead. It shortens the code and infers the element size automatically. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-13 16:00:09 -08:00
Matheus Tavares	2dcde20e1c	sha1-file: pass git_hash_algo to hash_object_file() Allow hash_object_file() to work on arbitrary repos by introducing a git_hash_algo parameter. Change callers which have a struct repository pointer in their scope to pass on the git_hash_algo from the said repo. For all other callers, pass on the_hash_algo, which was already being used internally at hash_object_file(). This functionality will be used in the following patch to make check_object_signature() be able to work on arbitrary repos (which, in turn, will be used to fix an inconsistency at object.c:parse_object()). Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-01-31 10:45:39 -08:00
SZEDER Gábor	bae60ba7e9	builtin/unpack-objects.c: show throughput progress 'git unpack-objects' shows a progress line only counting the number of unpacked objects, so if some of the received objects are unusually large, then that progress might appear to be frozen while processing such a larger object. I just stared at a seemingly stuck progress line for over half a minute, while 'git fetch' was busy receiving a pack with only a couple of objects (i.e. fewer than 'fetch.unpackLimit'), with one of them being over 80MB. Display throughput in 'git unpack-objects' progress line, so we show that something is going on even when receiving and processing a large object. Counting the consumed bytes is far away from the place that counts objects and displays progress, and to pass around the 'struct progress' instance we would have to modify the signature of five functions and 14 of their callsites: this is just too much churn, so let's rather make it file-scope static. 'git index-pack', i.e. the non-unpacking cousin of 'git unpack-objects' already includes throughput in its progress line, and it uses a file-scope static 'struct progress' instance as well. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-11-20 10:30:18 +09:00
Jeff King	d0229abd93	object: convert lookup_object() to use object_id There are no callers left of lookup_object() that aren't just passing us the "hash" member of a "struct object_id". Let's take the whole struct, which gets us closer to removing all raw sha1 variables. It also matches the existing conversions of lookup_blob(), etc. The conversions of callers were done by hand, but they're all mechanical one-liners. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-06-20 10:18:09 -07:00
Jeff King	67947c34ae	convert "hashcmp() != 0" to "!hasheq()" This rounds out the previous three patches, covering the inequality logic for the "hash" variant of the functions. As with the previous three, the accompanying code changes are the mechanical result of applying the coccinelle patch; see those patches for more discussion. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-29 11:32:49 -07:00
Jeff King	4a7e27e957	convert "oidcmp() == 0" to oideq() Using the more restrictive oideq() should, in the long run, give the compiler more opportunities to optimize these callsites. For now, this conversion should be a complete noop with respect to the generated code. The result is also perhaps a little more readable, as it avoids the "zero is equal" idiom. Since it's so prevalent in C, I think seasoned programmers tend not to even notice it anymore, but it can sometimes make for awkward double negations (e.g., we can drop a few !!oidcmp() instances here). This patch was generated almost entirely by the included coccinelle patch. This mechanical conversion should be completely safe, because we check explicitly for cases where oidcmp() is compared to 0, which is what oideq() is doing under the hood. Note that we don't have to catch "!oidcmp()" separately; coccinelle's standard isomorphisms make sure the two are treated equivalently. I say "almost" because I did hand-edit the coccinelle output to fix up a few style violations (it mostly keeps the original formatting, but sometimes unwraps long lines). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-29 11:32:49 -07:00
Junio C Hamano	1689c22c1c	Merge branch 'jk/core-use-replace-refs' A new configuration variable core.usereplacerefs has been added, primarily to help server installations that want to ignore the replace mechanism altogether. * jk/core-use-replace-refs: add core.usereplacerefs config option check_replace_refs: rename to read_replace_refs check_replace_refs: fix outdated comment	2018-08-15 15:08:23 -07:00
Jeff King	6ebd1cafe2	check_replace_refs: rename to read_replace_refs This was added as a NEEDSWORK in `c3c36d7de2` (replace-object: check_replace_refs is safe in multi repo environment, 2018-04-11), waiting for a calmer period. Since doing so now doesn't conflict with anything in 'pu', it seems as good a time as any. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-18 15:45:14 -07:00
Junio C Hamano	00624d608c	Merge branch 'sb/object-store-grafts' The conversion to pass "the_repository" and then "a_repository" throughout the object access API continues. * sb/object-store-grafts: commit: allow lookup_commit_graft to handle arbitrary repositories commit: allow prepare_commit_graft to handle arbitrary repositories shallow: migrate shallow information into the object parser path.c: migrate global git_path_* to take a repository argument cache: convert get_graft_file to handle arbitrary repositories commit: convert read_graft_file to handle arbitrary repositories commit: convert register_commit_graft to handle arbitrary repositories commit: convert commit_graft_pos() to handle arbitrary repositories shallow: add repository argument to is_repository_shallow shallow: add repository argument to check_shallow_file_for_update shallow: add repository argument to register_shallow shallow: add repository argument to set_alternate_shallow_file commit: add repository argument to lookup_commit_graft commit: add repository argument to prepare_commit_graft commit: add repository argument to read_graft_file commit: add repository argument to register_commit_graft commit: add repository argument to commit_graft_pos object: move grafts to object parser object-store: move object access functions to object-store.h	2018-07-18 12:20:28 -07:00
Stefan Beller	da14a7ff99	blob: add repository argument to lookup_blob Add a repository argument to allow the callers of lookup_blob to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-06-29 10:43:38 -07:00
Stefan Beller	1ec5bfd24e	object: add repository argument to parse_object_buffer Add a repository argument to allow the callers of parse_object_buffer to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-06-29 10:43:38 -07:00
Stefan Beller	5abddd1eb7	object: add repository argument to lookup_object Add a repository argument to allow callers of lookup_object to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-06-29 10:43:38 -07:00
Junio C Hamano	b16b60f71b	Merge branch 'sb/object-store-grafts' into sb/object-store-lookup * sb/object-store-grafts: commit: allow lookup_commit_graft to handle arbitrary repositories commit: allow prepare_commit_graft to handle arbitrary repositories shallow: migrate shallow information into the object parser path.c: migrate global git_path_* to take a repository argument cache: convert get_graft_file to handle arbitrary repositories commit: convert read_graft_file to handle arbitrary repositories commit: convert register_commit_graft to handle arbitrary repositories commit: convert commit_graft_pos() to handle arbitrary repositories shallow: add repository argument to is_repository_shallow shallow: add repository argument to check_shallow_file_for_update shallow: add repository argument to register_shallow shallow: add repository argument to set_alternate_shallow_file commit: add repository argument to lookup_commit_graft commit: add repository argument to prepare_commit_graft commit: add repository argument to read_graft_file commit: add repository argument to register_commit_graft commit: add repository argument to commit_graft_pos object: move grafts to object parser object-store: move object access functions to object-store.h	2018-06-29 10:43:28 -07:00
Junio C Hamano	7913f53b56	Sync with Git 2.17.1 * maint: (25 commits) Git 2.17.1 Git 2.16.4 Git 2.15.2 Git 2.14.4 Git 2.13.7 fsck: complain when .gitmodules is a symlink index-pack: check .gitmodules files with --strict unpack-objects: call fsck_finish() after fscking objects fsck: call fsck_finish() after fscking objects fsck: check .gitmodules content fsck: handle promisor objects in .gitmodules check fsck: detect gitmodules files fsck: actually fsck blob data fsck: simplify ".git" check index-pack: make fsck error message more specific verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant ...	2018-05-29 17:10:05 +09:00
Jeff King	6e328d6cae	unpack-objects: call fsck_finish() after fscking objects As with the previous commit, we must call fsck's "finish" function in order to catch any queued objects for .gitmodules checks. This second pass will be able to access any incoming objects, because we will have exploded them to loose objects by now. This isn't quite ideal, because it means that bad objects may have been written to the object database (and a subsequent operation could then reference them, even if the other side doesn't send the objects again). However, this is sufficient when used with receive.fsckObjects, since those loose objects will all be placed in a temporary quarantine area that will get wiped if we find any problems. Signed-off-by: Jeff King <peff@peff.net>	2018-05-21 23:55:12 -04:00
Jeff King	db5a58c1bd	index-pack: make fsck error message more specific If fsck reports an error, we say only "Error in object". This isn't quite as bad as it might seem, since the fsck code would have dumped some errors to stderr already. But it might help to give a little more context. The earlier output would not have even mentioned "fsck", and that may be a clue that the "fsck." or ".fsckObjects" config may be relevant. Signed-off-by: Jeff King <peff@peff.net>	2018-05-21 23:55:12 -04:00
Stefan Beller	cbd53a2193	object-store: move object access functions to object-store.h This should make these functions easier to find and cache.h less overwhelming to read. In particular, this moves: - read_object_file - oid_object_info - write_object_file As a result, most of the codebase needs to #include object-store.h. In this patch the #include is only added to files that would fail to compile otherwise. It would be better to #include wherever identifiers from the header are used. That can happen later when we have better tooling for it. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-16 11:42:03 +09:00
Stefan Beller	0df8e96566	cache.h: add repository argument to oid_object_info Add a repository argument to allow the callers of oid_object_info to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-26 10:54:27 +09:00
Junio C Hamano	a5bbc29994	Merge branch 'bc/object-id' Conversion from uchar[20] to struct object_id continues. * bc/object-id: (36 commits) convert: convert to struct object_id sha1_file: introduce a constant for max header length Convert lookup_replace_object to struct object_id sha1_file: convert read_sha1_file to struct object_id sha1_file: convert read_object_with_reference to object_id tree-walk: convert tree entry functions to object_id streaming: convert istream internals to struct object_id tree-walk: convert get_tree_entry_follow_symlinks internals to object_id builtin/notes: convert static functions to object_id builtin/fmt-merge-msg: convert remaining code to object_id sha1_file: convert sha1_object_info* to object_id Convert remaining callers of sha1_object_info_extended to object_id packfile: convert unpack_entry to struct object_id sha1_file: convert retry_bad_packed_offset to struct object_id sha1_file: convert assert_sha1_type to object_id builtin/mktree: convert to struct object_id streaming: convert open_istream to use struct object_id sha1_file: convert check_sha1_signature to struct object_id sha1_file: convert read_loose_object to use struct object_id builtin/index-pack: convert struct ref_delta_entry to object_id ...	2018-04-10 08:25:45 +09:00
Junio C Hamano	99321e327b	Merge branch 'nd/object-allocation-comments' Code doc update. * nd/object-allocation-comments: object.h: realign object flag allocation comment object.h: update flag allocation comment	2018-03-14 12:01:06 -07:00
brian m. carlson	b4f5aca40e	sha1_file: convert read_sha1_file to struct object_id Convert read_sha1_file to take a pointer to struct object_id and rename it read_object_file. Do the same for read_sha1_file_extended. Convert one use in grep.c to use the new function without any other code change, since the pointer being passed is a void pointer that is already initialized with a pointer to struct object_id. Update the declaration and definitions of the modified functions, and apply the following semantic patch to convert the remaining callers: @@ expression E1, E2, E3; @@ - read_sha1_file(E1.hash, E2, E3) + read_object_file(&E1, E2, E3) @@ expression E1, E2, E3; @@ - read_sha1_file(E1->hash, E2, E3) + read_object_file(E1, E2, E3) @@ expression E1, E2, E3, E4; @@ - read_sha1_file_extended(E1.hash, E2, E3, E4) + read_object_file_extended(&E1, E2, E3, E4) @@ expression E1, E2, E3, E4; @@ - read_sha1_file_extended(E1->hash, E2, E3, E4) + read_object_file_extended(E1, E2, E3, E4) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 09:23:50 -07:00
brian m. carlson	abef9020e3	sha1_file: convert sha1_object_info* to object_id Convert sha1_object_info and sha1_object_info_extended to take pointers to struct object_id and rename them to use "oid" instead of "sha1" in their names. Update the declaration and definition and apply the following semantic patch, plus the standard object_id transforms: @@ expression E1, E2; @@ - sha1_object_info(E1.hash, E2) + oid_object_info(&E1, E2) @@ expression E1, E2; @@ - sha1_object_info(E1->hash, E2) + oid_object_info(E1, E2) @@ expression E1, E2, E3; @@ - sha1_object_info_extended(E1.hash, E2, E3) + oid_object_info_extended(&E1, E2, E3) @@ expression E1, E2, E3; @@ - sha1_object_info_extended(E1->hash, E2, E3) + oid_object_info_extended(E1, E2, E3) Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 09:23:49 -07:00
Junio C Hamano	169c9c0169	Merge branch 'bw/c-plus-plus' Avoid using identifiers that clash with C++ keywords. Even though it is not a goal to compile Git with C++ compilers, changes like this help use of code analysis tools that targets C++ on our codebase. * bw/c-plus-plus: (37 commits) replace: rename 'new' variables trailer: rename 'template' variables tempfile: rename 'template' variables wrapper: rename 'template' variables environment: rename 'namespace' variables diff: rename 'template' variables environment: rename 'template' variables init-db: rename 'template' variables unpack-trees: rename 'new' variables trailer: rename 'new' variables submodule: rename 'new' variables split-index: rename 'new' variables remote: rename 'new' variables ref-filter: rename 'new' variables read-cache: rename 'new' variables line-log: rename 'new' variables imap-send: rename 'new' variables http: rename 'new' variables entry: rename 'new' variables diffcore-delta: rename 'new' variables ...	2018-03-06 14:54:07 -08:00
Nguyễn Thái Ngọc Duy	95308d64ce	object.h: update flag allocation comment Since the "flags" is shared, it's a good idea to keep track of who uses what bit. When we need to use more flags in library code, we can be sure it won't be re-used for another purpose by some caller. While at there, fix the location of "5" (should be in a different column than "4" two lines down) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-06 11:41:21 -08:00
Junio C Hamano	0fd90daba8	Merge branch 'bc/hash-algo' More abstraction of hash function from the codepath. * bc/hash-algo: hash: update obsolete reference to SHA1_HEADER bulk-checkin: abstract SHA-1 usage csum-file: abstract uses of SHA-1 csum-file: rename sha1file to hashfile read-cache: abstract away uses of SHA-1 pack-write: switch various SHA-1 values to abstract forms pack-check: convert various uses of SHA-1 to abstract forms fast-import: switch various uses of SHA-1 to the_hash_algo sha1_file: switch uses of SHA-1 to the_hash_algo builtin/unpack-objects: switch uses of SHA-1 to the_hash_algo builtin/index-pack: improve hash function abstraction hash: create union for hash context allocation hash: move SHA-1 macros to hash.h	2018-02-15 14:55:47 -08:00
Brandon Williams	debca9d2fe	object: rename function 'typename' to 'type_name' Rename C++ keyword in order to bring the codebase closer to being able to be compiled with a C++ compiler. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-14 13:10:05 -08:00
brian m. carlson	3206b6bdf6	builtin/unpack-objects: switch uses of SHA-1 to the_hash_algo Switch various uses of explicit calls to SHA-1 into references to the_hash_algo to better abstract away the various uses of it. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-02 11:28:41 -08:00
Patryk Obara	a09c985eae	sha1_file: convert write_sha1_file to object_id Convert the definition and declaration of write_sha1_file to struct object_id and adjust usage of this function. This commit also converts static function write_sha1_file_prepare, as it is closely related. Rename these functions to write_object_file and write_object_file_prepare respectively. Replace sha1_to_hex, hashcpy and hashclr with their oid equivalents wherever possible. Signed-off-by: Patryk Obara <patryk.obara@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-30 10:42:36 -08:00
Patryk Obara	f070faccc1	sha1_file: convert hash_sha1_file to object_id Convert the declaration and definition of hash_sha1_file to use struct object_id and adjust all function calls. Rename this function to hash_object_file. Signed-off-by: Patryk Obara <patryk.obara@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-30 10:42:36 -08:00
Derrick Stolee	19716b21a4	cleanup: fix possible overflow errors in binary search A common mistake when writing binary search is to allow possible integer overflow by using the simple average: mid = (min + max) / 2; Instead, use the overflow-safe version: mid = min + (max - min) / 2; This translation is safe since the operation occurs inside a loop conditioned on "min < max". The included changes were found using the following git grep: git grep '/ 2;' '.c' Making this cleanup will prevent future review friction when a new binary search is contructed based on existing code. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-10 08:57:24 +09:00
Junio C Hamano	50f03c6676	Merge branch 'ab/free-and-null' A common pattern to free a piece of memory and assign NULL to the pointer that used to point at it has been replaced with a new FREE_AND_NULL() macro. * ab/free-and-null: *.[ch] refactoring: make use of the FREE_AND_NULL() macro coccinelle: make use of the "expression" FREE_AND_NULL() rule coccinelle: add a rule to make "expression" code use FREE_AND_NULL() coccinelle: make use of the "type" FREE_AND_NULL() rule coccinelle: add a rule to make "type" code use FREE_AND_NULL() git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL	2017-06-24 14:28:41 -07:00
Junio C Hamano	f31d23a399	Merge branch 'bw/config-h' Fix configuration codepath to pay proper attention to commondir that is used in multi-worktree situation, and isolate config API into its own header file. * bw/config-h: config: don't implicitly use gitdir or commondir config: respect commondir setup: teach discover_git_directory to respect the commondir config: don't include config.h by default config: remove git_config_iter config: create config.h	2017-06-24 14:28:41 -07:00
Ævar Arnfjörð Bjarmason	6a83d90207	coccinelle: make use of the "type" FREE_AND_NULL() rule Apply the result of the just-added coccinelle rule. This manually excludes a few occurrences, mostly things that resulted in many FREE_AND_NULL() on one line, that'll be manually fixed in a subsequent change. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-16 12:44:03 -07:00
Brandon Williams	b2141fc1d2	config: don't include config.h by default Stop including config.h by default in cache.h. Instead only include config.h in those files which require use of the config system. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-15 12:56:22 -07:00
brian m. carlson	c251c83df2	object: convert parse_object* to take struct object_id Make parse_object, parse_object_or_die, and parse_object_buffer take a pointer to struct object_id. Remove the temporary variables inserted earlier, since they are no longer necessary. Transform all of the callers using the following semantic patch: @@ expression E1; @@ - parse_object(E1.hash) + parse_object(&E1) @@ expression E1; @@ - parse_object(E1->hash) + parse_object(E1) @@ expression E1, E2; @@ - parse_object_or_die(E1.hash, E2) + parse_object_or_die(&E1, E2) @@ expression E1, E2; @@ - parse_object_or_die(E1->hash, E2) + parse_object_or_die(E1, E2) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1.hash, E2, E3, E4, E5) + parse_object_buffer(&E1, E2, E3, E4, E5) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1->hash, E2, E3, E4, E5) + parse_object_buffer(E1, E2, E3, E4, E5) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:58 +09:00
brian m. carlson	3aca1fc6c9	Convert lookup_blob to struct object_id Convert lookup_blob to take a pointer to struct object_id. The commit was created with manual changes to blob.c and blob.h, plus the following semantic patch: @@ expression E1; @@ - lookup_blob(E1.hash) + lookup_blob(&E1) @@ expression E1; @@ - lookup_blob(E1->hash) + lookup_blob(E1) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:57 +09:00
brian m. carlson	834bc47b42	builtin/unpack-objects: convert to struct object_id Convert struct delta_info and struct object_info, as well as the various functions, to use struct object_id. Convert several hard-coded 20 values to GIT_SHA1_RAWSZ. Among the functions converted is a caller of lookup_blob, which we will convert shortly. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:57 +09:00
Christian Couder	5ad2186733	unpack-objects: add --max-input-size=<size> option When receiving a pack-file, it can be useful to abort the `git unpack-objects`, if the pack-file is too big. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-24 12:31:05 -07:00
brian m. carlson	f449198e58	coccinelle: convert hashcpy() with null_sha1 to hashclr() hashcpy with null_sha1 as the source is equivalent to hashclr. In addition to being simpler, using hashclr may give the compiler a chance to optimize better. Convert instances of hashcpy with the source argument of null_sha1 to hashclr. This transformation was implemented using the following semantic patch: @@ expression E1; @@ -hashcpy(E1, null_sha1); +hashclr(E1); Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-28 11:39:02 -07:00

1 2

66 Commits