git-commit-vandalism

Author	SHA1	Message	Date
Ævar Arnfjörð Bjarmason	f69a6e4f07	.h: move some _INIT to designated initializers Move various *_INIT macros to use designated initializers. This helps readability. I've only picked those leftover macros that were not touched by another in-flight series of mine which changed others, but also how initialization was done. In the case of SUBMODULE_ALTERNATE_SETUP_INIT I've left an explicit initialization of "error_mode", even though SUBMODULE_ALTERNATE_ERROR_IGNORE itself is defined as "0". Let's not peek under the hood and assume that enum fields we know the value of will stay at "0". The change to "TESTSUITE_INIT" in "t/helper/test-run-command.c" was part of an earlier on-list version[1] of `c90be786da` (test-tool run-command: fix flip-flop init pattern, 2021-09-11). 1. https://lore.kernel.org/git/patch-1.1-0aa4523ab6e-20210909T130849Z-avarab@gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 14:48:00 -07:00
Ævar Arnfjörð Bjarmason	608cfd31cf	*.h _INIT macros: don't specify fields equal to 0 Change the initialization of "struct strbuf" changed in `cbc0f81d96` (strbuf: use designated initializers in STRBUF_INIT, 2017-07-10) to omit specifying "alloc" and "len", as we do with other "alloc" and "len" (or "nr") in similar structs. Let's likewise omit the explicit initialization of all fields in the "struct ipc_client_connect_option" struct added in `59c7b88198` (simple-ipc: add win32 implementation, 2021-03-15). Do the same for a few other initializers, e.g. STRVEC_INIT and CACHE_DEF_INIT. Finally, start incrementally changing the same pattern in "t/helper/test-run-command.c". This change was part of an earlier on-list version[1] of `c90be786da` (test-tool run-command: fix flip-flop init pattern, 2021-09-11). 1. https://lore.kernel.org/git/patch-1.1-0aa4523ab6e-20210909T130849Z-avarab@gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 14:47:59 -07:00
Junio C Hamano	0e35107e7d	Merge branch 'ab/retire-option-argument' An oddball OPTION_ARGUMENT feature has been removed from the parse-options API. * ab/retire-option-argument: parse-options API: remove OPTION_ARGUMENT feature difftool: use run_command() API in run_file_diff() difftool: prepare "diff" cmdline in cmd_difftool() difftool: prepare "struct child_process" in cmd_difftool()	2021-09-23 13:44:48 -07:00
Junio C Hamano	ffb0387608	Merge branch 'ab/test-tool-run-command-cleanup' Code clean-up. * ab/test-tool-run-command-cleanup: test-tool run-command: fix flip-flop init pattern	2021-09-23 13:44:46 -07:00
Ævar Arnfjörð Bjarmason	c6b4888b3f	environment.c: remove test-specific "ignore_untracked..." variable Instead of the global ignore_untracked_cache_config variable added in `dae6c322fa` (test-dump-untracked-cache: don't modify the untracked cache, 2016-01-27) we can make use of the new facility to set config via environment variables added in `d8d77153ea` (config: allow specifying config entries via envvar pairs, 2021-01-12). It's arguably a bit hacky to use setenv() and getenv() to pass messages between the same program, but since the test helpers are not the main intended audience of repo-settings.c I think it's better than hardcoding the test-only special-case in prepare_repo_settings(). This uses the xsetenv() wrapper added in the preceding commit, if we don't set these in the environment we'll fail in t7063-status-untracked-cache.sh, but let's fail earlier anyway if that were to happen. This breaks any parent process that's potentially using the GIT_CONFIG_* and GIT_CONFIG_PARAMETERS mechanism to pass one-shot config setting down to a git subprocess, but in this case we don't care about the general case of such potential parents. This process neither spawns other "git" processes, nor is it interested in other configuration. We might want to pick up other test modes here, but those will be passed via GIT_TEST_* environment variables. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-22 13:15:00 -07:00
Junio C Hamano	5331af2352	Merge branch 'ab/serve-cleanup' Code clean-up around "git serve". * ab/serve-cleanup: upload-pack: document and rename --advertise-refs serve.[ch]: remove "serve_options", split up --advertise-refs code {upload,receive}-pack tests: add --advertise-refs tests serve.c: move version line to advertise_capabilities() serve: move transfer.advertiseSID check into session_id_advertise() serve.[ch]: don't pass "struct strvec *keys" to commands serve: use designated initializers transport: use designated initializers transport: rename "fetch" in transport_vtable to "fetch_refs" serve: mark has_capability() as static	2021-09-20 15:20:43 -07:00
Jeff Hostetler	05881a6fc9	t/helper/simple-ipc: convert test-simple-ipc to use start_bg_command Convert test helper to use `start_bg_command()` when spawning a server daemon in the background rather than blocks of platform-specific code. Also, while here, remove _() translation around error messages since this is a test helper and not Git code. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-20 08:57:58 -07:00
Jeff Hostetler	a3e2033e04	simple-ipc: preparations for supporting binary messages. Add `command_len` argument to the Simple IPC API. In my original Simple IPC API, I assumed that the request would always be a null-terminated string of text characters. The `command` argument was just a `const char `. I found a caller that would like to pass a binary command to the daemon, so I am amending the Simple IPC API to receive `const char command, size_t command_len` arguments. I considered changing the `command` argument to be a `void `, but the IPC layer simply passes it to the pkt-line layer which takes a `const char `, so to avoid confusion I left it as is. Note, the response side has always been a `struct strbuf` which includes the buffer and length, so we already support returning a binary answer. (Yes, it feels a little weird returning a binary buffer in a `strbuf`, but it works.) Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-20 08:57:58 -07:00
Taylor Blau	a05f02b1d9	t/helper/test-bitmap.c: add 'dump-hashes' mode The pack-bitmap writer code is about to learn how to propagate values from an existing hash-cache. To prepare, teach the test-bitmap helper to dump the values from a bitmap's hash-cache extension in order to test those changes. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-14 16:34:17 -07:00
Ævar Arnfjörð Bjarmason	4c25356e0e	parse-options API: remove OPTION_ARGUMENT feature As was noted in `1a85b49b87` (parse-options: make OPT_ARGUMENT() more useful, 2019-03-14) there's only ever been one user of the OPT_ARGUMENT(), that user was added in `20de316e33` (difftool: allow running outside Git worktrees with --no-index, 2019-03-14). The OPT_ARGUMENT() feature itself was added way back in `580d5bffde` (parse-options: new option type to treat an option-like parameter as an argument., 2008-03-02), but as discussed in `1a85b49b87` wasn't used until `20de316e33` in 2019. Now that the preceding commit has migrated this code over to using "struct strvec" to manage the "args" member of a "struct child_process", we can just use that directly instead of relying on OPT_ARGUMENT. This has a minor change in behavior in that if we'll pass --no-index we'll now always pass it as the first argument, before we'd pass it in whatever position the caller did. Preserving this was the real value of OPT_ARGUMENT(), but as it turns out we didn't need that either. We can always inject it as the first argument, the other end will parse it just the same. Note that we cannot remove the "out" and "cpidx" members of "struct parse_opt_ctx_t" added in `580d5bffde`, while they were introduced with OPT_ARGUMENT() we since used them for other things. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-12 23:27:38 -07:00
Ævar Arnfjörð Bjarmason	c90be786da	test-tool run-command: fix flip-flop init pattern In `be5d88e112` (test-tool run-command: learn to run (parts of) the testsuite, 2019-10-04) an init pattern was added that would use TESTSUITE_INIT, but then promptly memset() everything back to 0. We'd then set the "dup" on the two string lists. Our setting of "next" to "-1" thus did nothing, we'd reset it to "0" before using it. Let's set it to "0" instead, and trust the "STRING_LIST_INIT_DUP" to set "strdup_strings" appropriately for us. Note that while we compile this code, there's no in-tree user for the "testsuite" target being modified here anymore, see the discussion at and around <nycvar.QRO.7.76.6.2109091323150.59@tvgsbejvaqbjf.bet>[1]. 1. https://lore.kernel.org/git/nycvar.QRO.7.76.6.2109091323150.59@tvgsbejvaqbjf.bet/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-12 16:46:54 -07:00
Jonathan Tan	8eb8dcf946	repository: support unabsorbed in repo_submodule_init In preparation for a subsequent commit that migrates code using add_submodule_odb() to repo_submodule_init(), teach repo_submodule_init() to support submodules with unabsorbed gitdirs. (See the documentation for "git submodule absorbgitdirs" for more information about absorbed and unabsorbed gitdirs.) Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-09 14:09:30 -07:00
Taylor Blau	b1b82d1c30	t/helper/test-read-midx.c: add --checksum mode Subsequent tests will want to check for the existence of a multi-pack bitmap which matches the multi-pack-index stored in the pack directory. The multi-pack bitmap includes the hex checksum of the MIDX it corresponds to in its filename (for example, '$packdir/multi-pack-index-<checksum>.bitmap'). As a result, some tests want a way to learn what '<checksum>' is. This helper addresses that need by printing the checksum of the repository's multi-pack-index. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-01 13:56:43 -07:00
Ævar Arnfjörð Bjarmason	f234da8019	serve.[ch]: remove "serve_options", split up --advertise-refs code The "advertise capabilities" mode of serve.c added in `ed10cb952d` (serve: introduce git-serve, 2018-03-15) is only used by the http-backend.c to call {upload,receive}-pack with the --advertise-refs parameter. See `42526b478e` (Add stateless RPC options to upload-pack, receive-pack, 2009-10-30). Let's just make cmd_upload_pack() take the two (v2) or three (v2) parameters the the v2/v1 servicing functions need directly, and pass those in via the function signature. The logic of whether daemon mode is implied by the timeout belongs in the v1 function (only used there). Once we split up the "advertise v2 refs" from "serve v2 request" it becomes clear that v2 never cared about those in combination. The only time it mattered was for v1 to emit its ref advertisement, in that case we wanted to emit the smart-http-only "no-done" capability. Since we only do that in the --advertise-refs codepath let's just have it set "do_done" itself in v1's upload_pack() just before send_ref(), at that point --advertise-refs and --stateless-rpc in combination are redundant (the only user is get_info_refs() in http-backend.c), so we can just pass in --advertise-refs only. Since we need to touch all the serve() and advertise_capabilities() codepaths let's rename them to less clever and obvious names, it's been suggested numerous times, the latest of which is [1]'s suggestion for protocol_v2_serve_loop(). Let's go with that. 1. https://lore.kernel.org/git/CAFQ2z_NyGb8rju5CKzmo6KhZXD0Dp21u-BbyCb2aNxLEoSPRJw@mail.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-08-05 08:59:37 -07:00
Junio C Hamano	fea3738ac5	Merge branch 'ab/getcwd-test' Portability test update. * ab/getcwd-test: t0001: fix broken not-quite getcwd(3) test in `bed67874e2`	2021-08-04 13:28:55 -07:00
Ævar Arnfjörð Bjarmason	482e1488a9	t0001: fix broken not-quite getcwd(3) test in `bed67874e2` With `a54e938e5b` (strbuf: support long paths w/o read rights in strbuf_getcwd() on FreeBSD, 2017-03-26) we had t0001 break on systems like OpenBSD and AIX whose getcwd(3) has standard (but not like glibc et al) behavior. This was partially fixed in `bed67874e2` (t0001: skip test with restrictive permissions if getpwd(3) respects them, 2017-08-07). The problem with that fix is that while its analysis of the problem is correct, it doesn't actually call getcwd(3), instead it invokes "pwd -P". There is no guarantee that "pwd -P" is going to call getcwd(3), as opposed to e.g. being a shell built-in. On AIX under both bash and ksh this test breaks because "pwd -P" will happily display the current working directory, but getcwd(3) called by the "git init" we're testing here will fail to get it. I checked whether clobbering the $PWD environment variable would affect it, and it didn't. Presumably these shells keep track of their working directory internally. There's possible follow-up work here in teaching strbuf_getcwd() to get the working directory with whatever method "pwd" uses on these platforms. See [1] for a discussion of that, but let's take the easy way out here and just skip these tests by fixing the GETCWD_IGNORES_PERMS prerequisite to match the limitations of strbuf_getcwd(). 1. https://lore.kernel.org/git/b650bef5-d739-d98d-e9f1-fa292b6ce982@web.de/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-30 10:18:27 -07:00
Junio C Hamano	5bae927222	Merge branch 'ab/pkt-line-tests' Tests that cover protocol bits have been updated and helpers used there have been consolidated. * ab/pkt-line-tests: test-lib-functions: use test-tool for [de]packetize()	2021-07-28 13:18:00 -07:00
Junio C Hamano	dd6d3c90ee	Merge branch 'ab/attribute-format' Many "printf"-like helper functions we have have been annotated with __attribute__() to catch placeholder/parameter mismatches. * ab/attribute-format: advice.h: add missing __attribute__((format)) & fix usage .h: add a few missing __attribute__((format)) .c static functions: add missing __attribute__((format)) sequencer.c: move static function to avoid forward decl *.c static functions: don't forward-declare __attribute__	2021-07-28 13:17:59 -07:00
Junio C Hamano	e5cc59c77c	Merge branch 'ew/many-alternate-optim' Optimization for repositories with many alternate object store. * ew/many-alternate-optim: oidtree: a crit-bit tree for odb_loose_cache oidcpy_with_padding: constify `src' arg make object_directory.loose_objects_subdir_seen a bitmap avoid strlen via strbuf_addstr in link_alt_odb_entry speed up alt_odb_usable() with many alternates	2021-07-28 13:17:57 -07:00
Ævar Arnfjörð Bjarmason	64f0109f17	test-lib-functions: use test-tool for [de]packetize() The shell+perl "[de]packetize()" helper functions were added in `4414a15002` (t/lib-git-daemon: add network-protocol helpers, 2018-01-24), and around the same time we added the "pkt-line" helper in `74e7002961` (test-pkt-line: introduce a packet-line test helper, 2018-03-14). For some reason it seems we've mostly used the shell+perl version instead of the helper since then. There were discussions around `88124ab263` (test-lib-functions: make packetize() more efficient, 2020-03-27) and `cacae4329f` (test-lib-functions: simplify packetize() stdin code, 2020-03-29) to improve them and make them more efficient. There was one good reason to do so, we needed an equivalent of "test-tool pkt-line pack", but that command wasn't capable of handling input with "\n" (a feature) or "\0" (just because it happens to be printf-based under the hood). Let's add a "pkt-line-raw" helper for that, and expose is at a packetize_raw() to go with the existing packetize() on the shell level, this gives us the smallest amount of change to the tests themselves. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-19 11:53:50 -07:00
Junio C Hamano	8721e2eaed	Merge branch 'jt/partial-clone-submodule-1' Prepare the internals for lazily fetching objects in submodules from their promisor remotes. * jt/partial-clone-submodule-1: promisor-remote: teach lazy-fetch in any repo run-command: refactor subprocess env preparation submodule: refrain from filtering GIT_CONFIG_COUNT promisor-remote: support per-repository config repository: move global r_f_p_c to repo struct	2021-07-16 17:42:53 -07:00
Junio C Hamano	c9780bb2ca	Merge branch 'hn/prep-tests-for-reftable' Preliminary clean-up of tests before the main reftable changes hits the codebase. * hn/prep-tests-for-reftable: (22 commits) t1415: set REFFILES for test specific to storage format t4202: mark bogus head hash test with REFFILES t7003: check reflog existence only for REFFILES t7900: stop checking for loose refs t1404: mark tests that muck with .git directly as REFFILES. t2017: mark --orphan/logAllRefUpdates=false test as REFFILES t1414: mark corruption test with REFFILES t1407: require REFFILES for for_each_reflog test test-lib: provide test prereq REFFILES t5304: use "reflog expire --all" to clear the reflog t5304: restyle: trim empty lines, drop ':' before > t7003: use rev-parse rather than FS inspection t5000: inspect HEAD using git-rev-parse t5000: reformat indentation to the latest fashion t1301: fix typo in error message t1413: use tar to save and restore entire .git directory t1401-symbolic-ref: avoid direct filesystem access t1401: use tar to snapshot and restore repo state t5601: read HEAD using rev-parse t9300: check ref existence using test-helper rather than a file system check ...	2021-07-13 16:52:50 -07:00
Ævar Arnfjörð Bjarmason	927dc33070	advice.h: add missing __attribute__((format)) & fix usage Add the missing __attribute__((format)) checking to advise_if_enabled(). This revealed a trivial issue introduced in `b3b18d1621` (advice: revamp advise API, 2020-03-02). We treated the argv[1] as a format string, but did not intend to do so. Let's use "%s" and pass argv[1] as an argument instead. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-13 15:20:20 -07:00
Junio C Hamano	e867110340	Merge branch 'ab/cmd-foo-should-return' Code clean-up. * ab/cmd-foo-should-return: builtins + test helpers: use return instead of exit() in cmd_*	2021-07-08 13:15:04 -07:00
Eric Wong	92d8ed8ac1	oidtree: a crit-bit tree for odb_loose_cache This saves 8K per `struct object_directory', meaning it saves around 800MB in my case involving 100K alternates (half or more of those alternates are unlikely to hold loose objects). This is implemented in two parts: a generic, allocation-free `cbtree' and the `oidtree' wrapper on top of it. The latter provides allocation using alloc_state as a memory pool to improve locality and reduce free(3) overhead. Unlike oid-array, the crit-bit tree does not require sorting. Performance is bound by the key length, for oidtree that is fixed at sizeof(struct object_id). There's no need to have 256 oidtrees to mitigate the O(n log n) overhead like we did with oid-array. Being a prefix trie, it is natively suited for expanding short object IDs via prefix-limited iteration in `find_short_object_filename'. On my busy workstation, p4205 performance seems to be roughly unchanged (+/-8%). Startup with 100K total alternates with no loose objects seems around 10-20% faster on a hot cache. (800MB in memory savings means more memory for the kernel FS cache). The generic cbtree implementation does impose some extra overhead for oidtree in that it uses memcmp(3) on "struct object_id" so it wastes cycles comparing 12 extra bytes on SHA-1 repositories. I've not yet explored reducing this overhead, but I expect there are many places in our code base where we'd want to investigate this. More information on crit-bit trees: https://cr.yp.to/critbit.html Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-07 21:28:04 -07:00
Jonathan Tan	ef830cc434	promisor-remote: teach lazy-fetch in any repo This is one step towards supporting partial clone submodules. Even after this patch, we will still lack partial clone submodules support, primarily because a lot of Git code that accesses submodule objects does so by adding their object stores as alternates, meaning that any lazy fetches that would occur in the submodule would be done based on the config of the superproject, not of the submodule. This also prevents testing of the functionality in this patch by user-facing commands. So for now, test this mechanism using a test helper. Besides that, there is some code that uses the wrapper functions like has_promisor_remote(). Those will need to be checked to see if they could support the non-wrapper functions instead (and thus support any repository, not just the_repository). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-28 09:58:01 -07:00
Junio C Hamano	169914ede2	Merge branch 'en/ort-perf-batch-11' Optimize out repeated rename detection in a sequence of mergy operations. * en/ort-perf-batch-11: merge-ort, diffcore-rename: employ cached renames when possible merge-ort: handle interactions of caching and rename/rename(1to1) cases merge-ort: add helper functions for using cached renames merge-ort: preserve cached renames for the appropriate side merge-ort: avoid accidental API mis-use merge-ort: add code to check for whether cached renames can be reused merge-ort: populate caches of rename detection results merge-ort: add data structures for in-memory caching of rename detection t6429: testcases for remembering renames fast-rebase: write conflict state to working tree, index, and HEAD fast-rebase: change assert() to BUG() Documentation/technical: describe remembering renames optimization t6423: rename file within directory that other side renamed	2021-06-14 13:33:27 +09:00
Ævar Arnfjörð Bjarmason	338abb0f04	builtins + test helpers: use return instead of exit() in cmd_* Change various cmd_* functions that claim to return an "int" to use "return" instead of exit() to indicate an exit code. These were not marked with NORETURN, and by directly exit()-ing we'll skip the cleanup git.c would otherwise do (e.g. closing fd's, erroring if we can't). See run_builtin() in git.c. In the case of shell.c and sh-i18n--envsubst.c this was the result of an incomplete migration to using a cmd_main() in `3f2e2297b9` (add an extra level of indirection to main(), 2016-07-01). This was spotted by SunCC 12.5 on Solaris 10 (gcc210 on the gccfarm). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-09 09:15:58 +09:00
Han-Wen Nienhuys	0221eb8678	t/helper/ref-store: initialize oid in resolve-ref This will print $ZERO_OID when asking for a non-existent ref from the test-helper. Since resolve-ref provides direct access to refs_resolve_ref_unsafe(), it provides a reliable mechanism for accessing REFNAME, while avoiding the implicit resolution to refs/heads/REFNAME. Reviewed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-02 10:01:54 +09:00
Elijah Newren	f9500261e0	fast-rebase: write conflict state to working tree, index, and HEAD Previously, when fast-rebase hit a conflict, it simply aborted and left HEAD, the index, and the working tree where they were before the operation started. While fast-rebase does not support restarting from a conflicted state, write the conflicted state out anyway as it gives us a way to see what the conflicts are and write tests that check for them. This will be important in the upcoming commits, because sequencer.c is only superficially integrated with merge-ort.c; in particular, it calls merge_switch_to_result() after EACH merge instead of only calling it at the end of all the sequence of merges (or when a conflict is hit). This not only causes needless updates to the working copy and index, but also causes all intermediate data to be freed and tossed, preventing caching information from one merge to the next. However, integrating sequencer.c more deeply with merge-ort.c is a big task, and making this small extension to fast-rebase.c provides us with a simple way to test the edge and corner cases that we want to make sure continue working. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-20 15:40:39 +09:00
Elijah Newren	caba91c373	fast-rebase: change assert() to BUG() assert() can succinctly document expectations for the code, and do so in a way that may be useful to future folks trying to refactor the code and change basic assumptions; it allows them to more quickly find some places where their violations of previous assumptions trips things up. Unfortunately, assert() can surround a function call with important side-effects, which is a huge mistake since some users will compile with assertions disabled. I've had to debug such mistakes before in other codebases, so I should know better. Luckily, this was only in test code, but it's still very embarrassing. Change an assert() to an if (...) BUG (...). Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-20 15:40:39 +09:00
Junio C Hamano	416449eaba	Merge branch 'jk/symlinked-dotgitx-cleanup' Various test and documentation updates about .gitsomething paths that are symlinks. * jk/symlinked-dotgitx-cleanup: docs: document symlink restrictions for dot-files fsck: warn about symlinked dotfiles we'll open with O_NOFOLLOW t0060: test ntfs/hfs-obscured dotfiles t7450: test .gitmodules symlink matching against obscured names t7450: test verify_path() handling of gitmodules t7415: rename to expand scope fsck_tree(): wrap some long lines fsck_tree(): fix shadowed variable t7415: remove out-dated comment about translation	2021-05-11 15:27:23 +09:00
Junio C Hamano	aaa3c8065d	Merge branch 'bc/hash-transition-interop-part-1' SHA-256 transition. * bc/hash-transition-interop-part-1: hex: print objects using the hash algorithm member hex: default to the_hash_algo on zero algorithm value builtin/pack-objects: avoid using struct object_id for pack hash commit-graph: don't store file hashes as struct object_id builtin/show-index: set the algorithm for object IDs hash: provide per-algorithm null OIDs hash: set, copy, and use algo field in struct object_id builtin/pack-redundant: avoid casting buffers to struct object_id Use the final_oid_fn to finalize hashing of object IDs hash: add a function to finalize object IDs http-push: set algorithm when reading object ID Always use oidread to read into struct object_id hash: add an algo member to struct object_id	2021-05-10 16:59:46 +09:00
Jeff King	801ed010bf	t0060: test ntfs/hfs-obscured dotfiles We have tests that cover various filesystem-specific spellings of ".gitmodules", because we need to reliably identify that path for some security checks. These are from `dc2d9ba318` (is_{hfs,ntfs}_dotgitmodules: add tests, 2018-05-12), with the actual code coming from `e7cb0b4455` (is_ntfs_dotgit: match other .git files, 2018-05-11) and `0fc333ba20` (is_hfs_dotgit: match other .git files, 2018-05-02). Those latter two commits also added similar matching functions for .gitattributes and .gitignore. These ended up not being used in the final series, and are currently dead code. But in preparation for them being used in some fsck checks, let's make sure they actually work by throwing a few basic tests at them. Likewise, let's cover .mailmap (which does need matching code added). I didn't bother with the whole battery of tests that we cover for .gitmodules. These functions are all based on the same generic matcher, so it's sufficient to test most of the corner cases just once. Note that the ntfs magic prefix names in the tests come from the algorithm described in `e7cb0b4455` (and are different for each file). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-04 11:52:02 +09:00
Junio C Hamano	8e97852919	Merge branch 'ds/sparse-index-protections' Builds on top of the sparse-index infrastructure to mark operations that are not ready to mark with the sparse index, causing them to fall back on fully-populated index that they always have worked with. * ds/sparse-index-protections: (47 commits) name-hash: use expand_to_path() sparse-index: expand_to_path() name-hash: don't add directories to name_hash revision: ensure full index resolve-undo: ensure full index read-cache: ensure full index pathspec: ensure full index merge-recursive: ensure full index entry: ensure full index dir: ensure full index update-index: ensure full index stash: ensure full index rm: ensure full index merge-index: ensure full index ls-files: ensure full index grep: ensure full index fsck: ensure full index difftool: ensure full index commit: ensure full index checkout: ensure full index ...	2021-04-30 13:50:26 +09:00
Junio C Hamano	13158b9910	Merge branch 'jk/promisor-optim' Handling of "promisor packs" that allows certain objects to be missing and lazily retrievable has been optimized (a bit). * jk/promisor-optim: revision: avoid parsing with --exclude-promisor-objects lookup_unknown_object(): take a repository argument is_promisor_object(): free tree buffer after parsing	2021-04-30 13:50:24 +09:00
brian m. carlson	14228447c9	hash: provide per-algorithm null OIDs Up until recently, object IDs did not have an algorithm member, only a hash. Consequently, it was possible to share one null (all-zeros) object ID among all hash algorithms. Now that we're going to be handling objects from multiple hash algorithms, it's important to make sure that all object IDs have a correct algorithm field. Introduce a per-algorithm null OID, and add it to struct hash_algo. Introduce a wrapper function as well, and use it everywhere we used to use the null_oid constant. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 16:31:39 +09:00
Junio C Hamano	ab99efc817	Merge branch 'ab/userdiff-tests' A bit of code clean-up and a lot of test clean-up around userdiff area. * ab/userdiff-tests: blame tests: simplify userdiff driver test blame tests: don't rely on t/t4018/ directory userdiff: remove support for "broken" tests userdiff tests: list builtin drivers via test-tool userdiff tests: explicitly test "default" pattern userdiff: add and use for_each_userdiff_driver() userdiff style: normalize pascal regex declaration userdiff style: declare patterns with consistent style userdiff style: re-order drivers in alphabetical order	2021-04-20 17:23:34 -07:00
Junio C Hamano	8446b388b1	Merge branch 'cc/test-helper-bloom-usage-fix' Usage message fix for a test helper. * cc/test-helper-bloom-usage-fix: test-bloom: fix missing 'bloom' from usage string	2021-04-13 15:28:52 -07:00
Junio C Hamano	0623669fc6	Merge branch 'tb/pack-preferred-tips-to-give-bitmap' A configuration variable has been added to force tips of certain refs to be given a reachability bitmap. * tb/pack-preferred-tips-to-give-bitmap: builtin/pack-objects.c: respect 'pack.preferBitmapTips' t/helper/test-bitmap.c: initial commit pack-bitmap: add 'test_bitmap_commits()' helper	2021-04-13 15:28:50 -07:00
Jeff King	45a187cc34	lookup_unknown_object(): take a repository argument All of the other lookup_foo() functions take a repository argument, but lookup_unknown_object() was never converted, and it uses the_repository internally. Let's fix that. We could leave a wrapper that uses the_repository, but there aren't that many calls, so we'll just convert them all. I looked briefly at each site to see if we had a repository struct (besides the_repository) we could pass, but none of them do (so this conversion to pass the_repository is a pure noop in each case, though it does take us one step closer to eventually getting rid of the_repository). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-13 13:18:46 -07:00
Junio C Hamano	e6b971fcf5	Merge branch 'tb/reverse-midx' An on-disk reverse-index to map the in-pack location of an object back to its object name across multiple packfiles is introduced. * tb/reverse-midx: midx.c: improve cache locality in midx_pack_order_cmp() pack-revindex: write multi-pack reverse indexes pack-write.c: extract 'write_rev_file_order' pack-revindex: read multi-pack reverse indexes Documentation/technical: describe multi-pack reverse indexes midx: make some functions non-static midx: keep track of the checksum midx: don't free midx_name early midx: allow marking a pack as preferred t/helper/test-read-midx.c: add '--show-objects' builtin/multi-pack-index.c: display usage on unrecognized command builtin/multi-pack-index.c: don't enter bogus cmd_mode builtin/multi-pack-index.c: split sub-commands builtin/multi-pack-index.c: define common usage with a macro builtin/multi-pack-index.c: don't handle 'progress' separately builtin/multi-pack-index.c: inline 'flags' with options	2021-04-08 13:23:25 -07:00
Ævar Arnfjörð Bjarmason	28e8f0d5e5	userdiff tests: list builtin drivers via test-tool Change the userdiff test to list the builtin drivers via the test-tool, using the new for_each_userdiff_driver() API function. This gets rid of the need to modify this part of the test every time a new pattern is added, see `2ff6c34612` (userdiff: support Bash, 2020-10-22) and `09dad9256a` (userdiff: support Markdown, 2020-05-02) for two recent examples. I only need the "list-builtin-drivers "argument here, but let's add "list-custom-drivers" and "list-drivers" too, just because it's easy. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-08 12:19:10 -07:00
Christian Couder	dba94e3a85	test-bloom: fix missing 'bloom' from usage string Like 'get_murmur3' and 'generate_filter', 'get_filter_for_commit' is a subcommand of `test-tool bloom` not of `test-tool` itself. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-05 22:54:34 -07:00
Junio C Hamano	861794b60d	Merge branch 'jh/simple-ipc' A simple IPC interface gets introduced to build services like fsmonitor on top. * jh/simple-ipc: t0052: add simple-ipc tests and t/helper/test-simple-ipc tool simple-ipc: add Unix domain socket implementation unix-stream-server: create unix domain socket under lock unix-socket: disallow chdir() when creating unix domain sockets unix-socket: add backlog size option to unix_stream_listen() unix-socket: eliminate static unix_stream_socket() helper function simple-ipc: add win32 implementation simple-ipc: design documentation for new IPC mechanism pkt-line: add options argument to read_packetized_to_strbuf() pkt-line: add PACKET_READ_GENTLE_ON_READ_ERROR option pkt-line: do not issue flush packets in write_packetized_*() pkt-line: eliminate the need for static buffer in packet_write_gently()	2021-04-02 14:43:14 -07:00
Taylor Blau	483fa7f42d	t/helper/test-bitmap.c: initial commit Add a new 'bitmap' test-tool which can be used to list the commits that have received bitmaps. In theory, a determined tester could run 'git rev-list --test-bitmap <commit>' to check if '<commit>' received a bitmap or not, since '--test-bitmap' exits with a non-zero code when it can't find the requested commit. But this is a dubious behavior to rely on, since arguably 'git rev-list' could continue its object walk outside of which commits are covered by bitmaps. This will be used to test the behavior of 'pack.preferBitmapTips', which will be added in the following patch. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-31 23:14:03 -07:00
Derrick Stolee	2782db3eed	test-tool: don't force full index We will use 'test-tool read-cache --table' to check that a sparse index is written as part of init_repos. Since we will no longer always expand a sparse index into a full index, add an '--expand' parameter that adds a call to ensure_full_index() so we can compare a sparse index directly against a full index, or at least what the in-memory index looks like when expanded in this way. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:46 -07:00
Derrick Stolee	e2df6c3972	test-read-cache: print cache entries with --table This table is helpful for discovering data in the index to ensure it is being written correctly, especially as we build and test the sparse-index. This table includes an output format similar to 'git ls-tree', but should not be compared to that directly. The biggest reasons are that 'git ls-tree' includes a tree entry for every subdirectory, even those that would not appear as a sparse directory in a sparse-index. Further, 'git ls-tree' does not use a trailing directory separator for its tree rows. This does not print the stat() information for the blobs. That will be added in a future change with another option. The tests that are added in the next few changes care only about the object types and IDs. However, this future need for full index information justifies the need for this test helper over extending a user-facing feature, such as 'git ls-files'. To make the option parsing slightly more robust, wrap the string comparisons in a loop adapted from test-dir-iterator.c. Care must be taken with the final check for the 'cnt' variable. We continue the expectation that the numerical value is the final argument. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:46 -07:00
Taylor Blau	86d174b724	t/helper/test-read-midx.c: add '--show-objects' The 'read-midx' helper is used in places like t5319 to display basic information about a multi-pack-index. In the next patch, the MIDX writing machinery will learn a new way to choose from which pack an object is selected when multiple copies of that object exist. To disambiguate which pack introduces an object so that this feature can be tested, add a '--show-objects' option which displays additional information about each object in the MIDX. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:16:56 -07:00
Junio C Hamano	858119f6d7	Merge branch 'nk/diff-index-fsmonitor' "git diff-index" codepath has been taught to trust fsmonitor status to reduce number of lstat() calls. * nk/diff-index-fsmonitor: fsmonitor: add perf test for git diff HEAD fsmonitor: add assertion that fsmonitor is valid to check_removed fsmonitor: skip lstat deletion check during git diff-index	2021-03-24 14:36:27 -07:00
Jeff Hostetler	36a7eb6876	t0052: add simple-ipc tests and t/helper/test-simple-ipc tool Create t0052-simple-ipc.sh with unit tests for the "simple-ipc" mechanism. Create t/helper/test-simple-ipc test tool to exercise the "simple-ipc" functions. When the tool is invoked with "run-daemon", it runs a server to listen for "simple-ipc" connections on a test socket or named pipe and responds to a set of commands to exercise/stress the communication setup. When the tool is invoked with "start-daemon", it spawns a "run-daemon" command in the background and waits for the server to become ready before exiting. (This helps make unit tests in t0052 more predictable and avoids the need for arbitrary sleeps in the test script.) The tool also has a series of client "send" commands to send commands and data to a server instance. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-22 11:52:54 -07:00
Junio C Hamano	eabacfd9cb	Merge branch 'jc/calloc-fix' Code clean-up. * jc/calloc-fix: xcalloc: use CALLOC_ARRAY() when applicable	2021-03-19 15:25:37 -07:00
Nipunn Koorapati	7e5aa13d2c	fsmonitor: add perf test for git diff HEAD Update the xargs call so that if your large repo contains symlinks, test-tool chmtime failure does not end the script. On Linux Test this tree upstream/master --------------------------------------------------------------------------------------------------------- 7519.4: status (fsmonitor=fsmonitor-watchman) 0.52(0.43+0.10) 0.53(0.49+0.05) +1.9% 7519.5: status -uno (fsmonitor=fsmonitor-watchman) 0.21(0.15+0.07) 0.22(0.13+0.09) +4.8% 7519.6: status -uall (fsmonitor=fsmonitor-watchman) 1.65(0.93+0.71) 1.69(1.03+0.65) +2.4% 7519.7: status (dirty) (fsmonitor=fsmonitor-watchman) 11.99(11.34+1.58) 11.95(11.02+1.79) -0.3% 7519.8: diff (fsmonitor=fsmonitor-watchman) 0.25(0.17+0.26) 0.25(0.18+0.26) +0.0% 7519.9: diff HEAD (fsmonitor=fsmonitor-watchman) 0.39(0.25+0.34) 0.89(0.35+0.74) +128.2% 7519.10: diff -- 0_files (fsmonitor=fsmonitor-watchman) 0.16(0.13+0.04) 0.16(0.12+0.05) +0.0% 7519.11: diff -- 10_files (fsmonitor=fsmonitor-watchman) 0.16(0.12+0.05) 0.16(0.12+0.05) +0.0% 7519.12: diff -- 100_files (fsmonitor=fsmonitor-watchman) 0.16(0.12+0.05) 0.16(0.12+0.05) +0.0% 7519.13: diff -- 1000_files (fsmonitor=fsmonitor-watchman) 0.16(0.11+0.06) 0.16(0.12+0.05) +0.0% 7519.14: diff -- 10000_files (fsmonitor=fsmonitor-watchman) 0.18(0.13+0.06) 0.17(0.10+0.08) -5.6% 7519.15: add (fsmonitor=fsmonitor-watchman) 2.25(1.53+0.68) 2.25(1.47+0.74) +0.0% 7519.18: status (fsmonitor=disabled) 0.88(0.73+1.03) 0.89(0.67+1.08) +1.1% 7519.19: status -uno (fsmonitor=disabled) 0.45(0.43+0.89) 0.45(0.34+0.98) +0.0% 7519.20: status -uall (fsmonitor=disabled) 1.88(1.16+1.58) 1.88(1.22+1.51) +0.0% 7519.21: status (dirty) (fsmonitor=disabled) 7.53(7.05+2.11) 7.53(6.98+2.04) +0.0% 7519.22: diff (fsmonitor=disabled) 0.42(0.37+0.92) 0.42(0.38+0.91) +0.0% 7519.23: diff HEAD (fsmonitor=disabled) 0.44(0.41+0.90) 0.44(0.40+0.91) +0.0% 7519.24: diff -- 0_files (fsmonitor=disabled) 0.13(0.09+0.05) 0.13(0.09+0.05) +0.0% 7519.25: diff -- 10_files (fsmonitor=disabled) 0.13(0.10+0.04) 0.13(0.10+0.04) +0.0% 7519.26: diff -- 100_files (fsmonitor=disabled) 0.13(0.09+0.05) 0.13(0.10+0.04) +0.0% 7519.27: diff -- 1000_files (fsmonitor=disabled) 0.13(0.09+0.06) 0.13(0.09+0.05) +0.0% 7519.28: diff -- 10000_files (fsmonitor=disabled) 0.14(0.11+0.05) 0.14(0.10+0.05) +0.0% 7519.29: add (fsmonitor=disabled) 2.43(1.61+1.64) 2.43(1.69+1.57) +0.0% On linux (2.29.2 vs w/ this patch): nipunn@nipunn-dbx:~/src/server3$ strace -f -c git diff 2>&1 \| grep lstat 0.04 0.000063 3 20 6 lstat nipunn@nipunn-dbx:~/src/server3$ strace -f -c git diff HEAD 2>&1 \| grep lstat 94.98 5.242262 10 523783 13 lstat nipunn@nipunn-dbx:~/src/server3$ strace -f -c ../git/bin-wrappers/git diff 2>&1 \| grep lstat 0.38 0.000032 5 7 3 lstat nipunn@nipunn-dbx:~/src/server3$ strace -f -c ../git/bin-wrappers/git diff HEAD 2>&1 \| grep lstat 99.44 0.741892 9 81634 10 lstat On mac (2.29.2 vs w/ this patch): nipunn-mbp:server nipunn$ sudo dtruss -L -f -c git diff 2>&1 \| grep "^lstat64 " lstat64 8 nipunn-mbp:server nipunn$ sudo dtruss -L -f -c git diff HEAD 2>&1 \| grep "^lstat64 " lstat64 120242 nipunn-mbp:server nipunn$ sudo dtruss -L -f -c ../git/bin-wrappers/git diff 2>&1 \| grep "^lstat64 " lstat64 4 nipunn-mbp:server nipunn$ sudo dtruss -L -f -c ../git/bin-wrappers/git diff HEAD 2>&1 \| grep "^lstat64 " lstat64 4497 There are still a bunch of lstats - on directories, but not every file. Progress! Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-18 13:31:14 -07:00
Junio C Hamano	486f4bd183	xcalloc: use CALLOC_ARRAY() when applicable These are for codebase before Git 2.31 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-15 17:51:10 -07:00
Junio C Hamano	aa2d3dbdf5	Merge branch 'jt/trace2-BUG' Even though invocations of "die()" were logged to the trace2 system, "BUG()"s were not, which has been corrected. * jt/trace2-BUG: usage: trace2 BUG() invocations	2021-02-17 17:21:41 -08:00
Junio C Hamano	8b4701ae4f	Merge branch 'ak/corrected-commit-date' The commit-graph learned to use corrected commit dates instead of the generation number to help topological revision traversal. * ak/corrected-commit-date: doc: add corrected commit date info commit-reach: use corrected commit dates in paint_down_to_common() commit-graph: use generation v2 only if entire chain does commit-graph: implement generation data chunk commit-graph: implement corrected commit date commit-graph: return 64-bit generation number commit-graph: add a slab to store topological levels t6600-test-reach: generalize *_three_modes commit-graph: consolidate fill_commit_graph_info revision: parse parent in indegree_walk_step() commit-graph: fix regression when computing Bloom filters	2021-02-17 17:21:40 -08:00
Junio C Hamano	59ace284f3	Merge branch 'ab/grep-pcre-invalid-utf8' Update support for invalid UTF-8 in PCRE2. * ab/grep-pcre-invalid-utf8: grep/pcre2: better support invalid UTF-8 haystacks grep/pcre2 tests: don't rely on invalid UTF-8 data test	2021-02-10 14:48:33 -08:00
Jonathan Tan	0a9dde4a04	usage: trace2 BUG() invocations die() messages are traced in trace2, but BUG() messages are not. Anyone tracking die() messages would have even more reason to track BUG(). Therefore, write to trace2 when BUG() is invoked. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-09 14:14:34 -08:00
Ævar Arnfjörð Bjarmason	95ca1f987e	grep/pcre2: better support invalid UTF-8 haystacks Improve the support for invalid UTF-8 haystacks given a non-ASCII needle when using the PCREv2 backend. This is a more complete fix for a bug I started to fix in `870eea8166` (grep: do not enter PCRE2_UTF mode on fixed matching, 2019-07-26), now that PCREv2 has the PCRE2_MATCH_INVALID_UTF mode we can make use of it. This fixes the sort of case described in `8a5999838e` (grep: stess test PCRE v2 on invalid UTF-8 data, 2019-07-26), i.e.: - The subject string is non-ASCII (e.g. "ævar") - We're under a is_utf8_locale(), e.g. "en_US.UTF-8", not "C" - We are using --ignore-case, or we're a non-fixed pattern If those conditions were satisfied and we matched found non-valid UTF-8 data PCREv2 might bark on it, in practice this only happened under the JIT backend (turned on by default on most platforms). Ultimately this fixes a "regression" in `b65abcafc7` ("grep: use PCRE v2 for optimized fixed-string search", 2019-07-01), I'm putting that in scare-quotes because before then we wouldn't properly support these complex case-folding, locale etc. cases either, it just broke in different ways. There was a bug related to this the PCRE2_NO_START_OPTIMIZE flag fixed in PCREv2 10.36. It can be worked around by setting the PCRE2_NO_START_OPTIMIZE flag. Let's do that in those cases, and add tests for the bug. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-24 16:09:17 -08:00
Jeff King	36a317929b	refs: switch peel_ref() to peel_iterated_oid() The peel_ref() interface is confusing and error-prone: - it's typically used by ref iteration callbacks that have both a refname and oid. But since they pass only the refname, we may load the ref value from the filesystem again. This is inefficient, but also means we are open to a race if somebody simultaneously updates the ref. E.g., this: int some_ref_cb(const char refname, const struct object_id oid, ...) { if (!peel_ref(refname, &peeled)) printf("%s peels to %s", oid_to_hex(oid), oid_to_hex(&peeled); } could print nonsense. It is correct to say "refname peels to..." (you may see the "before" value or the "after" value, either of which is consistent), but mentioning both oids may be mixing before/after values. Worse, whether this is possible depends on whether the optimization to read from the current iterator value kicks in. So it is actually not possible with: for_each_ref(some_ref_cb); but it _is_ possible with: head_ref(some_ref_cb); which does not use the iterator mechanism (though in practice, HEAD should never peel to anything, so this may not be triggerable). - it must take a fully-qualified refname for the read_ref_full() code path to work. Yet we routinely pass it partial refnames from callbacks to for_each_tag_ref(), etc. This happens to work when iterating because there we do not call read_ref_full() at all, and only use the passed refname to check if it is the same as the iterator. But the requirements for the function parameters are quite unclear. Instead of taking a refname, let's instead take an oid. That fixes both problems. It's a little funny for a "ref" function not to involve refs at all. The key thing is that it's optimizing under the hood based on having access to the ref iterator. So let's change the name to make it clear why you'd want this function versus just peel_object(). There are two other directions I considered but rejected: - we could pass the peel information into the each_ref_fn callback. However, we don't know if the caller actually wants it or not. For packed-refs, providing it is essentially free. But for loose refs, we actually have to peel the object, which would be wasteful in most cases. We could likewise pass in a flag to the callback indicating whether the peeled information is known, but that complicates those callbacks, as they then have to decide whether to manually peel themselves. Plus it requires changing the interface of every callback, whether they care about peeling or not, and there are many of them. - we could make a function to return the peeled value of the current iterated ref (computing it if necessary), and BUG() otherwise. I.e.: int peel_current_iterated_ref(struct object_id *out); Each of the current callers is an each_ref_fn callback, so they'd mostly be happy. But: - we use those callbacks with functions like head_ref(), which do not use the iteration code. So we'd need to handle the fallback case there, anyway. - it's possible that a caller would want to call into generic code that sometimes is used during iteration and sometimes not. This encapsulates the logic to do the fast thing when possible, and fallback when necessary. The implementation is mostly obvious, but I want to call out a few things in the patch: - the test-tool coverage for peel_ref() is now meaningless, as it all collapses to a single peel_object() call (arguably they were pretty uninteresting before; the tricky part of that function is the fast-path we see during iteration, but these calls didn't trigger that). I've just dropped it entirely, though note that some other tests relied on the tags we created; I've moved that creation to the tests where it matters. - we no longer need to take a ref_store parameter, since we'd never look up a ref now. We do still rely on a global "current iterator" variable which _could_ be kept per-ref-store. But in practice this is only useful if there are multiple recursive iterations, at which point the more appropriate solution is probably a stack of iterators. No caller used the actual ref-store parameter anyway (they all call the wrapper that passes the_repository). - the original only kicked in the optimization when the "refname" pointer matched (i.e., not string comparison). We do likewise with the "oid" parameter here, but fall back to doing an actual oideq() call. This in theory lets us kick in the optimization more often, though in practice no current caller cares. It should never be wrong, though (peeling is a property of an object, so two refs pointing to the same object would peel identically). - the original took care not to touch the peeled out-parameter unless we found something to put in it. But no caller cares about this, and anyway, it is enforced by peel_object() itself (and even in the optimized iterator case, that's where we eventually end up). We can shorten the code and avoid an extra copy by just passing the out-parameter through the stack. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-21 15:51:31 -08:00
Abhishek Kumar	e8b63005c4	commit-graph: implement generation data chunk As discovered by Ævar, we cannot increment graph version to distinguish between generation numbers v1 and v2 [1]. Thus, one of pre-requistes before implementing generation number v2 was to distinguish between graph versions in a backwards compatible manner. We are going to introduce a new chunk called Generation DATa chunk (or GDAT). GDAT will store corrected committer date offsets whereas CDAT will still store topological level. Old Git does not understand GDAT chunk and would ignore it, reading topological levels from CDAT. New Git can parse GDAT and take advantage of newer generation numbers, falling back to topological levels when GDAT chunk is missing (as it would happen with a commit-graph written by old Git). We introduce a test environment variable 'GIT_TEST_COMMIT_GRAPH_NO_GDAT' which forces commit-graph file to be written without generation data chunk to emulate a commit-graph file written by old Git. To minimize the space required to store corrrected commit date, Git stores corrected commit date offsets into the commit-graph file, instea of corrected commit dates. This saves us 4 bytes per commit, decreasing the GDAT chunk size by half, but it's possible for the offset to overflow the 4-bytes allocated for storage. As such overflows are and should be exceedingly rare, we use the following overflow management scheme: We introduce a new commit-graph chunk, Generation Data OVerflow ('GDOV') to store corrected commit dates for commits with offsets greater than GENERATION_NUMBER_V2_OFFSET_MAX. If the offset is greater than GENERATION_NUMBER_V2_OFFSET_MAX, we set the MSB of the offset and the other bits store the position of corrected commit date in GDOV chunk, similar to how Extra Edge List is maintained. We test the overflow-related code with the following repo history: F - N - U / \ U - N - U N \ / N - F - N Where the commits denoted by U have committer date of zero seconds since Unix epoch, the commits denoted by N have committer date of 1112354055 (default committer date for the test suite) seconds since Unix epoch and the commits denoted by F have committer date of (2 ^ 31 - 2) seconds since Unix epoch. The largest offset observed is 2 ^ 31, just large enough to overflow. [1]: https://lore.kernel.org/git/87a7gdspo4.fsf@evledraar.gmail.com/ Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-18 16:21:18 -08:00
Junio C Hamano	8f8f10ac09	Merge branch 'jx/t5411-flake-fix' The exchange between receive-pack and proc-receive hook did not carefully check for errors. * jx/t5411-flake-fix: receive-pack: use default version 0 for proc-receive receive-pack: gently write messages to proc-receive t5411: new helper filter_out_user_friendly_and_stable_output	2020-11-25 15:24:52 -08:00
Junio C Hamano	bf0a430f70	Merge branch 'en/strmap' A specialization of hashmap that uses a string as key has been introduced. Hopefully it will see wider use over time. * en/strmap: shortlog: use strset from strmap.h Use new HASHMAP_INIT macro to simplify hashmap initialization strmap: take advantage of FLEXPTR_ALLOC_STR when relevant strmap: enable allocations to come from a mem_pool strmap: add a strset sub-type strmap: split create_entry() out of strmap_put() strmap: add functions facilitating use as a string->int map strmap: enable faster clearing and reusing of strmaps strmap: add more utility functions strmap: new utility functions hashmap: provide deallocation function names hashmap: introduce a new hashmap_partial_clear() hashmap: allow re-use after hashmap_free() hashmap: adjust spacing to fix argument alignment hashmap: add usage documentation explaining hashmap_free[_entries]()	2020-11-21 15:14:38 -08:00
Junio C Hamano	a1f95951ef	Merge branch 'en/merge-ort-api-null-impl' Preparation for a new merge strategy. * en/merge-ort-api-null-impl: merge,rebase,revert: select ort or recursive by config or environment fast-rebase: demonstrate merge-ort's API via new test-tool command merge-ort-wrappers: new convience wrappers to mimic the old merge API merge-ort: barebones API of new merge strategy with empty implementation	2020-11-18 13:32:53 -08:00
Junio C Hamano	7660da1618	Merge branch 'ds/maintenance-part-3' Parts of "git maintenance" to ease writing crontab entries (and other scheduling system configuration) for it. * ds/maintenance-part-3: maintenance: add troubleshooting guide to docs maintenance: use 'incremental' strategy by default maintenance: create maintenance.strategy config maintenance: add start/stop subcommands maintenance: add [un]register subcommands for-each-repo: run subcommands on configured repos maintenance: add --schedule option and config maintenance: optionally skip --auto process	2020-11-18 13:32:53 -08:00
Elijah Newren	b19315d8ab	Use new HASHMAP_INIT macro to simplify hashmap initialization Now that hashamp has lazy initialization and a HASHMAP_INIT macro, hashmaps allocated on the stack can be initialized without a call to hashmap_init() and in some cases makes the code a bit shorter. Convert some callsites over to take advantage of this. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-11 12:55:27 -08:00
Jiang Xin	80ffeb94f4	receive-pack: use default version 0 for proc-receive In the verison negotiation phase between "receive-pack" and "proc-receive", "proc-receive" can send an empty flush-pkt to end the negotiation and use default version 0. Capabilities (such as "push-options") are not supported in version 0. Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-11 12:46:56 -08:00
Jiang Xin	f65003b4c4	receive-pack: gently write messages to proc-receive Johannes found a flaky hang in `t5411/test-0013-bad-protocol.sh` in the osx-clang job of the CI/PR builds, and ran into an issue when using the `--stress` option with the following error messages: fatal: unable to write flush packet: Broken pipe send-pack: unexpected disconnect while reading sideband packet fatal: the remote end hung up unexpectedly In this test case, the "proc-receive" hook sends an error message and dies earlier. While "receive-pack" on the other side of the pipe should forward the error message of the "proc-receive" hook to the client side, but it fails to do so. This is because "receive-pack" uses `packet_write_fmt()` and `packet_flush()` to write pkt-line message to "proc-receive" hook, and these functions die immediately when pipe is broken. Using "gently" forms for these functions will get more predicable output. Add more "--die-*" options to test helper to test different stages of the protocol between "receive-pack" and "proc-receive" hook. Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-11 12:46:56 -08:00
Junio C Hamano	6b9f5096eb	Merge branch 'js/avoid-split-sideband-message' The side-band status report can be sent at the same time as the primary payload multiplexed, but the demultiplexer on the receiving end incorrectly split a single status report into two, which has been corrected. * js/avoid-split-sideband-message: test-pkt-line: drop colon from sideband identity sideband: report unhandled incomplete sideband messages as bugs sideband: avoid reporting incomplete sideband messages	2020-11-02 13:17:37 -08:00
Elijah Newren	6da1a25814	hashmap: provide deallocation function names hashmap_free(), hashmap_free_entries(), and hashmap_free_() have existed for a while, but aren't necessarily the clearest names, especially with hashmap_partial_clear() being added to the mix and lazy-initialization now being supported. Peff suggested we adopt the following names[1]: - hashmap_clear() - remove all entries and de-allocate any hashmap-specific data, but be ready for reuse - hashmap_clear_and_free() - ditto, but free the entries themselves - hashmap_partial_clear() - remove all entries but don't deallocate table - hashmap_partial_clear_and_free() - ditto, but free the entries This patch provides the new names and converts all existing callers over to the new naming scheme. [1] https://lore.kernel.org/git/20201030125059.GA3277724@coredump.intra.peff.net/ Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-02 12:15:50 -08:00
Elijah Newren	fe1a21d526	fast-rebase: demonstrate merge-ort's API via new test-tool command Add a new test-tool command named 'fast-rebase', which is a super-slimmed down and nowhere near as capable version of 'git rebase'. 'test-tool fast-rebase' is not currently planned for usage in the testsuite, but is here for two purposes: 1) Demonstrate the desired API of merge-ort. In particular, fast-rebase takes advantage of the separation of the merging operation from the updating of the index and working tree, to allow it to pick N commits, but only update the index and working tree once at the end. Look for the calls to merge_incore_nonrecursive() and merge_switch_to_result(). 2) Provide a convenient benchmark that isn't polluted by the heavy disk writing and forking of unnecessary processes that comes from sequencer.c and merge-recursive.c. fast-rebase is not meant to replace sequencer.c, just give ideas on how sequencer.c can be changed. Updating sequencer.c with these goals is probably a large amount of work; writing a simple targeted command with no documentation, less-than-useful help messages, numerous limitations in terms of flags it can accept and situations it can handle, and which is flagged off from users is a much easier interim step. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-29 14:05:48 -07:00
Jeff King	712b0377db	test-pkt-line: drop colon from sideband identity We pass "sideband: " as our identity for errors to recv_sideband(). But it already adds the trailing colon and space. This doesn't invalidate any tests, but it looks funny when you examine the test output. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-27 11:57:51 -07:00
Johannes Schindelin	17e7dbbcbc	sideband: avoid reporting incomplete sideband messages In `2b695ecd74` (t5500: count objects through stderr, not trace, 2020-05-06) we tried to ensure that the "Total 3" message could be grepped in Git's output, even if it sometimes got chopped up into multiple lines in the trace machinery. However, the first instance where this mattered now goes through the sideband machinery, where it is _still_ possible for messages to get chopped up: it is possible for the standard error stream to be sent byte-for-byte and hence it can be easily interrupted. Meaning: it is possible for the single line that we're looking for to be chopped up into multiple sideband packets, with a primary packet being delivered between them. This seems to happen occasionally in the `vs-test` part of our CI builds, i.e. with binaries built using Visual C, but not when building with GCC or clang; The symptom is that t5500.43 fails to find a line matching `remote: Total 3` in the `log` file, which ends in something along these lines: remote: Tota remote: l 3 (delta 0), reused 0 (delta 0), pack-reused 0 This should not happen, though: we have code in `demultiplex_sideband()` _specifically_ to stitch back together lines that were delivered in separate sideband packets. However, this stitching was broken in a subtle way in `fbd76cd450` (sideband: reverse its dependency on pkt-line, 2019-01-16): before that change, incomplete sideband lines would not be flushed upon receiving a primary packet, but after that patch, they would be. The subtleness of this bug comes from the fact that it is easy to get confused by the ambiguous meaning of the `break` keyword: after writing the primary packet contents, the `break;` in the original version of `recv_sideband()` does _not_ break out of the `while` loop, but instead only ends the `switch` case: while (!retval) { [...] switch (band) { [...] case 1: /* Write the contents of the primary packet / write_or_die(out, buf + 1, len); / Here, we do not break out of the loop, `retval` is unchanged / break; [...] } if (outbuf.len) { / Write any remaining sideband messages lacking a trailing LF / strbuf_addch(&outbuf, '\n'); xwrite(2, outbuf.buf, outbuf.len); } In contrast, after `fbd76cd450` (sideband: reverse its dependency on pkt-line, 2019-01-16), the body of the `while` loop was extracted into `demultiplex_sideband()`, crucially _including_ the logic to write incomplete sideband messages: switch (band) { [...] case 1: sideband_type = SIDEBAND_PRIMARY; /* This does not break out of the loop: the loop is in the caller / break; [...] } cleanup: [...] / This logic is now no longer _outside_ the loop but _inside_ */ if (scratch->len) { strbuf_addch(scratch, '\n'); xwrite(2, scratch->buf, scratch->len); } The correct way to fix this is to return from `demultiplex_sideband()` early. The caller will then write out the contents of the primary packet and continue looping. The `scratch` buffer for incomplete sideband messages is owned by that caller, and will continue to accumulate the remainder(s) of those messages. The loop will only end once `demultiplex_sideband()` returned non-zero _and_ did not indicate a primary packet, which is the case only when we hit the `cleanup:` path, in which we take care of flushing any unfinished sideband messages and release the `scratch` buffer. To ensure that this does not get broken again, we introduce a pair of subcommands of the `pkt-line` test helper that specifically chop up the sideband message and squeeze a primary packet into the middle. Final note: The other test case touched by `2b695ecd74` (t5500: count objects through stderr, not trace, 2020-05-06) is not affected by this issue because the sideband machinery is not involved there. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-20 13:31:00 -07:00
Junio C Hamano	19dd352d03	Merge branch 'jk/unused' Code cleanup. * jk/unused: dir.c: drop unused "untracked" from treat_path_fast() sequencer: handle ignore_footer when parsing trailers test-advise: check argument count with argc instead of argv sparse-checkout: fill in some options boilerplate sequencer: drop repository argument from run_git_commit() push: drop unused repo argument to do_push() assert PARSE_OPT_NONEG in parse-options callbacks env--helper: write to opt->value in parseopt helper drop unused argc parameters convert: drop unused crlf_action from check_global_conv_flags_eol()	2020-10-05 14:01:52 -07:00
Junio C Hamano	c01b041ef0	Merge branch 'ds/in-merge-bases-many-optim-bug' in_merge_bases_many(), a way to see if a commit is reachable from any commit in a set of commits, was totally broken when the commit-graph feature was in use, which has been corrected. * ds/in-merge-bases-many-optim-bug: commit-reach: fix in_merge_bases_many bug	2020-10-05 14:01:50 -07:00
Derrick Stolee	8791bf1841	commit-reach: fix in_merge_bases_many bug Way back in `f9b8908b` (commit.c: use generation numbers for in_merge_bases(), 2018-05-01), a heuristic was used to short-circuit the in_merge_bases() walk. This works just fine as long as the caller is checking only two commits, but when there are multiple, there is a possibility that this heuristic is _very wrong_. Some code moves since then has changed this method to repo_in_merge_bases_many() inside commit-reach.c. The heuristic computes the minimum generation number of the "reference" list, then compares this number to the generation number of the "commit". In a recent topic, a test was added that used in_merge_bases_many() to test if a commit was reachable from a number of commits pulled from a reflog. However, this highlighted the problem: if any of the reference commits have a smaller generation number than the given commit, then the walk is skipped _even if there exist some with higher generation number_. This heuristic is wrong! It must check the MAXIMUM generation number of the reference commits, not the MINIMUM. This highlights a testing gap. t6600-test-reach.sh covers many methods in commit-reach.c, including in_merge_bases() and get_merge_bases_many(), but since these methods either restrict to two input commits or actually look for the full list of merge bases, they don't check this heuristic! Add a possible input to "test-tool reach" that tests in_merge_bases_many() and add tests to t6600-test-reach.sh that cover this heuristic. This includes cases for the reference commits having generation above and below the generation of the input commit, but also having maximum generation below the generation of the input commit. The fix itself is to swap min_generation with a max_generation in repo_in_merge_bases_many(). Reported-by: Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-02 10:26:31 -07:00
Jeff King	26e28fe7bb	test-advise: check argument count with argc instead of argv We complain if "test-tool advise" is not given an argument, but we quietly ignore any additional arguments it receives. Let's instead check that we got the expected number. As a bonus, this silences -Wunused-parameter, which notes that we don't ever look at argc. While we're here, we can also fix the indentation in the conditional. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-30 12:53:48 -07:00
Jeff King	e885a84f1b	drop unused argc parameters Many functions take an argv/argc pair, but never actually look at argc. This makes it useless at best (we use the NULL sentinel in argv to find the end of the array), and misleading at worst (what happens if the argc count does not match the argv NULL?). In each of these instances, the argv NULL does match the argc count, so there are no bugs here. But let's tighten the interfaces to make it harder to get wrong (and to reduce some -Wunused-parameter complaints). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-30 12:53:47 -07:00
Junio C Hamano	288ed98bf7	Merge branch 'tb/bloom-improvements' "git commit-graph write" learned to limit the number of bloom filters that are computed from scratch with the --max-new-filters option. * tb/bloom-improvements: commit-graph: introduce 'commitGraph.maxNewFilters' builtin/commit-graph.c: introduce '--max-new-filters=<n>' commit-graph: rename 'split_commit_graph_opts' bloom: encode out-of-bounds filters as non-empty bloom/diff: properly short-circuit on max_changes bloom: use provided 'struct bloom_filter_settings' bloom: split 'get_bloom_filter()' in two commit-graph.c: store maximum changed paths commit-graph: respect 'commitGraph.readChangedPaths' t/helper/test-read-graph.c: prepare repo settings commit-graph: pass a 'struct repository *' in more places t4216: use an '&&'-chain commit-graph: introduce 'get_bloom_filter_settings()'	2020-09-29 14:01:20 -07:00
Junio C Hamano	6c430a647c	Merge branch 'jx/proc-receive-hook' "git receive-pack" that accepts requests by "git push" learned to outsource most of the ref updates to the new "proc-receive" hook. * jx/proc-receive-hook: doc: add documentation for the proc-receive hook transport: parse report options for tracking refs t5411: test updates of remote-tracking branches receive-pack: new config receive.procReceiveRefs doc: add document for capability report-status-v2 New capability "report-status-v2" for git-push receive-pack: feed report options to post-receive receive-pack: add new proc-receive hook t5411: add basic test cases for proc-receive hook transport: not report a non-head push as a branch	2020-09-25 15:25:39 -07:00
Derrick Stolee	2fec604f8d	maintenance: add start/stop subcommands Add new subcommands to 'git maintenance' that start or stop background maintenance using 'cron', when available. This integration is as simple as I could make it, barring some implementation complications. The schedule is laid out as follows: 0 1-23 * * * $cmd maintenance run --schedule=hourly 0 0 * * 1-6 $cmd maintenance run --schedule=daily 0 0 * * 0 $cmd maintenance run --schedule=weekly where $cmd is a properly-qualified 'git for-each-repo' execution: $cmd=$path/git --exec-path=$path for-each-repo --config=maintenance.repo where $path points to the location of the Git executable running 'git maintenance start'. This is critical for systems with multiple versions of Git. Specifically, macOS has a system version at '/usr/bin/git' while the version that users can install resides at '/usr/local/bin/git' (symlinked to '/usr/local/libexec/git-core/git'). This will also use your locally-built version if you build and run this in your development environment without installing first. This conditional schedule avoids having cron launch multiple 'git for-each-repo' commands in parallel. Such parallel commands would likely lead to the 'hourly' and 'daily' tasks competing over the object database lock. This could lead to to some tasks never being run! Since the --schedule=<frequency> argument will run all tasks with _at least_ the given frequency, the daily runs will also run the hourly tasks. Similarly, the weekly runs will also run the daily and hourly tasks. The GIT_TEST_CRONTAB environment variable is not intended for users to edit, but instead as a way to mock the 'crontab [-l]' command. This variable is set in test-lib.sh to avoid a future test from accidentally running anything with the cron integration from modifying the user's schedule. We use GIT_TEST_CRONTAB='test-tool crontab <file>' in our tests to check how the schedule is modified in 'git maintenance (start\|stop)' commands. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-25 10:59:44 -07:00
Taylor Blau	9a7a9ed10d	bloom: use provided 'struct bloom_filter_settings' When 'get_or_compute_bloom_filter()' needs to compute a Bloom filter from scratch, it looks to the default 'struct bloom_filter_settings' in order to determine the maximum number of changed paths, number of bits per entry, and so on. All of these values have so far been constant, and so there was no need to pass in a pointer from the caller (eg., the one that is stored in the 'struct write_commit_graph_context'). Start passing in a 'struct bloom_filter_settings *' instead of using the default values to respect graph-specific settings (eg., in the case of setting 'GIT_TEST_BLOOM_SETTINGS_MAX_CHANGED_PATHS'). In order to have an initialized value for these settings, move its initialization to earlier in the commit-graph write. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-17 09:31:25 -07:00
Taylor Blau	312cff5207	bloom: split 'get_bloom_filter()' in two 'get_bloom_filter' takes a flag to control whether it will compute a Bloom filter if the requested one is missing. In the next patch, we'll add yet another parameter to this method, which would force all but one caller to specify an extra 'NULL' parameter at the end. Instead of doing this, split 'get_bloom_filter' into two functions: 'get_bloom_filter' and 'get_or_compute_bloom_filter'. The former only looks up a Bloom filter (and does not compute one if it's missing, thus dropping the 'compute_if_not_present' flag). The latter does compute missing Bloom filters, with an additional parameter to store whether or not it needed to do so. This simplifies many call-sites, since the majority of existing callers to 'get_bloom_filter' do not want missing Bloom filters to be computed (so they can drop the parameter entirely and use the simpler version of the function). While we're at it, instrument the new 'get_or_compute_bloom_filter()' with counters in the 'write_commit_graph_context' struct which store the number of filters that we did and didn't compute, as well as filters that were truncated. It would be nice to drop the 'compute_if_not_present' flag entirely, since all remaining callers of 'get_or_compute_bloom_filter' pass it as '1', but this will change in a future patch and hence cannot be removed. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-17 09:31:25 -07:00
Taylor Blau	24f951a492	t/helper/test-read-graph.c: prepare repo settings The read-graph test-tool is used by a number of the commit-graph test to assert various properties about a commit-graph. Previously, this program never ran 'prepare_repo_settings()'. There was no need to do so, since none of the commit-graph machinery is affected by the repo settings. In the next patch, the commit-graph machinery's behavior will become dependent on the repo settings, and so loading them before running the rest of the test tool is critical. As such, teach the test tool to call 'prepare_repo_settings()'. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-09 12:51:48 -07:00
Junio C Hamano	afd49c39dd	Merge branch 'jk/slimmed-down' Trim an unused binary and turn a bunch of commands into built-in. * jk/slimmed-down: drop vcs-svn experiment make git-fast-import a builtin make git-bugreport a builtin make credential helpers builtins Makefile: drop builtins from MSVC pdb list	2020-09-03 12:37:02 -07:00
Junio C Hamano	0d9a8e33f9	Merge branch 'jk/leakfix' Code clean-up. * jk/leakfix: submodule--helper: fix leak of core.worktree value config: fix leak in git_config_get_expiry_in_days() config: drop git_config_get_string_const() config: fix leaks from git_config_get_string_const() checkout: fix leak of non-existent branch names submodule--helper: use strbuf_release() to free strbufs clear_pattern_list(): clear embedded hashmaps	2020-08-27 14:04:49 -07:00
Jiang Xin	15d3af5e22	receive-pack: add new proc-receive hook Git calls an internal `execute_commands` function to handle commands sent from client to `git-receive-pack`. Regardless of what references the user pushes, git creates or updates the corresponding references if the user has write-permission. A contributor who has no write-permission, cannot push to the repository directly. So, the contributor has to write commits to an alternate location, and sends pull request by emails or by other ways. We call this workflow as a distributed workflow. It would be more convenient to work in a centralized workflow like what Gerrit provided for some cases. For example, a read-only user who cannot push to a branch directly can run the following `git push` command to push commits to a pseudo reference (has a prefix "refs/for/", not "refs/heads/") to create a code review. git push origin \ HEAD:refs/for/<branch-name>/<session> The `<branch-name>` in the above example can be as simple as "master", or a more complicated branch name like "foo/bar". The `<session>` in the above example command can be the local branch name of the client side, such as "my/topic". We cannot implement a centralized workflow elegantly by using "pre-receive" + "post-receive", because Git will call the internal function "execute_commands" to create references (even the special pseudo reference) between these two hooks. Even though we can delete the temporarily created pseudo reference via the "post-receive" hook, having a temporary reference is not safe for concurrent pushes. So, add a filter and a new handler to support this kind of workflow. The filter will check the prefix of the reference name, and if the command has a special reference name, the filter will turn a specific field (`run_proc_receive`) on for the command. Commands with this filed turned on will be executed by a new handler (a hook named "proc-receive") instead of the internal `execute_commands` function. We can use this "proc-receive" command to create pull requests or send emails for code review. Suggested by Junio, this "proc-receive" hook reads the commands, push-options (optional), and send result using a protocol in pkt-line format. In the following example, the letter "S" stands for "receive-pack" and letter "H" stands for the hook. # Version and features negotiation. S: PKT-LINE(version=1\0push-options atomic...) S: flush-pkt H: PKT-LINE(version=1\0push-options...) H: flush-pkt # Send commands from server to the hook. S: PKT-LINE(<old-oid> <new-oid> <ref>) S: ... ... S: flush-pkt # Send push-options only if the 'push-options' feature is enabled. S: PKT-LINE(push-option) S: ... ... S: flush-pkt # Receive result from the hook. # OK, run this command successfully. H: PKT-LINE(ok <ref>) # NO, I reject it. H: PKT-LINE(ng <ref> <reason>) # Fall through, let 'receive-pack' to execute it. H: PKT-LINE(ok <ref>) H: PKT-LINE(option fall-through) # OK, but has an alternate reference. The alternate reference name # and other status can be given in options H: PKT-LINE(ok <ref>) H: PKT-LINE(option refname <refname>) H: PKT-LINE(option old-oid <old-oid>) H: PKT-LINE(option new-oid <new-oid>) H: PKT-LINE(option forced-update) H: ... ... H: flush-pkt After receiving a command, the hook will execute the command, and may create/update different reference. For example, a command for a pseudo reference "refs/for/master/topic" may create/update different reference such as "refs/pull/123/head". The alternate reference name and other status are given in option lines. The list of commands returned from "proc-receive" will replace the relevant commands that are sent from user to "receive-pack", and "receive-pack" will continue to run the "execute_commands" function and other routines. Finally, the result of the execution of these commands will be reported to end user. The reporting function from "receive-pack" to "send-pack" will be extended in latter commit just like what the "proc-receive" hook reports to "receive-pack". Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-27 12:47:47 -07:00
Derrick Stolee	d96075428a	multi-pack-index: use hash version byte Similar to the commit-graph format, the multi-pack-index format has a byte in the header intended to track the hash version used to write the file. This allows one to interpret the hash length without having the context of the repository config specifying the hash length. This was not modified as part of the SHA-256 work because the hash length was automatically up-shifted due to that config. Since we have this byte available, we can make the file formats more obviously incompatible instead of relying on other context from the repository. Add a new oid_version() method in midx.c similar to the one in commit-graph.c. This is specifically made separate from that implementation to avoid artificially linking the formats. The test impact requires a few more things than the corresponding change in the commit-graph format. Specifically, 'test-tool read-midx' was not writing anything about this header value to output. Since the value available in 'struct multi_pack_index' is hash_len instead of a version value, we output "20" or "32" instead of "1" or "2". Since we want a user to not have their Git commands fail if their multi-pack-index has the incorrect hash version compared to the repository's hash version, we relax the die() to an error() in load_multi_pack_index(). This has some effect on 'git multi-pack-index verify' as we need to check that a failed parse of a file that exists is actually a verify error. For that test that checks the hash version matches, we change the corrupted byte from "2" to "3" to ensure the test fails for both hash algorithms. Helped-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-17 16:45:20 -07:00
Jeff King	f1de981e8b	config: fix leaks from git_config_get_string_const() There are two functions to get a single config string: - git_config_get_string() - git_config_get_string_const() One might naively think that the first one allocates a new string and the second one just points us to the internal configset storage. But in fact they both allocate a new copy; the second one exists only to avoid having to cast when using it with a const global which we never intend to free. The documentation for the function explains that clearly, but it seems I'm not alone in being surprised by this. Of 17 calls to the function, 13 of them leak the resulting value. We could obviously fix these by adding the appropriate free(). But it would be simpler still if we actually had a non-allocating way to get the string. There's git_config_get_value() but that doesn't quite do what we want. If the config key is present but is a boolean with no value (e.g., "[foo]bar" in the file), then we'll get NULL (whereas the string versions will print an error and die). So let's introduce a new variant, git_config_get_string_tmp(), that behaves as these callers expect. We need a new name because we have new semantics but the same function signature (so even if we converted the four remaining callers, topics in flight might be surprised). The "tmp" is because this value should only be held onto for a short time. In practice it's rare for us to clear and refresh the configset, invalidating the pointer, but hopefully the "tmp" makes callers think about the lifetime. In each of the converted cases here the value only needs to last within the local function or its immediate caller. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-14 10:52:04 -07:00
Jeff King	fc47391e24	drop vcs-svn experiment The code in vcs-svn was started in 2010 as an attempt to build a remote-helper for interacting with svn repositories (as opposed to git-svn). However, we never got as far as shipping a mature remote helper, and the last substantive commit was `e99d012a6b` in 2012. We do have a git-remote-testsvn, and it is even installed as part of "make install". But given the name, it seems unlikely to be used by anybody (you'd have to explicitly "git clone testsvn::$url", and there have been zero mentions of that on the mailing list since 2013, and even that includes the phrase "you might need to hack a bit to get it working properly"[1]). We also ship contrib/svn-fe, which builds on the vcs-svn work. However, it does not seem to build out of the box for me, as the link step misses some required libraries for using libgit.a. Curiously, the original build breakage bisects for me to `eff80a9fd9` (Allow custom "comment char", 2013-01-16), which seems unrelated. There was an attempt to fix it in `da011cb0e7` (contrib/svn-fe: fix Makefile, 2014-08-28), but on my system that only switches the error message. So it seems like the result is not really usable by anybody in practice. It would be wonderful if somebody wanted to pick up the topic again, and potentially it's worth carrying around for that reason. But the flip side is that people doing tree-wide operations have to deal with this code. And you can see the list with (replace "HEAD" with this commit as appropriate): { echo "--" git diff-tree --diff-filter=D -r --name-only HEAD^ HEAD } \| git log --no-merges --oneline e99d012a6bc.. --stdin which shows 58 times somebody had to deal with the code, generally due to a compile or test failure, or a tree-wide style fix or API change. Let's drop it and let anybody who wants to pick it up do so by resurrecting it from the git history. As a bonus, this also reduces the size of a stripped installation of Git from 21MB to 19MB. [1] https://lore.kernel.org/git/CALkWK0mPHzKfzFKKpZkfAus3YVC9NFYDbFnt+5JQYVKipk3bQQ@mail.gmail.com/ Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-13 11:02:15 -07:00
Junio C Hamano	e0ad9574dd	Merge branch 'bc/sha-256-part-3' The final leg of SHA-256 transition. * bc/sha-256-part-3: (39 commits) t: remove test_oid_init in tests docs: add documentation for extensions.objectFormat ci: run tests with SHA-256 t: make SHA1 prerequisite depend on default hash t: allow testing different hash algorithms via environment t: add test_oid option to select hash algorithm repository: enable SHA-256 support by default setup: add support for reading extensions.objectformat bundle: add new version for use with SHA-256 builtin/verify-pack: implement an --object-format option http-fetch: set up git directory before parsing pack hashes t0410: mark test with SHA1 prerequisite t5308: make test work with SHA-256 t9700: make hash size independent t9500: ensure that algorithm info is preserved in config t9350: make hash size independent t9301: make hash size independent t9300: use $ZERO_OID instead of hard-coded object ID t9300: abstract away SHA-1-specific constants t8011: make hash size independent ...	2020-08-11 18:04:11 -07:00
Jeff King	d70a9eb611	strvec: rename struct fields The "argc" and "argv" names made sense when the struct was argv_array, but now they're just confusing. Let's rename them to "nr" (which we use for counts elsewhere) and "v" (which is rather terse, but reads well when combined with typical variable names like "args.v"). Note that we have to update all of the callers immediately. Playing tricks with the preprocessor is hard here, because we wouldn't want to rewrite unrelated tokens. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-30 19:18:06 -07:00
brian m. carlson	094a685cd7	t: make test-bloom initialize repository The bloom filter code relies on reading object IDs using parse_oid_hex. In order to make that work with an appropriate size, we need to have initialized the repository's hash algorithm. Since the values we're processing depend on the repository in use, let's set up the repository when we run the test helper. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-30 09:16:45 -07:00
Jeff King	f6d8942b1f	strvec: fix indentation in renamed calls Code which split an argv_array call across multiple lines, like: argv_array_pushl(&args, "one argument", "another argument", "and more", NULL); was recently mechanically renamed to use strvec, which results in mis-matched indentation like: strvec_pushl(&args, "one argument", "another argument", "and more", NULL); Let's fix these up to align the arguments with the opening paren. I did this manually by sifting through the results of: git jump grep 'strvec_.*,$' and liberally applying my editor's auto-format. Most of the changes are of the form shown above, though I also normalized a few that had originally used a single-tab indentation (rather than our usual style of aligning with the open paren). I also rewrapped a couple of obvious cases (e.g., where previously too-long lines became short enough to fit on one), but I wasn't aggressive about it. In cases broken to three or more lines, the grouping of arguments is sometimes meaningful, and it wasn't worth my time or reviewer time to ponder each case individually. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-28 15:02:18 -07:00
Jeff King	c972bf4cf5	strvec: convert remaining callers away from argv_array name We eventually want to drop the argv_array name and just use strvec consistently. There's no particular reason we have to do it all at once, or care about interactions between converted and unconverted bits. Because of our preprocessor compat layer, the names are interchangeable to the compiler (so even a definition and declaration using different names is OK). This patch converts all of the remaining files, as the resulting diff is reasonably sized. The conversion was done purely mechanically with: git ls-files '.c' '.h' \| xargs perl -i -pe ' s/ARGV_ARRAY/STRVEC/g; s/argv_array/strvec/g; ' We'll deal with any indentation/style fallouts separately. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-28 15:02:18 -07:00
Jeff King	dbbcd44fb4	strvec: rename files from argv-array to strvec This requires updating #include lines across the code-base, but that's all fairly mechanical, and was done with: git ls-files '.c' '.h' \| xargs perl -i -pe 's/argv-array.h/strvec.h/' Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-28 15:02:17 -07:00
Junio C Hamano	0258ed1e08	Merge branch 'cb/is-descendant-of' Code clean-up. * cb/is-descendant-of: commit-reach: avoid is_descendant_of() shim	2020-07-06 22:09:16 -07:00
Junio C Hamano	645f63111b	Merge branch 'es/get-worktrees-unsort' API cleanup for get_worktrees() * es/get-worktrees-unsort: worktree: drop get_worktrees() unused 'flags' argument worktree: drop get_worktrees() special-purpose sorting option	2020-07-06 22:09:15 -07:00
Junio C Hamano	d80bea479d	Merge branch 'ak/commit-graph-to-slab' A few fields in "struct commit" that do not have to always be present have been moved to commit slabs. * ak/commit-graph-to-slab: commit-graph: minimize commit_graph_data_slab access commit: move members graph_pos, generation to a slab commit-graph: introduce commit_graph_data_slab object: drop parsed_object_pool->commit_count	2020-07-06 22:09:14 -07:00
Junio C Hamano	12210859da	Merge branch 'bc/sha-256-part-2' SHA-256 migration work continues. * bc/sha-256-part-2: (44 commits) remote-testgit: adapt for object-format bundle: detect hash algorithm when reading refs t5300: pass --object-format to git index-pack t5704: send object-format capability with SHA-256 t5703: use object-format serve option t5702: offer an object-format capability in the test t/helper: initialize the repository for test-sha1-array remote-curl: avoid truncating refs with ls-remote t1050: pass algorithm to index-pack when outside repo builtin/index-pack: add option to specify hash algorithm remote-curl: detect algorithm for dumb HTTP by size builtin/ls-remote: initialize repository based on fetch t5500: make hash independent serve: advertise object-format capability for protocol v2 connect: parse v2 refs with correct hash algorithm connect: pass full packet reader when parsing v2 refs Documentation/technical: document object-format for protocol v2 t1302: expect repo format version 1 for SHA-256 builtin/show-index: provide options to determine hash algo t5302: modernize test formatting ...	2020-07-06 22:09:13 -07:00
Carlo Marcelo Arenas Belón	c1ea625f72	commit-reach: avoid is_descendant_of() shim `d91d6fbf26` (commit-reach: create repo_is_descendant_of(), 2020-06-17) adds a repository aware version of is_descendant_of() and a backward compatibility shim that is barely used. Update all callers to directly use the new repo_is_descendant_of() function instead; making the codebase simpler and pushing more the_repository references higher up the stack. Helped-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-23 16:36:53 -07:00
Eric Sunshine	03f2465bb1	worktree: drop get_worktrees() unused 'flags' argument get_worktrees() accepts a 'flags' argument, however, there are no existing flags (the lone flag GWT_SORT_LINKED was recently retired) and no behavior which can be tweaked. Therefore, drop the 'flags' argument. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-22 10:31:15 -07:00
brian m. carlson	54cbbe4c6e	t/helper: initialize the repository for test-sha1-array test-sha1-array uses the_hash_algo under the hood. Since t0064 wants to use the value that is correct for the hash algorithm that we're testing, make sure the test helper initializes the repository to set the_hash_algo correctly. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-19 14:04:08 -07:00
Abhishek Kumar	6da43d937c	object: drop parsed_object_pool->commit_count `14ba97f8` (alloc: allow arbitrary repositories for alloc functions, 2018-05-15) introduced parsed_object_pool->commit_count to keep count of commits per repository and was used to assign commit->index. However, commit-slab code requires commit->index values to be unique and a global count would be correct, rather than a per-repo count. Let's introduce a static counter variable, `parsed_commits_count` to keep track of parsed commits so far. As commit_count has no use anymore, let's also drop it from the struct. Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-17 14:37:14 -07:00
Junio C Hamano	b37fd14beb	Merge branch 'dl/remote-curl-deadlock-fix' On-the-wire protocol v2 easily falls into a deadlock between the remote-curl helper and the fetch-pack process when the server side prematurely throws an error and disconnects. The communication has been updated to make it more robust. * dl/remote-curl-deadlock-fix: stateless-connect: send response end packet pkt-line: define PACKET_READ_RESPONSE_END remote-curl: error on incomplete packet pkt-line: extern packet_length() transport: extract common fetch_pack() call remote-curl: remove label indentation remote-curl: fix typo	2020-06-08 18:06:30 -07:00
Junio C Hamano	f4cec40dbd	Merge branch 'cb/t4210-illseq-auto-detect' As FreeBSD is not the only platform whose regexp library reports a REG_ILLSEQ error when fed invalid UTF-8, add logic to detect that automatically and skip the affected tests. * cb/t4210-illseq-auto-detect: t4210: detect REG_ILLSEQ dynamically and skip affected tests t/helper: teach test-regex to report pattern errors (like REG_ILLSEQ)	2020-06-08 18:06:27 -07:00
Denton Liu	0181b600a6	pkt-line: define PACKET_READ_RESPONSE_END In a future commit, we will use PACKET_READ_RESPONSE_END to separate messages proxied by remote-curl. To prepare for this, add the PACKET_READ_RESPONSE_END enum value. In switch statements that need a case added, die() or BUG() when a PACKET_READ_RESPONSE_END is unexpected. Otherwise, mirror how PACKET_READ_DELIM is implemented (especially in cases where packets are being forwarded). Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-24 16:26:00 -07:00
Carlo Marcelo Arenas Belón	aba8187e4d	t/helper: teach test-regex to report pattern errors (like REG_ILLSEQ) `7187c7bbb8` (t4210: skip i18n tests that don't work on FreeBSD, 2019-11-27) adds a REG_ILLSEQ prerequisite to avoid failures from the tests added in `4e2443b181` (log tests: test regex backends in "--encode=<enc>" tests, 2019-06-28), but hardcodes it to be only enabled in FreeBSD. Instead of hardcoding the affected platform, teach the test-regex helper, how to validate a pattern and report back, so it can be used to detect the same issue in other affected systems (like DragonFlyBSD or macOS). While at it, refactor the tool so it can report back the source of the errors it founds, and can be invoked also in a --silent mode, when needed, for backward compatibility. A missing flag has been added and the code reformatted, as well as updates to the way the parameters are handled, for consistency. To minimize changes, it is assumed the regcomp error is of the right type since we control the only caller, and is also assumed to affect both basic and extended syntax (only basic is tested, but both behave the same in all three affected platforms since they use the same function). Based-on-patch-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-18 13:03:35 -07:00
Junio C Hamano	4b1e5e5d8c	Merge branch 'ds/bloom-cleanup' Code cleanup and typofixes * ds/bloom-cleanup: completion: offer '--(no-)patch' among 'git log' options bloom: use num_changes not nr for limit detection bloom: de-duplicate directory entries Documentation: changed-path Bloom filters use byte words bloom: parse commit before computing filters test-bloom: fix usage typo bloom: fix whitespace around tab length	2020-05-14 14:39:44 -07:00
Đoàn Trần Công Danh	066b70ae97	bloom: fix `make sparse` warning * We need a `final_new_line` to make our source code as text file, per POSIX and C specification. * `bloom_filters` should be limited to interal linkage only Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-07 17:08:21 -07:00
Junio C Hamano	bf04590ecd	Merge branch 'dd/sparse-fixes' Compilation fix. * dd/sparse-fixes: progress.c: silence cgcc suggestion about internal linkage graph.c: limit linkage of internal variable compat/regex: move stdlib.h up in inclusion chain test-parse-pathspec-file.c: s/0/NULL/ for pointer type	2020-05-01 13:39:56 -07:00
Junio C Hamano	6d56d4c7dc	Merge branch 'ds/blame-on-bloom' "git blame" learns to take advantage of the "changed-paths" Bloom filter stored in the commit-graph file. * ds/blame-on-bloom: test-bloom: check that we have expected arguments test-bloom: fix some whitespace issues blame: drop unused parameter from maybe_changed_path blame: use changed-path Bloom filters tests: write commit-graph with Bloom filters revision: complicated pathspecs disable filters	2020-05-01 13:39:54 -07:00
Junio C Hamano	9b6606f43d	Merge branch 'gs/commit-graph-path-filter' Introduce an extension to the commit-graph to make it efficient to check for the paths that were modified at each commit using Bloom filters. * gs/commit-graph-path-filter: bloom: ignore renames when computing changed paths commit-graph: add GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS test flag t4216: add end to end tests for git log with Bloom filters revision.c: add trace2 stats around Bloom filter usage revision.c: use Bloom filters to speed up path based revision walks commit-graph: add --changed-paths option to write subcommand commit-graph: reuse existing Bloom filters during write commit-graph: write Bloom filters to commit graph file commit-graph: examine commits by generation number commit-graph: examine changed-path objects in pack order commit-graph: compute Bloom filters for changed paths diff: halt tree-diff early after max_changes bloom.c: core Bloom filter implementation for changed paths. bloom.c: introduce core Bloom filter constructs bloom.c: add the murmur3 hash implementation commit-graph: define and use MAX_NUM_CHUNKS	2020-05-01 13:39:53 -07:00
Junio C Hamano	6a1c17d05b	Merge branch 'tb/commit-graph-split-strategy' "git commit-graph write" learned different ways to write out split files. * tb/commit-graph-split-strategy: Revert "commit-graph.c: introduce '--[no-]check-oids'" commit-graph.c: introduce '--[no-]check-oids' commit-graph.h: replace 'commit_hex' with 'commits' oidset: introduce 'oidset_size' builtin/commit-graph.c: introduce split strategy 'replace' builtin/commit-graph.c: introduce split strategy 'no-merge' builtin/commit-graph.c: support for '--split[=<strategy>]' t/helper/test-read-graph.c: support commit-graph chains	2020-05-01 13:39:52 -07:00
Derrick Stolee	54c337be9c	test-bloom: fix usage typo Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-01 11:41:21 -07:00
Junio C Hamano	28ba5a7b27	Merge branch 'dd/test-with-busybox' Various tests have been updated to work around issues found with shell utilities that come with busybox etc. * dd/test-with-busybox: t5703: feed raw data into test-tool unpack-sideband t4124: tweak test so that non-compliant diff(1) can also be used t7063: drop non-POSIX argument "-ls" from find(1) t5616: use rev-parse instead to get HEAD's object_id t5003: skip conversion test if unzip -a is unavailable t5003: drop the subshell in test_lazy_prereq test-lib-functions: test_cmp: eval $GIT_TEST_CMP t4061: use POSIX compliant regex(7)	2020-04-28 15:49:55 -07:00
Đoàn Trần Công Danh	3cacb9aaf4	progress.c: silence cgcc suggestion about internal linkage Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Reviewed-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-27 11:21:28 -07:00
Đoàn Trần Công Danh	1437ebf74a	test-parse-pathspec-file.c: s/0/NULL/ for pointer type Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Reviewed-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-27 11:21:12 -07:00
Jeff King	1b4c57fa87	test-bloom: check that we have expected arguments If "test-tool bloom" is not fed a command, or if arguments are missing for some commands, it will just segfault. Let's check argc and write a friendlier usage message. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-23 16:05:29 -07:00
Jeff King	24b7d1e7b0	test-bloom: fix some whitespace issues Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-23 16:05:29 -07:00
Junio C Hamano	a768f866e9	Merge branch 'jk/oid-array-cleanups' Code cleanup. * jk/oid-array-cleanups: oidset: stop referring to sha1-array ref-filter: stop referring to "sha1 array" bisect: stop referring to sha1_array test-tool: rename sha1-array to oid-array oid_array: rename source file from sha1-array oid_array: use size_t for iteration oid_array: use size_t for count and allocation	2020-04-22 13:42:49 -07:00
Junio C Hamano	810dc6481a	Merge branch 'js/trace2-env-vars' Trace2 enhancement to allow logging of the environment variables. * js/trace2-env-vars: trace2: teach Git to log environment variables	2020-04-22 13:42:44 -07:00
Taylor Blau	2fa05f31bd	t/helper/test-read-graph.c: support commit-graph chains In `61df89c8e5` (commit-graph: don't early exit(1) on e.g. "git status", 2019-03-25), the former 'load_commit_graph_one' was refactored into 'open_commit_graph' and 'load_commit_graph_one_fd_st' as a means of avoiding an early-exit from non-library code. However, 'load_commit_graph_one' does not support commit-graph chains, and hence the 'read-graph' test tool does not work with them. Replace 'load_commit_graph_one' with 'read_commit_graph_one' in order to support commit-graph chains. In the spirit of `61df89c8e5`, 'read_commit_graph_one' does not ever 'die()', making it a suitable replacement here. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-15 09:20:24 -07:00
Garima Singh	a759bfa9ee	t4216: add end to end tests for git log with Bloom filters These tests exercises writing commit graph with Bloom filters and exercises 'git log -- path' with all the applicable options. They check that the output is the same with and without Bloom filters, confirm Bloom filters were used by checking if trace2 statistics were logged correctly. Also confirms cases where Bloom filters are not used: 1. Multiple path specs, 2. --walk-reflogs (see patch titled 'revision.c: use Bloom filters...' for details, 3. If the latest commit graph does not have Bloom filters Signed-off-by: Garima Singh <garima.singh@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-06 11:08:37 -07:00
Garima Singh	1217c03e7b	commit-graph: reuse existing Bloom filters during write Add logic to a) parse Bloom filter information from the commit graph file and, b) re-use existing Bloom filters. See Documentation/technical/commit-graph-format for the format in which the Bloom filter information is written to the commit graph file. To read Bloom filter for a given commit with lexicographic position 'i' we need to: 1. Read BIDX[i] which essentially gives us the starting index in BDAT for filter of commit i+1. It is essentially the index past the end of the filter of commit i. It is called end_index in the code. 2. For i>0, read BIDX[i-1] which will give us the starting index in BDAT for filter of commit i. It is called the start_index in the code. For the first commit, where i = 0, Bloom filter data starts at the beginning, just past the header in the BDAT chunk. Hence, start_index will be 0. 3. The length of the filter will be end_index - start_index, because BIDX[i] gives the cumulative 8-byte words including the ith commit's filter. We toggle whether Bloom filters should be recomputed based on the compute_if_not_present flag. Helped-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Garima Singh <garima.singh@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-06 11:08:37 -07:00
Jeff King	ed4b804e46	test-tool: rename sha1-array to oid-array This matches the actual data structure name, as well as the source file that contains the code we're testing. The test scripts need updating to use the new name, as well. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-30 10:59:08 -07:00
Jeff King	fe299ec5ae	oid_array: rename source file from sha1-array We renamed the actual data structure in `910650d2f8` (Rename sha1_array to oid_array, 2017-03-31), but the file is still called sha1-array. Besides being slightly confusing, it makes it more annoying to grep for leftover occurrences of "sha1" in various files, because the header is included in so many places. Let's complete the transition by renaming the source and header files (and fixing up a few comment references). I kept the "-" in the name, as that seems to be our style; cf. `fc1395f4a4` (sha1_file.c: rename to use dash in file name, 2018-04-10). We also have oidmap.h and oidset.h without any punctuation, but those are "struct oidmap" and "struct oidset" in the code. We _could_ make this "oidarray" to match, but somehow it looks uglier to me because of the length of "array" (plus it would be a very invasive patch for little gain). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-30 10:59:08 -07:00
Garima Singh	ed591febb4	bloom.c: core Bloom filter implementation for changed paths. Add the core implementation for computing Bloom filters for the paths changed between a commit and it's first parent. We fill the Bloom filters as (const char *data, int len) pairs as `struct bloom_filters" within a commit slab. Filters for commits with no changes and more than 512 changes, is represented with a filter of length zero. There is no gain in distinguishing between a computed filter of length zero for a commit with no changes, and an uncomputed filter for new commits or for commits with more than 512 changes. The effect on `git log -- path` is the same in both cases. We will fall back to the normal diffing algorithm when we can't benefit from the existence of Bloom filters. Helped-by: Jeff King <peff@peff.net> Helped-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Jakub Narębski <jnareb@gmail.com> Signed-off-by: Garima Singh <garima.singh@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-30 09:59:53 -07:00
Garima Singh	f1294eaf7f	bloom.c: introduce core Bloom filter constructs Introduce the constructs for Bloom filters, Bloom filter keys and Bloom filter settings. For details on what Bloom filters are and how they work, refer to Dr. Derrick Stolee's blog post [1]. It provides a concise explanation of the adoption of Bloom filters as described in [2] and [3]. Implementation specifics: 1. We currently use 7 and 10 for the number of hashes and the size of each entry respectively. They served as great starting values, the mathematical details behind this choice are described in [1] and [4]. The implementation, while not completely open to it at the moment, is flexible enough to allow for tweaking these settings in the future. Note: The performance gains we have observed with these values are significant enough that we did not need to tweak these settings. The performance numbers are included in the cover letter of this series and in the commit message of the subsequent commit where we use Bloom filters to speed up `git log -- path`. 2. As described in [1] and [3], we do not need 7 independent hashing functions. We use the Murmur3 hashing scheme, seed it twice and then combine those to procure an arbitrary number of hash values. 3. The filters will be sized according to the number of changes in each commit, in multiples of 8 bit words. [1] Derrick Stolee "Supercharging the Git Commit Graph IV: Bloom Filters" https://devblogs.microsoft.com/devops/super-charging-the-git-commit-graph-iv-Bloom-filters/ [2] Flavio Bonomi, Michael Mitzenmacher, Rina Panigrahy, Sushil Singh, George Varghese "An Improved Construction for Counting Bloom Filters" http://theory.stanford.edu/~rinap/papers/esa2006b.pdf https://doi.org/10.1007/11841036_61 [3] Peter C. Dillinger and Panagiotis Manolios "Bloom Filters in Probabilistic Verification" http://www.ccs.neu.edu/home/pete/pub/Bloom-filters-verification.pdf https://doi.org/10.1007/978-3-540-30494-4_26 [4] Thomas Mueller Graf, Daniel Lemire "Xor Filters: Faster and Smaller Than Bloom and Cuckoo Filters" https://arxiv.org/abs/1912.08258 Helped-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Jakub Narębski <jnareb@gmail.com> Signed-off-by: Garima Singh <garima.singh@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-30 09:59:53 -07:00
Garima Singh	f52207a45c	bloom.c: add the murmur3 hash implementation In preparation for computing changed paths Bloom filters, implement the Murmur3 hash algorithm as described in [1]. It hashes the given data using the given seed and produces a uniformly distributed hash value. [1] https://en.wikipedia.org/wiki/MurmurHash#Algorithm Helped-by: Derrick Stolee <dstolee@microsoft.com> Helped-by: Szeder Gábor <szeder.dev@gmail.com> Reviewed-by: Jakub Narębski <jnareb@gmail.com> Signed-off-by: Garima Singh <garima.singh@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-30 09:59:53 -07:00
Đoàn Trần Công Danh	84370e36bb	t5703: feed raw data into test-tool unpack-sideband busybox's sed isn't binary clean. Thus, triggers false-negative on this test. We could replace sed with perl on this usecase. But, we could slightly modify the helper to discard unwanted data in the beginning. Fix the false negative by updating this helper. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-26 17:30:48 -07:00
Junio C Hamano	f8cb64e3d4	Merge branch 'bc/sha-256-part-1-of-4' SHA-256 transition continues. * bc/sha-256-part-1-of-4: (22 commits) fast-import: add options for rewriting submodules fast-import: add a generic function to iterate over marks fast-import: make find_marks work on any mark set fast-import: add helper function for inserting mark object entries fast-import: permit reading multiple marks files commit: use expected signature header for SHA-256 worktree: allow repository version 1 init-db: move writing repo version into a function builtin/init-db: add environment variable for new repo hash builtin/init-db: allow specifying hash algorithm on command line setup: allow check_repository_format to read repository format t/helper: make repository tests hash independent t/helper: initialize repository if necessary t/helper/test-dump-split-index: initialize git repository t6300: make hash algorithm independent t6300: abstract away SHA-1-specific constants t: use hash-specific lookup tables to define test constants repository: require a build flag to use SHA-256 hex: add functions to parse hex object IDs in any algorithm hex: introduce parsing variants taking hash algorithms ...	2020-03-26 17:11:20 -07:00
Junio C Hamano	4d0e8996ec	Merge branch 'am/real-path-fix' The real_path() convenience function can easily be misused; with a bit of code refactoring in the callers' side, its use has been eliminated. * am/real-path-fix: get_superproject_working_tree(): return strbuf real_path_if_valid(): remove unsafe API real_path: remove unsafe API set_git_dir: fix crash when used with real_path()	2020-03-25 13:57:43 -07:00
Junio C Hamano	c4a09cc9cc	Merge branch 'hw/advise-ng' Revamping of the advise API to allow more systematic enumeration of advice knobs in the future. * hw/advise-ng: tag: use new advice API to check visibility advice: revamp advise API advice: change "setupStreamFailure" to "setUpstreamFailure" advice: extract vadvise() from advise()	2020-03-25 13:57:41 -07:00
Josh Steadmon	3d3adaad91	trace2: teach Git to log environment variables Via trace2, Git can already log interesting config parameters (see the trace2_cmd_list_config() function). However, this can grant an incomplete picture because many config parameters also allow overrides via environment variables. To allow for more complete logs, we add a new trace2_cmd_list_env_vars() function and supporting implementation, modeled after the pre-existing config param logging implementation. Signed-off-by: Josh Steadmon <steadmon@google.com> Acked-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-23 13:14:53 -07:00
Alexandr Miloslavskiy	3d7747e318	real_path: remove unsafe API Returning a shared buffer invites very subtle bugs due to reentrancy or multi-threading, as demonstrated by the previous patch. There was an unfinished effort to abolish this [1]. Let's finally rid of `real_path()`, using `strbuf_realpath()` instead. This patch uses a local `strbuf` for most places where `real_path()` was previously called. However, two places return the value of `real_path()` to the caller. For them, a `static` local `strbuf` was added, effectively pushing the problem one level higher: read_gitfile_gently() get_superproject_working_tree() [1] https://lore.kernel.org/git/1480964316-99305-1-git-send-email-bmwill@google.com/ Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-10 11:41:40 -07:00
Junio C Hamano	0e0d717537	Merge branch 'pb/am-show-current-patch' "git am --short-current-patch" is a way to show the piece of e-mail for the stopped step, which is not suitable to directly feed "git apply" (it is designed to be a good "git am" input). It learned a new option to show only the patch part. * pb/am-show-current-patch: am: support --show-current-patch=diff to retrieve .git/rebase-apply/patch am: support --show-current-patch=raw as a synonym for--show-current-patch am: convert "resume" variable to a struct parse-options: convert "command mode" to a flag parse-options: add testcases for OPT_CMDMODE()	2020-03-09 11:21:19 -07:00
Heba Waly	b3b18d1621	advice: revamp advise API Currently it's very easy for the advice library's callers to miss checking the visibility step before printing an advice. Also, it makes more sense for this step to be handled by the advice library. Add a new advise_if_enabled function that checks the visibility of advice messages before printing. Add a new helper advise_enabled to check the visibility of the advice if the caller needs to carry out complicated processing based on that value. A list of advice_settings is added to cache the config variables names and values, it's intended to replace advice_config[] and the global variables once we migrate all the callers to use the new APIs. Signed-off-by: Heba Waly <heba.waly@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-05 06:15:02 -08:00
Junio C Hamano	ff41848e99	Merge branch 'rs/micro-cleanups' Code cleanup. * rs/micro-cleanups: use strpbrk(3) to search for characters from a given set quote: use isalnum() to check for alphanumeric characters	2020-03-02 15:07:20 -08:00
Junio C Hamano	d0038f4b31	Merge branch 'bw/remote-rename-update-config' "git remote rename X Y" needs to adjust configuration variables (e.g. branch.<name>.remote) whose value used to be X to Y. branch.<name>.pushRemote is now also updated. * bw/remote-rename-update-config: remote rename/remove: gently handle remote.pushDefault config config: provide access to the current line number remote rename/remove: handle branch.<name>.pushRemote config values remote: clean-up config callback remote: clean-up by returning early to avoid one indentation pull --rebase/remote rename: document and honor single-letter abbreviations rebase types	2020-02-25 11:18:32 -08:00
brian m. carlson	bf154a8782	t/helper: make repository tests hash independent This test currently hard-codes the hash algorithm as SHA-1 when calling repo_set_hash_algo so that the_hash_algo is properly initialized. However, this does not work with SHA-256 repositories. Read the repository value that repo_init has read into the local repository variable and set the algorithm based on that value. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-24 09:33:27 -08:00
brian m. carlson	8dca7f30e4	t/helper: initialize repository if necessary The repository helper is used in t5318 to read commit graphs whether we're in a repository or not. However, without a repository, we have no way to properly initialize the hash algorithm, meaning that the file is misread. Fix this by calling setup_git_directory_gently, which will read the environment variable the testsuite sets to ensure that the correct hash algorithm is set. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-24 09:33:27 -08:00
brian m. carlson	6946e525ae	t/helper/test-dump-split-index: initialize git repository In this test helper, we read the index. In order to have the proper hash algorithm set up, we must call setup_git_directory. Do so, so that the test works when extensions.objectFormat is set. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-24 09:33:24 -08:00
René Scharfe	2ce6d075fa	use strpbrk(3) to search for characters from a given set We can check if certain characters are present in a string by calling strchr(3) on each of them, or we can pass them all to a single strpbrk(3) call. The latter is shorter, less repetitive and slightly more efficient, so let's do that instead. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-24 09:30:31 -08:00
Paolo Bonzini	62e7a6f7a1	parse-options: add testcases for OPT_CMDMODE() Before modifying the implementation, ensure that general operation of OPT_CMDMODE() and detection of incompatible options are covered. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-20 13:20:40 -08:00
Junio C Hamano	5d55554b1d	Merge branch 'mr/show-config-scope' "git config" learned to show in which "scope", in addition to in which file, each config setting comes from. * mr/show-config-scope: config: add '--show-scope' to print the scope of a config value submodule-config: add subomdule config scope config: teach git_config_source to remember its scope config: preserve scope in do_git_config_sequence config: clarify meaning of command line scoping config: split repo scope to local and worktree config: make scope_name non-static and rename it t1300: create custom config file without special characters t1300: fix over-indented HERE-DOCs config: fix typo in variable name	2020-02-17 13:22:17 -08:00
Junio C Hamano	53c3be2c29	Merge branch 'tb/commit-graph-object-dir' The code to compute the commit-graph has been taught to use a more robust way to tell if two object directories refer to the same thing. * tb/commit-graph-object-dir: commit-graph.h: use odb in 'load_commit_graph_one_fd_st' commit-graph.c: remove path normalization, comparison commit-graph.h: store object directory in 'struct commit_graph' commit-graph.h: store an odb in 'struct write_commit_graph_context' t5318: don't pass non-object directory to '--object-dir'	2020-02-14 12:54:24 -08:00
Junio C Hamano	c9a33e5e5d	Merge branch 'kw/fsmonitor-watchman-racefix' A new version of fsmonitor-watchman hook has been introduced, to avoid races. * kw/fsmonitor-watchman-racefix: fsmonitor: update documentation for hook version and watchman hooks fsmonitor: add fsmonitor hook scripts for version 2 fsmonitor: handle version 2 of the hooks that will use opaque token fsmonitor: change last update timestamp on the index_state to opaque token	2020-02-14 12:54:20 -08:00
Bert Wesarg	f2a2327a4a	config: provide access to the current line number Users are nowadays trained to see message from CLI tools in the form <file>:<lno>: … To be able to give such messages when notifying the user about configurations in any config file, it is currently only possible to get the file name (if the value originates from a file to begin with) via `current_config_name()`. Now it is also possible to query the current line number for the configuration. Signed-off-by: Bert Wesarg <bert.wesarg@googlemail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-10 10:52:10 -08:00
Matthew Rogers	a5cb4204b6	config: make scope_name non-static and rename it To prepare for the upcoming --show-scope option, we require the ability to convert a config_scope enum to a string. As this was originally implemented as a static function 'scope_name()' in t/helper/test-config.c, we expose it via config.h and give it a less ambiguous name 'config_scope_name()' Signed-off-by: Matthew Rogers <mattr94@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-10 10:32:18 -08:00
Taylor Blau	a7df60cac8	commit-graph.h: use odb in 'load_commit_graph_one_fd_st' Apply a similar treatment as in the previous patch to pass a 'struct object_directory *' through the 'load_commit_graph_one_fd_st' initializer, too. This prevents a potential bug where a pointer comparison is made to a NULL 'g->odb', which would cause the commit-graph machinery to think that a pair of commit-graphs belonged to different alternates when in fact they do not (i.e., in the case of no '--object-dir'). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-04 11:36:51 -08:00
Taylor Blau	ad2dd5bb63	commit-graph.c: remove path normalization, comparison As of the previous patch, all calls to 'commit-graph.c' functions which perform path normalization (for e.g., 'get_commit_graph_filename()') are of the form 'ctx->odb->path', which is always in normalized form. Now that there are no callers passing non-normalized paths to these functions, ensure that future callers are bound by the same restrictions by making these functions take a 'struct object_directory ' instead of a 'const char '. To match, replace all calls with arguments of the form 'ctx->odb->path' with 'ctx->odb' To recover the path, functions that perform path manipulation simply use 'odb->path'. Further, avoid string comparisons with arguments of the form 'odb->path', and instead prefer raw pointer comparisons, which accomplish the same effect, but are far less brittle. This has a pleasant side-effect of making these functions much more robust to paths that cannot be normalized by 'normalize_path_copy()', i.e., because they are outside of the current working directory. For example, prior to this patch, Valgrind reports that the following uninitialized memory read [1]: $ ( cd t && GIT_DIR=../.git valgrind git rev-parse HEAD^ ) because 'normalize_path_copy()' can't normalize '../.git' (since it's relative to but above of the current working directory) [2]. By using a 'struct object_directory *' directly, 'get_commit_graph_filename()' does not need to normalize, because all paths are relative to the current working directory since they are always read from the '->path' of an object directory. [1]: https://lore.kernel.org/git/20191027042116.GA5801@sigill.intra.peff.net. [2]: The bug here is that 'get_commit_graph_filename()' returns the result of 'normalize_path_copy()' without checking the return value. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-04 11:36:51 -08:00
Alexandr Miloslavskiy	d0d0a357a1	t: directly test parse_pathspec_file() Previously, `parse_pathspec_file()` was tested indirectly by invoking git commands with properly crafted inputs. As demonstrated by the previous bugfix, testing complicated black boxes indirectly can lead to tests that silently test the wrong thing. Introduce direct tests for `parse_pathspec_file()`. Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-01-15 12:14:20 -08:00
Kevin Willford	56c6910028	fsmonitor: change last update timestamp on the index_state to opaque token Some file system monitors might not use or take a timestamp for processing and in the case of watchman could have race conditions with using a timestamp. Watchman uses something called a clockid that is used for race free queries to it. The clockid for watchman is simply a string. Change the fsmonitor_last_update from being a uint64_t to a char pointer so that any arbitrary data can be stored in it and passed back to the fsmonitor. Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-01-13 14:58:43 -08:00
Junio C Hamano	55d607d85b	Merge branch 'js/mingw-inherit-only-std-handles' Work around a issue where a FD that is left open when spawning a child process and is kept open in the child can interfere with the operation in the parent process on Windows. * js/mingw-inherit-only-std-handles: mingw: forbid translating ERROR_SUCCESS to an errno value mingw: do set `errno` correctly when trying to restrict handle inheritance mingw: restrict file handle inheritance only on Windows 7 and later mingw: spawned processes need to inherit only standard handles mingw: work around incorrect standard handles mingw: demonstrate that all file handles are inherited by child processes	2019-12-10 13:11:42 -08:00
Junio C Hamano	7034cd094b	Sync with Git 2.24.1	2019-12-09 22:17:55 -08:00
Johannes Schindelin	67af91c47a	Sync with 2.23.1 * maint-2.23: (44 commits) Git 2.23.1 Git 2.22.2 Git 2.21.1 mingw: sh arguments need quoting in more circumstances mingw: fix quoting of empty arguments for `sh` mingw: use MSYS2 quoting even when spawning shell scripts mingw: detect when MSYS2's sh is to be spawned more robustly t7415: drop v2.20.x-specific work-around Git 2.20.2 t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x Git 2.19.3 Git 2.18.2 Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters ...	2019-12-06 16:31:39 +01:00
Johannes Schindelin	7fd9fd94fb	Sync with 2.22.2 * maint-2.22: (43 commits) Git 2.22.2 Git 2.21.1 mingw: sh arguments need quoting in more circumstances mingw: fix quoting of empty arguments for `sh` mingw: use MSYS2 quoting even when spawning shell scripts mingw: detect when MSYS2's sh is to be spawned more robustly t7415: drop v2.20.x-specific work-around Git 2.20.2 t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x Git 2.19.3 Git 2.18.2 Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors ...	2019-12-06 16:31:30 +01:00
Johannes Schindelin	5421ddd8d0	Sync with 2.21.1 * maint-2.21: (42 commits) Git 2.21.1 mingw: sh arguments need quoting in more circumstances mingw: fix quoting of empty arguments for `sh` mingw: use MSYS2 quoting even when spawning shell scripts mingw: detect when MSYS2's sh is to be spawned more robustly t7415: drop v2.20.x-specific work-around Git 2.20.2 t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x Git 2.19.3 Git 2.18.2 Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh ...	2019-12-06 16:31:23 +01:00
Johannes Schindelin	fc346cb292	Sync with 2.20.2 * maint-2.20: (36 commits) Git 2.20.2 t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x Git 2.19.3 Git 2.18.2 Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories ...	2019-12-06 16:31:12 +01:00
Johannes Schindelin	d851d94151	Sync with 2.19.3 * maint-2.19: (34 commits) Git 2.19.3 Git 2.18.2 Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams ...	2019-12-06 16:30:49 +01:00
Johannes Schindelin	7c9fbda6e2	Sync with 2.18.2 * maint-2.18: (33 commits) Git 2.18.2 Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up ...	2019-12-06 16:30:38 +01:00
Johannes Schindelin	14af7ed5a9	Sync with 2.17.3 * maint-2.17: (32 commits) Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names ...	2019-12-06 16:29:15 +01:00
Johannes Schindelin	bdfef0492c	Sync with 2.16.6 * maint-2.16: (31 commits) Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names path: safeguard `.git` against NTFS Alternate Streams Accesses ...	2019-12-06 16:27:36 +01:00
Johannes Schindelin	68440496c7	test-drop-caches: use `has_dos_drive_prefix()` This is a companion patch to 'mingw: handle `subst`-ed "DOS drives"': use the DOS drive prefix handling that is already provided by `compat/mingw.c` (and which just learned to handle non-alphabetical "drive letters"). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2019-12-06 16:27:20 +01:00
Johannes Schindelin	9ac92fed5b	Sync with 2.15.4 * maint-2.15: (29 commits) Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names path: safeguard `.git` against NTFS Alternate Streams Accesses clone --recurse-submodules: prevent name squatting on Windows is_ntfs_dotgit(): only verify the leading segment ...	2019-12-06 16:27:18 +01:00
Johannes Schindelin	d3ac8c3f27	Sync with 2.14.6 * maint-2.14: (28 commits) Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names path: safeguard `.git` against NTFS Alternate Streams Accesses clone --recurse-submodules: prevent name squatting on Windows is_ntfs_dotgit(): only verify the leading segment test-path-utils: offer to run a protectNTFS/protectHFS benchmark ...	2019-12-06 16:26:55 +01:00
Johannes Schindelin	65d30a19de	Merge branch 'win32-filenames-cannot-have-trailing-spaces-or-periods' On Windows, filenames cannot have trailing spaces or periods, when opening such paths, they are stripped automatically. Read: you can open the file `README` via the file name `README . . .`. This ambiguity can be used in combination with other security bugs to cause e.g. remote code execution during recursive clones. This patch series fixes that. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2019-12-05 15:37:09 +01:00
Johannes Schindelin	d2c84dad1c	mingw: refuse to access paths with trailing spaces or periods When creating a directory on Windows whose path ends in a space or a period (or chains thereof), the Win32 API "helpfully" trims those. For example, `mkdir("abc ");` will return success, but actually create a directory called `abc` instead. This stems back to the DOS days, when all file names had exactly 8 characters plus exactly 3 characters for the file extension, and the only way to have shorter names was by padding with spaces. Sadly, this "helpful" behavior is a bit inconsistent: after a successful `mkdir("abc ");`, a `mkdir("abc /def")` will actually _fail_ (because the directory `abc ` does not actually exist). Even if it would work, we now have a serious problem because a Git repository could contain directories `abc` and `abc `, and on Windows, they would be "merged" unintentionally. As these paths are illegal on Windows, anyway, let's disallow any accesses to such paths on that Operating System. For practical reasons, this behavior is still guarded by the config setting `core.protectNTFS`: it is possible (and at least two regression tests make use of it) to create commits without involving the worktree. In such a scenario, it is of course possible -- even on Windows -- to create such file names. Among other consequences, this patch disallows submodules' paths to end in spaces on Windows (which would formerly have confused Git enough to try to write into incorrect paths, anyway). While this patch does not fix a vulnerability on its own, it prevents an attack vector that was exploited in demonstrations of a number of recently-fixed security bugs. The regression test added to `t/t7417-submodule-path-url.sh` reflects that attack vector. Note that we have to adjust the test case "prevent git~1 squatting on Windows" in `t/t7415-submodule-names.sh` because of a very subtle issue. It tries to clone two submodules whose names differ only in a trailing period character, and as a consequence their git directories differ in the same way. Previously, when Git tried to clone the second submodule, it thought that the git directory already existed (because on Windows, when you create a directory with the name `b.` it actually creates `b`), but with this patch, the first submodule's clone will fail because of the illegal name of the git directory. Therefore, when cloning the second submodule, Git will take a different code path: a fresh clone (without an existing git directory). Both code paths fail to clone the second submodule, both because the the corresponding worktree directory exists and is not empty, but the error messages are worded differently. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2019-12-05 15:37:06 +01:00
Johannes Schindelin	379e51d1ae	quote-stress-test: offer to test quoting arguments for MSYS2 sh It is unfortunate that we need to quote arguments differently on Windows, depending whether we build a command-line for MSYS2's `sh` or for other Windows executables. We already have a test helper to verify the latter, with this patch we can also verify the former. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2019-12-05 15:37:06 +01:00
Johannes Schindelin	7530a6287e	quote-stress-test: allow skipping some trials When the, say, 93rd trial run fails, it is a good idea to have a way to skip the first 92 trials and dig directly into the 93rd in a debugger. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2019-12-05 15:37:06 +01:00
Johannes Schindelin	55953c77c0	quote-stress-test: accept arguments to test via the command-line When the stress test reported a problem with quoting certain arguments, it is helpful to have a facility to play with those arguments in order to find out whether variations of those arguments are affected, too. Let's allow `test-run-command quote-stress-test -- <args>` to be used for that purpose. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2019-12-05 15:36:53 +01:00
Garima Singh	ad15592529	tests: add a helper to stress test argument quoting On Windows, we have to do all the command-line argument quoting ourselves. Worse: we have to have two versions of said quoting, one for MSYS2 programs (which have their own dequoting rules) and the rest. We care mostly about the rest, and to make sure that that works, let's have a stress test that comes up with all kinds of awkward arguments, verifying that a spawned sub-process receives those unharmed. Signed-off-by: Garima Singh <garima.singh@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2019-12-05 15:36:52 +01:00
Garima Singh	a62f9d1ace	test-path-utils: offer to run a protectNTFS/protectHFS benchmark In preparation to flipping the default on `core.protectNTFS`, let's have some way to measure the speed impact of this config setting reliably (and for comparison, the `core.protectHFS` config setting). For now, this is a manual performance benchmark: ./t/helper/test-path-utils protect_ntfs_hfs [arguments...] where the arguments are an optional number of file names to test with, optionally followed by minimum and maximum length of the random file names. The default values are one million, 3 and 20, respectively. Just like `sqrti()` in `bisect.c`, we introduce a very simple function to approximation the square root of a given value, in order to avoid having to introduce the first user of `<math.h>` in Git's source code. Note: this is _not_ implemented as a Unix shell script in t/perf/ because we really care about _very_ precise timings here, and Unix shell scripts are simply unsuited for precise and consistent benchmarking. Signed-off-by: Garima Singh <garima.singh@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2019-12-05 15:36:40 +01:00
Johannes Schindelin	eea4a7f4b3	mingw: demonstrate that all file handles are inherited by child processes When spawning child processes, we really should be careful which file handles we let them inherit. This is doubly important on Windows, where we cannot rename, delete, or modify files if there is still a file handle open. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-11-23 11:17:01 +09:00
Derrick Stolee	4bd0593e0f	test-tool: use 'read-graph' helper The 'git commit-graph read' subcommand is used in test scripts to check that the commit-graph contents match the expected data. Mostly, this helps check the header information and the list of chunks. Users do not need this information, so move the functionality to a test helper. Reported-by: Bryan Turner <bturner@atlassian.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-11-13 11:14:16 +09:00
Junio C Hamano	e3cf08361a	Merge branch 'sg/progress-fix' Byte-order fix the recent update to progress display code. * sg/progress-fix: test-progress: fix test failures on big-endian systems	2019-10-23 14:43:09 +09:00
SZEDER Gábor	2b6f6ea1bd	test-progress: fix test failures on big-endian systems In 't0500-progress-display.sh' all tests running 'test-tool progress --total=<N>' fail on big-endian systems, e.g. like this: + test-tool progress --total=3 Working hard [...] + test_i18ncmp expect out --- expect 2019-10-18 23:07:54.765523916 +0000 +++ out 2019-10-18 23:07:54.773523916 +0000 @@ -1,4 +1,2 @@ -Working hard: 33% (1/3)<CR> -Working hard: 66% (2/3)<CR> -Working hard: 100% (3/3)<CR> -Working hard: 100% (3/3), done. +Working hard: 0% (1/12884901888)<CR> +Working hard: 0% (3/12884901888), done. The reason for that bogus value is that '--total's parameter is parsed via parse-options's OPT_INTEGER into a uint64_t variable [1], so the two bits of 3 end up in the "wrong" bytes on big-endian systems (12884901888 = 0x300000000). Change the type of that variable from uint64_t to int, to match what parse-options expects; in the tests of the progress output we won't use values that don't fit into an int anyway. [1] start_progress() expects the total number as an uint64_t, that's why I chose the same type when declaring the variable holding the value given on the command line. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> [jpag: Debian unstable/ppc64 (big-endian)] Tested-By: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> [tz: Fedora s390x (big-endian)] Tested-By: Todd Zullinger <tmz@pobox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-10-21 09:53:49 +09:00
Junio C Hamano	5efabc7ed9	Merge branch 'ew/hashmap' Code clean-up of the hashmap API, both users and implementation. * ew/hashmap: hashmap_entry: remove first member requirement from docs hashmap: remove type arg from hashmap_{get,put,remove}_entry OFFSETOF_VAR macro to simplify hashmap iterators hashmap: introduce hashmap_free_entries hashmap: hashmap_{put,remove} return hashmap_entry * hashmap: use _entry APIs for iteration hashmap_cmp_fn takes hashmap_entry params hashmap_get{,_from_hash} return "struct hashmap_entry " hashmap: use _entry APIs to wrap container_of hashmap_get_next returns "struct hashmap_entry " introduce container_of macro hashmap_put takes "struct hashmap_entry " hashmap_remove takes "const struct hashmap_entry " hashmap_get takes "const struct hashmap_entry " hashmap_add takes "struct hashmap_entry " hashmap_get_next takes "const struct hashmap_entry " hashmap_entry_init takes "struct hashmap_entry " packfile: use hashmap_entry in delta_base_cache_entry coccicheck: detect hashmap_entry.hash assignment diff: use hashmap_entry_init on moved_entry.ent	2019-10-15 13:48:02 +09:00
Junio C Hamano	6d5291be45	Merge branch 'js/azure-pipelines-msvc' CI updates. * js/azure-pipelines-msvc: ci: also build and test with MS Visual Studio on Azure Pipelines ci: really use shallow clones on Azure Pipelines tests: let --immediate and --write-junit-xml play well together test-tool run-command: learn to run (parts of) the testsuite vcxproj: include more generated files vcxproj: only copy `git-remote-http.exe` once it was built msvc: work around a bug in GetEnvironmentVariable() msvc: handle DEVELOPER=1 msvc: ignore some libraries when linking compat/win32/path-utils.h: add #include guards winansi: use FLEX_ARRAY to avoid compiler warning msvc: avoid using minus operator on unsigned types push: do not pretend to return `int` from `die_push_simple()`	2019-10-15 13:48:00 +09:00
Junio C Hamano	56c7ab0f4e	Merge branch 'sg/t-helper-gitignore' Update the way build artifacts in t/helper/ directory are ignored. * sg/t-helper-gitignore: t/helper: ignore only executable files	2019-10-07 11:33:01 +09:00
Junio C Hamano	ef93bfbd45	Merge branch 'sg/progress-fix' Regression fix for progress output. * sg/progress-fix: Test the progress display Revert "progress: use term_clear_line()"	2019-10-07 11:32:59 +09:00
Junio C Hamano	36d2fca82b	Merge branch 'ss/get-time-cleanup' Code simplification. * ss/get-time-cleanup: test_date.c: remove reference to GIT_TEST_DATE_NOW Quit passing 'now' to date code	2019-10-07 11:32:54 +09:00
Eric Wong	e2b5038d87	hashmap_entry: remove first member requirement from docs Comments stating that "struct hashmap_entry" must be the first member in a struct are no longer valid. Suggested-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-10-07 10:20:12 +09:00
Eric Wong	404ab78e39	hashmap: remove type arg from hashmap_{get,put,remove}_entry Since these macros already take a `keyvar' pointer of a known type, we can rely on OFFSETOF_VAR to get the correct offset without relying on non-portable `__typeof__' and `offsetof'. Argument order is also rearranged, so `keyvar' and `member' are sequential as they are used as: `keyvar->member' Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-10-07 10:20:12 +09:00
Eric Wong	23dee69f53	OFFSETOF_VAR macro to simplify hashmap iterators While we cannot rely on a `__typeof__' operator being portable to use with `offsetof'; we can calculate the pointer offset using an existing pointer and the address of a member using pointer arithmetic for compilers without `__typeof__'. This allows us to simplify usage of hashmap iterator macros by not having to specify a type when a pointer of that type is already given. In the future, list iterator macros (e.g. list_for_each_entry) may also be implemented using OFFSETOF_VAR to save hackers the trouble of using container_of/list_entry macros and without relying on non-portable `__typeof__'. v3: use `__typeof__' to avoid clang warnings Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-10-07 10:20:11 +09:00
Eric Wong	c8e424c9c9	hashmap: introduce hashmap_free_entries `hashmap_free_entries' behaves like `container_of' and passes the offset of the hashmap_entry struct to the internal `hashmap_free_' function, allowing the function to free any struct pointer regardless of where the hashmap_entry field is located. `hashmap_free' no longer takes any arguments aside from the hashmap itself. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-10-07 10:20:11 +09:00
Eric Wong	8a973d0bb3	hashmap: hashmap_{put,remove} return hashmap_entry * And add *_entry variants to perform container_of as necessary to simplify most callers. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-10-07 10:20:11 +09:00
Eric Wong	87571c3f71	hashmap: use *_entry APIs for iteration Inspired by list_for_each_entry in the Linux kernel. Once again, these are somewhat compromised usability-wise by compilers lacking __typeof__ support. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-10-07 10:20:11 +09:00
Eric Wong	939af16eac	hashmap_cmp_fn takes hashmap_entry params Another step in eliminating the requirement of hashmap_entry being the first member of a struct. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-10-07 10:20:11 +09:00
Eric Wong	f0e63c4113	hashmap: use _entry APIs to wrap container_of Using `container_of' can be verbose and choosing names for intermediate "struct hashmap_entry" pointers is a hard problem. So introduce "_entry" APIs inspired by similar linked-list APIs in the Linux kernel. Unfortunately, `__typeof__' is not portable C, so we need an extra parameter to specify the type. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-10-07 10:20:10 +09:00
Eric Wong	6bcbdfb277	hashmap_get_next returns "struct hashmap_entry *" This is a step towards removing the requirement for hashmap_entry being the first field of a struct. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-10-07 10:20:10 +09:00
Eric Wong	26b455f21e	hashmap_put takes "struct hashmap_entry " This is less error-prone than "void " as the compiler now detects invalid types being passed. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-10-07 10:20:10 +09:00
Eric Wong	b94e5c1df6	hashmap_add takes "struct hashmap_entry " This is less error-prone than "void " as the compiler now detects invalid types being passed. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-10-07 10:20:10 +09:00
Eric Wong	f6eb6bdcf2	hashmap_get_next takes "const struct hashmap_entry " This is less error-prone than "const void " as the compiler now detects invalid types being passed. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-10-07 10:20:10 +09:00
Eric Wong	d22245a2e3	hashmap_entry_init takes "struct hashmap_entry " C compilers do type checking to make life easier for us. So rely on that and update all hashmap_entry_init callers to take "struct hashmap_entry " to avoid future bugs while improving safety and readability. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-10-07 10:20:09 +09:00
Johannes Schindelin	be5d88e112	test-tool run-command: learn to run (parts of) the testsuite Git for Windows jumps through hoops to provide a development environment that allows to build Git and to run its test suite. To that end, an entire MSYS2 system, including GNU make and GCC is offered as "the Git for Windows SDK". It does come at a price: an initial download of said SDK weighs in with several hundreds of megabytes, and the unpacked SDK occupies ~2GB of disk space. A much more native development environment on Windows is Visual Studio. To help contributors use that environment, we already have a Makefile target `vcxproj` that generates a commit with project files (and other generated files), and Git for Windows' `vs/master` branch is continuously re-generated using that target. The idea is to allow building Git in Visual Studio, and to run individual tests using a Portable Git. The one missing thing is a way to run the entire test suite: neither `make` nor `prove` are required to run Git, therefore Git for Windows does not support those commands in the Portable Git. To help with that, add a simple test helper that exercises the `run_processes_parallel()` function to allow for running test scripts in parallel (which is really necessary, especially on Windows, as Git's test suite takes such a long time to run). This will also come in handy for the upcoming change to our Azure Pipeline: we will use this helper in a Portable Git to test the Visual Studio build of Git. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-10-06 09:07:44 +09:00
SZEDER Gábor	27ea41c0b4	t/helper: ignore only executable files This patch conceptually reverts `44103f4197` (t/helper: ignore everything but sources, 2017-12-12). Back in those days we did have a lot of separate test helper executables under 't/helper', and its '.gitignore' did get out of sync every once in a while. Since then, however, most of those separate executables were integrated into a single 'test-tool' command [1], and new test helpers are added as new subcommands, so the chances of that '.gitignore' getting out of sync again are much lower. And even if a contributor were not careful enough and submits a patch that adds a new executable under 't/helper' but forgets to update '.gitignore' accordingly, our CI builds would catch it in a timely manner [2]. Ignoring everything but sources has the drawback that building an older version of Git (e.g. during bisecting) creates all those executables, and after going back to e.g. current 'master' the usual cleanup commands like 'make clean' or 'git clean -fd' don't remove them (the former doesn't know about them, and the latter doesn't remove ignored files). So let's ignore only the executable files under 't/helper/, i.e. 'test-tool' and the three other remaining executables that could not be integrated into 'test-tool' (no need to ignore object files, as they are already ignored by our toplevel '.gitignore'). [1] The topic starting with `efd71f8913` (t/helper: add an empty test-tool program, 2018-03-24), and leading up to the merge commit `27f25845cf` (Merge branch 'nd/combined-test-helper', 2018-04-11). [2] `b92cb86ea1` (travis-ci: check that all build artifacts are .gitignore-d, 2017-12-31) Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-20 11:13:13 -07:00
Stephen P. Smith	47b27c96fa	test_date.c: remove reference to GIT_TEST_DATE_NOW Remove the reference to the GIT_TEST_DATE_NOW which is done in date.c. We can't get rid of the "x" variable, since it serves as a generic scratch variable for parsing later in the function. Signed-off-by: Stephen P. Smith <ischis2@cox.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-18 14:15:01 -07:00
SZEDER Gábor	2bb74b53a4	Test the progress display 'progress.c' has seen a few fixes recently [1], and, unfortunately, some of those fixes required further fixes [2]. It seems it's time to have a few tests focusing on the subtleties of the progress display. Add the 'test-tool progress' subcommand to help testing the progress display, reading instructions from standard input and turning them into calls to the display_progress() and display_throughput() functions with the given parameters. The progress display is, however, critically dependent on timing, because it's only updated once every second or, if the toal is known in advance, every 1%, and there is the throughput rate as well. These make the progress display far too undeterministic for testing as-is. To address this, add a few testing-specific variables and functions to 'progress.c', allowing the the new test helper to: - Disable the triggered-every-second SIGALRM and set the 'progress_update' flag explicitly based in the input instructions. This way the progress line will be updated deterministically when the test wants it to be updated. - Specify the time elapsed since start_progress() to make the throughput rate calculations deterministic. Add the new test script 't0500-progress-display.sh' to check a few simple cases with and without throughput, and that a shorter progress line properly covers up the previously displayed line in different situations. [1] See commits `545dc345eb` (progress: break too long progress bar lines, 2019-04-12) and `9f1fd84e15` (progress: clear previous progress update dynamically, 2019-04-12). [2] `1aed1a5f25` (progress: avoid empty line when breaking the progress line, 2019-05-19) Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-09-17 09:39:16 -07:00

... 2 3 4 5 6 ...

652 Commits