git-commit-vandalism

Author	SHA1	Message	Date
Junio C Hamano	f00809500f	Merge branch 'jc/all-negative-pathspec' A git subcommand like "git add -p" spawns a separate git process while relaying its command line arguments. A pathspec with only negative elements was mistakenly passed with an empty string, which has been corrected. * jc/all-negative-pathspec: pathspec: correct an empty string used as a pathspec element	2022-06-07 14:10:59 -07:00
Junio C Hamano	08baf19fa3	Merge branch 'js/scalar-diagnose' Implementation of "scalar diagnose" subcommand. * js/scalar-diagnose: scalar: teach `diagnose` to gather loose objects information scalar: teach `diagnose` to gather packfile info scalar diagnose: include disk space information scalar: implement `scalar diagnose` scalar: validate the optional enlistment argument archive --add-virtual-file: allow paths containing colons archive: optionally add "virtual" files	2022-06-07 14:10:58 -07:00
Junio C Hamano	006fd83e03	Merge branch 'rs/document-archive-prefix' The documentation on the interaction between "--add-file" and "--prefix" options of "git archive" has been improved. * rs/document-archive-prefix: archive: improve documentation of --prefix	2022-06-07 14:10:57 -07:00
Junio C Hamano	07a454027b	Merge branch 'fh/transport-push-leakfix' Leakfix. * fh/transport-push-leakfix: transport: free local and remote refs in transport_push() transport: unify return values and exit point from transport_push() transport: remove unnecessary indenting in transport_push()	2022-06-07 14:10:57 -07:00
Junio C Hamano	fc5a070f59	Merge branch 'js/ci-github-workflow-markup' Update the GitHub workflow support to make it quicker to get to the failing test. * js/ci-github-workflow-markup: ci: call `finalize_test_case_output` a little later ci(github): mention where the full logs can be found ci: use `--github-workflow-markup` in the GitHub workflow ci(github): avoid printing test case preamble twice ci(github): skip the logs of the successful test cases ci: optionally mark up output in the GitHub workflow ci/run-build-and-tests: add some structure to the GitHub workflow output ci: make it easier to find failed tests' logs in the GitHub workflow ci/run-build-and-tests: take a more high-level view test(junit): avoid line feeds in XML attributes tests: refactor --write-junit-xml code ci: fix code style	2022-06-07 14:10:57 -07:00
Junio C Hamano	2da81d1efb	Merge branch 'ab/plug-leak-in-revisions' Plug the memory leaks from the trickiest API of all, the revision walker. * ab/plug-leak-in-revisions: (27 commits) revisions API: add a TODO for diff_free(&revs->diffopt) revisions API: have release_revisions() release "topo_walk_info" revisions API: have release_revisions() release "date_mode" revisions API: call diff_free(&revs->pruning) in revisions_release() revisions API: release "reflog_info" in release revisions() revisions API: clear "boundary_commits" in release_revisions() revisions API: have release_revisions() release "prune_data" revisions API: have release_revisions() release "grep_filter" revisions API: have release_revisions() release "filter" revisions API: have release_revisions() release "cmdline" revisions API: have release_revisions() release "mailmap" revisions API: have release_revisions() release "commits" revisions API users: use release_revisions() for "prune_data" users revisions API users: use release_revisions() with UNLEAK() revisions API users: use release_revisions() in builtin/log.c revisions API users: use release_revisions() in http-push.c revisions API users: add "goto cleanup" for release_revisions() stash: always have the owner of "stash_info" free it revisions API users: use release_revisions() needing REV_INFO_INIT revision.[ch]: document and move code declared around "init" ...	2022-06-07 14:10:56 -07:00
Junio C Hamano	f31b624495	Merge branch 'yw/cmake-updates' CMake updates. * yw/cmake-updates: cmake: remove (_)UNICODE def on Windows in CMakeLists.txt cmake: add pcre2 support cmake: fix CMakeLists.txt on Linux	2022-06-07 14:10:56 -07:00
Junio C Hamano	ab336e8f1c	Seventh batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-03 14:30:45 -07:00
Junio C Hamano	a50036da1a	Merge branch 'tb/cruft-packs' A mechanism to pack unreachable objects into a "cruft pack", instead of ejecting them into loose form to be reclaimed later, has been introduced. * tb/cruft-packs: sha1-file.c: don't freshen cruft packs builtin/gc.c: conditionally avoid pruning objects via loose builtin/repack.c: add cruft packs to MIDX during geometric repack builtin/repack.c: use named flags for existing_packs builtin/repack.c: allow configuring cruft pack generation builtin/repack.c: support generating a cruft pack builtin/pack-objects.c: --cruft with expiration reachable: report precise timestamps from objects in cruft packs reachable: add options to add_unseen_recent_objects_to_traversal builtin/pack-objects.c: --cruft without expiration builtin/pack-objects.c: return from create_object_entry() t/helper: add 'pack-mtimes' test-tool pack-mtimes: support writing pack .mtimes files chunk-format.h: extract oid_version() pack-write: pass 'struct packing_data' to 'stage_tmp_packfiles' pack-mtimes: support reading .mtimes files Documentation/technical: add cruft-packs.txt	2022-06-03 14:30:37 -07:00
Junio C Hamano	37d4ae58ef	Merge branch 'kl/setup-in-unreadable-worktree' Disable the "do not remove the directory the user started Git in" logic when Git cannot tell where that directory is. Earlier we refused to run in such a case. * kl/setup-in-unreadable-worktree: setup: don't die if realpath(3) fails on getcwd(3)	2022-06-03 14:30:36 -07:00
Junio C Hamano	28db3b7b71	Merge branch 'jx/l10n-workflow-change' A workflow change for translators are being proposed. * jx/l10n-workflow-change: l10n: Document the new l10n workflow Makefile: add "po-init" rule to initialize po/XX.po Makefile: add "po-update" rule to update po/XX.po po/git.pot: don't check in result of "make pot" po/git.pot: this is now a generated file Makefile: remove duplicate and unwanted files in FOUND_SOURCE_FILES i18n CI: stop allowing non-ASCII source messages in po/git.pot Makefile: have "make pot" not "reset --hard" Makefile: generate "po/git.pot" from stable LOCALIZED_C Makefile: sort source files before feeding to xgettext	2022-06-03 14:30:36 -07:00
Junio C Hamano	16a0e92ddc	Merge branch 'tb/geom-repack-with-keep-and-max' Teach "git repack --geometric" work better with "--keep-pack" and avoid corrupting the repository when packsize limit is used. * tb/geom-repack-with-keep-and-max: builtin/repack.c: ensure that `names` is sorted t7703: demonstrate object corruption with pack.packSizeLimit repack: respect --keep-pack with geometric repack	2022-06-03 14:30:36 -07:00
Junio C Hamano	c276c21da6	Merge branch 'ds/sparse-sparse-checkout' "sparse-checkout" learns to work well with the sparse-index feature. * ds/sparse-sparse-checkout: sparse-checkout: integrate with sparse index p2000: add test for 'git sparse-checkout [add\|set]' sparse-index: complete partial expansion sparse-index: partially expand directories sparse-checkout: --no-sparse-index needs a full index cache-tree: implement cache_tree_find_path() sparse-index: introduce partially-sparse indexes sparse-index: create expand_index() t1092: stress test 'git sparse-checkout set' t1092: refactor 'sparse-index contents' test	2022-06-03 14:30:35 -07:00
Junio C Hamano	091680472d	Merge branch 'tb/midx-race-in-pack-objects' The multi-pack-index code did not protect the packfile it is going to depend on from getting removed while in use, which has been corrected. * tb/midx-race-in-pack-objects: builtin/pack-objects.c: ensure pack validity from MIDX bitmap objects builtin/pack-objects.c: ensure included `--stdin-packs` exist builtin/pack-objects.c: avoid redundant NULL check pack-bitmap.c: check preferred pack validity when opening MIDX bitmap	2022-06-03 14:30:35 -07:00
Junio C Hamano	d8c8dccbaa	Merge branch 'ds/object-file-unpack-loose-header-fix' Coding style fix. * ds/object-file-unpack-loose-header-fix: object-file: convert 'switch' back to 'if'	2022-06-03 14:30:35 -07:00
Junio C Hamano	a9e7c3a6ef	Merge branch 'pb/use-freebsd-12.3-in-cirrus-ci' Update the version of FreeBSD image used in Cirrus CI. * pb/use-freebsd-12.3-in-cirrus-ci: ci: update Cirrus-CI image to FreeBSD 12.3	2022-06-03 14:30:34 -07:00
Junio C Hamano	b3b2ddced2	Merge branch 'ds/bundle-uri' Preliminary code refactoring around transport and bundle code. * ds/bundle-uri: bundle.h: make "fd" version of read_bundle_header() public remote: allow relative_url() to return an absolute url remote: move relative_url() http: make http_get_file() external fetch-pack: move --keep=* option filling to a function fetch-pack: add a deref_without_lazy_fetch_extended() dir API: add a generalized path_match_flags() function connect.c: refactor sending of agent & object-format	2022-06-03 14:30:34 -07:00
Junio C Hamano	83937e9592	Merge branch 'ns/batch-fsync' Introduce a filesystem-dependent mechanism to optimize the way the bits for many loose object files are ensured to hit the disk platter. * ns/batch-fsync: core.fsyncmethod: performance tests for batch mode t/perf: add iteration setup mechanism to perf-lib core.fsyncmethod: tests for batch mode test-lib-functions: add parsing helpers for ls-files and ls-tree core.fsync: use batch mode and sync loose objects by default on Windows unpack-objects: use the bulk-checkin infrastructure update-index: use the bulk-checkin infrastructure builtin/add: add ODB transaction around add_files_to_cache cache-tree: use ODB transaction around writing a tree core.fsyncmethod: batched disk flushes for loose-objects bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' bulk-checkin: rename 'state' variable and separate 'plugged' boolean	2022-06-03 14:30:34 -07:00
Junio C Hamano	377d347eb3	Merge branch 'en/sparse-cone-becomes-default' Deprecate non-cone mode of the sparse-checkout feature. * en/sparse-cone-becomes-default: Documentation: some sparsity wording clarifications git-sparse-checkout.txt: mark non-cone mode as deprecated git-sparse-checkout.txt: flesh out pattern set sections a bit git-sparse-checkout.txt: add a new EXAMPLES section git-sparse-checkout.txt: shuffle some sections and mark as internal git-sparse-checkout.txt: update docs for deprecation of 'init' git-sparse-checkout.txt: wording updates for the cone mode default sparse-checkout: make --cone the default tests: stop assuming --no-cone is the default mode for sparse-checkout	2022-06-03 14:30:33 -07:00
Junio C Hamano	2668e3608e	Sixth batch Fast-tracking GitHub CI Windows build fixes. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-31 19:10:35 -07:00
Junio C Hamano	4c9b052377	Merge branch 'jc/http-clear-finished-pointer' Meant to go with js/ci-gcc-12-fixes. * jc/http-clear-finished-pointer: http.c: clear the 'finished' member once we are done with it	2022-05-31 19:10:35 -07:00
Junio C Hamano	db5b7c3e46	Merge branch 'js/ci-gcc-12-fixes' Fixes real problems noticed by gcc 12 and works around false positives. * js/ci-gcc-12-fixes: dir.c: avoid "exceeds maximum object size" error with GCC v12.x nedmalloc: avoid new compile error compat/win32/syslog: fix use-after-realloc	2022-05-31 19:10:35 -07:00
Junio C Hamano	1bcf4f6271	Fifth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-30 23:24:12 -07:00
Junio C Hamano	1fc1879839	Merge branch 'js/use-builtin-add-i' "git add -i" was rewritten in C some time ago and has been in testing; the reimplementation is now exposed to general public by default. * js/use-builtin-add-i: add -i: default to the built-in implementation t2016: require the PERL prereq only when necessary	2022-05-30 23:24:03 -07:00
Junio C Hamano	5a10f4c3a1	Merge branch 'jc/t6424-failing-merge-preserve-local-changes' The tests that ensured merges stop when interfering local changes are present did not make sure that local changes are preserved; now they do. * jc/t6424-failing-merge-preserve-local-changes: t6424: make sure a failed merge preserves local changes	2022-05-30 23:24:03 -07:00
Junio C Hamano	60be29398a	Merge branch 'cc/http-curlopt-resolve' With the new http.curloptResolve configuration, the CURLOPT_RESOLVE mechanism that allows cURL based applications to use pre-resolved IP addresses for the requests is exposed to the scripts. * cc/http-curlopt-resolve: http: add custom hostname to IP address resolutions	2022-05-30 23:24:02 -07:00
Matthew John Cheetham	15d8adccab	scalar: teach `diagnose` to gather loose objects information When operating at the scale that Scalar wants to support, certain data shapes are more likely to cause undesirable performance issues, such as large numbers of loose objects. By including statistics about this, `scalar diagnose` now makes it easier to identify such scenarios. Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-30 23:07:31 -07:00
Matthew John Cheetham	93e804b278	scalar: teach `diagnose` to gather packfile info It's helpful to see if there are other crud files in the pack directory. Let's teach the `scalar diagnose` command to gather file size information about pack files. While at it, also enumerate the pack files in the alternate object directories, if any are registered. Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-30 23:07:31 -07:00
Johannes Schindelin	0ed5b13f24	scalar diagnose: include disk space information When analyzing problems with large worktrees/repositories, it is useful to know how close to a "full disk" situation Scalar/Git operates. Let's include this information. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-30 23:07:31 -07:00
Johannes Schindelin	aa5c79a331	scalar: implement `scalar diagnose` Over the course of Scalar's development, it became obvious that there is a need for a command that can gather all kinds of useful information that can help identify the most typical problems with large worktrees/repositories. The `diagnose` command is the culmination of this hard-won knowledge: it gathers the installed hooks, the config, a couple statistics describing the data shape, among other pieces of information, and then wraps everything up in a tidy, neat `.zip` archive. Note: originally, Scalar was implemented in C# using the .NET API, where we had the luxury of a comprehensive standard library that includes basic functionality such as writing a `.zip` file. In the C version, we lack such a commodity. Rather than introducing a dependency on, say, libzip, we slightly abuse Git's `archive` machinery: we write out a `.zip` of the empty try, augmented by a couple files that are added via the `--add-file*` options. We are careful trying not to modify the current repository in any way lest the very circumstances that required `scalar diagnose` to be run are changed by the `diagnose` run itself. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-30 23:07:31 -07:00
Johannes Schindelin	b44855743b	scalar: validate the optional enlistment argument The `scalar` command needs a Scalar enlistment for many subcommands, and looks in the current directory for such an enlistment (traversing the parent directories until it finds one). These is subcommands can also be called with an optional argument specifying the enlistment. Here, too, we traverse parent directories as needed, until we find an enlistment. However, if the specified directory does not even exist, or is not a directory, we should stop right there, with an error message. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-30 23:07:31 -07:00
Johannes Schindelin	de1f68a968	archive --add-virtual-file: allow paths containing colons By allowing the path to be enclosed in double-quotes, we can avoid the limitation that paths cannot contain colons. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-30 23:07:31 -07:00
Johannes Schindelin	237a1d138c	archive: optionally add "virtual" files With the `--add-virtual-file=<path>:<content>` option, `git archive` now supports use cases where relatively trivial files need to be added that do not exist on disk. This will allow us to generate `.zip` files with generated content, without having to add said content to the object database and without having to write it out to disk. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> [jc: tweaked <path> handling] Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-30 23:07:22 -07:00
Junio C Hamano	b02fdbc80a	pathspec: correct an empty string used as a pathspec element Pathspecs with only negative elements did not work with some commands that pass the pathspec along to a subprocess. For instance, $ git add -p -- ':!*.txt' should add everything except for paths ending in ".txt", but it gets complaint from underlying "diff-index" and aborts. We used to error out when a pathspec with only negative elements in it, like the one in the above example. Later, `859b7f1d` (pathspec: don't error out on all-exclusionary pathspec patterns, 2017-02-07) updated the logic to add an empty string as an extra element. The intention was to let the extra element to match everything and let the negative ones given by the user to subtract from it. At around the same time, we were migrating from "an empty string is a valid pathspec element that matches everything" to "either a dot or ":/" is used to match all, and an empty string is rejected", between `d426430e` (pathspec: warn on empty strings as pathspec, 2016-06-22) and `9e4e8a64` (pathspec: die on empty strings as pathspec, 2017-06-06). I think `9e4e8a64`, which happened long after `859b7f1d` happened, was not careful enough to turn the empty string `859b7f1d` added to either a dot or ":/". A care should be taken as the definition of "everything" depends on subcommand. For the purpose of "add -p", adding a "." to add everything in the current directory is the right thing to do. But for some other commands, ":/" (i.e. really really everything, even things outside the current subdirectory) is the right choice. We would break commands in a big way if we get this wrong, so add a handful of test pieces to make sure the resulting code still excludes the paths that are expected and includes "everything" else. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-29 15:42:18 -07:00
Junio C Hamano	23f2356fd9	Merge branch 'rs/document-archive-prefix' into js/scalar-diagnose * rs/document-archive-prefix: archive: improve documentation of --prefix	2022-05-28 10:38:06 -07:00
René Scharfe	a75910602a	archive: improve documentation of --prefix Document the interaction between --add-file and --prefix by giving an example. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-28 10:29:40 -07:00
Junio C Hamano	05e280c0a6	http.c: clear the 'finished' member once we are done with it In http.c, the run_active_slot() function allows the given "slot" to make progress by calling step_active_slots() in a loop repeatedly, and the loop is not left until the request held in the slot completes. Ages ago, we used to use the slot->in_use member to get out of the loop, which misbehaved when the request in "slot" completes (at which time, the result of the request is copied away from the slot, and the in_use member is cleared, making the slot ready to be reused), and the "slot" gets reused to service a different request (at which time, the "slot" becomes in_use again, even though it is for a different request). The loop terminating condition mistakenly thought that the original request has yet to be completed. Today's code, after `baa7b67d` (HTTP slot reuse fixes, 2006-03-10) fixed this issue, uses a separate "slot->finished" member that is set in run_active_slot() to point to an on-stack variable, and the code that completes the request in finish_active_slot() clears the on-stack variable via the pointer to signal that the particular request held by the slot has completed. It also clears the in_use member (as before that fix), so that the slot itself can safely be reused for an unrelated request. One thing that is not quite clean in this arrangement is that, unless the slot gets reused, at which point the finished member is reset to NULL, the member keeps the value of &finished, which becomes a dangling pointer into the stack when run_active_slot() returns. Clear the finished member before the control leaves the function, which has a side effect of unconfusing compilers like recent GCC 12 that is over-eager to warn against such an assignment. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-27 15:58:31 -07:00
Frantisek Hrbata	8c49d704ef	transport: free local and remote refs in transport_push() Fix memory leaks in transport_push(), where remote_refs and local_refs are never freed. 116 bytes in 1 blocks are definitely lost in loss record 56 of 103 at 0x484486F: malloc (vg_replace_malloc.c:381) by 0x4938D7E: strdup (strdup.c:42) by 0x628418: xstrdup (wrapper.c:39) by 0x4FD454: process_capabilities (connect.c:232) by 0x4FD454: get_remote_heads (connect.c:354) by 0x610A38: handshake (transport.c:333) by 0x612B02: transport_push (transport.c:1302) by 0x4803D6: push_with_options (push.c:357) by 0x4811D6: do_push (push.c:414) by 0x4811D6: cmd_push (push.c:650) by 0x405210: run_builtin (git.c:465) by 0x405210: handle_builtin (git.c:719) by 0x406363: run_argv (git.c:786) by 0x406363: cmd_main (git.c:917) by 0x404F17: main (common-main.c:56) 5,912 (388 direct, 5,524 indirect) bytes in 2 blocks are definitely lost in loss record 98 of 103 at 0x4849464: calloc (vg_replace_malloc.c:1328) by 0x628705: xcalloc (wrapper.c:150) by 0x5C216D: alloc_ref_with_prefix (remote.c:975) by 0x5C232A: alloc_ref (remote.c:983) by 0x5C232A: one_local_ref (remote.c:2299) by 0x5C232A: one_local_ref (remote.c:2289) by 0x5BDB03: do_for_each_repo_ref_iterator (iterator.c:418) by 0x5B4C4F: do_for_each_ref (refs.c:1486) by 0x5B4C4F: refs_for_each_ref (refs.c:1492) by 0x5B4C4F: for_each_ref (refs.c:1497) by 0x5C6ADF: get_local_heads (remote.c:2310) by 0x612A85: transport_push (transport.c:1286) by 0x4803D6: push_with_options (push.c:357) by 0x4811D6: do_push (push.c:414) by 0x4811D6: cmd_push (push.c:650) by 0x405210: run_builtin (git.c:465) by 0x405210: handle_builtin (git.c:719) by 0x406363: run_argv (git.c:786) by 0x406363: cmd_main (git.c:917) Signed-off-by: Frantisek Hrbata <frantisek@hrbata.com> Reviewed-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-27 14:51:17 -07:00
Frantisek Hrbata	35919bf1ab	transport: unify return values and exit point from transport_push() It seems there is no reason to return 1 instead of -1 when push_refs() is not set in transport vtable. Let's unify the error return values and use the done label as a single exit point from transport_push(). Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Frantisek Hrbata <frantisek@hrbata.com> Reviewed-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-27 14:51:16 -07:00
Frantisek Hrbata	6448182a83	transport: remove unnecessary indenting in transport_push() Remove the big indented block for transport_push() check in transport vtable and let's just return error immediately. Hopefully this makes the code more readable. Signed-off-by: Frantisek Hrbata <frantisek@hrbata.com> Reviewed-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-27 14:51:16 -07:00
Taylor Blau	a613164257	sha1-file.c: don't freshen cruft packs We don't bother to freshen objects stored in a cruft pack individually by updating the `.mtimes` file. This is because we can't portably `mmap` and write into the middle of a file (i.e., to update the mtime of just one object). Instead, we would have to rewrite the entire `.mtimes` file which may incur some wasted effort especially if there a lot of cruft objects and they are freshened infrequently. Instead, force the freshening code to avoid an optimizing write by writing out the object loose and letting it pick up a current mtime. This works because we prefer the mtime of the loose copy of an object when both a loose and packed one exist (whether or not the packed copy comes from a cruft pack or not). This could certainly do with a test and/or be included earlier in this series/PR, but I want to wait until after I have a chance to clean up the overly-repetitive nature of the cruft pack tests in general. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-26 15:48:26 -07:00
Taylor Blau	5b92477f89	builtin/gc.c: conditionally avoid pruning objects via loose Expose the new `git repack --cruft` mode from `git gc` via a new opt-in flag. When invoked like `git gc --cruft`, `git gc` will avoid exploding unreachable objects as loose ones, and instead create a cruft pack and `.mtimes` file. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-26 15:48:26 -07:00
Taylor Blau	ddee3703b3	builtin/repack.c: add cruft packs to MIDX during geometric repack When using cruft packs, the following race can occur when a geometric repack that writes a MIDX bitmap takes place afterwords: - First, create an unreachable object and do an all-into-one cruft repack which stores that object in the repository's cruft pack. - Then make that object reachable. - Finally, do a geometric repack and write a MIDX bitmap. Assuming that we are sufficiently unlucky as to select a commit from the MIDX which reaches that object for bitmapping, then the `git multi-pack-index` process will complain that that object is missing. The reason is because we don't include cruft packs in the MIDX when doing a geometric repack. Since the "make that object reachable" doesn't necessarily mean that we'll create a new copy of that object in one of the packs that will get rolled up as part of a geometric repack, it's possible that the MIDX won't see any copies of that now-reachable object. Of course, it's desirable to avoid including cruft packs in the MIDX because it causes the MIDX to store a bunch of objects which are likely to get thrown away. But excluding that pack does open us up to the above race. This patch demonstrates the bug, and resolves it by including cruft packs in the MIDX even when doing a geometric repack. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-26 15:48:26 -07:00
Taylor Blau	72263ffc32	builtin/repack.c: use named flags for existing_packs We use the `util` pointer for items in the `existing_packs` string list to indicate which packs are going to be deleted. Since that has so far been the only use of that `util` pointer, we just set it to 0 or 1. But we're going to add an additional state to this field in the next patch, so prepare for that by adding a #define for the first bit so we can more expressively inspect the flags state. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-26 15:48:26 -07:00
Taylor Blau	4571324b99	builtin/repack.c: allow configuring cruft pack generation In servers which set the pack.window configuration to a large value, we can wind up spending quite a lot of time finding new bases when breaking delta chains between reachable and unreachable objects while generating a cruft pack. Introduce a handful of `repack.cruft*` configuration variables to control the parameters used by pack-objects when generating a cruft pack. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-26 15:48:26 -07:00
Taylor Blau	f9825d1cf7	builtin/repack.c: support generating a cruft pack Expose a way to split the contents of a repository into a main and cruft pack when doing an all-into-one repack with `git repack --cruft -d`, and a complementary configuration variable. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-26 15:48:26 -07:00
Taylor Blau	a7d493833f	builtin/pack-objects.c: --cruft with expiration In a previous patch, pack-objects learned how to generate a cruft pack so long as no objects are dropped. This patch teaches pack-objects to handle the case where a non-never `--cruft-expiration` value is passed. This case is slightly more complicated than before, because we want pack-objects to save unreachable objects which would have been pruned when there is another recent (i.e., non-prunable) unreachable object which reaches the other. We'll call these objects "unreachable but reachable-from-recent". Here is how pack-objects handles `--cruft-expiration`: - Instead of adding all objects outside of the kept pack(s) into the packing list, only handle the ones whose mtime is within the grace period. - Construct a reachability traversal whose tips are the unreachable-but-recent objects. - Then, walk along that traversal, stopping if we reach an object in the kept pack. At each step along the traversal, we add the object we are visiting to the packing list. In the majority of these cases, any object we visit in this traversal will already be in our packing list. But we will sometimes encounter reachable-from-recent cruft objects, which we want to retain even if they aged out of the grace period. The most subtle point of this process is that we actually don't need to bother to update the rescued object's mtime. Even though we will write an .mtimes file with a value that is older than the expiration window, it will continue to survive cruft repacks so long as any objects which reach it haven't aged out. That is, a future repack will also exclude that object from the initial packing list, only to discover it later on when doing the reachability traversal. Finally, stopping early once an object is found in a kept pack is safe to do because the kept packs ordinarily represent which packs will survive after repacking. Assuming that it _isn't_ safe to halt a traversal early would mean that there is some ancestor object which is missing, which implies repository corruption (i.e., the complete set of reachable objects isn't present). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-26 15:48:26 -07:00
Taylor Blau	fb546d6e43	reachable: report precise timestamps from objects in cruft packs When generating a cruft pack, the caller within pack-objects will want to know the precise timestamps of cruft objects (i.e., their corresponding values in the .mtimes table) rather than the mtime of the cruft pack itself. Teach add_recent_packed() to lookup each object's precise mtime from the .mtimes file if one exists (indicated by the is_cruft bit on the packed_git structure). A couple of small things worth noting here: - load_pack_mtimes() needs to be called before asking for nth_packed_mtime(), and that call is done lazily here. That function exits early if the .mtimes file has already been opened and parsed, so only the first call is slow. - Checking the is_cruft bit can be done without any extra work on the caller's behalf, since it is set up for us automatically as a side-effect of calling add_packed_git() (just like the 'pack_keep' and 'pack_promisor' bits). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-26 15:48:26 -07:00
Taylor Blau	2fb90409b8	reachable: add options to add_unseen_recent_objects_to_traversal This function behaves very similarly to what we will need in pack-objects in order to implement cruft packs with expiration. But it is lacking a couple of things. Namely, it needs: - a mechanism to communicate the timestamps of individual recent objects to some external caller - and, in the case of packed objects, our future caller will also want to know the originating pack, as well as the offset within that pack at which the object can be found - finally, it needs a way to skip over packs which are marked as kept in-core. To address the first two, add a callback interface in this patch which reports the time of each recent object, as well as a (packed_git, off_t) pair for packed objects. Likewise, add a new option to the packed object iterators to skip over packs which are marked as kept in core. This option will become implicitly tested in a future patch. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-26 15:48:26 -07:00
Taylor Blau	b757353676	builtin/pack-objects.c: --cruft without expiration Teach `pack-objects` how to generate a cruft pack when no objects are dropped (i.e., `--cruft-expiration=never`). Later patches will teach `pack-objects` how to generate a cruft pack that prunes objects. When generating a cruft pack which does not prune objects, we want to collect all unreachable objects into a single pack (noting and updating their mtimes as we accumulate them). Ordinary use will pass the result of a `git repack -A` as a kept pack, so when this patch says "kept pack", readers should think "reachable objects". Generating a non-expiring cruft packs works as follows: - Callers provide a list of every pack they know about, and indicate which packs are about to be removed. - All packs which are going to be removed (we'll call these the redundant ones) are marked as kept in-core. Any packs the caller did not mention (but are known to the `pack-objects` process) are also marked as kept in-core. Packs not mentioned by the caller are assumed to be unknown to them, i.e., they entered the repository after the caller decided which packs should be kept and which should be discarded. Since we do not want to include objects in these "unknown" packs (because we don't know which of their objects are or aren't reachable), these are also marked as kept in-core. - Then, we enumerate all objects in the repository, and add them to our packing list if they do not appear in an in-core kept pack. This results in a new cruft pack which contains all known objects that aren't included in the kept packs. When the kept pack is the result of `git repack -A`, the resulting pack contains all unreachable objects. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-26 15:48:26 -07:00

1 2 3 4 5 ...

66982 Commits