git-commit-vandalism

Author	SHA1	Message	Date
Han-Wen Nienhuys	fa48de62ac	Documentation: object_id_len goes up to 31 The value is stored in a 5-bit field, so we can't support more without a format version upgrade. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-23 13:36:26 -08:00
Taylor Blau	95e8383bac	midx.c: make changing the preferred pack safe The previous patch demonstrates a bug where a MIDX's auxiliary object order can become out of sync with a MIDX bitmap. This is because of two confounding factors: - First, the object order is stored in a file which is named according to the multi-pack index's checksum, and the MIDX does not store the object order. This means that the object order can change without altering the checksum. - But the .rev file is moved into place with finalize_object_file(), which link(2)'s the file into place instead of renaming it. For us, that means that a modified .rev file will not be moved into place if MIDX's checksum was unchanged. This fix is to force the MIDX's checksum to change when the preferred pack changes but the set of packs contained in the MIDX does not. In other words, when the object order changes, the MIDX's checksum needs to change with it (regardless of whether the MIDX is tracking the same or different packs). This prevents a race whereby changing the object order (but not the packs themselves) enables a reader to see the new .rev file with the old MIDX, or similarly seeing the new bitmap with the old object order. But why can't we just stop hardlinking the .rev into place instead adding additional data to the MIDX? Suppose that's what we did. Then when we go to generate the new bitmap, we'll load the old MIDX bitmap, along with the MIDX that it references. That's fine, since the new MIDX isn't moved into place until after the new bitmap is generated. But the new object order has been moved into place. So we'll read the old bitmaps in the new order when generating the new bitmap file, meaning that without this secondary change, bitmap generation itself would become a victim of the race described here. This can all be prevented by forcing the MIDX's checksum to change when the object order does. By embedding the entire object order into the MIDX, we do just that. That is, the MIDX's checksum will change in response to any perturbation of the underlying object order. In t5326, this will cause the MIDX's checksum to update (even without changing the set of packs in the MIDX), preventing the stale read problem. Note that this makes it safe to continue to link(2) the MIDX .rev file into place, since it is now impossible to have a .rev file that is out-of-sync with the MIDX whose checksum it references. (But we will do away with MIDX .rev files later in this series anyway, so this is somewhat of a moot point). In theory, it is possible to store a "fingerprint" of the full object order here, so long as that fingerprint changes at least as often as the full object order does. Some possibilities here include storing the identity of the preferred pack, along with the mtimes of the non-preferred packs in a consistent order. But storing a limited part of the information makes it difficult to reason about whether or not there are gaps between the two that would cause us to get bitten by this bug again. Signed-off-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-01-27 12:07:52 -08:00
Junio C Hamano	4ce498baa3	Merge branch 'en/zdiff3' "Zealous diff3" style of merge conflict presentation has been added. * en/zdiff3: update documentation for new zdiff3 conflictStyle xdiff: implement a zealous diff3, or "zdiff3"	2021-12-15 09:39:47 -08:00
Junio C Hamano	cdac0caddd	Merge branch 'jt/midx-doc-fix' Docfix. * jt/midx-doc-fix: Doc: no midx and partial clone relation	2021-12-10 14:35:13 -08:00
Junio C Hamano	4ee5cacc16	Merge branch 'tl/midx-docfix' Doc mark-up fix. * tl/midx-docfix: midx: fix a formatting issue in "multi-pack-index.txt"	2021-12-10 14:35:11 -08:00
Junio C Hamano	83113c4268	Merge branch 'cw/protocol-v2-doc-fix' Doc update. * cw/protocol-v2-doc-fix: protocol-v2.txt: align delim-pkt spec with usage	2021-12-10 14:35:00 -08:00
Elijah Newren	ddfc44a898	update documentation for new zdiff3 conflictStyle Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-12-01 14:45:59 -08:00
Jonathan Tan	7d3fc7df70	Doc: no midx and partial clone relation The multi-pack index treats promisor packfiles (that is, packfiles that have an accompanying .promisor file) the same as other packfiles. Remove a section in the documentation that seems to indicate otherwise. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-11-22 12:46:33 -08:00
Teng Long	ad506e6780	midx: fix a formatting issue in "multi-pack-index.txt" There is a formatting issue in "multi-pack-index.html", corresponding to the nesting bulleted list of a wrong usage in "multi-pack-index.txt" and this commit fix the problem. In ASCIIDOC, it doesn't treat an indented character as the beginning of a sub-list. If we want to write a nested bulleted list, we could just use ASTERISK without any DASH like: " * Level 1 list item Level 2 list item * Level 3 list item ** Level 2 list item * Level 1 list item ** Level 2 list item * Level 1 list item " The DASH can be used for bulleted list too, But the DASH is suggested only to be used as the marker for the first level because the DASH doesn’t work well or a best practice for nested lists, like (dash is as level 2 below): " * Level 1 list item - Level 2 list item * Level 1 list item " ASTERISK is recommanded to use because it works intuitively and clearly ("marker length = nesting level") in nested lists, but the DASH can't. However, when you want to write a non-nested bulleted lists, DASH works too, like: " - Level 1 list item - Level 1 list item - Level 1 list item " Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Teng Long <dyroneteng@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-11-18 11:31:07 -08:00
Junio C Hamano	5a73c6bdc7	Merge branch 'js/trace2-raise-format-version' When we added a new event type to trace2 event stream, we forgot to raise the format version number, which has been corrected. * js/trace2-raise-format-version: trace2: increment event format version	2021-11-12 15:29:25 -08:00
Josh Steadmon	04480e67fe	trace2: increment event format version In `64bc752` (trace2: add trace2_child_ready() to report on background children, 2021-09-20), we added a new "child_ready" event. In Documentation/technical/api-trace2.txt, we promise that adding a new event type will result in incrementing the trace2 event format version number, but this was not done. Correct this in code & docs. Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-11-11 15:01:04 -08:00
Calvin Wan	74db416c9c	protocol-v2.txt: align delim-pkt spec with usage The current protocol EBNF allows command-request to end with the capability list, if no command specific arguments follow, but the protocol requires that after the capability list, there must be a delim-pkt regardless of the number of command specific arguments. Fixed the EBNF to match. Both JGit and libgit2's implementation has the delim-pkt as mandatory. JGit's code is not publicly linkable, but libgit2 is linked below[1]. As for currently implemented commands on v2 (ls-ref and fetch), the delim packet is already being passed through [1]: https://github.com/libgit2/libgit2/blob/main/src/transports/git.c Reported-by: Ivan Frade <ifrade@google.com> Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-11-11 14:53:18 -08:00
Junio C Hamano	97ab03b12a	Merge branch 'jc/doc-commit-header-continuation-line' Doc update. * jc/doc-commit-header-continuation-line: signature-format.txt: explain and illustrate multi-line headers	2021-10-25 16:07:00 -07:00
Junio C Hamano	af303ee392	Merge branch 'jh/builtin-fsmonitor-part1' Built-in fsmonitor (part 1). * jh/builtin-fsmonitor-part1: t/helper/simple-ipc: convert test-simple-ipc to use start_bg_command run-command: create start_bg_command simple-ipc/ipc-win32: add Windows ACL to named pipe simple-ipc/ipc-win32: add trace2 debugging simple-ipc: move definition of ipc_active_state outside of ifdef simple-ipc: preparations for supporting binary messages. trace2: add trace2_child_ready() to report on background children	2021-10-13 15:15:58 -07:00
Junio C Hamano	f6c013dfa1	signature-format.txt: explain and illustrate multi-line headers A signature attached to a signed commit, and the contents of the commit that merged a signed tag, are both recorded as a value of an object header field as a multi-line value, and are subject to the formatting convention for multi-line values in the headers, with a leading SP signaling that the rest of the line is a continuation of the previous line. Most notably, an empty line in such a multi-line value would result in a line with a sole SP on it. Examples in the signature-format technical documentation include a few of these cases but we did not show these otherwise invisible SPs in the example. These trailing spaces cannot be seen on display or on paper, and forces the readers to look for them in their editors or pagers, even if we added them to the document. Extend the overview section to explain the multi-line value formatting and highlight these otherwise invisible SPs by inventing the "a dollar-sign at the end of line that appears after SP merely signals that there is a SP there, and the dollar-sign itself does not appear in the real file" notation, inspired by "cat -e" output, to help readers to learn exactly where such "a single SP that is originally an empty line" appears in the examples. Reported-by: Rob Browning <rlb@defaultvalue.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-12 19:06:24 -07:00
Junio C Hamano	b39b0e1a82	Merge branch 'ew/midx-doc-update' Doc tweak. * ew/midx-doc-update: doc/technical: update note about core.multiPackIndex	2021-10-06 13:40:12 -07:00
Junio C Hamano	bb1677fc29	Merge branch 'jk/reduce-malloc-in-v2-servers' Code cleanup to limit memory consumption and tighten protocol message parsing. * jk/reduce-malloc-in-v2-servers: ls-refs: reject unknown arguments serve: reject commands used as capabilities serve: reject bogus v2 "command=ls-refs=foo" docs/protocol-v2: clarify some ls-refs ref-prefix details ls-refs: ignore very long ref-prefix counts serve: drop "keys" strvec serve: provide "receive" function for session-id capability serve: provide "receive" function for object-format capability serve: add "receive" method for v2 capabilities table serve: return capability "value" from get_capability() serve: rename is_command() to parse_command()	2021-09-28 13:06:53 -07:00
Eric Wong	0d0d8d8a11	doc/technical: update note about core.multiPackIndex MIDX files are used by default since commit `18e449f86b` (midx: enable core.multiPackIndex by default, 2020-09-25) Helped-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-24 08:39:53 -07:00
Junio C Hamano	0e35107e7d	Merge branch 'ab/retire-option-argument' An oddball OPTION_ARGUMENT feature has been removed from the parse-options API. * ab/retire-option-argument: parse-options API: remove OPTION_ARGUMENT feature difftool: use run_command() API in run_file_diff() difftool: prepare "diff" cmdline in cmd_difftool() difftool: prepare "struct child_process" in cmd_difftool()	2021-09-23 13:44:48 -07:00
Junio C Hamano	cabb41d0f6	Merge branch 'jk/http-server-protocol-versions' Taking advantage of the CGI interface, http-backend has been updated to enable protocol v2 automatically when the other side asks for it. * jk/http-server-protocol-versions: docs/protocol-v2: point readers transport config discussion docs/git: discuss server-side config for GIT_PROTOCOL docs/http-backend: mention v2 protocol http-backend: handle HTTP_GIT_PROTOCOL CGI variable t5551: test v2-to-v0 http protocol fallback	2021-09-23 13:44:47 -07:00
Junio C Hamano	5331af2352	Merge branch 'ab/serve-cleanup' Code clean-up around "git serve". * ab/serve-cleanup: upload-pack: document and rename --advertise-refs serve.[ch]: remove "serve_options", split up --advertise-refs code {upload,receive}-pack tests: add --advertise-refs tests serve.c: move version line to advertise_capabilities() serve: move transfer.advertiseSID check into session_id_advertise() serve.[ch]: don't pass "struct strvec *keys" to commands serve: use designated initializers transport: use designated initializers transport: rename "fetch" in transport_vtable to "fetch_refs" serve: mark has_capability() as static	2021-09-20 15:20:43 -07:00
Junio C Hamano	0649303820	Merge branch 'tb/multi-pack-bitmaps' The reachability bitmap file used to be generated only for a single pack, but now we've learned to generate bitmaps for history that span across multiple packfiles. * tb/multi-pack-bitmaps: (29 commits) pack-bitmap: drop bitmap_index argument from try_partial_reuse() pack-bitmap: drop repository argument from prepare_midx_bitmap_git() p5326: perf tests for MIDX bitmaps p5310: extract full and partial bitmap tests midx: respect 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' t7700: update to work with MIDX bitmap test knob t5319: don't write MIDX bitmaps in t5319 t5310: disable GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP t0410: disable GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP t5326: test multi-pack bitmap behavior t/helper/test-read-midx.c: add --checksum mode t5310: move some tests to lib-bitmap.sh pack-bitmap: write multi-pack bitmaps pack-bitmap: read multi-pack bitmaps pack-bitmap.c: avoid redundant calls to try_partial_reuse pack-bitmap.c: introduce 'bitmap_is_preferred_refname()' pack-bitmap.c: introduce 'nth_bitmap_object_oid()' pack-bitmap.c: introduce 'bitmap_num_objects()' midx: avoid opening multiple MIDXs when writing midx: close linked MIDXs, avoid leaking memory ...	2021-09-20 15:20:39 -07:00
Jeff Hostetler	64bc75244b	trace2: add trace2_child_ready() to report on background children Create "child_ready" event to capture the state of a child process created in the background. When a child command is started a "child_start" event is generated in the Trace2 log. For normal synchronous children, a "child_exit" event is later generated when the child exits or is terminated. The two events include information, such as the "child_id" and "pid", to allow post analysis to match-up the command line and exit status. When a child is started in the background (and may outlive the parent process), it is not possible for the parent to emit a "child_exit" event. Create a new "child_ready" event to indicate whether the child was successfully started. Also include the "child_id" and "pid" to allow similar post processing. This will be used in a later commit with the new "start_bg_command()". Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-20 08:57:58 -07:00
Jeff King	9db5fb4fb3	docs/protocol-v2: clarify some ls-refs ref-prefix details We've never documented the fact that a client can provide multiple ref-prefix capabilities. Let's describe the behavior. We also never discussed the "best effort" nature of the prefixes. The client side of git.git has always treated them this way, filtering the result with local patterns. And indeed any client must do this, because the prefix patterns are not sufficient to express the usual refspecs (and so for "foo" we ask for "refs/heads/foo", "refs/tags/foo", and so on). So this may be considered a change in the spec with respect to client expectations / requirements, but it's mostly codifying existing behavior. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-15 12:25:19 -07:00
Junio C Hamano	0057847208	Merge branch 'ab/serve-cleanup' into jk/reduce-malloc-in-v2-servers * ab/serve-cleanup: upload-pack: document and rename --advertise-refs serve.[ch]: remove "serve_options", split up --advertise-refs code {upload,receive}-pack tests: add --advertise-refs tests serve.c: move version line to advertise_capabilities() serve: move transfer.advertiseSID check into session_id_advertise() serve.[ch]: don't pass "struct strvec *keys" to commands serve: use designated initializers transport: use designated initializers transport: rename "fetch" in transport_vtable to "fetch_refs" serve: mark has_capability() as static	2021-09-14 10:56:05 -07:00
Ævar Arnfjörð Bjarmason	4c25356e0e	parse-options API: remove OPTION_ARGUMENT feature As was noted in `1a85b49b87` (parse-options: make OPT_ARGUMENT() more useful, 2019-03-14) there's only ever been one user of the OPT_ARGUMENT(), that user was added in `20de316e33` (difftool: allow running outside Git worktrees with --no-index, 2019-03-14). The OPT_ARGUMENT() feature itself was added way back in `580d5bffde` (parse-options: new option type to treat an option-like parameter as an argument., 2008-03-02), but as discussed in `1a85b49b87` wasn't used until `20de316e33` in 2019. Now that the preceding commit has migrated this code over to using "struct strvec" to manage the "args" member of a "struct child_process", we can just use that directly instead of relying on OPT_ARGUMENT. This has a minor change in behavior in that if we'll pass --no-index we'll now always pass it as the first argument, before we'd pass it in whatever position the caller did. Preserving this was the real value of OPT_ARGUMENT(), but as it turns out we didn't need that either. We can always inject it as the first argument, the other end will parse it just the same. Note that we cannot remove the "out" and "cpidx" members of "struct parse_opt_ctx_t" added in `580d5bffde`, while they were introduced with OPT_ARGUMENT() we since used them for other things. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-12 23:27:38 -07:00
Jeff King	1b421e7a5a	docs/protocol-v2: point readers transport config discussion We recently added tips for server admins to configure various transports to support v2's GIT_PROTOCOL variable. While the protocol-v2 document is pretty technical and not of interest to most admins, it may be a starting point for them to figure out how to turn on v2. Let's put some pointers from there to the other documentation. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-10 15:35:00 -07:00
Junio C Hamano	aca13c2355	Merge branch 'en/merge-strategy-docs' Documentation updates. * en/merge-strategy-docs: Update error message and code comment merge-strategies.txt: add coverage of the `ort` merge strategy git-rebase.txt: correct out-of-date and misleading text about renames merge-strategies.txt: fix simple capitalization error merge-strategies.txt: avoid giving special preference to patience algorithm merge-strategies.txt: do not imply using copy detection is desired merge-strategies.txt: update wording for the resolve strategy Documentation: edit awkward references to `git merge-recursive` directory-rename-detection.txt: small updates due to merge-ort optimizations git-rebase.txt: correct antiquated claims about --rebase-merges	2021-08-30 16:06:01 -07:00
Junio C Hamano	6f64eeab60	Merge branch 'es/trace2-log-parent-process-name' trace2 logs learned to show parent process name to see in what context Git was invoked. * es/trace2-log-parent-process-name: tr2: log parent process name tr2: make process info collection platform-generic	2021-08-24 15:32:40 -07:00
Taylor Blau	917a54c017	Documentation: describe MIDX-based bitmaps Update the technical documentation to describe the multi-pack bitmap format. This patch merely introduces the new format, and describes its high-level ideas. Git does not yet know how to read nor write these multi-pack variants, and so the subsequent patches will: - Introduce code to interpret multi-pack bitmaps, according to this document. - Then, introduce code to write multi-pack bitmaps from the 'git multi-pack-index write' sub-command. Finally, the implementation will gain tests in subsequent patches (as opposed to inline with the patch teaching Git how to write multi-pack bitmaps) to avoid a cyclic dependency. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-08-24 13:21:13 -07:00
Ævar Arnfjörð Bjarmason	98e2d9d6f7	upload-pack: document and rename --advertise-refs The --advertise-refs documentation in git-upload-pack added in `9812f2136b` (upload-pack.c: use parse-options API, 2016-05-31) hasn't been entirely true ever since v2 support was implemented in `e52449b672` (connect: request remote refs using v2, 2018-03-15). Under v2 we don't advertise the refs at all, but rather dump the capabilities header. This option has always been an obscure internal implementation detail, it wasn't even documented for git-receive-pack. Since it has exactly one user let's rename it to --http-backend-info-refs, which is more accurate and points the reader in the right direction. Let's also cross-link this from the protocol v1 and v2 documentation. I'm retaining a hidden --advertise-refs alias in case there's any external users of this, and making both options hidden to the bash completion (as with most other internal-only options). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-08-05 08:59:37 -07:00
Elijah Newren	b378df72ed	directory-rename-detection.txt: small updates due to merge-ort optimizations In commit `0c4fd732f0` ("Move computation of dir_rename_count from merge-ort to diffcore-rename", 2021-02-27), much of the logic for computing directory renames moved into diffcore-rename. directory-rename-detection.txt had claims that all of that logic was found in merge-recursive. Update the documentation. Acked-by: Derrick Stolee <dstolee@microsoft.com> Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-08-05 08:57:39 -07:00
Emily Shaffer	2f732bf15e	tr2: log parent process name It can be useful to tell who invoked Git - was it invoked manually by a user via CLI or script? By an IDE? In some cases - like 'repo' tool - we can influence the source code and set the GIT_TRACE2_PARENT_SID environment variable from the caller process. In 'repo''s case, that parent SID is manipulated to include the string "repo", which means we can positively identify when Git was invoked by 'repo' tool. However, identifying parents that way requires both that we know which tools invoke Git and that we have the ability to modify the source code of those tools. It cannot scale to keep up with the various IDEs and wrappers which use Git, most of which we don't know about. Learning which tools and wrappers invoke Git, and how, would give us insight to decide where to improve Git's usability and performance. Unfortunately, there's no cross-platform reliable way to gather the name of the parent process. If procfs is present, we can use that; otherwise we will need to discover the name another way. However, the process ID should be sufficient to look up the process name on most platforms, so that code may be shareable. Git for Windows gathers similar information and logs it as a "data_json" event. However, since "data_json" has a variable format, it is difficult to parse effectively in some languages; instead, let's pursue a dedicated "cmd_ancestry" event to record information about the ancestry of the current process and a consistent, parseable way. Git for Windows also gathers information about more than one generation of parent. In Linux further ancestry info can be gathered with procfs, but it's unwieldy to do so. In the interest of later moving Git for Windows ancestry logging to the 'cmd_ancestry' event, and in the interest of later adding more ancestry to the Linux implementation - or of adding this functionality to other platforms which have an easier time walking the process tree - let's make 'cmd_ancestry' accept an array of parentage. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-22 13:35:20 -07:00
Junio C Hamano	a515f26eac	Merge branch 'ar/typofix' Typofixes. * ar/typofix: *: fix typos which duplicate a word	2021-07-08 13:14:59 -07:00
Junio C Hamano	9c7a1fc9b6	Merge branch 'js/trace2-discard-event-docfix' Docfix. * js/trace2-discard-event-docfix: docs: fix api-trace2 doc for "too_many_files" event	2021-07-08 13:14:57 -07:00
Junio C Hamano	40098093c6	Merge branch 'tk/partial-clone-repack-doc' Docfix. * tk/partial-clone-repack-doc: Remove warning that repack only works on non-promisor packfiles	2021-07-08 13:14:56 -07:00
Junio C Hamano	169914ede2	Merge branch 'en/ort-perf-batch-11' Optimize out repeated rename detection in a sequence of mergy operations. * en/ort-perf-batch-11: merge-ort, diffcore-rename: employ cached renames when possible merge-ort: handle interactions of caching and rename/rename(1to1) cases merge-ort: add helper functions for using cached renames merge-ort: preserve cached renames for the appropriate side merge-ort: avoid accidental API mis-use merge-ort: add code to check for whether cached renames can be reused merge-ort: populate caches of rename detection results merge-ort: add data structures for in-memory caching of rename detection t6429: testcases for remembering renames fast-rebase: write conflict state to working tree, index, and HEAD fast-rebase: change assert() to BUG() Documentation/technical: describe remembering renames optimization t6423: rename file within directory that other side renamed	2021-06-14 13:33:27 +09:00
Andrei Rybak	abcb66c614	*: fix typos which duplicate a word Fix typos in documentation, code comments, and RelNotes which repeat various words. In trivial cases, just delete the duplicated word and rewrap text, if needed. Reword the affected sentence in Documentation/RelNotes/1.8.4.txt for it to make sense. Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-14 10:16:06 +09:00
Junio C Hamano	8e1d2fc0cc	Merge branch 'tl/fix-packfile-uri-doc' Doc fix. * tl/fix-packfile-uri-doc: packfile-uri.txt: fix blobPackfileUri description	2021-06-10 12:04:26 +09:00
Josh Steadmon	7ba68e0cf1	docs: fix api-trace2 doc for "too_many_files" event In `87db61a` (trace2: write discard message to sentinel files, 2019-10-04), we added a new "too_many_files" event for when trace2 logging would create too many files in an output directory. Unfortunately, the api-trace2 doc described a "discard" event instead. Fix the doc to use the correct event name. Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-04 12:11:16 +09:00
Tao Klerks	ace6d8e3d6	Remove warning that repack only works on non-promisor packfiles The git-repack doc clearly states that it does operate on promisor packfiles (in a separate partition), with "-a" specified. Presumably the statements here are outdated, as they feature from the first doc in 2017 (and the repack support was added in 2018) Signed-off-by: Tao Klerks <tao@klerks.biz> Reviewed-by: Taylor Blau <me@ttaylorr.com> Acked-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-06-04 09:45:47 +09:00
Teng Long	3127ff90ea	packfile-uri.txt: fix blobPackfileUri description Fix the 'uploadpack.blobPackfileUri' description in packfile-uri.txt and the correct format also can be seen in t5702. Signed-off-by: Teng Long <dyroneteng@gmail.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-25 09:31:06 +09:00
Elijah Newren	a22099f552	t6429: testcases for remembering renames We will soon be adding an optimization that caches (in memory only, never written to disk) upstream renames during a sequence of merges such as occurs during a cherry-pick or rebase operation. Add several tests meant to stress such an implementation to ensure it does the right thing, and include a test whose outcome we will later change due to this optimization as well. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-20 15:40:39 +09:00
Elijah Newren	bb80333c08	Documentation/technical: describe remembering renames optimization Remembering renames on the upstream side of history in an early merge of a rebase or cherry-pick for re-use in a latter merge of the same operation makes pretty good intuitive sense. However, trying to show that it doesn't cause some subtle behavioral difference or some funny edge or corner case is much more involved. And, in fact, it does introduce a subtle behavioral change. Document all the assumptions, special cases, and logic involved in such an optimization, and describe why this optimization is safe under the current optimizations/features/etc. -- even when the subtle behavioral change is triggered. Part of the point of adding this document that goes over the optimization in such laborious detail, is that it is possible that significant future changes (optimizations or feature changes) could interact with this optimization in interesting ways; this document is here to help folks making big changes sanity check that the assumptions and arguments underlying this optimization are still valid. (As a side note, creating this document forced me to review things in sufficient detail that I found I was not properly caching directory-rename-induced renames, resulting in the code not being aware of those renames and causing unnecessary diffcore_rename_extended() calls in subsequent merges.) A subsequent commit will add several testcases based on this document meant to stress-test the implementation and also document the case with the subtle behavioral change. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-20 15:40:39 +09:00
Junio C Hamano	644f4a2046	Merge branch 'jt/push-negotiation' "git push" learns to discover common ancestor with the receiving end over protocol v2. * jt/push-negotiation: send-pack: support push negotiation fetch: teach independent negotiation (no packfile) fetch-pack: refactor command and capability write fetch-pack: refactor add_haves() fetch-pack: refactor process_acks()	2021-05-16 21:05:22 +09:00
Junio C Hamano	eede71149e	Merge branch 'ba/object-info' Over-the-wire protocol learns a new request type to ask for object sizes given a list of object names. * ba/object-info: object-info: support for retrieving object info	2021-05-14 08:26:08 +09:00
Jonathan Tan	9c1e657a8f	fetch: teach independent negotiation (no packfile) Currently, the packfile negotiation step within a Git fetch cannot be done independent of sending the packfile, even though there is at least one application wherein this is useful. Therefore, make it possible for this negotiation step to be done independently. A subsequent commit will use this for one such application - push negotiation. This feature is for protocol v2 only. (An implementation for protocol v0 would require a separate implementation in the fetch, transport, and transport helper code.) In the protocol, the main hindrance towards independent negotiation is that the server can unilaterally decide to send the packfile. This is solved by a "wait-for-done" argument: the server will then wait for the client to say "done". In practice, the client will never say it; instead it will cease requests once it is satisfied. In the client, the main change lies in the transport and transport helper code. fetch_refs_via_pack() performs everything needed - protocol version and capability checks, and the negotiation itself. There are 2 code paths that do not go through fetch_refs_via_pack() that needed to be individually excluded: the bundle transport (excluded through requiring smart_options, which the bundle transport doesn't support) and transport helpers that do not support takeover. If or when we support independent negotiation for protocol v0, we will need to modify these 2 code paths to support it. But for now, report failure if independent negotiation is requested in these cases. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-05 10:41:29 +09:00
Junio C Hamano	a1cac26cc6	Merge branch 'mt/parallel-checkout-part-2' The checkout machinery has been taught to perform the actual write-out of the files in parallel when able. * mt/parallel-checkout-part-2: parallel-checkout: add design documentation parallel-checkout: support progress displaying parallel-checkout: add configuration options parallel-checkout: make it truly parallel unpack-trees: add basic support for parallel checkout	2021-04-30 13:50:26 +09:00
Junio C Hamano	8e97852919	Merge branch 'ds/sparse-index-protections' Builds on top of the sparse-index infrastructure to mark operations that are not ready to mark with the sparse index, causing them to fall back on fully-populated index that they always have worked with. * ds/sparse-index-protections: (47 commits) name-hash: use expand_to_path() sparse-index: expand_to_path() name-hash: don't add directories to name_hash revision: ensure full index resolve-undo: ensure full index read-cache: ensure full index pathspec: ensure full index merge-recursive: ensure full index entry: ensure full index dir: ensure full index update-index: ensure full index stash: ensure full index rm: ensure full index merge-index: ensure full index ls-files: ensure full index grep: ensure full index fsck: ensure full index difftool: ensure full index commit: ensure full index checkout: ensure full index ...	2021-04-30 13:50:26 +09:00
Bruno Albuquerque	a2ba162cda	object-info: support for retrieving object info Sometimes it is useful to get information of an object without having to download it completely. Add the "object-info" capability that lets the client ask for object-related information with their full hexadecimal object names. Only sizes are returned for now. Signed-off-by: Bruno Albuquerque <bga@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-20 17:41:13 -07:00
Junio C Hamano	fdef940afe	Merge branch 'ab/usage-error-docs' Documentation updates, with unrelated comment updates, too. * ab/usage-error-docs: api docs: document that BUG() emits a trace2 error event api docs: document BUG() in api-error-handling.txt usage.c: don't copy/paste the same comment three times	2021-04-20 17:23:36 -07:00
Junio C Hamano	196cc525e2	Merge branch 'hn/reftable-tables-doc-update' Doc updte. * hn/reftable-tables-doc-update: reftable: document an alternate cleanup method on Windows	2021-04-20 17:23:35 -07:00
Matheus Tavares	68e66f2987	parallel-checkout: add design documentation Co-authored-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-19 15:05:25 -07:00
Derrick Stolee	839a66349e	sparse-index: API protection strategy Edit and expand the sparse-index design document with the plan for guarding index operations with ensure_full_index(). Notably, the plan has changed to not have an expand_to_path() method in favor of checking for a sparse-directory hit inside of the index_path_pos() API. The changes that follow this one will incrementally add ensure_full_index() guards to iterations over all cache entries. Some iterations over the cache entries are not protected due to a few categories listed in the document. Since these are not being modified, here is a short list of the files and methods that will not receive these guards: Looking for non-zero stage: * builtin/add.c:chmod_pathspec() * builtin/merge.c:count_unmerged_entries() * merge-ort.c:record_conflicted_index_entries() * read-cache.c:unmerged_index() * rerere.c:check_one_conflict(), find_conflict(), rerere_remaining() * revision.c:prepare_show_merge() * sequencer.c:append_conflicts_hint() * wt-status.c:wt_status_collect_changes_initial() Looking for submodules: * builtin/submodule--helper.c:module_list_compute() * submodule.c: several methods * worktree.c:validate_no_submodules() Part of the index API: * name-hash.c: lazy init methods * preload-index.c:preload_thread(), preload_index() * read-cache.c: file format methods Checking for correct order of cache entries: * read-cache.c:check_ce_order() Ignores SKIP_WORKTREE entries or already aware: * unpack-trees.c:mark_new_skip_worktree() * wt-status.c:wt_status_check_sparse_checkout() Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-14 13:45:34 -07:00
Ævar Arnfjörð Bjarmason	f6d25d7878	api docs: document that BUG() emits a trace2 error event Correct documentation added in `e544221d97` (trace2: Documentation/technical/api-trace2.txt, 2019-02-22) to state that calling BUG() also emits an "error" event. See `ee4512ed48` (trace2: create new combined trace facility, 2019-02-22) for the initial implementation. The BUG() function did not emit an event then however, that was only changed later in `0a9dde4a04` (usage: trace2 BUG() invocations, 2021-02-05), that commit changed the code, but didn't update any of the docs. Let's also add a cross-reference from api-error-handling.txt. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-13 14:57:13 -07:00
Ævar Arnfjörð Bjarmason	4bf0c6f38f	api docs: document BUG() in api-error-handling.txt When the BUG() function was added in `d8193743e0` (usage.c: add BUG() function, 2017-05-12) these docs added in `1f23cfe0ef` (doc: document error handling functions and conventions, 2014-12-03) were not updated. Let's do that. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-13 14:56:58 -07:00
Han-Wen Nienhuys	61a7660516	reftable: document an alternate cleanup method on Windows The new method uses the update_index counter, which isn't susceptible to clock inaccuracies. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-12 14:29:44 -07:00
Junio C Hamano	e6b971fcf5	Merge branch 'tb/reverse-midx' An on-disk reverse-index to map the in-pack location of an object back to its object name across multiple packfiles is introduced. * tb/reverse-midx: midx.c: improve cache locality in midx_pack_order_cmp() pack-revindex: write multi-pack reverse indexes pack-write.c: extract 'write_rev_file_order' pack-revindex: read multi-pack reverse indexes Documentation/technical: describe multi-pack reverse indexes midx: make some functions non-static midx: keep track of the checksum midx: don't free midx_name early midx: allow marking a pack as preferred t/helper/test-read-midx.c: add '--show-objects' builtin/multi-pack-index.c: display usage on unrecognized command builtin/multi-pack-index.c: don't enter bogus cmd_mode builtin/multi-pack-index.c: split sub-commands builtin/multi-pack-index.c: define common usage with a macro builtin/multi-pack-index.c: don't handle 'progress' separately builtin/multi-pack-index.c: inline 'flags' with options	2021-04-08 13:23:25 -07:00
Junio C Hamano	861794b60d	Merge branch 'jh/simple-ipc' A simple IPC interface gets introduced to build services like fsmonitor on top. * jh/simple-ipc: t0052: add simple-ipc tests and t/helper/test-simple-ipc tool simple-ipc: add Unix domain socket implementation unix-stream-server: create unix domain socket under lock unix-socket: disallow chdir() when creating unix domain sockets unix-socket: add backlog size option to unix_stream_listen() unix-socket: eliminate static unix_stream_socket() helper function simple-ipc: add win32 implementation simple-ipc: design documentation for new IPC mechanism pkt-line: add options argument to read_packetized_to_strbuf() pkt-line: add PACKET_READ_GENTLE_ON_READ_ERROR option pkt-line: do not issue flush packets in write_packetized_*() pkt-line: eliminate the need for static buffer in packet_write_gently()	2021-04-02 14:43:14 -07:00
Taylor Blau	b25fd24c00	Documentation/technical: describe multi-pack reverse indexes As a prerequisite to implementing multi-pack bitmaps, motivate and describe the format and ordering of the multi-pack reverse index. The subsequent patch will implement reading this format, and the patch after that will implement writing it while producing a multi-pack index. Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-01 13:07:37 -07:00
Taylor Blau	9218c6a40c	midx: allow marking a pack as preferred When multiple packs in the multi-pack index contain the same object, the MIDX machinery must make a choice about which pack it associates with that object. Prior to this patch, the lowest-ordered[1] pack was always selected. Pack selection for duplicate objects is relatively unimportant today, but it will become important for multi-pack bitmaps. This is because we can only invoke the pack-reuse mechanism when all of the bits for reused objects come from the reuse pack (in order to ensure that all reused deltas can find their base objects in the same pack). To encourage the pack selection process to prefer one pack over another (the pack to be preferred is the one a caller would like to later use as a reuse pack), introduce the concept of a "preferred pack". When provided, the MIDX code will always prefer an object found in a preferred pack over any other. No format changes are required to store the preferred pack, since it will be able to be inferred with a corresponding MIDX bitmap, by looking up the pack associated with the object in the first bit position (this ordering is described in detail in a subsequent commit). [1]: the ordering is specified by MIDX internals; for our purposes we can consider the "lowest ordered" pack to be "the one with the most-recent mtime. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-01 13:07:37 -07:00
Derrick Stolee	cd42415fb4	sparse-index: add 'sdir' index extension The index format does not currently allow for sparse directory entries. This violates some expectations that older versions of Git or third-party tools might not understand. We need an indicator inside the index file to warn these tools to not interact with a sparse index unless they are aware of sparse directory entries. Add a new _required_ index extension, 'sdir', that indicates that the index may contain sparse directory entries. This allows us to continue to use the differences in index formats 2, 3, and 4 before we create a new index version 5 in a later change. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:46 -07:00
Derrick Stolee	0ad6090bdd	sparse-index: design doc and format update This begins a long effort to update the index format to allow sparse directory entries. This should result in a significant improvement to Git commands when HEAD contains millions of files, but the user has selected many fewer files to keep in their sparse-checkout definition. Currently, the index format is only updated in the presence of extensions.sparseIndex instead of increasing a file format version number. This is temporary, and index v5 is part of the plan for future work in this area. The design document details many of the reasons for embarking on this work, and also the plan for completing it safely. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:44 -07:00
Jeff Hostetler	066d5234d0	simple-ipc: design documentation for new IPC mechanism Brief design documentation for new IPC mechanism allowing foreground Git client to talk with an existing daemon process at a known location using a named pipe or unix domain socket. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-15 14:32:50 -07:00
Junio C Hamano	18aabfaee5	Merge branch 'hn/reftable-tables-doc-update' Documentation update. * hn/reftable-tables-doc-update: doc/reftable: document how to handle windows	2021-03-01 14:02:57 -08:00
Junio C Hamano	660dd97a62	Merge branch 'ds/chunked-file-api' The common code to deal with "chunked file format" that is shared by the multi-pack-index and commit-graph files have been factored out, to help codepaths for both filetypes to become more robust. * ds/chunked-file-api: commit-graph.c: display correct number of chunks when writing chunk-format: add technical docs chunk-format: restore duplicate chunk checks midx: use 64-bit multiplication for chunk sizes midx: use chunk-format read API commit-graph: use chunk-format read API chunk-format: create read chunk API midx: use chunk-format API in write_midx_internal() midx: drop chunk progress during write midx: return success/failure in chunk write methods midx: add num_large_offsets to write_midx_context midx: add pack_perm to write_midx_context midx: add entries to write_midx_context midx: use context in write_midx_pack_names() midx: rename pack_info to write_midx_context commit-graph: use chunk-format write API chunk-format: create chunk format write API commit-graph: anonymize data in chunk_write_fn	2021-03-01 14:02:57 -08:00
Junio C Hamano	09e72204f8	Merge branch 'dl/doc-config-camelcase' A handful of multi-word configuration variable names in documentation that are spelled in all lowercase have been corrected to use the more canonical camelCase. * dl/doc-config-camelcase: index-format doc: camelCase core.excludesFile blame-options.txt: camelcase blame.blankBoundary i18n.txt: camel case and monospace "i18n.commitEncoding"	2021-02-25 16:43:32 -08:00
Junio C Hamano	f47c3328ef	Merge branch 'js/doc-proto-v2-response-end' Docfix. * js/doc-proto-v2-response-end: doc: fix naming of response-end-pkt	2021-02-25 16:43:30 -08:00
Junio C Hamano	11875561bf	Merge branch 'ds/chunked-file-api' into tb/reverse-midx * ds/chunked-file-api: commit-graph.c: display correct number of chunks when writing chunk-format: add technical docs chunk-format: restore duplicate chunk checks midx: use 64-bit multiplication for chunk sizes midx: use chunk-format read API commit-graph: use chunk-format read API chunk-format: create read chunk API midx: use chunk-format API in write_midx_internal() midx: drop chunk progress during write midx: return success/failure in chunk write methods midx: add num_large_offsets to write_midx_context midx: add pack_perm to write_midx_context midx: add entries to write_midx_context midx: use context in write_midx_pack_names() midx: rename pack_info to write_midx_context commit-graph: use chunk-format write API chunk-format: create chunk format write API commit-graph: anonymize data in chunk_write_fn	2021-02-24 15:26:14 -08:00
Junio C Hamano	7dd0eaa39c	index-format doc: camelCase core.excludesFile Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-24 15:21:25 -08:00
Han-Wen Nienhuys	00f68732e5	doc/reftable: document how to handle windows On Windows we can't delete or overwrite files opened by other processes. Here we sketch how to handle this situation. We propose to use a random element in the filename. It's possible to design an alternate solution based on counters, but that would assign semantics to the filenames that complicates implementation. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-23 10:01:21 -08:00
Junio C Hamano	dc24948be9	Merge branch 'ta/hash-function-transition-doc' Update formatting and grammar of the hash transition plan documentation, plus some updates. * ta/hash-function-transition-doc: doc: use https links doc hash-function-transition: move rationale upwards doc hash-function-transition: fix incomplete sentence doc hash-function-transition: use upper case consistently doc hash-function-transition: use SHA-1 and SHA-256 consistently doc hash-function-transition: fix asciidoc output	2021-02-22 16:12:43 -08:00
Derrick Stolee	a43a2e6c2a	chunk-format: add technical docs The chunk-based file format is now an API in the code, but we should also take time to document it as a file format. Specifically, it matches the CHUNK LOOKUP sections of the commit-graph and multi-pack-index files, but there are some commonalities that should be grouped in this document. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-18 13:38:16 -08:00
Junio C Hamano	69571dfe21	Merge branch 'jt/clone-unborn-head' "git clone" tries to locally check out the branch pointed at by HEAD of the remote repository after it is done, but the protocol did not convey the information necessary to do so when copying an empty repository. The protocol v2 learned how to do so. * jt/clone-unborn-head: clone: respect remote unborn HEAD connect, transport: encapsulate arg in struct ls-refs: report unborn targets of symrefs	2021-02-17 17:21:40 -08:00
Junio C Hamano	8b4701ae4f	Merge branch 'ak/corrected-commit-date' The commit-graph learned to use corrected commit dates instead of the generation number to help topological revision traversal. * ak/corrected-commit-date: doc: add corrected commit date info commit-reach: use corrected commit dates in paint_down_to_common() commit-graph: use generation v2 only if entire chain does commit-graph: implement generation data chunk commit-graph: implement corrected commit date commit-graph: return 64-bit generation number commit-graph: add a slab to store topological levels t6600-test-reach: generalize *_three_modes commit-graph: consolidate fill_commit_graph_info revision: parse parent in indegree_walk_step() commit-graph: fix regression when computing Bloom filters	2021-02-17 17:21:40 -08:00
Joey Salazar	9d336655ba	doc: fix naming of response-end-pkt Git Protocol version 2[1] defines 0002 as a Message Packet that indicates the end of a response for stateless connections. Change the naming of the 0002 Packet to 'Response End' to match the parsing introduced in Wireshark's MR !1922 for consistency. A subsequent MR in Wireshark will address additional mismatches. [1] kernel.org/pub/software/scm/git/docs/technical/protocol-v2.html [2] gitlab.com/wireshark/wireshark/-/merge_requests/1922 Signed-off-by: Joey Salazar <jgsal@protonmail.com> Reviewed-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-17 16:30:43 -08:00
Junio C Hamano	3c12d0b885	Merge branch 'tb/pack-revindex-on-disk' Introduce an on-disk file to record revindex for packdata, which traditionally was always created on the fly and only in-core. * tb/pack-revindex-on-disk: t5325: check both on-disk and in-memory reverse index pack-revindex: ensure that on-disk reverse indexes are given precedence t: support GIT_TEST_WRITE_REV_INDEX t: prepare for GIT_TEST_WRITE_REV_INDEX Documentation/config/pack.txt: advertise 'pack.writeReverseIndex' builtin/pack-objects.c: respect 'pack.writeReverseIndex' builtin/index-pack.c: write reverse indexes builtin/index-pack.c: allow stripping arbitrary extensions pack-write.c: prepare to write 'pack-.rev' files packfile: prepare for the existence of '.rev' files	2021-02-12 14:21:04 -08:00
Junio C Hamano	71e83b2e7d	Merge branch 'ma/doc-pack-format-varint-for-sizes' into maint Doc update. * ma/doc-pack-format-varint-for-sizes: pack-format.txt: document sizes at start of delta data	2021-02-08 14:05:54 -08:00
Junio C Hamano	a0a2d75d3b	Merge branch 'ds/cache-tree-basics' Document, clean-up and optimize the code around the cache-tree extension in the index. * ds/cache-tree-basics: cache-tree: speed up consecutive path comparisons cache-tree: use ce_namelen() instead of strlen() index-format: discuss recursion of cache-tree better index-format: update preamble to cache tree extension index-format: use 'cache tree' over 'cached tree' cache-tree: trace regions for prime_cache_tree cache-tree: trace regions for I/O cache-tree: use trace2 in cache_tree_update() unpack-trees: add trace2 regions tree-walk: report recursion counts	2021-02-05 16:40:44 -08:00
Junio C Hamano	93da9662d7	Merge branch 'jt/packfile-as-uri-doc' into maint Doc fix for packfile URI feature. * jt/packfile-as-uri-doc: Doc: clarify contents of packfile sent as URI	2021-02-05 16:31:28 -08:00
Jonathan Tan	59e1205d16	ls-refs: report unborn targets of symrefs When cloning, we choose the default branch based on the remote HEAD. But if there is no remote HEAD reported (which could happen if the target of the remote HEAD is unborn), we'll fall back to using our local init.defaultBranch. Traditionally this hasn't been a big deal, because most repos used "master" as the default. But these days it is likely to cause confusion if the server and client implementations choose different values (e.g., if the remote started with "main", we may choose "master" locally, create commits there, and then the user is surprised when they push to "master" and not "main"). To solve this, the remote needs to communicate the target of the HEAD symref, even if it is unborn, and "git clone" needs to use this information. Currently, symrefs that have unborn targets (such as in this case) are not communicated by the protocol. Teach Git to advertise and support the "unborn" feature in "ls-refs" (by default, this is advertised, but server administrators may turn this off through the lsrefs.unborn config). This feature indicates that "ls-refs" supports the "unborn" argument; when it is specified, "ls-refs" will send the HEAD symref with the name of its unborn target. This change is only for protocol v2. A similar change for protocol v0 would require independent protocol design (there being no analogous position to signal support for "unborn") and client-side plumbing of the data required, so the scope of this patch set is limited to protocol v2. The client side will be updated to use this in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-05 13:49:53 -08:00
Thomas Ackermann	6eda9ac9e5	doc: use https links Use only https links for lore.kernel.org. Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-05 11:57:10 -08:00
Thomas Ackermann	1d18997007	doc hash-function-transition: move rationale upwards Move rationale for new hash function to beginning of document so that it appears before the concrete move to SHA-256 is described. Remove some of the details about SHA-1 weaknesses and add references to the details on how the new hash function was chosen instead. Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-05 11:57:10 -08:00
Thomas Ackermann	cc9f0916bd	doc hash-function-transition: fix incomplete sentence Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-05 11:57:02 -08:00
Thomas Ackermann	810372f881	doc hash-function-transition: use upper case consistently Use upper case consistently in Document History. Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-05 11:57:02 -08:00
Thomas Ackermann	af9b1e9aba	doc hash-function-transition: use SHA-1 and SHA-256 consistently Use SHA-1 and SHA-256 instead of sha1 and sha256 when referring to the hash type. Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-05 11:57:02 -08:00
Thomas Ackermann	de82095a95	doc hash-function-transition: fix asciidoc output Asciidoc requires lists to start with an empty line and uses different characters for indentation levels ("-", "", "*", ...). For special symbols like a dash "--" has to be used and there is no double arrow "<->", so a left and right arrow "<-->" has to be combined for that. Lastly for verbatim output a newline followed by an indentation has to be used. Fix asciidoc output for lists, special characters and verbatim text while retaining the readabilty of the original text file. Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-05 11:57:02 -08:00
Junio C Hamano	d03553ecd1	Merge branch 'jt/packfile-as-uri-doc' Doc fix for packfile URI feature. * jt/packfile-as-uri-doc: Doc: clarify contents of packfile sent as URI	2021-02-03 15:04:49 -08:00
Taylor Blau	2f4ba2a867	packfile: prepare for the existence of '.rev' files Specify the format of the on-disk reverse index 'pack-.rev' file, as well as prepare the code for the existence of such files. The reverse index maps from pack relative positions (i.e., an index into the array of object which is sorted by their offsets within the packfile) to their position within the 'pack-.idx' file. Today, this is done by building up a list of (off_t, uint32_t) tuples for each object (the off_t corresponding to that object's offset, and the uint32_t corresponding to its position in the index). To convert between pack and index position quickly, this array of tuples is radix sorted based on its offset. This has two major drawbacks: First, the in-memory cost scales linearly with the number of objects in a pack. Each 'struct revindex_entry' is sizeof(off_t) + sizeof(uint32_t) + padding bytes for a total of 16. To observe this, force Git to load the reverse index by, for e.g., running 'git cat-file --batch-check="%(objectsize:disk)"'. When asking for a single object in a fresh clone of the kernel, Git needs to allocate 120+ MB of memory in order to hold the reverse index in memory. Second, the cost to sort also scales with the size of the pack. Luckily, this is a linear function since 'load_pack_revindex()' uses a radix sort, but this cost still must be paid once per pack per process. As an example, it takes ~60x longer to print the _size_ of an object as it does to print that entire object's _contents_: Benchmark #1: git.compile cat-file --batch <obj Time (mean ± σ): 3.4 ms ± 0.1 ms [User: 3.3 ms, System: 2.1 ms] Range (min … max): 3.2 ms … 3.7 ms 726 runs Benchmark #2: git.compile cat-file --batch-check="%(objectsize:disk)" <obj Time (mean ± σ): 210.3 ms ± 8.9 ms [User: 188.2 ms, System: 23.2 ms] Range (min … max): 193.7 ms … 224.4 ms 13 runs Instead, avoid computing and sorting the revindex once per process by writing it to a file when the pack itself is generated. The format is relatively straightforward. It contains an array of uint32_t's, the length of which is equal to the number of objects in the pack. The ith entry in this table contains the index position of the ith object in the pack, where "ith object in the pack" is determined by pack offset. One thing that the on-disk format does _not_ contain is the full (up to) eight-byte offset corresponding to each object. This is something that the in-memory revindex contains (it stores an off_t in 'struct revindex_entry' along with the same uint32_t that the on-disk format has). Omit it in the on-disk format, since knowing the index position for some object is sufficient to get a constant-time lookup in the pack-.idx file to ask for an object's offset within the pack. This trades off between the on-disk size of the 'pack-.rev' file for runtime to chase down the offset for some object. Even though the lookup is constant time, the constant is heavier, since it can potentially involve two pointer walks in v2 indexes (one to access the 4-byte offset table, and potentially a second to access the double wide offset table). Consider trying to map an object's pack offset to a relative position within that pack. In a cold-cache scenario, more page faults occur while switching between binary searching through the reverse index and searching through the .idx file for an object's offset. Sure enough, with a cold cache (writing '3' into '/proc/sys/vm/drop_caches' after 'sync'ing), printing out the entire object's contents is still marginally faster than printing its size: Benchmark #1: git.compile cat-file --batch-check="%(objectsize:disk)" <obj >/dev/null Time (mean ± σ): 22.6 ms ± 0.5 ms [User: 2.4 ms, System: 7.9 ms] Range (min … max): 21.4 ms … 23.5 ms 41 runs Benchmark #2: git.compile cat-file --batch <obj >/dev/null Time (mean ± σ): 17.2 ms ± 0.7 ms [User: 2.8 ms, System: 5.5 ms] Range (min … max): 15.6 ms … 18.2 ms 45 runs (Numbers taken in the kernel after cheating and using the next patch to generate a reverse index). There are a couple of approaches to improve cold cache performance not pursued here: - We could include the object offsets in the reverse index format. Predictably, this does result in fewer page faults, but it triples the size of the file, while simultaneously duplicating a ton of data already available in the .idx file. (This was the original way I implemented the format, and it did show `--batch-check='%(objectsize:disk)'` winning out against `--batch`.) On the other hand, this increase in size also results in a large block-cache footprint, which could potentially hurt other workloads. - We could store the mapping from pack to index position in more cache-friendly way, like constructing a binary search tree from the table and writing the values in breadth-first order. This would result in much better locality, but the price you pay is trading O(1) lookup in 'pack_pos_to_index()' for an O(log n) one (since you can no longer directly index the table). So, neither of these approaches are taken here. (Thankfully, the format is versioned, so we are free to pursue these in the future.) But, cold cache performance likely isn't interesting outside of one-off cases like asking for the size of an object directly. In real-world usage, Git is often performing many operations in the revindex (i.e., asking about many objects rather than a single one). The trade-off is worth it, since we will avoid the vast majority of the cost of generating the revindex that the extra pointer chase will look like noise in the following patch's benchmarks. This patch describes the format and prepares callers (like in pack-revindex.c) to be able to read *.rev files once they exist. An implementation of the writer will appear in the next patch, and callers will gradually begin to start using the writer in the patches that follow after that. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-25 18:32:43 -08:00
Jonathan Tan	bfc2a36ff2	Doc: clarify contents of packfile sent as URI Clarify that, when the packfile-uri feature is used, the client should not assume that the extra packfiles downloaded would only contain a single blob, but support packfiles containing multiple objects of all types. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-20 19:06:50 -08:00
Abhishek Kumar	5a3b130cad	doc: add corrected commit date info With generation data chunk and corrected commit dates implemented, let's update the technical documentation for commit-graph. Signed-off-by: Abhishek Kumar <abhishekkumar8222@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-18 16:21:18 -08:00
Derrick Stolee	4bdde337f4	index-format: discuss recursion of cache-tree better The end of the cache tree index extension format trails off with ellipses ever since `23fcc98` (doc: technical details about the index file format, 2011-03-01). While an intuitive reader could gather what this means, it could be better to use "and so on" instead. Really, this is only justified because I also wanted to point out that the number of subtrees in the index format is used to determine when the recursive depth-first-search stack should be "popped." This should help to add clarity to the format. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-15 23:04:59 -08:00
Derrick Stolee	22ad8600c1	index-format: update preamble to cache tree extension I had difficulty in my efforts to learn about the cache tree extension based on the documentation and code because I had an incorrect assumption about how it behaved. This might be due to some ambiguity in the documentation, so this change modifies the beginning of the cache tree format by expanding the description of the feature. My hope is that this documentation clarifies a few things: 1. There is an in-memory recursive tree structure that is constructed from the extension data. This structure has a few differences, such as where the name is stored. 2. What does it mean for an entry to be invalid? 3. When exactly are "new" trees created? Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-15 23:04:46 -08:00
Derrick Stolee	845d15d4d0	index-format: use 'cache tree' over 'cached tree' The index has a "cache tree" extension. This corresponds to a significant API implemented in cache-tree.[ch]. However, there are a few places that refer to this erroneously as "cached tree". These are rare, but notably the index-format.txt file itself makes this error. The only other reference is in t7104-reset-hard.sh. Reported-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-15 23:04:38 -08:00
Junio C Hamano	16a8055dae	Merge branch 'ma/doc-pack-format-varint-for-sizes' Doc update. * ma/doc-pack-format-varint-for-sizes: pack-format.txt: document sizes at start of delta data	2021-01-15 15:20:29 -08:00
Martin Ågren	7b77f5a13e	pack-format.txt: document sizes at start of delta data We document the delta data as a set of instructions, but forget to document the two sizes that precede those instructions: the size of the base object and the size of the object to be reconstructed. Fix this omission. Rather than cramming all the details about the encoding into the running text, introduce a separate section detailing our "size encoding" and refer to it. Reported-by: Ross Light <ross@zombiezen.com> Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 13:00:28 -08:00
Thomas Ackermann	7efc378205	doc: fix some typos Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Acked-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-04 11:27:48 -08:00
Junio C Hamano	7bceb83bfe	Merge branch 'jh/index-v2-doc-on-fsmn' Doc update. * jh/index-v2-doc-on-fsmn: index-format.txt: document v2 format of file system monitor extension	2020-12-17 15:06:42 -08:00
Junio C Hamano	94dc98d1d2	Merge branch 'jb/midx-doc-update' Doc update. * jb/midx-doc-update: docs: multi-pack-index: remove note about future 'verify' work	2020-12-17 15:06:41 -08:00
Jeff Hostetler	5885367e8f	index-format.txt: document v2 format of file system monitor extension Update the documentation of the file system monitor extension to describe version 2. The format was extended to support opaque tokens in: `56c6910028` fsmonitor: change last update timestamp on the index_state to opaque token Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-14 08:42:23 -08:00

1 2 3 4 5 ...

922 Commits