git-commit-vandalism

Author	SHA1	Message	Date
Junio C Hamano	8c1bc7c244	Merge branch 'mk/describe-match-with-all' "git describe --match <pattern>" has been taught to play well with the "--all" option. * mk/describe-match-with-all: describe: teach --match to handle branches and remotes	2017-09-29 11:23:41 +09:00
Junio C Hamano	075bc9c798	Merge branch 'jm/status-ignored-directory-optim' "git status --ignored", when noticing that a directory without any tracked path is ignored, still enumerated all the ignored paths in the directory, which is unnecessary. The codepath has been optimized to avoid this overhead. * jm/status-ignored-directory-optim: Improve performance of git status --ignored	2017-09-29 11:23:40 +09:00
Adam Dinwoodie	5e633326e4	doc: correct command formatting Leaving spaces around the `-delimeters for commands means asciidoc fails to parse them as the start of a literal string. Remove an extraneous space that is causing a literal to not be formatted as such. Signed-off-by: Adam Dinwoodie <adam@dinwoodie.org> Acked-by: Andreas Heiduk <asheiduk@gmail.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-29 10:54:38 +09:00
Jonathan Nieder	752414ae43	technical doc: add a design doc for hash function transition This document describes what a transition to a new hash function for Git would look like. Add it to Documentation/technical/ as the plan of record so that future changes can be recorded as patches. Also-by: Brandon Williams <bmwill@google.com> Also-by: Jonathan Tan <jonathantanmy@google.com> Also-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-28 19:37:52 +09:00
Junio C Hamano	20fed7cad4	The tenth batch for 2.15 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-28 14:51:45 +09:00
Junio C Hamano	3b6e73a3b1	Merge branch 'js/win32-lazyload-dll' Add a helper in anticipation for its need in a future topic RSN. * js/win32-lazyload-dll: Win32: simplify loading of DLL functions	2017-09-28 14:47:57 +09:00
Junio C Hamano	4da3e234f5	Merge branch 'jc/merge-x-theirs-docfix' The documentation for '-X<option>' for merges was misleadingly written to suggest that "-s theirs" exists, which is not the case. * jc/merge-x-theirs-docfix: merge-strategies: avoid implying that "-s theirs" exists	2017-09-28 14:47:57 +09:00
Junio C Hamano	47d26f0a66	Merge branch 'ks/doc-use-camelcase-for-config-name' Doc update. * ks/doc-use-camelcase-for-config-name: doc: camelCase the config variables to improve readability	2017-09-28 14:47:56 +09:00
Junio C Hamano	fdbe2ac198	Merge branch 'mk/diff-delta-avoid-large-offset' The delta format used in the packfile cannot reference data at offset larger than what can be expressed in 4-byte, but the generator for the data failed to make sure the offset does not overflow. This has been corrected. * mk/diff-delta-avoid-large-offset: diff-delta: do not allow delta offset truncation	2017-09-28 14:47:56 +09:00
Junio C Hamano	3d09e79b27	Merge branch 'mk/diff-delta-uint-may-be-shorter-than-ulong' The machinery to create xdelta used in pack files received the sizes of the data in size_t, but lost the higher bits of them by storing them in "unsigned int" during the computation, which is fixed. * mk/diff-delta-uint-may-be-shorter-than-ulong: diff-delta: fix encoding size that would not fit in "unsigned int"	2017-09-28 14:47:56 +09:00
Junio C Hamano	73ecdc606e	Merge branch 'rs/resolve-ref-optional-result' Code clean-up. * rs/resolve-ref-optional-result: refs: pass NULL to resolve_ref_unsafe() if hash is not needed refs: pass NULL to refs_resolve_ref_unsafe() if hash is not needed refs: make sha1 output parameter of refs_resolve_ref_unsafe() optional	2017-09-28 14:47:56 +09:00
Junio C Hamano	2812ca7f0e	Merge branch 'rs/mailinfo-qp-decode-fix' "git mailinfo" was loose in decoding quoted printable and produced garbage when the two letters after the equal sign are not hexadecimal. This has been fixed. * rs/mailinfo-qp-decode-fix: mailinfo: don't decode invalid =XY quoted-printable sequences	2017-09-28 14:47:56 +09:00
Junio C Hamano	1ba75ffd01	Merge branch 'jk/doc-read-tree-table-asciidoctor-fix' A docfix. * jk/doc-read-tree-table-asciidoctor-fix: doc: put literal block delimiter around table	2017-09-28 14:47:55 +09:00
Junio C Hamano	376a1da839	Merge branch 'ik/userdiff-html-h-element-fix' The built-in pattern to detect the "function header" for HTML did not match <H1>..<H6> elements without any attributes, which has been fixed. * ik/userdiff-html-h-element-fix: userdiff: fix HTML hunk header regexp	2017-09-28 14:47:54 +09:00
Junio C Hamano	59373a4e03	Merge branch 'jk/fallthrough' Many codepaths have been updated to squelch -Wimplicit-fallthrough warnings from Gcc 7 (which is a good code hygiene). * jk/fallthrough: consistently use "fallthrough" comments in switches curl_trace(): eliminate switch fallthrough test-line-buffer: simplify command parsing	2017-09-28 14:47:53 +09:00
Junio C Hamano	bfbc2fccfd	Merge branch 'jk/diff-blob' "git cat-file --textconv" started segfaulting recently, which has been corrected. * jk/diff-blob: cat-file: handle NULL object_context.path	2017-09-28 14:47:53 +09:00
Junio C Hamano	8174645831	Merge branch 'hn/typofix' * hn/typofix: submodule.h: typofix	2017-09-28 14:47:52 +09:00
Junio C Hamano	386dd12b55	Merge branch 'ic/fix-filter-branch-to-handle-tag-without-tagger' "git filter-branch" cannot reproduce a history with a tag without the tagger field, which only ancient versions of Git allowed to be created. This has been corrected. * ic/fix-filter-branch-to-handle-tag-without-tagger: filter-branch: use hash-object instead of mktag filter-branch: stash away ref map in a branch filter-branch: preserve and restore $GIT_AUTHOR_* and $GIT_COMMITTER_* filter-branch: reset $GIT_* before cleaning up	2017-09-28 14:47:52 +09:00
Junio C Hamano	a515136c52	Merge branch 'jk/describe-omit-some-refs' "git describe --match" learned to take multiple patterns in v2.13 series, but the feature ignored the patterns after the first one and did not work at all. This has been fixed. * jk/describe-omit-some-refs: describe: fix matching to actually match all patterns	2017-09-28 14:47:52 +09:00
Stefan Beller	2d94dd2fc6	submodule: correct error message for missing commits When a submodule diff should be displayed we currently just add the submodule objects to the main object store and then e.g. walk the revision graph and create a summary for that submodule. It is possible that we are missing the submodule either completely or partially, which we currently differentiate with different error messages depending on whether (1) the whole submodule object store is missing or (2) just the needed for this particular diff. (1) is reported as "not initialized", and (2) is reported as "commits not present". If a submodule is deinit'ed its repository data is still around inside the superproject, such that the diff can still be produced. In that way the error message (1) is misleading as we can have a diff despite the submodule being not initialized. Downgrade the error message (1) to be the same as (2) and just say the commits are not present, as that is the true reason why the diff cannot be shown. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-28 14:15:20 +09:00
Stefan Beller	58aaced444	diff: correct newline in summary for renamed files In `146fdb0dfe` (diff.c: emit_diff_symbol learns about DIFF_SYMBOL_SUMMARY, 2017-06-29), the conversion from direct printing to the symbol emission dropped the new line character for renamed, copied and rewritten files. Add the emission of a newline, add a test for this case. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-28 13:15:59 +09:00
Jeff King	27344d6a6c	git: add --no-optional-locks option Some tools like IDEs or fancy editors may periodically run commands like "git status" in the background to keep track of the state of the repository. Some of these commands may refresh the index and write out the result in an opportunistic way: if they can get the index lock, then they update the on-disk index with any updates they find. And if not, then their in-core refresh is lost and just has to be recomputed by the next caller. But taking the index lock may conflict with other operations in the repository. Especially ones that the user is doing themselves, which _aren't_ opportunistic. In other words, "git status" knows how to back off when somebody else is holding the lock, but other commands don't know that status would be happy to drop the lock if somebody else wanted it. There are a couple possible solutions: 1. Have some kind of "pseudo-lock" that allows other commands to tell status that they want the lock. This is likely to be complicated and error-prone to implement (and maybe even impossible with just dotlocks to work from, as it requires some inter-process communication). 2. Avoid background runs of commands like "git status" that want to do opportunistic updates, preferring instead plumbing like diff-files, etc. This is awkward for a couple of reasons. One is that "status --porcelain" reports a lot more about the repository state than is available from individual plumbing commands. And two is that we actually _do_ want to see the refreshed index. We just don't want to take a lock or write out the result. Whereas commands like diff-files expect us to refresh the index separately and write it to disk so that they can depend on the result. But that write is exactly what we're trying to avoid. 3. Ask "status" not to lock or write the index. This is easy to implement. The big downside is that any work done in refreshing the index for such a call is lost when the process exits. So a background process may end up re-hashing a changed file multiple times until the user runs a command that does an index refresh themselves. This patch implements the option 3. The idea (and the test) is largely stolen from a Git for Windows patch by Johannes Schindelin, 67e5ce7f63 (status: offer not to lock the index and update it, 2016-08-12). The twist here is that instead of making this an option to "git status", it becomes a "git" option and matching environment variable. The reason there is two-fold: 1. An environment variable is carried through to sub-processes. And whether an invocation is a background process or not should apply to the whole process tree. So you could do "git --no-optional-locks foo", and if "foo" is a script or alias that calls "status", you'll still get the effect. 2. There may be other programs that want the same treatment. I've punted here on finding more callers to convert, since "status" is the obvious one to call as a repeated background job. But "git diff"'s opportunistic refresh of the index may be a good candidate. The test is taken from 67e5ce7f63, and it's worth repeating Johannes's explanation: Note that the regression test added in this commit does not really verify that no index.lock file was written; that test is not possible in a portable way. Instead, we verify that .git/index is rewritten only when `git status` is run without `--no-optional-locks`. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-27 16:11:01 +09:00
Jeff King	0bca165fdb	validate_headref: use get_oid_hex for detached HEADs If a candidate HEAD isn't a symref, we check that it contains a viable sha1. But in a post-sha1 world, we should be checking whether it has any plausible object-id. We can do that by switching to get_oid_hex(). Note that both before and after this patch, we only check for a plausible object id at the start of the file, and then call that good enough. We ignore any content _after_ the hex, so a string like: 0123456789012345678901234567890123456789 foo is accepted. Though we do put extra bytes like this into some pseudorefs (e.g., FETCH_HEAD), we don't typically do so with HEAD. We could tighten this up by using parse_oid_hex(), like: if (!parse_oid_hex(buffer, &oid, &end) && end++ == '\n' && end == '\0') return 0; But we're probably better to remain on the loose side. We're just checking here for a plausible-looking repository directory, so heuristics are acceptable (if we really want to be meticulous, we should use the actual ref code to parse HEAD). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-27 16:07:22 +09:00
Jeff King	7eb4b9d025	validate_headref: use skip_prefix for symref parsing Since the previous commit guarantees that our symref buffer is NUL-terminated, we can just use skip_prefix() and friends to parse it. This is shorter and saves us having to deal with magic numbers and keeping the "len" counter up to date. While we're at it, let's name the rather obscure "buf" to "refname", since that is the thing we are parsing with it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-27 16:06:31 +09:00
Jeff King	6e68c91410	validate_headref: NUL-terminate HEAD buffer When we are checking to see if we have a git repo, we peek into the HEAD file and see if it's a plausible symlink, symref, or detached HEAD. For the latter two, we read the contents with read_in_full(), which means they aren't NUL-terminated. The symref check is careful to respect the length we got, but the sha1 check will happily parse up to 40 bytes, even if we read fewer. E.g.,: echo 1234 >.git/HEAD git rev-parse will parse 36 uninitialized bytes from our stack buffer. This isn't a big deal in practice. Our buffer is 256 bytes, so we know we'll never read outside of it. The worst case is that the uninitialized bytes look like valid hex, and we claim a bogus HEAD file is valid. The chances of this happening randomly are quite slim, but let's be careful. One option would be to check that "len == 41" before feeding the buffer to get_sha1_hex(). But we'd like to eventually prepare for a world with variable-length hashes. Let's NUL-terminate as soon as we've read the buffer (we already even leave a spare byte to do so!). That fixes this problem without depending on the size of an object id. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-27 16:01:24 +09:00
Jeff King	8a1a8d2ad1	worktree: check the result of read_in_full() We try to read "len" bytes into a buffer and just assume that it happened correctly. In practice this should usually be the case, since we just stat'd the file to get the length. But we could be fooled by transient errors or by other processes racily truncating the file. Let's be more careful. There's a slim chance this could catch a real error, but it also prevents people and tools from getting worried while reading the code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-27 15:46:05 +09:00
Jeff King	228740b67b	worktree: use xsize_t to access file size To read the "gitdir" file into memory, we stat the file and allocate a buffer. But we store the size in an "int", which may be truncated. We should use a size_t and xsize_t(), which will detect truncation. An overflow is unlikely for a "gitdir" file, but it's a good practice to model. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-27 15:45:57 +09:00
Jeff King	41dcc4dccc	distinguish error versus short read from read_in_full() Many callers of read_in_full() expect to see the exact number of bytes requested, but their error handling lumps together true read errors and short reads due to unexpected EOF. We can give more specific error messages by separating these cases (showing errno when appropriate, and otherwise describing the short read). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-27 15:45:24 +09:00
Jeff King	90dca6710e	avoid looking at errno for short read_in_full() returns When a caller tries to read a particular set of bytes via read_in_full(), there are three possible outcomes: 1. An error, in which case -1 is returned and errno is set. 2. A short read, in which fewer bytes are returned and errno is unspecified (we never saw a read error, so we may have some random value from whatever syscall failed last). 3. The full read completed successfully. Many callers handle cases 1 and 2 together by just checking the result against the requested size. If their combined error path looks at errno (e.g., by calling die_errno), they may report a nonsense value. Let's fix these sites by having them distinguish between the two error cases. That avoids the random errno confusion, and lets us give more detailed error messages. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-27 15:45:24 +09:00
Jeff King	61d36330b4	prefer "!=" when checking read_in_full() result Comparing the result of read_in_full() using less-than is potentially dangerous, as a negative return value may be converted to an unsigned type and be considered a success. This is discussed further in 561598cfcf (read_pack_header: handle signed/unsigned comparison in read result, 2017-09-13). Each of these instances is actually fine in practice: - in get-tar-commit-id, the HEADERSIZE macro expands to a signed integer. If it were switched to an unsigned type (e.g., a size_t), then it would be a bug. - the other two callers check for a short read only after handling a negative return separately. This is a fine practice, but we'd prefer to model "!=" as a general rule. So all of these cases can be considered cleanups and not actual bugfixes. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-27 15:45:24 +09:00
Stefan Beller	83a17fa83b	t7406: submodule.<name>.update command must not be run from .gitmodules submodule.<name>.update can be assigned an arbitrary command via setting it to "!command". When this command is found in the regular config, Git ought to just run that command instead of other update mechanisms. However if that command is just found in the .gitmodules file, it is potentially untrusted, which is why we do not run it. Add a test confirming the behavior. Suggested-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-27 12:22:01 +09:00
Han-Wen Nienhuys	4f665f2cf3	string-list.h: move documentation from Documentation/api/ into header This mirrors commit 'bdfdaa497 ("strbuf.h: integrate api-strbuf.txt documentation, 2015-01-16") which did the same for strbuf.h: * API documentation uses /** / to set it apart from other comments. Function names were stripped from the comments. * Ordering of the header was adjusted to follow the one from the text file. * Edited some existing comments from string-list.h for consistency. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-27 09:14:34 +09:00
Han-Wen Nienhuys	ea1d87560c	read_gitfile_gently: clarify return value ownership. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-27 09:14:02 +09:00
Han-Wen Nienhuys	d83d846e84	real_path: clarify return value ownership Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-27 09:13:47 +09:00
Junio C Hamano	7451fcdc0d	Sync with 2.14.2 * maint: Git 2.14.2 Git 2.13.6 Git 2.12.5 Git 2.11.4 Git 2.10.5 cvsimport: shell-quote variable used in backticks archimport: use safe_pipe_capture for user input shell: drop git-cvsserver support by default cvsserver: use safe_pipe_capture for `constant commands` as well cvsserver: use safe_pipe_capture instead of backticks cvsserver: move safe_pipe_capture() to the main package	2017-09-26 14:15:55 +09:00
Han-Wen Nienhuys	3ce08548bb	submodule.c: describe submodule_to_gitdir() in a new comment Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-26 14:08:23 +09:00
Jeff King	a1f3515da7	notes-merge: drop dead zero-write code We call write_in_full() with a size that we know is greater than zero. The return value can never be zero, then, since write_in_full() converts such a failed write() into ENOSPC and returns -1. We can just drop this branch of the error handling entirely. Suggested-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-26 12:55:59 +09:00
Jeff King	88780c37b3	files-backend: prefer "0" for write_in_full() error check Commit `06f46f237a` (avoid "write_in_full(fd, buf, len) != len" pattern, 2017-09-13) converted this callsite from: write_in_full(...) != 1 to write_in_full(...) < 0 But during the conflict resolution in `c50424a6f0` (Merge branch 'jk/write-in-full-fix', 2017-09-25), this morphed into write_in_full(...) < 1 This behaves as we want, but we prefer to avoid modeling the "less than length" error-check which can be subtly buggy, as shown in `efacf609c8` (config: avoid "write_in_full(fd, buf, len) < len" pattern, 2017-09-13). Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-26 12:54:43 +09:00
Johannes Schindelin	db2f7c48cb	Win32: simplify loading of DLL functions Dynamic loading of DLL functions is duplicated in several places in Git for Windows' source code. This patch adds a pair of macros to simplify the process: the DECLARE_PROC_ADDR(<dll>, <return-type>, <function-name>, ...<function-parameter-types>...) macro to be used at the beginning of a code block, and the INIT_PROC_ADDR(<function-name>) macro to call before using the declared function. The return value of the INIT_PROC_ADDR() call has to be checked; If it is NULL, the function was not found in the specified DLL. Example: DECLARE_PROC_ADDR(kernel32.dll, BOOL, CreateHardLinkW, LPCWSTR, LPCWSTR, LPSECURITY_ATTRIBUTES); if (!INIT_PROC_ADDR(CreateHardLinkW)) return error("Could not find CreateHardLinkW() function"; if (!CreateHardLinkW(source, target, NULL)) return error("could not create hardlink from %S to %S", source, target); return 0; Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-26 11:02:49 +09:00
Michael Haggerty	cff28ca94c	packed-backend.c: rename a bunch of things and update comments We've made huge changes to this file, and some of the old names and comments are no longer very fitting. So rename a bunch of things: * `struct packed_ref_cache` → `struct snapshot` * `acquire_packed_ref_cache()` → `acquire_snapshot()` * `release_packed_ref_buffer()` → `clear_snapshot_buffer()` * `release_packed_ref_cache()` → `release_snapshot()` * `clear_packed_ref_cache()` → `clear_snapshot()` * `struct packed_ref_entry` → `struct snapshot_record` * `cmp_packed_ref_entries()` → `cmp_packed_ref_records()` * `cmp_entry_to_refname()` → `cmp_record_to_refname()` * `sort_packed_refs()` → `sort_snapshot()` * `read_packed_refs()` → `create_snapshot()` * `validate_packed_ref_cache()` → `validate_snapshot()` * `get_packed_ref_cache()` → `get_snapshot()` * Renamed local variables and struct members accordingly. Also update a bunch of comments to reflect the renaming and the accumulated changes that the code has undergone. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-25 18:02:46 +09:00
Michael Haggerty	523ee2d785	mmapped_ref_iterator: inline into `packed_ref_iterator` Since `packed_ref_iterator` is now delegating to `mmapped_ref_iterator` rather than `cache_ref_iterator` to do the heavy lifting, there is no need to keep the two iterators separate. So "inline" `mmapped_ref_iterator` into `packed_ref_iterator`. This removes a bunch of boilerplate. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-25 18:02:46 +09:00
Michael Haggerty	a6e19bcdad	ref_cache: remove support for storing peeled values Now that the `packed-refs` backend doesn't use `ref_cache`, there is nobody left who might want to store peeled values of references in `ref_cache`. So remove that feature. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-25 18:02:46 +09:00
Michael Haggerty	9dd389f3d8	packed_ref_store: get rid of the `ref_cache` entirely Now that everything has been changed to read what it needs directly out of the `packed-refs` file, `packed_ref_store` doesn't need to maintain a `ref_cache` at all. So get rid of it. First of all, this will save a lot of memory and lots of little allocations. Instead of needing to store complicated parsed data structures in memory, we just mmap the file (potentially sharing memory with other processes) and parse only what we need. Moreover, since the mmapped access to the file reads only the parts of the file that it needs, this might save reading all of the data from disk at all (at least if the file starts out sorted). Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-25 18:02:46 +09:00
Michael Haggerty	ba1c052fa6	ref_store: implement `refs_peel_ref()` generically We're about to stop storing packed refs in a `ref_cache`. That means that the only way we have left to optimize `peel_ref()` is by checking whether the reference being peeled is the one currently being iterated over (in `current_ref_iter`), and if so, using `ref_iterator_peel()`. But this can be done generically; it doesn't have to be implemented per-backend. So implement `refs_peel_ref()` in `refs.c` and remove the `peel_ref()` method from the refs API. This removes the last callers of a couple of functions, so delete them. More cleanup to come... Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-25 18:02:46 +09:00
Michael Haggerty	f3987ab36d	packed_read_raw_ref(): read the reference from the mmapped buffer Instead of reading the reference from the `ref_cache`, read it directly from the mmapped buffer. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-25 18:02:45 +09:00
Michael Haggerty	d1cf15516f	packed_ref_iterator_begin(): iterate using `mmapped_ref_iterator` Now that we have an efficient way to iterate, in order, over the mmapped contents of the `packed-refs` file, we can use that directly to implement reference iteration for the `packed_ref_store`, rather than iterating over the `ref_cache`. This is the next step towards getting rid of the `ref_cache` entirely. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-25 18:02:45 +09:00
Michael Haggerty	02b920f3f7	read_packed_refs(): ensure that references are ordered when read It doesn't actually matter now, because the references are only iterated over to fill the associated `ref_cache`, which itself puts them in the correct order. But we want to get rid of the `ref_cache`, so we want to be able to iterate directly over the `packed-refs` buffer, and then the iteration will need to be ordered correctly. In fact, we already write the `packed-refs` file sorted, but it is possible that other Git clients don't get it right. So let's not assume that a `packed-refs` file is sorted unless it is explicitly declared to be so via a `sorted` trait in its header line. If it is not declared to be sorted, then scan quickly through the file to check. If it is found to be out of order, then sort the records into a new memory-only copy. This checking and sorting is done quickly, without parsing the full file contents. However, it needs a little bit of care to avoid reading past the end of the buffer even if the `packed-refs` file is corrupt. Since we always write the file correctly sorted, include that trait when we write or rewrite a `packed-refs` file. This means that the scan described in the previous paragraph should only have to be done for `packed-refs` files that were written by older versions of the Git command-line client, or by other clients that haven't yet learned to write the `sorted` trait. If `packed-refs` was already sorted, then (if the system allows it) we can use the mmapped file contents directly. But if the system doesn't allow a file that is currently mmapped to be replaced using `rename()`, then it would be bad for us to keep the file mmapped for any longer than necessary. So, on such systems, always make a copy of the file contents, either as part of the sorting process, or afterwards. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-25 18:02:45 +09:00
Michael Haggerty	5b633610ec	packed_ref_cache: keep the `packed-refs` file mmapped if possible Keep a copy of the `packed-refs` file contents in memory for as long as a `packed_ref_cache` object is in use: * If the system allows it, keep the `packed-refs` file mmapped. * If not (either because the system doesn't support `mmap()` at all, or because a file that is currently mmapped cannot be replaced via `rename()`), then make a copy of the file's contents in heap-allocated space, and keep that around instead. We base the choice of behavior on a new build-time switch, `MMAP_PREVENTS_DELETE`. By default, this switch is set for Windows variants. After this commit, `MMAP_NONE` and `MMAP_TEMPORARY` are still handled identically. But the next commit will introduce a difference. This whole change is still pointless, because we only read the `packed-refs` file contents immediately after instantiating the `packed_ref_cache`. But that will soon change. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-25 18:02:45 +09:00
Michael Haggerty	14b3c344ea	packed-backend.c: reorder some definitions No code has been changed. This will make subsequent patches more self-contained. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-25 18:02:45 +09:00
Michael Haggerty	81b9b5aea7	mmapped_ref_iterator_advance(): no peeled value for broken refs If a reference is broken, suppress its peeled value. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-25 18:02:45 +09:00

... 3 4 5 6 7 ...

49111 Commits