git-commit-vandalism

Author	SHA1	Message	Date
Johannes Schindelin	bdfef0492c	Sync with 2.16.6 * maint-2.16: (31 commits) Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names path: safeguard `.git` against NTFS Alternate Streams Accesses ...	2019-12-06 16:27:36 +01:00
Johannes Schindelin	9ac92fed5b	Sync with 2.15.4 * maint-2.15: (29 commits) Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names path: safeguard `.git` against NTFS Alternate Streams Accesses clone --recurse-submodules: prevent name squatting on Windows is_ntfs_dotgit(): only verify the leading segment ...	2019-12-06 16:27:18 +01:00
Johannes Schindelin	d3ac8c3f27	Sync with 2.14.6 * maint-2.14: (28 commits) Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names path: safeguard `.git` against NTFS Alternate Streams Accesses clone --recurse-submodules: prevent name squatting on Windows is_ntfs_dotgit(): only verify the leading segment test-path-utils: offer to run a protectNTFS/protectHFS benchmark ...	2019-12-06 16:26:55 +01:00
Johannes Schindelin	d2c84dad1c	mingw: refuse to access paths with trailing spaces or periods When creating a directory on Windows whose path ends in a space or a period (or chains thereof), the Win32 API "helpfully" trims those. For example, `mkdir("abc ");` will return success, but actually create a directory called `abc` instead. This stems back to the DOS days, when all file names had exactly 8 characters plus exactly 3 characters for the file extension, and the only way to have shorter names was by padding with spaces. Sadly, this "helpful" behavior is a bit inconsistent: after a successful `mkdir("abc ");`, a `mkdir("abc /def")` will actually _fail_ (because the directory `abc ` does not actually exist). Even if it would work, we now have a serious problem because a Git repository could contain directories `abc` and `abc `, and on Windows, they would be "merged" unintentionally. As these paths are illegal on Windows, anyway, let's disallow any accesses to such paths on that Operating System. For practical reasons, this behavior is still guarded by the config setting `core.protectNTFS`: it is possible (and at least two regression tests make use of it) to create commits without involving the worktree. In such a scenario, it is of course possible -- even on Windows -- to create such file names. Among other consequences, this patch disallows submodules' paths to end in spaces on Windows (which would formerly have confused Git enough to try to write into incorrect paths, anyway). While this patch does not fix a vulnerability on its own, it prevents an attack vector that was exploited in demonstrations of a number of recently-fixed security bugs. The regression test added to `t/t7417-submodule-path-url.sh` reflects that attack vector. Note that we have to adjust the test case "prevent git~1 squatting on Windows" in `t/t7415-submodule-names.sh` because of a very subtle issue. It tries to clone two submodules whose names differ only in a trailing period character, and as a consequence their git directories differ in the same way. Previously, when Git tried to clone the second submodule, it thought that the git directory already existed (because on Windows, when you create a directory with the name `b.` it actually creates `b`), but with this patch, the first submodule's clone will fail because of the illegal name of the git directory. Therefore, when cloning the second submodule, Git will take a different code path: a fresh clone (without an existing git directory). Both code paths fail to clone the second submodule, both because the the corresponding worktree directory exists and is not empty, but the error messages are worded differently. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2019-12-05 15:37:06 +01:00
Johannes Schindelin	288a74bcd2	is_ntfs_dotgit(): only verify the leading segment The config setting `core.protectNTFS` is specifically designed to work not only on Windows, but anywhere, to allow for repositories hosted on, say, Linux servers to be protected against NTFS-specific attack vectors. As a consequence, `is_ntfs_dotgit()` manually splits backslash-separated paths (but does not do the same for paths separated by forward slashes), under the assumption that the backslash might not be a valid directory separator on the _current_ Operating System. However, the two callers, `verify_path()` and `fsck_tree()`, are supposed to feed only individual path segments to the `is_ntfs_dotgit()` function. This causes a lot of duplicate scanning (and very inefficient scanning, too, as the inner loop of `is_ntfs_dotgit()` was optimized for readability rather than for speed. Let's simplify the design of `is_ntfs_dotgit()` by putting the burden of splitting the paths by backslashes as directory separators on the callers of said function. Consequently, the `verify_path()` function, which already splits the path by directory separators, now treats backslashes as directory separators _explicitly_ when `core.protectNTFS` is turned on, even on platforms where the backslash is _not_ a directory separator. Note that we have to repeat some code in `verify_path()`: if the backslash is not a directory separator on the current Operating System, we want to allow file names like `\`, but we _do_ want to disallow paths that are clearly intended to cause harm when the repository is cloned on Windows. The `fsck_tree()` function (the other caller of `is_ntfs_dotgit()`) now needs to look for backslashes in tree entries' names specifically when `core.protectNTFS` is turned on. While it would be tempting to completely disallow backslashes in that case (much like `fsck` reports names containing forward slashes as "full paths"), this would be overzealous: when `core.protectNTFS` is turned on in a non-Windows setup, backslashes are perfectly valid characters in file names while we _still_ want to disallow tree entries that are clearly designed to exploit NTFS-specific behavior. This simplification will make subsequent changes easier to implement, such as turning `core.protectNTFS` on by default (not only on Windows) or protecting against attack vectors involving NTFS Alternate Data Streams. Incidentally, this change allows for catching malicious repositories that contain tree entries of the form `dir\.gitmodules` already on the server side rather than only on the client side (and previously only on Windows): in contrast to `is_ntfs_dotgit()`, the `is_ntfs_dotgitmodules()` function already expects the caller to split the paths by directory separators. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2019-12-05 15:36:50 +01:00
Junio C Hamano	68f95b26e4	Sync with Git 2.16.4 * maint-2.16: Git 2.16.4 Git 2.15.2 Git 2.14.4 Git 2.13.7 verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths	2018-05-22 14:25:26 +09:00
Junio C Hamano	023020401d	Sync with Git 2.15.2 * maint-2.15: Git 2.15.2 Git 2.14.4 Git 2.13.7 verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths	2018-05-22 14:18:06 +09:00
Junio C Hamano	9e0f06d55d	Sync with Git 2.14.4 * maint-2.14: Git 2.14.4 Git 2.13.7 verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths	2018-05-22 14:15:14 +09:00
Junio C Hamano	7b01c71b64	Sync with Git 2.13.7 * maint-2.13: Git 2.13.7 verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths	2018-05-22 14:10:49 +09:00
Jeff King	10ecfa7649	verify_path: disallow symlinks in .gitmodules There are a few reasons it's not a good idea to make .gitmodules a symlink, including: 1. It won't be portable to systems without symlinks. 2. It may behave inconsistently, since Git may look at this file in the index or a tree without bothering to resolve any symbolic links. We don't do this _yet_, but the config infrastructure is there and it's planned for the future. With some clever code, we could make (2) work. And some people may not care about (1) if they only work on one platform. But there are a few security reasons to simply disallow it: a. A symlinked .gitmodules file may circumvent any fsck checks of the content. b. Git may read and write from the on-disk file without sanity checking the symlink target. So for example, if you link ".gitmodules" to "../oops" and run "git submodule add", we'll write to the file "oops" outside the repository. Again, both of those are problems that _could_ be solved with sufficient code, but given the complications in (1) and (2), we're better off just outlawing it explicitly. Note the slightly tricky call to verify_path() in update-index's update_one(). There we may not have a mode if we're not updating from the filesystem (e.g., we might just be removing the file). Passing "0" as the mode there works fine; since it's not a symlink, we'll just skip the extra checks. Signed-off-by: Jeff King <peff@peff.net>	2018-05-21 23:50:11 -04:00
Jeff King	641084b618	verify_dotfile: mention case-insensitivity in comment We're more restrictive than we need to be in matching ".GIT" on case-sensitive filesystems; let's make a note that this is intentional. Signed-off-by: Jeff King <peff@peff.net>	2018-05-21 23:50:11 -04:00
Jeff King	e19e5e66d6	verify_path: drop clever fallthrough We check ".git" and ".." in the same switch statement, and fall through the cases to share the end-of-component check. While this saves us a line or two, it makes modifying the function much harder. Let's just write it out. Signed-off-by: Jeff King <peff@peff.net>	2018-05-21 23:50:11 -04:00
Junio C Hamano	3112c3fa7f	Merge branch 'nd/shared-index-fix' into maint Code clean-up. * nd/shared-index-fix: read-cache: don't write index twice if we can't write shared index read-cache.c: move tempfile creation/cleanup out of write_shared_index read-cache.c: change type of "temp" in write_shared_index()	2018-03-22 14:24:16 -07:00
Junio C Hamano	b0e0fc267b	Merge branch 'tg/split-index-fixes' into maint The split-index mode had a few corner case bugs fixed. * tg/split-index-fixes: travis: run tests with GIT_TEST_SPLIT_INDEX split-index: don't write cache tree with null oid entries read-cache: fix reading the shared index for other repos	2018-03-22 14:24:10 -07:00
Junio C Hamano	d17811154b	Merge branch 'rj/warning-uninitialized-fix' Compilation fix. * rj/warning-uninitialized-fix: read-cache: fix an -Wmaybe-uninitialized warning -Wuninitialized: remove some 'init-self' workarounds	2018-03-21 11:30:15 -07:00
Junio C Hamano	fddf9a2d06	Merge branch 'bp/refresh-cache-ent-rehash-fix' The codepath to replace an existing entry in the index had a bug in updating the name hash structure, which has been fixed. * bp/refresh-cache-ent-rehash-fix: Fix bugs preventing adding updated cache entries to the name hash	2018-03-21 11:30:11 -07:00
Junio C Hamano	beb2cdf504	Merge branch 'ma/skip-writing-unchanged-index' Internal API clean-up to allow write_locked_index() optionally skip writing the in-core index when it is not modified. * ma/skip-writing-unchanged-index: write_locked_index(): add flag to avoid writing unchanged index	2018-03-21 11:30:10 -07:00
Ramsay Jones	00a4b03501	read-cache: fix an -Wmaybe-uninitialized warning The function ce_write_entry() uses a 'self-initialised' variable construct, for the symbol 'saved_namelen', to suppress a gcc '-Wmaybe-uninitialized' warning, given that the warning is a false positive. For the purposes of this discussion, the ce_write_entry() function has three code blocks of interest, that look like so: /* block #1 / if (ce->ce_flags & CE_STRIP_NAME) { saved_namelen = ce_namelen(ce); ce->ce_namelen = 0; } / block #2 / / * several code blocks that contain, among others, calls * to copy_cache_entry_to_ondisk(ondisk, ce); / / block #3 */ if (ce->ce_flags & CE_STRIP_NAME) { ce->ce_namelen = saved_namelen; ce->ce_flags &= ~CE_STRIP_NAME; } The warning implies that gcc thinks it is possible that the first block is not entered, the calls to copy_cache_entry_to_ondisk() could toggle the CE_STRIP_NAME flag on, thereby entering block #3 with saved_namelen unset. However, the copy_cache_entry_to_ondisk() function does not write to ce->ce_flags (it only reads). gcc could easily determine this, since that function is local to this file, but it obviously doesn't. In order to suppress this warning, we make it clear to the reader (human and compiler), that block #3 will only be entered when the first block has been entered, by introducing a new 'stripped_name' boolean variable. We also take the opportunity to change the type of 'saved_namelen' to 'unsigned int' to match ce->ce_namelen. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-20 09:59:21 -07:00
Ben Peart	0e267b7a24	Fix bugs preventing adding updated cache entries to the name hash Update replace_index_entry() to clear the CE_HASHED flag from the new cache entry so that it can add it to the name hash in set_index_entry() Fix refresh_cache_ent() to use the copy_cache_entry() macro instead of memcpy() so that it doesn't incorrectly copy the hash state from the old entry. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-15 10:58:30 -07:00
Junio C Hamano	169c9c0169	Merge branch 'bw/c-plus-plus' Avoid using identifiers that clash with C++ keywords. Even though it is not a goal to compile Git with C++ compilers, changes like this help use of code analysis tools that targets C++ on our codebase. * bw/c-plus-plus: (37 commits) replace: rename 'new' variables trailer: rename 'template' variables tempfile: rename 'template' variables wrapper: rename 'template' variables environment: rename 'namespace' variables diff: rename 'template' variables environment: rename 'template' variables init-db: rename 'template' variables unpack-trees: rename 'new' variables trailer: rename 'new' variables submodule: rename 'new' variables split-index: rename 'new' variables remote: rename 'new' variables ref-filter: rename 'new' variables read-cache: rename 'new' variables line-log: rename 'new' variables imap-send: rename 'new' variables http: rename 'new' variables entry: rename 'new' variables diffcore-delta: rename 'new' variables ...	2018-03-06 14:54:07 -08:00
Martin Ågren	610008146e	write_locked_index(): add flag to avoid writing unchanged index We have several callers like if (active_cache_changed && write_locked_index(...)) handle_error(); rollback_lock_file(...); where the final rollback is needed because "!active_cache_changed" shortcuts the if-expression. There are also a few variants of this, including some if-else constructs that make it more clear when the explicit rollback is really needed. Teach `write_locked_index()` to take a new flag SKIP_IF_UNCHANGED and simplify the callers. Leave the most complicated of the callers (in builtin/update-index.c) unchanged. Rewriting it to use this new flag would end up duplicating logic. We could have made the new flag behave the other way round ("FORCE_WRITE"), but that could break existing users behind their backs. Let's take the more conservative approach. We can still migrate existing callers to use our new flag. Later we might even be able to flip the default, possibly without entirely ignoring the risk to in-flight or out-of-tree topics. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-01 13:28:01 -08:00
Brandon Williams	285c2e259d	read-cache: rename 'new' variables Rename C++ keyword in order to bring the codebase closer to being able to be compiled with a C++ compiler. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-22 10:08:05 -08:00
Junio C Hamano	0fd90daba8	Merge branch 'bc/hash-algo' More abstraction of hash function from the codepath. * bc/hash-algo: hash: update obsolete reference to SHA1_HEADER bulk-checkin: abstract SHA-1 usage csum-file: abstract uses of SHA-1 csum-file: rename sha1file to hashfile read-cache: abstract away uses of SHA-1 pack-write: switch various SHA-1 values to abstract forms pack-check: convert various uses of SHA-1 to abstract forms fast-import: switch various uses of SHA-1 to the_hash_algo sha1_file: switch uses of SHA-1 to the_hash_algo builtin/unpack-objects: switch uses of SHA-1 to the_hash_algo builtin/index-pack: improve hash function abstraction hash: create union for hash context allocation hash: move SHA-1 macros to hash.h	2018-02-15 14:55:47 -08:00
Junio C Hamano	090dbea684	Merge branch 'nd/trace-index-ops' * nd/trace-index-ops: trace: measure where the time is spent in the index-heavy operations	2018-02-15 14:55:44 -08:00
Junio C Hamano	8be8342b4c	Merge branch 'po/object-id' Conversion from uchar[20] to struct object_id continues. * po/object-id: sha1_file: rename hash_sha1_file_literally sha1_file: convert write_loose_object to object_id sha1_file: convert force_object_loose to object_id sha1_file: convert write_sha1_file to object_id notes: convert write_notes_tree to object_id notes: convert combine_notes_* to object_id commit: convert commit_tree* to object_id match-trees: convert splice_tree to object_id cache: clear whole hash buffer with oidclr sha1_file: convert hash_sha1_file to object_id dir: convert struct sha1_stat to use object_id sha1_file: convert pretend_sha1_file to object_id	2018-02-15 14:55:43 -08:00
Junio C Hamano	dd0c256b67	Merge branch 'nd/shared-index-fix' Code clean-up. * nd/shared-index-fix: read-cache: don't write index twice if we can't write shared index read-cache.c: move tempfile creation/cleanup out of write_shared_index read-cache.c: change type of "temp" in write_shared_index()	2018-02-13 13:39:14 -08:00
Junio C Hamano	cbf0240f82	Merge branch 'sg/cocci-move-array' Code clean-up. * sg/cocci-move-array: Use MOVE_ARRAY	2018-02-13 13:39:13 -08:00
Junio C Hamano	e75c862125	Merge branch 'tg/split-index-fixes' The split-index mode had a few corner case bugs fixed. * tg/split-index-fixes: travis: run tests with GIT_TEST_SPLIT_INDEX split-index: don't write cache tree with null oid entries read-cache: fix reading the shared index for other repos	2018-02-13 13:39:13 -08:00
brian m. carlson	aab6135906	read-cache: abstract away uses of SHA-1 Convert various uses of direct calls to SHA-1 and 20- and 40-based constants to use the_hash_algo instead. Don't yet convert the on-disk data structures, which will be handled in a future commit. Adjust some comments so as not to refer explicitly to SHA-1. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-02 11:28:41 -08:00
Nguyễn Thái Ngọc Duy	ca54d9baa4	trace: measure where the time is spent in the index-heavy operations All the known heavy code blocks are measured (except object database access). This should help identify if an optimization is effective or not. An unoptimized git-status would give something like below: 0.001791141 s: read cache ... 0.004011363 s: preload index 0.000516161 s: refresh index 0.003139257 s: git command: ... 'status' '--porcelain=2' 0.006788129 s: diff-files 0.002090267 s: diff-index 0.001885735 s: initialize name hash 0.032013138 s: read directory 0.051781209 s: git command: './git' 'status' Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-02 11:20:16 -08:00
Patryk Obara	a09c985eae	sha1_file: convert write_sha1_file to object_id Convert the definition and declaration of write_sha1_file to struct object_id and adjust usage of this function. This commit also converts static function write_sha1_file_prepare, as it is closely related. Rename these functions to write_object_file and write_object_file_prepare respectively. Replace sha1_to_hex, hashcpy and hashclr with their oid equivalents wherever possible. Signed-off-by: Patryk Obara <patryk.obara@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-30 10:42:36 -08:00
Nguyễn Thái Ngọc Duy	ef5b3a6c5e	read-cache: don't write index twice if we can't write shared index In `a0a967568e` ("update-index --split-index: do not split if $GIT_DIR is read only", 2014-06-13), we tried to make sure we can still write an index, even if the shared index can not be written. We did so by just calling 'do_write_locked_index()' just before 'write_shared_index()'. 'do_write_locked_index()' always at least closes the tempfile nowadays, and used to close or commit the lockfile if COMMIT_LOCK or CLOSE_LOCK were given at the time this feature was introduced. COMMIT_LOCK or CLOSE_LOCK is passed in by most callers of 'write_locked_index()'. After calling 'write_shared_index()', we call 'write_split_index()', which calls 'do_write_locked_index()' again, which then tries to use the closed lockfile again, but in fact fails to do so as it's already closed. This eventually leads to a segfault. Make sure to write the main index only once. [nd: most of the commit message and investigation done by Thomas, I only tweaked the solution a bit] Helped-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-24 10:09:18 -08:00
SZEDER Gábor	f919ffebed	Use MOVE_ARRAY Use the helper macro MOVE_ARRAY to move arrays. This is shorter and safer, as it automatically infers the size of elements. Patch generated by Coccinelle and contrib/coccinelle/array.cocci in Travis CI's static analysis build job. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-22 11:32:51 -08:00
Thomas Gummerer	4bddd98311	split-index: don't write cache tree with null oid entries In `a96d3cc3f6` ("cache-tree: reject entries with null sha1", 2017-04-21) we made sure that broken cache entries do not get propagated to new trees. Part of that was making sure not to re-use an existing cache tree that includes a null oid. It did so by dropping the cache tree in 'do_write_index()' if one of the entries contains a null oid. In split index mode however, there are two invocations to 'do_write_index()', one for the shared index and one for the split index. The cache tree is only written once, to the split index. As we only loop through the elements that are effectively being written by the current invocation, that may not include the entry with a null oid in the split index (when it is already written to the shared index), where we write the cache tree. Therefore in split index mode we may still end up writing the cache tree, even though there is an entry with a null oid in the index. Fix this by checking for null oids in prepare_to_write_split_index, where we loop the entries of the shared index as well as the entries for the split index. This fixes t7009 with GIT_TEST_SPLIT_INDEX. Also add a new test that's more specifically showing the problem. Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-19 10:36:39 -08:00
Thomas Gummerer	a125a22334	read-cache: fix reading the shared index for other repos read_index_from() takes a path argument for the location of the index file. For reading the shared index in split index mode however it just ignores that path argument, and reads it from the gitdir of the current repository. This works as long as an index in the_repository is read. Once that changes, such as when we read the index of a submodule, or of a different working tree than the current one, the gitdir of the_repository will no longer contain the appropriate shared index, and git will fail to read it. For example t3007-ls-files-recurse-submodules.sh was broken with GIT_TEST_SPLIT_INDEX set in `188dce131f` ("ls-files: use repository object", 2017-06-22), and t7814-grep-recurse-submodules.sh was also broken in a similar manner, probably by introducing struct repository there, although I didn't track down the exact commit for that. `be489d02d2` ("revision.c: --indexed-objects add objects from all worktrees", 2017-08-23) breaks with split index mode in a similar manner, not erroring out when it can't read the index, but instead carrying on with pruning, without taking the index of the worktree into account. Fix this by passing an additional gitdir parameter to read_index_from, to indicate where it should look for and read the shared index from. read_cache_from() defaults to using the gitdir of the_repository. As it is mostly a convenience macro, having to pass get_git_dir() for every call seems overkill, and if necessary users can have more control by using read_index_from(). Helped-by: Brandon Williams <bmwill@google.com> Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-19 10:36:34 -08:00
Nguyễn Thái Ngọc Duy	59f9d2dd60	read-cache.c: move tempfile creation/cleanup out of write_shared_index For one thing, we have more consistent cleanup procedure now and always keep errno intact. The real purpose is the ability to break out of write_locked_index() early when mks_tempfile() fails in the next patch. It's more awkward to do it if this mks_tempfile() is still inside write_shared_index(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-16 13:12:07 -08:00
Nguyễn Thái Ngọc Duy	7db2d08cdc	read-cache.c: change type of "temp" in write_shared_index() This local variable 'temp' will be passed in from the caller in the next patch. To reduce patch noise, let's change its type now while it's still a local variable and get all the trival conversion out of the next patch. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-16 13:12:02 -08:00
Junio C Hamano	af6e0fe3a5	Merge branch 'tb/add-renormalize' "git add --renormalize ." is a new and safer way to record the fact that you are correcting the end-of-line convention and other "convert_to_git()" glitches in the in-repository data. * tb/add-renormalize: add: introduce "--renormalize"	2017-11-27 11:06:37 +09:00
Junio C Hamano	c9fdbca92c	Merge branch 'av/fsmonitor' Various fixes to bp/fsmonitor topic. * av/fsmonitor: fsmonitor: simplify determining the git worktree under Windows fsmonitor: store fsmonitor bitmap before splitting index fsmonitor: read from getcwd(), not the PWD environment variable fsmonitor: delay updating state until after split index is merged fsmonitor: document GIT_TRACE_FSMONITOR fsmonitor: don't bother pretty-printing JSON from watchman fsmonitor: set the PWD to the top of the working tree	2017-11-21 14:07:51 +09:00
Junio C Hamano	e05336bdda	Merge branch 'bp/fsmonitor' We learned to talk to watchman to speed up "git status" and other operations that need to see which paths have been modified. * bp/fsmonitor: fsmonitor: preserve utf8 filenames in fsmonitor-watchman log fsmonitor: read entirety of watchman output fsmonitor: MINGW support for watchman integration fsmonitor: add a performance test fsmonitor: add a sample integration script for Watchman fsmonitor: add test cases for fsmonitor extension split-index: disable the fsmonitor extension when running the split index test fsmonitor: add a test tool to dump the index extension update-index: add fsmonitor support to update-index ls-files: Add support in ls-files to display the fsmonitor valid bit fsmonitor: add documentation for the fsmonitor extension. fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files. update-index: add a new --force-write-index option preload-index: add override to enable testing preload-index bswap: add 64 bit endianness helper get_be64	2017-11-21 14:07:50 +09:00
Torsten Bögershausen	9472935d81	add: introduce "--renormalize" Make it safer to normalize the line endings in a repository. Files that had been commited with CRLF will be commited with LF. The old way to normalize a repo was like this: # Make sure that there are not untracked files $ echo "* text=auto" >.gitattributes $ git read-tree --empty $ git add . $ git commit -m "Introduce end-of-line normalization" The user must make sure that there are no untracked files, otherwise they would have been added and tracked from now on. The new "add --renormalize" does not add untracked files: $ echo "* text=auto" >.gitattributes $ git add --renormalize . $ git commit -m "Introduce end-of-line normalization" Note that "git add --renormalize <pathspec>" is the short form for "git add -u --renormalize <pathspec>". While at it, document that the same renormalization may be needed, whenever a clean filter is added or changed. Helped-By: Junio C Hamano <gitster@pobox.com> Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-17 10:31:05 +09:00
Junio C Hamano	e539a83455	Merge branch 'bp/read-index-from-skip-verification' Drop (perhaps overly cautious) sanity check before using the index read from the filesystem at runtime. * bp/read-index-from-skip-verification: read_index_from(): speed index loading by skipping verification of the entry order	2017-11-15 12:14:37 +09:00
Alex Vandiver	3bd28eb299	fsmonitor: store fsmonitor bitmap before splitting index `ba1b9cac` ("fsmonitor: delay updating state until after split index is merged", 2017-10-27) resolved the problem of the fsmonitor data being applied to the non-base index when reading; however, a similar problem exists when writing the index. Specifically, writing of the fsmonitor extension happens only after the work to split the index has been applied -- as such, the information in the index is only for the non-"base" index, and thus the extension information contains only partial data. When saving, compute the ewah bitmap before the index is split, and store it in the fsmonitor_dirty field, mirroring the behavior that occurred during reading. fsmonitor_dirty is kept from being leaked by being freed when the extension data is written -- which always happens precisely once, no matter the split index configuration. Signed-off-by: Alex Vandiver <alexmv@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-10 14:05:01 +09:00
Ben Peart	00ec50e56d	read_index_from(): speed index loading by skipping verification of the entry order There is code in post_read_index_from() to catch out of order entries when reading an index file. This order verification is ~13% of the cost of every call to read_index_from(). Update check_ce_order() so that it skips this verification unless the "verify_ce_order" global variable is set. Teach fsck to force this verification. The effect can be seen using t/perf/p0002-read-cache.sh: Test HEAD HEAD~1 -------------------------------------------------------------------------------------- 0002.1: read_cache/discard_cache 1000 times 0.41(0.04+0.04) 0.50(0.00+0.10) +22.0% Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-08 10:39:41 +09:00
Junio C Hamano	e7e456f500	Merge branch 'bc/object-id' Conversion from uchar[20] to struct object_id continues. * bc/object-id: (25 commits) refs/files-backend: convert static functions to object_id refs: convert read_raw_ref backends to struct object_id refs: convert peel_object to struct object_id refs: convert resolve_ref_unsafe to struct object_id worktree: convert struct worktree to object_id refs: convert resolve_gitlink_ref to struct object_id Convert remaining callers of resolve_gitlink_ref to object_id sha1_file: convert index_path and index_fd to struct object_id refs: convert reflog_expire parameter to struct object_id refs: convert read_ref_at to struct object_id refs: convert peel_ref to struct object_id builtin/pack-objects: convert to struct object_id pack-bitmap: convert traverse_bitmap_commit_list to object_id refs: convert dwim_log to struct object_id builtin/reflog: convert remaining unsigned char uses to object_id refs: convert dwim_ref and expand_ref to struct object_id refs: convert read_ref and read_ref_full to object_id refs: convert resolve_refdup and refs_resolve_refdup to struct object_id Convert check_connected to use struct object_id refs: update ref transactions to use struct object_id ...	2017-11-06 14:24:27 +09:00
brian m. carlson	a98e6101f0	refs: convert resolve_gitlink_ref to struct object_id Convert the declaration and definition of resolve_gitlink_ref to use struct object_id and apply the following semantic patch: @@ expression E1, E2, E3; @@ - resolve_gitlink_ref(E1, E2, E3.hash) + resolve_gitlink_ref(E1, E2, &E3) @@ expression E1, E2, E3; @@ - resolve_gitlink_ref(E1, E2, E3->hash) + resolve_gitlink_ref(E1, E2, E3) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-16 11:05:51 +09:00
brian m. carlson	1053fe829c	Convert remaining callers of resolve_gitlink_ref to object_id Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-16 11:05:51 +09:00
Martin Ågren	b74c90fb41	read_cache: roll back lock in `update_index_if_able()` `update_index_if_able()` used to always commit the lock or roll it back. Commit `03b866477` (read-cache: new API write_locked_index instead of write_index/write_cache, 2014-06-13) stopped rolling it back in case a write was not even attempted. This change in behavior is not motivated in the commit message and appears to be accidental: the `else`-path was removed, although that changed the behavior in case the `if` shortcuts. Reintroduce the rollback and document this behavior. While at it, move the documentation on this function from the function definition to the function declaration in cache.h. If `write_locked_index(..., COMMIT_LOCK)` fails, it will roll back the lock for us (see the previous commit). Noticed-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-07 10:20:56 +09:00
Martin Ågren	df60cf5789	read-cache: leave lock in right state in `write_locked_index()` If the original version of `write_locked_index()` returned with an error, it didn't roll back the lockfile unless the error occured at the very end, during closing/committing. See commit `03b866477` (read-cache: new API write_locked_index instead of write_index/write_cache, 2014-06-13). In commit `9f41c7a6b` (read-cache: close index.lock in do_write_index, 2017-04-26), we learned to close the lock slightly earlier in the callstack. That was mostly a side-effect of lockfiles being implemented using temporary files, but didn't cause any real harm. Recently, commit `076aa2cbd` (tempfile: auto-allocate tempfiles on heap, 2017-09-05) introduced a subtle bug. If the temporary file is deleted (i.e., the lockfile is rolled back), the tempfile-pointer in the `struct lock_file` will be left dangling. Thus, an attempt to reuse the lockfile, or even just to roll it back, will induce undefined behavior -- most likely a crash. Besides not crashing, we clearly want to make things consistent. The guarantees which the lockfile-machinery itself provides is A) if we ask to commit and it fails, roll back, and B) if we ask to close and it fails, do _not_ roll back. Let's do the same for consistency. Do not delete the temporary file in `do_write_index()`. One of its callers, `write_locked_index()` will thereby avoid rolling back the lock. The other caller, `write_shared_index()`, will delete its temporary file anyway. Both of these callers will avoid undefined behavior (crashing). Teach `write_locked_index(..., COMMIT_LOCK)` to roll back the lock before returning. If we have already succeeded and committed, it will be a noop. Simplify the existing callers where we now have a superfluous call to `rollback_lockfile()`. That should keep future readers from wondering why the callers are inconsistent. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-07 10:20:56 +09:00
Martin Ågren	812d6b0075	read-cache: drop explicit `CLOSE_LOCK`-flag `write_locked_index()` takes two flags: `COMMIT_LOCK` and `CLOSE_LOCK`. At most one is allowed. But it is also possible to use no flag, i.e., `0`. But when `write_locked_index()` calls `do_write_index()`, the temporary file, a.k.a. the lockfile, will be closed. So passing `0` is effectively the same as `CLOSE_LOCK`, which seems like a bug. We might feel tempted to restructure the code in order to close the file later, or conditionally. It also feels a bit unfortunate that we simply "happen" to close the lock by way of an implementation detail of lockfiles. But note that we need to close the temporary file before `stat`-ing it, at least on Windows. See `9f41c7a6b` (read-cache: close index.lock in do_write_index, 2017-04-26). Drop `CLOSE_LOCK` and make it explicit that `write_locked_index()` always closes the lock. Whether it is also committed is governed by the remaining flag, `COMMIT_LOCK`. This means we neither have nor suggest that we have a mode to write the index and leave the file open. Whatever extra contents we might eventually want to write, we should probably write it from within `write_locked_index()` itself anyway. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-07 10:20:56 +09:00

1 2 3 4 5 ...

518 Commits