git-commit-vandalism

Author	SHA1	Message	Date
Junio C Hamano	e3ab3e8a39	Merge branch 'ab/perl-python-attrs' We learned that our source files with ".pl" and ".py" extensions are Perl and Python files respectively and changes to them are better viewed as such with appropriate diff drivers. * ab/perl-python-attrs: .gitattributes: add a diff driver for Python .gitattributes: use the "perl" differ for Perl .gitattributes: add *.pl extension for Perl	2018-05-23 14:38:16 +09:00
Junio C Hamano	fcb6df3254	Merge branch 'sb/oid-object-info' The codepath around object-info API has been taught to take the repository object (which in turn tells the API which object store the objects are to be located). * sb/oid-object-info: cache.h: allow oid_object_info to handle arbitrary repositories packfile: add repository argument to cache_or_unpack_entry packfile: add repository argument to unpack_entry packfile: add repository argument to read_object packfile: add repository argument to packed_object_info packfile: add repository argument to packed_to_object_type packfile: add repository argument to retry_bad_packed_offset cache.h: add repository argument to oid_object_info cache.h: add repository argument to oid_object_info_extended	2018-05-23 14:38:16 +09:00
Junio C Hamano	d0f7b22b5b	Merge branch 'cc/perf-aggregate-unknown-option' Perf-test helper updates. * cc/perf-aggregate-unknown-option: perf/aggregate: use Getopt::Long for option parsing	2018-05-23 14:38:15 +09:00
Junio C Hamano	89753dc2b7	Merge branch 'en/git-debugger' Dev support. * en/git-debugger: Make running git under other debugger-like programs easy	2018-05-23 14:38:15 +09:00
Junio C Hamano	41267e9697	Merge branch 'bw/server-options' The transport protocol v2 is getting updated further. * bw/server-options: fetch: send server options when using protocol v2 ls-remote: send server options when using protocol v2 serve: introduce the server-option capability	2018-05-23 14:38:15 +09:00
Junio C Hamano	30b015bffe	Merge branch 'nd/repack-keep-pack' "git gc" in a large repository takes a lot of time as it considers to repack all objects into one pack by default. The command has been taught to pretend as if the largest existing packfile is marked with ".keep" so that it is left untouched while objects in other packs and loose ones are repacked. * nd/repack-keep-pack: pack-objects: show some progress when counting kept objects gc --auto: exclude base pack if not enough mem to "repack -ad" gc: handle a corner case in gc.bigPackThreshold gc: add gc.bigPackThreshold config gc: add --keep-largest-pack option repack: add --keep-pack option t7700: have closing quote of a test at the beginning of line	2018-05-23 14:38:14 +09:00
Junio C Hamano	c89b6e136e	Merge branch 'ds/lazy-load-trees' The code has been taught to use the duplicated information stored in the commit-graph file to learn the tree object name for a commit to avoid opening and parsing the commit object when it makes sense to do so. * ds/lazy-load-trees: coccinelle: avoid wrong transformation suggestions from commit.cocci commit-graph: lazy-load trees for commits treewide: replace maybe_tree with accessor methods commit: create get_commit_tree() method treewide: rename tree to maybe_tree	2018-05-23 14:38:13 +09:00
Junio C Hamano	05682ee270	Merge branch 'nd/term-columns' The code did not propagate the terminal width to subprocesses via COLUMNS environment variable, which it now does. This caused trouble to "git column" helper subprocess when "git tag --column=row" tried to list the existing tags on a display with non-default width. * nd/term-columns: column: fix off-by-one default width pager: set COLUMNS to term_columns()	2018-05-23 14:38:13 +09:00
Junio C Hamano	798b029da8	Merge branch 'sg/t7005-spaces-in-filenames-cleanup' Test update. * sg/t7005-spaces-in-filenames-cleanup: t7005-editor: get rid of the SPACES_IN_FILENAMES prereq	2018-05-23 14:38:12 +09:00
Junio C Hamano	c8311980f9	Merge branch 'sg/t5516-fixes' Test fixes. * sg/t5516-fixes: t5516-fetch-push: fix broken &&-chain t5516-fetch-push: fix 'push with dry-run' test	2018-05-23 14:38:12 +09:00
Junio C Hamano	c4e7220f08	Merge branch 'sg/t5310-jgit-bitmap-test' Test update. * sg/t5310-jgit-bitmap-test: t5310-pack-bitmaps: make JGit tests work with GIT_TEST_SPLIT_INDEX	2018-05-23 14:38:12 +09:00
Junio C Hamano	b577198526	Merge branch 'nd/pack-format-doc' Doc update. * nd/pack-format-doc: pack-format.txt: more details on pack file format	2018-05-23 14:38:11 +09:00
Junio C Hamano	ece48106e6	Merge branch 'jk/apply-p-doc' Doc update. * jk/apply-p-doc: apply: clarify "-p" documentation	2018-05-23 14:38:11 +09:00
Junio C Hamano	d9a0ddc217	Merge branch 'ao/config-api-doc' Doc update. * ao/config-api-doc: doc: fix config API documentation about config_with_options	2018-05-23 14:38:10 +09:00
Junio C Hamano	e1dd23a8c9	Merge branch 'bc/mailmap-self' * bc/mailmap-self: mailmap: update brian m. carlson's email address	2018-05-23 14:38:10 +09:00
Junio C Hamano	a2cec42213	Merge branch 'sb/object-store-replace' Hotfix. * sb/object-store-replace: get_main_ref_store: BUG() when outside a repository object.c: clear replace map before freeing it replace-object.c: remove the_repository from prepare_replace_object object.c: free replace map in raw_object_store_clear	2018-05-23 14:38:09 +09:00
Junio C Hamano	67b878eba1	Merge branch 'hn/sort-ls-remote' Hotfix. * hn/sort-ls-remote: t5512: run git fetch inside test	2018-05-23 14:38:09 +09:00
Brandon Williams	eaf6a1b6e9	remote-curl: accept compressed responses with protocol v2 Configure curl to accept compressed responses when using protocol v2 by setting `CURLOPT_ENCODING` to "", which indicates that curl should send an "Accept-Encoding" header with all supported compression encodings. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-23 10:24:13 +09:00
Brandon Williams	1a53e692af	remote-curl: accept all encodings supported by curl Configure curl to accept all encodings which curl supports instead of only accepting gzip responses. This fixes an issue when using an installation of curl which is built without the "zlib" feature. Since `aa90b9697` (Enable info/refs gzip decompression in HTTP client, 2012-09-19) we end up requesting "gzip" encoding anyway despite libcurl not being able to decode it. Worse, instead of getting a clear error message indicating so, we end up falling back to "dumb" http, producing a confusing and difficult to debug result. Since curl doesn't do any checking to verify that it supports the a requested encoding, instead set the curl option `CURLOPT_ENCODING` with an empty string indicating that curl should send an "Accept-Encoding" header containing only the encodings supported by curl. Reported-by: Anton Golubev <anton.golubev@gmail.com> Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-23 10:24:12 +09:00
Junio C Hamano	fc54c1af3e	Git 2.17.1 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-22 14:28:26 +09:00
Junio C Hamano	9e84a6d758	Merge branch 'jk/submodule-fsck-loose' into maint * jk/submodule-fsck-loose: fsck: complain when .gitmodules is a symlink index-pack: check .gitmodules files with --strict unpack-objects: call fsck_finish() after fscking objects fsck: call fsck_finish() after fscking objects fsck: check .gitmodules content fsck: handle promisor objects in .gitmodules check fsck: detect gitmodules files fsck: actually fsck blob data fsck: simplify ".git" check index-pack: make fsck error message more specific	2018-05-22 14:26:05 +09:00
Junio C Hamano	68f95b26e4	Sync with Git 2.16.4 * maint-2.16: Git 2.16.4 Git 2.15.2 Git 2.14.4 Git 2.13.7 verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths	2018-05-22 14:25:26 +09:00
Junio C Hamano	a42a58d7b6	Git 2.16.4 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-22 14:18:51 +09:00
Junio C Hamano	023020401d	Sync with Git 2.15.2 * maint-2.15: Git 2.15.2 Git 2.14.4 Git 2.13.7 verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths	2018-05-22 14:18:06 +09:00
Junio C Hamano	d33c87517a	Git 2.15.2 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-22 14:15:59 +09:00
Junio C Hamano	9e0f06d55d	Sync with Git 2.14.4 * maint-2.14: Git 2.14.4 Git 2.13.7 verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths	2018-05-22 14:15:14 +09:00
Junio C Hamano	4dde7b8799	Git 2.14.4 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-22 14:12:02 +09:00
Junio C Hamano	7b01c71b64	Sync with Git 2.13.7 * maint-2.13: Git 2.13.7 verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths	2018-05-22 14:10:49 +09:00
Junio C Hamano	0114f71344	Git 2.13.7 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-22 13:50:36 +09:00
Junio C Hamano	8528c31d98	Merge branch 'jk/submodule-fix-loose' into maint-2.13 * jk/submodule-fix-loose: verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths	2018-05-22 13:48:26 +09:00
Jeff King	b7b1fca175	fsck: complain when .gitmodules is a symlink We've recently forbidden .gitmodules to be a symlink in verify_path(). And it's an easy way to circumvent our fsck checks for .gitmodules content. So let's complain when we see it. Signed-off-by: Jeff King <peff@peff.net>	2018-05-21 23:55:12 -04:00
Jeff King	73c3f0f704	index-pack: check .gitmodules files with --strict Now that the internal fsck code has all of the plumbing we need, we can start checking incoming .gitmodules files. Naively, it seems like we would just need to add a call to fsck_finish() after we've processed all of the objects. And that would be enough to cover the initial test included here. But there are two extra bits: 1. We currently don't bother calling fsck_object() at all for blobs, since it has traditionally been a noop. We'd actually catch these blobs in fsck_finish() at the end, but it's more efficient to check them when we already have the object loaded in memory. 2. The second pass done by fsck_finish() needs to access the objects, but we're actually indexing the pack in this process. In theory we could give the fsck code a special callback for accessing the in-pack data, but it's actually quite tricky: a. We don't have an internal efficient index mapping oids to packfile offsets. We only generate it on the fly as part of writing out the .idx file. b. We'd still have to reconstruct deltas, which means we'd basically have to replicate all of the reading logic in packfile.c. Instead, let's avoid running fsck_finish() until after we've written out the .idx file, and then just add it to our internal packed_git list. This does mean that the objects are "in the repository" before we finish our fsck checks. But unpack-objects already exhibits this same behavior, and it's an acceptable tradeoff here for the same reason: the quarantine mechanism means that pushes will be fully protected. In addition to a basic push test in t7415, we add a sneaky pack that reverses the usual object order in the pack, requiring that index-pack access the tree and blob during the "finish" step. This already works for unpack-objects (since it will have written out loose objects), but we'll check it with this sneaky pack for good measure. Signed-off-by: Jeff King <peff@peff.net>	2018-05-21 23:55:12 -04:00
Jeff King	6e328d6cae	unpack-objects: call fsck_finish() after fscking objects As with the previous commit, we must call fsck's "finish" function in order to catch any queued objects for .gitmodules checks. This second pass will be able to access any incoming objects, because we will have exploded them to loose objects by now. This isn't quite ideal, because it means that bad objects may have been written to the object database (and a subsequent operation could then reference them, even if the other side doesn't send the objects again). However, this is sufficient when used with receive.fsckObjects, since those loose objects will all be placed in a temporary quarantine area that will get wiped if we find any problems. Signed-off-by: Jeff King <peff@peff.net>	2018-05-21 23:55:12 -04:00
Jeff King	1995b5e03e	fsck: call fsck_finish() after fscking objects Now that the internal fsck code is capable of checking .gitmodules files, we just need to teach its callers to use the "finish" function to check any queued objects. With this, we can now catch the malicious case in t7415 with git-fsck. Signed-off-by: Jeff King <peff@peff.net>	2018-05-21 23:55:12 -04:00
Jeff King	ed8b10f631	fsck: check .gitmodules content This patch detects and blocks submodule names which do not match the policy set forth in submodule-config. These should already be caught by the submodule code itself, but putting the check here means that newer versions of Git can protect older ones from malicious entries (e.g., a server with receive.fsckObjects will block the objects, protecting clients which fetch from it). As a side effect, this means fsck will also complain about .gitmodules files that cannot be parsed (or were larger than core.bigFileThreshold). Signed-off-by: Jeff King <peff@peff.net>	2018-05-21 23:55:12 -04:00
Jeff King	2738744426	fsck: handle promisor objects in .gitmodules check If we have a tree that points to a .gitmodules blob but don't have that blob, we can't check its contents. This produces an fsck error when we encounter it. But in the case of a promisor object, this absence is expected, and we must not complain. Note that this can technically circumvent our transfer.fsckObjects check. Imagine a client fetches a tree, but not the matching .gitmodules blob. An fsck of the incoming objects will show that we don't have enough information. Later, we do fetch the actual blob. But we have no idea that it's a .gitmodules file. The only ways to get around this would be to re-scan all of the existing trees whenever new ones enter (which is expensive), or to somehow persist the gitmodules_found set between fsck runs (which is complicated). In practice, it's probably OK to ignore the problem. Any repository which has all of the objects (including the one serving the promisor packs) can perform the checks. Since promisor packs are inherently about a hierarchical topology in which clients rely on upstream repositories, those upstream repositories can protect all of their downstream clients from broken objects. Signed-off-by: Jeff King <peff@peff.net>	2018-05-21 23:55:12 -04:00
Jeff King	159e7b080b	fsck: detect gitmodules files In preparation for performing fsck checks on .gitmodules files, this commit plumbs in the actual detection of the files. Note that unlike most other fsck checks, this cannot be a property of a single object: we must know that the object is found at a ".gitmodules" path at the root tree of a commit. Since the fsck code only sees one object at a time, we have to mark the related objects to fit the puzzle together. When we see a commit we mark its tree as a root tree, and when we see a root tree with a .gitmodules file, we mark the corresponding blob to be checked. In an ideal world, we'd check the objects in topological order: commits followed by trees followed by blobs. In that case we can avoid ever loading an object twice, since all markings would be complete by the time we get to the marked objects. And indeed, if we are checking a single packfile, this is the order in which Git will generally write the objects. But we can't count on that: 1. git-fsck may show us the objects in arbitrary order (loose objects are fed in sha1 order, but we may also have multiple packs, and we process each pack fully in sequence). 2. The type ordering is just what git-pack-objects happens to write now. The pack format does not require a specific order, and it's possible that future versions of Git (or a custom version trying to fool official Git's fsck checks!) may order it differently. 3. We may not even be fscking all of the relevant objects at once. Consider pushing with transfer.fsckObjects, where one push adds a blob at path "foo", and then a second push adds the same blob at path ".gitmodules". The blob is not part of the second push at all, but we need to mark and check it. So in the general case, we need to make up to three passes over the objects: once to make sure we've seen all commits, then once to cover any trees we might have missed, and then a final pass to cover any .gitmodules blobs we found in the second pass. We can simplify things a bit by loosening the requirement that we find .gitmodules only at root trees. Technically a file like "subdir/.gitmodules" is not parsed by Git, but it's not unreasonable for us to declare that Git is aware of all ".gitmodules" files and make them eligible for checking. That lets us drop the root-tree requirement, which eliminates one pass entirely. And it makes our worst case much better: instead of potentially queueing every root tree to be re-examined, the worst case is that we queue each unique .gitmodules blob for a second look. This patch just adds the boilerplate to find .gitmodules files. The actual content checks will come in a subsequent commit. Signed-off-by: Jeff King <peff@peff.net>	2018-05-21 23:55:12 -04:00
Jeff King	7ac4f3a007	fsck: actually fsck blob data Because fscking a blob has always been a noop, we didn't bother passing around the blob data. In preparation for content-level checks, let's fix up a few things: 1. The fsck_object() function just returns success for any blob. Let's a noop fsck_blob(), which we can fill in with actual logic later. 2. The fsck_loose() function in builtin/fsck.c just threw away blob content after loading it. Let's hold onto it until after we've called fsck_object(). The easiest way to do this is to just drop the parse_loose_object() helper entirely. Incidentally, this also fixes a memory leak: if we successfully loaded the object data but did not parse it, we would have left the function without freeing it. 3. When fsck_loose() loads the object data, it does so with a custom read_loose_object() helper. This function streams any blobs, regardless of size, under the assumption that we're only checking the sha1. Instead, let's actually load blobs smaller than big_file_threshold, as the normal object-reading code-paths would do. This lets us fsck small files, and a NULL return is an indication that the blob was so big that it needed to be streamed, and we can pass that information along to fsck_blob(). Signed-off-by: Jeff King <peff@peff.net>	2018-05-21 23:55:12 -04:00
Jeff King	ed9c322062	fsck: simplify ".git" check There's no need for us to manually check for ".git"; it's a subset of the other filesystem-specific tests. Dropping it makes our code slightly shorter. More importantly, the existing code may make a reader wonder why ".GIT" is not covered here, and whether that is a bug (it isn't, as it's also covered in the filesystem-specific tests). Signed-off-by: Jeff King <peff@peff.net>	2018-05-21 23:55:12 -04:00
Jeff King	db5a58c1bd	index-pack: make fsck error message more specific If fsck reports an error, we say only "Error in object". This isn't quite as bad as it might seem, since the fsck code would have dumped some errors to stderr already. But it might help to give a little more context. The earlier output would not have even mentioned "fsck", and that may be a clue that the "fsck." or ".fsckObjects" config may be relevant. Signed-off-by: Jeff King <peff@peff.net>	2018-05-21 23:55:12 -04:00
Jeff King	eedd5949f5	Merge branch 'jk/submodule-name-verify-fix' into jk/submodule-name-verify-fsck * jk/submodule-name-verify-fix: verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_path: drop clever fallthrough skip_prefix: add icase-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests path: match NTFS short names for more .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths Note that this includes two bits of evil-merge: - there's a new call to verify_path() that doesn't actually have a mode available. It should be OK to pass "0" here, since we're just manipulating the untracked cache, not an actual index entry. - the lstat() in builtin/update-index.c:update_one() needs to be updated to handle the fsmonitor case (without this it still behaves correctly, but does an unnecessary lstat).	2018-05-21 23:54:28 -04:00
Jeff King	10ecfa7649	verify_path: disallow symlinks in .gitmodules There are a few reasons it's not a good idea to make .gitmodules a symlink, including: 1. It won't be portable to systems without symlinks. 2. It may behave inconsistently, since Git may look at this file in the index or a tree without bothering to resolve any symbolic links. We don't do this _yet_, but the config infrastructure is there and it's planned for the future. With some clever code, we could make (2) work. And some people may not care about (1) if they only work on one platform. But there are a few security reasons to simply disallow it: a. A symlinked .gitmodules file may circumvent any fsck checks of the content. b. Git may read and write from the on-disk file without sanity checking the symlink target. So for example, if you link ".gitmodules" to "../oops" and run "git submodule add", we'll write to the file "oops" outside the repository. Again, both of those are problems that _could_ be solved with sufficient code, but given the complications in (1) and (2), we're better off just outlawing it explicitly. Note the slightly tricky call to verify_path() in update-index's update_one(). There we may not have a mode if we're not updating from the filesystem (e.g., we might just be removing the file). Passing "0" as the mode there works fine; since it's not a symlink, we'll just skip the extra checks. Signed-off-by: Jeff King <peff@peff.net>	2018-05-21 23:50:11 -04:00
Jeff King	eb12dd0c76	update-index: stat updated files earlier In the update_one(), we check verify_path() on the proposed path before doing anything else. In preparation for having verify_path() look at the file mode, let's stat the file earlier, so we can check the mode accurately. This is made a bit trickier by the fact that this function only does an lstat in a few code paths (the ones that flow down through process_path()). So we can speculatively do the lstat() here and pass the results down, and just use a dummy mode for cases where we won't actually be updating the index from the filesystem. Signed-off-by: Jeff King <peff@peff.net>	2018-05-21 23:50:11 -04:00
Jeff King	641084b618	verify_dotfile: mention case-insensitivity in comment We're more restrictive than we need to be in matching ".GIT" on case-sensitive filesystems; let's make a note that this is intentional. Signed-off-by: Jeff King <peff@peff.net>	2018-05-21 23:50:11 -04:00
Jeff King	e19e5e66d6	verify_path: drop clever fallthrough We check ".git" and ".." in the same switch statement, and fall through the cases to share the end-of-component check. While this saves us a line or two, it makes modifying the function much harder. Let's just write it out. Signed-off-by: Jeff King <peff@peff.net>	2018-05-21 23:50:11 -04:00
Jeff King	41a80924ae	skip_prefix: add case-insensitive variant We have the convenient skip_prefix() helper, but if you want to do case-insensitive matching, you're stuck doing it by hand. We could add an extra parameter to the function to let callers ask for this, but the function is small and somewhat performance-critical. Let's just re-implement it for the case-insensitive version. Signed-off-by: Jeff King <peff@peff.net>	2018-05-21 23:50:11 -04:00
Johannes Schindelin	dc2d9ba318	is_{hfs,ntfs}_dotgitmodules: add tests This tests primarily for NTFS issues, but also adds one example of an HFS+ issue. Thanks go to Congyi Wu for coming up with the list of examples where NTFS would possibly equate the filename with `.gitmodules`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Jeff King <peff@peff.net>	2018-05-21 23:50:11 -04:00
Johannes Schindelin	e7cb0b4455	is_ntfs_dotgit: match other .git files When we started to catch NTFS short names that clash with .git, we only looked for GIT~1. This is sufficient because we only ever clone into an empty directory, so .git is guaranteed to be the first subdirectory or file in that directory. However, even with a fresh clone, .gitmodules is not necessarily the first file to be written that would want the NTFS short name GITMOD~1: a malicious repository can add .gitmodul0000 and friends, which sorts before `.gitmodules` and is therefore checked out first. For that reason, we have to test not only for ~1 short names, but for others, too. It's hard to just adapt the existing checks in is_ntfs_dotgit(): since Windows 2000 (i.e., in all Windows versions still supported by Git), NTFS short names are only generated in the <prefix>~<number> form up to number 4. After that, a different prefix is used, calculated from the long file name using an undocumented, but stable algorithm. For example, the short name of .gitmodules would be GITMOD~1, but if it is taken, and all of ~2, ~3 and ~4 are taken, too, the short name GI7EBA~1 will be used. From there, collisions are handled by incrementing the number, shortening the prefix as needed (until ~9999999 is reached, in which case NTFS will not allow the file to be created). We'd also want to handle .gitignore and .gitattributes, which suffer from a similar problem, using the fall-back short names GI250A~1 and GI7D29~1, respectively. To accommodate for that, we could reimplement the hashing algorithm, but it is just safer and simpler to provide the known prefixes. This algorithm has been reverse-engineered and described at https://usn.pw/blog/gen/2015/06/09/filenames/, which is defunct but still available via https://web.archive.org/. These can be recomputed by running the following Perl script: -- snip -- use warnings; use strict; sub compute_short_name_hash ($) { my $checksum = 0; foreach (split('', $_[0])) { $checksum = ($checksum * 0x25 + ord($_)) & 0xffff; } $checksum = ($checksum * 314159269) & 0xffffffff; $checksum = 1 + (~$checksum & 0x7fffffff) if ($checksum & 0x80000000); $checksum -= (($checksum * 1152921497) >> 60) * 1000000007; return scalar reverse sprintf("%x", $checksum & 0xffff); } print compute_short_name_hash($ARGV[0]); -- snap -- E.g., running that with the argument ".gitignore" will result in "250a" (which then becomes "gi250a" in the code). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Jeff King <peff@peff.net>	2018-05-21 23:50:11 -04:00
Jeff King	0fc333ba20	is_hfs_dotgit: match other .git files Both verify_path() and fsck match ".git", ".GIT", and other variants specific to HFS+. Let's allow matching other special files like ".gitmodules", which we'll later use to enforce extra restrictions via verify_path() and fsck. Signed-off-by: Jeff King <peff@peff.net>	2018-05-21 23:50:11 -04:00
Jeff King	11a9f4d807	is_ntfs_dotgit: use a size_t for traversing string We walk through the "name" string using an int, which can wrap to a negative value and cause us to read random memory before our array (e.g., by creating a tree with a name >2GB, since "int" is still 32 bits even on most 64-bit platforms). Worse, this is easy to trigger during the fsck_tree() check, which is supposed to be protecting us from malicious garbage. Note one bit of trickiness in the existing code: we sometimes assign -1 to "len" at the end of the loop, and then rely on the "len++" in the for-loop's increment to take it back to 0. This is still legal with a size_t, since assigning -1 will turn into SIZE_MAX, which then wraps around to 0 on increment. Signed-off-by: Jeff King <peff@peff.net>	2018-05-21 23:50:11 -04:00

... 3 4 5 6 7 ...

51818 Commits