git-commit-vandalism

Author	SHA1	Message	Date
Elijah Newren	0e29222e0c	Documentation: call out commands that nuke untracked files/directories Some commands have traditionally also removed untracked files (or directories) that were in the way of a tracked file we needed. Document these cases. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 13:38:37 -07:00
Elijah Newren	94b7f1563a	Comment important codepaths regarding nuking untracked files/dirs In the last few commits we focused on code in unpack-trees.c that mistakenly removed untracked files or directories. There may be more of those, but in this commit we change our focus: callers of toplevel commands that are expected to remove untracked files or directories. As noted previously, we have toplevel commands that are expected to delete untracked files such as 'read-tree --reset', 'reset --hard', and 'checkout --force'. However, that does not mean that other highlevel commands that happen to call these other commands thought about or conveyed to users the possibility that untracked files could be removed. Audit the code for such callsites, and add comments near existing callsites to mention whether these are safe or not. My auditing is somewhat incomplete, though; it skipped several cases: * git-rebase--preserve-merges.sh: is in the process of being deprecated/removed, so I won't leave a note that there are likely more bugs in that script. * contrib/git-new-workdir: why is the -f flag being used in a new empty directory?? It shouldn't hurt, but it seems useless. * git-p4.py: Don't see why -f is needed for a new dir (maybe it's not and is just superfluous), but I'm not at all familiar with the p4 stuff * git-archimport.perl: Don't care; arch is long since dead * git-cvs*.perl: Don't care; cvs is long since dead Also, the reset --hard in builtin/worktree.c looks safe, due to only running in an empty directory. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 13:38:37 -07:00
Elijah Newren	56d06fe4aa	unpack-trees: avoid nuking untracked dir in way of locally deleted file Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 13:38:37 -07:00
Elijah Newren	1fdd51aa13	unpack-trees: avoid nuking untracked dir in way of unmerged file Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 13:38:37 -07:00
Elijah Newren	480d3d6bf9	Change unpack_trees' 'reset' flag into an enum Traditionally, unpack_trees_options->reset was used to signal that it was okay to delete any untracked files in the way. This was used by `git read-tree --reset`, but then started appearing in other places as well. However, many of the other uses should not be deleting untracked files in the way. Change this value to an enum so that a value of 1 (i.e. "true") can be split into two: UNPACK_RESET_PROTECT_UNTRACKED, UNPACK_RESET_OVERWRITE_UNTRACKED In order to catch accidental misuses (i.e. where folks call it the way they traditionally used to), define the special enum value of UNPACK_RESET_INVALID = 1 which will trigger a BUG(). Modify existing callers so that read-tree --reset reset --hard checkout --force continue using the UNPACK_RESET_OVERWRITE_UNTRACKED logic, while other callers, including am checkout without --force stash (though currently dead code; reset always had a value of 0) numerous callers from rebase/sequencer to reset_head() will use the new UNPACK_RESET_PROTECT_UNTRACKED value. Also, note that it has been reported that 'git checkout <treeish> <pathspec>' currently also allows overwriting untracked files[1]. That case should also be fixed, but it does not use unpack_trees() and thus is outside the scope of the current changes. [1] https://lore.kernel.org/git/15dad590-087e-5a48-9238-5d2826950506@gmail.com/ Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 13:38:37 -07:00
Elijah Newren	1b5f37334a	Remove ignored files by default when they are in the way Change several commands to remove ignored files by default when they are in the way. Since some commands (checkout, merge) take a --no-overwrite-ignore option to allow the user to configure this, and it may make sense to add that option to more commands (and in the case of merge, actually plumb that configuration option through to more of the backends than just the fast-forwarding special case), add little comments about where such flags would be used. Incidentally, this fixes a test failure in t7112. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 13:38:37 -07:00
Elijah Newren	c42e0b6409	unpack-trees: make dir an internal-only struct Avoid accidental misuse or confusion over ownership by clearly making unpack_trees_options.dir an internal-only variable. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 13:38:37 -07:00
Elijah Newren	04988c8d18	unpack-trees: introduce preserve_ignored to unpack_trees_options Currently, every caller of unpack_trees() that wants to ensure ignored files are overwritten by default needs to: * allocate unpack_trees_options.dir * flip the DIR_SHOW_IGNORED flag in unpack_trees_options.dir->flags * call setup_standard_excludes AND then after the call to unpack_trees() needs to * call dir_clear() * deallocate unpack_trees_options.dir That's a fair amount of boilerplate, and every caller uses identical code. Make this easier by instead introducing a new boolean value where the default value (0) does what we want so that new callers of unpack_trees() automatically get the appropriate behavior. And move all the handling of unpack_trees_options.dir into unpack_trees() itself. While preserve_ignored = 0 is the behavior we feel is the appropriate default, we defer fixing commands to use the appropriate default until a later commit. So, this commit introduces several locations where we manually set preserve_ignored=1. This makes it clear where code paths were previously preserving ignored files when they should not have been; a future commit will flip these to instead use a value of 0 to get the behavior we want. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 13:38:37 -07:00
Elijah Newren	491a7575f1	read-tree, merge-recursive: overwrite ignored files by default This fixes a long-standing patchwork of ignored files handling in read-tree and merge-recursive, called out and suggested by Junio long ago. Quoting from commit `dcf0c16ef1` ("core.excludesfile clean-up" 2007-11-16): git-read-tree takes --exclude-per-directory=<gitignore>, not because the flexibility was needed. Again, this was because the option predates the standardization of the ignore files. ... On the other hand, I think it makes perfect sense to fix git-read-tree, git-merge-recursive and git-clean to follow the same rule as other commands. I do not think of a valid use case to give an exclude-per-directory that is nonstandard to read-tree command, outside a "negative" test in the t1004 test script. This patch is the first step to untangle this mess. The next step would be to teach read-tree, merge-recursive and clean (in C) to use setup_standard_excludes(). History shows each of these were partially or fully fixed: * clean was taught the new trick in `1617adc7a0` ("Teach git clean to use setup_standard_excludes()", 2007-11-14). * read-tree was primarily used by checkout & merge scripts. checkout and merge later became builtins and were both fixed to use the new setup_standard_excludes() handling in `fc001b526c` ("checkout,merge: loosen overwriting untracked file check based on info/exclude", 2011-11-27). So the primary users were fixed, though read-tree itself was not. * merge-recursive has now been replaced as the default merge backend by merge-ort. merge-ort fixed this by using setup_standard_excludes() starting early in its implementation; see commit `6681ce5cf6` ("merge-ort: add implementation of checkout()", 2020-12-13), largely due to its design depending on checkout() and thus being influenced by the checkout code. However, merge-recursive itself was not fixed here, in part because its design meant it had difficulty differentiating between untracked files, ignored files, leftover tracked files that haven't been removed yet due to order of processing files, and files written by itself due to collisions). Make the conversion more complete by now handling read-tree and handling at least the unpack_trees() portion of merge-recursive. While merge-recursive is on its way out, fixing the unpack_trees() portion is easy and facilitates some of the later changes in this series. Note that fixing read-tree makes the --exclude-per-directory option to read-tree useless, so we remove it from the documentation (though we continue to accept it if passed). The read-tree changes happen to fix a bug in t1013. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 13:38:37 -07:00
Elijah Newren	c512d27e78	checkout, read-tree: fix leak of unpack_trees_options.dir Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 13:38:37 -07:00
Carlo Marcelo Arenas Belón	2d84c4ed57	lazyload.h: use an even more generic function pointer than FARPROC gcc will helpfully raise a -Wcast-function-type warning when casting between functions that might have incompatible return types (ex: GetUserNameExW returns bool which is only half the size of the return type from FARPROC which is long long), so create a new type that could be used as a completely generic function pointer and cast through it instead. Additionaly remove the -Wno-incompatible-pointer-types temporary flag added in `27e0c3c` (win32: allow building with pedantic mode enabled, 2021-09-03), as it will be no longer needed. Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 13:13:58 -07:00
Jeff King	67985e4e4a	refs: drop "broken" flag from for_each_fullref_in() No callers pass in anything but "0" here. Likewise to our sibling functions. Note that some of them ferry along the flag, but none of their callers pass anything but "0" either. Nor is anybody likely to change that. Callers which really want to see all of the raw refs use for_each_rawref(). And anybody interested in iterating a subset of the refs will likely be happy to use the now-default behavior of showing broken refs, but omitting dangling symlinks. So we can get rid of this whole feature. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 12:36:45 -07:00
Jeff King	2d653c5036	ref-filter: drop broken-ref code entirely Now that none of our callers passes the INCLUDE_BROKEN flag, we can drop it entirely, along with the code to plumb it through to the for_each_fullref_in() functions. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 12:36:45 -07:00
Jeff King	1763334caf	ref-filter: stop setting FILTER_REFS_INCLUDE_BROKEN Of the ref-filter callers, for-each-ref and git-branch both set the INCLUDE_BROKEN flag (but git-tag does not, which is a weird inconsistency). But now that GIT_REF_PARANOIA is on by default, that produces almost the same outcome for all three. The one exception is that GIT_REF_PARANOIA will omit dangling symrefs. That's a better behavior for these tools, as they would never include such a symref in the main output anyway (they can't, as it doesn't point to an object). Instead they issue a warning to stderr. But that warning is somewhat useless; a dangling symref is a perfectly reasonable thing to have in your repository, and is not a sign of corruption. It's much friendlier to just quietly ignore it. And in terms of robustness, the warning gains us little. It does not impact the exit code of either tool. So while the warning _might_ clue in a user that they have an unexpected broken symref, it would not help any kind of scripted use. This patch converts for-each-ref and git-branch to stop using the INCLUDE_BROKEN flag. That gives them more reasonable behavior, and harmonizes them with git-tag. We have to change one test to adapt to the situation. t1430 tries to trigger all of the REF_ISBROKEN behaviors from the underlying ref code. It uses for-each-ref to do so (because there isn't any other mechanism). That will no longer issue a warning about the symref which points to an invalid name, as it's considered dangling (and we can instead be sure that it's _not_ mentioned on stderr). Note that we do still complain about the illegally named "broken..symref"; its problem is not that it's dangling, but the name of the symref itself is illegal. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 12:36:45 -07:00
Jeff King	5d1f5b8cd4	repack, prune: drop GIT_REF_PARANOIA settings Now that GIT_REF_PARANOIA is the default, we don't need to selectively enable it for destructive operations. In fact, it's harmful to do so, because it overrides any GIT_REF_PARANOIA=0 setting that the user may have provided (because they're trying to work around some corruption). With these uses gone, we can further clean up the ref_paranoia global, and make it a static variable inside the refs code. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 12:36:45 -07:00
Jeff King	968f12fdac	refs: turn on GIT_REF_PARANOIA by default The original point of the GIT_REF_PARANOIA flag was to include broken refs in iterations, so that possibly-destructive operations would not silently ignore them (and would generally instead try to operate on the oids and fail when the objects could not be accessed). We already turned this on by default for some dangerous operations, like "repack -ad" (where missing a reachability tip would mean dropping the associated history). But it was not on for general use, even though it could easily result in the spreading of corruption (e.g., imagine cloning a repository which simply omits some of its refs because their objects are missing; the result quietly succeeds even though you did not clone everything!). This patch turns on GIT_REF_PARANOIA by default. So a clone as mentioned above would actually fail (upload-pack tells us about the broken ref, and when we ask for the objects, pack-objects fails to deliver them). This may be inconvenient when working with a corrupted repository, but: - we are better off to err on the side of complaining about corruption, and then provide mechanisms for explicitly loosening safety. - this is only one type of corruption anyway. If we are missing any other objects in the history that _aren't_ ref tips, then we'd behave similarly (happily show the ref, but then barf when we started traversing). We retain the GIT_REF_PARANOIA variable, but simply default it to "1" instead of "0". That gives the user an escape hatch for loosening this when working with a corrupt repository. It won't work across a remote connection to upload-pack (because we can't necessarily set environment variables on the remote), but there the client has other options (e.g., choosing which refs to fetch). As a bonus, this also makes ref iteration faster in general (because we don't have to call has_object_file() for each ref), though probably not noticeably so in the general case. In a repo with a million refs, it shaved a few hundred milliseconds off of upload-pack's advertisement; that's noticeable, but most repos are not nearly that large. The possible downside here is that any operation which iterates refs but doesn't ever open their objects may now quietly claim to have X when the object is corrupted (e.g., "git rev-list new-branch --not --all" will treat a broken ref as uninteresting). But again, that's not really any different than corruption below the ref level. We might have refs/heads/old-branch as non-corrupt, but we are not actively checking that we have the entire reachable history. Or the pointed-to object could even be corrupted on-disk (but our "do we have it" check would still succeed). In that sense, this is merely bringing ref-corruption in line with general object corruption. One alternative implementation would be to actually check for broken refs, and then _immediately die_ if we see any. That would cause the "rev-list --not --all" case above to abort immediately. But in many ways that's the worst of all worlds: - it still spends time looking up the objects an extra time - it still doesn't catch corruption below the ref level - it's even more inconvenient; with the current implementation of GIT_REF_PARANOIA for something like upload-pack, we can make the advertisement and let the client choose a non-broken piece of history. If we bail as soon as we see a broken ref, they cannot even see the advertisement. The test changes here show some of the fallout. A non-destructive "git repack -adk" now fails by default (but we can override it). Deleting a broken ref now actually tells the hooks the correct "before" state, rather than a confusing null oid. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 12:36:45 -07:00
Jeff King	6d751be4b6	refs: omit dangling symrefs when using GIT_REF_PARANOIA Dangling symrefs aren't actually a corruption problem. It's perfectly fine for refs/remotes/origin/HEAD to point to an unborn branch. And in particular, if you are trying to establish reachability, a symref that points nowhere doesn't matter either way. Any ref it could point to will be examined during the rest of the traversal. It's possible that a symref pointing nowhere _could_ be a sign that the ref it was meant to point to was deleted accidentally (e.g., via corruption). But there is no particular reason to think that is true for any given case, and in the meantime, GIT_REF_PARANOIA kicking in automatically for some operations means they'll fail unnecessarily. So let's loosen it just a bit. The new test in t5312 shows off an example that is safe, but currently fails (and no longer does after this patch). Note that we don't do anything if the caller explicitly asked for DO_FOR_EACH_INCLUDE_BROKEN. In that case they may be looking for dangling symrefs themselves, and setting GIT_REF_PARANOIA should not _loosen_ things from what the caller asked for. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 12:36:45 -07:00
Jeff King	8dccb2244c	refs: add DO_FOR_EACH_OMIT_DANGLING_SYMREFS flag When the DO_FOR_EACH_INCLUDE_BROKEN flag is used, we include both actual corrupt refs (illegal names, missing objects), but also symrefs that point to nothing. This latter is not really a corruption, but just something that may happen normally. For example, the symref at refs/remotes/origin/HEAD may point to a tracking branch which is later deleted. (The local HEAD may also be unborn, of course, but we do not access it through ref iteration). Most callers of for_each_ref() etc, do not care. They don't pass INCLUDE_BROKEN, so don't see it at all. But for those which do pass it, this somewhat-normal state causes extra warnings (e.g., from for-each-ref) or even aborts operations (destructive repacks with GIT_REF_PARANOIA set). This patch just introduces the flag and the mechanism; there are no callers yet (and hence no tests). Two things to note on the implementation: - we actually skip any symref that does not resolve to a ref. This includes ones which point to an invalidly-named ref. You could argue this is a more serious breakage than simple dangling. But the overall effect is the same (we could not follow the symref), as well as the impact on things like REF_PARANOIA (either way, a symref we can't follow won't impact reachability, because we'll see the ref itself during iteration). The underlying resolution function doesn't distinguish these two cases (they both get REF_ISBROKEN). - we change the iterator in refs/files-backend.c where we check INCLUDE_BROKEN. There's a matching spot in refs/packed-backend.c, but we don't know need to do anything there. The packed backend does not support symrefs at all. The resulting set of flags might be a bit easier to follow if we broke this down into "INCLUDE_CORRUPT_REFS" and "INCLUDE_DANGLING_SYMREFS". But there are a few reasons not do so: - adding a new OMIT_DANGLING_SYMREFS flag lets us leave existing callers intact, without changing their behavior (and some of them really do want to see the dangling symrefs; e.g., t5505 has a test which expects us to report when a symref becomes dangling) - they're not actually independent. You cannot say "include dangling symrefs" without also including refs whose objects are not reachable, because dangling symrefs by definition do not have an object. We could tweak the implementation to distinguish this, but in practice nobody wants to ask for that. Adding the OMIT flag keeps the implementation simple and makes sure we don't regress the current behavior. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 12:36:45 -07:00
Jeff King	9aab952e85	refs-internal.h: reorganize DO_FOR_EACH_* flag documentation The documentation for the DO_FOR_EACH_* flags is sprinkled over the refs-internal.h file. We define the two flags in one spot, and then describe them in more detail far away from there, in the definitions of refs_ref_iterator_begin() and ref_iterator_advance_fn(). Let's try to organize this a bit better: - convert the #defines to an enum. This makes it clear that they are related, and that the enum shows the complete set of flags. - combine all descriptions for each flag in a single spot, next to the flag's definition - use the enum rather than a bare int for functions which take the flags. This helps readers realize which flags can be used. - clarify the mention of flags for ref_iterator_advance_fn(). It does not take flags itself, but is meant to depend on ones set up earlier. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 12:36:45 -07:00
Jeff King	bf708add2e	refs-internal.h: move DO_FOR_EACH_* flags next to each other There are currently two DO_FOR_EACH_* flags, which must not have their bits overlap. Yet they're defined hundreds of lines apart. Let's move them next to each other to make it clear that they are related and are a complete set (which matters if you are adding a new flag and would like to know what the next available bit is). Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 12:36:45 -07:00
Jeff King	5b062e1f79	t5312: be more assertive about command failure When repacking or pruning in a corrupted repository, our tests in t5312 argue that it is OK to complete the operation or bail, as long as we don't actually delete the objects pointed to by the corruption. This isn't a wrong line of reasoning, but the tests are a bit permissive by using test_might_fail. The fact is that we _do_ bail currently, and if we ever stopped doing so, that would be worthy of a human investigating. So let's switch these to test_must_fail. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 12:36:45 -07:00
Jeff King	078eecbcbe	t5312: test non-destructive repack In t5312, we create a state with a broken ref, and then make sure that destructive repacks don't silently ignore the breakage (where a destructive repack is one that might drop objects). But we don't check the behavior of non-destructive repacks at all (i.e., ones where we'd keep unreachable objects). So let's add a test to confirm the current behavior, which is that they are allowed (i.e., ignoring the breakage and considering any objects it points to as unreachable). This may change in the future, but we'd like for the test suite to alert us to that fact. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 12:36:45 -07:00
Jeff King	f805844676	t5312: create bogus ref as necessary Some tests in t5312 create an illegally-named ref, and then see how various operations handle it. But between those operations, we also do some more setup (e.g., repacking), and we are subtly depending on how those setup steps react to the illegal ref. To future-proof us against those behaviors changing, let's instead create and clean up our bogus ref on demand in the tests that need it. This has two small extra advantages: - the tests are more stand-alone; we do not need an extra test to clean up the ref before moving on to other parts of the script - the creation and cleanup is together in one helper function. Because these depend on touching the refs in the filesystem directly, they may need to be tweaked for a world with alternate backends (they have not been noticed so far in the reftable work because with a non-file backend the tests don't fail; they simply become uninteresting noops because the broken ref isn't read at all). Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 12:36:44 -07:00
Jeff King	2ac0cbc9b0	t5312: drop "verbose" helper t5312 has several uses of the "verbose" helper, as described in `8ad1652418` (t5304: use helper to report failure of "test foo = bar", 2014-10-10). Back then the "-x" trace option for tests was new, and was not as pleasant to use (e.g., some tests failed under "-x", we did not support BASH_XTRACEFD, etc). These days it is clear that "-x" is the preferred way to get extra output, and we don't need to mark up individual tests. Let's get rid of the uses of "verbose" here, as one step toward eradicating it totally. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 12:36:44 -07:00
Jeff King	da5e0c6a00	t5600: provide detached HEAD for corruption failures When checking how git-clone behaves when it fails, we stimulate some failures by trying to do a clone from a local repository whose objects have been removed. Because these clones use local optimizations, there's a subtle dependency in how the corruption is handled on the sending side. If upload-pack does not show us the broken refs (which it does not currently), then we see only HEAD (which is itself broken), and clone that as a detached HEAD. When we try to write the ref, we notice that we never got the object and bail. But if upload-pack _does_ show us the broken refs (which it may in a future patch), then we'll realize that HEAD is a symref and just write that. You'd think we'd fail when writing out the refs themselves, but we don't; we do a bulk write and skip the connectivity check because of our --local optimizations. For the non-bare case, we do notice the problem when we try to checkout. But for a bare repository, we unexpectedly complete the clone successfully! At first glance this may seem like a bug. But the whole point of those local optimizations is to give up some safety for speed. If you want to be careful, you should be using "--no-local", which would notice that the pack did not transfer sufficient objects. We could do that in these tests, but part of the point is for them to fail at specific moments (and indeed, we have a later test that checks for transport failure). However, we can make this less subtle and future-proof it against changes on the upload-pack side by just having an explicit detached HEAD in the corrupted repo. Now we'll fail as expected during the ref write if any ref _or_ HEAD is corrupt, whether we're --bare or not. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 12:36:44 -07:00
Jeff King	e9de7a52a5	t5516: don't use HEAD ref for invalid ref-deletion tests A few tests in t5516 want to assert that we can delete a corrupted ref whose pointed-to object is missing. They do so by using the "main" branch, which is also pointed to by HEAD. This does work, but only because of a subtle assumption about the implementation. We do not block the deletion because of the invalid ref, but we _also_ do not notice that the deleted branch is pointed to by HEAD. And so the safety rule of "do not allow HEAD to be deleted in a non-bare repository" does not kick in, and the test passes. Let's instead use a non-HEAD branch. That still tests what we care about here (deleting a corrupt ref), but without implicitly depending on our failure to notice that we're deleting HEAD. That will future proof the test against that behavior changing. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 12:36:44 -07:00
Jeff King	b4724242fa	t7900: clean up some more broken refs The "incremental-repack task" test replaces the object directory with a known state. As a result, some of our refs point to objects that are not included in that state. Commit `3cf5f221be` (t7900: clean up some broken refs, 2021-01-19) cleaned up some of those (that were causing warnings to stderr from the maintenance process). But there are a few more that were missed. These aren't hurting anything for now, but it's certainly an unexpected state to leave the test repository in, and it will become a problem if repack ever gets more picky about broken refs. Let's clean up those additional refs (which are all in refs/remotes, with nothing there that isn't broken), and add an extra "for-each-ref" call to assert that we've got everything. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 12:36:44 -07:00
Ævar Arnfjörð Bjarmason	3e8084f188	http: check CURLE_SSL_PINNEDPUBKEYNOTMATCH when emitting errors Change the error shown when a http.pinnedPubKey doesn't match to point the http.pinnedPubKey variable added in `aeff8a6121` (http: implement public key pinning, 2016-02-15), e.g.: git -c http.pinnedPubKey=sha256/someNonMatchingKey ls-remote https://github.com/git/git.git fatal: unable to access 'https://github.com/git/git.git/' with http.pinnedPubkey configuration: SSL: public key does not match pinned public key! Before this we'd emit the exact same thing without the " with http.pinnedPubkey configuration". The advantage of doing this is that we're going to get a translated message (everything after the ":" is hardcoded in English in libcurl), and we've got a reference to the git-specific configuration variable that's causing the error. Unfortunately we can't test this easily, as there are no tests that require https:// in the test suite, and t/lib-httpd.sh doesn't know how to set up such tests. See [1] for the start of a discussion about what it would take to have divergent "t/lib-httpd/apache.conf" test setups. #leftoverbits 1. https://lore.kernel.org/git/YUonS1uoZlZEt+Yd@coredump.intra.peff.net/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 10:58:07 -07:00
Andrzej Hunt	44d2aec6e8	connect: also update offset for features without values parse_feature_value() takes an offset, and uses it to seek past the point in features_list that we've already seen. However if the feature being searched for does not specify a value, the offset is not updated. Therefore if we call parse_feature_value() in a loop on a value-less feature, we'll keep on parsing the same feature over and over again. This usually isn't an issue: there's no point in using next_server_feature_value() to search for repeated instances of the same capability unless that capability typically specifies a value - but a broken server could send a response that omits the value for a feature even when we are expecting a value. Therefore we add an offset update calculation for the no-value case, which helps ensure that loops using next_server_feature_value() will always terminate. next_server_feature_value(), and the offset calculation, were first added in 2.28 in `2c6a403d96` (connect: add function to parse multiple v1 capability values, 2020-05-25). Thanks to Peff for authoring the test. Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Andrzej Hunt <andrzej@ahunt.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 10:34:41 -07:00
Ævar Arnfjörð Bjarmason	cfe853e66b	hook-list.h: add a generated list of hooks, like config-list.h Make githooks(5) the source of truth for what hooks git supports, and punt out early on hooks we don't know about in find_hook(). This ensures that the documentation and the C code's idea about existing hooks doesn't diverge. We still have Perl and Python code running its own hooks, but that'll be addressed by Emily Shaffer's upcoming "git hook run" command. This resolves a long-standing TODO item in bugreport.c of there being no centralized listing of hooks, and fixes a bug with the bugreport listing only knowing about 1/4 of the p4 hooks. It didn't know about the recent "reference-transaction" hook either. We could make the find_hook() function die() or BUG() out if the new known_hook() returned 0, but let's make it return NULL just as it does when it can't find a hook of a known type. Making it die() is overly anal, and unlikely to be what we need in catching stupid typos in the name of some new hook hardcoded in git.git's sources. By making this be tolerant of unknown hook names, changes in a later series to make "git hook run" run arbitrary user-configured hook names will be easier to implement. I have not been able to directly test the CMake change being made here. Since `4c2c38e800` (ci: modification of main.yml to use cmake for vs-build job, 2020-06-26) some of the Windows CI has a hard dependency on CMake, this change works there, and is to my eyes an obviously correct use of a pattern established in previous CMake changes, namely: - `061c2240b1` (Introduce CMake support for configuring Git, 2020-06-12) - `709df95b78` (help: move list_config_help to builtin/help, 2020-04-16) - `976aaedca0` (msvc: add a Makefile target to pre-generate the Visual Studio solution, 2019-07-29) The LC_ALL=C is needed because at least in my locale the dash ("-") is ignored for the purposes of sorting, which results in a different order. I'm not aware of anything in git that has a hard dependency on the order, but e.g. the bugreport output would end up using whatever locale was in effect when git was compiled. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 09:44:54 -07:00
Ævar Arnfjörð Bjarmason	07a348e746	hook.c users: use "hook_exists()" instead of "find_hook()" Use the new hook_exists() function instead of find_hook() where the latter was called in boolean contexts. This make subsequent changes in a series where we further refactor the hook API clearer, as we won't conflate wanting to get the path of the hook with checking for its existence. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 09:44:54 -07:00
Emily Shaffer	330155ed8a	hook.c: add a hook_exists() wrapper and use it in bugreport.c Add a boolean version of the find_hook() function for those callers who are only interested in checking whether the hook exists, not what the path to it is. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 09:44:54 -07:00
Ævar Arnfjörð Bjarmason	5e3aba33da	hook.[ch]: move find_hook() from run-command.c to hook.c Move the find_hook() function from run-command.c to a new hook.c library. This change establishes a stub library that's pretty pointless right now, but will see much wider use with Emily Shaffer's upcoming "configuration-based hooks" series. Eventually all the hook related code will live in hook.[ch]. Let's start that process by moving the simple find_hook() function over as-is. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 09:44:54 -07:00
Johannes Sixt	d2c470f9bc	lazyload.h: fix warnings about mismatching function pointer types Here, GCC warns about every use of the INIT_PROC_ADDR macro, for example: In file included from compat/mingw.c:8: compat/mingw.c: In function 'mingw_strftime': compat/win32/lazyload.h:38:12: warning: assignment to 'size_t ()(char , size_t, const char , const struct tm )' {aka 'long long unsigned int ()(char , long long unsigned int, const char , const struct tm )'} from incompatible pointer type 'FARPROC' {aka 'long long int (*)()'} [-Wincompatible-pointer-types] 38 \| (function = get_proc_addr(&proc_addr_##function)) \| ^ compat/mingw.c:1014:6: note: in expansion of macro 'INIT_PROC_ADDR' 1014 \| if (INIT_PROC_ADDR(strftime)) \| ^~~~~~~~~~~~~~ (message wrapped for convenience). Insert a cast to keep the compiler happy. A cast is fine in these cases because they are generic function pointer values that have been looked up in a DLL. Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 09:31:59 -07:00
Derrick Stolee	ca267aee15	t3705: test that 'sparse_entry' is unstaged The tests in t3705-add-sparse-checkout.sh check to see how 'git add' behaves with paths outside the sparse-checkout definition. These currently check to see if a given warning is present but not that the index is not updated with the sparse entries. Add a new 'test_sparse_entry_unstaged' helper to be sure 'git add' is behaving correctly. We need to modify setup_sparse_entry to actually commit the sparse_entry file so it exists at HEAD and as an entry in the index, but its exact contents are not staged in the index. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-24 11:43:56 -07:00
Elijah Newren	446cc5544a	t2500: add various tests for nuking untracked files Noting that unpack_trees treats reset=1 & update=1 as license to nuke untracked files, I looked for code paths that use this combination and tried to generate testcases which demonstrated unintentional loss of untracked files and directories. I found several. I also include testcases for `git reset --{hard,merge,keep}`. A hard reset is perhaps the most direct test of unpack_tree's reset=1 behavior, but we cannot make `git reset --hard` preserve untracked files without some migration work. Also, the two commands `checkout --force` (because of the --force) and `read-tree --reset` (because it's plumbing and we need to keep it backward compatible) were left out as we expect those to continue removing untracked files and directories. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-24 09:24:25 -07:00
René Scharfe	8c6b4332b4	packfile: release bad_objects in close_pack() Unusable entries of a damaged pack file are recorded in the oidset bad_objects. Release it when we're done with the pack. This doesn't affect intact packs because an empty oidset requires no allocation. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-24 09:22:46 -07:00
Phillip Wood	2b88fe0603	rebase: fix todo-list rereading `54fd3243da` ("rebase -i: reread the todo list if `exec` touched it", 2017-04-26) sought to reread the todo list after running an exec command only if it had been changed. To accomplish this it checks the stat data of the todo list after running an exec command to see if it has changed. Unfortunately there are two problems, firstly the implementation is buggy we actually reread the list after each exec which is quadratic in the number of commit lookups and secondly the design is predicated on using nanosecond time stamps which are not the default. The implementation bug stems from the fact that we write a new todo list to disk before running each command but do not update the stat data to reflect this[1]. The design problem is that it is possible for the user to edit the todo list without changing its size or inode which means we have to rely on the mtime to tell us if it has changed. Unfortunately unless git is built with USE_NSEC it is possible for the original and edited list to share the same mtime. Ideally "git rebase --edit-todo" would set a flag that we would then check in sequencer.c. Unfortunately this is approach will not work as there are scripts in the wild that write to the todo list directly without running "git rebase --edit-todo". Instead of relying on stat data this patch simply reads the possibly edited todo list and compares it to the original with memcmp(). This is much faster than reparsing the todo list each time. This patch reduces the time to run git rebase -r -xtrue v2.32.0~100 v2.32.0 which runs 419 exec commands by 6.6%. For comparison fixing the implementation bug in stat based approach reduces the time by a further 1.4% and is indistinguishable from never rereading the todo list. [1] https://lore.kernel.org/git/20191125131833.GD23183@szeder.dev/ Reported-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-24 08:56:28 -07:00
Phillip Wood	dfa8bae5a2	sequencer.c: factor out a function This code is heavily indented and obscures the high level logic within the loop. Let's move it to its own function before modifying it in the next commit. Note that there is a subtle change in behavior if the todo list cannot be reread. Previously todo_list->current was incremented before returning, now it returns immediately. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-24 08:56:27 -07:00
Eric Wong	0d0d8d8a11	doc/technical: update note about core.multiPackIndex MIDX files are used by default since commit `18e449f86b` (midx: enable core.multiPackIndex by default, 2020-09-25) Helped-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-24 08:39:53 -07:00
Ævar Arnfjörð Bjarmason	f53df0bdf6	Makefile: remove an out-of-date comment This comment added in `dfea575017` (Makefile: lazily compute header dependencies, 2010-01-26) has been out of date since `92b88eba9f` (Makefile: use `git ls-files` to list header files, if possible, 2019-03-04), when we did exactly what it tells us not to do and added $(GENERATED_H) to $(OBJECTS) dependencies. The rest of it was also somewhere between inaccurate and outdated, since as of `b8ba629264` (Makefile: fold MISC_H into LIB_H, 2012-06-20) it's not followed by a list of header files, that got moved earlier in the file into LIB_H in `60d24dd255` (Makefile: fold XDIFF_H and VCSSVN_H into LIB_H, 2012-07-06). Let's just remove it entirely, to the extent that we have anything useful to say here the comment on the "USE_COMPUTED_HEADER_DEPENDENCIES" variable a few lines above this change does the job for us. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-23 15:06:47 -07:00
Ævar Arnfjörð Bjarmason	7c81295382	Makefile: don't perform "mv $@+ $@" dance for $(GENERATED_H) Change the "cmd.sh > $@+ && mv $@+ $@" pattern used for generating the config-list.h and command-list.h to just "cmd.sh >$@". This was needed as a guard to ensure that we don't have an empty file if the script failed, but since `7b76d6bf22` (Makefile: add and use the ".DELETE_ON_ERROR" flag, 2021-06-29) GNU make ensures that doesn't happen. There's still a lot of other places in the Makefile where we needlessly use this pattern, but I'm just changing these because I'm about to add a new $(GENERATED_H) target, let's have them all look and act the same way. Even with ".DELETE_ON_ERROR" there is still a point to using the "mv $@+ $@" pattern in some cases, e.g. to ensure that you have a working binary during recompilation (see [1] for the start of a long discussion about that), but that doesn't apply here. Nothing external uses $(GENERATED_H) directly, it's only ever used in the context of the Makefile's own dependency (re-)generation. 1. https://lore.kernel.org/git/8735t93h0u.fsf@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-23 15:06:47 -07:00
Ævar Arnfjörð Bjarmason	7c3c0a99cc	Makefile: stop hardcoding {command,config}-list.h Change various places that hardcode the names of these two files to refer to either $(GENERATED_H), or to a new generated-hdrs target. That target is consistent with the *-objs targets I recently added in `029bac01a8` (Makefile: add {program,xdiff,test,git,fuzz}-objs & objects targets, 2021-02-23). A subsequent commit will add a new generated hook-list.h. By doing this refactoring we'll only need to add the new file to the GENERATED_H variable, not EXCEPT_HDRS, the vcbuild/README etc. Hardcoding command-list.h there seems to have been a case of copy/paste programming in `976aaedca0` (msvc: add a Makefile target to pre-generate the Visual Studio solution, 2019-07-29). The config-list.h was added later in `709df95b78` (help: move list_config_help to builtin/help, 2020-04-16). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-23 15:06:47 -07:00
Ævar Arnfjörð Bjarmason	ea47e59fe3	Makefile: mark "check" target as .PHONY Fix a bug in `44c9e8594e` (Fix up header file dependencies and add sparse checking rules, 2005-07-03), we never marked the phony "check" target as such. Perhaps we should just remove it, since as of a combination of `912f9980d2` (Makefile: help people who run 'make check' by mistake, 2008-11-11) `0bcd9ae85d` (sparse: Fix errors due to missing target-specific variables, 2011-04-21) we've been suggesting the user run "make sparse" directly. But under that mode it still does something, as well as directing the user to run "make test" under non-sparse. So let's punt that and narrowly fix the PHONY bug. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-23 15:06:47 -07:00
Ævar Arnfjörð Bjarmason	f188160be9	bundle: remove ignored & undocumented "--verbose" flag In `73c3253d75` (bundle: framework for options before bundle file, 2019-11-10) the "git bundle" command was refactored to use parse_options(). In that refactoring it started understanding the "--verbose" flag before the subcommand, e.g.: git bundle --verbose verify --quiet However, nothing ever did anything with this "verbose" variable, and the change wasn't documented. It appears to have been something that escaped the lab, and wasn't flagged by reviewers at the time. Let's just remove it. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-23 15:03:48 -07:00
Junio C Hamano	ddb1055343	The eighth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-23 13:45:03 -07:00
Junio C Hamano	b1b065ee35	Merge branch 'rs/use-xopen-in-index-pack' Code clean-up. * rs/use-xopen-in-index-pack: index-pack: use xopen in init_thread	2021-09-23 13:44:50 -07:00
Junio C Hamano	d1e376d2f9	Merge branch 'kz/revindex-comment-fix' Header comment fix. * kz/revindex-comment-fix: pack-revindex.h: correct the time complexity descriptions	2021-09-23 13:44:49 -07:00
Junio C Hamano	50eb005eb3	Merge branch 'cb/plug-leaks-in-alloca-emu-users' Leakfix. * cb/plug-leaks-in-alloca-emu-users: t0000: avoid masking git exit value through pipes tree-diff: fix leak when not HAVE_ALLOCA_H	2021-09-23 13:44:49 -07:00
Junio C Hamano	f7511fdfbd	Merge branch 'jt/submodule-name-to-gitdir' Code refactoring. * jt/submodule-name-to-gitdir: submodule: extract path to submodule gitdir func	2021-09-23 13:44:49 -07:00

... 6 7 8 9 10 ...

64823 Commits