git-commit-vandalism

Author	SHA1	Message	Date
Vasil Dimov	8cf51561d1	range-diff: fix a crash in parsing git-log output `git range-diff` calls `git log` internally and tries to parse its output. But `git log` output can be customized by the user in their git config and for certain configurations either an error will be returned by `git range-diff` or it will crash. To fix this explicitly set the output format of the internally executed `git log` with `--pretty=medium`. Because that cancels `--notes`, add explicitly `--notes` at the end. Also, make sure we never crash in the same way - trying to dereference `util` which was never created and has remained NULL. It would happen if the first line of `git log` output does not begin with 'commit '. Alternative considered but discarded - somehow disable all git configs and behave as if no config is present in the internally executed `git log`, but that does not seem to be possible. GIT_CONFIG_NOSYSTEM is the closest to it, but even with that we would still read `.git/config`. Signed-off-by: Vasil Dimov <vd@FreeBSD.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-15 18:32:47 -07:00
Jeff King	4c5971e18a	credential: treat "?" and "#" in URLs as end of host It's unusual to see: https://example.com?query-parameters without an intervening slash, like: https://example.com/some-path?query-parameters or even: https://example.com/?query-parameters but it is a valid end to the hostname (actually "authority component") according to RFC 3986. Likewise for "#". And curl will parse the URL according to the standard, meaning it will contact example.com, but our credential code would ask about a bogus hostname with a "?" in it. Let's make sure we follow the standard, and more importantly ask about the same hosts that curl will be talking to. It would be nice if we could just ask curl to parse the URL for us. But it didn't grow a URL-parsing API until 7.62, so we'd be stuck with fallback code either way. Plus we'd need this code in the main Git binary, where we've tried to avoid having a link dependency on libcurl. But let's at least fix our parser. Moving to curl's parser would prevent other potential discrepancies, but this gives us immediate relief for the known problem, and would help our fallback code if we eventually use curl. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-15 10:31:03 -07:00
luciano.rocha@booking.com	312cd76130	freshen_file(): use NULL `times' for implicit current-time Update freshen_file() to use a NULL `times', semantically equivalent to the currently setup, with an explicit `actime' and `modtime' set to the "current time", but with the advantage that it works with other files not owned by the current user. Fixes an issue on shared repos with a split index, where eventually a user's operation creates a shared index, and another user will later do an operation that will try to update its freshness, but will instead raise a warning: $ git status warning: could not freshen shared index '.git/sharedindex.bd736fa10e0519593fefdb2aec253534470865b2' Signed-off-by: Luciano Miguel Ferreira Rocha <luciano.rocha@booking.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-15 09:57:23 -07:00
Junio C Hamano	efe3874640	Git 2.26.1 -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE4fA2sf7nIh/HeOzvsLXohpav5ssFAl57uiUACgkQsLXohpav 5ssUdQ/+MINTr0vCUL6PUvdhDI4S3+GZ8tryZm6yysIzUHbngbGiPsG2I6XkZQb4 uCdMLcuFMn7BZ3fbP0tZeX9GwqJu8RnZ1Nt+Eh8uKc51AklS/5w322Xb/gL6xA3p 2zLXwruGiR5wmlbkFaeQmLggrFqFWknXVqGdpJ7ei/pl1RTfmfdVz/tWXUME/Q6/ thTEbW2ru6yMuR/GGHQs2J6xYlE0de/Yidvlcl3cK9aGN0rpKe/di/mZwBp1jWam icfDoncNpinzEtube9ML1/aG1fCy/aGKe7CYxKz0zEuM/ODCOaCqU7eLza03M5mJ KPjZGBrfGHwVDzlqeVp8MS7uvhHB4pA8wivHKrbEpbaeCtSc66EhtXGyyoPGpP77 aqwcgbExUUTZm0BNWEdPkKZ0IeYiXbabDsFmESVQnMKvjRcwLGZv5tfFIA56yCql Px/pkQSbKz07f6WR1P49EUDfLNONNYFpY9Q7LZo8Iw1FfiARaXwYvcb9/Mt2CGW3 VUxUKUXaz8PkYFbmEwSxSznqjA+5ZDeC5Q0Ntzvq0YghmenLO+fC3yjT4HSVyGUE uaxNygzbqMwCDLRhbOv0RC81thfwgLYr4X6ZAD0GKZqaGdQbvseo+vr9TwqsA+U/ 7lPh8FBJjoGsMfqt6mgLzLSVscussuOaNeojDMTHaSXJDt/psFU= =4jfG -----END PGP SIGNATURE----- Sync with v2.26.1	2020-04-13 18:40:10 -07:00
Johannes Schindelin	176a66a748	t: restrict `is_hidden` to be called only on Windows The function won't work anywhere else, so let's mark it as an explicit bug if it is called on a non-Windows platform. Let's also rename the function to avoid cluttering the global namespace with an overly-generic function name. While at it, we also fix the code comment above that function: the lower-case `windows` refers to something different than `Windows`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-11 14:24:40 -07:00
Johannes Schindelin	9814d0a4ad	mingw: make test_path_is_hidden more robust This function uses Windows' system tool `attrib` to determine the state of the hidden flag of a file or directory. We should not actually expect the first `attrib.exe` in the PATH to be the one we are looking for. Or that it is in the PATH, for that matter. Let's use the full path to the tool instead. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-11 14:23:25 -07:00
Johannes Schindelin	7c2dfca7e8	t: consolidate the `is_hidden` functions The `is_hidden` function can be used (only on Windows) to determine whether a directory or file have their `hidden` flag set. This function is duplicated between two test scripts. It is better to move it into `test-lib-functions.sh` so that it is reused. This patch is best viewed with `--color-moved`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-11 14:23:23 -07:00
Martin Ågren	9881b451f3	merge: use skip_prefix to parse config key Instead of using `starts_with()`, the magic number 7, `strlen()` and a fair number of additions to verify the three parts of the config key "branch.<branch>.mergeoptions", use `skip_prefix()` to jump through them more explicitly. We need to introduce a new variable for this (we certainly can't modify `k` just because we see "branch."!). With `skip_prefix()` we often use quite bland names like `p` or `str`. Let's do the same. If and when this function needs to do more prefix-skipping, we'll have a generic variable ready for this. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-11 14:21:12 -07:00
Jonathan Tan	0fcb4f6b62	rebase --merge: optionally skip upstreamed commits When rebasing against an upstream that has had many commits since the original branch was created: O -- O -- ... -- O -- O (upstream) \ -- O (my-dev-branch) it must read the contents of every novel upstream commit, in addition to the tip of the upstream and the merge base, because "git rebase" attempts to exclude commits that are duplicates of upstream ones. This can be a significant performance hit, especially in a partial clone, wherein a read of an object may end up being a fetch. Add a flag to "git rebase" to allow suppression of this feature. This flag only works when using the "merge" backend. This flag changes the behavior of sequencer_make_script(), called from do_interactive_rebase() <- run_rebase_interactive() <- run_specific_rebase() <- cmd_rebase(). With this flag, limit_list() (indirectly called from sequencer_make_script() through prepare_revision_walk()) will no longer call cherry_pick_list(), and thus PATCHSAME is no longer set. Refraining from setting PATCHSAME both means that the intermediate commits in upstream are no longer read (as shown by the test) and means that no PATCHSAME-caused skipping of commits is done by sequencer_make_script(), either directly or through make_script_with_merges(). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-11 14:15:57 -07:00
Elijah Newren	50ed76148a	rebase: fix an incompatible-options error message When the user specifies the apply backend with options that only work with the merge backend, such as git rebase --apply --exec /bin/true HEAD~3 the error message has always been fatal: --exec requires an interactive rebase This error message is misleading and was one of the reasons we renamed the interactive backend to the merge backend. Update the error message to state that these options merely require use of the merge backend. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-11 14:15:54 -07:00
Elijah Newren	b9cbd2958f	rebase: reinstate --no-keep-empty Commit `d48e5e21da` ("rebase (interactive-backend): make --keep-empty the default", 2020-02-15) turned --keep-empty (for keeping commits which start empty) into the default. The logic underpinning that commit was: 1) 'git commit' errors out on the creation of empty commits without an override flag 2) Once someone determines that the override is worthwhile, it's annoying and/or harmful to required them to take extra steps in order to keep such commits around (and to repeat such steps with every rebase). While the logic on which the decision was made is sound, the result was a bit of an overcorrection. Instead of jumping to having --keep-empty being the default, it jumped to making --keep-empty the only available behavior. There was a simple workaround, though, which was thought to be good enough at the time. People could still drop commits which started empty the same way the could drop any commits: by firing up an interactive rebase and picking out the commits they didn't want from the list. However, there are cases where external tools might create enough empty commits that picking all of them out is painful. As such, having a flag to automatically remove start-empty commits may be beneficial. Provide users a way to drop commits which start empty using a flag that existed for years: --no-keep-empty. Interpret --keep-empty as countermanding any previous --no-keep-empty, but otherwise leaving --keep-empty as the default. This might lead to some slight weirdness since commands like git rebase --empty=drop --keep-empty git rebase --empty=keep --no-keep-empty look really weird despite making perfect sense (the first will drop commits which become empty, but keep commits that started empty; the second will keep commits which become empty, but drop commits which started empty). However, --no-keep-empty was named years ago and we are predominantly keeping it for backward compatibility; also we suspect it will only be used rarely since folks already have a simple way to drop commits they don't want with an interactive rebase. Reported-by: Bryan Turner <bturner@atlassian.com> Reported-by: Sami Boukortt <sami@boukortt.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-11 14:15:52 -07:00
Elijah Newren	1b5735f75c	rebase -i: mark commits that begin empty in todo editor While many users who intentionally create empty commits do not want them thrown away by a rebase, there are third-party tools that generate empty commits that a user might not want. In the past, users have used rebase to get rid of such commits (a side-effect of the fact that the --apply backend is not currently capable of keeping them). While such users could fire up an interactive rebase and just remove the lines corresponding to empty commits, that might be difficult if the third-party tool generates many of them. Simplify this task for users by marking such lines with a suffix of " # empty" in the todo list. Suggested-by: Sami Boukortt <sami@boukortt.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-11 14:15:49 -07:00
Jeff King	348482dede	config: reject parsing of files over INT_MAX While the last few commits have made it possible for the config parser to handle config files up to the limits of size_t, the rest of the code isn't really ready for this. In particular, we often feed the keys as strings into printf "%s" format specifiers. And because the printf family of functions must return an int to specify the result, they complain. Here are two concrete examples (using glibc; we're in uncharted territory here so results may vary): Generate a gigantic .gitmodules file like this: git submodule add /some/other/repo foo { printf '[submodule "' perl -e 'print "a" x 2**31' echo '"]path = foo' } >.gitmodules git commit -m 'huge gitmodule' then try this: $ git show BUG: strbuf.c:397: your vsnprintf is broken (returned -1) The problem is that we end up calling: strbuf_addf(&sb, "submodule.%s.ignore", submodule_name); which relies on vsnprintf(), and that function has no way to report back a size larger than INT_MAX. Taking that same file, try this: git config --file=.gitmodules --list --name-only On my system it produces an output with exactly 4GB of spaces. I confirmed in a debugger that we reach the config callback with the key intact: it's 2147483663 bytes and full of a's. But when we print it with this call: printf("%s%c", key_, term); we just get the spaces. So given the fact that these are insane cases which we have no need to support, the weird behavior from feeding the results to printf even if the code is careful, and the possibility of uncareful code introducing its own integer truncation issues, let's just declare INT_MAX as a limit for parsing config files. We'll enforce the limit in get_next_char(), which generalizes over all sources (blobs, files, etc) and covers any element we're parsing (whether section, key, value, etc). For simplicity, the limit is over the length of the _whole_ file, so you couldn't have two 1GB values in the same file. This should be perfectly fine, as the expected size for config files is generally kilobytes at most. With this patch both cases above will yield: fatal: bad config line 1 in file .gitmodules That's not an amazing error message, but the parser isn't set up to provide specific messages (it just breaks out of the parsing loop and gives that generic error even if see a syntactic issue). And we really wouldn't expect to see this case outside of somebody maliciously probing the limits of the config system. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-10 14:58:21 -07:00
Jeff King	6a9c235eb4	config: use size_t to store parsed variable baselen Most of the config parsing infrastructure is limited in what it can parse only by the size of memory, because it parses character by character, building up strbufs for keys, values, etc. One exception is the "baselen" value we keep in git_parse_source(), which is an int. That stores the length of the section.subsection base, to which we can then append individual key names (by truncating back to the baselen with strbuf_setlen(), and then appending characters for the key name). But because it's an int, if we see an absurdly long section or subsection, we may overflow the integer, wrapping negative. That negative value is then implicitly cast to a size_t when we pass it to strbuf_setlen(), creating a very large value and triggering a BUG. For example: $ { printf '[foo "' perl -e 'print "a" x 231' echo '"]bar = value' } >huge $ git config --file=huge --list fatal: BUG: strbuf_setlen() beyond buffer While this is obviously a silly case that we don't care about supporting, it's worth fixing it by switching to a size_t for a few reasons: - we should try to avoid hitting BUG assertions at all - avoiding integer truncation or overflow sets a good example and makes it easier to audit the code for more important issues - the BUG outcome is what happens in _this_ instance, because we wrap negative. If we used a 232 subsection, we'd wrap to a small positive value and actually generate wrong output (the subsection of our key would be truncated). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-10 14:56:45 -07:00
Jeff King	f011a9654d	git_config_parse_key(): return baselen as size_t As with the recent change to parse_config_key(), the best type to return a string length is a size_t, as it won't cause integer truncation for a gigantic key. And as with that change, this is mostly a clarity / hygiene issue for now, as our config parser would choke on such a large key anyway. There are a few ripple effects within the config code, as callers switch to using size_t. I also adjusted a few related variables that iterate over strings. The most unexpected change is that a call to strbuf_addf() had to switch to strbuf_add(). We can't use a size_t with "%.*s", because printf precisions must have type "int" (we could cast, of course, but that would miss the point of using size_t in the first place). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-10 14:52:22 -07:00
Jeff King	6c7e6963c1	config: drop useless length variable in write_pair() We compute the length of a subset of a string, but then use that length only to feed a "%.*s" printf placeholder for the same string. We can just use "%s" to achieve the same thing. The variable became useless in `cb891a5989` (Use a strbuf for building up section header and key/value pair strings., 2007-12-14), which swapped out a write() which _did_ use the length for a strbuf_addf() call. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-10 14:47:36 -07:00
Jeff King	f5914f4b6b	parse_config_key(): return subsection len as size_t We return the length to a subset of a string using an "int *" out-parameter. This is fine most of the time, as we'd expect config keys to be relatively short, but it could behave oddly if we had a gigantic config key. A more appropriate type is size_t. Let's switch over, which lets our callers use size_t as appropriate (they are bound by our type because they must pass the out-parameter as a pointer). This is mostly just a cleanup to make it clear this code handles long strings correctly. In practice, our config parser already chokes on long key names (because of a similar int/size_t mixup!). When doing an int/size_t conversion, we have to be careful that nobody was trying to assign a negative value to the variable. I manually confirmed that for each case here. They tend to just feed the result to xmemdupz() or similar; in a few cases I adjusted the parameter types for helper functions to make sure the size_t is preserved. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-10 14:44:29 -07:00
Jeff King	021ba32a7b	remote: drop auto-strlen behavior of make_branch() and make_rewrite() The make_branch() and make_rewrite() functions can take a NUL-terminated string or a ptr/len pair. They use a sentinel value of "0" for the len to tell the difference between the two. However, when parsing config like: [branch ""] merge = whatever whose key flattens to: branch..merge we might actually have a zero-length branch name. This is obviously nonsense, but the current code would consider it as a NUL-terminated string and use the branch name ".merge". We could use a better sentinel value here (like "-1"), but that gets in the way of moving to size_t, which is a more appropriate type for a ptr/len combo. Let's instead just drop this feature and have the callers (of which there are only two total) use strlen() themselves. This simplifies the code, and lets us move to using size_t. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-10 14:43:36 -07:00
Nathan Sanders	23eafd924a	mingw: cope with the Isilon network file system On certain network filesystems (currently encountered with Isilon, but in theory more network storage solutions could be causing the same issue), when the directory in question is missing, `raceproof_create_file()` fails with an `ERROR_INVALID_PARAMETER` instead of an `ERROR_PATH_NOT_FOUND`. Since it is highly unlikely that we produce such an error by mistake (the parameters we pass are fairly benign), we can be relatively certain that the directory is missing in this instance. So let's just translate that error automagically. This fixes https://github.com/git-for-windows/git/issues/1345. Signed-off-by: Nathan Sanders <spekbukkem@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-10 10:34:05 -07:00
마누엘	1f09aed834	interactive: explicitly `fflush` stdout before expecting input At least one interactive command writes a prompt to `stdout` and then reads user input on `stdin`: `git clean --interactive`. If the prompt is left in the buffer, the user will not realize the program is waiting for their input. So let's just flush `stdout` before reading the user's input. Signed-off-by: 마누엘 <nalla@hamal.uberspace.de> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-10 10:27:16 -07:00
Johannes Schindelin	08d383f23e	interactive: refactor code asking the user for interactive input There are quite a few code locations (e.g. `git clean --interactive`) where Git asks the user for an answer. In preparation for fixing a bug shared by all of them, and also to DRY up the code, let's refactor it. Please note that most of these callers trimmed white-space both at the beginning and at the end of the answer, instead of trimming only the end (as the caller in `add-patch.c` does). Therefore, technically speaking, we change behavior in this patch. At the same time, it can be argued that this is actually a bug fix. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-10 10:26:31 -07:00
Johannes Schindelin	3efc128cd5	mingw: help debugging by optionally executing bash with strace MSYS2's strace facility is very useful for debugging... With this patch, the bash will be executed through strace if the environment variable GIT_STRACE_COMMANDS is set, which comes in real handy when investigating issues in the test suite. Also support passing a path to a log file via GIT_STRACE_COMMANDS to force Git to call strace.exe with the `-o <path>` argument, i.e. to log into a file rather than print the log directly. That comes in handy when the output would otherwise misinterpreted by a calling process as part of Git's output. Note: the values "1", "yes" or "true" are not specifying paths, but tell Git to let strace.exe log directly to the console. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-10 10:21:07 -07:00
Derrick Stolee	8d049e182e	revision: --show-pulls adds helpful merges The default file history simplification of "git log -- <path>" or "git rev-list -- <path>" focuses on providing the smallest set of commits that first contributed a change. The revision walk greatly restricts the set of walked commits by visiting only the first TREESAME parent of a merge commit, when one exists. This means that portions of the commit-graph are not walked, which can be a performance benefit, but can also "hide" commits that added changes but were ignored by a merge resolution. The --full-history option modifies this by walking all commits and reporting a merge commit as "interesting" if it has _any_ parent that is not TREESAME. This tends to be an over-representation of important commits, especially in an environment where most merge commits are created by pull request completion. Suppose we have a commit A and we create a commit B on top that changes our file. When we merge the pull request, we create a merge commit M. If no one else changed the file in the first-parent history between M and A, then M will not be TREESAME to its first parent, but will be TREESAME to B. Thus, the simplified history will be "B". However, M will appear in the --full-history mode. However, suppose that a number of topics T1, T2, ..., Tn were created based on commits C1, C2, ..., Cn between A and M as follows: A----C1----C2--- ... ---Cn----M------P1---P2--- ... ---Pn \ \ \ \ / / / / \ \__.. \ \/ ..__T1 / Tn \ \__.. /\ ..__T2 / \_____________________B \____________________/ If the commits T1, T2, ... Tn did not change the file, then all of P1 through Pn will be TREESAME to their first parent, but not TREESAME to their second. This means that all of those merge commits appear in the --full-history view, with edges that immediately collapse into the lower history without introducing interesting single-parent commits. The --simplify-merges option was introduced to remove these extra merge commits. By noticing that the rewritten parents are reachable from their first parents, those edges can be simplified away. Finally, the commits now look like single-parent commits that are TREESAME to their "only" parent. Thus, they are removed and this issue does not cause issues anymore. However, this also ends up removing the commit M from the history view! Even worse, the --simplify-merges option requires walking the entire history before returning a single result. Many Git users are using Git alongside a Git service that provides code storage alongside a code review tool commonly called "Pull Requests" or "Merge Requests" against a target branch. When these requests are accepted and merged, they typically create a merge commit whose first parent is the previous branch tip and the second parent is the tip of the topic branch used for the request. This presents a valuable order to the parents, but also makes that merge commit slightly special. Users may want to see not only which commits changed a file, but which pull requests merged those commits into their branch. In the previous example, this would mean the users want to see the merge commit "M" in addition to the single- parent commit "C". Users are even more likely to want these merge commits when they use pull requests to merge into a feature branch before merging that feature branch into their trunk. In some sense, users are asking for the "first" merge commit to bring in the change to their branch. As long as the parent order is consistent, this can be handled with the following rule: Include a merge commit if it is not TREESAME to its first parent, but is TREESAME to a later parent. These merges look like the merge commits that would result from running "git pull <topic>" on a main branch. Thus, the option to show these commits is called "--show-pulls". This has the added benefit of showing the commits created by closing a pull request or merge request on any of the Git hosting and code review platforms. To test these options, extend the standard test example to include a merge commit that is not TREESAME to its first parent. It is surprising that that option was not already in the example, as it is instructive. In particular, this extension demonstrates a common issue with file history simplification. When a user resolves a merge conflict using "-Xours" or otherwise ignoring one side of the conflict, they create a TREESAME edge that probably should not be TREESAME. This leads users to become frustrated and complain that "my change disappeared!" In my experience, showing them history with --full-history and --simplify-merges quickly reveals the problematic merge. As mentioned, this option is expensive to compute. The --show-pulls option _might_ show the merge commit (usually titled "resolving conflicts") more quickly. Of course, this depends on the user having the correct parent order, which is backwards when using "git pull master" from a topic branch. There are some special considerations when combining the --show-pulls option with --simplify-merges. This requires adding a new PULL_MERGE object flag to store the information from the initial TREESAME comparisons. This helps avoid dropping those commits in later filters. This is covered by a test, including how the parents can be simplified. Since "struct object" has already ruined its 32-bit alignment by using 33 bits across parsed, type, and flags member, let's not make it worse. PULL_MERGE is used in revision.c with the same value (1u<<15) as REACHABLE in commit-graph.c. The REACHABLE flag is only used when writing a commit-graph file, and a revision walk using --show-pulls does not happen in the same process. Care must be taken in the future to ensure this remains the case. Update Documentation/rev-list-options.txt with significant details around this option. This requires updating the example in the History Simplification section to demonstrate some of the problems with TREESAME second parents. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-10 09:58:55 -07:00
Jeff King	0220461071	repository: mark the "refs" pointer as private The "refs" pointer in a struct repository starts life as NULL, but then is lazily initialized when it is accessed via get_main_ref_store(). However, it's easy for calling code to forget this and access it directly, leading to code which works _some_ of the time, but fails if it is called before anybody else accesses the refs. This was the cause of the bug fixed by `5ff4b920eb` (sha1-name: do not assume that the ref store is initialized, 2020-04-09). In order to prevent similar bugs, let's more clearly mark the "refs" field as private. In addition to helping future code, the name change will help us audit any existing direct uses. Besides get_main_ref_store() itself, it turns out there is only one. But we know it's OK as it is on the line directly after the fix from `5ff4b920eb`, which will have initialized the pointer. However it's still a good idea for it to model the proper use of the accessing function, so we'll convert it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-09 22:40:48 -07:00
brian m. carlson	edc6dccf81	builtin/receive-pack: use constant-time comparison for HMAC value When we're comparing a push cert nonce, we currently do so using strcmp. Most implementations of strcmp short-circuit and exit as soon as they know whether two values are equal. This, however, is a problem when we're comparing the output of HMAC, as it leaks information in the time taken about how much of the two values match if they do indeed differ. In our case, the nonce is used to prevent replay attacks against our server via the embedded timestamp and replay attacks using requests from a different server via the HMAC. Push certs, which contain the nonces, are signed, so an attacker cannot tamper with the nonces without breaking validation of the signature. They can, of course, create their own signatures with invalid nonces, but they can also create their own signatures with valid nonces, so there's nothing to be gained. Thus, there is no security problem. Even though it doesn't appear that there are any negative consequences from the current technique, for safety and to encourage good practices, let's use a constant time comparison function for nonce verification. POSIX does not provide one, but they are easy to write. The technique we use here is also used in NaCl and the Go standard library and relies on the fact that bitwise or and xor are constant time on all known architectures. We need not be concerned about exiting early if the actual and expected lengths differ, since the standard cryptographic assumption is that everyone, including an attacker, knows the format of and algorithm used in our nonces (and in any event, they have the source code and can determine it easily). As a result, we assume everyone knows how long our nonces should be. This philosophy is also taken by the Go standard library and other cryptographic libraries when performing constant time comparisons on HMAC values. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-09 18:17:27 -07:00
Junio C Hamano	5ff4b920eb	sha1-name: do not assume that the ref store is initialized `c931ba4e` (sha1-name.c: remove the_repo from handle_one_ref(), 2019-04-16) replaced the use of for_each_ref() helper, which works with the main ref store of the default repository instance, with refs_for_each_ref(), which can work on any ref store instance, by assuming that the repository instance the function is given has its ref store already initialized. But it is possible that nobody has initialized it, in which case, the code ends up dereferencing a NULL pointer. Reported-by: Érico Rolim <erico.erc@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-09 17:22:20 -07:00
Martin Ågren	76ba7fa225	config.txt: move closing "----" to cover entire listing Commit `1925fe0c8a` ("Documentation: wrap config listings in "----"", 2019-09-07) wrapped this fairly large block of example config directives in "----". The closing "----" ended up a few lines too early though. Make sure to include the trailing "IncludeIf.onbranch:..." example, too. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-09 11:04:01 -07:00
Johannes Schindelin	7723436149	stash -p: (partially) fix bug concerning split hunks When trying to stash part of the worktree changes by splitting a hunk and then only partially accepting the split bits and pieces, the user is presented with a rather cryptic error: error: patch failed: <file>:<line> error: test: patch does not apply Cannot remove worktree changes and the command would fail to stash the desired parts of the worktree changes (even if the `stash` ref was actually updated correctly). We even have a test case demonstrating that failure, carrying it for four years already. The explanation: when splitting a hunk, the changed lines are no longer separated by more than 3 lines (which is the amount of context lines Git's diffs use by default), but less than that. So when staging only part of the diff hunk for stashing, the resulting diff that we want to apply to the worktree in reverse will contain those changes to be dropped surrounded by three context lines, but since the diff is relative to HEAD rather than to the worktree, these context lines will not match. Example time. Let's assume that the file README contains these lines: We the people and the worktree added some lines so that it contains these lines instead: We are the kind people and the user tries to stash the line containing "are", then the command will internally stage this line to a temporary index file and try to revert the diff between HEAD and that index file. The diff hunk that `git stash` tries to revert will look somewhat like this: @@ -1776,3 +1776,4 We +are the people It is obvious, now, that the trailing context lines overlap with the part of the original diff hunk that the user did not want to stash. Keeping in mind that context lines in diffs serve the primary purpose of finding the exact location when the diff does not apply precisely (but when the exact line number in the file to be patched differs from the line number indicated in the diff), we work around this by reducing the amount of context lines: the diff was just generated. Note: this is not a full fix for the issue. Just as demonstrated in t3701's 'add -p works with pathological context lines' test case, there are ambiguities in the diff format. It is very rare in practice, of course, to encounter such repeated lines. The full solution for such cases would be to replace the approach of generating a diff from the stash and then applying it in reverse by emulating `git revert` (i.e. doing a 3-way merge). However, in `git stash -p` it would not apply to `HEAD` but instead to the worktree, which makes this non-trivial to implement as long as we also maintain a scripted version of `add -i`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-08 12:17:59 -07:00
Johannes Schindelin	121c0d4151	t3904: fix incorrect demonstration of a bug In `7e9e048661` (stash -p: demonstrate failure of split with mixed y/n, 2015-04-16), a regression test for a known breakage that was added to the test script `t3904-stash-patch.sh` that demonstrated that splitting a hunk and trying to stash only part of that split hunk fails (but shouldn't). As expected, it still fails, but for the wrong reason: once the bug is fixed, we would expect stderr to show nothing, yet the regression test expects stderr to show something. Let's fix that by telling that regression test case to expect nothing to be printed to stderr. While at it, also drop the obvious left-over from debugging where the regression test did not mind `git stash -p` to return a non-zero exit status. Of course, the regression test still fails, but this time for the correct reason. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-08 12:17:58 -07:00
Johannes Schindelin	b6852e1979	mingw: do not treat `COM0` as a reserved file name In `4dc42c6c18` (mingw: refuse paths containing reserved names, 2019-12-21), we started disallowing file names that are reserved, e.g. `NUL`, `CONOUT$`, etc. This included `COM<n>` where `<n>` is a digit. Unfortunately, this includes `COM0` but only `COM1`, ..., `COM9` are reserved, according to the official documentation, `COM0` is mentioned in the "NT Namespaces" section but it is explicitly _omitted_ from the list of reserved names: https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file#naming-conventions Tests corroborate this: it is totally possible to write a file called `com0.c` on Windows 10, but not `com1.c`. So let's tighten the code to disallow only the reserved `COM<n>` file names, but to allow `COM0` again. This fixes https://github.com/git-for-windows/git/issues/2470. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-08 12:15:51 -07:00
Matthias Aßhauer	a748f3f3dc	mingw: use modern strftime implementation if possible Microsoft introduced a new "Universal C Runtime Library" (UCRT) with Visual Studio 2015. The UCRT comes with a new strftime() implementation that supports more date formats. We link git against the older "Microsoft Visual C Runtime Library" (MSVCRT), so to use the UCRT strftime() we need to load it from ucrtbase.dll using DECLARE_PROC_ADDR()/INIT_PROC_ADDR(). Most supported Windows systems should have recieved the UCRT via Windows update, but in some cases only MSVCRT might be available. In that case we fall back to using that implementation. With this change, it is possible to use e.g. the `%g` and `%V` date format specifiers, e.g. git show -s --format=%cd --date=format:‘%g.%V’ HEAD Without this change, the user would see this error message on Windows: fatal: invalid strftime format: '‘%g.%V’' This fixes https://github.com/git-for-windows/git/issues/2495 Signed-off-by: Matthias Aßhauer <mha1993@live.de> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-08 12:15:50 -07:00
Johannes Schindelin	826f0c0df2	subtree: fix build with AsciiDoctor 2 This is a (late) companion for `f6461b82b9` (Documentation: fix build with Asciidoctor 2, 2019-09-15). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-08 12:10:36 -07:00
Emma Brooks	19d097e3d7	format-patch: teach --no-encode-email-headers When commit subjects or authors have non-ASCII characters, git format-patch Q-encodes them so they can be safely sent over email. However, if the patch transfer method is something other than email (web review tools, sneakernet), this only serves to make the patch metadata harder to read without first applying it (unless you can decode RFC 2047 in your head). git am as well as some email software supports non-Q-encoded mail as described in RFC 6531. Add --[no-]encode-email-headers and format.encodeEmailHeaders to let the user control this behavior. Signed-off-by: Emma Brooks <me@pluvano.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-07 22:37:18 -07:00
Jonathan Tan	95acf11a3d	diff: restrict when prefetching occurs Commit `7fbbcb21b1` ("diff: batch fetching of missing blobs", 2019-04-08) optimized "diff" by prefetching blobs in a partial clone, but there are some cases wherein blobs do not need to be prefetched. In these cases, any command that uses the diff machinery will unnecessarily fetch blobs. diffcore_std() may read blobs when it calls the following functions: (1) diffcore_skip_stat_unmatch() (controlled by the config variable diff.autorefreshindex) (2) diffcore_break() and diffcore_merge_broken() (for break-rewrite detection) (3) diffcore_rename() (for rename detection) (4) diffcore_pickaxe() (for detecting addition/deletion of specified string) Instead of always prefetching blobs, teach diffcore_skip_stat_unmatch(), diffcore_break(), and diffcore_rename() to prefetch blobs upon the first read of a missing object. This covers (1), (2), and (3): to cover the rest, teach diffcore_std() to prefetch if the output type is one that includes blob data (and hence blob data will be required later anyway), or if it knows that (4) will be run. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-07 16:09:29 -07:00
Jonathan Tan	c14b6f83ec	diff: refactor object read Refactor the object reads in diff_populate_filespec() to have the first object read not be in an if/else branch, because in a future patch, a retry will be added to that first object read. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-07 16:09:29 -07:00
Jonathan Tan	1c37e86ab2	diff: make diff_populate_filespec_options struct The behavior of diff_populate_filespec() currently can be customized through a bitflag, but a subsequent patch requires it to support a non-boolean option. Replace the bitflag with an options struct. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-07 16:09:29 -07:00
Elijah Newren	1f6965f994	sequencer: honor GIT_REFLOG_ACTION There is a lot of code to honor GIT_REFLOG_ACTION throughout git, including some in sequencer.c; unfortunately, reflog_message() and its callers ignored it. Instruct reflog_message() to check the existing environment variable, and use it when present as an override to action_name(). Also restructure pick_commits() to only temporarily modify GIT_REFLOG_ACTION for a short duration and then restore the old value, so that when we do this setting within a loop we do not keep adding " (pick)" substrings and end up with a reflog message of the form rebase (pick) (pick) (pick) (finish): returning to refs/heads/master Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-07 15:10:11 -07:00
Damien Robert	acbfae32a3	doc: --recurse-submodules mostly applies to active submodules The documentation refers to "initialized" or "populated" submodules, to explain which submodules are affected by '--recurse-submodules', but the real terminology here is 'active' submodules. Update the documentation accordingly. Some terminology: - Active is defined in gitsubmodules(7), it only involves the configuration variables 'submodule.active', 'submodule.<name>.active' and 'submodule.<name>.url'. The function submodule.c::is_submodule_active checks that a submodule is active. - Populated means that the submodule's working tree is present (and the gitfile correctly points to the submodule repository), i.e. either the superproject was cloned with ` --recurse-submodules`, or the user ran `git submodule update --init`, or `git submodule init [<path>]` and `git submodule update [<path>]` separately which populated the submodule working tree. This does not involve the 3 configuration variables above. - Initialized (at least in the context of the man pages involved in this patch) means both "populated" and "active" as defined above, i.e. what `git submodule update --init` does. The --recurse-submodules option mostly affects active submodules. An exception is `git fetch` where the option affects populated submodules. As a consequence, in `git pull --recurse-submodules` the fetch affects populated submodules, but the resulting working tree update only affects active submodules. In the documentation of `git-pull`, let's distinguish between the fetching part which affects populated submodules, and the updating of worktrees, which only affects active submodules. Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com> Helped-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-06 13:42:43 -07:00
Damien Robert	4da9e99e6e	doc: be more precise on (fetch\|push).recurseSubmodules The default value also depends on the value of submodule.recurse. Use this opportunity to correct some grammar mistakes in Documentation/config/fetch.txt signaled by Robert P. J. Day. Also mention `fetch.recurseSubmodules` in fetch-options.txt. In git-push.txt, `push.recurseSubmodules` is implicitly mentioned (by explaining how to disable it), so no need to add it there. Lastly add a link to `git-fetch` in `git-pull.txt` to explain the meaning of `--recurse-submodules` there. Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-06 13:42:43 -07:00
Damien Robert	d09bc51428	doc: explain how to deactivate submodule.recurse completely Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-06 13:42:43 -07:00
Damien Robert	b3cec57338	doc: document --recurse-submodules for reset and restore Also unify the formulation about --no-recurse-submodules for checkout and switch, which we reuse for restore. And correct the formulation about submodules' HEAD in read-tree, which we reuse in reset. Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-06 13:42:43 -07:00
Damien Robert	dd0cb7dffa	doc: list all commands affected by submodule.recurse Note that `ls-files` is not affected, even though it has a `--recurse-submodules` option, so list it as an exception too. Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-06 13:42:43 -07:00
Jeff King	d8410a816b	fast-import: replace custom hash with hashmap.c We use a custom hash in fast-import to store the set of objects we've imported so far. It has a fixed set of 2^16 buckets and chains any collisions with a linked list. As the number of objects grows larger than that, the load factor increases and we degrade to O(n) lookups and O(n^2) insertions. We can scale better by using our hashmap.c implementation, which will resize the bucket count as we grow. This does incur an extra memory cost of 8 bytes per object, as hashmap stores the integer hash value for each entry in its hashmap_entry struct (which we really don't care about here, because we're just reusing the embedded object hash). But I think the numbers below justify this (and our per-object memory cost is already much higher). I also looked at using khash, but it seemed to perform slightly worse than hashmap at all sizes, and worse even than the existing code for small sizes. It's also awkward to use here, because we want to look up a "struct object_entry" from a "struct object_id", and it doesn't handle mismatched keys as well. Making a mapping of object_id to object_entry would be more natural, but that would require pulling the embedded oid out of the object_entry or incurring an extra 32 bytes per object. In a synthetic test creating as many cheap, tiny objects as possible perl -e ' my $bits = shift; my $nr = 2**$bits; for (my $i = 0; $i < $nr; $i++) { print "blob\n"; print "data 4\n"; print pack("N", $i); } ' $bits \| git fast-import I got these results: nr_objects master khash hashmap 2^20 0m4.317s 0m5.109s 0m3.890s 2^21 0m10.204s 0m9.702s 0m7.933s 2^22 0m27.159s 0m17.911s 0m16.751s 2^23 1m19.038s 0m35.080s 0m31.963s 2^24 4m18.766s 1m10.233s 1m6.793s which points to hashmap as the winner. We didn't have any perf tests for fast-export or fast-import, so I added one as a more real-world case. It uses an export without blobs since that's significantly cheaper than a full one, but still is an interesting case people might use (e.g., for rewriting history). It will emphasize this change in some ways (as a percentage we spend more time making objects and less shuffling blob bytes around) and less in others (the total object count is lower). Here are the results for linux.git: Test HEAD^ HEAD ---------------------------------------------------------------------------- 9300.1: export (no-blobs) 67.64(66.96+0.67) 67.81(67.06+0.75) +0.3% 9300.2: import (no-blobs) 284.04(283.34+0.69) 198.09(196.01+0.92) -30.3% It only has ~5.2M commits and trees, so this is a larger effect than I expected (the 2^23 case above only improved by 50s or so, but here we gained almost 90s). This is probably due to actually performing more object lookups in a real import with trees and commits, as opposed to just dumping a bunch of blobs into a pack. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-06 13:41:24 -07:00
Philippe Blain	a44088435c	pull doc: correct outdated description of an example Since `f269048754` (fetch: opportunistically update tracking refs, 2013-05-11), the underlying `git fetch` in `git pull <remote> <branch>` updates the configured remote-tracking branch for <branch>. However, an example in the 'Examples' section of the `git pull` documentation still states that this is not the case. Correct the description of this example. Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-05 15:00:04 -07:00
Philippe Blain	f6a65de621	pull doc: refer to a specific section in 'fetch' doc The documentation for the `<refspec>` parameter in the `git fetch` documentation refers to the section "CONFIGURED REMOTE-TRACKING BRANCHES" in this same documentation page. In the `git pull` documentation, let's also refer specifically to this section instead of just linking to the `git fetch` documentation. Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-05 15:00:03 -07:00
Johannes Schindelin	a1aba0c95c	t0007: fix a typo Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-05 14:51:13 -07:00
Elijah Newren	f7139e7cc2	git-rebase.txt: add another hook to the hooks section, and explain more For more discussion about these hooks, their history relative to rebase, and logical consistency between different types of operations, see https://lore.kernel.org/git/CABPp-BG0bFKUage5cN_2yr2DkmS04W2Z9Pg5WcROqHznV3XBdw@mail.gmail.com/ and the links to some threads referenced therein. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-05 14:49:16 -07:00
Đoàn Trần Công Danh	5c5bac173d	Documentation: document merge option --no-gpg-sign Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-03 11:37:22 -07:00
Đoàn Trần Công Danh	4369d3a000	Documentation: merge commit-tree --[no-]gpg-sign Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-03 11:37:22 -07:00
Đoàn Trần Công Danh	9da37fe11e	Documentation: reword commit --no-gpg-sign Merge with --gpg-sign option, and clarify that --no-gpg-sign also override earlier --gpg-sign. Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-03 11:37:22 -07:00

1 2 3 4 5 ...

59117 Commits