git-commit-vandalism

Author	SHA1	Message	Date
Jeff King	f3e76ed228	strbuf_readlink: use ssize_t The return type of readlink() is ssize_t, not int. This probably doesn't matter in practice, as it would require a 2GB symlink destination, but it doesn't hurt to be careful. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-24 10:19:29 -07:00
Jeff King	26114c00be	strbuf: use size_t for length in intermediate variables A few strbuf functions store the length of a strbuf in a temporary variable. We should always use size_t for this, as it's possible for a strbuf to exceed an "int" (e.g., a 2GB string on a 64-bit system). This is unlikely in practice, but we should try to behave sensibly on silly or malicious input. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-24 10:19:29 -07:00
Jeff King	c7d017d7e1	reencode_string: use size_t for string lengths The iconv interface takes a size_t, which is the appropriate type for an in-memory buffer. But our reencode_string_* functions use integers, meaning we may get confusing results when the sizes exceed INT_MAX. Let's use size_t consistently. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-24 10:19:29 -07:00
Jeff King	77aa03d6c7	reencode_string: use st_add/st_mult helpers When converting a string with iconv, if the output buffer isn't big enough, we grow it. But our growth is done without any concern for integer overflow. So when we add: outalloc = sofar + insz * 2 + 32; we may end up wrapping outalloc (which is a size_t), and allocating a too-small buffer. We then manipulate it further: outsz = outalloc - sofar - 1; and feed outsz back to iconv. If outalloc is wrapped and smaller than sofar, we'll end up with a small allocation but feed a very large outsz to iconv, which could result in it overflowing the buffer. Can we use this to construct an attack wherein the victim clones a repository with a very large commit object with an encoding header, and running "git log" reencodes it into utf8, causing an overflow? An attack of this sort is likely impossible in practice. "sofar" is how many output bytes we've written total, and "insz" is the number of input bytes remaining. Imagine our input doubles in size as we output it (which is easy to do by converting latin1 to utf8, for example), and that we start with N input bytes. Our initial output buffer also starts at N bytes, so after the first call we'd have N/2 input bytes remaining (insz), and have written N bytes (sofar). That means our next allocation will be (N + N/2 * 2 + 32) bytes, or (2N + 32). We can therefore overflow a 32-bit size_t with a commit message that's just under 2^31 bytes, assuming it consists mostly of "doubling" sequences (e.g., latin1 0xe1 which becomes utf8 0xc3 0xa1). But we'll never make it that far with such a message. We'll be spending 2^31 bytes on the original string. And our initial output buffer will also be 2^31 bytes. Which is not going to succeed on a system with a 32-bit size_t, since there will be other things using the address space, too. The initial malloc will fail. If we imagine instead that we can triple the size when converting, then our second allocation becomes (N + 2/3N * 2 + 32), or (7/3N + 32). That still requires two allocations of 3/7 of our address space (6/7 of the total) to succeed. If we imagine we can quadruple, it becomes (5/2N + 32); we need to be able to allocate 4/5 of the address space to succeed. This might start to get plausible. But is it possible to get a 4-to-1 increase in size? Probably if you're converting to some obscure encoding. But since git defaults to utf8 for its output, that's the likely destination encoding for an attack. And while there are 4-character utf8 sequences, it's unlikely that you'd be able find a single-byte source sequence in any encoding. So this is certainly buggy code which should be fixed, but it is probably not a useful attack vector. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-24 10:19:29 -07:00
Junio C Hamano	b20a3cbb88	Merge branch 'sb/blame-color' into jk/banned-function * sb/blame-color: blame: prefer xsnprintf to strcpy for colors	2018-07-24 09:05:35 -07:00
Jonathan Tan	2b554353a5	fetch: send "refs/tags/" prefix upon CLI refspecs When performing tag following, in addition to using the server's "include-tag" capability to send tag objects (and emulating it if the server does not support that capability), "git fetch" relies upon the presence of refs/tags/* entries in the initial ref advertisement to locally create refs pointing to the aforementioned tag objects. When using protocol v2, refs/tags/* entries in the initial ref advertisement may be suppressed by a ref-prefix argument, leading to the tag object being downloaded, but the ref not being created. Commit `dcc73cf7ff` ("fetch: generate ref-prefixes when using a configured refspec", 2018-05-18) ensured that "refs/tags/" is always sent as a ref prefix when "git fetch" is invoked with no refspecs, but not when "git fetch" is invoked with refspecs. Extend that functionality to make it work in both situations. This also necessitates a change another test which tested ref advertisement filtering using tag refs - since tag refs are sent by default now, the test has been switched to using branch refs instead. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-24 08:54:17 -07:00
Jonathan Tan	15cfc985e0	t5702: test fetch with multiple refspecs at a time Extend the protocol v2 tests to also test fetches with multiple refspecs specified. This also covers the previously uncovered cases of fetching with prefix matching and fetching by SHA-1. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-24 08:54:16 -07:00
Brandon Williams	bbb19a8b06	fetch-pack: mark die strings for translation Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 15:59:40 -07:00
SZEDER Gábor	1a96638e69	coccinelle: extract dedicated make target to clean Coccinelle's results Sometimes I want to remove only Coccinelle's results, but keep all other build artifacts left after my usual 'make all man' build. This new 'cocciclean' make target will allow just that. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 12:39:42 -07:00
SZEDER Gábor	f57d11728d	coccinelle: put sane filenames into output patches Coccinelle outputs its suggested transformations as patches, whose header looks something like this: --- commit.c +++ /tmp/cocci-output-19250-7ae78a-commit.c Note the lack of 'diff --opts <old> <new>' line, the differing number of path components on the --- and +++ lines, and the nonsensical filename on the +++ line. 'patch -p0' can still apply these patches, as it takes the filename to be modified from the --- line. Alas, 'git apply' can't, because it takes the filename from the +++ line, and then complains about the nonexisting file. Pass the '--patch .' options to Coccinelle via the SPATCH_FLAGS 'make' variable, as it seems to make it generate proper context diff patches, with the header starting with a 'diff ...' line and containing sane filenames. The resulting 'contrib/coccinelle/*.cocci.patch' files then can be applied both with 'git apply' and 'patch' (even without '-p0'). Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 12:38:16 -07:00
SZEDER Gábor	ac1e31d5ca	coccinelle: exclude sha1dc source files from static analysis sha1dc is an external library, that we carry in-tree for convenience or grab as a submodule, so there is no use in applying our semantic patches to its source files. Therefore, exclude sha1dc's source files from Coccinelle's static analysis. This change also makes the static analysis somewhat faster: presumably because of the heavy use of repetitive macro declarations, applying the semantic patches 'array.cocci' and 'swap.cocci' to 'sha1dc/sha1.c' takes over half a minute each on my machine, which amounts to about a third of the runtime of applying these two semantic patches to the whole git source tree. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 12:37:47 -07:00
SZEDER Gábor	7cd3af5437	coccinelle: use $(addsuffix) in 'coccicheck' make target The dependencies of the 'coccicheck' make target are listed with the help of the $(patsubst) make function, which in this case doesn't do any pattern substitution, but only adds the '.patch' suffix. Use the shorter and more idiomatic $(addsuffix) make function instead. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 12:37:28 -07:00
SZEDER Gábor	0c7642562e	coccinelle: mark the 'coccicheck' make target as .PHONY The 'coccicheck' target doesn't create a file with the same name, so mark it as .PHONY. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 12:35:31 -07:00
Johannes Schindelin	0b7d324ee5	t7406: avoid failures solely due to timing issues Regression tests are automated tests which try to ensure a specific behavior. The idea is: if the test case fails, the behavior indicated in the test case's title regressed. If a regression test that fails, even occasionally, for any reason other than to indicate the particular regression(s) it tries to catch, it is less useful than when it really only fails when there is a bug in the (non-test) code that needs to be fixed. In the instance of the test case "submodule update --init --recursive from subdirectory" of the script t7406-submodule-update.sh, the exact output of a recursive clone is compared with a pre-generated one. And this is a racy test because the structure of the submodules only guarantees a partial order. The 'none' and the 'rebasing' submodules can be cloned in any order, which means that a mismatch with the hard-coded order does not necessarily indicate a bug in the tested code. See for example: https://git-for-windows.visualstudio.com/git/_build/results?buildId=14035&view=logs To prevent such false positives from unnecessarily costing time when investigating test failures, let's take the exact order of the lines out of the equation by sorting them before comparing them. This test script seems not to have any more test cases that try to verify any specific order in which recursive clones process the submodules, therefore this is the only test case that is changed in this manner. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 12:22:55 -07:00
SZEDER Gábor	0860a7641b	travis-ci: fail if Coccinelle static analysis found something to transform Coccinelle's and in turn 'make coccicheck's exit code only indicates that Coccinelle managed to finish its analysis without any errors (e.g. no unknown --options, no missing files, no syntax errors in the semantic patches, etc.), but it doesn't indicate whether it found any undesired code patterns to transform or not. To find out the latter, one has to look closer at 'make coccicheck's standard output and look for lines like: SPATCH result: contrib/coccinelle/<something>.cocci.patch And this only indicates that there is something to transform, but to see what the suggested transformations are one has to actually look into those '.cocci.patch' files. This makes the automated static analysis build job on Travis CI not particularly useful, because it neither draws our attention to Coccinelle's findings, nor shows the actual findings. Consequently, new topics introducing undesired code patterns graduated to master on several occasions without anyone noticing. The only way to draw attention in such an automated setting is to fail the build job. Therefore, modify the 'ci/run-static-analysis.sh' build script to check all the resulting '.cocci.patch' files, and fail the build job if any of them turns out to be not empty. Include those files' contents, i.e. Coccinelle's suggested transformations, in the build job's trace log, so we'll know why it failed. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 12:08:38 -07:00
SZEDER Gábor	4ab8d1af33	travis-ci: run Coccinelle static analysis with two parallel jobs Currently the static analysis build job runs Coccinelle using a single 'make' job. Using two parallel jobs cuts down the build job's run time from around 10-12mins to 6-7mins, sometimes even under 6mins (there is quite large variation between build job runtimes). More than two parallel jobs don't seem to bring further runtime benefits. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 12:08:36 -07:00
Nguyễn Thái Ngọc Duy	6b5b309f5e	transport-helper.c: mark more strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:10 -07:00
Nguyễn Thái Ngọc Duy	68e39e4100	transport.c: mark more strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:10 -07:00
Nguyễn Thái Ngọc Duy	259328b731	sha1-file.c: mark more strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:10 -07:00
Nguyễn Thái Ngọc Duy	02127c639b	sequencer.c: mark more strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:10 -07:00
Nguyễn Thái Ngọc Duy	b73c6e3a0d	replace-object.c: mark more strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:10 -07:00
Nguyễn Thái Ngọc Duy	1b5e07bbf0	refspec.c: mark more strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:10 -07:00
Nguyễn Thái Ngọc Duy	661558f0a5	refs.c: mark more strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:10 -07:00
Nguyễn Thái Ngọc Duy	c60d7697d1	pkt-line.c: mark more strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:10 -07:00
Nguyễn Thái Ngọc Duy	42246589b8	object.c: mark more strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:10 -07:00
Nguyễn Thái Ngọc Duy	31a55e91bc	exec-cmd.c: mark more strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:10 -07:00
Nguyễn Thái Ngọc Duy	cbb46ca78c	environment.c: mark more strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:10 -07:00
Nguyễn Thái Ngọc Duy	a80897c1e9	dir.c: mark more strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:10 -07:00
Nguyễn Thái Ngọc Duy	d26a328eaf	convert.c: mark more strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:10 -07:00
Nguyễn Thái Ngọc Duy	aad6fddb0c	connect.c: mark more strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:10 -07:00
Nguyễn Thái Ngọc Duy	a769bfc74f	config.c: mark more strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:10 -07:00
Nguyễn Thái Ngọc Duy	4f5b532d18	commit-graph.c: mark more strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:09 -07:00
Nguyễn Thái Ngọc Duy	225c62e067	builtin/replace.c: mark more strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:09 -07:00
Nguyễn Thái Ngọc Duy	f616db6a5c	builtin/pack-objects.c: mark more strings for translation Most of these are straight forward. GETTEXT_POISON does catch the last string in cmd_pack_objects(), but since this is --progress output, it's not supposed to be machine-readable. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:09 -07:00
Nguyễn Thái Ngọc Duy	5507067dbd	builtin/grep.c: mark strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:09 -07:00
Nguyễn Thái Ngọc Duy	1d28ff4ce6	builtin/config.c: mark more strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:09 -07:00
Nguyễn Thái Ngọc Duy	02f3fe5a9a	archive-zip.c: mark more strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:09 -07:00
Nguyễn Thái Ngọc Duy	d0482e697c	archive-tar.c: mark more strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:09 -07:00
Nguyễn Thái Ngọc Duy	1a07e59c3e	Update messages in preparation for i18n Many messages will be marked for translation in the following commits. This commit updates some of them to be more consistent and reduce diff noise in those commits. Changes are - keep the first letter of die(), error() and warning() in lowercase - no full stop in die(), error() or warning() if it's single sentence messages - indentation - some messages are turned to BUG(), or prefixed with "BUG:" and will not be marked for i18n - some messages are improved to give more information - some messages are broken down by sentence to be i18n friendly (on the same token, combine multiple warning() into one big string) - the trailing \n is converted to printf_ln if possible, or deleted if not redundant - errno_errno() is used instead of explicit strerror() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 11:19:09 -07:00
Stefan Beller	79cb2ebb92	xdiff/histogram: remove tail recursion When running the same reproduction script as the previous patch, it turns out the stack is too small, which can be easily avoided. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-23 10:12:16 -07:00
Brandon Williams	402c47d939	clone: send ref-prefixes when using protocol v2 Teach clone to send a list of ref-prefixes, when using protocol v2, to allow the server to filter out irrelevant references from the ref-advertisement. This reduces wasted time and bandwidth when cloning repositories with a larger number of references. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-20 15:25:19 -07:00
Stefan Beller	1e83b9bfdd	Documentation/git-interpret-trailers: explain possible values Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-20 15:23:59 -07:00
SZEDER Gábor	ab29f1b329	t9300: wait for background fast-import process to die after killing it The five new tests added to 't9300-fast-import.sh' in `30e215a65c` (fast-import: checkpoint: dump branches/tags/marks even if object_count==0, 2017-09-28), all with the prefix "V:" in their test description, run 'git fast-import' in the background and then 'kill' it as part of a 'test_when_finished' cleanup command. When this test script is executed with Bash, some or even all of these tests tend to pollute the test script's stderr, and messages about terminated processes end up on the terminal: $ bash ./t9300-fast-import.sh <... snip ...> ok 179 - V: checkpoint helper does not get stuck with extra output /<...>/test-lib-functions.sh: line 388: 28383 Terminated git fast-import $options 0<&8 1>&9 ok 180 - V: checkpoint updates refs after reset ./t9300-fast-import.sh: line 3210: 28401 Terminated git fast-import $options 0<&8 1>&9 ok 181 - V: checkpoint updates refs and marks after commit ok 182 - V: checkpoint updates refs and marks after commit (no new objects) ./test-lib.sh: line 634: line 3250: 28485 Terminated git fast-import $options 0<&8 1>&9 ok 183 - V: checkpoint updates tags after tag ./t9300-fast-import.sh: line 3264: 28510 Terminated git fast-import $options 0<&8 1>&9 After a background child process terminates, its parent Bash process always outputs a message like those above to stderr, even when in non-interactive mode. But how do some of these messages end up on the test script's stderr, why don't we get them from all five tests, and why do they come from different file/line locations? Well, after sending the TERM signal to the background child process, it takes a little while until that process receives the signal and terminates, and then it takes another while until the parent process notices it. During this time the parent Bash process is continuing execution, and by the time it notices that its child terminated it might have already left 'test_eval_inner_' and its stderr is not redirected to /dev/null anymore. That's why such a message can appear on the test script's stderr, while other times, when the child terminates fast and/or the parent shell is slow enough, the message ends up in /dev/null, just like any other output of the test does. Bash always adds the file name and line number of the code location it was about to execute when it notices the termination of its child process as a prefix to that message, hence the varying and sometimes totally unrelated location prefixes in those messages (e.g. line 388 in 'test-lib-functions.sh' is 'test_verify_prereq', and I saw such a message pointing to 'say_color' as well). Prevent these messages from appearing on the test script's stderr by 'wait'-ing on the pid of the background 'git fast-import' process after sending it the TERM signal. This ensures that the executing shell's stderr is still redirected when the shell notices the termination of its child process in the background, and that these messages get a consistent file/line location prefix. Note that this is not an issue when the test script is run with Bash and '-v', because then these messages are supposed to go to the test script's stderr anyway, and indeed all of them do; though the sometimes seemingly random file/line prefixes could be confusing still. Similarly, it's not an issue with Bash and '--verbose-log' either, because then all messages go to the log file as they should. Finally, it's not an issue with some other shells (I tried dash, ksh, ksh93 and mksh) even without any of the verbose options, because they don't print messages like these in non-interactive mode in the first place. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-20 11:15:32 -07:00
Henning Schild	53fc999306	gpg-interface t: extend the existing GPG tests with GPGSM Add test cases to cover the new X509/gpgsm support. Most of them resemble existing ones. They just switch the format to x509 and set the signingkey when creating signatures. Validation of signatures does not need any configuration of git, it does need gpgsm to be configured to trust the key(-chain). Several of the testcases build on top of existing gpg testcases. The commit ships a self-signed key for committer@example.com and configures gpgsm to trust it. Signed-off-by: Henning Schild <henning.schild@siemens.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-20 08:41:42 -07:00
Stefan Beller	64c4e8bccd	xdiff/xhistogram: move index allocation into find_lcs This fixes a memory issue when recursing a lot, which can be reproduced as seq 1 100000 >one seq 1 4 100000 >two git diff --no-index --histogram one two Before this patch, histogram_diff would call itself recursively before calling free_index, which would mean a lot of memory is allocated during the recursion and only freed afterwards. By moving the memory allocation (and its free call) into find_lcs, the memory is free'd before we recurse, such that memory is reused in the next step of the recursion instead of using new memory. This addresses only the memory pressure, not the run time complexity, that is also awful for the corner case outlined above. Helpful in understanding the code (in addition to the sparse history of this file), was https://stackoverflow.com/a/32367597 which reproduces most of the code comments of the JGit implementation. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-19 12:46:03 -07:00
Stefan Beller	c671d4b599	xdiff/xhistogram: factor out memory cleanup into free_index() This will be useful in the next patch as we'll introduce multiple callers. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-19 12:46:01 -07:00
Stefan Beller	282098506f	xdiff/xhistogram: pass arguments directly to fall_back_to_classic_diff By passing the 'xpp' and 'env' argument directly to the function 'fall_back_to_classic_diff', we eliminate an occurrence of the 'index' in histogram_diff, which will prove useful in a bit. While at it, move it up in the file. This will make the diff of one of the next patches more legible. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-19 12:46:00 -07:00
Stefan Beller	626c0b5d39	diff.c: offer config option to control ws handling in move detection Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-19 12:02:54 -07:00
Stefan Beller	ca1f4ae4df	diff.c: add white space mode to move detection that allows indent changes The option of --color-moved has proven to be useful as observed on the mailing list. However when refactoring sometimes the indentation changes, for example when partitioning a functions into smaller helper functions the code usually mostly moved around except for a decrease in indentation. To just review the moved code ignoring the change in indentation, a mode to ignore spaces in the move detection as implemented in a previous patch would be enough. However the whole move coloring as motivated in commit `2e2d5ac` (diff.c: color moved lines differently, 2017-06-30), brought up the notion of the reviewer being able to trust the move of a "block". As there are languages such as python, which depend on proper relative indentation for the control flow of the program, ignoring any white space change in a block would not uphold the promises of `2e2d5ac` that allows reviewers to pay less attention to the inside of a block, as inside the reviewer wants to assume the same program flow. This new mode of white space ignorance will take this into account and will only allow the same white space changes per line in each block. This patch even allows only for the same change at the beginning of the lines. As this is a white space mode, it is made exclusive to other white space modes in the move detection. This patch brings some challenges, related to the detection of blocks. We need a wide net to catch the possible moved lines, but then need to narrow down to check if the blocks are still intact. Consider this example (ignoring block sizes): - A - B - C + A + B + C At the beginning of a block when checking if there is a counterpart for A, we have to ignore all space changes. However at the following lines we have to check if the indent change stayed the same. Checking if the indentation change did stay the same, is done by computing the indentation change by the difference in line length, and then assume the change is only in the beginning of the longer line, the common tail is the same. That is why the test contains lines like: - <TAB> A ... + A <TAB> ... As the first line starting a block is caught using a compare function that ignores white spaces unlike the rest of the block, where the white space delta is taken into account for the comparison, we also have to think about the following situation: - A - B - A - B + A + B + A + B When checking if the first A (both in the + and - lines) is a start of a block, we have to check all 'A' and record all the white space deltas such that we can find the example above to be just one block that is indented. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-19 12:02:54 -07:00
Jeff King	da4398d6a0	add core.usereplacerefs config option We can already disable replace refs using a command line option or environment variable, but those are awkward to apply universally. Let's add a config option to do the same thing. That raises the question of why one might want to do so universally. The answer is that replace refs violate the immutability of objects. For instance, if you wanted to cache the diff between commit XYZ and its parent, then in theory that never changes; the hash XYZ represents the total state. But replace refs violate that; pushing up a new ref may create a completely new diff. The obvious "if it hurts, don't do it" answer is not to create replace refs if you're doing this kind of caching. But for a site hosting arbitrary repositories, they may want to allow users to share replace refs with each other, but not actually respect them on the site (because the caching is more important than the replace feature). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-18 15:45:27 -07:00

1 2 3 4 5 ...

52453 Commits