git-commit-vandalism

Author	SHA1	Message	Date
Junio C Hamano	51bb8adbc9	Merge branch 'km/avoid-cp-a' Portability fix. * km/avoid-cp-a: test: fix t7001 cp to use POSIX options	2014-04-16 13:38:55 -07:00
Junio C Hamano	5b713d990d	Merge branch 'km/avoid-bs-in-shell-glob' Portability fix. * km/avoid-bs-in-shell-glob: test: fix t5560 on FreeBSD	2014-04-16 13:38:52 -07:00
Erik Faye-Lund	cb005c1fdf	send-email: recognize absolute path on Windows On Windows, absolute paths might start with a DOS drive prefix, which these two checks failed to recognize. Unfortunately, we cannot simply use the file_name_is_absolute helper in File::Spec::Functions, because Git for Windows has an MSYS-based Perl, where this helper doesn't grok DOS drive-prefixes. So let's manually check for these in that case, and fall back to the File::Spec-helper on other platforms (e.g Win32 with native Perl) Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-16 11:51:16 -07:00
Jeff King	06bdc23b7e	config.c: mark die_bad_number as NORETURN This can help avoid -Wuninitialized false positives in git_config_int and git_config_ulong, as the compiler now knows that we do not return "ret" if we hit the error codepath. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-16 10:21:14 -07:00
Nguyễn Thái Ngọc Duy	39539495ac	index-pack: work around thread-unsafe pread() Multi-threaing of index-pack was disabled with `c0f8654` (index-pack: Disable threading on cygwin - 2012-06-26), because pread() implementations for Cygwin and MSYS were not thread safe. Recent Cygwin does offer usable pread() and we enabled multi-threading with `103d530f` (Cygwin 1.7 has thread-safe pread, 2013-07-19). Work around this problem on platforms with a thread-unsafe pread() emulation by opening one file handle per thread; it would prevent parallel pread() on different file handles from stepping on each other. Also remove NO_THREAD_SAFE_PREAD that was introduced in `c0f8654` because it's no longer used anywhere. This workaround is unconditional, even for platforms with thread-safe pread() because the overhead is small (a couple file handles more) and not worth fragmenting the code. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Tested-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-16 09:29:41 -07:00
Dave Borowitz	d5067112db	Makefile: allow static linking against libcurl This requires more flags than can be guessed with the old-style CURLDIR and related options, so is only supported when curl-config is present. Signed-off-by: Dave Borowitz <dborowitz@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-15 13:01:51 -07:00
Dave Borowitz	61a64fff4f	Makefile: use curl-config to determine curl flags curl-config should always be installed alongside a curl distribution, and its purpose is to provide flags for building against libcurl, so use it instead of guessing flags and dependent libraries. Allow overriding CURL_CONFIG to a custom path to curl-config, to compile against a curl installation other than the first in PATH. Depending on the set of features curl is compiled with, there may be more libraries required than the previous two options of -lssl and -lidn. For example, with a vanilla build of libcurl-7.36.0 on Mac OS X 10.9: $ ~/d/curl-out-7.36.0/lib/curl-config --libs -L/Users/dborowitz/d/curl-out-7.36.0/lib -lcurl -lgssapi_krb5 -lresolv -lldap -lz Use this only when CURLDIR is not explicitly specified, to continue supporting older builds. Signed-off-by: Dave Borowitz <dborowitz@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-15 13:01:49 -07:00
Felipe Contreras	3994e64d77	transport-helper: fix sync issue on crashes When a remote helper crashes while pushing we should revert back to the state before the push, however, it's possible that `git fast-export` already finished its job, and therefore has exported the marks already. This creates a synchronization problem because from that moment on `git fast-{import,export}` will have marks that the remote helper is not aware of and all further commands fail (if those marks are referenced). The fix is to tell `git fast-export` to export to a temporary file, and only after the remote helper has finishes successfully, move to the final destination. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-14 14:03:33 -07:00
Felipe Contreras	852e54bc0f	transport-helper: trivial cleanup It's simpler to store the file names directly, and form the fast-export arguments only when needed, and re-use the same strbuf with a format. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-14 14:03:33 -07:00
Felipe Contreras	0551a06c22	transport-helper: propagate recvline() error pushing It's cleaner, and will allow us to do something sensible on errors later. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-14 13:51:37 -07:00
Felipe Contreras	5931b33e20	remote-helpers: make recvline return an error Instead of exiting directly, make it the duty of the caller to do so. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-14 13:48:33 -07:00
Felipe Contreras	4a1b59c85f	transport-helper: remove barely used xchgline() It's only used once, we can just call the two functions inside directly. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-14 13:48:17 -07:00
Felipe Contreras	59d3924fbb	prompt: fix missing file errors in zsh zsh seems to have a bug while redirecting the stderr of the 'read' command: % read foo 2>/dev/null <foo zsh: no such file or directory: foo Which causes errors to be displayed when certain files are missing. Let's add a convenience function to manually check if the file is readable before calling "read". Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-14 13:27:23 -07:00
Felipe Contreras	7569accf41	remote-bzr: trivial test fix So that the committer is reset properly. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-14 13:25:28 -07:00
Kyle J. McKay	ff7a1c677a	test: fix t5560 on FreeBSD Since `fd0a8c2e` (first appearing in v1.7.0), the t/t5560-http-backend-noserver.sh test has used a backslash escape inside a ${} expansion in order to specify a literal '?' character. Unfortunately the FreeBSD /bin/sh does not interpret this correctly. In a POSIX compliant shell, the following: x='one?two?three' echo "${x#\?}" Would be expected to produce this: two?three When using the FreeBSD /bin/sh instead you get this: one?two?three In fact the FreeBSD /bin/sh treats the backslash as a literal character to match so that this: y='one\two\three' echo "${y#\?}" Produces this unexpected value: wo\three In this case the backslash is not only treated literally, it also fails to defeat the special meaning of the '?' character. Instead, we can use the [...] construct to defeat the special meaning of the '?' character and match it exactly in a way that works for the FreeBSD /bin/sh as well as other POSIX /bin/sh implementations. Changing the example like so: x='one?two?three' echo "${x#*[?]}" Produces the expected output using the FreeBSD /bin/sh. Therefore, change the use of \? to [?] in order to be compatible with the FreeBSD /bin/sh which allows t/t5560-http-backend-noserver.sh to pass on FreeBSD again. Signed-off-by: Kyle J. McKay <mackyle@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-11 13:21:56 -07:00
Kyle J. McKay	00764ca10e	test: fix t7001 cp to use POSIX options Since `11502468` and `04c1ee57` (both first appearing in v1.8.5), the t7001-mv test has used "cp -a" to perform a copy in several of the tests. However, the "-a" option is not required for a POSIX cp utility and some platforms' cp utilities do not support it. The POSIX equivalent of -a is -R -P -p. Change "cp -a" to "cp -R -P -p" so that the t7001-mv test works on systems with a cp utility that only implements the POSIX required set of options and not the "-a" option. Signed-off-by: Kyle J. McKay <mackyle@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-11 13:19:00 -07:00
Yiannis Marangos	426ddeead6	read-cache.c: verify index file before we opportunistically update it Before we proceed to opportunistically update the index (often done by an otherwise read-only operation like "git status" and "git diff" that internally refreshes the index), we must verify that the current index file is the same as the one that we read earlier before we took the lock on it, in order to avoid a possible race. In the example below git-status does "opportunistic update" and git-rebase updates the index, but the race can happen in general. 1. process A calls git-rebase (or does anything that uses the index) 2. process A applies 1st commit 3. process B calls git-status (or does anything that updates the index) 4. process B reads index 5. process A applies 2nd commit 6. process B takes the lock, then overwrites process A's changes. 7. process A applies 3rd commit As an end result the 3rd commit will have a revert of the 2nd commit. When process B takes the lock, it needs to make sure that the index hasn't changed since step 4. Signed-off-by: Yiannis Marangos <yiannis.marangos@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-10 12:27:58 -07:00
Yiannis Marangos	9aa91af036	wrapper.c: add xpread() similar to xread() It is a common mistake to call read(2)/pread(2) and forget to anticipate that they may return error with EAGAIN/EINTR when the system call is interrupted. We have xread() helper to relieve callers of read(2) from having to worry about it; add xpread() helper to do the same for pread(2). Update the caller in the builtin/index-pack.c and the mmap emulation in compat/. Signed-off-by: Yiannis Marangos <yiannis.marangos@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-10 12:18:55 -07:00
Felipe Contreras	880111c11b	completion: fix completing args of aliased "push", "fetch", etc. Some commands need the first word to determine the actual action that is being executed, however, the command is wrong when we use an alias, for example 'alias.p=push', if we try to complete 'git p origin ', the result would be wrong because __git_complete_remote_or_refspec() doesn't know where it came from. So let's override words[1], so the alias 'p' is override by the actual command, 'push'. Reported-by: Aymeric Beaumet <aymeric.beaumet@gmail.com> Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-09 14:22:18 -07:00
dequis	62210887f7	remote-bzr: include authors field in pushed commits Tests-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-09 14:20:48 -07:00
Felipe Contreras	5ff569908d	remote-bzr: add support for older versions Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-09 14:20:48 -07:00
Felipe Contreras	867bf7b490	remote-hg: always normalize paths Apparently Mercurial can have paths such as 'foo//bar', so normalize all paths. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-09 14:20:48 -07:00
Felipe Contreras	fe45cfb518	remote-helpers: allow all tests running from any dir Commit `d3243d7` (test-bzr.sh, test-hg.sh: allow running from any dir) allowed the tests to run from any directory, however, it didn't update all the tests. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-09 14:20:47 -07:00
Junio C Hamano	68773ac915	Sync with 1.9.2 * maint: Git 1.9.2 doc/http-backend: missing accent grave in literal mark-up	2014-04-09 12:06:14 -07:00
Junio C Hamano	0bc85abb7a	Git 1.9.2 The second maintenance release for Git 1.9; contains all the fixes that are scheduled to appear in Git 2.0. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-09 12:04:34 -07:00
Junio C Hamano	3c9e56b75c	Merge branch 'jl/nor-or-nand-and' into maint * jl/nor-or-nand-and: code and test: fix misuses of "nor" comments: fix misuses of "nor" contrib: fix misuses of "nor" Documentation: fix misuses of "nor"	2014-04-09 12:03:26 -07:00
Junio C Hamano	fbae3d9ace	Merge branch 'cn/fetch-prune-overlapping-destination' into maint * cn/fetch-prune-overlapping-destination: fetch: handle overlaping refspecs on --prune fetch: add a failing test for prunning with overlapping refspecs	2014-04-09 12:02:41 -07:00
Junio C Hamano	aba7af8e67	Merge branch 'mh/update-ref-batch-create-fix' into maint * mh/update-ref-batch-create-fix: update-ref: fail create operation over stdin if ref already exists	2014-04-09 12:01:28 -07:00
Junio C Hamano	b8a30194db	Merge branch 'jk/commit-dates-parsing-fix' into maint * jk/commit-dates-parsing-fix: t4212: loosen far-in-future test for AIX date: recognize bogus FreeBSD gmtime output	2014-04-09 11:59:38 -07:00
Junio C Hamano	693b407077	Merge branch 'jc/fix-diff-no-index-diff-opt-parse' into maint * jc/fix-diff-no-index-diff-opt-parse: diff-no-index: correctly diagnose error return from diff_opt_parse()	2014-04-09 11:59:16 -07:00
Junio C Hamano	efb4ec68b8	Merge commit 'doc/http-backend: missing accent grave in literal mark-up' * commit '5df05146d5cb94628a3dfc53063c802ee1152cec': doc/http-backend: missing accent grave in literal mark-up	2014-04-09 11:45:04 -07:00
Thomas Ackermann	5df05146d5	doc/http-backend: missing accent grave in literal mark-up Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-09 11:43:56 -07:00
Torsten Bögershausen	d813ab970d	utf8.c: partially update to version 6.3 Unicode 6.3 defines more code points as combining or accents. For example, the character "ö" could be expressed as an "o" followed by U+0308 COMBINING DIARESIS (aka umlaut, double-dot-above). We should consider that such a sequence of two codepoints occupies one display column for the alignment purposes, and for that, git_wcwidth() should return 0 for them. Affected codepoints are: U+0358..U+035C U+0487 U+05A2, U+05BA, U+05C5, U+05C7 U+0604, U+0616..U+061A, U+0659..U+065F Earlier unicode standards had defined these as "reserved". Only the range 0..U+07FF has been checked to see which codepoints need to be marked as 0-width while preparing for this commit; more updates may be needed. Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-09 10:14:05 -07:00
Kirill Smelkov	22f4c27e68	mingw: activate alloca Both MSVC and MINGW have alloca(3) definitions in malloc.h, so by moving win32-compat alloca.h from compat/vcbuild/include/ to compat/win32/ , which is included by both MSVC and MINGW CFLAGS, we can make alloca() work on both those Windows environments. In MINGW, malloc.h has explicit check for GNUC and if it is so, defines alloca to __builtin_alloca, so it looks like we don't need to add any code to here-shipped alloca.h to get optimum performance. Compile-tested on Windows in MSysGit. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Acked-by: Erik Faye-Lund <kusmabite@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-09 10:08:35 -07:00
Junio C Hamano	7bf272cc04	Update draft release notes to 2.0 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-08 12:11:17 -07:00
Junio C Hamano	2d1a5a5856	Merge branch 'maint' * maint: Update draft release notes to 1.9.2	2014-04-08 12:08:59 -07:00
Junio C Hamano	4d7ad08f6a	Update draft release notes to 1.9.2 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-08 12:08:34 -07:00
Junio C Hamano	360f852d24	Merge branch 'mm/status-porcelain-format-i18n-fix' into maint * mm/status-porcelain-format-i18n-fix: status: disable translation when --porcelain is used	2014-04-08 12:07:06 -07:00
Junio C Hamano	86b4c1639c	Merge branch 'bp/commit-p-editor' into maint * bp/commit-p-editor: run-command: mark run_hook_with_custom_index as deprecated merge hook tests: fix and update tests merge: fix GIT_EDITOR override for commit hook commit: fix patch hunk editing with "commit -p -m" test patch hunk editing with "commit -p -m" merge hook tests: use 'test_must_fail' instead of '!' merge hook tests: fix missing '&&' in test	2014-04-08 12:07:06 -07:00
Junio C Hamano	967f8c9184	Merge branch 'jk/pack-bitmap' * jk/pack-bitmap: pack-objects: do not reuse packfiles without --delta-base-offset add `ignore_missing_links` mode to revwalk	2014-04-08 12:00:33 -07:00
Junio C Hamano	d59c12d7ad	Merge branch 'jl/nor-or-nand-and' Eradicate mistaken use of "nor" (that is, essentially "nor" used not in "neither A nor B" ;-)) from in-code comments, command output strings, and documentations. * jl/nor-or-nand-and: code and test: fix misuses of "nor" comments: fix misuses of "nor" contrib: fix misuses of "nor" Documentation: fix misuses of "nor"	2014-04-08 12:00:28 -07:00
Junio C Hamano	9b30a0339d	Merge branch 'mh/update-ref-batch-create-fix' * mh/update-ref-batch-create-fix: update-ref: fail create operation over stdin if ref already exists	2014-04-08 12:00:22 -07:00
Junio C Hamano	b389e04031	Merge branch 'mr/opt-set-ptr' OPT_SET_PTR() implementation was broken on IL32P64 platforms; it turns out that the macro is not used by any real user. * mr/opt-set-ptr: parse-options: remove unused OPT_SET_PTR parse-options: add cast to correct pointer type to OPT_SET_PTR MSVC: fix t0040-parse-options crash	2014-04-08 12:00:17 -07:00
Junio C Hamano	ed15e20ba3	Merge branch 'ib/rev-parse-parseopt-argh' Finishing touch to a new topic scheduled for 2.0. * ib/rev-parse-parseopt-argh: rev-parse: fix typo in example on manpage	2014-04-08 12:00:09 -07:00
Junio C Hamano	48ae20513d	Merge branch 'mr/msvc-link-with-invalidcontinue' * mr/msvc-link-with-invalidcontinue: MSVC: link in invalidcontinue.obj for better POSIX compatibility	2014-04-08 11:59:46 -07:00
Junio C Hamano	b5a52fa6c6	Merge branch 'jc/rev-parse-argh-dashed-multi-words' Make sure that the help text given to describe the "<param>" part of the "git cmd --option=<param>" does not contain SP or _, e.g. "--gpg-sign=<key-id>" option for "git commit" is not spelled as "--gpg-sign=<key id>". * jc/rev-parse-argh-dashed-multi-words: parse-options: make sure argh string does not have SP or _ update-index: teach --cacheinfo a new syntax "mode,sha1,path" parse-options: multi-word argh should use dash to separate words	2014-04-08 11:59:27 -07:00
Junio C Hamano	bdb830c445	Merge branch 'jk/commit-dates-parsing-fix' Finishing touches for portability. * jk/commit-dates-parsing-fix: t4212: loosen far-in-future test for AIX date: recognize bogus FreeBSD gmtime output	2014-04-08 11:59:06 -07:00
Vlad Dogaru	4e49d95ece	git-p4: explicitly specify that HEAD is a revision 'git p4 rebase' fails with the following message if there is a file named HEAD in the current directory: fatal: ambiguous argument 'HEAD': both revision and filename Use '--' to separate paths from revisions, like this: 'git <command> [<revision>...] -- [<file>...]' Take the suggestion above and explicitly state that HEAD should be treated as a revision. Signed-off-by: Vlad Dogaru <vdogaru@ixiacom.com> Acked-by: Pete Wyckoff <pw@padd.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-07 15:37:12 -07:00
Kirill Smelkov	7195fbfaf5	combine-diff: speed it up, by using multiparent diff tree-walker directly As was recently shown in "combine-diff: optimize combine_diff_path sets intersection", combine-diff runs very slowly. In that commit we optimized paths sets intersection, but that accounted only for ~ 25% of the slowness, and as my tracing showed, for linux.git v3.10..v3.11, for merges a lot of time is spent computing diff(commit,commit^2) just to only then intersect that huge diff to almost small set of files from diff(commit,commit^1). In previous commit, we described the problem in more details, and reworked the diff tree-walker to be general one - i.e. to work in multiple parent case too. Now is the time to take advantage of it for finding paths for combine diff. The implementation is straightforward - if we know, we can get generated diff paths directly, and at present that means no diff filtering or rename/copy detection was requested(), we can call multiparent tree-walker directly and get ready paths. () because e.g. at present, all diffcore transformations work on diff_filepair queues, but in the future, that limitation can be lifted, if filters would operate directly on combine_diff_paths. Timings for `git log --raw --no-abbrev --no-renames` without `-c` ("git log") and with `-c` ("git log -c") and with `-c --merges` ("git log -c --merges") before and after the patch are as follows: linux.git v3.10..v3.11 log log -c log -c --merges before 1.9s 16.4s 15.2s after 1.9s 2.4s 1.1s The result stayed the same. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-07 14:41:49 -07:00
Kirill Smelkov	72441af7c4	tree-diff: rework diff_tree() to generate diffs for multiparent cases as well Previously diff_tree(), which is now named ll_diff_tree_sha1(), was generating diff_filepair(s) for two trees t1 and t2, and that was usually used for a commit as t1=HEAD~, and t2=HEAD - i.e. to see changes a commit introduces. In Git, however, we have fundamentally built flexibility in that a commit can have many parents - 1 for a plain commit, 2 for a simple merge, but also more than 2 for merging several heads at once. For merges there is a so called combine-diff, which shows diff, a merge introduces by itself, omitting changes done by any parent. That works through first finding paths, that are different to all parents, and then showing generalized diff, with separate columns for +/- for each parent. The code lives in combine-diff.c . There is an impedance mismatch, however, in that a commit could generally have any number of parents, and that while diffing trees, we divide cases for 2-tree diffs and more-than-2-tree diffs. I mean there is no special casing for multiple parents commits in e.g. revision-walker . That impedance mismatch hurts performance badly for generating combined diffs - in "combine-diff: optimize combine_diff_path sets intersection" I've already removed some slowness from it, but from the timings provided there, it could be seen, that combined diffs still cost more than an order of magnitude more cpu time, compared to diff for usual commits, and that would only be an optimistic estimate, if we take into account that for e.g. linux.git there is only one merge for several dozens of plain commits. That slowness comes from the fact that currently, while generating combined diff, a lot of time is spent computing diff(commit,commit^2) just to only then intersect that huge diff to almost small set of files from diff(commit,commit^1). That's because at present, to compute combine-diff, for first finding paths, that "every parent touches", we use the following combine-diff property/definition: D(A,P1...Pn) = D(A,P1) ^ ... ^ D(A,Pn) (w.r.t. paths) where D(A,P1...Pn) is combined diff between commit A, and parents Pi and D(A,Pi) is usual two-tree diff Pi..A So if any of that D(A,Pi) is huge, tracting 1 n-parent combine-diff as n 1-parent diffs and intersecting results will be slow. And usually, for linux.git and other topic-based workflows, that D(A,P2) is huge, because, if merge-base of A and P2, is several dozens of merges (from A, via first parent) below, that D(A,P2) will be diffing sum of merges from several subsystems to 1 subsystem. The solution is to avoid computing n 1-parent diffs, and to find changed-to-all-parents paths via scanning A's and all Pi's trees simultaneously, at each step comparing their entries, and based on that comparison, populate paths result, and deduce we could skip recursing into subdirectories, if at least for 1 parent, sha1 of that dir tree is the same as in A. That would save us from doing significant amount of needless work. Such approach is very similar to what diff_tree() does, only there we deal with scanning only 2 trees simultaneously, and for n+1 tree, the logic is a bit more complex: D(T,P1...Pn) calculation scheme ------------------------------- D(T,P1...Pn) = D(T,P1) ^ ... ^ D(T,Pn) (regarding resulting paths set) D(T,Pj) - diff between T..Pj D(T,P1...Pn) - combined diff from T to parents P1,...,Pn We start from all trees, which are sorted, and compare their entries in lock-step: T P1 Pn - - - \|t\| \|p1\| \|pn\| \|-\| \|--\| ... \|--\| imin = argmin(p1...pn) \| \| \| \| \| \| \|-\| \|--\| \|--\| \|.\| \|. \| \|. \| . . . . . . at any time there could be 3 cases: 1) t < p[imin]; 2) t > p[imin]; 3) t = p[imin]. Schematic deduction of what every case means, and what to do, follows: 1) t < p[imin] -> ∀j t ∉ Pj -> "+t" ∈ D(T,Pj) -> D += "+t"; t↓ 2) t > p[imin] 2.1) ∃j: pj > p[imin] -> "-p[imin]" ∉ D(T,Pj) -> D += ø; ∀ pi=p[imin] pi↓ 2.2) ∀i pi = p[imin] -> pi ∉ T -> "-pi" ∈ D(T,Pi) -> D += "-p[imin]"; ∀i pi↓ 3) t = p[imin] 3.1) ∃j: pj > p[imin] -> "+t" ∈ D(T,Pj) -> only pi=p[imin] remains to investigate 3.2) pi = p[imin] -> investigate δ(t,pi) \| \| v 3.1+3.2) looking at δ(t,pi) ∀i: pi=p[imin] - if all != ø -> ⎧δ(t,pi) - if pi=p[imin] -> D += ⎨ ⎩"+t" - if pi>p[imin] in any case t↓ ∀ pi=p[imin] pi↓ ~ For comparison, here is how diff_tree() works: D(A,B) calculation scheme ------------------------- A B - - \|a\| \|b\| a < b -> a ∉ B -> D(A,B) += +a a↓ \|-\| \|-\| a > b -> b ∉ A -> D(A,B) += -b b↓ \| \| \| \| a = b -> investigate δ(a,b) a↓ b↓ \|-\| \|-\| \|.\| \|.\| . . . . ~~~~~~~~ This patch generalizes diff tree-walker to work with arbitrary number of parents as described above - i.e. now there is a resulting tree t, and some parents trees tp[i] i=[0..nparent). The generalization builds on the fact that usual diff D(A,B) is by definition the same as combined diff D(A,[B]), so if we could rework the code for common case and make it be not slower for nparent=1 case, usual diff(t1,t2) generation will not be slower, and multiparent diff tree-walker would greatly benefit generating combine-diff. What we do is as follows: 1) diff tree-walker ll_diff_tree_sha1() is internally reworked to be a paths generator (new name diff_tree_paths()), with each generated path being `struct combine_diff_path` with info for path, new sha1,mode and for every parent which sha1,mode it was in it. 2) From that info, we can still generate usual diff queue with struct diff_filepairs, via "exporting" generated combine_diff_path, if we know we run for nparent=1 case. (see emit_diff() which is now named emit_diff_first_parent_only()) 3) In order for diff_can_quit_early(), which checks DIFF_OPT_TST(opt, HAS_CHANGES)) to work, that exporting have to be happening not in bulk, but incrementally, one diff path at a time. For such consumers, there is a new callback in diff_options introduced: ->pathchange(opt, struct combine_diff_path ) which, if set to !NULL, is called for every generated path. (see new compat ll_diff_tree_sha1() wrapper around new paths generator for setup) 4) The paths generation itself, is reworked from previous ll_diff_tree_sha1() code according to "D(A,P1...Pn) calculation scheme" provided above: On the start we allocate [nparent] arrays in place what was earlier just for one parent tree. then we just generalize loops, and comparison according to the algorithm. Some notes(): 1) alloca(), for small arrays, is used for "runs not slower for nparent=1 case than before" goal - if we change it to xmalloc()/free() the timings get ~1% worse. For alloca() we use just-introduced xalloca/xalloca_free compatibility wrappers, so it should not be a portability problem. 2) For every parent tree, we need to keep a tag, whether entry from that parent equals to entry from minimal parent. For performance reasons I'm keeping that tag in entry's mode field in unused bit - see S_IFXMIN_NEQ. Not doing so, we'd need to alloca another [nparent] array, which hurts performance. 3) For emitted paths, memory could be reused, if we know the path was processed via callback and will not be needed later. We use efficient hand-made realloc-style path_appendnew(), that saves us from ~1-1.5% of potential additional slowdown. 4) goto(s) are used in several places, as the code executes a little bit faster with lowered register pressure. Also - we should now check for FIND_COPIES_HARDER not only when two entries names are the same, and their hashes are equal, but also for a case, when a path was removed from some of all parents having it. The reason is, if we don't, that path won't be emitted at all (see "a > xi" case), and we'll just skip it, and FIND_COPIES_HARDER wants all paths - with diff or without - to be emitted, to be later analyzed for being copies sources. The new check is only necessary for nparent >1, as for nparent=1 case xmin_eqtotal always =1 =nparent, and a path is always added to diff as removal. ~~~~~~~~ Timings for # without -c, i.e. testing only nparent=1 case `git log --raw --no-abbrev --no-renames` before and after the patch are as follows: navy.git linux.git v3.10..v3.11 before 0.611s 1.889s after 0.619s 1.907s slowdown 1.3% 0.9% This timings show we did no harm to usual diff(tree1,tree2) generation. From the table we can see that we actually did ~1% slowdown, but I think I've "earned" that 1% in the previous patch ("tree-diff: reuse base str(buf) memory on sub-tree recursion", HEAD~~) so for nparent=1 case, net timings stays approximately the same. The output also stayed the same. (*) If we revert 1)-4) to more usual techniques, for nparent=1 case, we'll get ~2-2.5% of additional slowdown, which I've tried to avoid, as "do no harm for nparent=1 case" rule. For linux.git, combined diff will run an order of magnitude faster and appropriate timings will be provided in the next commit, as we'll be taking advantage of the new diff tree-walker for combined-diff generation there. P.S. and combined diff is not some exotic/for-play-only stuff - for example for a program I write to represent Git archives as readonly filesystem, there is initial scan with `git log --reverse --raw --no-abbrev --no-renames -c` to extract log of what was created/changed when, as a result building a map {} sha1 -> in which commit (and date) a content was added that `-c` means also show combined diff for merges, and without them, if a merge is non-trivial (merges changes from two parents with both having separate changes to a file), or an evil one, the map will not be full, i.e. some valid sha1 would be absent from it. That case was my initial motivation for combined diffs speedup. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-04-07 14:40:46 -07:00

... 3 4 5 6 7 ...

36578 Commits